news
LWN on Linux Kernel and Programming
-
LWN ☛ Inline socket-local storage for BPF
Martin Lau gave a talk in the BPF track of the 2025 Linux Storage, Filesystem, Memory-Management, and BPF Summit about a performance problem plaguing the networking subsystem, and some potential ways to fix it. He works on BPF programs that need to store socket-local data; amid other improvements to the networking and BPF subsystems, retrieving that data has become a noticeable bottleneck for his use case. His proposed fix prompted a good deal of discussion about how the data should be laid out.
One day, Lau said, Yonghong Song showed him an instruction-level profile of some kernel code from the networking subsystem. Two instructions in particular were much hotter than it seemed like they should be. In bpf_sk_storage_get() (which looks up socket-local data for a BPF program), the inline function bpf_local_storage_lookup() needs to dereference two pointers in order to retrieve the user data associated with a given socket. As it turns out, both of those pointer indirections were causing expensive cache misses.
-
LWN ☛ Better debugging information for inlined kernel functions
Modern compilers perform a lot of optimizations, which can complicate debugging. Song Liu and Thierry Treyer spoke about a potential improvement to BPF Type Format (BTF) debugging information that could partially combat that problem at the 2025 Linux Storage, Filesystem, Memory-Management, and BPF Summit. They want to add information on selectively inlined functions to BTF in order to better support tracing tools. Treyer participated remotely.
One of the most common compiler optimizations is inlining, which embeds the code of a function directly into its caller, avoiding the overhead of a function call and potentially exposing other hidden optimization opportunities. Modern compilers don't always inline every call to a particular function that would benefit from inlining, however. The compiler uses heuristics to decide which call sites will benefit from inlining. This means that a programmer can easily end up with a situation where a function still appears in a binary's symbol table (because some calls were not inlined), but tracing that function won't show calls to it (because the hot calls were inlined, and therefore the function's symbol no longer referrs to them).
-
LWN ☛ Freezing filesystems for suspend
Sometimes worms have a tendency to multiply once their can is opened. James Bottomley recently encountered that situation; he led a session in the filesystem track at the 2025 Linux Storage, Filesystem, Memory Management, and BPF Summit (LSFMM+BPF) to discuss filesystem behavior with respect to suspending and resuming the system. As he noted in his topic proposal, he came at the problem because he needed a way to resynchronize the contents of efivarfs after a system resume and thought there should be an API available to use. But, as the resulting thread shows, the filesystem freeze and thaw code had never been used by the system-wide suspend and resume code. Due to a scheduling mixup, though, several of us missed Bottomley's session, including Luis Chamberlain who has been working on hooking those two pieces up; what follows is largely from a second session that Chamberlain led, with some background information from the topic-proposal discussion and an email exchange with Bottomley.
-
LWN ☛ Cache awareness for the CPU scheduler
The kernel's CPU scheduler has to balance a wide range of objectives. The tasks in the system must be scheduled fairly, with latency for any given task kept within bounds. All of the CPUs in the system should be kept busy if there is enough work to do, but unneeded CPUs should be shut down to reduce power consumption. A task should also run on the CPU that is most likely to have cached the memory that task is using. This patch series from Chen Yu aims to improve how the scheduler handles cache locality for multi-threaded processes.
RAM is fast, but it is still unable to provide data at anything resembling the rate that a CPU can consume it. For this reason, systems are built with multiple layers of cache that are meant to hold frequently used data and make it available more quickly. Reading a value from cache is relatively fast; a read that goes all the way to RAM, instead, can stall a CPU for the time it takes to execute hundreds of instructions. Making effective use of cache is, thus, important for an application to perform well. Well-written applications are implemented with cache behavior in mind, but the kernel has a role to play as well.
-
LWN ☛ Some __nonstring__ turbulence
New compiler releases often bring with them new warnings; those warnings are usually welcome, since they help developers find problems before they turn into nasty bugs. Adapting to new warnings can also create disruption in the development process, though, especially when an important developer upgrades to a new compiler at an unfortunate time. This is just the scenario that played out with the 6.15-rc3 kernel release and the implementation of -Wunterminated-string-initialization in GCC 15.
[...]
Torvalds stood his ground, though, blaming Cook for not having gotten the fixes into the mainline quickly enough.
That is where the situation stands, as of this writing. Others will undoubtedly take the time to fix the problems properly, adding the changes that were intended all along. But this course of events has created some bad feelings all around, feelings that could maybe have been avoided with a better understanding of just when a future version of GCC is expected to be able to build the kernel.
As a sort of coda, it is worth saying that Torvalds also has a fundamental disagreement with how this attribute is implemented. The __nonstring__ attribute applies to variables, not types, so it must be used in every place where a char array is used without trailing NUL bytes. He would rather annotate the type, indicating that every instance of that type holds bytes rather than a character string, and avoid the need to mark rather larger numbers of variable declarations. But that is not how the attribute works, so the kernel will have to include __nonstring markers for every char array that is used in that way.