Kernel Savings, Linux 6.14, and uretprobes
-
Tux Digital ☛ This tiny Linux kernel tweak could SAVE 30% on Power Use for Datacenters!
Computer hardware is always getting faster and more powerful. With each new generation we get more cores, more bandwidth, and just… more power. The more powerful hardware gets the more power hungry it gets, that is the case.
-
LWN ☛ The first part of the 6.14 merge window
As of this writing, just over 4,300 non-merge changesets have been pulled into the mainline repository for the 6.14 release. Many of the pull requests this time around include remarks saying that activity has been relatively low this time around, presumably due to the holidays. So those 4,300 changesets are probably closer to the merge-window halfway point than usual. Much of the work merged thus far looks more like incremental improvements than major new initiatives, but there still have been a number of interesting changes in the mix.
-
LWN ☛ The trouble with the new uretprobes
A "uretprobe" is a dynamic, user-space tracepoint injected by the kernel into a running process; this document tersely describes their use. Among other things, uretprobes are used by the perf utility to time function calls. The 6.11 kernel saw a significant change to uretprobes that improved their performance, but that change is also creating trouble for some users. The best way to solve the problem is not entirely clear.
Specifically, a uretprobe exists to gain information at the return from a function in the process of interest. Older kernels implemented uretprobes by injecting code that, on entry to a function, changed the return address to a special trampoline that, in turn, contained a breakpoint trap instruction. When the target process executed that instruction, it would trap back into the kernel, which would then extract the information of interest (such as the function's return value) and run any other attached code (a BPF program, perhaps) before allowing the process to resume. This method worked, but it also had a noticeable performance impact on the probed process.
In an attempt to improve uretprobe performance, Jiri Olsa put together a patch set that changed the implementation on x86 systems. The return trampoline still exists but, rather than triggering a trap, it just calls the new uretprobe() system call, which then takes care of all of the associated work. Since system-call handling is faster than taking a trap, the cost to the probed process is lower when uretprobe() is used. This new system call takes no arguments, and it can only be called from the kernel-injected special trampoline; otherwise it will just deliver a SIGILL signal to the calling process.