Linux in LWN
-
Yet another try at the BPF program allocator [LWN.net]
The BPF subsystem, which allows code to be loaded into the kernel from user space and safely executed in the kernel context, is bound to create a number of challenges for the kernel as a whole. One might not think that allocating memory for BPF programs would be high on the list of problems, but life (and memory management) can be surprising. The attempts to do a better job of providing space for compiled BPF code have, to date, only been partially successful; now Song Liu is back with a new approach to finish the job.
-
Averting excessive oopses [LWN.net]
Even a single kernel oops is never a good thing; it is an indication that something has gone badly wrong in the system somewhere and a straightforward recovery is not possible. But it seems that oopsing a large number of times has the potential to be even worse. To head off problems that might result from repeated oopsing, there is currently work afoot to put an upper limit on the number of times that the kernel can be allowed to oops before just giving up and rebooting.
An oops in the kernel is the equivalent of a crash in user space. It can come about for a number of reasons, including dereferencing a stray pointer, hardware problems, or a bug detected by checks within the kernel code itself. The normal response to an oops is to output a bunch of diagnostic information to the system log and kill the process that was running when the problem occurred.
The system as a whole, however, will continue on after an oops if at all possible. Killing the system would deprive the users of the ability to save any outstanding work and can also make problems much harder to debug than they would otherwise be. So the kernel will do its best to continue executing even when something has clearly gone badly wrong. An immediate result of that design decision is that any given system can oops more than once. Indeed, for some types of problems, multiple oopses are common and may continue until somebody gets fed up and reboots the system.
Jann Horn recently started to wonder whether perhaps the kernel should just give up and go into a panic (which will cause a reboot) if it oopses too many times. This could be a wise course of action in general; a kernel that is oopsing frequently is clearly not in a good condition and allowing it to continue could lead to problems like data corruption. But Horn had another concern: oopsing a system enough times might be a way to exploit security problems.
An oops, almost by definition, will leave an operation halfway completed; there is usually no way to clean up everything that might need cleaning when something has gone wrong in an unexpected place. So an oops might cause locks to be left in a held state or might lead to the failure to decrement counters that have been incremented. Counters are a particular concern; if an oops causes a counter to not be properly decremented, oopsing repeatedly might well become a way to overflow that counter, creating an exploitable situation.
-
Rust in the 6.2 kernel [LWN.net]
The merge window for the 6.1 release brought in basic support for writing kernel code in Rust — with an emphasis on "basic". It is possible to create a "hello world" module for 6.1, but not much can be done beyond that. There is, however, a lot more Rust code for the kernel out there; it's just waiting for its turn to be reviewed and merged into the mainline. Miguel Ojeda has now posted the next round of Rust patches, adding to the support infrastructure in the kernel. This 28-part patch series is focused on low-level support code, still without much in the way of abstractions for dealing with the rest of the kernel. There will be no shiny new drivers built on this base alone. But it does show another step toward the creation of a workable environment for the development of code in the Linux kernel.
As an example of how stripped-down the initial Rust support is, consider that the kernel has eight different logging levels, from "debug" through "emergency". There is a macro defined for each level to make printing simple; screaming about an imminent crash can be done with pr_emerg(), for example. The Rust code in 6.1 defines equivalent macros, but only two of them: pr_info!() and pr_emerg!(); the macros for the other log levels were left out. The first order of business for 6.2 appears to be to fill in the rest of the set, from pr_debug!() at one end through pr_alert!() at the other. There is also pr_cont!() for messages that are pieced together from multiple calls. This sample kernel module shows all of the print macros in action.