LWN Articles on Linux Kernel
-
LWN ☛ Exposing concurrency bugs with a custom scheduler
Jake Hillion gave a presentation at FOSDEM about using sched_ext, the BPF scheduling framework that was introduced in kernel version 6.12, to help find elusive concurrency problems. In collaboration with Johannes Bechberger, he has built a scheduler that can reveal theoretically possible but unobserved concurrency bugs in test code in a few minutes. Since their scheduler only relies on mainline kernel features, it can theoretically be applied to any application that runs on Linux — although there are a number of caveats since the project is still in its early days.
Bechberger, who unfortunately could not be present for the talk, is an OpenJDK developer. Since Java has its own concurrency model that OpenJDK is responsible for upholding, Bechberger often has to spend time debugging nasty concurrency problems. After wrestling with one such bug, wasting a lot of time trying to reproduce it, he came up with the idea of making a scheduler that deliberately scheduled a process "badly" in order to try and make misbehavior more likely, and therefore easier to debug.
-
LWN ☛ Resistance to Rust abstractions for DMA mapping
While the path toward the ability to write device drivers in Rust has been anything but smooth, steady progress has been made and that goal is close to being achieved — for some types of drivers at least. Device drivers need to be able to set up memory areas for direct memory access (DMA) transfers, though; that means Rust drivers will need a set of abstractions to interface with the kernel's DMA-mapping subsystem. Those abstractions have run into resistance that has the potential to block progress on the Rust-for-Linux project as a whole.
DMA transfers move data directly between RAM and the device of interest, without involving the CPU. It is difficult to get any sort of reasonable I/O performance without DMA, so almost all devices support it. Making DMA work, though, is not just a matter of handing a memory address to a peripheral device; there are many concerns that must be dealt with. These include maintaining cache coherency, ensuring that pages are resident in RAM, handling device-specific addressing limitations, programming I/O memory-management units, and more. Plus, of course, every architecture does things differently. The DMA-mapping layer exists to hide most of these problems from device drivers behind an architecture-independent interface.
-
LWN ☛ The rest of the 6.14 merge window
By the time that Linus Torvalds released 6.14-rc1 and closed the merge window for this development cycle, some 9,307 non-merge changesets had been pulled into the mainline repository — the lowest level of merge-window activity seen in years. There were, nonetheless, a number of interesting changes in the 5,000 commits pulled since the first-half merge-window summary was written.
-
LWN ☛ An update on sealed system mappings
Jeff Xu has been working on a patch set that makes certain mappings in a process's address space impossible to change, sealing them against tampering. This has some potential security benefits — mainly, making sure that someone cannot relocate the vsyscall and vDSO mappings — but some kernel developers haven't been impressed with the patches. While the core functionality (sealing the mappings) is sound, some of the supporting code for enabling and disabling the new feature caused concern by going against the normal design for such things. Reviewers also questioned how this feature would interact with checkpointing and with sandboxing.
Unlike the mseal() system call, which can be used to seal any memory mapping, Xu's patch set is focused specifically on sealing mappings that the kernel uses, before any user-space code starts executing. The patch set seals the memory mappings for five things: the vDSO (code to implement some system calls in user space), vvar (data for vDSO calls), sigpage (code for implementing signal handling on Arm), uprobes (user-space tracing), and vsyscall (an older and obsolete system-call mechanism). Each of these facilities involves having the kernel map some additional pages into a user-space process; all of them except the uprobe pages are created on process startup, and should by and large remain unmodified until the process dies. Uprobes are inserted dynamically, and the kernel maps the pages at that time, but that mapping also lives until the process is terminated.