Kernel: OSPM, XFS, and More
-
Reports from OSPM 2023, part 2
The fifth conference on Power Management and Scheduling in the Linux Kernel (abbreviated "OSPM") was held on April 17 to 19 in Ancona, Italy. LWN was not there, unfortunately, but the attendees of the event have gotten together to write up summaries of the discussions that took place and LWN has the privilege of being able to publish them. Reports from the second day of the event appear below.
-
Merging copy offload
Kernel support for copy offload is a feature that has been floating around in limbo for a decade or more at this point; it has been implemented along the way, but never merged. The idea is that the host system can simply ask a block storage device to copy some data within the device and it will do so without further involving the host; instead of reading data into the host so that it can be written back out again, the device circumvents that process. At the 2023 Linux Storage, Filesystem, Memory-Management and BPF Summit, Nitesh Shetty led a storage and filesystem session to discuss the current status of a patch set that he and others have been working on, with an eye toward getting something merged fairly soon.
The overall concept of copy offload is that you issue a command to a device and it copies the data from one place on the device to another, though the copy can also be between NVMe namespaces on a device. The advantages are in saving CPU resources, PCI bandwidth, and, on fabrics, network bandwidth, because the copy stays local to the device. The first approach was from Martin Petersen in 2014, which was ioctl()-based; another, which was based on using two BIOs, was developed by Mikulas Patocka in 2015. The ioctl() approach had problems with scalability, Shetty said. Petocka's approach was compatible with the device mapper, but neither of the two patch sets gained any traction in the community.
-
Backporting XFS fixes to stable
Backporting fixes to stable kernels is an ongoing process that, in general, is handled by the stable maintainers or the developers of the fixes. However, due to some unhappiness in the XFS development community with the process of handling stable fixes for that filesystem, a different process has come about for backporting XFS patches to the stable kernels. The three developers doing that work, Leah Rumancik, Amir Goldstein, and Chandan Babu Rajendra, led a plenary session at the 2023 Linux Storage, Filesystem, Memory-Management and BPF Summit (with Rajendra participating remotely) to discuss that process.
Goldstein began by noting that each of the presenters is responsible for a different stable kernel; he does 5.10, Rumancik handles 5.15, and Rajendra is responsible for 5.4. The session was meant to be something of a case study, because other filesystems (and subsystems) have similar issues. He was "very happy to see" that stable maintainer Sasha Levin was present for the session so that he could offer his perspective as well.
-
Merging bcachefs
The bcachefs filesystem, and the process for getting it upstream, were the topics of a session led remotely by Kent Overstreet, creator of bcachefs, at the 2023 Linux Storage, Filesystem, Memory-Management and BPF Summit. He has also discussed bcachefs in previous editions of the summit, first in 2018 and at last year's event; in both of those cases, the question of getting bcachefs merged into the mainline kernel came up, but that merge has not happened yet. This time around, though, Overstreet seemed closer than ever to being ready to actually start that process.
He began his talk by noting that he had been saying bcachefs is almost ready for merging for some time now; "now I'm saying, let's finally do it". He wanted to report on the status of the filesystem and on why it is ready now for upstreaming, but he wanted to use the bulk of the session to discuss the process of doing so. "It's a massive, 90,000-lines-of-code beast" that needs to get reviewed, so there is a need to figure out the process to do that review.
His goal with bcachefs is to have the "performance, reliability, scalability, and robustness of XFS with modern features". That's a high bar, and one that bcachefs has not yet reached, but "I think we're pretty far along". People are running bcachefs on 100TB filesystems "without any issues or complaints"; he is waiting for the first 1PB filesystem. "Snapshots scale beautifully", which is not true for Btrfs, based on user complaints, he said.
-
XFS online filesystem check and repair
Darrick Wong has been doing work on XFS online repair for a number of years and things are getting to the point where most of the filesystem-internal work has been completed and is under review. The work remaining mostly concerns the user-space side to set up a periodic scan and repair cycle, so he wanted to discuss what user space needs from this kind of feature in a filesystem session at the 2023 Linux Storage, Filesystem, Memory-Management and BPF Summit that he led remotely. The session may not have gone quite as he hoped, as it got somewhat derailed by topics that spilled over from the earlier session on unprivileged image mounts.
His current patch set for XFS online repair is "out for review on Dave Chinner's laptop right now", so it is time to start talking about the missing pieces. That means that he will be talking more about user space than he would normally; there is a user-space driver program that controls how often the online fsck mechanism runs. There is nothing yet for notifying user space of problems that were found by an online fsck pass, nor is there a daemon monitoring for notifications to do anything about them, such as to issue repair requests. There is no good infrastructure in the kernel for handling and dispatching such things, he said.
-
Scope-based resource management for the kernel
The C language does not provide the sort of resource-management features found in more recent languages. As a result, bugs involving leaked memory or failure to release a lock are relatively common in programs written in C — including the kernel. The kernel project has never limited itself to the language features found in the C standard, though; kernel developers will happily use extensions provided by compilers if they prove helpful. It looks like a relatively simple compiler-provided feature may lead to a significant change in some common kernel coding patterns.