LWN on Python, Debian, and Kernel
-
Python
-
LWN ☛ MemHive: sharing immutable data between Python subinterpreters
Immutable data makes concurrent access easier, since it eliminates the data-race conditions that can plague multithreaded programs. At PyCon 2024, Yury Selivanov introduced an early-stage project called MemHive, which uses Python subinterpreters and immutable data to overcome the problems of thread serialization that are caused by the language's Global Interpreter Lock (GIL). Recent developments in the Python world have opened up different strategies for avoiding the longstanding problems with the GIL.
Selivanov began by displaying the output of top showing what was running on his laptop, which consisted of a single Python process running ten asyncio event loops. Each event loop was running concurrently in its own thread and they were saturating the ten CPU cores that his laptop has. They were sharing a single mapping data structure (i.e. similar to a dictionary) between them to exchange values; the mapping had one million keys and values.
-
-
Debian Family
-
LWN ☛ The history, status, and plans for reproducible builds
On the second day of DebConf24 in Busan, South Korea, Holger Levsen provided a history lesson on the "first 11 years" of the Reproducible Builds project. He has been involved in the project for most of that time and has been a Debian user since the mid-1990s, contributor since 2001, and a Debian member since 2007; "I love Debian". Meanwhile, his aim is to make all free software be reproducible, so that anyone can check that a binary program comes from the source code it purports to.
He began by noting that the talk was not really only his, but was instead a talk that comes from the work of more than 100 people listed on the Reproducible Builds web site. He asked a few questions of the audience, such as who knows about the project, who has contributed to it, and who knows that the project itself is more than ten years old but that the idea of reproducible builds goes back more than 30 years? The goal of the talk is to recap and celebrate what has been done, he said, in order to get attendees excited and, thus, involved in the project. "Because there is still a lot of work to do."
The problem is that, while the source code of free software is available, most people install pre-compiled binaries. "No one really knows how they really correspond, even those building the binaries." The machine doing the build might have been compromised, for example. Because of this problem, there are various types of supply-chain attacks that can result.
-
LWN ☛ Debian discusses principles for package maintenance
Achieving consensus among Debian Developers on technical topics and procedures can be, to put it mildly, challenging. Nevertheless, that is exactly what Otto Kekäläinen has tried to do with a proposal that would set up ""principles all Debian packages should follow to be open for collaboration in package maintenance"". In the near term, it seems unlikely that the proposal will be accepted, but the discussion may be effective at improving collaboration nonetheless.
Ending single-developer maintainership of Debian packages has been a popular topic of discussion this year. Current Debian Project Leader (DPL) Andreas Tille made building redundancy, ""whether it's maintaining infrastructure or managing non-leaf packages"", part of his platform during the 2024 DPL election. He also spoke about this in his "Bits from the DPL" talk at DebConf 2024 in Busan, South Korea. Video of the talk is available on the DebConf site.
Tille wrote in his platform that he envisioned a future where ""every crucial task in Debian"" is handled by at least two people to ""ensure comprehensive backup and support"". He would also like to see adoption of packaging standards, make it mandatory to maintain packages on Debian's GitLab instance, called Salsa, and to use its continuous-integration tools. If voters were attached to single-maintainership of packages, he suggested that they should ""probably rank me below 'None of the above'"".
-
-
Kernel Space
-
LWN ☛ A new version of modversions
The genksyms tool has long been buried deeply within the kernel's build system; it is one of the two C-code parsers shipped with the kernel (the other being the horrifying kernel-doc script). It is a key part of how the kernel's module-loading infrastructure works. While genksyms has quietly done its job for decades, that period may soon be coming to an end. It would seem that genksyms is not up to the task of handling Rust code, so Sami Tolvanen is proposing a new tool to handle this task going forward.
In the early days, the kernel only supported monolithic builds; there was no concept of loadable modules. That changed with the 0.99.15 release in early 1994, which added module support along with a number of other features.
-
LWN ☛ A review of file descriptor memory safety in the kernel
On July 30, Al Viro sent a patch set to the linux-fsdevel mailing list with a comprehensive cover letter explaining his recent work on ensuring that the kernel's internal representation of file descriptors are used correctly in the kernel. File descriptors are ubiquitous; many system calls need to handle them. Viro's review identified a few existing bugs, and may prevent more in the future. He also had suggestions for ways to keep uses consistent throughout the kernel.
File descriptors are represented in user space as non-negative integers. In the kernel, these are actually indexes into the process's file-descriptor table, which stores entries of type struct file. Most system calls that take a file descriptor — with the exception of things such as dup2() that only touch the file-descriptor table, not the files themselves — need to refer to the associated struct file to determine how to handle the call. These structures are, like many things in the kernel, reference counted.
-