Programming Leftovers
-
Improving C-library scalability with restartable sequences
The Linux kernel has supported restartable sequences (sometimes referred to as "RSEQ") since 2018, but it remains a bit of a niche feature, mostly useful to performance-oriented developers who do not mind writing assembly code. According to Mathieu Desnoyers, the developer behind the kernel's implementation of restartable sequences, this feature can be applicable to a much wider range of performance-sensitive code with proper library support. He came to the 2023 GNU Tools Cauldron to present the case for use of restartable sequences within the GNU C Library (glibc).
There are, he began, a number of approaches that are used to improve the scalability of user-space code; most of them revolve around partitioning the workload in one way or another. Use of thread-local storage to minimize contention for shared data is one example. Applications can also use read-copy-update, hazard pointers, or reference counting (which works best in the absence of frequent changes, he said). Another approach is per-CPU data structures; they are heavily used in the kernel, he said, but can be made to work in user space as well. The kernel can rely on techniques like disabling preemption to guarantee exclusive access to a per-CPU data structure, but user space has no such luxury. That is where restartable sequences can help.
-
Presumed technical debt: how to recognise it and avoid it
The programming community unanimously considers technical debt an aspect of our work to keep under control and reduce.
Personally, I’ve been vocal about the perils of technical debt in one of my early blog posts about the organizational issues it can cause.
While I still stand by the majority of what I described in that article, I want to clarify my take on what I call the presumed technical debt.
-
Optimism vs Pessimism in Distributed Systems
Avoiding coordination is the one fundamental thing that allows us to build distributed systems that out-scale the performance of a single machine1. When we build systems that avoid coordinating, we end up building components that make assumptions about what other components are doing. This, too, is fundamental. If two components can’t check in with each other after every single step, they need to make assumptions about the ongoing behavior of the other component.
One way to classify these assumptions is into optimistic and pessimistic assumptions. I find it very useful, when thinking through the design of a distributed system, to be explicit about each assumption each component is making, whether that assumption is optimistic or pessimistic, and what exactly happens if the assumption is wrong. The choice between pessimistic and optimistic assumptions can make a huge difference to the scalability and performance of systems.
I generally think of optimistic assumptions as ones that avoid or delay coordination, and pessimistic assumptions as ones that require or seek coordination. The optimistic assumption assumes it’ll get away with its plans. The pessimistic assumption takes the bull by the horns and makes sure it will.
-
New Library: Simple Router
Simple story, really. I wanted an HTTP router for Clojure that
1. is order-independent
2. and allows for overlapping routes.
I didn’t find one, so I had to write my own.
-
Recent improvements in GCC diagnostics
The primary job of a compiler is to translate source code into a binary form that can be run by a computer. Increasingly, though, developers want more from their tools, compilers included. Since the compiler must understand the code it is being asked to translate, it is in a good position to provide information about how that code will execute — and where things might go wrong. At the 2023 GNU Tools Cauldron, David Malcolm talked about recent work to improve the diagnostic output from the GCC compiler.
Much of the talk was dedicated to improvements in the ASCII-art output created by the compiler's static analyzer. In the existing GCC 13 release, the compiler is able to quote source code, underline and label source ranges, and provide hints for improving the code. All of this output is created by a module called pretty-print.cc, which has a lot of nice capabilities but which is proving increasingly hard to extend. It does not create two-dimensional layouts well, is not good with non-ASCII text, and its colorization support falls short.
-
WordPress 6.4’s PHP Compatibility [Ed: PHP adoption means slow-motion digital obsolescence. What runs today won't run for long. More reasons to dump WordPress... and PHP-dependent 'suites' that won't last long. WordPress has basically become kitchensinkware. b2 already had many of the features a blogger may need, sans all the bloat and underlying complexity that almost nobody asked for.]
In an effort to keep the WordPress community up to date, this post provides an update on the PHP compatibility of the upcoming WordPress 6.4 release scheduled for November 7, 2023. Recommended PHP version for WordPress 6.4 It’s recommended to use PHP 8.1 or 8.2 with this upcoming release.
-
Perl / Raku
-
Raku is surprisingly good for CLIs
A while back I wrote Raku: a Language for Gremlins about my first experiences with the language. After three more months of using it I've found that it's quite nice for writing CLIs! This is because of a couple features: [...]
-
-
Python
-
PEP 703 (Making the Global Interpreter Lock Optional in CPython) acceptance
As we’ve announced before , the Steering Council has decided to accept PEP 703 (Making the Global Interpreter Lock Optional in CPython) . We want to make it clear why, and under what expectations we’re doing so.
It is clear to the Steering Council that theoretically, a no-GIL (or free-threaded) Python would be of great benefit, and the majority of the community seems in agreement. Threads have significant downsides and caveats, but they are widely adopted, both by software and hardware, and they do enable more scalable solutions to problems. The GIL clearly inhibits CPython in this, and removing that barrier would be a good thing.
At the same time we’re not sure if it’s possible to remove the GIL without fundamentally breaking all extension modules out there, or significantly reducing the performance or maintainability of CPython. The third-party/PyPI package ecosystem is one of Python’s strengths, and the tight, efficient integration with C libraries is one of CPython’s. It has enabled the existence of a diverse selection of packages that’s a unique selling point for Python. We need to be careful that we do not destroy those benefits, or discard decades worth of package development.
-
The path toward a no-GIL Python
The Python Steering Council has posted a detailed plan for the addition of "free-threaded" (no global interpreter lock) support into the Python mainline. It will not be a short process and does not have a guaranteed successful outcome.
-
How to Run JavaScript in Python (with an Example)
Polyglot programming is uncommon but can be a lifesaver in specific situations. For example, we recently wrote an article about executing Python scripts within a PHP/HTML file, which can be valuable for certain programmers.
-
Python Pickle Dump
The “pickle.dump()” method in Python is used to serialize an object, write or dump the list, dictionary, and other data to the file.
-
Pandas Reshape
The “series.values.reshape()”, “pandas.pivot()” and “pandas.melt()” methods are used to reshape the Pandas Series and DataFrame.
-
Pandas Interpolate
The “DataFrame.interpolate()” method is utilized in Python to fill the DataFrame/Series missing value or Nan values based on the specified method.
-
Python Iterator
In Python, iterators are used to iterate the iterable object such as a list, string, tuple, etc. Next, we retrieved their element value by looping through it.
-
-
Shell/Bash/Zsh/Ksh
-
Jon Chiappetta: Dynamic Bash Shell Prompt (dash)
So it’s been a busy year this year and I haven’t gotten a chance to post much on the good old blog here but I wanted create something that provided me with more dynamically updating information in my bash shell.
-
-
R
-
Plotting a Logistic Regression In Base R
Logistic regression is a statistical method used for predicting the probability of a binary outcome.
-
Ted Laderas Discusses CascadiaR and the Diverse R Community in Portland
Ted Laderas of the Portland R User Group shared his experience of pioneering the Cascadia R Conference for the Pacific Northwest and the West Coast. >
-
-
Rust
-
This Week In Rust: This Week in Rust 518
Hello and welcome to another issue of This Week in Rust!
-
The Rust Programming Language Blog: A tale of broken badges and 23,000 features
Around mid-October of 2023 the crates.io team was notified by one of our users that a shields.io badge for their crate stopped working. The issue reporter was kind enough to already debug the problem and figured out that the API request that shields.io sends to crates.io was most likely the problem. Here is a quote from the original issue:
-