Programming Leftovers
-
MaskRay ☛ Light ELF: exploring potential size reduction
ELF's design emphasizes natural size and alignment guidelines for its control structures. While ensured efficient processing in the old days, this can lead to larger file sizes. I propose "Light ELF" (EV_LIGHT, version 2) – a whimsical exploration inspired by Light Elves of Tolkien's legendarium (who had seen the light of the Two Trees in Valinor).
-
Luke Harris ☛ Learning Date Comparison in Go
I spent time messing around with Go today after reading this article on ByteSizeGo. The “Beginner Gopher” section had requirements for an age calculator project, and I gave it a shot. “Dates are hard but not that hard”, I thought, confident in Go’s date handling tools. My confidence wasn’t misplaced; the tools are great. But like all tools, when you don’t know what you’re doing they can only get you so far. I quickly got stuck doing date math in my head and trying to translate it into code.
-
Rlang ☛ organize blocks of code in R with with() ?
The variables you specify come from the enclosing scope and would be available as copies within a separate scope defined by the curly braces. basically an anonymous function with more intuitive (?) syntax. One benefit would be if you did want to reuse this code block later it’d be very easy to convert to a full fledged function and if it was one off code you could just leave it. You also get some scoping control and added readability (maybe?).
-
Jack Kelly ☛ Which Build Tool For A Bootstrappable Project?
As far as I can see, the best choice for writing bootstrap-related software in 2024 is still C99, with as few dependencies as possible. Any (hopefully few) necessary dependencies should also be bootstrappable, written in C99 and ideally provide pkg-config-style .pc files to describe the necessary compiler/linker flags. But at least there are several C compilers as well as several implementations of pkg-config (the FreeDesktop one, pkgconf, u-config, etc.).
Since we are compiling C, what should we use for the build system? Autotools is under scrutiny again in the wake of the xz-utils compromise, as code to trigger the payload was smuggled into the dist tarball as “autotools junk” that nobody looks at. Should bootstrappable projects still use autotools, or is there something better in 2024?
-
Simon Josefsson ☛ Towards reproducible minimal source code tarballs? On *-src.tar.gz
While the work to analyze the xz backdoor is in progress, several ideas have been suggested to improve the entire software supply chain ecosystem. Some of those ideas are good, some of the ideas are at best irrelevant and harmless, and some suggestions are plain bad. I’d like to attempt to formalize one idea (remains to be see in which category it belongs), which have been discussed before, but the context in which the idea can be appreciated have not been as clear as it is today.
-
Education
-
Archipylago ☛ It's my first time at a meetup - how does it work?
If you're new here, welcome! We're archipylago, a community for Python developers and we organize meetups. If you're new to developer communities, you might have some questions on what meetups are and how they work.
We host our meetups on the second Thursday of the month, on odd-numbered months.
We partner up with local tech companies who sponsor and host our events, usually in their offices. This gives you a wonderful opportunity to get to learn about the companies here in Turku, their company cultures and people there. The companies invite us to their offices and offer food and drinks during the events.
-
-
Python
-
Daniel Lemire ☛ Fast and concise probabilistic filters in Python
Sometimes you need to filter out or filter in data quickly. Suppose that your employer maintains a list of forbidden passwords or URLs or words. You may store them in a relational database and query them as needed. Unfortunately, this process can be slow and inefficient.
A better approach might be to use a probabilistic filter. A probabilistic filter is a sort of ‘approximate set’. You can ask it whether a key is present in the set, and if it is present, then you will always get ‘true’ (the correct answer). However, when the key is not present, you may still get ‘true’, although with a low probability. So the probabilistic filter is sometimes wrong. Why would you accept a data structure that is sometimes wrong? Because it can be several times smaller and faster than querying directly the actual set.
The best known probabilistic filter is the Bloom filter, but there are many others. For example, we recently presented the binary fuse filters which are smaller and faster than Bloom filters, for large and immutable sets.
-