news
Programming Leftovers
-
Aleksandar Vacić ☛ AI sceptic in LLM adventure land
Ever since ChatGPT and its gen-AI ilk arrived, I have been very vocal and adamant that these are bullshit generators: they try to guess what you want and will continually hallucinate things until you say it’s OK. Depending on the body of existing knowledge they were trained on, they’ve becoming ever more successful in that endeavour. In the span of just few years and after countless energy spent on their development and training, those tools have become quite capable delivering amazing, believable results in certain areas.
-
Zig ☛ Parallel Self-Hosted Code Generation
Less than a week ago, we finally turned on the x86_64 backend by default for Debug builds on Linux and macOS. Today, we’ve got a big performance improvement to it: we’ve parallelized the compiler pipeline even more!
-
Daniel Lemire ☛ Metcalfe’s Law against Brooks’ Law
Software thrives on the network effect, or Metcalfe’s Law, where a system’s value scales with the square of its users. Linux excels because its vast user base fuels adoption, documentation, and compatibility everywhere.
But larger teams don’t build better software—often the reverse. Brooks’ Law, from Fred Brooks’ The Mythical Man-Month, shows that adding people increases communication overhead, slowing progress. The Pareto Principle (80/20 rule) also applies: a small minority drives most meaningful contributions. Great software often stems from a single visionary or a small, cohesive team, not a crowd.
-
Jussi Pakkanen ☛ A custom C++ standard library part 4: using it for real
Writing your own standard library is all fun and games until someone (which is to say yourself) asks the important question: could this be actually used for real? Theories and opinions can be thrown about the issue pretty much forever, but the only way to actually know for sure is to do it.
-
Nicolas Fränkel ☛ Improving my previous OpenRewrite recipe
I started discovering OpenRewrite last week by writing a Kotlin recipe that moves Kotlin files according to the official directory structure recommendation. I mentioned some future works, and here they are. In this post, I want to describe how to compute the root package instead of letting the user set it.
-
Henrik Warne ☛ Lessons From 9 More Years of Tricky Bugs
Since 2002, I have been keeping track of all the tricky bugs I have come across. Nine years ago, I wrote a blog post with the lessons learned from the bugs up till then. Now I have reviewed all the bugs I have tracked since then. I wanted to see if I have learnt the lessons I listed in the first review. I also wanted to see what kind of bugs I have encountered since then. Like before, I have divided the lessons into the categories of coding, testing and debugging: [...]
-
Clazy 1.15 Released – New Checks, Better Stability
🎉 New Clazy Release: Stability Boost & New Checks!
We’re excited to roll out a new Clazy release packed with bug fixes, a new check, and improvements to existing checks. This release included 34 commits from 5 contributors.
-
R / R-Script
-
Rlang ☛ Use the duplicated Function in R: Find & Remove Duplicates
Your statistical model is built, and your p-values are perfect, but is your conclusion valid? What if a single, overlooked duplicate entry in your dataset is silently skewing your results, leading to flawed insights? How can you be certain that the data you're analyzing is clean, accurate, and trustworthy?
-
Rlang ☛ Celebrating 18 Years of ggplot2: A Special Bundle Offer
It’s been 18 years since ggplot2 revolutionized data visualization in R. To celebrate this milestone, I’m offering a special bundle deal on my data visualization books.
-
Rlang ☛ 📦 {alone} v0.6 is now available
Alone: Australia season 3 has finished and has been added to the {alone} R package 👍 Season 3 was awesome.
-
-
Java
-
Alisa Sireneva ☛ Splitting independent variables without SSA
I’m making progress on the Java decompiler I’ve mentioned in a previous post, and I want to share the next couple of tricks I’m using to speed it up.
Java bytecode is a stack-based language, and so data flow is a bit cursed, especially when the control flow is complicated. I need to analyze data flow globally for expression inlining and some other stuff. Single-static assignment produces basically everything I need as a byproduct… but it’s not very fast.
-