Programming Leftovers
-
Five Questions
Many years ago, there was a blog post containing five programming problems every software engineer should be able to solve in less than 1 hour. I had bookmarked it at the time and didn't notice the controversy it created on Reddit. The original link seems to be down, but there are various solutions posted online, including a solution in Python.
I finally got around to looking at it and writing up some solutions to the problems listed. Apparently, instead of solving this in 1 hour in Factor, it took me almost 8 years: [...]
-
Do Large Language Models learn world models or just surface statistics?
From various philosophical [1] and mathematical [2] perspectives, some researchers argue that it is fundamentally impossible for models trained with guess-the-next-word to learn the “meanings'' of language and their performance is merely the result of memorizing “surface statistics”, i.e., a long list of correlations that do not reflect a causal model of the process generating the sequence. Without knowing if this is the case, it becomes difficult to align the model to human values and purge spurious correlations picked up by the model [3,4]. This issue is of practical concern since relying on spurious correlations may lead to problems on out-of-distribution data.
The goal of our paper [5] (notable-top-5% at ICLR 2023) is to explore this question in a carefully controlled setting. As we will discuss, we find interesting evidence that simple sequence prediction can lead to the formation of a world model. But before we dive into technical details, we start with a parable.
-
AI chatbots learned to write before they could learn to think
The internet can't stop talking about an AI program that can write such artful prose that it seems to pass the Turing Test. College students are writing papers with it, internet marketers are using it to write marketing copy, and numerous others are just having earnest and fun conversations with it about the meaning of life. The AI chatbot in question is called GPT-3, and it's the latest iteration of a long project from the company OpenAI. Short for "Generative Pre-trained Transformer 3," GPT-3 is what is known to computer scientists as a large language model (LLM).
-
OpenAI’s Purpose is to Build AGI, and What That Means
Anyway, the point of all this is to say that this isn’t something that might fall out of ChatGPT. It’s not a conspiracy that they’re trying to build AGI. It’s not a rumor. It’s their stated goal.
-
Adding restaurant review metadata to WordPress
I've started adding Restaurant Reviews to this blog - with delicious semantic metadata. Previously I'd been posting all my reviews to HappyCow. It's a great site for finding veggie-friendly food around the worlds, but I wanted to experiment more with the IndieWeb idea of POSSE. So now I can Post on my Own Site and Syndicate Elsewhere.
-
Funny Programming Languages • Buttondown
One of the weirdest and most wonderful things about people is that they can make a joke out of anything. For any human discipline there’s people making jokes about that discipline. In programming, that starts with memes like “how do I exit vim” (as typified in places like r/programmerhumor), or funny examples of awful code (such as from TheDailyWTF).
-
Use the Wrong Tool for the Job • Buttondown
I’ve recently been real fascinated by the topic of complexity and what keeps us from keeping software simple. The wider net likes to blame “lazy programmers” and “evil managers” for this, as if any software could, with sufficient time, be made as simple as “hello world”. I’ve instead been looking at how various factors create complexity “pressure”. Code that needs to satisfy a physical constraint is more likely to be complex than code that doesn’t, etc.
One complexity pressure is “impedance”: when the problem you are solving isn’t well suited for the means you have to solve it. For example, if you need to write really fast software, then Python will be too slow. You can get around this by using foreign function interface, as scientific libraries do, or running multiple processes, as webdevs do, but these are solutions you might not need if you were using a faster language in the first place. In a sense impedance is complexity that comes from using “the wrong tool for the job.”