Programming Leftovers
-
Rachel ☛ A common bug in a bunch of feed readers
Therein lies the shared bug: they're not designed around the notion of "always return the values you got last time from the web server". If they had been, they would not throw them away just because the hash matched.
-
Sean Conner ☛ For a few hours yesterday, I felt as if I was taking crazy pills. Then again, I was dealing with time in C
Over the past year or two, I've been cleaning up my posts here. Largely making sure the HTML is valid, but also that all internal links (links where I link to previous posts) are valid to cut down on needless redirects or “404 Not Found” responses, in addition to fixing errors with my web server configuration. So along those lines, yesterday, I thought it might be time to add conditional responses to mod_blog. Given that it's mostly autonomous web crawling agents that read my site, I might as well tell them that most of the links here haven't changed since the last time they came by.
There are two headers involved with this—Last-Modified and If-Modified-Since. The server sends a Last-Modified header with the last modification date of the resource in question. The client can then send in a If-Modified-Since header with this date, and if the resource hasn't changed, then the server can send back a “304 Not Modified” response, saving a lot of bandwidth. So all I had to do was generate a Last-Modified header (easy, as I already read that information) and then deal with the If-Modified-Since header.
-
Michal Pitr ☛ Inference Engine: Optimizing Performance
Today I wanted to add graph optimizations to my inference engine, hoping for maybe a 5-10% performance improvement. Instead, I accidentally found a critical bottleneck!
If you haven’t already, you can read about how I wrote an inference engine from scratch here.
-
Jonathan Y Chan ☛ Generating random unit vectors in Elixir Nx
Here’s an Elixir Nx function that implements the algorithm.
-
Noel Rappin ☛ What About Static Typing in Ruby?
I’ve tried writing this literally a half-dozen times. And it always feels like it slips out of control and gets too abstract to be useful.
So, let’s start with something concrete. And we’re going to wind up splitting this into multiple parts. Probably two, but honestly, at this point who knows?
This all got started because I was discussing the use of runtime checking using Sorbet. The other person gave me a code snippet and asked how I would manage it without type checking. We kind of got distracted and I never really answered, but then I spent literally the next month trying to answer the question in my head. It’s echoey in there.
-
Rlang ☛ Post-hoc Adjustment for Zero-Thresholded Linear Models
Suppose you are modeling a process that you believe is well approximated as being linear in its inputs, but only within a certain range. Outside that range, the output might saturate or threshold: for example if you are modeling a count or a physical process, you likely can never get a negative outcome. Similarly, a process can saturate to a upper bound value outside a given range of the input data.
However, you may still want to model the process as linear under the assumption that you don’t expect the process to hit the saturation point too often. But what if it does? For simplicity we’ll look specifically at the case where you expect the process to return non-negative values, and you hope it doesn’t saturate to zero very often.
When you don’t expect to see too many zeros in practice, modeling the process as linear and thresholding negative predictions at zero is not unreasonable. But the more zeros (saturations) you expect to see, the less well a linear model will perform.
-
Daniel Lemire ☛ Faster random integer generation with batching
We often generate random integers. Quite often these numbers must be within an interval: e.g., an integer between 0 and 100. One application is a random shuffle. A standard algorithm for a fair random shuffle is the Knuth algorithm: [...]
-
Medevel ☛ Charts.css is an open source CSS framework for data visualization.
Visualization help end-users understand data. Charts.css help frontend developers turn data into beautiful charts and graphs using simple CSS classes.
-
Medevel ☛ Why PNPM Should Be Your Go-To Node Package Manager: Installation and Usage Guide for Developers
In the world of Node.js development, managing packages efficiently is crucial. For years, NPM (Node Package Manager) has been the standard choice, but recently, PNPM has emerged as a strong alternative, offering significant improvements in performance, storage efficiency, and developer experience.
-
Python
-
James G ☛ How to implement TF-IDF in Python
Once you have a search index, the next step is to implement a ranking algorithm. A ranking algorithm takes documents that meet the criteria in the search (called “candidates”) and ranks them according to a specific formula. There are many formulas that are widely implemented for document ranking, including TF-IDF and BM25.
In this guide, we are going to implement the Term Frequency / Inverse Document Frequency (TF-IDF) algorithm.
-
James G ☛ How to build a query language in Python
In this guide, I walk through how to build a query language in Python. No required knowledge of query languages is required to follow this guide. You will find this article easier to understand if you have some knowledge of trees.
-
Simon Willison ☛ Upgrading my cookiecutter templates to use python -m pytest
Every now and then I get caught out by weird test failures when I run pytest and it turns out I'm running the wrong installation of that tool, so my tests fail because that pytest is executing in a different virtual environment from the one needed by the tests.
-
James G ☛ Designing a fuzzer for Knowledge Graph Language
I have been writing a test suite for Knowledge Graph Language (KGL), a concise syntax for querying knowledge graphs. My test suite ensures that, given a specific input, the language execution engine returns the correct response. Specific functionalities are tested, too, such as data imports and class methods that allow someone to manipulate a knowledge graph in Python.
-
Juha-Matti Santala ☛ Combine iterables with zip
Zip allows you to combine two or more iterables into one with each corresponding item being grouped: [...]
-
Anže Pečar ☛ Go-like Error Handling Makes no Sense in JavaScript or Python
Yesterday, I saw this proposal to add Golike error handling to Javascript, which got me thinking about whether or not this would make sense in my go-to language, Python.
TLDR: Even though I am a fan of Go’s error handling, I don’t think the safe assignment operator adds any value to Python or Javascript. For the real solution, we’d probably have to look at Java instead 😅
-
Medevel ☛ Streamlit: Build Data Apps from Simple Python Scripts
Streamlit is an open-source Python self-hosted platform that makes it incredibly easy to create and share web applications for machine learning and data science.
-
Medevel ☛ Mercury:Convert Python Jupyter Notebook to Web App
Mercury is a free and open-source app that allows you to add interactive widgets in Python notebooks, so you can share notebooks as web applications.
-