Programming Leftovers
-
Ben Hoyt ☛ Using a Markov chain to generate readable nonsense with 20 lines of Python
I recently learned how to generate text using a simple Markov chain. The generated text is readable but is also complete nonsense; as prose it’s not worth much, but for predicting the next word like your phone keyboard’s suggestions, it’s surprisingly useful.
I learned about this algorithm in chapter 3 of Kernighan and Pike’s book The Practice of Programming, where they implement such a generator in various programming languages to discuss program design and data structures.
Note that this algorithm is only one small use for Markov chains, which are a much more general statistical concept.
-
TecAdmin ☛ Creating MySQL User with GRANT OPTION
In the realm of database management, user privileges are the backbone of security and access control. MySQL, as one of the most popular relational database management systems, offers a comprehensive suite of commands for managing user permissions, tailored to safeguard data integrity and confidentiality.
-
Chris Hannah ☛ More Thoughts on the Type of Programmer I Am
I wrote back in July about my programming career, how it had changed, and also both what programmer I see myself as now, and what I want to become in the future. It’s a reasonably long post (~1000 words), but the tldr is essentially, I joined as an iOS developer, but after some internal changes, I’m now primarily writing server applications in Java, but at the same time, other projects that are quite random, e.g. JavaScript scripts for NetSuite, Python scripts for data operations, etc.
-
Michael's and Christian's blog ☛ Permutation SHAP versus Kernel SHAP
SHAP is the predominant way to interpret black-box ML models, especially for tree-based models with the blazingly fast TreeSHAP algorithm.
-
Lee Yingtong Li ☛ Efficiently reading a CSV of floats in Rust
In hpstat, we are often required to read large CSV files consisting of a header of string column names, followed by data consisting entirely of floating-point numbers. Profiling reveals that a naive-approach based on generic CSV parsers is inefficient at this task.
-
University of Toronto ☛ Go modules and the domain expiry problem
Every programming language with aspirations of having a usable system for third party packages has some sort of a namespace problem. Today, Tony Arcieri posted something about Rust's package namespace issues, which caused me to think about Go's approach to the problem. The concise summary is that Go outsources the problem to other people by making package names be URLs. Filippo Valsorda noted that this doesn't solve the domain expiration problem, which is true.