Programming Leftovers
-
Some notes (to myself) about formatting text in jq
These days I'm having to deal with a steadily increasing number of commands that either output JSON only or where JSON is their best output option, and I want to reformat some of that JSON to a more useful or more readable text-based format. The obvious tool to do this with is jq, at least for simple reformatting (I think there's some things that are too tangled for jq). However, every time I need to do this, I keep having to look up how to format text in jq. Jq has a very big manual and a lot of features, so here's some notes to my future self about this.
-
Kernel SHAP
Our last posts were on SHAP, one of the major ways to shed light into black-box Machine Learning models. SHAP values decompose predictions in a fair way into additive contributions from each feature. Decomposing many predictions and then analyzing the SHAP values gives a relatively quick and informative picture of the fitted model at hand.
In their 2017 paper on SHAP, Scott Lundberg and Su-In Lee presented Kernel SHAP, an algorithm to calculate SHAP values for any model with numeric predictions. Compared to Monte-Carlo sampling (e.g. implemented in R package “fastshap”), Kernel SHAP is much more efficient.
-
Bring Your Own Binary Packages with RSPM
Installing R packages from source can be a slow process. This is compounded by the challenge of making sure you have all the right system libraries and compilers installed. CRAN eases the burden on most desktop R users by providing pre-built binary packages for both Windows and MacOS, but Linux users (or anyone using a Linux-based environment like Docker) are still expected to build from source.
-
Highlights from rstudio::conf(2022)
July 25 – 28 2022 saw thousands of people attend rstudio::conf(2022) both in-person in Washington D.C. and virtually from all over the world, including a few of us from Jumping Rivers. Here’s a recap of the big news, and a few of our personal highlights from the conference!
-
Dirk Eddelbuettel: RcppArmadillo used by 1001 CRAN Packages
It is with a mix of pride and joy, but also some genuine astonishment and amazement, that we can share that the counter of reverse dependencies at CRAN for our RcppArmadillo package for R just crossed 1000 packages [1]:
Conrad actually posted this a few weeks ago, by my count we were then still a few packages shy. In any event, having crossed this marker this summer, either then or now, and after more than a dozen years of working on the package is a really nice moment. Google Scholar counts nearly 500 citations for our CSDA paper (also this vignette), and that ratio of nearly a citation for every two packages used is certainly impressive. We have had the pleasure of working with so many other researchers and scientists using RcppArmadillo. Its combination of performance (C++, after all, and heavily tuned) and ease-of-use (inspired by ‘another popular flavour for matrix computing’ that is however mostly interpreted) makes for a powerful package, and we are delighted to see it used so widely.
-
Top 7 Python Developer Tools
Believe it or not, today python is considered one of the most powerful programming languages, and it’s spreading at a mass level. We have witnessed a surge of Python developers in the past couple of years at a whopping rate of 27% YoY (Year on Year). Last year python marked 30 years of success and it is clearly a sign that it is going to disrupt the market in the upcoming few years.
-
p6steve: TRC Slides
-
Symbolism | Playing Perl 6␛b6xA Raku
On IRC deoac wished to know how to print the name of a variable. This question is ambiguous. To get the name of the container is easy.