Programming Leftovers
-
Rlang ☛ R dtplyr: How to Efficiently Process Huge Datasets with a data.table Backend
In a world where compute time is billed by the second, make every one of them count. There are zero valid reasons to utilize a quarter of your CPU and memory, but achieving complete resource utilization isn’t always a straightforward task. That is if you don’t know about R dtplyr.
One option is to use dplyr. It’s simple to use and has intuitive syntax. But it’s slow. The other option is to use data.table. It’s lightning-fast but has a steep learning curve and syntax that’s not too friendly to follow. The third – and your best option – is to combine the simplicity of dplyr with efficiency of data.table. And that’s where R dtplyr chimes in!
Today you’ll learn just how easy it is to switch from dplyr to dtplyr, and you’ll see hands-on the performance differences between the two. Let’s dig in!
-
Rlang ☛ Learning Path: Introduction to R
R stands out as a programming language for statistical computing and data visualization, offering advanced capabilities for data exploration, manipulation, visualization, and analysis. This online workshop path is the ideal opportunity to strengthen your foundations in R programming.
It can cater to both total beginners looking for a start-up course in R and experienced users seeking to enhance their approach to professional R development. Our goal is to provide a comprehensive overview of R programming, offering practical examples, and optimal development methodology with the correct tools provided by the R ecosystem. Attendees will reinforce their understanding through hands-on exercises in experiential, agile, and dynamic workshop sessions.
-
Armin Ronacher ☛ On Tech Debt: My Rust Library is now a CDO | Armin Ronacher's Thoughts and Writings
You're probably familiar with tech debt. There is a joke that if there is tech debt, surely there must be derivatives to work with that debt? I'm happy to say that the Rust ecosystem has created an environment where it looks like one solution for tech debt is collateralization.
Here is how this miracle works. Say you have a library stuff which depends on some other library learned-rust-this-way. The author of learned-rust-this-way at one point lost interest in this thing and issues keep piling up. Some of those issues are feature requests, others are legitimate bugs. However you as the person that wrote stuff never ran into any of those problems. Yet it's hard to argue that learned-rust-this-way isn't tech debt. It's one that does not bother you all that much, but it's debt nonetheless.
-
Marcel Kolaja ☛ Finding Needles in a Haystack with Best-of-K
As I’ve written about before, best of two and best of k are surprisingly powerful tools for load balancing in distributed systems. I have deployed them many times in large-scale production systems, and been happy with the performance nearly every time. There is one case where they don’t perform so well, though: when the bins are very limited in size.
-
Buttondown ☛ Why do regexes use `$` and `^` as line anchors?
Next week is April Cools! A bunch of tech bloggers will be writing about a bunch of non-tech topics. If you've got a blog come join us! You don't need to drive yourself crazy with a 3000-word hell essay, just write something fun and genuine and out of character for you.
But I am writing a 3000-word hell essay, so I'll keep this one short. Last week I fell into a bit of a rabbit hole: why do regular expressions use
$
and^
as line anchors?1 -
Rlang ☛ Canadamaps 0.3.0
The creation of Canadamaps is deeply rooted in a journey from adversity to contribution.
-
Python
-
Pyright: A Static Type Checker for Python (Install + Use)
Python is the most popular programming language in the world, as it provides great flexibility and simplicity to write and build big applications on AI/ML, automation, web apps, desktop, etc.
-
[Repeat] The Register UK ☛ Over 170K users caught up in poisoned Python package ruse
That malware stole data from people's browsers, Discord app, crypto wallets, and files that matched certain keywords. As of now, it's not clear where this data was sent.
There were multiple prongs to this remarkably complicated attack: clones of popular Python packages such as Colorama, a doppelganger or typosquatted domain for Python packages, and code obfuscation. Also reported are account break-ins across trusted GitHub community members. All of these tactics were used to successfully steal user data from an undetermined number of developers.
-
-
Standards/Consortia
-
404 Media ☛ 404 Media Now Has a Full Text RSS Feed
We paid for the development of full text RSS feeds for Ghost-based publishers. Now we can offer them to our paid subscribers, and other Ghost sites can use the service too.
-
Aral Balkan ☛ Draw Together
Sorry, your browser doesn't support embedded videos. But that doesn’t mean you can’t watch it! You can download the video and watch it with your favourite video player.
-