Programming Leftovers
-
Modus Create LLC ☛ Evaluating retrieval in RAGs: a practical framework
Evaluation of Retrieval-Augmented Generation (RAG) systems is paramount for any industry-quality usage. Without proper evaluation we end up in the world of “it works on my machine”. In the realm of AI, this would be called “it works on my questions”.
Whether you are an engineer seeking to refine your RAG systems, are just intrigued by the nuances of RAG evaluation or are eager to read more after the first part of the series (Evaluating retrieval in RAGs: a gentle introduction) — you are in the right place.
This article equips you with the knowledge needed to navigate evaluation in RAGs and the framework to systematically compare and contrast existing evaluation libraries. This framework covers benchmark creation, evaluation metrics, parameter space and experiment tracking.
-
Bartosz Milewski ☛ Neural Networks, Pre-Lenses, and Triple Tambara Modules
Neural networks are an example of composable systems, so it’s no surprise that they can be modeled in category theory, which is the ultimate science of composition. Moreover, the categorical ideas behind neural networks can be immediately implemented and tested in a programming language. In this post I will present the Haskell implementation of parametric lenses, generalize them to pre-lenses and introduce their profunctor representation. Using the profunctor representation I will build a working multi-layer perceptron.
In the second part of this post I will introduce the bicategory \mathbf{PreLens} of pre-lenses and the bicategory of triple Tambara profunctors and show how they related to pre-lenses.
Complete Haskell implementation is available on gitHub, where you can also find the PDF version of this post, complete with the categorical picture.
-
Rlang ☛ R doParallel: How to Parallelize R DataFrame Computations
Parallelizing R dataframe computation is a guaranteed way to shave minutes or even hours from your data processing pipeline compute time. Sure, it adds more complexity to the code, but it can drastically reduce your computing bills, especially if you’re doing everything in the cloud.
R doParallel package provides a significant speed increase to your dataframe calculation while minimizing code changes. It has all you need and more to get your feet wet in the world of dataframe parallelization, and today you’ll learn all about it. After reading, you’ll know what changes you need to make to run your code in parallel, and how your CPU core count affects total compute time and overhead (initialization) time.
-
Daniel Lemire ☛ Passing recursive C++ lambdas as function pointers
In modern C++, as in many popular languages, you can create ‘lambdas’. Effectively, they are potentially anonymous function instances that you can create on the fly as you are programming, possibly inside another function. The following is a simple example.
-
Trail of Bits ☛ Why fuzzing over formal verification?
We recently introduced our new offering, invariant development as a service. A recurring question that we are asked is, “Why fuzzing instead of formal verification?” And the answer is, “It’s complicated.”
We use fuzzing for most of our audits but have used formal verification methods in the past. In particular, we found symbolic execution useful in audits such as Sai, Computable, and Balancer. However, we realized through experience that fuzzing tools produce similar results but require significantly less skill and time.
In this blog post, we will examine why the two principal assertions in favor of formal verification often fall short: proving the absence of bugs is typically unattainable, and fuzzing can identify the same bugs that formal verification uncovers.
-
James G ☛ Designing a knowledge graph query language
In searching existing graph databases, I was a bit let down by the number of options that were proprietary, and the relative complexity of the syntax. Thus, with an itch to make something new, I set out to make my own graph query language and index. I have called it Knowledge Graph Language (KGL).
In this blog post, I am going to discuss some of the history behind this project and how I have designed the syntax.
-
Fermyon to Donate Open Source Wasm Platform to CNCF
Fermyon applied to donate an open source platform for building, deploying and managing Wasm apps on Kubernetes clusters to the CNCF.
-
Rlang ☛ ggbrick is now on CRAN
If you’re looking for something a little different, ggbrick creates a ‘waffle’ style chart with the aesthetic of a brick […]
Continue reading: ggbrick is now on CRAN
The post ggbrick is now on CRAN appeared first on Dan Oehm | Gradient Descending. -
Rlang ☛ Causal Effect of Approval of ETF for Bitcoin on the Prices
As known, the US Securities and Exchange Commission (SEC) made a historic decision for the cryptocurrency markets on January 10, approving spot Bitcoin ETF applications.
-
SANS ☛ Whois "geofeed" Data, (Thu, Mar 21st)
Attributing a particular IP address to a specific location is hard and often fails miserably. There are several difficulties that I have talked about before: Out-of-date whois data, data that is outright fake, or was never correct in the first place. Companies that have been allocated a larger address range are splitting it up into different geographic regions, but do not reflect this in their whois records.
-
Chris ☛ Intrusive Unit Testing
-
Rust
-
Rust Blog ☛ Announcing Rust 1.77.0
The Rust team is happy to announce a new version of Rust, 1.77.0. Rust is a programming language empowering everyone to build reliable and efficient software.
-
LWN ☛ Rust 1.77.0 released
Version 1.77.0 of the Rust language has been released. Changes include support for NUL-terminated C-string literals, the ability for async functions to call themselves recursively, the stabilization of the offset_of!() macro, and more.
-
-
Shell/Bash/Zsh/Ksh
-
Oil Shell ☛ Oils 0.21.0 - Flags, Integers, Starship Bug, and Speed
This is the latest version of Oils, a Unix shell. It's our upgrade path from bash to a better language and runtime:
"Oils version 0.21.0 - Source tarballs and documentation."
If you're new to the project, see the Oils 2023 FAQ and posts tagged #FAQ.
-
-
Standards/Consortia
-
James G ☛ HTML's readability, robustness, and intuitiveness
I find HTML readable. And it is explicit. After a bit of learning, I developed the mental mapping that p was a paragraph, h1 was a top-level heading, etc. While my initial knowledge may have spanned only the foundational text elements, and less of the semantic layout elements available -- main, section, header, etc. -- I felt I could do a lot with a little knowledge. This feeling of empowerment feels significant: with a bit of studying, you can get a long way with HTML.
-
Rodrigo Ghedin ☛ Plain text email
HTML email has some obvious disadvantages, such as less security due to hiding links and loading remote media. An incidental problem is that, unlike web browsers, email clients/apps do not follow web standards — each one renders HTML differently, which makes the design of newsletter layouts, for example, a hellish endeavor.
Another problem with HTML is that messages in this format are heavier, because they have invisible parts (headers and the HTML code itself) and visible (images, in particular) that the pure text counterpart doesn’t have.
-