Programming Leftovers
-
Python is my default choice for scripts that process text
Every so often I wind up writing something that needs to do something more complicated than can be readily handled in some Bourne shell, awk, or other basic Unix scripting tools. When this happens, the language I most often turn to is Python, and especially Python is my default choice when the work I'm doing involves processing text in some way (or often if I need to generate text). For example, if I want to analyze the output of some command and generate Prometheus metrics from it, Python is often my choice. These days, this is Python 3, even with its warts with handling non-Unicode input (which usually don't come up in this context).
(A what a lot of these programs do could be summarized as string processing with logic.)
In theory there's no obvious reason that my language of choice couldn't be, say, Go. But in practice, Python has much less friction than something like Go while still having enough structure and capabilities to be better than a much more limited tool like awk. One part of this is Python's casualness about typing, especially typing in dicts. In Python, you can shove anything you want into a dict and it's completely routine to have dicts with heterogenous values (usually your keys are homogenous, eg all strings). This might be madness in a large program, but for small, quickly written things it's a great speedup.
-
Comments on the New R OOP System, R7
Object-Oriented Programming (OOP) is more than just a programming style; it’s a philosophy. R has offered various forms of OOP, starting with S3, then (among others) S4, reference classes, and R6, and now R7. The latter has been under development by a team broadly drawn from the R community leadership, not only the “directors” of R development, the R Core Team, but also the prominent R services firm RStudio and so on.
I’ll start this report with a summary, followed by details (definition of OOP, my “safety” concerns etc.). The reader need not have an OOP background for this material; an overview will be given here (though I dare say some readers who have this background may learn something too).
This will not be a tutorial on how to use R7, nor an evaluation of its specific features. Instead, I’ll first discuss the goals of the S3 and S4 OOP systems, which R7 replaces, especially in terms of whether OOP is the best way to meet those goals. These comments then apply to R7 as well.
-
New Package yfR
Package yfR recently passed peer review at rOpenSci and is all about downloading stock price data from Yahoo Finance (YF). I wrote this package to solve a particular problem I had as a teacher: I needed a large volume of clean stock price data to use in my classes, either for explaining how financial markets work or for class exercises. While there are several R packages to import raw data from YF, none solved my problem.
Package yfR facilitates the importation of data, organizing it in the tidy format and speeding up the process using a cache system and parallel computing. yfR is a backwards-incompatible substitute of BatchGetSymbols, released in 2016 (see vignette yfR and BatchGetSymbols for details).
-
R Ladies Philly is Making a Difference with its Annual Datathon Focused on Local Issues
Alice Walsh and Karla Fettich of the R Ladies Philly talked to the R Consortium about the thriving R Community in Philadelphia. The group has broadened its reach both locally and internationally during the pandemic. However, they have a deep commitment to the local community and remain focused on local issues. Every year, the group partners with local non-profit organizations to host a Datathon to promote learning while contributing to the local community.
-
Announcing Quarto, a new scientific and technical publishing system
Today we’re excited to announce Quarto, a new open-source scientific and technical publishing system. Quarto is the next generation of R Markdown, and has been re-built from the ground up to support more languages and environments, as well as to take what we’ve learned from 10 years of R Markdown and weave it into a more complete, cohesive whole. While Quarto is a “new” system, it’s important to note that it’s highly compatible with what’s come before. Like R Markdown, Quarto is also based on Knitr and Pandoc, and despite the fact that Quarto does some things differently, most existing R Markdown documents can be rendered unmodified with Quarto. Quarto also supports Jupyter as an alternate computational engine to Knitr, and can also render existing Jupyter notebooks unmodified.
-
Which Database You Should Choose For Web Developement?
Millions of data are being generated daily. And companies store their valuable data in databases. A database is organized information stored in a dedicated system. To process the data stored in the system, the role of the database management system comes into the picture. Analogically, it’s like an office with the number of files stored in it.