GitHub Copilot and open source laundering


![]()
We have seen an explosion in machine learning in the past decade, alongside an explosion in the popularity of free software. At the same time as FOSS has come to dominate software and found its place in almost all new software products, machine learning has increased dramatically in sophistication, facilitating more natural interactions between humans and computers. However, despite their parallel rise in computing, these two domains remain philosophically distant.
Though some audaciously-named companies might suggest otherwise, the machine learning space has enjoyed almost none of the freedoms forwarded by the free and open source software movement. Much of the actual code related to machine learning is publicly available, and there are many public access research papers available for anyone to read. However, the key to machine learning is access to a high-quality dataset and heaps of computing power to process that data, and these two resources are still kept under lock and key by almost all participants in the space.1
The essential barrier to entry for machine learning projects is overcoming these two problems, which are often very costly to secure. A high-quality, well tagged data set generally requires thousands of hours of labor to produce,2 a task which can potentially cost millions of dollars. Any approach which lowers this figure is thus very desirable, even if the cost is making ethical compromises. With Amazon, it takes the form of gig economy exploitation. With GitHub, it takes the form of disregarding the terms of free software licenses. In the process, they built a tool which facilitates the large-scale laundering of free software into non-free software by their customers, who GitHub offers plausible deniability through an inscrutable algorithm.
-

- Login or register to post comments
Printer-friendly version- 3652 reads
PDF version
More in Tux Machines
- Highlights
- Front Page
- Latest Headlines
- Archive
- Recent comments
- All-Time Popular Stories
- Hot Topics
- New Members
digiKam 7.7.0 is released
After three months of active maintenance and another bug triage, the digiKam team is proud to present version 7.7.0 of its open source digital photo manager. See below the list of most important features coming with this release.
|
Dilution and Misuse of the "Linux" Brand
|
Samsung, Red Hat to Work on Linux Drivers for Future Tech
The metaverse is expected to uproot system design as we know it, and Samsung is one of many hardware vendors re-imagining data center infrastructure in preparation for a parallel 3D world.
Samsung is working on new memory technologies that provide faster bandwidth inside hardware for data to travel between CPUs, storage and other computing resources. The company also announced it was partnering with Red Hat to ensure these technologies have Linux compatibility.
|
today's howtos
|








.svg_.png)
Content (where original) is available under CC-BY-SA, copyrighted by original author/s.

LWN comments
DeVault: GitHub Copilot and open source laundering