Language Selection

English French German Italian Portuguese Spanish

Apache: Apark, ASF (Apache Software Foundation) and Apache Kafka vs. Apache Pulsar

Filed under
Server
OSS
  • Apache Spark Turns 10: The Secret Sauce Behind One Of The World’s Most Popular Open Source Projects

    It was the changing nature of big data technology and architectural models, that wrote the story for Hadoop. The infrastructure architecture moved towards edge computing, IoT and cloud computing and especially containers where the market is seeing an increase in Kuberenetes workload. With analytical and machine learning workloads increasing, there was an increased need for a unified analytics platform. And that’s exactly how Spark outperformed Hadoop in metrics such as In memory processing vs disk, real-time streaming and batch streaming besides providing a layer for integrating machine learning as well.

    As Apache Spark turned 10 years old, let’s see the strong driver that led to Spark adoption and what keeps it going. Dubbed as the official “in-memory replacement for MapReduce”, the disk-based computational engine is at the heart of early Hadoop clusters. Why Spark took off was because it reflects the changing processing paradigm to a more memory intensive pipeline, so if your cluster has a decent memory and an API simpler than MapReduce, processing in Spark will be faster. The reason why Spark is faster is because most of the operations (including reads) decrease in processing time roughly linearly with the number of machines since it’s all distributed.

  • The Apache Software Foundation Celebrates 20 Years of Community-led Development “The Apache Way”
  • The Apache® Software Foundation Celebrates 20 Years of Community-led Development "The Apache Way"

    World's largest Open Source foundation provides $20B+ worth of software for the public good at 100% no cost...

  • 20 milestones at the Apache Software Foundation

    Not at all a question of parts unknown, more a case of parts where some are better known than others.

    The Apache Software Foundation (ASF)’s Jim Jagielski and Sally Khudairi have detailed 20 major milestones that exist under the auspicious auspices of the ASF today.

    Without detailing every project (and the ASF holds stewardship over 350 community-led projects and initiatives) and repeating the entire story linked above… we’ll tour a handful in celebration of the fact that the ASF passed its 20th Anniversary on 26 March 2019.

    It would be tough not to mention Apache HTTP Server. This is most popular open source HTTP server on the planet — it provides a secure and extensible server that provides HTTP services observing the latest HTTP standards.

  • Pub/sub messaging: Apache Kafka vs. Apache Pulsar

    These days, massively scalable pub/sub messaging is virtually synonymous with Apache Kafka. Apache Kafka continues to be the rock-solid, open-source, go-to choice for distributed streaming applications, whether you’re adding something like Apache Storm or Apache Spark for processing or using the processing tools provided by Apache Kafka itself. But Kafka isn’t the only game in town.

    Developed by Yahoo and now an Apache Software Foundation project, Apache Pulsar is going for the crown of messaging that Apache Kafka has worn for many years. Apache Pulsar offers the potential of faster throughput and lower latency than Apache Kafka in many situations, along with a compatible API that allows developers to switch from Kafka to Pulsar with relative ease.

More in Tux Machines

digiKam 7.7.0 is released

After three months of active maintenance and another bug triage, the digiKam team is proud to present version 7.7.0 of its open source digital photo manager. See below the list of most important features coming with this release. Read more

Dilution and Misuse of the "Linux" Brand

Samsung, Red Hat to Work on Linux Drivers for Future Tech

The metaverse is expected to uproot system design as we know it, and Samsung is one of many hardware vendors re-imagining data center infrastructure in preparation for a parallel 3D world. Samsung is working on new memory technologies that provide faster bandwidth inside hardware for data to travel between CPUs, storage and other computing resources. The company also announced it was partnering with Red Hat to ensure these technologies have Linux compatibility. Read more

today's howtos

  • How to install go1.19beta on Ubuntu 22.04 – NextGenTips

    In this tutorial, we are going to explore how to install go on Ubuntu 22.04 Golang is an open-source programming language that is easy to learn and use. It is built-in concurrency and has a robust standard library. It is reliable, builds fast, and efficient software that scales fast. Its concurrency mechanisms make it easy to write programs that get the most out of multicore and networked machines, while its novel-type systems enable flexible and modular program constructions. Go compiles quickly to machine code and has the convenience of garbage collection and the power of run-time reflection. In this guide, we are going to learn how to install golang 1.19beta on Ubuntu 22.04. Go 1.19beta1 is not yet released. There is so much work in progress with all the documentation.

  • molecule test: failed to connect to bus in systemd container - openQA bites

    Ansible Molecule is a project to help you test your ansible roles. I’m using molecule for automatically testing the ansible roles of geekoops.

  • How To Install MongoDB on AlmaLinux 9 - idroot

    In this tutorial, we will show you how to install MongoDB on AlmaLinux 9. For those of you who didn’t know, MongoDB is a high-performance, highly scalable document-oriented NoSQL database. Unlike in SQL databases where data is stored in rows and columns inside tables, in MongoDB, data is structured in JSON-like format inside records which are referred to as documents. The open-source attribute of MongoDB as a database software makes it an ideal candidate for almost any database-related project. This article assumes you have at least basic knowledge of Linux, know how to use the shell, and most importantly, you host your site on your own VPS. The installation is quite simple and assumes you are running in the root account, if not you may need to add ‘sudo‘ to the commands to get root privileges. I will show you the step-by-step installation of the MongoDB NoSQL database on AlmaLinux 9. You can follow the same instructions for CentOS and Rocky Linux.

  • An introduction (and how-to) to Plugin Loader for the Steam Deck. - Invidious
  • Self-host a Ghost Blog With Traefik

    Ghost is a very popular open-source content management system. Started as an alternative to WordPress and it went on to become an alternative to Substack by focusing on membership and newsletter. The creators of Ghost offer managed Pro hosting but it may not fit everyone's budget. Alternatively, you can self-host it on your own cloud servers. On Linux handbook, we already have a guide on deploying Ghost with Docker in a reverse proxy setup. Instead of Ngnix reverse proxy, you can also use another software called Traefik with Docker. It is a popular open-source cloud-native application proxy, API Gateway, Edge-router, and more. I use Traefik to secure my websites using an SSL certificate obtained from Let's Encrypt. Once deployed, Traefik can automatically manage your certificates and their renewals. In this tutorial, I'll share the necessary steps for deploying a Ghost blog with Docker and Traefik.