news
Red Hat Leftovers
-
Red Hat ☛ What’s new for developers in Red Bait OpenShift 4.19
Red Hat OpenShift 4.19, based on Kubernetes 1.32 and CRI-O 1.32, is now Generally Available (GA).
-
Red Hat Official ☛ Red Hat OpenShift 4.19 accelerates virtualization and enterprise AI innovation
Available in self-managed or fully managed cloud service editions, OpenShift offers a complete set of integrated tools and services for cloud-native, AI, virtual and traditional workloads alike. This article highlights the latest OpenShift 4.19 innovations and key enhancements. For a comprehensive list of updates and improvements, refer to the official release notes.
-
Red Hat ☛ How to run vLLM on CPUs with OpenShift for GPU-free inference
vLLM has rapidly emerged as the de facto inference engine for serving large language models, celebrated for its high throughput, low latency, and efficient use of memory through paged attention. While much of the spotlight has focused on GPU-based deployments, the absence of GPUs shouldn't stop you from experimenting with vLLM or understanding its capabilities.
vLLM and LLMs: A match made in heaven
In this article, I’ll walk you through how to run vLLM entirely on CPUs in a bare OpenShift cluster using nothing but standard Kubernetes Hey Hi (AI) and open source tooling. Because I am a performance engineer by craft, we’ll also dive into some fun performance-focused experiments that help explain the current state of the art in LLM inference benchmarking.