Bridging the synchronization gap on Linux




With older graphics APIs like OpenGL, the client makes a series of API calls, each of which either mutates some bit of state or performs a draw operation. There are a number of techniques that have been used over the years to parallelize the rendering work, but the implementation has to ensure that everything appears to happen in order from the client's perspective. While this served us well for years, it's become harder and harder to keep the GPU fully occupied. Games and other 3D applications have gotten more complex and need multiple CPU cores in order to have enough processing power to reliably render their entire scene in less than 16 milliseconds and achieve a smooth 60 frames per second. GPUs have also gotten larger with more parallelism, and there's only so much a driver can do behind the client's back to parallelize things.
To improve both GPU and CPU utilization, modern APIs like Vulkan take a different approach. Most Vulkan objects such as images are immutable: while the underlying image contents may change, the fundamental properties of the image such as its dimensions, color format, and number of miplevels do not. This is different from OpenGL where the application can change any property of anything at any time. To draw, the client records sequences of rendering commands in command buffers which are submitted to the GPU as a separate step. The command buffers themselves are still stateful, and the recorded commands have the same in-order guarantees as OpenGL. However, the state and ordering guarantees only apply within the command buffer, making it safe to record multiple command buffers simultaneously from different threads. The client only needs to synchronize between threads at the last moment when they submit those command buffers to the GPU. Vulkan also allows the driver to expose multiple hardware work queues of different types which all run in parallel. Getting the most out of a large desktop GPU often requires having 3D rendering, compute, and image/buffer copy (DMA) work happening all at the same time and in parallel with the CPU prep work for the next batch of GPU work.
Enabling all this additional CPU and GPU parallelism comes at a cost: synchronization. One piece of GPU work may depend on other pieces of GPU work, possibly on a different queue. For instance, you may upload a texture on a copy queue and then use that texture on a 3D queue. Because command buffers can be built in parallel and the driver has no idea what the client is actually trying to do, the client has to explicitly provide that dependency information to the driver. In Vulkan, this is done through VkSemaphore objects. If command buffers are the nodes in the dependency graph of work to be done, semaphores are the edges. When a command buffer is submitted to a queue, the client provides two sets of semaphores: a set to wait on before executing the command buffer and a set to signal when the command buffer completes. In our texture upload example, the client would tell the driver to signal a semaphore when the texture upload operation completes and then have it wait on that same semaphore before doing the 3D rendering which uses the texture. This allows the client to take advantage of as much parallelism as it can manage while still having things happen in the correct order as needed.
-

- Login or register to post comments
Printer-friendly version- 2082 reads
PDF version
More in Tux Machines
- Highlights
- Front Page
- Latest Headlines
- Archive
- Recent comments
- All-Time Popular Stories
- Hot Topics
- New Members
digiKam 7.7.0 is released
After three months of active maintenance and another bug triage, the digiKam team is proud to present version 7.7.0 of its open source digital photo manager. See below the list of most important features coming with this release.
|
Dilution and Misuse of the "Linux" Brand
|
Samsung, Red Hat to Work on Linux Drivers for Future Tech
The metaverse is expected to uproot system design as we know it, and Samsung is one of many hardware vendors re-imagining data center infrastructure in preparation for a parallel 3D world.
Samsung is working on new memory technologies that provide faster bandwidth inside hardware for data to travel between CPUs, storage and other computing resources. The company also announced it was partnering with Red Hat to ensure these technologies have Linux compatibility.
|
today's howtos
|








.svg_.png)
Content (where original) is available under CC-BY-SA, copyrighted by original author/s.

Recent comments
43 weeks 21 hours ago
43 weeks 21 hours ago
43 weeks 23 hours ago
43 weeks 1 day ago
43 weeks 1 day ago
43 weeks 1 day ago
43 weeks 1 day ago
43 weeks 2 days ago
43 weeks 2 days ago
43 weeks 2 days ago