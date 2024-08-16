In Part 1 of this series, I explained what GPU programming is on a high level. Now we can move on to exploring your first GPU algorithm: scan/prefix sum.

If you've ever tried programming a GPU before, you're probably familiar with vector addition. This is the use case where you have two large arrays of numbers, and you want to build a new array, representing the element-wise sum. This algorithm is normally used to demonstrate the power of GPU programming, by showcasing one way to exploit its massively parallel capabilities. However, although this example is useful, it misses out on the techniques involved in more advanced GPU programming, because it views parallel programming as simply a tool that exists to parallelize certain workloads, rather than a language or set of computational tools in its own right.