1 hour ago, trojanfoe said:
GPU depth buffer in order to avoid sorting back-to-front on the CPU
In a very basic setting, your sprites will contain opaque and transparent fragments (e.g. text). These transparent fragments need to be blended correctly with the fragments behind them to "leak" the correct background. This can be achieved with the depth buffer in at least two separate passes for one layer of transparency (you can use more passes for multiple layers you want to support on top of each other as well). Alternatively, you can sort the sprites on the CPU, while only requiring a single pass when rendering the sprites in order (based on the sorting).
For transparent 3D objects, you can use the depth buffer as well, but CPU sorting is not guaranteed to be possible in all cases. You can have interlocked transparent triangles, for example, which cannot be sorted once for the whole image, but should rather be sorted per pixel. Sprites, on the other hand, are just stacked on top of each other. So you can always sort once for the whole image, instead of per pixel, allowing CPU sorting in all cases.
So given the above, I would say to carefully profile the GPU depth buffer approach for sprites, because I expect the CPU sorting to be faster for a common load of sprites. Even if you have an extreme number of sprites, you can always rely on insertion sort based sorting algorithms while exploiting coherency between frames.