Hello!
I recently added a new Diligent Engine tutorial that demonstrates the usage of compute shaders and may be useful on its own. The example app implements a simple GPU particle system that consists of a number of spherical particles moving in random directions and encountering elastic collisions. The simulation and collision detection is performed on the GPU by compute shaders. To accelerate collision detection, the shader subdivides the screen into bins and for every bin creates a list of particles residing in the bin. The number of bins is the same as the number of particles and the bins are distributed evenly on the screen, thus every bin on average contains one particle. The size of the particle does not exceed the bin size, so a particle should only be tested for collision against particles residing in its own or eight neighboring bins, resulting in O(1) algorithmic complexity.
The full description of the implementation of the method is here.