Not sure that any book will teach this. In general, books teach the fundamentals – how to make transformation matrices, how to project data to the screen, and so on – and then you have to figure out the specific details of how to combine the tools that exist, to get the end result you want.
There are two ways to make a manipulator have constant size in screen space, btw.
The first, is to project the 3D world position to screen space manually, and then draw the manipulator in screen space, using something like glOrtho() projection. This will look un-warped (always the same) because it's orthographic, but won't interact perfectly with the depth (Z) buffer. If you're OK just turning off depth test for the manipulator, that's okay, though, and if you're OK with the inexact depth culling, that's OK too.
The second, is to make the manipulator model be the right size 1 unit away from the camera, and then scale it by Z depth from camera, because the perspective divide is “divide by Z depth.” This will interact correctly with the depth buffer, but the manipulator may look skewed if it's towards the edges of the screen, especially if your field of view is very large and your on-screen size is large.
Btw, it's probably possible to make the orthographic manipulator interact correctly with depth in a modern shader, by doing fancy stuff to output the fragment Z value (running projection yourself, basically,) but that has a few other problems that I don't recommend you tackle right now…