IMO, if it''s slower than picking, this is only because picking does not render to the framebuffer. It saves a bit of fillrate, but as you may render a "mini-frame" to speed up this invisble rendering, then the fillrate is not really a problem.
Your technique about grouping objects is exactly what the selection buffer mean to do with the name stack.
In fact, your technique emulates something that OpenGL already does
data:image/s3,"s3://crabby-images/75dc1/75dc17048eabbd3aa5d269ee7fd064ead137103e" alt=""
Anyway, I find the "color-picking" thingy very elegant since you can (kind of) debug your picking by rendering the flat-colored scene on screen. It is very natural for the human eye to detect objects by their colors. I bet that sometimes you swap this flat-colored buffer just to see how awesome it looks !