I'd imagine a lot of sounds APIs work like this .. play sound at 3d position blah. Listener at 3d position blah.
They do exactly as you propose and mix the sounds according to the speaker configuration they support. It would in theory be very easy for them to output the channels individually because they do this internally, but whether they support this I don't know. If you want to find out whether you can get access to this info, you need to look at the docs of the various sound APIs that might be used in the games you are interested in. Most likely to have success is e.g. OpenAL, because it is open source.
You may also be able to make a compatible / shim layer that intercepts calls to the sound API and does your own stuff with them. But if the game doesn't use a shared library for sound then all bets are off.
You could do it as proof of concept with an open source game, that is most likely to succeed as there are a number of hurdles to overcome which you may not have the technical chops to do (the fact you ask the question suggests this).
Overall I'd question how really 'innovative' this whole thing is. That's not to say it's not worth exploring, but it's not in any way 'innovative'. The whole media-sound business is about exactly as you describe, they've just figured out it works better for them to have a few speakers and use the balance between them to position the sound (as we only have 2 ears). Having instead a bunch of speakers placed around an area or surface is probably the first thing they tried, and there are a number of disadvantages to this as I'm sure you are aware.