Some of this will duplicate things that've already been said.
Out of curiosity, are you using a perspective or orthogonal projection? I'll assume orthogonal (which seems most appropriate if it's purely 2D), but you mentioned a camera position with a non-zero z value, so I wasn't sure.
The first thing I'd do is ask these two questions:
- What you want your units to be (e.g. meters, etc.).
- The minimum distance in these units you want to be visible in the x and y directions at all times (aside from zooming or whatever).
Barring possible practical limitations, both of these are entirely up to you. For example, for a game with human-sized characters, you might want the units to be meters, and be able to see at least a 10-meter-by-10-meter square at all times. For a city-building game, you might want the units to be kilometers, and be able to see at least a 20-km-by-20-km square at all times. Of course the code doesn't care what the units are - that's just conceptual. The minimim required visible size is what's important here.
Whatever your desired minimum game size is will give you your desired aspect ratio (width / height). Obviously the window aspect ratio won't necessarily be the same as the desired game aspect ratio, so you'll probably have extra space to fill along one of the axes. For example, your minimim desired size might be 10x10, but you might have to fill 15x10 (which is basically an art and design problem).
So we can start to collect some data:
- Desired game size (e.g. 10x10).
- Desired game aspect ratio (desired game width / desired game height).
- Window size (e.g. 1024x768).
- Window aspect ratio (window width / window height).
The next thing is to compute the actual game size. It should have the same aspect ratio as the window, while ensuring that at minimum, the desired game size is visible. This post is getting long, so I'll skip over the details, but computing this reduces to some simple arithmetic.
Finally, you render everything in your desired units. Assuming orthographic, set up an orthographic projection with width and height equal to the actual game width and height. The view transform can be identity, or it can have a translation if the game scrolls, etc.
There's also the matter of choosing appropriate art sizes, but that's a separate issue.
Personally I think this topic can be a little confusing. I doubt what I described above is sufficient to put together an implementation from, but maybe along with the other information provided in this thread it'll give you some ideas.