Hello all,
I am currently working on a game engine for use with my game development that I would like to be as flexible as possible. As such the exact requirements for how things should work can't be nailed down to a specific implementation and I am looking for, at least now, a default good average case scenario design.
Here is what I have implemented:
- Deferred rendering using OpenGL
- Arbitrary number of lights and shadow mapping
- Each rendered object, as defined by a set of geometry, textures, animation data, and a model matrix is rendered with its own draw call
- Skeletal animations implemented on the GPU.
- Model matrix transformation implemented on the GPU
- Frustum and octree culling for optimization
Here are my questions and concerns:
- Doing the skeletal animation on the GPU, currently, requires doing the skinning for each object multiple times per frame: once for the initial geometry rendering and once for the shadow map rendering for each light for which it is not culled. This seems very inefficient. Is there a way to do skeletal animation on the GPU only once across these render calls?
- Without doing the model matrix transformation on the CPU, I fail to see how I can easily batch objects with the same textures and shaders in a single draw call without passing a ton of matrix data to the GPU (an array of model matrices then an index for each vertex into that array for transformation purposes?)
- If I do the matrix transformations on the CPU, It seems I can't really do the skinning on the GPU as the pre-transformed vertexes will wreck havoc with the calculations, so this seems not viable unless I am missing something
Overall it seems like simplest solution is to just do all of the vertex manipulation on the CPU and pass the pre-transformed data to the GPU, using vertex shaders that do basically nothing. This doesn't seem the most efficient use of the graphics hardware, but could potentially reduce the number of draw calls needed.
Really, I am looking for some advice on how to proceed with this, how something like this is typically handled. Are the multiple draw calls and skinning calculations not a huge deal? I would LIKE to save as much of the CPU's time per frame so it can be tasked with other things, as to keep CPU resources open to the implementation of the engine. However, that becomes a moot point if the GPU becomes a bottleneck.