Advertisement

glTF Skeleton + Keyframes to Left-handed Coordinate System

Started by May 10, 2024 02:08 AM
8 comments, last by Aressera 6 months, 3 weeks ago

I have been adding support for exporting skeleton and animation data from glTF (2.0) to my custom engine. glTF utilizes a right-handed coordinate system (+y up, +z in, +x left), whereas my engine utilizes a left-handed coordinate system (+y up, +z in, +x right).

For mesh import, this is a simple conversion (flip x, reverse winding order). Skeleton and animation (keyframe data), however, is a bit more complicated. glTF provides all of these transforms in bone-local space (both skeleton nodes and keyframe data). Were it just a matter of the skeleton, simply composing the model-space transforms, flipping x translation, and zeroing out rotations then re-calculating the bone-local transforms is simple enough and produces the correct results. Of course, this makes it a zero-rotation bind pose, which is not really en vogue these days for various reasons. However, the keyframe animation data is another matter. Obviously, the zero-rotation bind pose is no longer in the same reference space as the animation keyframe data. I can't really wrap my head around how to transform the keyframes into either the new zero-rotation, LHCS base pose reference frame, OR, preferably, how I can do the LHCS conversion to the skeleton and maintain rotations and apply the same conversion to the animation keyframes.

I feel like I might be over-thinking this, but I'm having trouble rationalizing how to make this work without doing some laborious full keyframe pose local->model->convert->local transformation. It gets especially hard to think about how this might work considering that we can't guarantee that there are a full set of bone transforms for every keyframe (or that there are necessarily matching keyframes across the bone set (could be at different/unaligned times).

Anyone have thoughts/experience with this particular problem?

This is probably not what you want to hear, but have you considered switching your engine to use right-handed coordinates? Left handed coordinates systems are a source of endless confusion because they operate contrary to reality (all of math/physics disciplines are right-handed in convention, e.g. “right-hand rule” for cross product, electromagnetism, etc). glTF uses a right-handed system for a reason. Left-handed systems are only convenient in rendering because depth is the same as Z, that's the only benefit, and it's not that useful. To make left-handed rendering work you have to flip the model data to make it display correctly, and as you experienced can be a big headache with non-trivial transformations. It's like you are rendering your scene through a mirror. To me at least it seems completely backwards with no tangible benefit. In a right-handed system, everything “just works”, without any fussing with flipping models or transforms.

I would try implementing all animation using the existing coordinates (no conversion at all, as if you animate in right-handed space), then apply a 4x4 right-to-left transformation to the skinned vertices in the shader as part of your local-to-world model transform (compose the matrices on CPU before submitting to shader). This should be a similar result to what you currently do to static models, however with animated models you would want to not do any changes to the vertices when loading, since that change will be done in vertex shader instead. To make it consistent, you can just do the right-to-left transform in the shader for all types of models, though maybe you will have problems with doing physics geometry that way. The key thing to realize is that the animation math is the same with both handedness, it's just that you have to flip the result right before display if left-handed. It's only the camera's viewing matrix that is left-handed.

Advertisement

I would certainly agree that a fully right-handed engine is the way to go these days. Unfortunately, the left-handed convention is pretty firmly baked in on my end of things (blame DirectX for driving THAT into my brain), as are a fair bit of existing assets, etc. So converting everything to right-handed is not really a feasible option.

That said, your comment and ideas about doing the conversion at runtime/presentation time got me thinking. I think a decent compromise is simply having a flag on the skeleton that indicates it's right-handed. Then, when composing the final pose into model space, I do an additional pass to transform those model-space transforms into LHCS. I think that's potentially as simple as just flipping the X translation, but I need to think some more about if there's an additional rotation I need to apply. This certainly gives a proper (translational) bind pose skeleton, but I won't know if the rotations need fiddling with until I get animations coming through so I can see the mesh/skinning result with rotational deltas.

This has the benefit of keeping everything else in the consistent LHCS that is expected, and removes the need for any shader trickery/complications.

Now that I have animation data coming through, I was able to take a few whacks at doing this model-space pose conversion. Simply doing nothing (RHCS model, skeleton and animation) gives exactly what you'd expect…everything appears and animates correctly, it's just reversed on the X-axis from what you'd see in the modeling software (or the glTF viewer, for instance). Using a LHCS model, with RHCS skeleton and animation, if we only flip the translation of the final pose transforms, they are positionally correct, but the rotations are not, which means the mesh vertices skinned to bones rotate in the opposite direction on the “up”/Y axis. I attempted to introduce a reversed rotation by deriving the Y axis model-space rotation, then add in -(2.0 * angle) Y axis rotation to effectively flip it, but that did not produce a correct result. Clearly my math and/or reasoning about those model space transforms corrections is off some how.

Did you try what I suggested, to use RHCS for everything (including not modifying the vertices on import), then just modify the final model matrix (applied after animation skinning) by applying X axis reflection?

@Aressera I'm sure the shader approach is a perfectly reasonable way to address this strictly from a rendering perspective. Even then, though, it does have some other wrinkles like triangle winding order that need to be taken into account and potentially tweaked (and normals, etc.).

The other issues relate to the skeleton and pose as it interacts with other game systems, such as bone attachments, etc. In these cases we need the final pose to be reflective (pun intended) of our LHCS. Thus, why I'm focusing on trying to solve this as early in the pipeline as possible. So, ideally, I can figure out how to process the skeleton + animations so that the loaded data is LHCS; no runtime overhead, everything is in the correct CS. The model-space pose adjustment at least has the benefit of getting anything that needs to query/use bone transforms the proper LHCS data, albeit with a little bit of extra cost when generating the final model-space pose.

There's probably a workable solution all the way at the authoring side of things (flip things in Maya/Blender/whatever before export), but I'd love to be able to ingest “pure” glTF as an intermediary format.

Advertisement

I managed to get the model-space pose LHCS conversion working properly.

Once I considered the problem from a transform matrix perspective, and lots of hand and 3-finger-axes visualizing, it became clear that the answer is actually quite simple; if we recall that for an orthonormal transformation (row-major) matrix, the first three rows represent the basis axis vectors for the rotation, then we can see that simply flipping the x component of each basis vector along with the x translation gives us the correct result. For my row-major matrices, this is simply flipping the first component of each row. After this operation I also invert the entire X axis rotation (row 0) to make it a proper “left-handed” rotation for the purposes of visualization. This is actually completely unnecessary, as the end rotational result is the same.

I need to have a think about how scale may factor in to this scheme, but I think this handles that case as well.

So, thankfully, this turned out to be a reasonably fast and uncomplicated operation. Obviously, specific to my particular coordinate system requirements, but hopefully useful for anyone else grappling with a similar problem.

I'm still pursuing a load-time conversion for the skeleton and animation data, but for the time being this gives me the correct results with minimal overhead.

Aressera said:
Left handed coordinates systems are a source of endless confusion because they operate contrary to reality (all of math/physics disciplines are right-handed in convention, e.g. “right-hand rule” for cross product, electromagnetism, etc)

Never, ever, try to say that at some interviews. Black and violent storms is all what you'll get.

_Silence_ said:
Never, ever, try to say that at some interviews. Black and violent storms is all what you'll get.

Luckily I'm the one giving interviews.

This topic is closed to new replies.

Advertisement