You're right, and I agree, my matrix code works, but to give some more context to this, I'll explain how this idea came to be.
My skeleton file format stores transforms as SRT, as I explained, my animation file format stores a "channel" per vector element, so animations are applied to the SRT vectors as well, this works great, BUT at runtime I need to generate a temp array of matrices (that is, an array of arrays) for the bind pose skeleton pose, a temp array of matrices for the animated joints which gets multiplied by the bind pose array and a final array for the result of the previous multiplication.
I cache the bind pose array since it never changes, but keeping duplicate data on memory kind of bothers me, and having a temp array with the SRT's converted to matrices provokes the same nagging annoyance in me.
So I though that if I could do the operations with the SRT's directly I could do without the auxiliary arrays, so no duplication of data in memory, and at the time avoid a runtime pre-computation (I am trying to do all preprocessing offline so my file formats are ready to use as soon as they're loaded into memory and some pointers are set), and some matrix multiplications here and there.
To me, it is an interesting topic, I have not dumped my matrix code yet, and I won't, but I'd like to know what could I get away with.