So just to make things clear, there's two ways to do matrix math on paper - putting the basis vectors in the rows, treating vectors as rows, and doing left-to-right multiplication: \(\begin{bmatrix} Vx & Vy & Vz & 1 \end{bmatrix} \cdot \begin{bmatrix} Xx & Xy & Xz & 0\\ Yx & Yy & Yz & 0\\ Zx & Zy & Zz & 0\\ Px & Py & Pz & 1 \end{bmatrix}\)
Or putting basis vectors in the columns, treating the vectors as columns, and doing right-to-left multiplication: \(\begin{bmatrix} Xx & Yx & Zx & Px\\ Xy & Yy & Zy & Py\\ Yz & Yz & Zz & Pz\\ 0 & 0 & 0 & 1 \end{bmatrix} \cdot \begin{bmatrix} Vx \\ Vy \\ Vz \\ 1 \end{bmatrix} \)
Then a completely separate issue is how you decide to store 2D arrays in linear memory. Row-major: \(\begin{bmatrix} 0&1&2&3\\ 4&5&6&7\\ 8&9&10&11\\ 12&13&14&15 \end{bmatrix}\)
Or column-major: \(\begin{bmatrix} 0&4&8&12\\ 1&5&9&13\\ 2&6&10&14\\ 3&7&11&15 \end{bmatrix}\)
That results in four different conventions for doing matrix math in a computer (row-major/column-major array indexing, and row-vector/column-vector math).
GLM uses column-vector math and column-major array indexing.
In the HLSL code that you posted, your math is written assuming row-vector math (left to right multiplication ordering), which is the opposite convention to what GLM uses. Your HLSL code also expects column-major array ordered data.
If, on the CPU side before the shader runs, you rearrange your data from column-major to row-major array ordering, then HLSL is going to accidentally interpret your data wrong -- it will read rows as columns and columns as rows... which has the same effect as doing a mathematical transpose operation, which cancels out the fact that you're using the opposite mathematical conventions.
i.e. your mathematical conventions in the vertex shader are the opposite of what GLM expects, but by also using the opposite array storage conventions, these cancel each other out and two wrongs make a right, so it works.
You should be able to get rid of all your transposing and just rewrite the VS to use right-to-left multiplication order,
e.g. position = mul(worldMatrix, position);