I have a function which works correctly with asCALL_CDECL_OBJFIRST. self is the correct pointer, and passed via rcx as specified in the x64 calling convention:
static void ScriptTextureTest(OverlayTexture* self)
Now I also have this function:
static glm::vec2 ScriptTextureGetSize(OverlayTexture* self)
This one is broken - "self" is pointing to the rdx value passed in rather than the rcx value. MSVC seems to treat this function as though rcx is making room for 8 bytes of return value (2 floats in glm::vec2), and so self is assigned rdx, and thus the wrong pointer.
Microsoft's calling convention documentation mentions the following:
struct Struct1 {
int j, k, l; // Struct1 exceeds 64 bits.
};
Struct1 func3(int a, double b, int c, float d);
// Caller allocates memory for Struct1 returned and passes pointer in RCX,
// a in RDX, b in XMM2, c in R9, d pushed on the stack;
// callee returns pointer to Struct1 result in RAX.
Specifically, see the first comment. This seems to indicate that room has to be made (on the stack, presumably) and then passed in through rcx, but only if the size exceeds 64 bits.
Personally I haven't seen this calling convention before, AND it is strange that it's even doing this, considering sizeof(glm::vec2) is 8, so it should fit in 64 bits just fine. Maybe this is also a bug in msvc? Not sure.
Here's the disassembly of the GetSize function:
static glm::vec2 ScriptTextureGetSize(OverlayTexture* self)
{
00007FFA10DA5F80 mov qword ptr [rsp+10h],rdx
00007FFA10DA5F85 mov qword ptr [rsp+8],rcx
00007FFA10DA5F8A sub rsp,28h
00007FFA10DA5F8E lea rcx,[00007FFA110A5B4Ah]
00007FFA10DA5F95 call 00007FFA10EE8BA4
return glm::vec2((float)self->m_width, (float)self->m_height);
00007FFA10DA5F9A mov rax,qword ptr [rsp+38h]
00007FFA10DA5F9F cvtsi2ss xmm0,dword ptr [rax+0Ch]
00007FFA10DA5FA4 mov rax,qword ptr [rsp+38h]
00007FFA10DA5FA9 cvtsi2ss xmm1,dword ptr [rax+8]
00007FFA10DA5FAE movaps xmm2,xmm0
00007FFA10DA5FB1 mov rcx,qword ptr [rsp+30h]
00007FFA10DA5FB6 call 00007FFA10C18180
00007FFA10DA5FBB mov rax,qword ptr [rsp+30h]
}
00007FFA10DA5FC0 add rsp,28h
00007FFA10DA5FC4 ret
Note the [rsp+38], which is 8 bytes ahead of where the first parameter is stored, which is rsp+30. You can see that in the disassembly of the correct test function:
static void ScriptTextureTest(OverlayTexture* self)
{
00007FFA10DA5FD0 mov qword ptr [rsp+8],rcx
00007FFA10DA5FD5 sub rsp,28h
00007FFA10DA5FD9 lea rcx,[00007FFA110A5B4Ah]
00007FFA10DA5FE0 call 00007FFA10EE8BA4
WriteLog(log_Overlay, "Test: %d", self->m_width);
00007FFA10DA5FE5 mov rax,qword ptr [rsp+30h]
WriteLog(log_Overlay, "Test: %d", self->m_width);
00007FFA10DA5FEA mov r8d,dword ptr [rax+8]
00007FFA10DA5FEE lea rdx,[00007FFA10F815C8h]
00007FFA10DA5FF5 mov ecx,5
00007FFA10DA5FFA call 00007FFA10C169B0
return;
}
00007FFA10DA5FFF add rsp,28h
00007FFA10DA6003 ret
This behavior seems to have started after I switched from my own simple vec2 structure to the more advanced (but memory-compatible) glm::vec2, as it worked fine before with the smaller vec2 structure.
It's all really strange to me. Open to suggestions of things to try.