There was a topic with similar problem that we are experiencing, http://www.gamedev.net/topic/607167-ogre-vector3-and-amd64/ .
But what is different in our case is that our vector type is in pure C and writing explicit constructors/copy-constructors dont work in our case.
And yes, the problem is returning simple struct type as value on linux x64 (GCC 4.6.1), the vector values are all funky.
Saying returning vector with values (1,2,3) will give out (3,0,1).
Vector struct looks like this;
typedef float vec_t;
typedef vec_t vec3_t[3];
typedef struct asvec3_s
{
vec3_t v;
} asvec3_t;
We register the type with flags (asOBJ_VALUE | asOBJ_POD | asOBJ_APP_CLASS).
What are possible options that we have to fix this?
Linux 64bit will probably return this type in the registers RAX:RDX. However, it looks like the values might be switched, so you're getting the higher elements in the lower elements. The 0 is probably just the default value as the type only has 3 floats, and not 4.
This might be a bug in as_callfunc.cpp (lines 513-514) or it may be a specific situation for this type that is not handled yet. I'll try to investigate this and see if I can figure it out.
In the meantime, you should be able to use the autowrappers to solve this problem.
Linux 64bit will probably return this type in the registers RAX:RDX. However, it looks like the values might be switched, so you're getting the higher elements in the lower elements. The 0 is probably just the default value as the type only has 3 floats, and not 4.
This might be a bug in as_callfunc.cpp (lines 513-514) or it may be a specific situation for this type that is not handled yet. I'll try to investigate this and see if I can figure it out.
In the meantime, you should be able to use the autowrappers to solve this problem.
Small update, I found something that temporarily fixes our problem. Any of the following methods produce the vector (1,2,3) correctly:
struct vector {
union {
float xyz[3];
double _as_64bit_hack[3];
};
};
struct vector {
float xyz[3];
float pad[2]; // <- has to be at least 2 elements
};
struct vector {
double xyz[3]; // works as double, see first method
};
I tried different alignments with GCC align attribute, but none worked like these so I'm not sure is this alignment issue or framesize issue or what?
Naturally this isnt 100% satisfactory workaround and I hope that you find a "correct" solution for this
All of these make the structure be larger than 128bit which will make the gcc compiler return the type in memory rather than the registers.
What stumps me is that I already have a test for validating that a class similar to this works properly on linux 64bit, and it is working as it should.
Unfortunately I haven't had the time to investigate this problem in detail yet, but hopefully before the end of the week I will at least be able to understand the cause.
I made some tests with asvec3_t, and I get the same result as you do, i.e. {3,0,1} where {1,2,3} is expected.
For some reason that I have yet to determine why gcc is treating the following class:
class Class3
{
asDWORD a;
asDWORD b;
asDWORD c;
};
differently than the asvec3_t type, even though both are of the same size and both contain only primitives. Both seem to be returned in the RAX and RDX registers, however the order of the registers is swapped for asvec3_t versus Class3.
Apparently I need to have another flag than asOBJ_APP_CLASS to identify that the asvec3_t type should be returned in a swapped order, however I do not know how the application developer should know when to use one or the other.
If you have any idea on what the rule might be as to when gcc does it one way or the other I would really like to hear it.
class Class3
{
asDWORD a;
asDWORD b;
asDWORD c;
};
At least this allows us to keep the 12-byte size on the vector, union { float v[3]; int i[3]; };
Apparently I need to have another flag than asOBJ_APP_CLASS to identify that the asvec3_t type should be returned in a swapped order, however I do not know how the application developer should know when to use one or the other.
[/quote]
Yes, this is a bad idea. Even Vicious dont approve
If you have any idea on what the rule might be as to when gcc does it one way or the other I would really like to hear it.
[/quote]
I'm afraid I'm not that well equipped to investigate the bug on gcc/asm level.
I wonder would issues in this thread be related: http://www.gamedev.n...urning-doubles/
Specially those " bigger changes to the code that implements the native calling conventions in version 2.20.2"?
Our version is 2.21.0 btw.
No, this problem is different from the one described in the other thread. Floats and doubles are returned in the XMM register, and for some reason when compiling with optimizations the value gets lost. Probably the gcc compiler doesn't see the XMM register is used and end up removing the instructions set it.
I'm trying to find some documentation that explains why gcc behaves this way for this type so I can add proper support for it.
After reading the documentation at http://www.x86-64.or...ntation/abi.pdf I think the case might be that this structure is actually returned in XMM0 and not RAX:RDX, because it only contains float values. The fact that we get {3,0,1} is might be a coincidence.
Would it be possible for you to compile the following function:
asvec3_t vec3_123()
{
asvec3_t v = {1,2,3};
return v;
}
into assembler, so I can see how the return value is loaded into the registers?
You should be able to do this by compiling with 'gcc -S test.c'. It will generate the file test.s instead of test.obj.
Yes it seems to use the SSE registers for return value. I found a function attribute sseregparm for GCC, but unfortunately I can't find a complementary attribute to it.
Here is nice document that lists ABI for various compilers (including MSVC and GCC) for various platforms (including 16-bit, 32-bit and 64-bit PC): http://www.agner.org/optimize/calling_conventions.pdf (page 16)