|
How much overhead does vecor normalisation cost?
Hi, sometimes vectors need to be normalised (ie.- converted to unit length), sometimes they don't, and we are very tempted to normalise them all by default, for example by including a normalisation routine within the vector class definition itself. I decided to verify exactly what overhead this would cost, and was actually surprised at the result. A normalised vectors costs 30x more processing power to define than a non-normalised vector: ie: only normalise vectors if you really need to.
Here is some testing code to demonstrate that. Launch it in a dos prompt window and choose 1'000'000 vectors. I get 13 milliseconds without normalisation and 380 milliseconds with normalisation.
Let me know what you think of the test method. Is it valid or not?
Edited by - Keermalec on September 23, 2001 12:43:35 PM
Write < and > to get < and > respectively.
Since normalisation of an arbitrary matrix requires the use of sqrt it is bound to be slow (sqrt is always a relatively slow operation on todays hardware).
If you know that your matrices are orthogonal then you can normalize them in a much cheaper way, utilizing the fact that:
A-1=AT if A is orthogonal.
(A-1 is the inverse of A and AT is the transpose of A).
Just remember that if your matrices are 4x4 only the upper 3x3 corner of them are the orthogonal 'vector-space base'. The position must be treated separately.
[EDIT: The "Edit" button is in the upper right corner of every post]
[EDIT2: Messed up the > and < advice
]
Edited by - Dactylos on September 21, 2001 4:59:51 PM
Since normalisation of an arbitrary matrix requires the use of sqrt it is bound to be slow (sqrt is always a relatively slow operation on todays hardware).
If you know that your matrices are orthogonal then you can normalize them in a much cheaper way, utilizing the fact that:
A-1=AT if A is orthogonal.
(A-1 is the inverse of A and AT is the transpose of A).
Just remember that if your matrices are 4x4 only the upper 3x3 corner of them are the orthogonal 'vector-space base'. The position must be treated separately.
[EDIT: The "Edit" button is in the upper right corner of every post]
[EDIT2: Messed up the > and < advice

Edited by - Dactylos on September 21, 2001 4:59:51 PM
Keermalec : Try using some more optimized methods....
RSqrt is some asm optimized 1/Sqrt function...
but your vector has to be defined something like
and one more thing.. never use sth. like
a=....
x/=a;
y/=a;
z/=a;
instead use....
a=....
a=1/a;
x*=a;
y*=a;
z*=a;
Edited by - _DarkWIng_ on September 22, 2001 2:39:17 PM
// Fast normalization of 3 component vector. Does not test if the vector has 0 length
__inline void FastNormVect3(float *v) {
float ilength;
ilength = RSqrt(FastDotProduct(v, v));
v[0] *= ilength;
v[1] *= ilength;
v[2] *= ilength;
}
// Fast 15 cycle asm dot product, by http://talika.fie.us.es/~titan/ (Titan engine)
__forceinline float __cdecl FastDotProduct(const float v1[3], const float v2[3]) {
FLOAT dotret;
__asm {
mov ecx, v1
mov eax, v2
// optimized dot product (15 cycles)
fld dword ptr [eax+0] // starts & ends on cycle 0
fmul dword ptr [ecx+0] // starts on cycle 1
fld dword ptr [eax+4] // starts & ends on cycle 2
fmul dword ptr [ecx+4] // starts on cycle 3
fld dword ptr [eax+8] // starts & ends on cycle 4
fmul dword ptr [ecx+8] // starts on cycle 5
fxch st(1) // no cost
faddp st(2),st(0) // starts on cycle 6, stalls for cycles 7-8
faddp st(1),st(0) // starts on cycle 9, stalls for cycles 10-12
fstp dword ptr [dotret] // starts on cycle 13, ends on cycle 14
}
return dotret;
}
RSqrt is some asm optimized 1/Sqrt function...
but your vector has to be defined something like
class Vector3 {
union { // union makes a nice trick to access variables
struct { // by vector.x vector.y vector.z or you can give
GLfloat x; // pointer to float v[3] structure
GLfloat y;
GLfloat z;
};
GLfloat v[3];
};
// add functions here...
}
and one more thing.. never use sth. like
a=....
x/=a;
y/=a;
z/=a;
instead use....
a=....
a=1/a;
x*=a;
y*=a;
z*=a;
Edited by - _DarkWIng_ on September 22, 2001 2:39:17 PM
You should never let your fears become the boundaries of your dreams.
Wow thanks for the */ trick Darkwing, that actually speeds up my
vector normalisations from about 380 to 250 milliseconds for
1'000'000 operations. The fastdotproduct stuff is way over my
head at the moment but I'll come back to it in due time. What
sort of time gain do you think it makes?
BTW, I also tested the classic dotproduct function. If I
normalise two vectors and THEN do a dotproduct, it takes much
longer than if I do a dotproduct of two unormalised vectors and
then divide the result by the two vector's lengths, even if I
calculate those two lengths at that moment.
I now store a bool parameter in my vector class who's sole
function is to inform on wether the cvector is normalised or
not. The dotproduct function therefore looks like this:
Edited by - Keermalec on September 23, 2001 12:50:16 PM
Edited by - Keermalec on September 23, 2001 12:51:39 PM
vector normalisations from about 380 to 250 milliseconds for
1'000'000 operations. The fastdotproduct stuff is way over my
head at the moment but I'll come back to it in due time. What
sort of time gain do you think it makes?
BTW, I also tested the classic dotproduct function. If I
normalise two vectors and THEN do a dotproduct, it takes much
longer than if I do a dotproduct of two unormalised vectors and
then divide the result by the two vector's lengths, even if I
calculate those two lengths at that moment.
I now store a bool parameter in my vector class who's sole
function is to inform on wether the cvector is normalised or
not. The dotproduct function therefore looks like this:
|
Edited by - Keermalec on September 23, 2001 12:50:16 PM
Edited by - Keermalec on September 23, 2001 12:51:39 PM
using boolean "unit" makes some speed-up in this function but remember : you''ll have to add unit=false to just about every other function in vector class (in my case that''s alot of function).. so the real world example you don''t gain anything... In my testing the most expencive operation is 1/Sqrt so try optimizing that...
another thing... you could use asm (+)MMX functions to speed up multiplication (multiplying x,y,z at the same time) to get x1*x2, y1*y2, z1*z2 in just two cycles... [try looking at gamasutra''s optimizations articles]
another thing... you could use asm (+)MMX functions to speed up multiplication (multiplying x,y,z at the same time) to get x1*x2, y1*y2, z1*z2 in just two cycles... [try looking at gamasutra''s optimizations articles]
You should never let your fears become the boundaries of your dreams.
I agree with the statement "Only normalize if you really have to but there are many casese in which normalization is mandatory:
-Movement Vectiors (so everything moves the same speed)
-Normal Vectors (Lighting is MUCH faster)
-Physics detection is harder without vector normalization.
It''s not much, but there isn''t much else you can use vectors for in 3D
Open mouth, insert foot
-Movement Vectiors (so everything moves the same speed)
-Normal Vectors (Lighting is MUCH faster)
-Physics detection is harder without vector normalization.
It''s not much, but there isn''t much else you can use vectors for in 3D
Open mouth, insert foot
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement