How much overhead does vecor normalisation cost?

Started by Keermalec September 21, 2001 03:42 PM

4 comments, last by Keermalec 23 years, 5 months ago

Author

122

September 21, 2001 03:42 PM

Hi, sometimes vectors need to be normalised (ie.- converted to unit length), sometimes they don't, and we are very tempted to normalise them all by default, for example by including a normalisation routine within the vector class definition itself. I decided to verify exactly what overhead this would cost, and was actually surprised at the result. A normalised vectors costs 30x more processing power to define than a non-normalised vector: ie: only normalise vectors if you really need to. Here is some testing code to demonstrate that. Launch it in a dos prompt window and choose 1'000'000 vectors. I get 13 milliseconds without normalisation and 380 milliseconds with normalisation. Let me know what you think of the test method. Is it valid or not?

    #include &ltstdlib.h>			// for rand() and srand() functions
#include &ltwindows.h>         		// Standard Header For MSWindows Applications
#include &ltgl/glut.h>         		// The GL Utility Toolkit (GLUT) Header
#include &ltmath.h>			// Math functions such as sqrt(), sin(), cos()


#include &ltiostream.h>			// Math functions such as sqrt(), sin(), cos()
#include &ltstdio.h>			// for FILE, fopen(), sprintf(), fwrite(), fclose() functions and objects
#include &ltmmsystem.h>			// for timeGetTime


struct vector
{
	float dx;
	float dy;
	float dz;
	void Normalised()				// declaration and definition of mormalising method

	{
		float hypothenuse = sqrt((dx*dx) + (dy*dy) + (dz*dz));
		if (hypothenuse == 0.0f) hypothenuse = 1.0f;
		dx /= hypothenuse;
		dy /= hypothenuse;
		dz /= hypothenuse;
	}
	vector (float dX, float dY, float dZ)	// declaration and definition of constructor

	{
		dx = dX;
		dy = dY;
		dz = dZ;
//		Normalised();				// normalising increases calculation time by 18x (10^8 vectors in 18 seconds instead of 1)

	}
	vector () {}					// declaration and definition of default constructor

	~vector () {}					// declaration and definition of destructor

};

void main()
{
	long int a, b;
	cout << "It usually takes 1 second to define 100'000'000 non-normalised vectors\n";
	cout << "And 18 seconds for normalised vectors, on a 1GHz Pentium III\n\n";
	cout << "How many vectors would you like to define?\n\n";
	cin >> a;
	cout << "\nWould you like to normalise them? Type 1 for yes, 0 for no\n\n";
	cin >> b;
	cout << "\nCALCULATING...\n";

	DWORD Start = timeGetTime();

	for ( long int i = 0; i < a; i++)
	{
		vector v1(14.378478f, 25.453625f, 46.672398f);
		if (b == 1) v1.Normalised();

	}

	DWORD End = timeGetTime();
	DWORD Span = (End - Start);

	char string[80];
	sprintf(string, "%d vectors defined in %d milliseconds", a, Span);
	cout << "\n\n" << string << "\n\n";
	cout << "END OF CALCULATION \n\n";
}

Edited by - Keermalec on September 23, 2001 12:43:35 PM

Dactylos

122

September 21, 2001 03:52 PM

Write < and > to get < and > respectively.

Since normalisation of an arbitrary matrix requires the use of sqrt it is bound to be slow (sqrt is always a relatively slow operation on todays hardware).
If you know that your matrices are orthogonal then you can normalize them in a much cheaper way, utilizing the fact that:

A^-1=A^T if A is orthogonal.

(A^-1 is the inverse of A and A^T is the transpose of A).

Just remember that if your matrices are 4x4 only the upper 3x3 corner of them are the orthogonal 'vector-space base'. The position must be treated separately.

[EDIT: The "Edit" button is in the upper right corner of every post]
[EDIT2: Messed up the > and < advice

]

Edited by - Dactylos on September 21, 2001 4:59:51 PM

_DarkWIng_

602

September 22, 2001 01:36 PM

Keermalec : Try using some more optimized methods....



// Fast normalization of 3 component vector. Does not test if the vector has 0 length
__inline void FastNormVect3(float *v) {
		float ilength;
		ilength = RSqrt(FastDotProduct(v, v)); 

		v[0] *= ilength;
		v[1] *= ilength;
		v[2] *= ilength;
}


// Fast 15 cycle asm dot product, by http://talika.fie.us.es/~titan/ (Titan engine)
__forceinline float __cdecl FastDotProduct(const float v1[3], const float v2[3]) {
		FLOAT dotret;
		__asm {
				mov ecx, v1
				mov eax, v2

				// optimized dot product (15 cycles)
				fld dword ptr   [eax+0]     // starts & ends on cycle 0
				fmul dword ptr  [ecx+0]     // starts on cycle 1
				fld dword ptr   [eax+4]     // starts & ends on cycle 2
				fmul dword ptr  [ecx+4]     // starts on cycle 3
				fld dword ptr   [eax+8]     // starts & ends on cycle 4
				fmul dword ptr  [ecx+8]     // starts on cycle 5
				fxch            st(1)       // no cost
				faddp           st(2),st(0) // starts on cycle 6, stalls for cycles 7-8
				faddp           st(1),st(0) // starts on cycle 9, stalls for cycles 10-12
				fstp dword ptr  [dotret]    // starts on cycle 13, ends on cycle 14
		}
		return dotret;
}

RSqrt is some asm optimized 1/Sqrt function...

but your vector has to be defined something like


class Vector3 {
			union {				// union makes a nice trick to access variables
				struct {		// by vector.x vector.y vector.z or you can give
					GLfloat x;  // pointer to float v[3] structure
					GLfloat y;
					GLfloat z;
				};
				GLfloat v[3];
			};           

// add functions here...

}

and one more thing.. never use sth. like

a=....
x/=a;
y/=a;
z/=a;

instead use....

a=....
a=1/a;
x*=a;
y*=a;
z*=a;

Edited by - _DarkWIng_ on September 22, 2001 2:39:17 PM

You should never let your fears become the boundaries of your dreams.

Keermalec

Author

122

September 23, 2001 11:39 AM

Wow thanks for the */ trick Darkwing, that actually speeds up my
vector normalisations from about 380 to 250 milliseconds for
1'000'000 operations. The fastdotproduct stuff is way over my
head at the moment but I'll come back to it in due time. What
sort of time gain do you think it makes?

BTW, I also tested the classic dotproduct function. If I
normalise two vectors and THEN do a dotproduct, it takes much
longer than if I do a dotproduct of two unormalised vectors and
then divide the result by the two vector's lengths, even if I
calculate those two lengths at that moment.

I now store a bool parameter in my vector class who's sole
function is to inform on wether the cvector is normalised or
not. The dotproduct function therefore looks like this:

    GLfloat dotproduct(vector v1, vector v2){	// if vector have been normalised:	if ((v1.unit == TRUE)&&(v2.unit == TRUE)) return v1.dx*v2.dx + v1.dy*v2.dy + v1.dz*v2.dz;	// if not:	else return (v1.dx*v2.dx + v1.dy*v2.dy + v1.dz*v2.dz)/(v1.len()*v2.len());}

Edited by - Keermalec on September 23, 2001 12:50:16 PM

Edited by - Keermalec on September 23, 2001 12:51:39 PM

_DarkWIng_

602

September 24, 2001 03:12 AM

using boolean "unit" makes some speed-up in this function but remember : you''ll have to add unit=false to just about every other function in vector class (in my case that''s alot of function).. so the real world example you don''t gain anything... In my testing the most expencive operation is 1/Sqrt so try optimizing that...

another thing... you could use asm (+)MMX functions to speed up multiplication (multiplying x,y,z at the same time) to get x1*x2, y1*y2, z1*z2 in just two cycles... [try looking at gamasutra''s optimizations articles]

You should never let your fears become the boundaries of your dreams.

oglman

122

September 25, 2001 09:10 AM

I agree with the statement "Only normalize if you really have to but there are many casese in which normalization is mandatory:

-Movement Vectiors (so everything moves the same speed)
-Normal Vectors (Lighting is MUCH faster)
-Physics detection is harder without vector normalization.

It''s not much, but there isn''t much else you can use vectors for in 3D

Open mouth, insert foot

How much overhead does vecor normalisation cost?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

How much overhead does vecor normalisation cost?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines