Advertisement

cosinus function

Started by November 02, 2001 03:05 AM
25 comments, last by Jesper T 23 years, 2 months ago
Here's a fast angle clamperizer

float ClampAngle(float fAngle){	if (fAngle > PI)			fAngle -= floorf( (fAngle+PI)/PI2 ) * PI2;	if (fAngle < -PI)		fAngle -=  ceilf( (fAngle-PI)/PI2 ) * PI2;	return fAngle;} 



Edited by - Thrump on November 2, 2001 10:15:53 PM
Thrump, if you want something to be fast NEVER call floor or ceil. They can take over 100 clock cycles on an x86 CPU (that''s worse than a single cos, sin, or sqrt). Use integer truncation instead.

[Resist Windows XP''s Invasive Production Activation Technology!]
Advertisement
Good to know. Those 2 are from the ps2 libs. Yeah, if I''d done that on the pc, I would have been in for a rude (and chunky) surprise.
I don''t know much about MIPS processors, so I couldn''t say anything about them . I guess they switch rounding modes faster, or something.

[Resist Windows XP''s Invasive Production Activation Technology!]
For interests sake, I just timed them. Integer casting is still quite a bit faster.

1000 casts = 2100 cycles
1000 ceilf = 15000 cycles
1000 floorf = 24000 cycles

Not sure if my testing methods are sound. I did this. (T1 is a built in bus counter)
      *T1_COUNT = 0;for(i=0;i<1000;i++){	//x = (int)(x+1.0f);	//x = ceilf(x);	x = floorf(x);}duration = *T1_COUNT;printf("counter %f\n", duration);      


One question. Why do you think doing both ceilf and floorf would only take 28000 cycles?
15000 + 24000 = 28000?
I could hazard a few guesses, but I'm not sure.



Edited by - Thrump on November 2, 2001 11:59:30 PM
Weird. Doing all 3 takes 51000 cycles.
Advertisement
Your compiler probably realizes that it only has to switch the FPU''s rounding mode once before running the many ceils, floors, or casts. If you do all three in one test it has to do 3 switches per loop.

[Resist Windows XP''s Invasive Production Activation Technology!]
Is that on the ps2 you timed them?

On 586+ architectures, if ''s are evil. In tight loops it can be faster to do more work if you can eliminate an if .
Next, destroy loops if possible.

For instance:
  //this((DWORD*)∠)[1] &= 0x7FFFFFFF;//can replace thisif(ang<0.0) {ang=-ang;}  

That shaves off 6 ticks

If you have a number of values that you need to take the cosine of, make a cosine function that takes an array of doubles, takes the cosine then sticks the results back into the array.

You could try making that array of cofactors constant.


If you really want to try to out-do the math cos, get the Intel instruction set reference manual and learn x786
manuals

On my K6-3, I have your Cos function clocked at 973 ticks and math''s cos at 124. There is an opcode for cosines now (fcos), I think it was introduced with the pentium.

Magmai Kai Holmlor
- Not For Rent
- The trade-off between price and quality does not exist in Japan. Rather, the idea that high quality brings on cost reduction is widely accepted.-- Tajima & Matsubara
Ok, not that I actually understood this:

((DWORD*))[1] &= 0x7FFFFFFF;

But is it like a bit mask or something ?

And this:  ..caused trouble when I tried it

my compiler (msvc) doesnt recognize that character or something..






(I want to guive yuo teh impresion that I am very intelgient)
what do u think?

double cos_table[6]=
{
1,
-0.5,
0.041666666666666666666666666666667,
-0.0013888888888888888888888888888889,
2.4801587301587301587301587301587e-5,
-2.7557319223985890652557319223986e-7
};

double sin_table[6]=
{
1,
-0.16666666666666666666666666666667,
0.0083333333333333333333333333333333,
-0.0001984126984126984126984126984127,
2.7557319223985890652557319223986e-6,
-2.5052108385441718775052108385442e-8
};

double fast_int_power(double x,int y)
{
double r=1;
for (int i=1;i<=y;i++)
r*=x;
return (r);
}

double _cos(double x)
{
double res;
res = 0;
for (int i=0;i<6;i++)
{
res += fast_int_power(x,i<<1)*cos_table;
}
return res;
}

double _sin(double x)
{
float res;
res = 0;
for (int i=0;i<6;i++)
{
res += fast_int_power(x,(i<<1)+1)*sin_table;<br> }<br> return res;<br>} </i>

This topic is closed to new replies.

Advertisement