Advertisement

Optimization: Benefit for using floats vs doubles?

Started by August 10, 2000 01:22 AM
14 comments, last by Dark Druid 24 years, 4 months ago
I''m trying to optimize my program right now (3d space flight simulator), I made an initial call to use doubles for my variables as I wanted the most precision possible for my physics calculations, now I''m wondering if it was a good idea as I have been trying to optimize my code lately... using floats will probably help performance, but will there be any hidden cost? I''m guessing the difference is negligeable and the benefits will outweight the cost, but I wanted to ask if anyone knew exactly how much this would help, and what is the standard in the gaming industry when it comes to 3d engines? Thanks
I think the floating point registers default to 80 bits, so I don''t think there''s any penalty. I don''t *think* it would matter either way, but I''m almost sure that there wouldn''t be a penalty for using doubles.
Advertisement
I don''t know how the registers in current CPUs are designed for floating point numbers, but I also know I don''t care. You are paying a hefty price to use doubles anyway. You''re wasting space in the cache that could be holding other frequently used data, and efficient cache usage is one of the best ways to speed up a program. You''re wasting memory bus bandwidth to send bigger data back and forth. Most likely, you''d not see any drawbacks using floats, unless you''re performing calculations with very different orders of magnitude. 3d hardware generally expects to get floats (since that''s what most programmers use...), not doubles, and is designed to work best that way. I would recommend you use floats, unless the accuracy becomes a problem, which is doubtful.
I would highly recommend using floats.
Then you can convert you data structures to make use if P3 Streaming SIMD Extensions 4 packed 32-bit float registers.

I would recommend you to TEST what the benefits are of both, in terms of precision and performance

        typedef double UNIT;    


then it's easy to change to another type.

Hmmm, did this piece of code change to source-type? (I just tested source /source)

Sorry, but I must try quote too.

quote:
I'm trying to optimize my program right now (3d space flight simulator),


I now post it and check!!!!



/Mankind gave birth to God.

Edited by - DDnewbie on August 11, 2000 4:27:47 AM
/Mankind gave birth to God.
Just to set the record straight here...

On a PC:

In General -
float is defined as a 32 bit format - 1 for sign, 8 for exponent, and 23 for the mantissa
double is defined as a 64 bit format - 1 for sign, 11 for the exponent, and 52 for the mantissa

MSVC specific -
long double is defined as an 80 bit format - 1 for sign, 15 for exponent, and 64 for the mantissa

Generally, most game engines use floats for most things. But in applications like matrices, where you can begin to see degredation after 3 or 4 levels of hierarchy (like in a finger matrix that has been built from the hand, then arm, then body matrix) you should use doubles. Depending on what you do with the numbers, floats should be sufficient for physics calculations.

PreManDrake
Advertisement
The issue here is memory. Floats are four bytes. Doubles are eight bytes. There''s no speed difference in terms of calculation, but there *is* a difference if you write lots of data out to memory as doubles. That is, an array of vectors will take twice as much space if the values are double precision. If you''re talking about a few million values, then the space is significant. The doubling of space will result in a possibly noticible speed degradation because of the additional cache misses.
It''s a case of precision vs memory... The fpu pipeline itself is designed to handle large floats (64bit+) and using 32 bits floats isn''t going to get them through any faster - the only way to do that is switch to SIMD (in which case the data size will probably be predetermined)

As far as cache misses go, you''d have to saturate even the smallest cache with over 8000 doubles to generate a _miss_ when analysing large blocks of data. You''re more likely to get cache misses by distributing your data all over memory - in which case the size of the data items doesn''t really matter...

A random example would be 500 mesh definitions each with 500 points; which would take up

500x500x3x8 = 6000000 = 5.7 meg - as (64bit) doubles
500x500x3x4 = 3000000 = 2.8 meg - as (32bit) floats

I don''t think you''re going to miss the memory...

Another consideration however, may be how the data gets where it''s going. e.g. Submitting vertices to a hardware OpenGL implementation may be faster if the amount of data to be sent across the bus is smaller...

Jans.


-----------------
Janucybermetaltvgothmogbunny
/source int x = 0;
quote: Original post by Jansic

As far as cache misses go, you''d have to saturate even the smallest cache with over 8000 doubles to generate a _miss_ when analysing large blocks of data. You''re more likely to get cache misses by distributing your data all over memory - in which case the size of the data items doesn''t really matter...

Jansic, you''re way off on the "smallest" cache holding 8000 doubles. A lot of CPUs only have 32k of L1 cache. And older CPUs which many people still use only have 16K. You shouldn''t be writing programs that only work well on Athlons or something.
And let''s not forget, normally this cache is divided so you only have HALF that much for data, the other half is for instructions only. This means we''re down to only 16K of data space on most CPUs. That''s 2048 doubles versus 4096 floats fitting in there, assuming that''s the only data you''re using. More often than not, you want various other data also sitting in that cache right along with it. That''s not a lot of space as far as modern data sets go. Also, all recent and current CPUs, memory chips, etc are optimized for sending 32-bit words, not double words, because this is the native word size for the CPU. Using anything bigger than that (regardless of what type) will result in some performance loss for that reason also. The point is, that extra 4 bytes per value can add up to a lot of cache misses, wasted bus bandwidth, wasted CPU cycles, etc.
I can''t say definitively that floats are going to offer better peformance because there''s too many other aspects to your program, but in general, they will. The best thing you can do is profile both ways if you really need to find out, but you can''t do that meaningfully until the rest of your code is solidified.

This topic is closed to new replies.

Advertisement