Advertisement

How to calculate a float from bytes?

Started by June 03, 2000 06:14 PM
5 comments, last by XBTC 24 years, 6 months ago
I know how to calculate a 32bit int from four bytes: int=byte0+byte1*2^8+byte2*2^16+byte3*2^32. But how do I calculate a float from 8 bytes? Thanx in advance,XBTC!
I write the IEEE Binary Floating Point Arithmetic Standard, 754-1985. The number = (-1)^s,m * 2^k, where s E {0, 1} and m = 1.m1m2m3m4m5, exponent = k + 2^8 - 1.
The first bit is the s (sign). The 2. - 9. bits is the mantiss (I don''t know the English word to this, and maybe it''s wrong), and the 10. - 32. is the exponent.

Advertisement
a float is 4 bytes and a double is 8 bytes.
To the vast majority of mankind, nothing is more agreeable than to escape the need for mental exertion... To most people, nothing is more troublesome than the effort of thinking.
Maybe I can explain the IEEE FPS a little more clearly. A 32-bit floating point value is set up like this:

SEEE EEEE EFFF FFFF FFFF FFFF FFFF FFFF

where S is the sign bit, E is the 8-bit exponent, and F is the 23-bit mantissa.

The sign bit is simple enough: it''s set to 0 if the number is positive, and 1 if the number is negative.

The exponent is stored in biased-127 format. For representable numbers, ts value may range from -126 to 127; the lowest and highest values, -127 and 128, are reserved for special representations. Read in the value of the exponent bits and apply the bias to get the exponent, which I''ll call e.

The mantissa is actually a string of digits meant to come after a one and a decimal point, so if the 23 bits stored for a number''s mantissa are:

001 1010 0011 0000 0111 1100

then the number the mantissa is representing is:

1.001 1010 0011 0000 0111 1100

So all you have to do is determine this mantissa, and multiply it by 2^e, where e is the value of the exponent. In other words, just shift the decimal point by e places, and from there, you can convert directly to the decimal representation.

Lastly, there are several special representations that you should be aware of when making the conversions:

If e = -127 (E = 0), and F = 0, the represented number is zero.
If e = 128 (E = 255), and F = 0, the represented value is either positive or negative infinity, depending on the sign bit. This occurs when a result was either too large or too small to store in a 32-bit value.
If e = 128 (E = 255), and F is not 0, the represented value is "NaN", or "not a number". Try to divide by zero and you''ll likely get NaN as a result.

I hope that helps a little.

-Ironblayde
Aeon Software
"Your superior intellect is no match for our puny weapons!"
Hey thanx alot,guys!
I think I understand it now although it looks a little "not-so-nice".
Perhaps someone nows an easy C-function which reads a float from a binary file or makes one from four bytes?

P.s.:I wanted to say 4 bytes not 8,sorry!

Thanx again,XBTC!
Hi!

You could just use fread()! Just read 4 bytes to the address of a float variable.

Hope this solves the prob,

MK42
Advertisement
erm, not sure if this is what you''re looking for, but..

DWORD dw = b0+(b1<<8)+(b2<<16)+(b3<<24);float *f = (float *)&dw   // f now points to a float made up of four bytes that have // to conform to the IEEE standard.  you can do the same // thing with eight bytes and a double.




--
Float like a butterfly, bite like a crocodile.

--Float like a butterfly, bite like a crocodile.

This topic is closed to new replies.

Advertisement