Advertisement

Double to float C++

Started by May 13, 2024 04:27 PM
181 comments, last by JoeJ 6 months, 1 week ago

I actually received some help there! I'm still working at it. Thanks again for your guys' insight.

I found this page:

https://stackoverflow.com/questions/16737615/how-is-floating-point-conversion-actually-done-in-cdouble-to-float-or-float

How would I go about manually doing the assembly in Visual Studio?

Advertisement

I altered the source so that you can enable/disable GLUT using a #define at the top of main.cpp:

https://github.com/sjhalayka/mercury_gr

The line in question is in main.cpp, line 100:

https://github.com/sjhalayka/mercury_gr/blob/7586d4c4f916f058edd7bb588ff2c1146fc483ac/main.cpp#L100

taby said:
I altered the source so that you can enable/disable GLUT using a #define at the top of main.cpp:

But it's still some effort to make a project, include the files, and run it.
To avoid the work i have copied your code before, but it could not reproduce the issue.

So might want to make it easier to help you.

Basically, all that's needed would be this:

void proceed_Euler(/*custom_math::vector_3& pos, custom_math::vector_3& vel, const long double G, const long double dt*/)
{
	//const custom_math::vector_3 grav_dir = sun_pos - pos;
	//const double distance = grav_dir.length();
	//const double Rs = 2 * grav_constant * sun_mass / (speed_of_light * speed_of_light);
	

	const double speed_of_light = 10000; //?
	const double distance = 1000; //?
	const double Rs = 100 / (speed_of_light * speed_of_light); //?

	const double v = 10; //?

	//const double alpha = 2.0 - sqrt(1 - (vel.length() * vel.length()) / (speed_of_light * speed_of_light));

	const double alpha = 2.0 - sqrt(1 - (v * v) / (speed_of_light * speed_of_light));

	double beta = sqrt(1.0 - Rs / distance);


	// beta is currenty X?, which is wrong imo

	
	beta = static_cast<float>(beta);
	
	// after the cast, beta is now Y?, which is right imo. wtf?


	//custom_math::vector_3 accel = grav_acceleration(pos, vel, G);
	//vel += accel * dt * alpha;
	//pos += vel * dt * beta;
}

Afaict, you only need to fill in constants marked with a ?, and then we could reproduce the testcase precisely, and we could surely tell what the cast actually changes here.

Well, the closest thing that I can do is this:

double beta = sqrt(1.0 - Rs / distance);
double x = beta / std::numeric_limits<float>::epsilon();
beta = static_cast<float>(x) * std::numeric_limits<float>::epsilon();

// Edit: added this	
if (beta == 1)
	beta = 0.99999999999;

I also tried:

float normalized_double_to_float(double d)
{
	float tempf = 1.0f;

	while (tempf > d && tempf > 0.0)
		tempf -= std::numeric_limits<float>::epsilon();

	return tempf;
}
Advertisement

Looks like beta being close to 1 is expected?

But how can this cause a large difference as said here:

If I comment the casting out, I get an answer of 7.75. If the casting remains, I get an answer of 42.66. The analytical solution gives an answer of 42.94.

I want a little code snippet producing those numbers, with all constants given and no dependencies.

However, i am very sure that HW precision is not even closely enough for your needs.
You do need a library for higher precision numbers, i guess.

I also assume that your ‘closely correct answer’ as mentioned above was just random luck, but only the library would confirm your math is right or wrong.

Which library would you recommend?

I don't believe it's pure luck – otherwise, God is the biggest trickster of them all. What I believe was lucky was finding out that if you further quantize gravitation, the effects of general relativity pop into existence.

taby said:
Which library would you recommend?

Never had a need for this. I'm a newtonian flatlander gamedev. :D

Here's a list: https://en.wikipedia.org/wiki/List_of_arbitrary-precision_arithmetic_software

I'd just try Boost.
But if you could be sure 128 bit is enough, that's surely faster than arbitrary.

I'm finding that Boost.Multiprecision does not work with Visual C++.

This topic is closed to new replies.

Advertisement