Advertisement

More variance with larger dice..

Started by May 09, 2015 07:33 PM
6 comments, last by Brain 9 years, 8 months ago

Either I'm doing something stupid in code or this is what should be expected based the math, I think it's the former.

Why is it when I add more dice faces, i.e. go from 4 sided dice to a six sided dice, my variance explodes! Did I do something stupid in my code?

When I have a 4 sided dice with two 1s and two 0s, and I run a monte carlo simulation totaling ten million dice rolls, my total bounces around zero give or take ~300.

When I run the simulation with a six side dice, my total bounces around 0 with a variance in the tens of thousands??

I'm almost 100% sure there's something with my code, because when I make the prizes, { 1, -2, 3, -4, 5, -3 }, my EV should be near zero, yet it's almost always a large negative number.

Here's my code

http://pastie.org/10180218

don't use rand, it's statistically not very good at being random. c++ features the random header, which supports a number of generators you can use:
http://www.cplusplus.com/reference/random/

by using ideone.org, we can run your code with results with some minor modifications to not timeout(making the simulationSize smaller):

http://ideone.com/ZlBFyG

this is 1 result from using the Mersenne Twister .
Check out https://www.facebook.com/LiquidGames for some great games made by me on the Playstation Mobile market.
Advertisement

There are other issues with your implementation as well. I recommend that you read this article: http://eternallyconfuzzled.com/arts/jsw_art_rand.aspx

Actually, this video explains why there *is* something wrong with rand:

http://channel9.msdn.com/Events/GoingNative/2013/rand-Considered-Harmful

In a nutshell: use mt19937, uniform_int_distribution and uniform_real_distribution. They are available in boost if you don't use c++11.


Actually, this video explains why there *is* something wrong with rand:

I thought that the article did a good job explaining the basic concepts but the conclusion about rand() is not right as you have noted. It's worth pointing out that messing up the seed can be just as harmful as using a number generator with poor statistical properties. I've played console games where clever players somehow managed to find out how the game created the seed from in-game actions and then managed to manipulate it to get extremely rare drops whenever they wanted to.

Trying to keep a good random distribution is tricky.

Even apart from the direct problem of having a solid number generator covered above, you can start with a good distribution and still break it with inappropriate operations.

Consider your mod operation, %, in your code. You may not think about it, but it destroys randomness. Even if you start with a perfectly random source of bits you can undo that distribution.

We'll say you extract a single 32-bit word that is statistically random. Then you take that number and run it through modulo division as you did, x%6. Six is not perfectly divisible in 32-bits, or in 16 bits, or any other pattern of bits. Think back to when you studied prime factors. 6 is 2*3. All bit patterns are some numbers of 2, 2*2, 2*2*2, 2*2*2*...*2. At no point will a pattern of bits have a factor of 3 in it, it will only ever be composed of 2's, or individual bits.

So if you had 16 random bits in a word and use mod six, the values 3, 4, and 5, will be slightly less common.

It is very easy for programmers to accidentally add patterns and bias through coding mistakes. That is one reason for heavy regulation of gambling.

Advertisement
I don't know if mod "destroys randomness". If you start with a 32-bit number and use % 6 to get a random number in {0,1,2,3,4,5}, the first 4 results will happen with probability .16666666674427688121 and the last 2 with probability .16666666651144623756. Now, if I give you a PRNG in a black box and I ask you to tell me whether it's completely fair or it has had its randomness "destroyed" by this issue, how many times do you think you'll have to query the PRNG to be confident of your answer?

Besides that, notice how in his code he is mapping {0,1,2,3,4,5} to {-1,1} using an array in which 4 and 5 are given different results. So even after "destroying randomness", he should get exactly 50% of -1 and 50% of +1.

Actually, this video explains why there *is* something wrong with rand:

http://channel9.msdn.com/Events/GoingNative/2013/rand-Considered-Harmful

In a nutshell: use mt19937, uniform_int_distribution and uniform_real_distribution. They are available in boost if you don't use c++11.

@Ed Welch, thanks for this. Watching this video was half an hour of my time well spent, and very useful in all programming. I wasn't even aware of the mathematical issues of using modulus or 1.0 / RAND_MAX to create a ranged distribution of rand(). You learn something new every day! :)

This topic is closed to new replies.

Advertisement