Advertisement

PRNG Question

Started by December 01, 2015 04:52 PM
14 comments, last by Waterlimon 9 years ago

Does anyone know a good example of a pseudo-random number generator that has the following properties:

- Efficient in time and memory

- Fairly long period

- Able to generate 32 bit numbers

- Randomizing once from each of a list of adjacent seeds (1, 2, 3, 4, 5, 6, ...) will produce pseudo-random numbers

If possible, please provide either an explanation of the algorithm, or just the math of it, or just a link to such.

Note: this does NOT need to be cryptographically secure (why does spell check think I misspelled that? Oh well), but I don't mind if it is.

Xorshift is probably a good candidate. If you are going to use consecutive seeds, I suggest you generate some numbers (say, 64 of them) and discard them as part of the seeding procedure.

Advertisement

I second XorShift. There are a number of variants depending on your needs, utilizing different numbers of state bits, providing different number of output bits, and involving some extra operations to improve quality, such as XorShift+ and XorShift*.

I'll also recommend the following lecture by Melissa O'Neill of Harvey Mudd College: "PCG: A Family of Better Random Number Generators" In particular, during the second half she talks about taking a variety of imperfect generators and applying some simple operations on them to substantially improve their quality.

"We should have a great fewer disputes in the world if words were taken for what they are, the signs of our ideas only, and not for things themselves." - John Locke

The C++ standard library provides a number of such PRNG engines and the code is entirely readable. Well, the code for ther Mersenne Twister can get obscure, but the basic linear congruential and lagged Fibonnaci generators are straightforward.

If you look at the sources for the GCC <random> library I even included some references in the comments.

Stephen M. Webb
Professional Free Software Developer

As far as I understand, numbers produced by a PRNG will only be random relative to other numbers from that same sequence of numbers, not relative to numbers from a different sequence (with a different seed).

This might give you patterns if you do something like plot numbers from 2 random sequences as (x,y) coordinates or such.

If you want to generate random numbers from a sequence of seeds (not a sequence of random numbers from a singular seed), look into hash functions instead.

o3o

As far as I understand, numbers produced by a PRNG will only be random relative to other numbers from that same sequence of numbers, not relative to numbers from a different sequence (with a different seed).

Only true for crappy PRNG's.


If you want to generate random numbers from a sequence of seeds (not a sequence of random numbers from a singular seed), look into hash functions instead.

Sure, what you're looking for is the term "DRBG" (deterministic random bit generator) which can easily be built from hash functions, they tend to be a bit slow with hash functions though. Better to build them from block ciphers, it's more efficient.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

Advertisement

Bacterius, so you're implying I should use a ctyptographic algorithm to create my random numbers?

Alvaro, I don't understand - why would I make a bunch of numbers and discard them? Anyway, I can't make a whole lot of numbers ahead of time. I need to make them as needed.

Alvaro, I don't understand - why would I make a bunch of numbers and discard them? Anyway, I can't make a whole lot of numbers ahead of time. I need to make them as needed.


If you use the algorithm I pointed to (say, the first piece of code in that Wikipedia page) and you use a naive way to seed it, the first pseudo-random number you get out of it when you use similar seeds won't look random at all. So I was suggesting generating a bunch of numbers and discarding them right after you seed it, as a method to create a less predictable initial state.

You don't need to make a whole lot of numbers ahead of time. Are you working in C++? I'll give you some code so you understand what I mean.

It might be useful to explain to us why you need it.

Trying to hopefully clean some of this up:

* NONE of the generators so far mentioned result in unpredictable numbers. You can store the internal state of any one of them, then use that state and get the same pattern of numbers on many machines.

* ALL of these generators allow you to specify the seed, or the starting number for the internal state.

* SOME of these generators allow user-provided parameters for number generation. These numbers have sometimes complex interactions with the PRNG. Often they are set by someone studying the details of the machine and understanding the math, then determining values that work well. For example, one common Mersenne Twister implementation by default uses some internal numbers of 0x9908B0DF, 0x9D2C5680, and 0xEFC60000. You can adjust these if you happen to know what they are doing and why they do what they do.

Remember that these are just number generators. They make a series of numbers. Hopefully they are random enough for your purposes, but they might not fit your purposes. There are all kinds of fancy statistical tests people can use, but those tests are not the same as your needs.

For which you should use, it depends a great deal on what you are using it for.

Some algorithms are just fine for generating data in game but terrible if you are looking for certain statistical probabilities.

For example, you may be just fine with a simple built-in numeric generator. In one game I worked on there was limited availability of items in the shop; we took the in-game date which was a simple integer, took the date's hash, and used that for an RNG seed, then shuffled the store's inventory. This gave us a consistent value that was random enough and easy to reproduce. Day 1 all the shops had the same items on sale, day 2 all the shops had different items on sale, etc. The seed number could be generated as many times as we needed for testing, meaning the store's content would be the same every time the player entered the shop on that in-game date.

Other times those built-in numeric generators won't work for you. You may pick a fancy algorithm and use it to generate a lot of data. You try it out, but over the course of development you discover it is not appealing for some reason. You can adjust and tune the parameters if you know what you are doing, but if you don't, you might need to pick a different generator. It isn't that the PRNG is itself flawed, just that it doesn't meet the properties you happen to need.

Depending on your specifics, Boost provides a bunch of number generators that may meet your needs.

So... What are you going to use it for?


Bacterius, so you're implying I should use a ctyptographic algorithm to create my random numbers?

I'm saying that it's one surefire way to do it (that also provides a bunch more advantages when done right, such as an extremely small memory footprint, the ability to seed, zero issues with poorly distributed seeds, good parallelization properties, optimal quality of output, and effectively infinite period). Sure, they may be slightly slower than a highly optimized dedicated generator, but:

- random number generation is almost never the bottleneck in software as almost all generators are completely CPU-bound (and if it is, there are ways around that, and if you seriously need every last clock cycle for some embedded or HPC task then you are definitely an outlier and already know the tradeoff between output quality and the amount of work you'll need to put in to actually achieve that)

- on the other hand, random number quality is actually a much bigger deal, and you avoid plenty of subtle issues with the way "dedicated" generators expect to be used, e.g. "weak seeds" or the all-too-common "yeah, the first few bytes kinda don't really look random, just skip them" defect

So, yes, it's a heavy-duty solution, but it's a flexible one that I've come to rely on and that has never once failed me. Why settle for anything less when the cost is so low, and the benefits so great? I have nothing against things like the Mersenne twister or xorshift, they are nice and noteworthy algorithms in their own right, I just find them inferior in almost every way to the above when it comes to practical matters. Necessary reading if you are interested in knowing more.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

This topic is closed to new replies.

Advertisement