Random number generator
this wiki
See also w:c:NetHack:random number generator.
The random number generator is a deity who decides your fate. You can never guess what this god will do next. This divine being controls the flip of the coin and the roll of the dice. Random Number God is the creator of chance, the author of luck, the curtain who conceals the future. The priests of this god can only use probability to guess what might happen.
Contents
[show]Pseudo-random number generator Edit
Computer games use Random Number God to introduce luck and unpredictability. The problem is that computer processors only do predictable maths like addition and subtraction.
So the computer uses a pseudo-random number generator. The PRNG uses a formula to compute the number. The PRNG also keeps n bits of state, so that multiple uses of the formula compute a sequence of numbers. The PRNG has at most 2**n states, so the sequence must repeat and become a cycle after at 2**n generated numbers. We say that the period of the PRNG is at most 2**n (read 2 to the power of n); the period might be less than 2**n. If we know the state and the formula, then we can predict the next number. If we only know the period, then we can predict the next number after we observe one entire cycle.
The PRNG must not always start in the same position in the sequence. A program can use a seed to set the initial state of the PRNG.
- A typical seed is a function of the current time and the process ID. The first caveat is that players who know the current time and the process ID can start a duplicate PRNG to predict the numbers. The second caveat is that some systems have only one process and no current time. These systems must use some other seed.
- Some programs use entropy to make a seed. Entropy refers to random variations in the hardware. These variations might occur in uninitialized RAM, key presses, mouse movements or incoming network packets.
Uniform distribution Edit
The typical PRNG uses the uniform distribution, which gives an equal probability to every number inside the range, and a probability of zero to every number outside the range. If the range contains m numbers, then the probability to generate each number is 1/m.
Consider a generator that yields the integers 0..255 in a uniform distribution. The probability to generate 13 equals 1/256. The probability to generate 20 equals 1/256. The probability to generate 280 equals zero, because 280 is greater than 255. The generator yields 13 or 20 with equal probability, but never generates 280.
Modulo bias Edit
Suppose rng() yields 0..7, but you want 0..2. A modulo bias occurs when you use rng() modulo 3 to yield 0..2. Consider all eight cases:
- rng() yields 0, so rng() modulo 3 is 0.
- rng() yields 1, so rng() modulo 3 is 1.
- rng() yields 2, so rng() modulo 3 is 2.
- rng() yields 3, so rng() modulo 3 is 0.
- rng() yields 4, so rng() modulo 3 is 1.
- rng() yields 5, so rng() modulo 3 is 2.
- rng() yields 6, so rng() modulo 3 is 0.
- rng() yields 7, so rng() modulo 3 is 1.
If rng() uses the uniform distribution, each case has probability 1/8. Three cases yield 0, so 0 has probability 3/8. Three cases yield 1, so 1 has probability 3/8. Two and only two cases yield 2, so 2 has probability 2/8. The modulo bias is the bias against yielding 2. To eliminate the modulo bias, we would need to reroll if rng() yields 6 or yields 7.
Suppose rbyte() yields 0..255. A modulo bias occurs when you use rng() modulo 6 to yield 0..5. This method yields each of 0..3 with probability 43/256. This method yields each of 5..6 with probability 42/246. The modulo bias is the bias against yielding 5 or yields 6. To eliminate the modulo bias, we would need to reroll if rng() yields 252..255. (Notice that 6 * 42 => 252. We eliminate the modulo bias by using 0..251, because this yields each of 0..6 with probability 42/252 => 1/6.)
You can choose to ignore the modulo bias. If you find a generator that yields 0..255, and you need to roll a six-sided die, then you can use rng() modulo 6. Your players might believe that the die is fair, and never notice the difference between 42/256 and 43/256. Many existing games ignore the modulo bias.
Examples Edit
We enumerate some typical random number generators.
BSD rand_r() Edit
The Posix/Unix standards specify a rand_r() function. BSD systems have this implementation, which I quote from /usr/include/stdlib.h and /usr/src/lib/libc/stdlib/rand.c of an OpenBSD system.
#define RAND_MAX 0x7fffffff int rand_r(u_int *seed) { *seed = *seed * 1103515245 + 12345; return (*seed % ((u_int)RAND_MAX + 1)); }
The u_int *seed is the entire state of the rand_r() generator. In most BSD systems, a u_int holds only 32 bits.
rand_r() uses a simple formula called a "linear congruential algorithm". The use of 1103515245 and 12345 seems to make a uniform distribution.
Here is a port of rand_r() to the Ruby language:
def rand_r(seed) seed[0] = seed[0] * 1103515245 + 12345 seed[0] % (0x7fffffff + 1) end
We can test our port from irb, the interactive Ruby prompt:
irb(main):044:0> state = [40] => [40] irb(main):045:0> (1..5).map { rand_r(state) } => [1190949185, 1027880678, 984196135, 124319444, 1591599229]
The first big problem with rand_r() is that the lower bits have shorter cycles. This is obvious we try to use the low bit from rand_r() for coin flips:
irb(main):046:0> state = [40] => [40] irb(main):047:0> (1..20).map { rand_r(state) & 1 } => [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0]
Heads, tails, heads, tails, heads, tails is so predictable. The period of 2 is so obvious.
The second big problem is that the random number contains 31 bits of a 32-bit state. With only two possibilities for the other bit, we can too easily guess the state of the RNG (and thus predict the future numbers) after looking at only one number.
For example, if the random number is 1493459905 (which equals 0x590463c1), then the state must be either 0x590463c1 or 0xd90463c1. Then the next number is either 1703507814 or 1703507814.
irb(main):056:0> rand_r(state).to_s 16 => "590463c1" irb(main):057:0> rand_r([0x590463c1]) => 1703507814 irb(main):058:0> rand_r([0xd90463c1])' => 1703507814 irb(main):059:0> rand_r(state) => 1703507814
Yes, both guesses are correct, the next number after 1493459905 is 1703507814.
The OpenBSD manual page warns that rand_r() is obsolete, and so are rand() and srand(), because they use rand_r() in OpenBSD. Other random number generators are better.
BSD random() Edit
BSD also has the random() function, which is a much better random number generator. The random() function is a BSD tradition from 1983. The manual page says:
- "The random() and srandom() functions have (almost) the same calling sequence and initialization properties as rand(3)/srand(3). The difference is that rand produces a much less random sequence -- in fact, the low dozen bits generated by rand go through a cyclic pattern. All the bits generated by random() are usable. For example, `random()&01' will produce a random binary value."
With the default configuration, the random() generator uses a state table of 124 bytes, plus 4 bytes for miscellaneous information, plus two pointers to the state table. The seed is the 992-bit initial value of the state table.
The state table holds 31 values, each 4 bytes. The front pointer always points three values ahead of the rear pointer. The pointers wrap to stay in the table. To generate a random number, the random() generator
- adds the rear value to the front value, then
- copies the high 31 bits of the front value to the random value, then
- moves each pointer to the next value, then
- returns the random value.
Here is a port of the 992-bit configuration of random() to Ruby language:
def random() value = (@state[@front] + @state[@rear]) & 0xffffffff @state[@front] = value @front = (@front + 1) % 31 @rear = (@rear + 1) % 31 value >> 1 end
To seed the Ruby version, fill @state with 31 values, each 32-bit; then do @front = 4 and @rear = 1.
This generator normally produces a uniform distribution in range 0..(2**31 - 1). However, if @state becomes all zeros, then the generator yields all zeros. If @state contains a pattern, then the generator might yield a pattern.
irb(main):030:0> @state = (0..30).to_a => [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30] irb(main):031:0> @front = 4 => 4 irb(main):032:0> @rear = 1 => 1 irb(main):033:0> (0..9).map { random } => [2, 3, 4, 6, 7, 9, 11, 13, 15, 17] irb(main):034:0> 990.times { random } => 990 irb(main):035:0> (0..4).map { random } => [144751481, 34014938, 894873289, 1887256469, 1684676240]
For example, if @state contains the pattern [0, 1, 2, ..., 30], then the first ten numbers in the generated sequence are [2, 3, 4, 6, 7, 9, 11, 13, 15, 17]. The pattern eventually ends. If you discard the first 1000 numbers in the sequence, then you find no pattern.
This is not a flaw in the generator; [2, 3, 4, 6, 7, 9, 11, 13, 15, 17] is as valid as any other random sequence of ten numbers. You create a flaw if you always seed @state with a pattern, such that the generator always yields a pattern.
One solution is to use another random number generator to yield the 992-bit seed for this generator! In BSD, the srandom() function takes a small seed and uses a linear congruential generator to yield the 992-bit seed for the random() generator. The newer srandomdev() function grabs a 992-bit seed from a generator in the BSD kernel.
ARC4 Edit
The ARC4 generator resembles the BSD random() generator. Both generators use a state table with two pointers. The major difference is how ARC4 uses its own random numbers to stir its table.
ARC4 has two components: an algorithm to seed the 1024-bit table from a key, and an algorithm to generate a random number. arc4.cpp and arc4.h in Crypto++ provide a good reference to ARC4.
The state table holds 256 values, each 1 byte. The pointers wrap to stay in the table. To generate a random value, the ARC4 generator
- copies the front value to a,
- adds a to the rear pointer,
- copies the rear value to b,
- swaps the front value with the rear value in the table,
- moves the front pointer to the next value,
- moves the final pointer to index (a + b) modulo 256 in the table, then
- returns the final value.
ARC4 yields a byte in range 0..255, but you can form a larger value from multiple bytes.
Here is a port of the ARC4 generator from Crypto++ to Ruby:
def arc4 a = @state[@front] @rear = (@rear + a) & 0xff b = @state[@rear] @state[@front] = b @state[@rear] = a @front = (@front + 1) & 0xff @state[(a + b) & 0xff] end
To seed the Ruby version, fill @state with 256 values, each 8-bit; then do @front = 1 and @rear = 0.
Like the BSD random() generator, the ARC4 generator yields all zeros if the @state becomes all zeros, and the ARC4 generator might yield a pattern if @state contains a pattern.
irb(main):086:0> @state = (0..255).to_a; @front = 1; @rear = 0 => 0 irb(main):087:0> (0..9).map { arc4 } => [2, 5, 7, 13, 13, 23, 31, 40, 40, 56] irb(main):088:0> 990.times { arc4 } => 990 irb(main):089:0> (0..9).map { arc4 } => [108, 190, 113, 236, 71, 43, 77, 164, 169, 26]
Mersenne Twister Edit
Mersenne Twister is a random number generator with a very long period. Many newer programs use Mersenne Twister because it seems to be the best available generator. WLA DX embeds Mersenne Twister in main.c and uses it for the .seed, .dbrnd and .dwrnd assembler directives.
Mersenne Twister uses a large state table, which contains 624 values, each 4 bytes. This table occupies 2496 bytes. (Contrast 256 bytes for ARC4, 124 bytes for BSD random().) The front pointer always points one value ahead of the rear pointer. A side pointer always points 397 values ahead of the rear pointer. The pointers wrap to stay in the table. To generate a random value, Mersenne Twister
- sets a to zero,
- moves bit 31 of the high value to bit 30 of a,
- moves bits 30..1 of the low value to bits 29..0 of a,
- sets a to the bitwise-xor of a and the side value,
- sets a to the bitwise-xor of a and one of two constant values,
- moves a to the rear value, then
- sets the random value to a function of the rear value,
- moves each pointer to the next value in the table, then
- returns the random value.
Mersenne Twister yields an unsigned 32-bit number.
Ruby uses Mersenne Twister for its built-in Kernel#srand and Kernel#rand methods.
$ irb irb(main):001:0> (0..4).map { rand(1 << 32) } => [3197605409, 4184161947, 2237093816, 3773594805, 4068277585]