## MATLAB’s Mersenne Twister Random Number Generator: Seed 0 gives the same numbers as Seed 5489

Something that became clear from my recent comparison of Numpy’s Mersenne Twister implementation with MATLAB’s is that there is something funky going on with seed 0 in MATLAB. A discussion in the comments thread helped uncover what was going on. In short, seed 0 gives exactly the same random numbers as seed 5489 in MATLAB (unless you use their deprecated rand(‘twister’,0) syntax).

This is a potential problem for anyone who performs lots of simulations that make use of random numbers such as monte-carlo simulations. One common work-flow is to run the same program hundreds of times where only the seed differs between runs. This is probably good enough to ensure that each simulation uses a random number stream that is statistically independent from all of the others — There is a risk that some streams will overlap but the probability is low and most people are content to live with that risk.

The practical upshot of this is that if you intend on sticking with Mersenne Twister for your MATLAB monte-carlo simulations, it might be wise to avoid seed 0. Alternatively, move to a random number generator that guarantees non-overlapping, independent streams – something that any implementation of Mersenne Twister cannot do.

Here’s a demo run in MATLAB 2014a on Windows 7.

>> format long >> rng(0) >> rand(1,5)' ans = 0.814723686393179 0.905791937075619 0.126986816293506 0.913375856139019 0.632359246225410 >> rng(5489) >> rand(1,5)' ans = 0.814723686393179 0.905791937075619 0.126986816293506 0.913375856139019 0.632359246225410

This shows one way of dealing with the need for many independent streams

Skipping Ahead the Mersenne Twister Random Number Generator

http://www.nag.co.uk/IndustryArticles/mersenne_twister_rng.pdf

Reading your post I checked the C version of Mersenne Twister: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/CODES/mt19937ar.c

I commented the code in main, and set succesively the two seeds and printed 5 values for each. As expected the two seeds led to different sequences of random numbers.

but with these lines:

int i;

init_genrand(5489);//set the seed

for(i=0;i<10;i++)

printf("\n %1.15lf ", genrand_real2());

I've seen that compared with the 5 values returned by MATLAB rand function (and np.random.random) it appears that MATLAB (Numpy) returns vals[2*i], i=0, … from those given by C code

~/AA2014$ gcc mt19937ar.c -o mersenne

~/AA2014$ ./mersenne

0.814723691903055

0.135477004107088

0.905791934113950

0.835008589783683

0.126986811868846

0.968867771094665

0.913375855656341

0.221034042770043

0.632359249982983

0.308167050359771

@Themos Say I use that routine to produce independent streams by skipping ahead by 2^N. I now know for sure that if I generate more than 2^N numbers in a stream, I am going to have problems. So, I just need to make sure I generate less than 2^N. All good stuff.

I have a question, however. Does the routine keep track of the number of calls I’ve made to it, and warn if I’ve made too many, or do I need to do it manually?

Hello Mike, the answer is “manually”, we don’t track consumption. But it should be pretty straightforward to encapsulate the underlying functionality in a module that does.

By the way, I recently came across a case where a RNG user unwittingly “sabotaged” the RNG by selecting a starting state (a “seed”) with extremely low entropy. In modern very-long-period RNGs this is something users need to be educated about. As a colleague put it

“[setting the seed] is useful for restarting the generator with the same sequence it had before, unless you are an expert in THAT PARTICULAR GENERATOR the only possible result of [starting the generator] with a user-specified seed is to make things worse.”

@Themos – which RNG was that for?

In the implementations of MT used in MATLAB, Python etc, the entire state is encoded in just one integer. How this maps onto the actual state, I have no idea.

@Mike, the Mersenne Twister inside NAG’s Fortran compiler. It provides full access to the entire RNG state, all 634 32-bit integers. What follows is the part of the manpage nagfor(1)

RANDOM NUMBER ALGORITHM

The random number generator supplied as the intrinsic subroutine RANDOM_NUMBER is the “Mersenne Twister”.

Note that this generator has a large state (630 32-bit integers) and an extremely long period (approx 10**6000), and therefore it is strongly recommended that the RANDOM_SEED routine only be used with a PUT argument that is the value returned by a previous call with GET; i.e., only to repeat a previous sequence. This is because if a user-specified seed has low entropy (likely since there are 630 values to be supplied), it is highly likely to set the generator to an apparently-low-entropy part of the sequence.

If you do want to provide your own seed (and thus entropy), you should store your values in the initial elements of the seed array and set all the remaining elements to zero — trailing zero elements will be ignored and not used to initialise the generator.

Note that the seed is a random bitstream, and is therefore expected to have approximately half of its bits nonzero (thus providing many small integer values will likely result in a low-entropy part of the Mersenne Twister sequence being reached).

Do you know how those 630 integers get mapped to 1 in the implementations discussed here?

Badly? :-)

I hope its not too bad since it seems that this mapping is used by several major implementations.

Hm afaiu, this is a special matlab behavior that you’ve got nowhere else! Not in ruby, nor in numpy, nor in octave.

in octave, the seed function is made by old fortran fuction.

https://github.com/markuman/linux/blob/master/octave/twister_seed.m will fix it.

the result is equal to numpy and ruby than (and for 5489 to matlab too).

octave:8> rand(‘twister’,twister_seed(5489))

octave:9> rand(5,1)

ans =

0.814723686393179

0.905791937075619

0.126986816293506

0.913375856139019

0.63235924622541

octave:10> rand(‘twister’,twister_seed(0))

octave:11> rand(5,1)

ans =

0.548813503927325

0.715189366372419

0.602763376071644

0.544883182996897

0.423654799338905

You can get the 624 (623?) integers that constitute the state of Mersenne-Twister PRNG:

### MATLAB:

% http://www.mathworks.com/help/matlab/ref/rng.html#btjqbki-2

rng(1) % seed

r = rng; disp(r.State) % 625×1 uint32 array (624 + 1 integer position)

### Python:

# http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.get_state.html

import numpy as np

np.random.seed(1) # seed

np.random.get_state() # tuple contains array of length 624, position, other things ..

I tried this in R, but the state is encoded as an array of signed integers, so I’m not sure how to it matches with the unsigned integers from MATLAB and NumPy:

### R

# http://www.inside-r.org/r-doc/base/.Random.seed

set.seed(1) # seed

.Random.seed # state vector of legnth 625

# (I think the second element denotes the current position)

As to how we map the seed number into the state vector, the Wikipedia article has a pseudocode for the mapping function:

// https://en.wikipedia.org/wiki/Mersenne_twister#Pseudocode

// Initialize the generator from a seed

function initialize_generator(int seed) {

index := 0

MT[0] := seed

for i from 1 to 623 { // loop over each other element

MT[i] := lowest 32 bits of(1812433253 * (MT[i-1] xor (right shift by 30 bits(MT[i-1]))) + i) // 0x6c078965

}

}

If you have Matlab Coder, you can see the underlying C code (or something fairly close) used by rng() to seed the default uniform random number generator. Create a simple M-file function like this:

function rngc(s) %#codegen

rng(s);

Then compile using codegen:

codegen -c -config:lib rngc -args uint32(0)

The generated code of interest will then be located at:

codegen/lib/rngc/rngc.c

Note that the generated code may be different if different input argument options are specified.