MATLAB’s Mersenne Twister Random Number Generator: Seed 0 gives the same numbers as Seed 5489

June 18th, 2014 | Categories: math software, matlab, random numbers | Tags:

Something that became clear from my recent comparison of Numpy’s Mersenne Twister implementation with MATLAB’s is that there is something funky going on with seed 0 in MATLAB. A discussion in the comments thread helped uncover what was going on. In short, seed 0 gives exactly the same random numbers as seed 5489 in MATLAB (unless you use their deprecated rand(‘twister’,0) syntax).

This is a potential problem for anyone who performs lots of simulations that make use of random numbers such as monte-carlo simulations. One common work-flow is to run the same program hundreds of times where only the seed differs between runs. This is probably good enough to ensure that each simulation uses a random number stream that is statistically independent from all of the others — There is a risk that some streams will overlap but the probability is low and most people are content to live with that risk.

The practical upshot of this is that if you intend on sticking with Mersenne Twister for your MATLAB monte-carlo simulations, it might be wise to avoid seed 0. Alternatively, move to a random number generator that guarantees non-overlapping, independent streams – something that any implementation of Mersenne Twister cannot do.

Here’s a demo run in MATLAB 2014a on Windows 7.

>> format long
>> rng(0)
>> rand(1,5)'

ans =

   0.814723686393179
   0.905791937075619
   0.126986816293506
   0.913375856139019
   0.632359246225410

>> rng(5489)
>> rand(1,5)'

ans =

   0.814723686393179
   0.905791937075619
   0.126986816293506
   0.913375856139019
   0.632359246225410
  1. June 18th, 2014 at 11:40
    Reply | Quote | #1

    This shows one way of dealing with the need for many independent streams

    Skipping Ahead the Mersenne Twister Random Number Generator

    http://www.nag.co.uk/IndustryArticles/mersenne_twister_rng.pdf

  2. Em
    June 18th, 2014 at 13:11
    Reply | Quote | #2

    Reading your post I checked the C version of Mersenne Twister: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/CODES/mt19937ar.c
    I commented the code in main, and set succesively the two seeds and printed 5 values for each. As expected the two seeds led to different sequences of random numbers.

    but with these lines:

    int i;
    init_genrand(5489);//set the seed
    for(i=0;i<10;i++)
    printf("\n %1.15lf ", genrand_real2());

    I've seen that compared with the 5 values returned by MATLAB rand function (and np.random.random) it appears that MATLAB (Numpy) returns vals[2*i], i=0, … from those given by C code

    ~/AA2014$ gcc mt19937ar.c -o mersenne
    ~/AA2014$ ./mersenne

    0.814723691903055
    0.135477004107088
    0.905791934113950
    0.835008589783683
    0.126986811868846
    0.968867771094665
    0.913375855656341
    0.221034042770043
    0.632359249982983
    0.308167050359771

  3. Mike Croucher
    June 18th, 2014 at 13:20
    Reply | Quote | #3

    @Themos Say I use that routine to produce independent streams by skipping ahead by 2^N. I now know for sure that if I generate more than 2^N numbers in a stream, I am going to have problems. So, I just need to make sure I generate less than 2^N. All good stuff.

    I have a question, however. Does the routine keep track of the number of calls I’ve made to it, and warn if I’ve made too many, or do I need to do it manually?

  4. June 18th, 2014 at 13:26
    Reply | Quote | #4

    Hello Mike, the answer is “manually”, we don’t track consumption. But it should be pretty straightforward to encapsulate the underlying functionality in a module that does.

  5. June 18th, 2014 at 14:14
    Reply | Quote | #5

    By the way, I recently came across a case where a RNG user unwittingly “sabotaged” the RNG by selecting a starting state (a “seed”) with extremely low entropy. In modern very-long-period RNGs this is something users need to be educated about. As a colleague put it

    “[setting the seed] is useful for restarting the generator with the same sequence it had before, unless you are an expert in THAT PARTICULAR GENERATOR the only possible result of [starting the generator] with a user-specified seed is to make things worse.”

  6. Mike Croucher
    June 18th, 2014 at 14:42
    Reply | Quote | #6

    @Themos – which RNG was that for?

    In the implementations of MT used in MATLAB, Python etc, the entire state is encoded in just one integer. How this maps onto the actual state, I have no idea.

  7. June 18th, 2014 at 14:59
    Reply | Quote | #7

    @Mike, the Mersenne Twister inside NAG’s Fortran compiler. It provides full access to the entire RNG state, all 634 32-bit integers. What follows is the part of the manpage nagfor(1)

    RANDOM NUMBER ALGORITHM
    The random number generator supplied as the intrinsic subroutine RANDOM_NUMBER is the “Mersenne Twister”.

    Note that this generator has a large state (630 32-bit integers) and an extremely long period (approx 10**6000), and therefore it is strongly recommended that the RANDOM_SEED routine only be used with a PUT argument that is the value returned by a previous call with GET; i.e., only to repeat a previous sequence. This is because if a user-specified seed has low entropy (likely since there are 630 values to be supplied), it is highly likely to set the generator to an apparently-low-entropy part of the sequence.

    If you do want to provide your own seed (and thus entropy), you should store your values in the initial elements of the seed array and set all the remaining elements to zero — trailing zero elements will be ignored and not used to initialise the generator.

    Note that the seed is a random bitstream, and is therefore expected to have approximately half of its bits nonzero (thus providing many small integer values will likely result in a low-entropy part of the Mersenne Twister sequence being reached).

  8. Mike Croucher
    June 18th, 2014 at 15:58
    Reply | Quote | #8

    Do you know how those 630 integers get mapped to 1 in the implementations discussed here?

  9. June 18th, 2014 at 16:40
    Reply | Quote | #9

    Badly? :-)

  10. Mike Croucher
    June 19th, 2014 at 12:14

    I hope its not too bad since it seems that this mapping is used by several major implementations.

  11. June 20th, 2014 at 18:46

    Hm afaiu, this is a special matlab behavior that you’ve got nowhere else! Not in ruby, nor in numpy, nor in octave.

    in octave, the seed function is made by old fortran fuction.
    https://github.com/markuman/linux/blob/master/octave/twister_seed.m will fix it.

    the result is equal to numpy and ruby than (and for 5489 to matlab too).

    octave:8> rand(‘twister’,twister_seed(5489))
    octave:9> rand(5,1)
    ans =

    0.814723686393179
    0.905791937075619
    0.126986816293506
    0.913375856139019
    0.63235924622541

    octave:10> rand(‘twister’,twister_seed(0))
    octave:11> rand(5,1)
    ans =

    0.548813503927325
    0.715189366372419
    0.602763376071644
    0.544883182996897
    0.423654799338905

  12. Amro
    June 21st, 2014 at 05:55

    You can get the 624 (623?) integers that constitute the state of Mersenne-Twister PRNG:

    ### MATLAB:
    % http://www.mathworks.com/help/matlab/ref/rng.html#btjqbki-2
    rng(1) % seed
    r = rng; disp(r.State) % 625×1 uint32 array (624 + 1 integer position)

    ### Python:
    # http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.get_state.html
    import numpy as np
    np.random.seed(1) # seed
    np.random.get_state() # tuple contains array of length 624, position, other things ..

    I tried this in R, but the state is encoded as an array of signed integers, so I’m not sure how to it matches with the unsigned integers from MATLAB and NumPy:

    ### R
    # http://www.inside-r.org/r-doc/base/.Random.seed
    set.seed(1) # seed
    .Random.seed # state vector of legnth 625
    # (I think the second element denotes the current position)

    As to how we map the seed number into the state vector, the Wikipedia article has a pseudocode for the mapping function:

    // https://en.wikipedia.org/wiki/Mersenne_twister#Pseudocode
    // Initialize the generator from a seed
    function initialize_generator(int seed) {
    index := 0
    MT[0] := seed
    for i from 1 to 623 { // loop over each other element
    MT[i] := lowest 32 bits of(1812433253 * (MT[i-1] xor (right shift by 30 bits(MT[i-1]))) + i) // 0x6c078965
    }
    }

  13. March 23rd, 2015 at 19:39

    If you have Matlab Coder, you can see the underlying C code (or something fairly close) used by rng() to seed the default uniform random number generator. Create a simple M-file function like this:

    function rngc(s) %#codegen
    rng(s);

    Then compile using codegen:

    codegen -c -config:lib rngc -args uint32(0)

    The generated code of interest will then be located at:

    codegen/lib/rngc/rngc.c

    Note that the generated code may be different if different input argument options are specified.

  14. myschizobuddy
    May 4th, 2015 at 16:03

    Not related to this post but i stumbled upon a website that has prebuild true random numbers from atmospheric noise https://www.random.org/files/