Reproducing MATLAB random numbers in Python

June 16th, 2014 | Categories: math software, matlab, programming, python, random numbers | Tags:

When porting code between MATLAB and Python, it is sometimes useful to produce the exact same set of random numbers for testing purposes.  Both Python and MATLAB currently use the Mersenne Twister generator by default so one assumes this should be easy…and it is…provided you use the generator in Numpy and avoid the seed 0.

Generate some random numbers in MATLAB

Here, we generate the first 5 numbers for 3 different seeds in MATLAB. Our aim is to reproduce these in Python.

>> format long
>> rng(0)
>> rand(1,5)'

ans =

   0.814723686393179
   0.905791937075619
   0.126986816293506
   0.913375856139019
   0.632359246225410

>> rng(1)
>> rand(1,5)'

ans =

   0.417022004702574
   0.720324493442158
   0.000114374817345
   0.302332572631840
   0.146755890817113

>> rng(2)
>> rand(1,5)'

ans =

   0.435994902142004
   0.025926231827891
   0.549662477878709
   0.435322392618277
   0.420367802087489

Python’s default random module

According to the documentation,Python’s random module uses the Mersenne Twister algorithm but the implementation seems to be different from MATLAB’s since the results are different.  Here’s the output from a fresh ipython session:

In [1]: import random

In [2]: random.seed(0)

In [3]: [random.random() for _ in range(5)]
Out[3]: 
[0.8444218515250481,
 0.7579544029403025,
 0.420571580830845,
 0.25891675029296335,
 0.5112747213686085]

In [4]: random.seed(1)

In [5]: [random.random() for _ in range(5)]
Out[5]: 
[0.13436424411240122,
 0.8474337369372327,
 0.763774618976614,
 0.2550690257394217,
 0.49543508709194095]

In [6]: random.seed(2)

In [7]: [random.random() for _ in range(5)]
Out[7]: 
[0.9560342718892494,
 0.9478274870593494,
 0.05655136772680869,
 0.08487199515892163,
 0.8354988781294496]

The Numpy random module

Numpy’s random module, on the other hand, seems to use an identical implementation to MATLAB for seeds other than 0. In the below, notice that for seeds 1 and 2, the results are identical to MATLAB’s. For a seed of zero, they are different.

In [1]: import numpy as np

In [2]: np.set_printoptions(suppress=True)

In [3]: np.set_printoptions(precision=15)

In [4]: np.random.seed(0)

In [5]: np.random.random((5,1))
Out[5]: 
array([[ 0.548813503927325],
       [ 0.715189366372419],
       [ 0.602763376071644],
       [ 0.544883182996897],
       [ 0.423654799338905]])

In [6]: np.random.seed(1)

In [7]: np.random.random((5,1))
Out[7]: 
array([[ 0.417022004702574],
       [ 0.720324493442158],
       [ 0.000114374817345],
       [ 0.30233257263184 ],
       [ 0.146755890817113]])

In [8]: np.random.seed(2)

In [9]: np.random.random((5,1))
Out[9]: 
array([[ 0.435994902142004],
       [ 0.025926231827891],
       [ 0.549662477878709],
       [ 0.435322392618277],
       [ 0.420367802087489]])

Checking a lot more seeds

Although the above interactive experiments look convincing, I wanted to check a few more seeds. All seeds from 0 to 1 million would be a good start so I wrote a MATLAB script that generated 10 random numbers for each seed from 0 to 1 million and saved the results as a .mat file.

A subsequent Python script loads the .mat file and ensures that numpy generates the same set of numbers for each seed. It outputs every seed for which Python and MATLAB differ.

On my mac, I opened a bash prompt and ran the two scripts as follows

matlab -nodisplay -nodesktop -r "generate_matlab_randoms"
python python_randoms.py

The output was

MATLAB file contains 1000001 seeds and 10 samples per seed
Random numbers for seed 0 differ between MATLAB and Numpy

System details

  • Late 2013 Macbook Air
  • MATLAB 2014a
  • Python 2.7.7
  • Numpy 1.8.1
  1. Júlio Hoffimann Mendes
    June 16th, 2014 at 22:47
    Reply | Quote | #1

    What is special about seed 0? I remember reading 3rd-party code that considered seed 0 a special case. Does anyone have an explanation or reference?

  2. June 17th, 2014 at 06:30
    Reply | Quote | #2

    Matlab and Numpy has already equal random numbers.

    In Matlab

    >> rand(‘twister’,0)
    >> rand

    ans =

    0.548813503927325

    In Numpy

    >>> import numpy
    >>> numpy.random.seed(0)
    >>> numpy.random.random()
    0.5488135039273248

  3. sam
    June 17th, 2014 at 09:29
    Reply | Quote | #3

    I wonder if it is also worth checking that if you pick one seed and then generate a million random numbers that they are the same.

  4. Mike Croucher
    June 17th, 2014 at 10:41
    Reply | Quote | #4

    @markuman – That syntax is discouraged by Mathworks.

    http://www.mathworks.co.uk/help/matlab/math/updating-your-random-number-generator-syntax.html

    Instead of

    rand(‘twister’,seed)

    They suggest that we should use

    rng(seed)

    The document linked to above gives us the key as to why a seed of zero is special in MATLAB. Mathworks have chosen a seed of zero to be ‘default’ and it seems that they have chosen a seed of 5489 to be this default and so, in MATLAB, a seed of 0 is equivalent to 5489….at least if you use the rng(seed) syntax”

    >> rng default
    >> rand(1,5)’

    ans =

    0.8147
    0.9058
    0.1270
    0.9134
    0.6324

    >> rng(0)
    >> rand(1,5)’

    ans =

    0.8147
    0.9058
    0.1270
    0.9134
    0.6324

    >>rng(5489)
    >> rand(1,5)’

    ans =

    0.8147
    0.9058
    0.1270
    0.9134
    0.6324

    Thanks for the questions, they helped me realise why 0 is special :)

  5. Amro
    June 21st, 2014 at 06:58
    Reply | Quote | #5

    In the reference implementation (http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html), if init_genrand() had not been called before to initialize the state vector, it is then initialized with a hardcoded seed value of 5489 upon the first call to genrand_int32().

    Now MATLAB interprets rng(0) as a special case to initializing the state vector with the default hardcoded value 5489 that the original authors used. This is how the RNG is initialized when MATLAB starts.

    NumPy on the other hand does not have a fixed default value for the seed (the default is “None” which tells it to get a random seed from /dev/urandom or some equivalent). So a seed value of 0 is literally interpreted as such, and used as-is to initialize the internal state vector.

    Reference:
    http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.RandomState.html
    https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c#L138
    https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/mtrand.pyx#L640

    In the old MATLAB syntax of rand(‘twister’,0), apparently the seed value of zero is interpreted as is as well. You can verify this fact by inspecting the state vector:

    >> rand(‘twister’,0) % old style
    >> r=rng;
    >> r.State{4} % r.State is a 1×6 cell array, contains both integers and floats (?)

    versus:

    >>> np.random.seed(0)
    >>> np.random.get_state()[1]

  6. Mike Croucher
    June 23rd, 2014 at 09:42
    Reply | Quote | #6

    Thanks for the follow up, Amro. Greatly appreciated.