Repeated reseeding of a PRNG from <random> isn't reproducible

130 Views Asked by At

Is there a deterministic random number generator in <random> in C++?

The point in question is that the below code on my Windows machine:

#include<iostream>
#include<random>
int main(){
  std::mt19937 g;
  std::normal_distribution<double> d;
  for(int i=0;i<100;++i){
    g.seed(65472381);
    std::cout << "List[65472381] = " << d(g) << "\n";
  }
}

produces the following result:

List[65472381]=0.972683
List[65472381]=-0.773812
List[65472381]=0.972683
List[65472381]=-0.773812
List[65472381]=0.972683
List[65472381]=-0.773812
...

My confusion is with the fact that 0.972683 != -0.773812 albeit the seed it reset to 65472381 each time before g is used.

My processor is a Zen 2 and the OS is Windows 10 Pro, Version 22H2. The compiler is GCC (x86_64-w64-mingw32/12.2.0). But from testing the code online on different virtual machines and compilers, it seems the result will likely be the same on your machine as well.

What I actually seek is a way to acquire the i-th number from an arbitrary fixed universal list of length 4,294,967,295 of randomly distributed numbers in SPACETIME O(1), implying that no element of the list be ever stored.

2

There are 2 best solutions below

3
tbxfreeware On BEST ANSWER

The distribution object has internal state. So, after you reseed the random number engine, you must reset the distribution to clear its internal state. This is true for all the engines and distributions in the Standard Library. In general, when you seed an engine, you should also reset the distribution.

#include<iostream>
#include<random>

int main() {
    std::mt19937 g;
    std::normal_distribution<double> d;
    for (int i = 0; i < 10; ++i) {
        g.seed(65472381);
        d.reset();
        std::cout << "List[65472381] = " << d(g) << "\n";
    }
}

Here is the output:

List[65472381] = -0.773812
List[65472381] = -0.773812
List[65472381] = -0.773812
List[65472381] = -0.773812
List[65472381] = -0.773812
List[65472381] = -0.773812
List[65472381] = -0.773812
List[65472381] = -0.773812
List[65472381] = -0.773812
List[65472381] = -0.773812
Efficient discard function for Mersenne Twister?

Which C++ random number engines have a O(1) discard function?

The answers to this Stack Overflow question explain that there is a "fast jump" algorithm for Mersenne Twister. That means that an efficient discard function is possible. Unfortunately, the C++ Standard does not mandate that an implementation use it.

You may discover that your system is one that has an efficient discard. You cannot, however, assume that every system will.

If you decide to use discard, be sure to reset the distribution afterwards. Otherwise, the values generated by the distribution are not guaranteed to be repeatable.

Is std::mt19937 portable?

As I noted in the comments below, the C++ Standard mandates that std::mt19937 be portable. It must generate the same sequences of random numbers on every implementation.

I used the following program to generate 10 values from std::mt19937. I ran it with the latest version of MSVC. If you run it on your system, you should get the same output.

#include<iostream>
#include<random>
int main() {
    std::mt19937 g;
    unsigned seed{ 65472381u };
    g.seed(seed);
    std::cout << "Seed : " << seed << "\n\n";
    for (int i = 0; i < 10; ++i) {
        std::cout << "g() : " << g() << "\n";
    }
}

Here is the output.

Seed : 65472381

g() : 3522518833
g() : 1238868101
g() : 1353561095
g() : 3289615924
g() : 1455032182
g() : 573730142
g() : 700682001
g() : 2371867773
g() : 3721872455
g() : 2742745620
Random number distributions are not portable

For better or worse, the C++ Standard does not require that distributions such as std::normal_distribution be the same in every implementation. In general, they are not portable.

1
Yksisarvinen On

Every generator (aside from possibly std::random_device) is deterministic in C++.

However, distributions can (and do) hold state. On my compiler, std::normal_distribution calls generator 4 times and uses that to generate 2 numbers for the distribution. For the following code:

#include <iostream>
#include <random>

std::mt19937 g;

struct MyGenerator {
    auto operator()() {
        std::cout << "I'm called!\n";
        return g();
    }
    static auto max() { return g.max(); }
    static auto min() { return g.min(); }
};

int main() {
    std::normal_distribution<double> d;
    MyGenerator gen;
    g.seed(65472381);
    std::cout << "List[0] = " << d(gen) << "\n";
    g.seed(65472381);
    std::cout << "List[1] = " << d(gen) << "\n";
    g.seed(65472381);
    std::cout << "List[2] = " << d(gen) << "\n";
}

I get the following output:

List[0] = I'm called!
I'm called!
I'm called!
I'm called!
0.972683
List[1] = -0.773812
List[2] = I'm called!
I'm called!
I'm called!
I'm called!
0.972683

That explains the output you see. Every second call to d(g) doesn't request any new entropy from g, instead it relies on the previously generated entropy.

You can call reset() method on your distribution object to reset any internal state it holds and make sure you get consistent results when generator is seeded with same value.


Getting i-th number is (likely) impossible in O(1) time. You can call discard() method on your std::mt19937 engine to skip some entropy it generates, but it has 2 problems:

  • it is not guaranteed that it will be faster than a loop (it may or may not be, no guarantees)
  • you don't actually know how may bits of entropy are required by your distribution, so you don't know how many to discard to reach i-th number it would produce, it's all internal behaviour of your standard library implementation and can change at any time