Home/Blog/Learn to Code/Generating pseudorandom numbers with C++ rand() and srand()

Generating pseudorandom numbers with C++ rand() and srand()

15 min read

Aug 09, 2023

content

Using rand() function and its problem

How the rand() functions works

The time(0)function

How does time(0)solve our problem?

Can time(0)be used as the seed to start the sequence generation process?

A common programming error

Remember

Food for thought (for the more interested reader)

Limitations of PRGs

Conclusion

How the rand() functions works#

The rand() function in C++ generates a pseudorandom number by performing some arithmetic operations on the previously generated number or the seed value (if it’s the first call). The arithmetic operations used by rand() are typically based on a linear congruential generator (LCG), which is based on a simple mathematical formula that involves multiplying the previous number by a constant and adding another constant to it and then taking the result modulo some value. The formula can be written as:

\begin{equation} X_{n+1} = (a X_n + c) \space mod \space m \end{equation}

$Z_p$ refers to the set of integers modulo $p$ , where $p$ is a prime number. This set consists of integers ranging from 0 to $p-1$ . For example, $Z_7$ would consist of the integers $\{0, 1, 2, 3, 4, 5, 6\}$ . In this set, addition and multiplication are performed modulo $p$ . For instance, if we take the example of $Z_7$ , then $4+5$ would be equal to $2$ in $Z_7$ as $(4+5) \space mod \space 7 = 2$ , and $45$ would be equal to $6$ in $Z_7$ , as $(45) \space mod \space 7 = 6$ .

Another nice observation related to modular arithmetic is that if we are working in any $Z_p$ , then the following expression is equivalent:
$(a \times b) \space mod \space p \equiv ((a \space mod \space p) \times (b \space mod \space p)) \space mod \space p$
For example, if we are working in modulus 13, then the following expression will be equivalent:
$(20 \times 14)\space mod\space 13 \equiv ((20\space mod \space 13) \times (14)) \equiv (7 \times 1) \space mod \space 13 \equiv 7$

In some implementations of rand(), a polynomial is used of a small degree $d$ with $d+1$ prime coefficients (preferably). This is because prime numbers have good properties for generating seemingly random sequences of numbers within a prime modular field, such as being relatively far apart from each other and having a low correlation with other numbers. The polynomial will look like the following:

\begin{equation} X_{n+1} = (a_1 X_n^d + a_2 X_n^{d-1} + ... + a_{d+1}) \space mod \space m \end{equation}

Using prime numbers in the LCG can help to produce a more uniform distribution of random numbers.

In this blog, we will stick to a polynomial-based implementation of random number generation. Let’s revisit polynomials:

Polynomial

A polynomial is a mathematical function of the following form:
$\begin{equation} f(x) = a_n x^n + a_{n-1} x^{n-1} + \ldots + a_1 x + a_0 \end{equation}$
In this equation, $a_0, a_1, \ldots, a_n$ are the coefficients of the polynomial, $n$ is a non-negative integer that represents the degree of the polynomial (i.e., the highest power of $x$ in the expression), and $x$ is the independent variable.

Here’s an example of evaluating a polynomial at a particular value of the independent variable $x$ :

Consider the polynomial:
$\begin{equation} f(x) = 2x^3 + 3x^2 + 5x + 1 \end{equation}$
To evaluate $f(x)$ at $x = 2$ , we substitute $x = 2$ into the polynomial expression:
$\begin{equation} f(2) = 2(2)^3 + 3(2)^2 + 5(2) + 1 = 16 + 12 + 10 + 1 = 39 \end{equation}$
Therefore, $f(2) = 39$ .

The rand() function is implemented using a polynomial function evaluator called evals(), which evaluates the polynomial at a given parameter. The rand() function also uses a global variable called seed. When rand() is called, it passes the current value of seed to evals(), takes the result modulo a large number, preferably a prime number (in our example, we will take 593 so that values do not blow out of the range), and then stores the result back in seed before returning it. Since seed is a global variable, subsequent calls to rand() will use this changed seed as an argument to evals() and generate the next number.

Look at the example implementation:

C++

#include<iostream>
#include<math.h>
using namespace std;
long long evals(int x) 
{
   long long ans = (113*pow(x,3) + 119*pow(x,2) 
                     + 53*pow(x,1) + 13); // Polynomial function
            // We have used the polynomial: 113 x^3 + 119 x^2 + 53 x + 13
   return ans;
}
// A global variable seed, we call it seed variable with 0 as initial value
int seed = 0;
   
int rand()
{
   return seed = evals(seed)%593; // Update the seed variable with return value   
} 
int main()
{
   for(int i=1; i<=10; i++)
    cout<<rand()<<endl;
   return 0;
}

Did you observe that the number sequence remains the same every time the program is executed?

Let us see why that is the case.

Every time the program is executed, the seed is initialized to 0, and the first call forces the pattern to begin the evaluation of the polynomial at 0, and afterward, the pattern follows.

Ideally, we do not want the sequence to be repeated when we execute the program again. Ideally, it should always be random, or at least it should appear random.

One way to get around this issue is to change the seed value every time the program starts. That can be done by the srand() function. Here is its possible implementation:

C++

#include<iostream>
#include<math.h>
using namespace std;
long long evals(int x) 
{
   long long ans = (113*pow(x,3) + 119*pow(x,2) 
                     + 53*pow(x,1) + 13); // Polynomial function
   return ans;
}
// A global variable seed, we call it seed variable with 0 as initial value
int seed = 0;
void srand(int v)
{
   seed = v;
}
int rand()
{
   return seed = evals(seed)%593; // Update the seed variable with return value   
} 
int main()
{
   int s;
   s = 10;
   srand(s);
   for(int i=1; i<=10; i++)
    cout<<rand()<<endl;
}

Does this approach solve the problem?

The problem remains unresolved since we need to change the seed value manually every time. The user (the programmer) has complete responsibility and control for assigning the seed value that determines the next sequence entirely. If someone knows the given polynomial as coded in the program above, knowing the seed means knowing the sequence hence no randomness.

Therefore, we need a solution where changing the seed value should not be under the user's control. So we need to answer the following question:

How can we create a unique seed value at every new program launch, and it should not be in the control of the user?

We aim to automatically assign a new seed to the program each time it starts without requiring the user to determine it. However, if we assign a completely random value to the seed each time we start the program, we encounter the same problem we were trying to solve originally: We again need to develop a new random number generator to initialize the seed at the beginning of each program run.

Are we really back at the same problem?

The Unix epoch is a reference point that is used as a standard in computer systems to measure and compare time. It represents the starting point of the Unix timekeeping system, which measures time as the number of seconds that have elapsed since January 1st, 1970, 00:00:00 UTC. This system provides a universal and consistent representation of time across different platforms and programming languages. The Unix epoch is particularly useful for computer systems as it provides a standard for measuring time unaffected by different time zones or daylight saving time adjustments. This allows for more accurate and reliable time comparisons across different systems and applications.

A common programming error#

One thing to particularly look out for when generating random numbers is that if we would like to generate only a few random numbers (for example, just 10), then we must not write the srand(time(0)) line inside the loop. If that happens, our program will generate the same random numbers again and again. This is because the loop usually runs very fast. During one second, the loop will execute several times and yield the same time(0) value again and again. This will result in the seed being reassigned to the same second’s value and cause the same number to be generated repeatedly.

Look at the following erroneous code and also its correct implementation:

So to solve this problem, rand() functions maintains two primes, $p_{small}$ and $p_{big}$ , where $p_{small} \ll p_{big}$ (reads as $p_{small}$ is very very smaller than $p_{big}$ ). The value of the sequence is generated in the following fashion:

The number $x$ is generated first in the set $Z_{p_{big}} = \{0,1,2,3,...,p_{big}-1\}$ .
$x$ is saved as the seed and shrunk in modulo of $p_{small}$ yielding one of the numbers in $Z_{p_{small}} = \{0,1,2,3,...,p_{small}-1\}$ and that is returned as a pseudorandom number.

Notice now that the shrunk value $x$ is repeatable in $Z_{p_{small}}$ although it will not be repeated $Z_{p_{big}}$ , as there are multiple numbers in $Z_{p_{big}}$ which will map to one numbering $Z_{p_{small}}$ .

C++

#include<iostream>
#include<math.h>
using namespace std;
long long pow(int x, int y, int mod)
{
    long long result = 1;
    for(int i=1; i<=y; i++)
    {
        result = (result*x)%mod;   // Keep taking modulus along, to shrink the answer as 
    }                              // soon as it tries to grow out of proportion
    return result;
}
// A global variable seed, we call it seed variable with 0 as initial value
int seed = 0, psmall=593, pbig=1000000007;
long long evals(int x) 
{
   long long ans = (113*pow(x,3, pbig) + 119*pow(x,2,pbig) 
                     + 53*pow(x,1,pbig) + 13); // Polynomial function
   return ans;
}
void srand(int v)
{
   seed = v;
}
int rand()
{
   seed = evals(seed)%pbig; // Update the seed variable with return value   
   return seed%psmall;      // return the random number in the smaller prime field
} 
int main()
{
   int s;
   s = 10;
   srand(s);
   for(int i=1; i<=1000; i++)
    cout<<rand()<<endl;
}

In lines 8–11, in the pow() function, we have passed a mod variable to make sure that during the power computation, the answer never grows out of the range of int capacity, and should be shrunk back in the range.
On line 16, we have used the two variables psmall=593, pbig=1000000007.
In lines 30–31, we first calculated the new seed in the modulo of pbig, and then the seed value is shrunk to the range within the smaller prime by taking the modulus from psmall.

Visually, it can be seen in the following animation:

As an example, the outer ring is working in modulo 29 and the inner ring is working in modulo 5. First the number is generated in modulo 29 and then it is mapped to modulo 5. This way will generate the same number multiple times.

Limitations of PRGs#

Pseudorandom number generations (PRNGs) through the srand(time(0)) and rand() functions have the following limitations:

Limited periodicity: Pseudorandom sequences repeat after a certain number of iterations (this is the exact reason why we work in two prime numbers).
Seed dependency: Same seed generates the same sequence of random numbers.
Lack of true randomness: Pseudorandom numbers are deterministic (once the first seed is decided), not truly random.
Statistical properties: The quality and statistical properties of generated random numbers may be inadequate.

Conclusion#

In conclusion, we discussed pseudorandom number generators and their use in generating random numbers that appear to be random but are deterministic. We looked at the rand() function in C++ and the use of the srand() function to set a specific seed for generating pseudorandom numbers. We also explored using time-based seeds (using time(0)) to create pseudorandom sequences. Furthermore, while diving into how polynomials are used in generating pseudorandom sequences, we discussed the problem of generating duplicate sequences and how to avoid it by working with two primes — a smaller prime number and a much larger prime number.

Generating pseudorandom numbers with C++ rand() and srand()

Using rand() function and its problem#

How the rand() functions works#

The time(0)function#

How does time(0)solve our problem?#

Can time(0)be used as the seed to start the sequence generation process?#

A common programming error#

Remember#

Food for thought (for the more interested reader)#

Limitations of PRGs#

Conclusion#

Further Reading#