Benchmark Fastest Way To Create NumPy Random Numbers

November 10, 2023 Python Benchmarking

You can benchmark NumPy random number array functions and discover the fastest approaches to use in different circumstances.

Generally the modern numpy.random.Generator NumPy random number generator should be used over the legacy numpy.random.RandomState random number generator as it is significantly faster.

When generating random floats, using a type of float32 is faster than float64. When generating random integers, using int16 and int32 can be faster than other types, and perhaps faster gain if unsigned. When generating random booleans, generating 0 and 1 integers and storing them in an array with the type numpy.bool_ is the fastest.

In this tutorial, you will discover how to benchmark and discover the fastest way to generate NumPy arrays of random values.

Let's get started.

Need Fast NumPy Random Numbers

Random numbers are a big part of many NumPy programs.

We need randomness in many programs such as simulations, optimization algorithms, learning algorithms, and more.

Generating random values is typically slow given that the pseudorandom number generator must use a complex mathematical operation. Therefore, we are interested in ways of generating the randomness we require in the fastest way possible.

There are two main randomness APIs in NumPy, they are:

Which one is faster?

Further, there are several functions that we can use to generate numbers, which one is the fastest?

We can explore this question from a few angles.

Firstly, we will explore how to create arrays of random floating point values using functions such as:

We will then explore how to create arrays of random integer values, with functions such as:

We will then use a mixture of these functions to create arrays of random boolean values, a capability not provided by the NumPy random APIs.

Benchmark NumPy Random Numbers

We can explore the question of how fast the different approaches to creating NumPy random numbers are using benchmarking.

In this case, we will use an approach to creating random number arrays of a modest fixed size, then repeat this process many times to give an estimated time. We can then compare the times to see the relative performance of the approaches tested.

You can use this approach to benchmark your own favorite NumPy array operations.

If you use or extend the NumPy benchmarking approach used in this tutorial, let me know in the comments below. I'd love to see what you come up with.

We could use the time.perf_counter() function directly and develop a helper function to perform the benchmarking and report results.

You can learn more about benchmarking with the time.perf_counter() function in the tutorial:

Instead, in this case, we will use the timeit API, specifically the timeit.timeit() function and specify the string of array code to run and a fixed number of times to run it.

We will also provide the globals argument for any constants defined in our benchmark code, such as array size or shape.

For example:

...
# benchmark a thing
result = timeit.timeit('...', globals=globals(), number=N)
print(f'approach {result:.3f} seconds')

You can learn more about benchmarking with the timeit.timeit() function in the tutorial:

The number of runs in each benchmark was tuned to ensure that each snippet was executed in more than one second and less than about 10 seconds.

Let's get started.

Fastest Way to Create 1D NumPy Array of Random Floats

We can explore the fastest way to create a modestly sized NumPy array of random floating point values in [0,1).

In this case, we will create a fixed size 1d array with one million elements (1,000,000) of random floats with the default data type, float64 on most platforms. Each approach will be used to create an array 2,000 times.

The approaches we will compare include the most common NumPy functions for creating a 1d array of random floats, including:

The complete example is listed below.

# SuperFastPython.com
# benchmark creating a 1d array of random floats
import numpy
import timeit
# size and shape of the arrays to create
SHAPE = 1000000
# number of times to run each snippet
N = 2000
# numpy.random.rand()
result = timeit.timeit('numpy.random.rand(SHAPE)', globals=globals(), number=N)
print(f'numpy.random.rand() {result:.3f} seconds')
# numpy.random.random_sample()
result = timeit.timeit('numpy.random.random_sample(SHAPE)', globals=globals(), number=N)
print(f'numpy.random.random_sample() {result:.3f} seconds')
# rng.random()
result = timeit.timeit('rng=numpy.random.default_rng(1);rng.random(SHAPE)', globals=globals(), number=N)
print(f'rng.random() {result:.3f} seconds')
# rng.random(out=A)
result = timeit.timeit('A=numpy.empty(SHAPE);rng=numpy.random.default_rng(1);rng.random(out=A)', globals=globals(), number=N)
print(f'rng.random(A) {result:.3f} seconds')

Running the example benchmarks each approach and reports the sum execution time.

numpy.random.rand() 11.946 seconds
numpy.random.random_sample() 11.740 seconds
rng.random() 6.538 seconds
rng.random(A) 6.660 seconds

We can restructure the output into a table for comparison.

Approach                     | Time (sec)
-----------------------------|------------
numpy.random.rand()          | 11.946
numpy.random.random_sample() | 11.740
rng.random()                 | 6.538
rng.random(A)                | 6.660

We can see that the two approaches that use the legacy API have a similar execution time of around 12 seconds, whereas the two approaches that use the more modern API have an execution time that is a little more than half the time.

This highlights that we should be using the modern NumPy random number generation API to generate floats if speed is important.

We can also see that it may be slightly faster to use the rng.random() function to create the array and populate it rather than to create an empty array and have the rng.random() function populate for us.

The difference is small, although re-running the benchmark test shows a similar pattern in performance.

numpy.random.rand() 11.846 seconds
numpy.random.random_sample() 11.833 seconds
rng.random() 6.552 seconds
rng.random(A) 6.760 seconds

Fastest Way to Create 2D NumPy Array of Random Floats

We can explore the fastest way to create a modestly sized two-dimensional NumPy array of random floats, e.g. a matrix.

Each array will have the size (1000,1000) and we will run each method 1,000 times.

The 2d nature of the array allows us to explore additional approaches, such as:

The complete example is listed below.

# SuperFastPython.com
# benchmark creating a 2d array of random floats
import numpy
import timeit
# size and shape of the arrays to create
SHAPE = (1000,1000)
# number of times to run each snippet
N = 1000
# numpy.random.rand()
result = timeit.timeit('numpy.random.rand(SHAPE[0]*SHAPE[1]).reshape(SHAPE)', globals=globals(), number=N)
print(f'numpy.random.rand() {result:.3f} seconds')
# numpy.random.random()
result = timeit.timeit('numpy.random.random(SHAPE)', globals=globals(), number=N)
print(f'numpy.random.random() {result:.3f} seconds')
# numpy.random.random_sample()
result = timeit.timeit('numpy.random.random_sample(SHAPE)', globals=globals(), number=N)
print(f'numpy.random.random_sample() {result:.3f} seconds')
# rrng.random()
result = timeit.timeit('rng=numpy.random.default_rng(1);rng.random(SHAPE)', globals=globals(), number=N)
print(f'rng.random() {result:.3f} seconds')
# rng.random(out=A)
result = timeit.timeit('A=numpy.empty(SHAPE);rng=numpy.random.default_rng(1);rng.random(out=A)', globals=globals(), number=N)
print(f'rng.random(A) {result:.3f} seconds')

Running the example benchmarks each approach and reports the sum execution time.

numpy.random.rand() 5.894 seconds
numpy.random.random() 5.854 seconds
numpy.random.random_sample() 5.826 seconds
rng.random() 3.248 seconds
rng.random(A) 3.319 seconds

We can restructure the output into a table for comparison.

Approach                     | Time (sec)
-----------------------------|------------
numpy.random.rand()          | 5.894
numpy.random.random()        | 5.854
numpy.random.random_sample() | 5.826
rng.random()                 | 3.248
rng.random(A)                | 3.319

Again, we see a clear distinction between the execution time of the legacy API at nearly 6 seconds and the modern API at just over 3 seconds.

It seems all of the functions used in the legacy API have a similar performance of about 5.8 seconds. It is likely that behind the scenes that each function is calling the same internal function for generating the random floating point values.

As with the previous example, the modern random number generator that creates the array for us and populates it is slightly faster than us creating an empty array and having it populated.

Float Data Types Matter

The data type of the array matters.

We would expect that a larger data type requires more random bits to be generated.

Therefore, we might expect that an array of float32 will be faster to create than an array of random float64 values.

We can explore this with the rng.random() function. We will generate a 1d array with one million elements with float32 and then again with float64 random values and repeat the process 2,000 times.

The complete example is listed below.

# SuperFastPython.com
# benchmark creating a 1d array of random floats
import numpy
import timeit
# size and shape of the arrays to create
SHAPE = 1000000
# number of times to run each snippet
N = 2000
# rng.random(float32)
result = timeit.timeit('rng=numpy.random.default_rng(1);rng.random(SHAPE, dtype=numpy.float32)', globals=globals(), number=N)
print(f'rng.random(float32) {result:.3f} seconds')
# rng.random(float64)
result = timeit.timeit('rng=numpy.random.default_rng(1);rng.random(SHAPE, dtype=numpy.float64)', globals=globals(), number=N)
print(f'rng.random(float64) {result:.3f} seconds')

Running the example benchmarks each approach and reports the sum execution time.

rng.random(float32) 5.200 seconds
rng.random(float64) 6.766 seconds

We can restructure the output into a table for comparison.

Approach            | Time (sec)
--------------------|------------
rng.random(float32) | 5.200
rng.random(float64) | 6.766

We can see that our expectations were confirmed.

It is faster to create an array of floats with the smaller data type of float32 compared to the larger data type of float64.

Where possible we should use the smallest possible data type when generating floating point values in order to reduce execution time.

Fastest Way to Create 1D NumPy Array of Random Integers

We can explore the fastest way to create a modestly sized NumPy array of random integer values.

In this case, we will create a fixed size 1d array with one million elements (1,000,000) of random integers between 0 and 100 (inclusive) with the default data type, int64 on most platforms. Each approach will be used to create an array 1,000 times.

The approaches we will compare include the most common NumPy functions for creating a 1d array of random integers, including:

The complete example is listed below.

# SuperFastPython.com
# benchmark creating a 1d array of random integers
import numpy
import timeit
# size and shape of the arrays to create
SHAPE = 1000000
# range of values
LOW, HIGH = 0, 100
# number of times to run each snippet
N = 1000
# numpy.random.randint()
result = timeit.timeit('numpy.random.randint(LOW, HIGH+1, SHAPE)', globals=globals(), number=N)
print(f'numpy.random.randint() {result:.3f} seconds')
# numpy.random.random_integers()
result = timeit.timeit('numpy.random.random_integers(LOW, HIGH, SHAPE)', globals=globals(), number=N)
print(f'numpy.random.random_integers() {result:.3f} seconds')
# numpy.random.choice()
result = timeit.timeit('numpy.random.choice(numpy.arange(HIGH+1), SHAPE)', globals=globals(), number=N)
print(f'numpy.random.choice() {result:.3f} seconds')
# rng.integers()
result = timeit.timeit('rng=numpy.random.default_rng(1);rng.integers(LOW, HIGH+1, SHAPE)', globals=globals(), number=N)
print(f'rng.random() {result:.3f} seconds')
# rng.choice()
result = timeit.timeit('rng=numpy.random.default_rng(1);rng.choice(numpy.arange(HIGH+1), SHAPE)', globals=globals(), number=N)
print(f'rng.choice() {result:.3f} seconds')

Running the example benchmarks each approach and reports the sum execution time.

numpy.random.randint() 7.353 seconds
numpy.random.random_integers() 7.237 seconds
numpy.random.choice() 9.864 seconds
rng.integers() 2.630 seconds
rng.choice() 5.877 seconds

We can restructure the output into a table for comparison.

Approach                       | Time (sec)
-------------------------------|------------
numpy.random.randint()         | 7.353
numpy.random.random_integers() | 7.237
numpy.random.choice()          | 9.864
rng.integers()                 | 2.630
rng.choice()                   | 5.877

We can see a diction in execution time between the legacy and modern APIs as we did when generating random floats.

We can also see that the choice() approach is generally slower than generating random integers directly.

In this case, the fastest approach was rng.integers() and is the preferred approach when generating an array of random integers.

Integer Data Types Matter

The data type of the integer array matters.

We would expect that a larger data type requires more random bits to be generated.

Therefore, we might expect that an array of int32 will be faster to create than an array of random int64 values. Similarly, we may expect int16 to be faster again, and int8 to be the fastest of all.

We can explore this with the rng.integers() function. We will generate a 1d array with one million random integer values between 0 and 100 with each integer type (8, 16, 32, and 64 bits) and repeat the process 2,000 times.

It may also be interesting to contrast the results between signed and unsigned data types. Recall that signed types allow negative values, whereas unsigned types only allow positive values and offer a larger range in the positive domain.

The complete example is listed below.

# SuperFastPython.com
# benchmark creating a 1d array of random integers with different types
import numpy
import timeit
# size and shape of the arrays to create
SHAPE = 1000000
# range of values
LOW, HIGH = 0, 100
# number of times to run each snippet
N = 2000
# list of types to compare
types = ['int8', 'uint8', 'int16', 'uint16', 'int32', 'uint32', 'int64', 'uint64']
# benchmark each data type
for t in types:
    result = timeit.timeit(f'rng=numpy.random.default_rng(1);rng.integers(LOW, HIGH+1, SHAPE, dtype=numpy.{t})', globals=globals(), number=N)
    print(f'rng.integers({t}) {result:.3f} seconds')

Running the example benchmarks each approach and reports the sum execution time.

rng.integers(int8) 14.848 seconds
rng.integers(uint8) 14.808 seconds
rng.integers(int16) 4.478 seconds
rng.integers(uint16) 4.466 seconds
rng.integers(int32) 4.582 seconds
rng.integers(uint32) 4.568 seconds
rng.integers(int64) 5.132 seconds
rng.integers(uint64) 5.148 seconds

We can restructure the output into a table for comparison.

Approach             | Time (sec)
---------------------|------------
rng.integers(int8)   | 14.848
rng.integers(uint8)  | 14.808
rng.integers(int16)  | 4.478
rng.integers(uint16) | 4.466
rng.integers(int32)  | 4.582
rng.integers(uint32) | 4.568
rng.integers(int64)  | 5.132
rng.integers(uint64) | 5.148

The results are fascinating.

Firstly, we can see that generally generating unsigned integers was slightly faster in most cases (except int64 types).

The expectation is that fewer bits would be faster to generate.

In this case, we can see that int8 types were the lowest to generate.

We can see that there was very little difference between int16 and int32 types and int64 random integers were slower to generate by about half a second.

It suggests that we may want to use an unsigned int16 or int32 type when generating random ints, as long as the type can hold the range required.

Fastest Way to Create 1D NumPy Array of Random Booleans

We can explore the fastest way to create a modestly sized NumPy array of random boolean values.

These are values that are either True or False.

In this case, we will create a fixed-size 1d array with one million elements (1,000,000) of random boolean values or integers between 0 and 1 (inclusive). If possible, we will try and set the type to be numpy.bool_. Each approach will be used to create an array 2,000 times.

The numpy.random APIs do not provide a way to create arrays of random booleans directly, therefore we will explore a few approaches that involve generating integers, using choice, and converting floats to booleans, including:

The complete example is listed below.

# SuperFastPython.com
# benchmark creating a 1d array of random booleans
import numpy
import timeit
# size and shape of the arrays to create
SHAPE = 1000000
# number of times to run each snippet
N = 2000
# numpy.random.rand()<0.5
result = timeit.timeit('numpy.random.rand(SHAPE)<0.5', globals=globals(), number=N)
print(f'numpy.random.rand()<0.5 {result:.3f} seconds')
# numpy.random.choice([True,False])
result = timeit.timeit('numpy.random.choice([True,False], SHAPE)', globals=globals(), number=N)
print(f'numpy.random.choice([True,False]) {result:.3f} seconds')
# numpy.random.randint(0,2)
result = timeit.timeit('numpy.random.randint(0, 2, SHAPE,numpy.bool_)', globals=globals(), number=N)
print(f'numpy.random.randint(0,2) {result:.3f} seconds')
# rng.random()<0.5
result = timeit.timeit('rng=numpy.random.default_rng(1);rng.random(SHAPE)<0.5', globals=globals(), number=N)
print(f'rng.random()<0.5 {result:.3f} seconds')
# rng.choice([True,False])
result = timeit.timeit('rng=numpy.random.default_rng(1);rng.choice([True,False],SHAPE)', globals=globals(), number=N)
print(f'rng.choice([True,False]) {result:.3f} seconds')
# rng.integers(0,1)
result = timeit.timeit('rng=numpy.random.default_rng(1);rng.integers(0,1,SHAPE,numpy.bool_,True)', globals=globals(), number=N)
print(f'rng.integers(0,1) {result:.3f} seconds')

Running the example benchmarks each approach and reports the sum execution time.

numpy.random.rand()<0.5 13.041 seconds
numpy.random.choice([True,False]) 10.334 seconds
numpy.random.randint(0,2) 1.902 seconds
rng.random()<0.5 7.569 seconds
rng.choice([True,False]) 12.157 seconds
rng.integers(0,1) 1.549 seconds

We can restructure the output into a table for comparison.

Approach                          | Time (sec)
----------------------------------|------------
numpy.random.rand()<0.5           | 13.041
numpy.random.choice([True,False]) | 10.334
numpy.random.randint(0,2)         | 1.902
rng.random()<0.5                  | 7.569
rng.choice([True,False])          | 12.157
rng.integers(0,1)                 | 1.549

Generally, we can see that using the choice() approach with the legacy and modern APIs is the slowest approach.

We can also see that creating an array of booleans from an array of floating point values as a mask is also very inefficient with both APIs.

Generally, the fastest approach was to generate 0 and 1 integers and to store the results in an array with the type numpy.bool_.

From the two approaches of this type tested, the approach that uses the modern API is nearly half a second faster.

Recommendations

The best recommendation is to identify the specific random number array tasks you need in your program, then benchmark them in isolation to discover what has the lowest execution speed on your system with your hardware and library versions.

I cannot stress this enough. The numbers above are highly specific and the patterns in performance observed may or may not hold on your specific platform.

That being said, if performance matters, you probably want to:

Don't rely on assumptions about performance, such as with data types of functions to call.

Always benchmark.

Takeaways

You now know how to benchmark and discover the fastest way to generate NumPy arrays of random values.



If you enjoyed this tutorial, you will love my book: Python Benchmarking. It covers everything you need to master the topic with hands-on examples and clear explanations.