Benchmark Fastest Way To Create NumPy Array

You can benchmark NumPy array creation functions and discover the fastest approaches to use in different circumstances.

Generally, the numpy.empty() function is faster when an uninitialized array is needed. If an initialized NumPy array is required, then numpy.zeros() or numpy.full() are the fastest.

An important consideration is the data type of the array. A type should be chosen that fits the data the array intends to hold, and is fast to create on the specific system on which the code will be run. Interestingly, it can be faster to create arrays with a larger data type than is needed in some cases.

In this tutorial, you will discover how to benchmark NumPy array creation functions and discover the fastest approaches you can use in your own programs.

Let’s get started.

Table of Contents

Need To Create NumPy Arrays Fast

Creating arrays is the most common activity when using NumPy.

As such, we need to know how to create arrays fast.

There are many ways to create new NumPy arrays, although perhaps the main difference between the approaches are:

1. Create an empty uninitialized array:
- numpy.empty()
2. Create an array initialized to a given value.

In this case, we will ignore creating arrays from or like other arrays and creating arrays as the outcome of an algorithm.

Generally, the division may be between creating empty arrays vs an initialized array.

Given that we need to create new arrays all the time in our NumPy program, which approach or approach is fast?

Specific requirements aside, What function should we use when creating new arrays in our programs?

Run loops using all CPUs, download your FREE book to learn how.

Benchmark NumPy Array Creation

We can explore the question of how fast the different array creation methods are using benchmarking.

In this case, we will use an approach to create an array of a modest fixed size, then repeat this process many times to give an estimated time. We can then compare the times to see the relative performance of the methods tested.

You can use this approach to test your own favorite NumPy array creation methods.

If you use or extend the NumPy benchmarking approach used in this tutorial, let me know in the comments below. I’d love to see what you come up with.

We could use the time.perf_counter() function directly and develop a helper function to perform the benchmarking and report results.

You can learn more about benchmarking with the time.perf_counter() function in the tutorial:

Benchmark Python with time.perf_counter()

Instead, in this case, we will use the timeit API, specifically the timeit.timeit() function and specify the string of array creation code to run and a fixed number of times to run it.

We will also provide the globals argument for any constants defined in our benchmark code, such as array size or shape.

For example:

...

# benchmark a thing

result = timeit.timeit('...', globals=globals(), number=N)

print(f'approach {result:.3f} seconds')

You can learn more about benchmarking with the timeit.timeit() function in the tutorial:

Benchmark Python with timeit.timeit()

The number of runs in each benchmark was tuned to ensure that each snippet was executed in more than one second and less than about 10 seconds.

Let’s get started.

Start Now: Free Python Benchmarking Crash Course

Fastest Way To Create NumPy 1D Array (uninitialized)

We can explore the fastest way to create a modestly sized uninitialized one-dimensional NumPy array.

In this case, we will define a fixed-size 1d array with one million elements (1,000,000). Each approach will be used to create an array 10,000 times.

The approaches we will compare include the most common NumPy functions for creating a new 1d array of a given size:

numpy.empty()
numpy.zeros()
numpy.ones()
numpy.arange()

The complete example is listed below.

# SuperFastPython.com

# benchmark creating an empty 1d numpy array

import numpy

import timeit

# size and shape of the arrays to create

SHAPE = 1000000

# number of times to run each snippet

N = 10000

# numpy.empty()

result = timeit.timeit('numpy.empty(SHAPE)', globals=globals(), number=N)

print(f'numpy.empty() {result:.3f} seconds')

# numpy.zeros()

result = timeit.timeit('numpy.zeros(SHAPE)', globals=globals(), number=N)

print(f'numpy.zeros() {result:.3f} seconds')

# numpy.ones()

result = timeit.timeit('numpy.ones(SHAPE)', globals=globals(), number=N)

print(f'numpy.ones() {result:.3f} seconds')

# numpy.arange()

result = timeit.timeit('numpy.arange(SHAPE)', globals=globals(), number=N)

print(f'numpy.arange() {result:.3f} seconds')

Running the example benchmarks each approach and reports the sum execution time.

numpy.empty() 0.182 seconds

numpy.zeros() 8.103 seconds

numpy.ones() 9.022 seconds

numpy.arange() 8.528 seconds

We can restructure the output into a table for comparison.

Approach | Time (sec)

---------------|------------

numpy.empty() | 0.182

numpy.zeros() | 8.103

numpy.ones() | 9.022

numpy.arange() | 8.528

We can see that creating an empty array via numpy.empty() was the fastest approach by a massive margin. It was about 44x faster than the next fastest method which was numpy.zeros().

This highlights that if we need an uninitialized 1d array, we should strongly consider using the numpy.empty() function.

Interestingly, we can see that numpy.ones() may have been the slowest method. Surprisingly, there was a one-second difference between numpy.zeros() and numpy.ones() as internally I would have guessed they perform nearly identical tasks (e.g., create an empty array and assign a value to each element in bulk).

Free Python Benchmarking Course

Get FREE access to my 7-day email course on Python Benchmarking.

Discover benchmarking with the time.perf_counter() function, how to develop a benchmarking helper function and context manager and how to use the timeit API and command line.

Learn more

Fastest Way To Create NumPy 2D Array (uninitialized)

We can explore the fastest way to create a modestly sized uninitialized two-dimensional NumPy array.

In this case, we will only compare the two fastest approaches for creating a one-dimensional array in the last section:

numpy.empty()
numpy.zeros()

Each array will have a size (100000,100000) and we will run each method 100,000 times.

The complete example is listed below.

# SuperFastPython.com

# benchmark creating an empty 2d numpy array (matrix)

import numpy

import timeit

# size and shape of the arrays to create

SHAPE = (100000,100000)

# number of times to run each snippet

N = 100000

# numpy.empty()

result = timeit.timeit('numpy.empty(SHAPE)', globals=globals(), number=N)

print(f'numpy.empty() {result:.3f} seconds')

# numpy.zeros()

result = timeit.timeit('numpy.zeros(SHAPE)', globals=globals(), number=N)

print(f'numpy.zeros() {result:.3f} seconds')

Running the example executes each approach many times and reports the sum time in seconds.

1 2	numpy.empty() 12.340 seconds numpy.zeros() 12.081 seconds

We can restructure the results into a table for comparison.

Approach | Time (sec)

---------------|------------

numpy.empty() | 12.340

numpy.zeros() | 12.081

Interestingly, the results suggest that it may be slightly faster to use numpy.zeros() to create a 2d array than numpy.empty().

The difference is small, less than 300 ms in this case. Nevertheless, if real, it is surprising as we might expect that creating an empty array would be as fast or as fast as creating an array and initializing it.

Repeating the same benchmark a few times, I see a similar pattern, although with different specific benchmark results.

For example:

1 2	numpy.empty() 11.917 seconds numpy.zeros() 11.700 seconds

It may be a real effect and the reason may require further investigation.

Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps

Fastest Way To Create and Initialize NumPy Array

We can explore the fastest way to create an initialized one-dimensional NumPy array.

This may be a more common situation.

We have seen that numpy.empty() is fast and numpy.zeros() is faster than numpy.ones(). Therefore, we will focus our attention on these approaches, as well as the numpy.full() function.

Specifically, we will directly compare the approaches:

numpy.zeros()
numpy.full()
numpy.empty() + slice assign
numpy.empty() + fill()
numpy.empty() + numpy.zeros_like()

We may need to initialize with an arbitrary value, so this is the focus of the methods explored with numpy.zeros() provided as a baseline.

Each approach is tested with an array with one million elements (1,000,000) and each test is run 10,000 times.

The complete example is listed below.

# SuperFastPython.com

# benchmark creating and initializing a 1d numpy array

import numpy

import timeit

# size and shape of the arrays to create

SHAPE = 1000000

# number of times to run each snippet

N = 10000

# numpy.zeros()

result = timeit.timeit('numpy.zeros(SHAPE)', globals=globals(), number=N)

print(f'numpy.zeros() {result:.3f} seconds')

# numpy.full()

result = timeit.timeit('numpy.full(SHAPE, 0)', globals=globals(), number=N)

print(f'numpy.full() {result:.3f} seconds')

# a=numpy.full(SHAPE);a[:]=0

result = timeit.timeit('a=numpy.empty(SHAPE);a[:]=0', globals=globals(), number=N)

print(f'a=numpy.empty(SHAPE);a[:]=0 {result:.3f} seconds')

# a=numpy.empty(SHAPE);a.fill(0)

result = timeit.timeit('a=numpy.empty(SHAPE);a.fill(0)', globals=globals(), number=N)

print(f'a=numpy.empty(SHAPE);a.fill(0) {result:.3f} seconds')

# a=numpy.empty(SHAPE);numpy.zeros_like(a)

result = timeit.timeit('a=numpy.empty(SHAPE);numpy.zeros_like(a)', globals=globals(), number=N)

print(f'a=numpy.empty(SHAPE);numpy.zeros_like(a) {result:.3f} seconds')

Running the example executes each approach many times and reports the total time in seconds.

numpy.zeros() 8.022 seconds

numpy.full() 8.182 seconds

a=numpy.empty(SHAPE);a[:]=0 10.102 seconds

a=numpy.empty(SHAPE);a.fill(0) 10.403 seconds

a=numpy.empty(SHAPE);numpy.zeros_like(a) 11.040 seconds

We can restructure each approach into a table for direct comparison.

Approach | Time (sec)

-----------------------------------------|------------

numpy.zeros() | 8.022

numpy.full() | 8.182

a=numpy.empty(SHAPE);a[:]=0 | 10.102

a=numpy.empty(SHAPE);a.fill(0) | 10.403

a=numpy.empty(SHAPE);numpy.zeros_like(a) | 11.040

The results show that numpy.zeros() is the fastest overall, as we might have expected. It is probably using a special trick down in the C implementation.

The next fastest was numpy.full(), which was 2 or more seconds faster than all of the approaches that involved creating an empty array and populating it.

This highlights that we should use the built-in NumPy functions where possible as they are likely faster than our attempts to re-implement them with multiple NumPy statements.

Loving The Tutorials?

Why not take the next step? Get the book.

Learn more

Array DataType Matters

We can explore the data type used when creating a NumPy array.

The data type matters as it defines the amount of main memory to allocate to the array, the space required.

Allocating more memory is slower, therefore it is faster if we can choose the smallest data type size that will best hold the data we intend to hold.

For example, if we only need True and False values, such as for an array mask, we should use numy.bool_. If we only need positive integers less than 256, we should use numpy.ushort, and so on.

You can see a list of all NumPy data types and the bounds of data they hold here:

Array types and conversions between types

I will use the aliases for the data types that give an idea of the number of bits used in the type, it helps in analysis.

We can highlight the impact of the data type on array creation, by creating an empty array with each NumPy data type and comparing the execution time.

We will focus on the numerical data types and exclude strings and objects as they are orders of magnitude slower than the numerical types.

The complete example is listed below.

# SuperFastPython.com

# benchmark creating an empty 1d numpy array with different data types

import numpy

import timeit

# size and shape of the arrays to create

SHAPE = 1000000

# number of times to run each snippet

N = 1000000

# data types

TYPES = [

'numpy.int8',

'numpy.int16',

'numpy.int32',

'numpy.int64',

'numpy.uint8',

'numpy.uint16',

'numpy.uint32',

'numpy.uint64',

'numpy.float16',

'numpy.float32',

'numpy.float64',

'numpy.complex64',

'numpy.complex128',

'numpy.bool_']

# create an empty list with each data type

for numpy_type in TYPES:

result = timeit.timeit(f'numpy.empty(SHAPE, dtype={numpy_type})', globals=globals(), number=N)

print(f'{numpy_type}: {result:.3f} seconds')

Running the example creates an empty NumPy array with each NumPy data type and reports the overall execution time.

numpy.int8: 0.312 seconds

numpy.int16: 5.161 seconds

numpy.int32: 9.590 seconds

numpy.int64: 1.748 seconds

numpy.uint8: 0.309 seconds

numpy.uint16: 5.201 seconds

numpy.uint32: 9.454 seconds

numpy.uint64: 2.233 seconds

numpy.float16: 5.888 seconds

numpy.float32: 9.723 seconds

numpy.float64: 4.225 seconds

numpy.complex64: 4.074 seconds

numpy.complex128: 0.687 seconds

numpy.bool_: 0.307 seconds

We can restructure the output as a table for direct comparison.

Data Type | Time (sec)

-----------------|------------

numpy.int8 | 0.312

numpy.int16 | 5.161

numpy.int32 | 9.590

numpy.int64 | 1.748

numpy.uint8 | 0.309

numpy.uint16 | 5.201

numpy.uint32 | 9.454

numpy.uint64 | 2.233

numpy.float16 | 5.888

numpy.float32 | 9.723

numpy.float64 | 4.225

numpy.complex64 | 4.074

numpy.complex128 | 0.687

numpy.bool_ | 0.307

The pattern of speed difference will be different on your platform, depending on how different-sized data types are handled by the operating system and/or how NumPy was compiled on your system.

Generally, we might expect that as the number of bits per data type is increased, we see a linear increase in execution time.

This is not the case. Looking at signed integers, the int8 type is the fastest followed by int64 and int32 is the slowest. A similar pattern is seen with the unsigned integers.

This is surprising and highlights the need to benchmark on a platform.

It suggests we may get better performance using a larger data type than needed when creating many empty arrays, e.g. int64 when we only need int32 or int16.

We see a similar artifact in floating point arrays, where a float32 array is nearly twice as slow to create than a float64 or float16 array. Again, we may want to use an array with more precision than is required in some situations.

Recommendations

The best recommendation is to identify the array creation tasks you need in your program, then benchmark them in isolation to discover what has the lowest execution speed on your system with your hardware and library versions.

I cannot stress this enough. The numbers above are highly specific and the patterns in performance observed may or may not hold on your specific platform.

That being said, you probably want to:

Use numpy.empty() to create uninitialized arrays.
Use numpy.zeros() to create arrays initialized to 0.
Use numpy.full() to create arrays initialized to a given value.
Use a data type that fits the data you require and is fastest to create.

Don’t rely on assumptions about performance, such as with data types of functions to call.

Always benchmark.

Takeaways

You now know how to benchmark NumPy array creation functions and discover the fastest approaches you can use in your own programs.

Did I make a mistake? See a typo?
I’m a simple humble human. Correct me, please!

Do you have any additional tips?
I’d love to hear about them!

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Photo by Emeric Deroubaix on Unsplash

Benchmark Fastest Way To Create NumPy Array

Need To Create NumPy Arrays Fast

Benchmark NumPy Array Creation

Fastest Way To Create NumPy 1D Array (uninitialized)

Fastest Way To Create NumPy 2D Array (uninitialized)

Fastest Way To Create and Initialize NumPy Array

Array DataType Matters

Recommendations

Further Reading

Takeaways

Related Tutorials:

Parallel Loops in Python

Loving the Tutorials?

Get The Book:

Don't Dabble!

Learn All Of Python Concurrency

No more idle CPUs

Learn Python Benchmarking Fast
(without the frustration)

Additional menu

Need To Create NumPy Arrays Fast

Benchmark NumPy Array Creation

Fastest Way To Create NumPy 1D Array (uninitialized)

Fastest Way To Create NumPy 2D Array (uninitialized)

Fastest Way To Create and Initialize NumPy Array

Array DataType Matters

Recommendations

Further Reading

Takeaways

Share this:

Related Tutorials:

About Jason Brownlee

Parallel Loops in Python

Reader Interactions

Leave a Reply Cancel reply

Footer

Learn Python Benchmarking Fast (without the frustration)

Learn Python Benchmarking Fast
(without the frustration)