You can benchmark NumPy array creation functions and discover the fastest approaches to use in different circumstances.
Generally, the numpy.empty() function is faster when an uninitialized array is needed. If an initialized NumPy array is required, then numpy.zeros() or numpy.full() are the fastest.
An important consideration is the data type of the array. A type should be chosen that fits the data the array intends to hold, and is fast to create on the specific system on which the code will be run. Interestingly, it can be faster to create arrays with a larger data type than is needed in some cases.
In this tutorial, you will discover how to benchmark NumPy array creation functions and discover the fastest approaches you can use in your own programs.
Let’s get started.
Need To Create NumPy Arrays Fast
Creating arrays is the most common activity when using NumPy.
As such, we need to know how to create arrays fast.
There are many ways to create new NumPy arrays, although perhaps the main difference between the approaches are:
- 1. Create an empty uninitialized array:
- 2. Create an array initialized to a given value.
In this case, we will ignore creating arrays from or like other arrays and creating arrays as the outcome of an algorithm.
Generally, the division may be between creating empty arrays vs an initialized array.
Given that we need to create new arrays all the time in our NumPy program, which approach or approach is fast?
Specific requirements aside, What function should we use when creating new arrays in our programs?
Run loops using all CPUs, download your FREE book to learn how.
Benchmark NumPy Array Creation
We can explore the question of how fast the different array creation methods are using benchmarking.
In this case, we will use an approach to create an array of a modest fixed size, then repeat this process many times to give an estimated time. We can then compare the times to see the relative performance of the methods tested.
You can use this approach to test your own favorite NumPy array creation methods.
If you use or extend the NumPy benchmarking approach used in this tutorial, let me know in the comments below. I’d love to see what you come up with.
We could use the time.perf_counter() function directly and develop a helper function to perform the benchmarking and report results.
You can learn more about benchmarking with the time.perf_counter() function in the tutorial:
Instead, in this case, we will use the timeit API, specifically the timeit.timeit() function and specify the string of array creation code to run and a fixed number of times to run it.
We will also provide the globals argument for any constants defined in our benchmark code, such as array size or shape.
For example:
1 2 3 4 |
... # benchmark a thing result = timeit.timeit('...', globals=globals(), number=N) print(f'approach {result:.3f} seconds') |
You can learn more about benchmarking with the timeit.timeit() function in the tutorial:
The number of runs in each benchmark was tuned to ensure that each snippet was executed in more than one second and less than about 10 seconds.
Let’s get started.
Fastest Way To Create NumPy 1D Array (uninitialized)
We can explore the fastest way to create a modestly sized uninitialized one-dimensional NumPy array.
In this case, we will define a fixed-size 1d array with one million elements (1,000,000). Each approach will be used to create an array 10,000 times.
The approaches we will compare include the most common NumPy functions for creating a new 1d array of a given size:
- numpy.empty()
- numpy.zeros()
- numpy.ones()
- numpy.arange()
The complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
# SuperFastPython.com # benchmark creating an empty 1d numpy array import numpy import timeit # size and shape of the arrays to create SHAPE = 1000000 # number of times to run each snippet N = 10000 # numpy.empty() result = timeit.timeit('numpy.empty(SHAPE)', globals=globals(), number=N) print(f'numpy.empty() {result:.3f} seconds') # numpy.zeros() result = timeit.timeit('numpy.zeros(SHAPE)', globals=globals(), number=N) print(f'numpy.zeros() {result:.3f} seconds') # numpy.ones() result = timeit.timeit('numpy.ones(SHAPE)', globals=globals(), number=N) print(f'numpy.ones() {result:.3f} seconds') # numpy.arange() result = timeit.timeit('numpy.arange(SHAPE)', globals=globals(), number=N) print(f'numpy.arange() {result:.3f} seconds') |
Running the example benchmarks each approach and reports the sum execution time.
1 2 3 4 |
numpy.empty() 0.182 seconds numpy.zeros() 8.103 seconds numpy.ones() 9.022 seconds numpy.arange() 8.528 seconds |
We can restructure the output into a table for comparison.
1 2 3 4 5 6 |
Approach | Time (sec) ---------------|------------ numpy.empty() | 0.182 numpy.zeros() | 8.103 numpy.ones() | 9.022 numpy.arange() | 8.528 |
We can see that creating an empty array via numpy.empty() was the fastest approach by a massive margin. It was about 44x faster than the next fastest method which was numpy.zeros().
This highlights that if we need an uninitialized 1d array, we should strongly consider using the numpy.empty() function.
Interestingly, we can see that numpy.ones() may have been the slowest method. Surprisingly, there was a one-second difference between numpy.zeros() and numpy.ones() as internally I would have guessed they perform nearly identical tasks (e.g., create an empty array and assign a value to each element in bulk).
Free Python Benchmarking Course
Get FREE access to my 7-day email course on Python Benchmarking.
Discover benchmarking with the time.perf_counter() function, how to develop a benchmarking helper function and context manager and how to use the timeit API and command line.
Fastest Way To Create NumPy 2D Array (uninitialized)
We can explore the fastest way to create a modestly sized uninitialized two-dimensional NumPy array.
In this case, we will only compare the two fastest approaches for creating a one-dimensional array in the last section:
- numpy.empty()
- numpy.zeros()
Each array will have a size (100000,100000) and we will run each method 100,000 times.
The complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
# SuperFastPython.com # benchmark creating an empty 2d numpy array (matrix) import numpy import timeit # size and shape of the arrays to create SHAPE = (100000,100000) # number of times to run each snippet N = 100000 # numpy.empty() result = timeit.timeit('numpy.empty(SHAPE)', globals=globals(), number=N) print(f'numpy.empty() {result:.3f} seconds') # numpy.zeros() result = timeit.timeit('numpy.zeros(SHAPE)', globals=globals(), number=N) print(f'numpy.zeros() {result:.3f} seconds') |
Running the example executes each approach many times and reports the sum time in seconds.
1 2 |
numpy.empty() 12.340 seconds numpy.zeros() 12.081 seconds |
We can restructure the results into a table for comparison.
1 2 3 4 |
Approach | Time (sec) ---------------|------------ numpy.empty() | 12.340 numpy.zeros() | 12.081 |
Interestingly, the results suggest that it may be slightly faster to use numpy.zeros() to create a 2d array than numpy.empty().
The difference is small, less than 300 ms in this case. Nevertheless, if real, it is surprising as we might expect that creating an empty array would be as fast or as fast as creating an array and initializing it.
Repeating the same benchmark a few times, I see a similar pattern, although with different specific benchmark results.
For example:
1 2 |
numpy.empty() 11.917 seconds numpy.zeros() 11.700 seconds |
It may be a real effect and the reason may require further investigation.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Fastest Way To Create and Initialize NumPy Array
We can explore the fastest way to create an initialized one-dimensional NumPy array.
This may be a more common situation.
We have seen that numpy.empty() is fast and numpy.zeros() is faster than numpy.ones(). Therefore, we will focus our attention on these approaches, as well as the numpy.full() function.
Specifically, we will directly compare the approaches:
- numpy.zeros()
- numpy.full()
- numpy.empty() + slice assign
- numpy.empty() + fill()
- numpy.empty() + numpy.zeros_like()
We may need to initialize with an arbitrary value, so this is the focus of the methods explored with numpy.zeros() provided as a baseline.
Each approach is tested with an array with one million elements (1,000,000) and each test is run 10,000 times.
The complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
# SuperFastPython.com # benchmark creating and initializing a 1d numpy array import numpy import timeit # size and shape of the arrays to create SHAPE = 1000000 # number of times to run each snippet N = 10000 # numpy.zeros() result = timeit.timeit('numpy.zeros(SHAPE)', globals=globals(), number=N) print(f'numpy.zeros() {result:.3f} seconds') # numpy.full() result = timeit.timeit('numpy.full(SHAPE, 0)', globals=globals(), number=N) print(f'numpy.full() {result:.3f} seconds') # a=numpy.full(SHAPE);a[:]=0 result = timeit.timeit('a=numpy.empty(SHAPE);a[:]=0', globals=globals(), number=N) print(f'a=numpy.empty(SHAPE);a[:]=0 {result:.3f} seconds') # a=numpy.empty(SHAPE);a.fill(0) result = timeit.timeit('a=numpy.empty(SHAPE);a.fill(0)', globals=globals(), number=N) print(f'a=numpy.empty(SHAPE);a.fill(0) {result:.3f} seconds') # a=numpy.empty(SHAPE);numpy.zeros_like(a) result = timeit.timeit('a=numpy.empty(SHAPE);numpy.zeros_like(a)', globals=globals(), number=N) print(f'a=numpy.empty(SHAPE);numpy.zeros_like(a) {result:.3f} seconds') |
Running the example executes each approach many times and reports the total time in seconds.
1 2 3 4 5 |
numpy.zeros() 8.022 seconds numpy.full() 8.182 seconds a=numpy.empty(SHAPE);a[:]=0 10.102 seconds a=numpy.empty(SHAPE);a.fill(0) 10.403 seconds a=numpy.empty(SHAPE);numpy.zeros_like(a) 11.040 seconds |
We can restructure each approach into a table for direct comparison.
1 2 3 4 5 6 7 |
Approach | Time (sec) -----------------------------------------|------------ numpy.zeros() | 8.022 numpy.full() | 8.182 a=numpy.empty(SHAPE);a[:]=0 | 10.102 a=numpy.empty(SHAPE);a.fill(0) | 10.403 a=numpy.empty(SHAPE);numpy.zeros_like(a) | 11.040 |
The results show that numpy.zeros() is the fastest overall, as we might have expected. It is probably using a special trick down in the C implementation.
The next fastest was numpy.full(), which was 2 or more seconds faster than all of the approaches that involved creating an empty array and populating it.
This highlights that we should use the built-in NumPy functions where possible as they are likely faster than our attempts to re-implement them with multiple NumPy statements.
Array DataType Matters
We can explore the data type used when creating a NumPy array.
The data type matters as it defines the amount of main memory to allocate to the array, the space required.
Allocating more memory is slower, therefore it is faster if we can choose the smallest data type size that will best hold the data we intend to hold.
For example, if we only need True and False values, such as for an array mask, we should use numy.bool_. If we only need positive integers less than 256, we should use numpy.ushort, and so on.
You can see a list of all NumPy data types and the bounds of data they hold here:
I will use the aliases for the data types that give an idea of the number of bits used in the type, it helps in analysis.
We can highlight the impact of the data type on array creation, by creating an empty array with each NumPy data type and comparing the execution time.
We will focus on the numerical data types and exclude strings and objects as they are orders of magnitude slower than the numerical types.
The complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
# SuperFastPython.com # benchmark creating an empty 1d numpy array with different data types import numpy import timeit # size and shape of the arrays to create SHAPE = 1000000 # number of times to run each snippet N = 1000000 # data types TYPES = [ 'numpy.int8', 'numpy.int16', 'numpy.int32', 'numpy.int64', 'numpy.uint8', 'numpy.uint16', 'numpy.uint32', 'numpy.uint64', 'numpy.float16', 'numpy.float32', 'numpy.float64', 'numpy.complex64', 'numpy.complex128', 'numpy.bool_'] # create an empty list with each data type for numpy_type in TYPES: result = timeit.timeit(f'numpy.empty(SHAPE, dtype={numpy_type})', globals=globals(), number=N) print(f'{numpy_type}: {result:.3f} seconds') |
Running the example creates an empty NumPy array with each NumPy data type and reports the overall execution time.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
numpy.int8: 0.312 seconds numpy.int16: 5.161 seconds numpy.int32: 9.590 seconds numpy.int64: 1.748 seconds numpy.uint8: 0.309 seconds numpy.uint16: 5.201 seconds numpy.uint32: 9.454 seconds numpy.uint64: 2.233 seconds numpy.float16: 5.888 seconds numpy.float32: 9.723 seconds numpy.float64: 4.225 seconds numpy.complex64: 4.074 seconds numpy.complex128: 0.687 seconds numpy.bool_: 0.307 seconds |
We can restructure the output as a table for direct comparison.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
Data Type | Time (sec) -----------------|------------ numpy.int8 | 0.312 numpy.int16 | 5.161 numpy.int32 | 9.590 numpy.int64 | 1.748 numpy.uint8 | 0.309 numpy.uint16 | 5.201 numpy.uint32 | 9.454 numpy.uint64 | 2.233 numpy.float16 | 5.888 numpy.float32 | 9.723 numpy.float64 | 4.225 numpy.complex64 | 4.074 numpy.complex128 | 0.687 numpy.bool_ | 0.307 |
The pattern of speed difference will be different on your platform, depending on how different-sized data types are handled by the operating system and/or how NumPy was compiled on your system.
Generally, we might expect that as the number of bits per data type is increased, we see a linear increase in execution time.
This is not the case. Looking at signed integers, the int8 type is the fastest followed by int64 and int32 is the slowest. A similar pattern is seen with the unsigned integers.
This is surprising and highlights the need to benchmark on a platform.
It suggests we may get better performance using a larger data type than needed when creating many empty arrays, e.g. int64 when we only need int32 or int16.
We see a similar artifact in floating point arrays, where a float32 array is nearly twice as slow to create than a float64 or float16 array. Again, we may want to use an array with more precision than is required in some situations.
Recommendations
The best recommendation is to identify the array creation tasks you need in your program, then benchmark them in isolation to discover what has the lowest execution speed on your system with your hardware and library versions.
I cannot stress this enough. The numbers above are highly specific and the patterns in performance observed may or may not hold on your specific platform.
That being said, you probably want to:
- Use numpy.empty() to create uninitialized arrays.
- Use numpy.zeros() to create arrays initialized to 0.
- Use numpy.full() to create arrays initialized to a given value.
- Use a data type that fits the data you require and is fastest to create.
Don’t rely on assumptions about performance, such as with data types of functions to call.
Always benchmark.
Further Reading
This section provides additional resources that you may find helpful.
Books
- Python Benchmarking, Jason Brownlee (my book!)
Also, the following Python books have chapters on benchmarking that may be helpful:
- Python Cookbook, 2013. (sections 9.1, 9.10, 9.22, 13.13, and 14.13)
- High Performance Python, 2020. (chapter 2)
Guides
- 4 Ways to Benchmark Python Code
- 5 Ways to Measure Execution Time in Python
- Python Benchmark Comparison Metrics
Benchmarking APIs
- time — Time access and conversions
- timeit — Measure execution time of small code snippets
- The Python Profilers
References
Takeaways
You now know how to benchmark NumPy array creation functions and discover the fastest approaches you can use in your own programs.
Did I make a mistake? See a typo?
I’m a simple humble human. Correct me, please!
Do you have any additional tips?
I’d love to hear about them!
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Emeric Deroubaix on Unsplash
Do you have any questions?