You can benchmark functions and algorithms to calculate the mean or average of NumPy arrays to discover the fastest approaches to use.
Generally, it can be slightly faster to calculate the mean of a NumPy array of float values manually, rather than use a built-in function like numpy.mean().
Importantly, this benefit is amplified when calculating the mean of integer arrays, where using a manual calculation can be more than 4x faster. This highlights that we need to calculate many mean values of integer arrays in our programs and execution time is critical, we should prefer manual calculations over built-in functions.
In this tutorial, you will discover how to benchmark the fastest approach to calculate the mean of a NumPy array in Python.
Let’s get started.
Need Fast Mean of NumPy Arrays
Calculating the mean or average of NumPy arrays in our Python program is a common operation.
It is perhaps the most commonly used summary statistic.
It seems straightforward, call mean() and we have the mean.
But is this the fastest method?
In fact, there are a few ways we can calculate the mean of NumPy arrays, from methods that use weightings, methods that ignore NaNs, and manual methods.
For example, some approaches include:
- numpy.mean()
- numpy.average()
- numpy.nanmean()
- numpy.divide(numpy.sum(array), len(array))
- numpy.sum()/len()
- numpy.sum()/shape
There are also more exotic methods we could devise.
Nevertheless, if we need to calculate the mean of a NumPy array frequently in our program, such as within a loop, what is the fastest way we should use?
What is the fastest way to calculate the mean of a NumPy array?
Run loops using all CPUs, download your FREE book to learn how.
Benchmark Mean of NumPy Arrays
We can explore the question of how fast the different approaches to calculate the mean of NumPy arrays are using benchmarking.
In this case, we will use an approach to calculate the mean of an array of random numbers of a modest fixed size, then repeat this process many times to give an estimated time. We can then compare the times to see the relative performance of the approaches tested.
You can use this approach to benchmark your own favorite NumPy array operations.
If you use or extend the NumPy benchmarking approach used in this tutorial, let me know in the comments below. I’d love to see what you come up with.
We could use the time.perf_counter() function directly and develop a helper function to perform the benchmarking and report results.
You can learn more about benchmarking with the time.perf_counter() function in the tutorial:
Instead, in this case, we will use the timeit API, specifically the timeit.timeit() function and specify the string of mean calculation code to run and a fixed number of times to run it.
We will also provide the “globals” argument for any constants defined in our benchmark code, such as the defined array to operate upon.
For example:
1 2 3 4 |
... # benchmark a thing result = timeit.timeit('...', globals=globals(), number=N) print(f'approach {result:.3f} seconds') |
You can learn more about benchmarking with the timeit.timeit() function in the tutorial:
The number of runs in each benchmark was tuned to ensure that each snippet was executed in more than one second and less than about 10 seconds.
Let’s get started.
Fastest Way to Calculate Mean of 1D Float NumPy Arrays
We can explore the fastest way to calculate the mean of a modestly sized NumPy array of random floating point values in [0,1).
In this case, we will create a fixed size 1d array with half a million elements (500,000) of random floats with the default data type, float64 on most platforms. Each approach will be used to calculate the mean of the array 100,000 times.
We will benchmark and directly compare the following approaches:
- numpy.mean()
- numpy.average()
- numpy.nanmean()
- numpy.divide(numpy.sum(array), len(array))
- numpy.sum()/len()
- numpy.sum()/shape
Generally, we may expect the numpy.mean() to be the fastest, although it is possible that the more manual approaches may be slightly faster due to fewer function calls and checks required.
The complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
# SuperFastPython.com # benchmark calculating the mean a 1d array import numpy import timeit # define the data used in all runs rng = numpy.random.default_rng(1) A = rng.random(500000) # number of times to run each snippet N = 100000 # numpy.mean() result = timeit.timeit('numpy.mean(A)', globals=globals(), number=N) print(f'numpy.mean() {result:.3f} seconds') # numpy.average() result = timeit.timeit('numpy.average(A)', globals=globals(), number=N) print(f'numpy.average() {result:.3f} seconds') # numpy.nanmean() result = timeit.timeit('numpy.nanmean(A)', globals=globals(), number=N) print(f'numpy.nanmean() {result:.3f} seconds') # numpy.divide(numpy.sum(), len()) result = timeit.timeit('numpy.divide(numpy.sum(A), len(A))', globals=globals(), number=N) print(f'numpy.divide(numpy.sum(), len()) {result:.3f} seconds') # numpy.sum()/len() result = timeit.timeit('numpy.sum(A)/len(A)', globals=globals(), number=N) print(f'numpy.sum()/len() {result:.3f} seconds') # numpy.sum()/shape[0] result = timeit.timeit('numpy.sum(A)/A.shape[0]', globals=globals(), number=N) print(f'numpy.sum()/shape[0] {result:.3f} seconds') |
Running the example benchmarks each approach and reports the sum execution time.
1 2 3 4 5 6 |
numpy.mean() 14.598 seconds numpy.average() 15.233 seconds numpy.nanmean() 132.891 seconds numpy.divide(numpy.sum(), len()) 14.281 seconds numpy.sum()/len() 13.948 seconds numpy.sum()/shape[0] 13.900 seconds |
We can restructure the output into a table for comparison.
1 2 3 4 5 6 7 8 |
Approach | Time (sec) ---------------------------------|------------ numpy.mean() | 14.598 numpy.average() | 15.233 numpy.nanmean() | 132.891 numpy.divide(numpy.sum(), len()) | 14.281 numpy.sum()/len() | 13.948 numpy.sum()/shape[0] | 13.900 |
We can see that most approaches had a similar execution time.
The one outlier was numpy.nanmean() at over 132 seconds, or 9.103x slower than numpy.mean(). This is a good reminder to not use numpy.nanmean() unless the data has NaN values and may encourage us to replace the NaNs or perhaps create a view of the array without the NaNs for calculating the mean value.
We can see that numpy.average() was slightly slower than numpy.average(), likely because of the checks and weightings applied in the calculation. Again, we should only use numpy.average() if we require a weighted mean.
We can see that both numpy.mean() and numpy.divide(numpy.sum(), len()) had a similar speed, with the latter showing a slightly faster performance.
Interestingly, as we simplify the mean calculation we see small reductions in overall execution from numpy.mean(), to numpy.divide(numpy.sum(), len()), to numpy.sum()/len() and finally to numpy.sum()/shape[0]. Each drop is hundreds of milliseconds.
This finding suggests that if we are calculating a mean value many times in our program, such as part of a loop or similar, we can use a manual calculation in order to achieve a modest speed-up.
This might hold for other calculations we may be required to perform.
Next, let’s repeat this benchmark with integer rather than floating point data.
Free Python Benchmarking Course
Get FREE access to my 7-day email course on Python Benchmarking.
Discover benchmarking with the time.perf_counter() function, how to develop a benchmarking helper function and context manager and how to use the timeit API and command line.
Fastest Way to Calculate Mean of 1D Integer NumPy Arrays
We can explore whether the same findings from the previous benchmark hold for arrays of integer values.
In this case, we will update the example to generate an array of random integer values between 0 and 100 (inclusive) and then calculate the mean of each using the same approaches.
The complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
# SuperFastPython.com # benchmark calculating the mean a 1d array import numpy import timeit # define the data used in all runs rng = numpy.random.default_rng(1) A = rng.integers(0, 100+1, 500000) # number of times to run each snippet N = 100000 # numpy.mean() result = timeit.timeit('numpy.mean(A)', globals=globals(), number=N) print(f'numpy.mean() {result:.3f} seconds') # numpy.average() result = timeit.timeit('numpy.average(A)', globals=globals(), number=N) print(f'numpy.average() {result:.3f} seconds') # numpy.nanmean() result = timeit.timeit('numpy.nanmean(A)', globals=globals(), number=N) print(f'numpy.nanmean() {result:.3f} seconds') # numpy.divide(numpy.sum(), len()) result = timeit.timeit('numpy.divide(numpy.sum(A), len(A))', globals=globals(), number=N) print(f'numpy.divide(numpy.sum(), len()) {result:.3f} seconds') # numpy.sum()/len() result = timeit.timeit('numpy.sum(A)/len(A)', globals=globals(), number=N) print(f'numpy.sum()/len() {result:.3f} seconds') # numpy.sum()/shape[0] result = timeit.timeit('numpy.sum(A)/A.shape[0]', globals=globals(), number=N) print(f'numpy.sum()/shape[0] {result:.3f} seconds') |
Running the example benchmarks each approach and reports the sum execution time.
1 2 3 4 5 6 |
numpy.mean() 29.992 seconds numpy.average() 30.485 seconds numpy.nanmean() 30.855 seconds numpy.divide(numpy.sum(), len()) 6.823 seconds numpy.sum()/len() 6.537 seconds numpy.sum()/shape[0] 6.528 seconds |
We can restructure the output into a table for comparison.
1 2 3 4 5 6 7 8 |
Approach | Time (sec) ---------------------------------|------------ numpy.mean() | 29.992 numpy.average() | 30.485 numpy.nanmean() | 30.855 numpy.divide(numpy.sum(), len()) | 6.823 numpy.sum()/len() | 6.537 numpy.sum()/shape[0] | 6.528 |
Interestingly, we can see that the numpy.nanmean() has the same general performance as numpy.mean() and numpy.average(). This is surprising until we realize that we cannot store numpy.nan values in integer arrays, simplifying any checks within numpy.nanmean().
The same general trend of manual calculation being faster than built-in functions holds with the integer data, but even more so.
We can see that the manual methods are all significantly faster than the function approaches. For example, numpy.sum()/shape[0] is about 6.528 seconds compared to numpy.mean() at 29.992 seconds, a speed of about 4.594x.
This highlights that if we have a lot of mean calculations on integer data, we should be performing these calculations manually if performance is important.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Further Reading
This section provides additional resources that you may find helpful.
Books
- Python Benchmarking, Jason Brownlee (my book!)
Also, the following Python books have chapters on benchmarking that may be helpful:
- Python Cookbook, 2013. (sections 9.1, 9.10, 9.22, 13.13, and 14.13)
- High Performance Python, 2020. (chapter 2)
Guides
- 4 Ways to Benchmark Python Code
- 5 Ways to Measure Execution Time in Python
- Python Benchmark Comparison Metrics
Benchmarking APIs
- time — Time access and conversions
- timeit — Measure execution time of small code snippets
- The Python Profilers
References
Takeaways
You now know how to benchmark the fastest approach to calculate the mean of a NumPy array in Python.
Did I make a mistake? See a typo?
I’m a simple humble human. Correct me, please!
Do you have any additional tips?
I’d love to hear about them!
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Alexander Schimmeck on Unsplash
Do you have any questions?