Benchmark Fastest Way To Sort NumPy Arrays

November 11, 2023 Python Benchmarking

You can benchmark functions and algorithms to sort NumPy arrays to discover the fastest approaches to use.

Generally, it is slightly faster to sort a NumPy array in place rather than to create a sorted copy. This approach should be preferred if the program requirements can support it.

The choice of sorting algorithm used by the sort() function matters. Generally, the default of quicksort is probably the best for randomly ordered numerical data, although it is a good idea to test different algorithms with samples of data to confirm this assumption holds for your specific data.

In this tutorial, you will discover how to benchmark and discover the fastest way to sort NumPy arrays in Python.

Let's get started.

Need Fast Sorting of NumPy Arrays

Sorting NumPy arrays in our Python program is a common operation.

It seems straightforward, call sort(), and the array is sorted.

But is this the fastest method?

In fact, there are a few ways we can sort NumPy arrays, from in-place to a sorted copy, to an array of indexes of sorted elements.

For example, some functions include:

Additionally, there are different sorting algorithms we can use that have different performance characteristics.

This includes algorithms such as:

In fact, internally, a NumPy sort() function may choose an algorithm based on the properties of the data.

The default is 'quicksort'. Note that both 'stable' and 'mergesort' use timsort or radix sort under the covers and, in general, the actual implementation will vary with data type. The 'mergesort' option is retained for backwards compatibility.

-- numpy.sort

These algorithms have different theoretical capabilities, but what is their performance on your specific array with your specific data?

Given that the algorithm may change based on the data, we cannot know the expected performance profile beforehand.

We must benchmark.

What is the fastest way to sort your NumPy arrays?

Benchmark Sort NumPy Arrays

We can explore the question of how fast the different approaches to sort NumPy arrays are using benchmarking.

In this case, we will use an approach to sort an array of random numbers of a modest fixed size, then repeat this process many times to give an estimated time. We can then compare the times to see the relative performance of the approaches tested.

You can use this approach to benchmark your own favorite NumPy array operations.

If you use or extend the NumPy benchmarking approach used in this tutorial, let me know in the comments below. I'd love to see what you come up with.

We could use the time.perf_counter() function directly and develop a helper function to perform the benchmarking and report results.

You can learn more about benchmarking with the time.perf_counter() function in the tutorial:

Instead, in this case, we will use the timeit API.

We cannot use the timeit.timeit() function as if we use the "setup" argument to prepare an array of random data to sort, it will only be prepared once and then sorted again and again.

Instead, we will use the timeit.repeat() function as it allows us to repeat the same benchmark test many times. Each repeat of the test will perform the same setup, allowing us to create a new array of random data to sort for each benchmark.

For example:

...
# define string to create data array to sort
DATA_STR = "rng=numpy.random.default_rng(1);A=..."
# number of times to run each snippet
N = 2000
# benchmark a thing
result = timeit.repeat('...', setup=DATA_STR, globals=globals(), number=1, repeat=N)
print(f'result {sum(result):.3f} seconds')

You can learn more about benchmarking with the timeit.repeat() function in the tutorial:

The number of runs in each benchmark was tuned to ensure that each snippet was executed in more than one second and less than about 10 seconds.

Let's get started.

Fastest Way to Sort 1D NumPy Array of Floats

We can explore the fastest way to sort a modestly sized NumPy array of random floating point values in [0,1).

In this case, we will use a pref-defined "setup" code snippet that creates a random number generator with a given seed and creates a 1D array of 100,000 random floats. We will then execute the given sorting approach 2,000 times and report the total execution time.

The approaches we will compare include the most common NumPy functions for sorting a 1d array, including:

Generally, we may expect that sorting in place would be slightly faster as no new array has to be created with sorted values or indexes of sorted values.

The complete example is listed below.

# SuperFastPython.com
# benchmark sorting a 1d array of random floats
import numpy
import timeit
# define string to create data array to sort
DATA_STR = "rng=numpy.random.default_rng(1);A=rng.random(100000)"
# number of times to run each snippet
N = 2000
# numpy.sort()
result = timeit.repeat('numpy.sort(A)', setup=DATA_STR, globals=globals(), number=1, repeat=N)
print(f'numpy.sort() {sum(result):.3f} seconds')
# ndarray.sort()
result = timeit.repeat('A.sort()', setup=DATA_STR, globals=globals(), number=1, repeat=N)
print(f'ndarray.sort() {sum(result):.3f} seconds')
# numpy.argsort()
result = timeit.repeat('A[numpy.argsort(A)]', setup=DATA_STR, globals=globals(), number=1, repeat=N)
print(f'numpy.argsort() {sum(result):.3f} seconds')

Running the example benchmarks each approach and reports the sum execution time.

numpy.sort() 10.568 seconds
ndarray.sort() 10.457 seconds
numpy.argsort() 15.666 seconds

We can restructure the output into a table for comparison.

Approach        | Time (sec)
----------------|------------
numpy.sort()    | 10.568
ndarray.sort()  | 10.457
numpy.argsort() | 15.666

We can see that we expected the ndarray.sort() approach of sorting the array in place was the fastest.

The difference was minor, perhaps 100 milliseconds, or less given random variation. Any difference between ndarray.sort() and numpy.sort() will be due to the creation of a new array required for the result.

Nevertheless, this confirms that in general, we should try to sort data in place if our program can accommodate it.

Next, let's see if the same expectations and confirmation hold for arrays of integers.

Fastest Way to Sort 1D NumPy Array of Integers

We can explore the fastest way to sort a modestly sized NumPy array of random integer values in [0,100].

Again, we will use the "setup" argument to create a random number generator and create an array to be sorted, in this case, an array of 100,000 random integers between 0 and 100 (inclusive).

We will benchmark and compare the same three approaches used for sorting a 1D array.

The complete example is listed below.

# SuperFastPython.com
# benchmark sorting a 1d array of random integers
import numpy
import timeit
# define string to create data array to sort
DATA_STR = "rng=numpy.random.default_rng(1);A=rng.integers(0,100+1,100000)"
# number of times to run each snippet
N = 2000
# numpy.sort()
result = timeit.repeat('numpy.sort(A)', setup=DATA_STR, globals=globals(), number=1, repeat=N)
print(f'numpy.sort() {sum(result):.3f} seconds')
# ndarray.sort()
result = timeit.repeat('A.sort()', setup=DATA_STR, globals=globals(), number=1, repeat=N)
print(f'ndarray.sort() {sum(result):.3f} seconds')
# numpy.argsort()
result = timeit.repeat('A[numpy.argsort(A)]', setup=DATA_STR, globals=globals(), number=1, repeat=N)
print(f'numpy.argsort() {sum(result):.3f} seconds')

Running the example benchmarks each approach and reports the sum execution time.

numpy.sort() 6.108 seconds
ndarray.sort() 5.986 seconds
numpy.argsort() 7.077 seconds

We can restructure the output into a table for comparison.

Approach        | Time (sec)
----------------|------------
numpy.sort()    | 6.108
ndarray.sort()  | 5.986
numpy.argsort() | 7.077

We can see that the same general pattern holds as we saw with sorting arrays of floating point values.

It is faster to sort the array of integers in place, then to create a copy of the sorted array, and finally to create and use an array of indices to sorted values.

This confirms that when sorting arrays of integers we should sort the array in place, if possible.

Next, let's explore the effect of different sorting algorithms.

Fastest Sort Algorithm to Sort 1D NumPy Array of Floats

We can explore the benchmark execution time of choosing different sort algorithms when sorting a 1D array of random floating point values.

The numpy.sort() function offers four algorithms and may choose or change the algorithm used based on the type of data provided.

In this case, we will explore sorting the same array of random floats with each sort algorithm (created anew for each test).

We may expect that the default algorithm of quicksort will perform best for an array of randomly ordered numerical values.

Overall, it is slightly faster than merge sort and heapsort for randomized data, particularly on larger distributions

-- Quicksort, Wikipedia.

The complete example is listed below.

# SuperFastPython.com
# benchmark sort algorithms for a 1d array of random floats
import numpy
import timeit
# define string to create data array to sort
DATA_STR = "rng=numpy.random.default_rng(1);A=rng.random(100000)"
# number of times to run each snippet
N = 3000
# kinds of sort algorithms
kinds = ['quicksort', 'mergesort', 'heapsort', 'stable']
# sort array with each algorithm
for kind in kinds:
    # ndarray.sort()
    result = timeit.repeat(f"A.sort(kind='{kind}')", setup=DATA_STR, globals=globals(), number=1, repeat=N)
    print(f'ndarray.sort({kind}) {sum(result):.3f} seconds')

Running the example benchmarks each approach and reports the sum execution time.

ndarray.sort(quicksort) 15.801 seconds
ndarray.sort(mergesort) 19.560 seconds
ndarray.sort(heapsort) 26.911 seconds
ndarray.sort(stable) 19.600 seconds

We can restructure the output into a table for comparison.

Approach                | Time (sec)
------------------------|------------
ndarray.sort(quicksort) | 15.801
ndarray.sort(mergesort) | 19.560
ndarray.sort(heapsort)  | 26.911
ndarray.sort(stable)    | 19.600

The result is interesting, there is a notable difference between the results.

We can see that the default, "quicksort" gave the fastest result in this case.

We can see that "mergesort" and "stable" follow closely, and may be using the same algorithm under the covers given the closeness of the result and the comment in the function definition that they may both be implemented using "radix sort".

The slowest, in this case, was "heapsort".

One approach in choosing a sorting algorithm is analytically, based on the data in the array (how sorted it may already be, etc.) and the expected performance of an algorithm for the data.

We should still do this, but we should also benchmark to confirm our analytical analysis (assumptions) was correct.

Takeaways

You now know how to benchmark and discover the fastest way to sort NumPy arrays in Python.



If you enjoyed this tutorial, you will love my book: Python Benchmarking. It covers everything you need to master the topic with hands-on examples and clear explanations.