You can benchmark a Python program using the time.perf_counter() function, with the timeit module, or the time Unix command.
Any of these approaches can be used to estimate the execution time of a Python code.
Benchmarking is an important step when improving the execution speed of Python programs. Benchmark results provide hard numbers that can be reported and compared directly. This can help in choosing among variations of code for the faster version and see if the code meets performance requirements.
In this tutorial, you will discover how to benchmark a program in Python.
Let’s get started.
Need to Benchmark a Python Program
A Python program is a .py file or script executed by the Python interpreter.
A program can do many things and can be as long or short as we like. It may be a single statement, a function, many functions, and/or many objects,
There are many changes we could make to a Python program.
How do we know which version of a Python program is the fastest?
How can we benchmark a program in Python?
Run loops using all CPUs, download your FREE book to learn how.
How to Benchmark a Python Program
There are three main ways to benchmark a Python program:
- Use time.perf_counter() function.
- Use the timeit module.
- Use the time Unix command.
Let’s take a closer look at each in turn.
Benchmark a Program With time.perf_counter()
We can benchmark a program in Python using the time.perf_counter() program.
The time.perf_counter() program was provided for benchmarking.
It has three properties that make it the preferred function for manual benchmarking in Python over other functions, such as time.time(), they are:
- Non-Adjustable.
- Monotonic.
- High-precision.
The time.perf_counter() function reports the time from a clock that is not adjustable. This means that the clock cannot be changed either manually by the system administrator or automatically by clock synchronization or changes for daylight saving.
The time.perf_counter() is monotonic meaning that each value retrieved will be equal to or greater than the previous value. It will never return a value in the past relative to the last value returned.
Finally, the clock used by time.perf_counter() is high-precision, if available. This means we can benchmark Python code that executes very quickly, such as in the order of nanoseconds.
You can learn more about the time.perf_counter() function for benchmarking in the tutorial:
The procedure for benchmarking with the time.perf_counter() program is as follows:
- Move the entry of the program into a main() function (if needed).
- Record time.perf_counter() before the main() function.
- Call the main() function.
- Record time.perf_counter() after the main() function.
- Subtract start time from after time to give duration.
- Report the duration using print().
For example:
1 2 3 4 5 6 7 8 9 10 11 12 |
# protect the entry point if __name__ == '__main__': # record start time time_start = perf_counter() # execute the program main() # record end time time_end = perf_counter() # calculate the duration time_duration = time_end - time_start # report the duration print(f'Took {time_duration:.3f} seconds') |
Next, let’s look at how we might benchmark a program using timeit.
Benchmark a Program With timeit.timeit()
The timeit module is provided in the Python standard library.
It provides an easy way to benchmark snippets of Python code.
It provides two interfaces for benchmarking, a command line interface, and an API interface, although we will focus on the API interface in this case and the timeit.timeit() function.
The timeit.timeit() function takes the Python function to be benchmarked as a string.
For example:
1 2 3 |
... # benchmark a python function result = timeit.timeit('myfunc()') |
Any Python code required to execute the benchmark code can be provided as a string to the “setup” argument.
It might include importing the main module so that required functions are imported.
For example:
1 2 3 |
... # benchmark a python function with import in setup result = timeit.timeit(main()', setup='from __main__ import task') |
We can do the same thing via the “globals” argument and specify the globals collection.
For example:
1 2 3 |
... # benchmark a python function with globals collection result = timeit.timeit(main()', globals=globals()) |
Finally, we can specify the number of repetitions of the benchmark code via the “number” argument.
By default, this is set to one million, e.g. 1,000,000, although can be set to a smaller number if the benchmark code takes a long time to execute.
For example:
1 2 3 |
... # benchmark a python function with a smaller number result = timeit.timeit('main()', number=100) |
Generally, the timeit module is intended to benchmark snippets of code. Typically, this is a single statement or a single function call. The timeit module is not well suited to benchmarking entire programs.
We can force the timeit module to benchmark an entire program, either via the timeit.timeit().
This can be done by reducing the program down to a main() function, then calling the main function as the “statement” to be benchmarked by timeit.
You can learn more about how to use timeit for benchmarking in the tutorial:
Benchmark a Program With the time Unix Command
The “time” command is a tool that can be used to measure the execution time of any program on the command line interface.
The command line or command line interface is a way of interacting with the computer using text commands, as opposed to clicking around on a graphical interface with a mouse.
The time command, also called the time Unix command, is a command line program for reporting the execution time of programs.
It is referred to as the “time Unix command” because it was originally developed for the Unix operating system.
The command is not available on almost all Unix-like operating systems and may be implemented as an executable command (e.g. GNU time) or as a keyword in the shell (e.g. bash time).
The “time” command runs a program, such as a Python script, and reports the overall execution time.
When using time to benchmark the execution time of Python programs, it is prepended to the “python” command.
For example:
1 |
time python myscript.py |
This will run the script normally.
Once the script is completed, a summary of the execution time will be reported.
And that’s all there is to it.
This approach is recommended if you want to benchmark the entire program without modifying it anyway.
It returns 3 benchmark results, real-time, user time, and system time.
Real-time is the wall clock time, the duration of the program. The CPU time is calculated as the sum of user time and system time which specify how long the program spent in each mode.
If the program is blocked by other programs running at the same time or sleeps, the CPU time may be shorter than the real-time. This is because clocks that record CPU time are paused when the program is blocked, and resume once the program resumes executing.
- Real-time: Wall-clock execution time of the program.
- CPU-time: Sum of user time and system time when executing the program.
- User-time: Time the CPU spent executing user code (your program and dependencies).
- System-time: Time the CPU spent executing in the kernel (operating system).
The time command is not available on all platforms.
Generally, it is available on Unix and Unix-like systems, such as:
- Unix
- Linux
- macOS
It is not available on Windows platforms by default. Nevertheless, it can be installed using a third-party software package, such as Cygwin or Microsoft PowerToys.
You can learn more about benchmarking Python programs with the time Unix command in the tutorial:
Now that we know how to benchmark a Python program, let’s look at some worked examples.
Example of Benchmarking a Python Program with time.perf_counter()
We can explore how to benchmark a Python program with time.perf_counter() in a worked example.
In this case, we will benchmark a function that creates a list of 100 million squared integers in a list comprehension.
The work() function implements this below.
1 2 3 4 |
# define a custom function def work(): # do some work data = [i*i for i in range(100000000)] |
We will call this function from a new main() function that captures the entry point to our program.
1 2 3 4 |
# main function for script def main(): # call a function task() |
We will record the time before the main() function and after the main() function with the time.perf_counter(), then calculate and report the difference as the benchmark time.
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
# SuperFastPython.com # example of benchmarking a program with time.perf_counter() from time import perf_counter # function to benchmark def task(): # create a large list data = [i*i for i in range(100000000)] # main function for script def main(): # call a function task() # protect the entry point if __name__ == '__main__': # record start time time_start = perf_counter() # execute the script main() # record end time time_end = perf_counter() # calculate the duration time_duration = time_end - time_start # report the duration print(f'Took {time_duration} seconds') |
Running the example records the start time, executes the main() function, and then records the end time.
In this case, we can see that the benchmark time calculated as the difference between the two times is about 6.144454502034932 seconds.
Your results may vary. This is due to the natural variability in running a program on a computer. You can try to repeat the benchmark measurement many times and calculate the average result.
1 |
Took 6.144454502034932 seconds |
Free Python Benchmarking Course
Get FREE access to my 7-day email course on Python Benchmarking.
Discover benchmarking with the time.perf_counter() function, how to develop a benchmarking helper function and context manager and how to use the timeit API and command line.
Example of Benchmarking a Python Program with timeit.timeit()
We can explore how to benchmark a Python program with timeit.timeit() in a worked example.
In this case, we can update the above example to benchmark our main() function using the timeit.timeit() function.
This involves defining the target main function call as a string and providing it as an argument to the timeit() function.
It also requires setting the “globals” argument to be equal to the globals() collection so that the function definition is available to the timeit() function.
We will set the number of times to run the function to one via the “number” argument.
1 2 3 |
... # benchmark the program time_duration = timeit('main()', globals=globals(), number=1) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
# SuperFastPython.com # example of benchmarking a program with timeit.timeit() from timeit import timeit # function to benchmark def task(): # create a large list data = [i*i for i in range(100000000)] # main function for script def main(): # call a function task() # protect the entry point if __name__ == '__main__': # benchmark the program time_duration = timeit('main()', globals=globals(), number=1) # report the duration print(f'Took {time_duration} seconds') |
Running the example executes the program one time using the timeit.timeit() function.
In this case, we can see that the benchmark result was about 6.2056802730076015 seconds.
Your results may vary. This is due to the natural variability in running a program on a computer. You can try to repeat the benchmark measurement many times and calculate the average result.
1 |
Took 6.2056802730076015 seconds |
Next, let’s look at an example of benchmarking the same program using the time Unix command.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Example of Benchmarking a Python Program with time Unix Command
We can benchmark our program using the time Unix command.
Firstly we must save our program to file, any name will do. In this case with the name program.py.
The content of the file is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# SuperFastPython.com # example of our program # function to benchmark def task(): # create a large list data = [i*i for i in range(100000000)] # main function for script def main(): # call a function task() # protect the entry point if __name__ == '__main__': # execute our program main() |
Normally we would execute the program from the command line using the python command.
For example:
1 |
python program.py |
In order to benchmark the execution time of the Python program, we can prepend the “time” command to the execution of our script.
For example:
1 |
time python program.py |
Running the example reports the duration both in terms of real-time (wall clock time) and CPU time (user + sys).
I recommend focusing on real-time.
In this case, we can see that the script took about 6.214 seconds to complete.
This highlights how we can benchmark a Python script using the time command.
Your results may vary. This is due to the natural variability in running a program on a computer. You can try to repeat the benchmark measurement many times and calculate the average result.
1 2 3 |
real 0m6.214s user 0m5.172s sys 0m1.006s |
Frequently Asked Questions
This section lists frequently asked questions about benchmarking and their answers.
Do you have a question?
Share it in the comments below and I will do my best to help.
Why Does Each Approach Give Different Results?
Shouldn’t we get the same score regardless of the method used to benchmark?
No.
Each method introduces small differences in what code is executed and when. These differences are reflected in the variation in the benchmark score and differences between benchmark methods.
We must expect measures to differ between methods.
Instead of aiming for an absolute measure of the execution time of code, we must aim for the relative differences when using one method when measuring different variations of the same code.
This allows us to compare relative execution time and choose code that performs “fastest” when measured in a consistent manner.
Why Do We Get Different Results Each Time?
We will get a different benchmark result each time.
The reason is because of the natural variability of running computer programs caused by things such as other programs running at the same time.
We can reduce this variability by repeating a benchmark measurement many times and calculating the average score. This will be a more stable and reliable estimate of the performance of the program
You can learn more about this in the tutorial:
Which Benchmark Method Should I Use?
Use the method that best suits you.
Perhaps you prefer the manual control at that time.perf_counter() provides.
Perhaps you prefer to use the timeit API.
Perhaps you prefer the command line interface.
Use the method that is the best fit for your style, then use it consistently when comparing variations of the same code.
How Do We Present Benchmark Results?
It is important to choose a numerical precision (resolution ) that best suits your benchmark results.
Too much precision is confusing and too little can hide detail. You must strike the right level of detail.
It is also important to choose a unit of measure that best suits your measures.
This might be hours, minutes, seconds, milliseconds, microseconds, or nanoseconds.
The choice of the unit of measure also interacts with the numerical precision of the results.
You can learn more about these issues in the tutorial:
What About profile and cProfile modules?
Profiling is different from benchmarking.
Benchmarking is about estimating the performance of code, typically for the purposes of comparison.
Profiling is about discovering why code is slow, and about what is occupying time when code executes.
We can use profiling to understand code and change it in order to improve a benchmark result.
Profiling cannot be used for benchmarking.
You can learn more about profiling and its relationship to benchmarking in the tutorial:
Further Reading
This section provides additional resources that you may find helpful.
Books
- Python Benchmarking, Jason Brownlee (my book!)
Also, the following Python books have chapters on benchmarking that may be helpful:
- Python Cookbook, 2013. (sections 9.1, 9.10, 9.22, 13.13, and 14.13)
- High Performance Python, 2020. (chapter 2)
Guides
- 4 Ways to Benchmark Python Code
- 5 Ways to Measure Execution Time in Python
- Python Benchmark Comparison Metrics
Benchmarking APIs
- time — Time access and conversions
- timeit — Measure execution time of small code snippets
- The Python Profilers
References
Takeaways
You now know how to benchmark a program in Python.
Did I make a mistake? See a typo?
I’m a simple humble human. Correct me, please!
Do you have any additional tips?
I’d love to hear about them!
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Josh Sorenson on Unsplash
Do you have any questions?