Python code can be slow.
We can benchmark Python code discover exactly how slow it is, and then test changes to the code to confirm that the changes we made had the desired effect.
This course provides you with a 7-day crash course in Python Benchmarking.
You will get a taste of what is possible and hopefully, skills that you can bring to your next project.
Let’s get started.
Course Overview
Hi, thanks for your interest in Python Benchmarking.
As a bonus, you get FREE access to my Python Benchmarking 7-day email course.
You have a lot of fun ahead, including:
- Lesson 01: Importance of Python Benchmarking (today)
- Lesson 02: Benchmark with time.pref_counter()
- Lesson 03: Develop a Benchmark Helper Function
- Lesson 04: Develop a Benchmark Context Manager
- Lesson 05: Tips for Reporting Benchmark Results
- Lesson 06: Benchmark with timeit API
- Lesson 07: Benchmark with timeit Command Line
You can download a .zip of all the source code for this course here:
Your first lesson is just below.
Before you dive in, a quick question:
Why are you interested in Python Benchmarking?
Let me know in the comments below.
Run loops using all CPUs, download your FREE book to learn how.
Lesson 01: Importance of Python Benchmarking
Hi, benchmarking the execution time is required to improve the performance of code.
Benchmarking is a systematic and methodical process of evaluating the performance of software by measuring how it executes specific tasks or processes.
Code performance is a critical aspect of modern software development. Typically the performance of a program is a requirement of the project from the beginning, ensuring responses are timely and user experience is consistent.
Therefore we cannot neglect performance benchmarking.
Benchmarking code involves many aspects to ensure accurate and meaningful performance measurements.
Nevertheless, the three primary aspects of benchmarking code are:
- Measurement and metrics.
- Controlled testing conditions.
- Interpretation and analysis.
There are many aspects of a Python program that we could benchmark, so why focus on execution time?
Benchmarking code execution time is a fundamental practice for delivering efficient and responsive software applications.
The main reason we benchmark execution time is to improve performance.
- Performance Optimization: Benchmarking helps identify performance bottlenecks and areas where code execution can be optimized. By measuring and analyzing execution times, we can focus optimization efforts on the most critical sections of code.
We measure the performance so that we can improve it.
This typically involves a baseline measurement, then a measurement of performance after each change to confirm the change moved performance in the right direction.
Neglecting the benchmarking of code execution time can have several significant consequences, which can impact both the development process and the quality of the software being developed.
Some of the key consequences include:
- Performance Issues
- Resource Inefficiency
- Scalability Problems
- Poor User Experience
- Long-Term Costs
To avoid these consequences, it’s crucial to take code benchmarking seriously as an integral part of the software development process.
You can learn more about the importance of Python Benchmarking here:
Lesson 02: Benchmark with time.pref_counter()
Hi, we can benchmark Python code using the time.perf_counter() function.
The time.perf_counter() function reports the value of a performance counter on the system. It does not report the time since epoch like time.time().
The returned value is in seconds with fractional components (e.g. milliseconds and nanoseconds), providing a high-resolution timestamp.
Calculating the difference between two timestamps from the time.perf_counter() allows high-resolution execution time benchmarking.
The timestamp from the time.perf_counter() function is consistent, meaning that two durations can be compared relative to each other in a meaningful way.
The perf_counter() function was specifically designed to overcome the limitations of other time functions like time.time().
Specifically, we should prefer to use time.pref_counter() over time.time() for 3 reasons:
- Precision. The clock used by time.pref_counter() has higher precision than time.time().
- Adjustment. The clock used by time.pref_counter() cannot be adjusted, whereas the clock used by time.time() can be.
- Monotonic. The clock used by time.pref_counter() will never return a time in the past, whereas time.time() may.
A key limitation of time.time() is that it is based on the system time. The system time can be changed at any time by the user, by daylight savings, and by synchronization with a time server.
Therefore the time.time() function is not appropriate for benchmarking.
You can learn more about time.perf_counter() vs time.time() in the tutorial:
We can use the time.perf_counter() function to benchmark arbitrary Python statements.
The procedure is as follows:
- Record time.perf_counter() before the statement.
- Execute the statement.
- Record time.perf_counter() after the statement.
- Subtract start time from after time to give duration.
- Report the duration using print().
For example:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
# SuperFastPython.com # example of benchmarking a statement with time.perf_counter() import time # record start time time_start = time.perf_counter() # execute the statement data = [i*i for i in range(100000000)] # record end time time_end = time.perf_counter() # calculate the duration time_duration = time_end - time_start # report the duration print(f'Took {time_duration} seconds') |
Run the program and note how long it takes to execute.
Vary the statement that is being benchmarked.
Let me know what you discover.
You can learn more about how to use the perf_counter() function for benchmarking in the tutorial:
Free Python Benchmarking Course
Get FREE access to my 7-day email course on Python Benchmarking.
Discover benchmarking with the time.perf_counter() function, how to develop a benchmarking helper function and context manager and how to use the timeit API and command line.
Lesson 03: Develop a Benchmark Helper Function
Hi, we can develop a helper function to benchmark target functions in Python.
This is a good practice as it can encode best practices, avoid bugs, and can be reused on any of our projects in the future.
Our helper benchmark function can take the name of our target function that we wish to benchmark as arguments.
Our function can then record the start time, call the target function, record the end time, and report the overall duration.
We can use our helper function to benchmark arbitrary functions.
The example below defines a task() function and then uses our helper function to benchmark it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
# SuperFastPython.com # example of benchmarking using a custom function import time # benchmark function def benchmark(fun, *args): # record start time time_start = time.perf_counter() # call the custom function fun(*args) # record end time time_end = time.perf_counter() # calculate the duration time_duration = time_end - time_start # report the duration print(f'Took {time_duration} seconds') # function to benchmark def task(): # create a large list data = [i*i for i in range(100000000)] # protect the entry point if __name__ == '__main__': # benchmark the task() function benchmark(task) |
Run the program and note how long it takes to execute.
Vary the function that is being benchmarked.
Let me know what you discover.
You can learn more about how to develop a helper benchmark function in the tutorial:
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Lesson 04: Develop a Benchmark Context Manager
Hi, we can develop a custom context manager to automatically benchmark code in Python.
Recall that a context manager is an object that defines a runtime context to be established when executing a with statement.
We can define a context manager that automatically measures and reports the execution time of all code within the body block.
It means that we can easily benchmark arbitrary blocks of code, not just a statement or a function.
This requires that we define a new class that implements a constructor __init__() the __enter__() and __exit__() methods.
- The __init__() constructor can take a name argument for the benchmark case and store it in an object attribute.
- The __enter__() method can initialize the start time and store it in object attributes.
- The __exit__() method can then record the end time, calculate and store the duration, and report the calculated duration along with the name of the benchmark case.
We can then use the Benchmark context manager to measure and report the execution time of arbitrary blocks of code.
For example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
# SuperFastPython.com # example of a benchmark context manager import time # define the benchmark context manager class Benchmark(object): # constructor def __init__(self, name): # store the name of this benchmark self.name = name # enter the context manager def __enter__(self): # record the start time self.time_start = time.perf_counter() # return this object return self # exit the context manager def __exit__(self, exc_type, exc_value, traceback): # record the end time self.time_end = time.perf_counter() # calculate the duration self.duration = self.time_end - self.time_start # report the duration print(f'{self.name} took {self.duration:.3f} seconds') # do not suppress any exception return False # function to benchmark def task(): # create a large list data = [i*i for i in range(100000000)] # protect the entry point if __name__ == '__main__': # create the benchmark context with Benchmark('Task'): # run the task task() |
Run the program and note how long it takes to execute.
Vary the code that is being benchmarked.
Let me know what you discover.
You can learn more about how to develop a helper benchmark context manager in the tutorial:
Lesson 05: Tips for Reporting Benchmark Results
Hi, we must carefully choose the level of precision and units of measure when presenting benchmark results.
These are the two main areas when presenting benchmark results that can introduce confusion and unnecessary cognitive load when attempting to interpret, analyze, and compare results.
Getting precision and units of measure correct will go a long way to ensuring execution time benchmark results are presented well.
We may need to report benchmark results to many people such as:
- A team lead or manager.
- Peer developers in the same team.
- Project stakeholders.
Presenting raw results can be a problem.
This is typically for two main reasons:
- The precision of the benchmark results is often too high, leading to confusion.
- The units of measure may be missing or limited to the default of seconds, which may not be appropriate.
We can focus on considerations when presenting results in these two areas, namely precision of measure and units of measure.
Tips for Measurement Precision
The precision of the result refers to the number of decimal places used to present results.
The default for benchmark results will be full double floating point precision, which is 16 decimal places on most platforms.
This can be confusing to managers, fellow developers, and stakeholders alike.
Below are some tips with regard to measurement precision when presenting results.
- Tip 01: Don’t show too much precision as it can be a distraction.
- Tip 02: Don’t show too little precision as it can hide detail.
- Tip 03: Be consistent when presenting all results.
- Tip 04: Prefer to truncate over rounding as the results of rounding can be confusing.
- Tip 05: Avoid scientific notation as few people understand it.
Tips for Measurement Units
Units of measure refer to what the measure represents.
The default measure for almost all measurement functions is seconds, although nanosecond versions of most functions do exist.
Seconds may or may not be the best measure to use for a given set of benchmark results.
- Tip 01: Know the difference between measures (nano, micro, milli, etc.).
- Tip 02: Don’t use a measure that is too low in scale as the numbers will be too large.
- Tip 03: Don’t use a measure that is too high in scale as the detail will be hidden.
- Tip 04: Default to seconds if you’re unsure of what units to use.
- Tip 05: Always include the units when reporting results
You can learn more about Python Benchmarking best practices here:
Lesson 06: Benchmark with timeit API
Hi, we can benchmark snippets of Python code using the timeit.timeit() function.
The timeit module is provided in the Python standard library.
It provides an easy way to benchmark single statements and snippets of Python code.
The timeit.timeit() function benchmarks Python code and reports the duration in seconds.
The function takes the Python statement to be benchmarked as a string.
Any Python code required to execute the benchmark code can be provided as a string to the “setup” argument. This might include defining a variable.
Alternatively, if we have defined code in our program that is required to execute the benchmark code, we can specify the “globals” argument to provide the entire namespace.
We can specify locals() or globals() which will include a namespace from our current program to include any custom variables and functions.
Finally, we can specify the number of repetitions of the benchmark code via the “number” argument.
By default, this is set to one million, e.g. 1,000,000, although can be set to a smaller number if the benchmark code takes a long time to execute.
The example below uses the timeit.timeit() function to benchmark and compare the execution time of two ways of creating a list of squared integer values.
1 2 3 4 5 6 7 8 9 10 11 |
# SuperFastPython.com # example of benchmarking a list of numbers with timeit.timeit() import timeit # benchmark the i*i method time_method1 = timeit.timeit('[i*i for i in range(1000)]', number=100000) # report the duration print(f'i*i: {time_method1} seconds') # benchmark the i**2 method time_method2 = timeit.timeit('[i**2 for i in range(1000)]', number=100000) # report the duration print(f'i**2: {time_method2} seconds') |
Run the program and note how long it takes to execute.
Vary the code that is being benchmarked.
Let me know what you discover.
You can learn more about how to benchmark Python with the timeit.timeit() function in the tutorial:
Lesson 07: Benchmark with timeit Command Line
Hi, we can benchmark snippets of Python code on the command line by using the timeit module.
Recall that the timeit module is provided in the Python standard library. It provides an easy way to benchmark single statements and snippets of Python code.
We saw in the previous lesson how we can use the timeit Python API.
The module also provides a command line interface that can be used to benchmark code snippets. It is probably the more popular interface.
Recall that the command line interface is a way of interacting with the computer using text commands, as opposed to clicking around on a graphical interface with a mouse.
Python can be used on the command line directly via the “python” command.
A Python module can be run as a command on the command line directly via the -m flag, followed by the module name.
The timeit module can be run directly in this way, for example:
1 |
python -m timeit [-n N] [-r N] [-u U] [-s S] [-h] [statement ...] |
We can use the -n flag to specify the number of times to run a statement, the -r flag to specify the number of times to repeat a test and the -s flag to specify any setup statements for the target code.
The result is a benchmark result with the format:
1 |
[n] loops, best of [r]: [time] [units] per loop |
Where n is the number of times the statement was run, r is the number of repeats, the time is the best time to execute the statement and the units are the units of measure.
We can use the command line interface to benchmark different ways to create a list of squared integers.
For example, we can benchmark the i*i method:
1 |
python -m timeit "[i*i for i in range(1000)]" |
We can also benchmark the i**2 method:
1 |
python -m timeit "[i**2 for i in range(1000)]" |
Run both timeit benchmark commands on the command line and compare the results.
Vary the code that is being benchmarked.
Let me know what you discover.
You can learn more about how to use the timeit command line interface in the tutorial:
Further Reading
This section provides additional resources that you may find helpful.
Books
- Python Benchmarking, Jason Brownlee (my book!)
Also, the following Python books have chapters on benchmarking that may be helpful:
- Python Cookbook, 2013. (sections 9.1, 9.10, 9.22, 13.13, and 14.13)
- High Performance Python, 2020. (chapter 2)
Guides
- 4 Ways to Benchmark Python Code
- 5 Ways to Measure Execution Time in Python
- Python Benchmark Comparison Metrics
Benchmarking APIs
- time — Time access and conversions
- timeit — Measure execution time of small code snippets
- The Python Profilers
References
Thank You
Hi, thank you for letting me help you learn more about Python Benchmarking.
If you ever have any questions about this course or Python concurrency in general, please reach out. Just ask in the comments below.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Did you enjoy this course?
Let me know in the comments below.
Photo by Lance Asper on Unsplash
Do you have any questions?