Microbenchmarking in Python

You can use the timeit module to perform microbenchmarking in Python.

The timeit module supports both a Python API and a command line interface and encapsulates best practices when benchmarking such as using a high-precision timer and repeating a benchmark many times.

In this tutorial, you will discover how to perform microbenchmarking in Python.

Let’s get started.

Table of Contents

What is Microbenchmarking

Microbenchmarking is a specialized form of benchmarking focused on evaluating the performance of very small and specific code snippets or functions within a program.

Unlike traditional benchmarking, which assesses the overall performance of an entire application or system, microbenchmarking zooms in on isolated, fine-grained code segments.

As such, it is sometimes called a component benchmark or a snippet benchmark.

Key characteristics of microbenchmarking include:

Small Code Samples: Microbenchmarks typically examine short and self-contained code segments, often limited to a single function or operation.
Precise Measurement: The goal is to obtain highly precise performance measurements, often at the microsecond or nanosecond level, to uncover subtle performance differences.
Isolation: Microbenchmarks are carefully isolated from external factors that might influence the results, such as I/O operations, network access, or complex dependencies.
Focused Analysis: They provide insights into low-level performance details, like a function call overhead, loop iteration times, or arithmetic operations.
Repeated Runs: Microbenchmarks are typically run multiple times to reduce the impact of variability in performance measurements.
Tooling: Specialized tools and libraries designed for microbenchmarking are used to ensure accurate measurements and reduce sources of error.

Microbenchmarking is particularly useful in scenarios where absolute performance and optimization at the smallest scale are critical, such as when developing low-level libraries, evaluating the impact of compiler optimizations, or comparing the efficiency of different algorithms for specific operations.

It’s a valuable technique for developers aiming to fine-tune code for maximum efficiency in situations where even minor performance gains matter.

Run loops using all CPUs, download your FREE book to learn how.

History of Microbenchmarking

Microbenchmarking has a relatively recent history in the realm of software development and performance analysis.

It emerged as a response to the growing need for fine-grained performance measurement and optimization in computing applications. Here’s a brief history of microbenchmarking:

In the early days of software development, performance optimization focused on the macro scale. Developers were primarily concerned with optimizing entire applications, and detailed performance analysis of smaller code segments was often overlooked. However, as technology advanced and the demand for high-performance computing increased, the need for more granular performance analysis became evident.

The term “microbenchmarking” gained recognition in the early 21st century, coinciding with the development of specialized tools and libraries for conducting fine-grained performance tests. These tools allowed developers to isolate and measure the performance of small, specific code segments within their applications. Microbenchmarking became especially important in fields where milliseconds or even nanoseconds mattered, such as real-time systems, high-frequency trading, and scientific computing.

One notable development in microbenchmarking was the introduction of benchmarking libraries for various programming languages. These libraries offered standardized and repeatable methods for measuring the performance of code segments, enabling developers to share and compare their results with a broader community. These tools often included features for statistical analysis, allowing developers to assess the stability and reliability of their benchmarking results.

In recent years, the use of microbenchmarking has expanded beyond low-level programming tasks. It has become an essential tool for assessing the efficiency of common programming constructs, algorithms, and data structures. Microbenchmarking is now standard practice for evaluating the performance of arithmetic operations, string manipulation, data structure access, and more.

Microbenchmarking also plays a crucial role in the development and optimization of programming languages themselves. Language designers and compiler developers use microbenchmarking to evaluate the impact of language features and compiler optimizations on code performance. This helps refine language specifications and identify areas for improvement.

Today, microbenchmarking remains a critical component of software development, contributing to the creation of faster, more efficient applications and the continuous advancement of programming languages and compilers. As computing technology continues to evolve, the role of microbenchmarking in fine-tuning code for optimal performance will likely become even more prominent.

Start Now: Free Python Benchmarking Crash Course

Cases For Microbenchmarking

Microbenchmarking involves assessing the performance of very specific, fine-grained code segments.

Below are some examples of areas where microbenchmarking can be applied:

Arithmetic Operations: Evaluating the speed of basic arithmetic operations like addition, multiplication, or division to determine which is the most efficient for a given context.
String Manipulation: Comparing the performance of different string manipulation operations, such as concatenation, substring extraction, or searching.
Data Structure Operations: Assessing the efficiency of common data structure operations, like appending to a list, adding elements to a dictionary, or accessing elements in an array.
Function Call Overhead: Measuring the overhead associated with function calls, especially for small functions, to understand the cost of abstraction.
Loop Iterations: Analyzing the time it takes to execute loops with various numbers of iterations to identify the impact of loop size on performance.
Regular Expressions: Testing the performance of regular expression patterns against different input strings to optimize text processing tasks.
Bitwise Operations: Assessing the speed of bitwise operations, such as bitwise AND, OR, XOR, or bit shifting, for operations like data compression or encryption.
Memory Allocation: Evaluating the time taken for memory allocation and deallocation, comparing techniques like stack allocation and heap allocation.
File I/O: Measuring file read and write performance for small or specific file sizes to optimize data access operations.
Mathematical Functions: Benchmarking mathematical functions like square roots, trigonometric operations, and exponentiation to identify the fastest implementations.

Microbenchmarking is valuable in cases where even minor performance differences can have a significant impact, such as in high-frequency trading applications, embedded systems, or real-time data processing.

It helps developers fine-tune code to achieve optimal performance for specific, critical operations.

Now that we know about microbenchmarking, let’s look at how we can use it in Python

Free Python Benchmarking Course

Get FREE access to my 7-day email course on Python Benchmarking.

Discover benchmarking with the time.perf_counter() function, how to develop a benchmarking helper function and context manager and how to use the timeit API and command line.

Learn more

We can use microbenchmarking in Python via the timeit module.

In fact, the timeit module was specifically designed to provide the ability to benchmark small snippets of Python code, e.g. for microbenchmarking before the term was widely used.

This module provides a simple way to time small bits of Python code.
— timeit — Measure execution time of small code snippets

The timeit module provides two interfaces for microbenchmarking, they are

The timeit API interface.
The timeit Command-line interface.

The first is an API that can be used via the timeit.Timer object or timeit.timeit() and timeit.repeat() module functions.

For example, we can use the timeit.timeit() function and pass it a string of Python code to execute and microbenchmark. It will then return a duration in seconds.

For example:

...

# benchmark a python statement

result = timeit.timeit('[i*i for i in range(1000)]')

You can learn more about how to use the timeit Python API in the tutorial:

Benchmark Python with timeit.timeit()

The second is a command line interface.

The timeit module can be run directly from the command line via the -m flag to the Python interpreter.

For example:

1	python -m timeit "[i*i for i in range(1000)]"

This will then perform a microbenchmark of the snippet of code and report the result.

You can learn more about how to use the timeit command line interface in the tutorial:

Benchmark timeit Command Line Interface

Now that we know how to perform microbenchmarking in Python, let’s look at some worked examples.

Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps

Example of Microbenchmark Creating a List

We can explore how to microbenchmark different ways of creating a list of 1,000 integers.

Some approaches include:

Create a list from a range
Use a list comprehension.
Append in a for loop
Adding items in a for-loop

The first approach is to create a list from a range.

1	python -m timeit "list(range(1000))"

With results:

1	20000 loops, best of 5: 11.1 usec per loop

We can try a list comprehension.

1	python -m timeit "[i for i in range(1000)]"

With results:

1	10000 loops, best of 5: 23 usec per loop

We can try appending a list in a for loop.

1	python -m timeit -s "data = []" "for i in range(1000): data.append(i)"

With results:

1	10000 loops, best of 5: 33.9 usec per loop

We can also try list concatenation in a for loop.

1	python -m timeit -s "data = []" "for i in range(1000): data += [i]"

With results:

1	5000 loops, best of 5: 68.6 usec per loop

Let’s compare the results in a table.

Method | Time (usec)

-------------------|-----------

List from Range | 11.1

List Comprehension | 23

List Append | 33.9

List Concatenation | 68.6

Recall that “usec” is microseconds and that there are 1,000 microseconds in one millisecond and 1,000 milliseconds in one second.

Generally, we discovered that we should create a list from a range directly as this is the fastest approach.

We found that appending to a list or concatenating lists is very slow and should be avoided if possible.

Loving The Tutorials?

Why not take the next step? Get the book.

Learn more

Example of Microbenchmark String Concatenation

We can explore how to microbenchmark different ways of concatenating or joining a list into a string.

Note, some of these examples are inspired by an example in the timeit module’s documentation.

Some approaches include:

String concatenation
Join using a generator.
Join using a list comprehension.
Join using map().

We can start with string concatenation.

1	python -m timeit -s 'data = ""' 'for i in range(1000): data += str(i)'

With results:

1	2000 loops, best of 5: 149 usec per loop

Next, we can try a generator directly in the join() function.

1	python -m timeit '"".join(str(n) for n in range(1000))'

With results:

1	2000 loops, best of 5: 109 usec per loop

Next, we can try a list comprehension in the join() function.

1	python -m timeit '"".join([str(n) for n in range(1000)])'

With results:

1	5000 loops, best of 5: 90.8 usec per loop

Finally, we can try the map() build-in function to apply the str() function to each value in the range.

1	python -m timeit '"".join(map(str, range(1000)))'

With results:

1	5000 loops, best of 5: 98.2 usec per loop

Let’s compare the results in a table.

Method | Time (usec)

-----------------------------|-----------

Concatenation | 149

Join With Generator | 109

Join With List Comprehension | 90.8

Join With map() | 98.2

Recall that “usec” is microseconds and that there are 1,000 microseconds in one millisecond and 1,000 milliseconds in one second.

Generally, we can see that string concatenation is the slowest, as we might have suspected.

We found that it is slightly faster to use a list comprehension than using the map() built-in function.

Takeaways

You now know how to perform microbenchmarking in Python.

Did I make a mistake? See a typo?
I’m a simple humble human. Correct me, please!

Do you have any additional tips?
I’d love to hear about them!

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Photo by Foad Memariaan on Unsplash

Microbenchmarking in Python

What is Microbenchmarking

History of Microbenchmarking

Cases For Microbenchmarking