Last Updated on September 12, 2022
You can configure the number of threads in the ThreadPoolExecutor in Python by setting the max_workers argument.
In this tutorial, you will discover how to configure the number of worker threads in Python thread pools.
Let’s get started.
Need to Configure the Number of Worker Threads in ThreadPoolExecutor
The ThreadPoolExecutor in Python provides a pool of reusable threads for executing ad hoc tasks.
You can submit tasks to the thread pool by calling the submit() function and passing in the name of the function you wish to execute on another thread. You can also submit tasks by calling the map() function and specifying the name of the function to execute and the iterable of items to which your function will be applied.
The thread pool has a fixed number of worker threads.
It is important to limit the number of worker threads in the thread pools to the number of asynchronous tasks you wish to complete, based on the resources in your system, or on the number of resources you intend to use within your tasks.
Alternately, you may wish to increase the number of worker threads dramatically, given the greater capacity in the resources you intend to use.
How do you configure the number of worker threads in the ThreadPoolExecutor?
Run loops using all CPUs, download your FREE book to learn how.
How to Configure the Number of Worker Threads
You can configure the number of worker threads in the ThreadPoolExecutor by setting the max_workers in the constructor.
For example:
1 2 3 4 5 6 |
... # create a thread pool and set the number of worker threads executor = ThreadPoolExecutor(max_workers=100) # ... # shutdown the thread pool executor.shutdown() |
The max_workers argument is the first argument in the constructor and does not need to be specified by name to be set; for example:
1 2 3 |
... # create a thread pool and set the number of worker threads executor = ThreadPoolExecutor(100) |
If you are using the context manager to create the thread pool so that it is automatically shutdown, then you can configure the number of threads in the same manner.
For example:
1 2 3 4 |
... # create a thread pool using the context manager and set the number of workers with ThreadPoolExecutor(50) as executor: # ... |
It takes a positive integer and defaults to the number of CPUs in your system plus four.
- Total Number Worker Threads = (CPUs in Your System) + 4
For example, if you had 2 physical CPUs in your system and each CPU has hyperthreading (common in modern CPUs) then you would have 2 physical and 4 logical CPUs. Python would see 4 CPUs. The default number of worker threads on your system would then be (4 + 4) or 8.
If this number comes out to be more than 32 (e.g. 16 physical cores, 32 logical cores, plus four), the default will clip the upper bound to 32 threads.
It is common to have more threads than CPUs (physical or logical) in your system.
The reason for this is that threads are used for IO-bound tasks, not CPU-bound tasks. This means that threads are used for tasks that wait for relatively slow resources to respond, like hard drives, DVD drives, printers, network connections, and much more. We will discuss the best application of threads in a later section.
Therefore, it is not uncommon to have tens, hundreds and even thousands of threads in your application, depending on your specific needs. It is unusual to have more than one or a few thousand threads. If you require this many threads, then alternative solutions may be preferred, such as AsyncIO.
Now that we know how to configure the number of worker threads in the ThreadPoolExecutor, let’s look at a worked example.
Check the Default Number of Worker Threads
Let’s check how many threads are created for thread pools on your system.
Looking at the source code for the ThreadPoolExecutor, we can see that the number of worker threads chosen by default is stored in the _max_workers property, which we can access and report after a thread pool is created.
Note, _max_workers is a protected member and may change in the future.
The example below reports the number of default threads in a thread pool on your system.
1 2 3 4 5 6 7 |
# SuperFastPython.com # report the default number of worker threads on your system from concurrent.futures import ThreadPoolExecutor # create a thread pool with the default number of worker threads executor = ThreadPoolExecutor() # report the number of worker threads chosen by default print(executor._max_workers) |
Running the example reports the number of worker threads used by default on your system.
I have four physical CPU cores, eight logical cores, therefore the default is 8 + 4 or 12 threads.
1 |
12 |
How many worker threads are allocated by default on your system?
Let me know in the comments below.
Free Python ThreadPoolExecutor Course
Download your FREE ThreadPoolExecutor PDF cheat sheet and get BONUS access to my free 7-day crash course on the ThreadPoolExecutor API.
Discover how to use the ThreadPoolExecutor class including how to configure the number of workers and how to execute tasks asynchronously.
Example of Configuring the Number of Worker Threads
We can specify the number of worker threads directly, and this is a good idea in most applications.
The example below demonstrates how to configure 500 worker threads.
1 2 3 4 5 6 7 |
# SuperFastPython.com # configure and report the default number of worker threads from concurrent.futures import ThreadPoolExecutor # create a thread pool with a large number of worker threads with ThreadPoolExecutor(500) as executor: # report the number of worker threads print(executor._max_workers) |
Running the example configures the thread pool to use 500 threads and confirms that it will create 500 threads.
1 |
500 |
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Common Questions
This section lists common questions related to the number of worker threads in the ThreadPoolExecutor.
Do you have a question about setting the number of threads?
Let me know in the comments and I will do my best to answer it and add it to this section.
What Is the Default Number of Threads in the ThreadPoolExecutor?
The default number of threads in the ThreadPoolExecutor is calculated as follows:
- Total Number Worker Threads = (CPUs in Your System) + 4
Where the number of CPUs in your system is determined by Python and will take hyperthreading into account.
For example, if you have two CPU cores each with hyperthreading (which is common), then Python will “see” four CPUs in your system.
How Many CPUs or CPU Cores Do I Have?
You can check the number of CPUs or CPU cores that are visible to Python via the cpu_count() function in the os module.
For example, the following program will report the number of CPU cores in your system that are visible to Python:
1 2 3 |
# report the number of CPUs in your system visible to Python import os print(os.cpu_count()) |
Does the Number of Threads in the ThreadPoolExecutor Match the Number of CPUs or Cores?
The number of worker threads in the ThreadPoolExecutor is not related to the number of CPUs or CPU cores in your system.
You can configure the number of threads based on the number of tasks you need to execute, the amount of local system resources you have available (e.g. memory), and the limitations of resources you intend to access within your tasks (e.g. connections to remote servers).
How Many Threads Should I Use?
If you have hundreds of tasks, you should probably set the number of threads to be equal to the number of tasks.
If you have thousands of tasks, you should probably cap the number of threads at hundreds or 1,000.
If your application is intended to be executed multiple times in the future, you can test different numbers of threads and compare overall execution time, then choose a number of threads that gives approximately the best performance. You may want to mock the task in these tests with a random sleep operation.
What Is the Maximum Number of Worker Threads in the ThreadPoolExecutor?
There is no maximum number of worker threads in the ThreadPoolExecutor.
Nevertheless, your system will have an upper limit of the number of threads you can create based on how much main memory (RAM) you have available.
Before you exceed main memory, you will reach a point of diminishing returns in terms of adding new threads and executing more tasks. This is because your operating system must switch between the threads, called context switching. With too many threads active at once, your program may spend more time context switching than actually executing tasks.
A sensible upper limit for many applications is hundreds of threads to perhaps a few thousand threads. More than a few thousand threads on a modern system may result in too much context switching, depending on your system and on the types of tasks that are being executed.
Further Reading
This section provides additional resources that you may find helpful.
Books
- ThreadPoolExecutor Jump-Start, Jason Brownlee, (my book!)
- Concurrent Futures API Interview Questions
- ThreadPoolExecutor Class API Cheat Sheet
I also recommend specific chapters from the following books:
- Effective Python, Brett Slatkin, 2019.
- See Chapter 7: Concurrency and Parallelism
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python ThreadPoolExecutor: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
- Python Threading: The Complete Guide
- Python ThreadPool: The Complete Guide
APIs
References
Takeaways
You now know how to configure the number of threads for the ThreadPoolExecutor in Python.
Do you have any questions about how to configure the number of worker threads?
Ask your question in the comments below and I will do my best to answer.
Photo by Gabriel Porras on Unsplash
Do you have any questions?