Last Updated on September 12, 2022
You can configure the number of workers in the ProcessPoolExecutor in Python by setting the “max_workers” argument.
In this tutorial you will discover how to configure the number of worker processes in Python process pools.
Let’s get started.
Need to Configure The Number of Worker Processes
The ProcessPoolExecutor in Python provides a pool of reusable processes for executing ad hoc tasks.
You can submit tasks to the process pool by calling the submit() function and passing in the name of the function you wish to execute on another process. You can also submit tasks by calling the map() function and specify the name of the function to execute and the iterable of items to which your function will be applied.
The process pool has a fixed number of worker processes.
It is important to limit the number of worker processes in the process pools to perhaps the number of logical CPU cores or the number of physical CPU cores in your system, depending on the types of tasks you will be executing.
How do you configure the number of worker processes in the ProcessPoolExecutor?
Run loops using all CPUs, download your FREE book to learn how.
How to Configure The Number of Workers
You can configure the number of worker processes in the ProcessPoolExecutor by setting the “max_workers” argument in the constructor.
For example:
1 2 3 4 5 6 |
... # create a process pool and set the number of worker processes executor = ProcessPoolExecutor(max_workers=4) # ... # shutdown the process pool executor.shutdown() |
The “max_workers” argument is the first argument in the constructor and does not need to be specified by name to be set, for example:
1 2 3 |
... # create a process pool and set the number of worker processes executor = ProcessPoolExecutor(4) |
If you are using the context manager to create the process pool so that it is automatically shutdown, then you can configure the number of processes in the same manner.
For example:
1 2 3 4 |
... # create a process pool using the context manager and set the number of workers with ProcessPoolExecutor(4) as executor: # ... |
The argument takes a positive integer and defaults to the number of logical CPU cores in your system.
- Total Number Worker Processes = (CPUs in Your System)
For example, if you had 2 physical CPUs in your system and each CPU has hyperthreading (common in modern CPUs) then you would have 2 physical and 4 logical CPUs. Python would see 4 CPUs. The default number of worker processes on your system would then be 4.
The number of workers must be less than or equal to 61 if Windows is your operating system.
It is common to have more processes than CPUs (physical or logical) in your system, if the target task function is performing blocking IO operations.
The reason for this is because processes are used for IO-bound tasks, not CPU bound tasks. This means that processes are used for tasks that wait for relatively slow resources to respond, like hard drives, printers, and network connections, and much more.
If you require hundreds or processes for IO-bound tasks, you might want to consider using threads instead and the ThreadPoolExecutor. If you require thousands of processes for IO-bound tasks, you might want to consider using the AsyncIO module.
Now that we know how to configure the number of worker processes in the ProcessPoolExecutor, let’s look at a worked example.
Example of Configuring The Number of Workers
Let’s explore how to configure the number of worker processes with a worked example.
Check The Default Number of Worker Processes
First, let’s check how many processes are created for process pools on your system.
Looking at the source code for the ProcessPoolExecutor we can see that the number of worker processes chosen by default is stored in the _max_workers property, which we can access and report after a process pool is created.
Note, “_max_workers” is a protected member and may change in the future.
The example below reports the number of default processes in a process pool on your system.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
# SuperFastPython.com # report the default number of worker processes on your system from concurrent.futures import ProcessPoolExecutor # entry point def main(): # create a process pool with the default number of worker processes pool = ProcessPoolExecutor() # report the number of worker processes chosen by default print(pool._max_workers) if __name__ == '__main__': main() |
Running the example reports the number of worker processes used by default on your system.
I have four physical CPU cores, eight logical cores therefore the default is 8 processes.
1 |
8 |
How many worker processes are allocated by default on your system?
Let me know in the comments below.
Set The Number of Worker Processes
We can specify the number of worker processes directly and this is a good idea in most applications.
The example below demonstrates how to configure 60 worker processes.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
# SuperFastPython.com # configure and report the default number of worker processes from concurrent.futures import ProcessPoolExecutor # entry point def main(): # create a process pool with a large number of worker processes pool = ProcessPoolExecutor(60) # report the number of worker processes print(pool._max_workers) if __name__ == '__main__': main() |
Running the example configures the process pool to use 60 processes and confirms that it will create 60 processes.
1 |
60 |
Free Python ProcessPoolExecutor Course
Download your FREE ProcessPoolExecutor PDF cheat sheet and get BONUS access to my free 7-day crash course on the ProcessPoolExecutor API.
Discover how to use the ProcessPoolExecutor class including how to configure the number of workers and how to execute tasks asynchronously.
Common Questions
This section lists common questions related to the number of worker processes in the ProcessPoolExecutor.
Do you have a question about setting the number of processes?
Let me know in the comments and I will do my best to answer it and add it to this section.
What is the Default Number of Processes in the ProcessPoolExecutor?
The default number of processes in the ProcessPoolExecutor is equal to the number of logical CPU cores in your system.
For example:
- Total Number Worker Processes = CPUs in Your System
Where the number of CPUs in your system is determined by Python and will take hyperthreading into account.
For example if you have two CPU cores each with hyperthreading (which is common), then Python will detect four CPUs in your system.
How Many CPU Cores Do I Have?
You can check the number of CPU cores that are visible to Python via the os.cpu_count() function.
You can learn more about the os Python module here:
For example, the following program will report the number of CPU cores in your system that are available to your Python interpreter:
1 2 3 |
# report the number of CPUs in your system visible to Python import os print(os.cpu_count()) |
Does The Number of Processes in the ProcessPoolExecutor Match the Number of CPUs or Cores?
The number of worker processes in the ProcessPoolExecutor should probably match the number of CPU cores in your system if your tasks are CPU-bound.
This is a good default.
If your tasks are IO-bound you may set the number of processes to be equal to or a factor of the number of tasks you wish to complete. Although, your operating system may limit the number of processes you’re able to create, e.g. 61 on Windows.
If you require hundreds or thousands of concurrent tasks executed and they are IO-bound, consider using the ThreadPoolExecutor instead.
How Many Processes Should I Use?
You should probably set the number of processes to be equal to the number of logical CPU cores in your system, e.g. the default.
- By default: Set to the number of logical CPU cores.
If you are expecting to perform computational work in the main process in addition to the process pool, consider setting the number of processes in the pool to be equal to the number of logical CPUs in your system minus one, to allow the main process to execute.
- If the main process is computationally intensive: Set to the number of logical CPU cores minus one.
If you have particularly CPU intensive tasks, consider configuring the number of processes to be equal to the number of physical CPUs instead of the number of logical CPUs.
- If tasks are computationally intensive: Set to the number of physical CPU cores.
What is the Maximum Number of Worker Processes in the ProcessPoolExecutor?
The maximum number of worker processes may be limited by your operating system.
For example, on windows, you will not be able to create more than 61 processes in your Python program.
Other operating systems like MacOS and Linux may impose an upper limit on the number of processes that may be spawned or forked.
Additionally, your system will have an upper limit of the number of processes you can create based on how much main memory (RAM) you have available.
Nevertheless, before you exceed main memory, you will reach a point of diminishing returns in terms of adding new processes and executing more tasks. This is because your operating system must switch between the processes, called context switching. With too many processes active at once, your program may spend more time context switching than actually executing tasks.
A sensible upper limit for most applications is to set the number of processes to be equal to the number of logical CPU cores or the number of physical CPU cores in your system.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Further Reading
This section provides additional resources that you may find helpful.
Books
- ProcessPoolExecutor Jump-Start, Jason Brownlee (my book!)
- Concurrent Futures API Interview Questions
- ProcessPoolExecutor PDF Cheat Sheet
I also recommend specific chapters from the following books:
- Effective Python, Brett Slatkin, 2019.
- See Chapter 7: Concurrency and Parallelism
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python ProcessPoolExecutor: The Complete Guide
- Python ThreadPoolExecutor: The Complete Guide
- Python Multiprocessing: The Complete Guide
- Python Pool: The Complete Guide
APIs
References
- Thread (computing), Wikipedia.
- Process (computing), Wikipedia.
- Thread Pool, Wikipedia.
- Futures and promises, Wikipedia.
Takeaways
You now know how to configure the number of processes for the ProcessPoolExecutor in Python.
Do you have any questions?
Ask your question in the comments below and I will do my best to answer.
Photo by Ross Parmly on Unsplash
Charlie Carroll says
Many (older) sources warn that multiprocessing is prone to errors if multiple processors are operating on the same memory. This is a problem since the two logical cores in one physical core share the same cache. Hence, multiprocessing should be limited to the number of physical cores, while multithreading can exploit the total number of logical cores. Further, one should leave one core free to manage the scheduling.
Despite all of the online advice that contradicts your information, I believe you since the current default in Python is in fact that the default for the ProcessPoolExecutor is to use all logical cores. I have to believe that they would only set this as the default if they found a way to partition the cache so that the logical processors were not operating on the same objects in memory (the cache). It would be really helpful if you would explain why the advice now is different from the many, many older posts that give the opposite advice. I was searching for days before I found your site.
Also, I am running a workstation with 10 Intel i9 cores. One tip is to leave one (logical) core free to handle the scheduling, etc. However, when I look at the Resource Monitor in Windows 10, all 20 logical cores ( CPU 0 to CPU 19) are running at 100% and there is a Service CPU that appears to be handling the scheduling, and it is running at roughly 60%. So at this point, I don’t understand why it works, but I am just leaving everything to the default values and trusting that the latest hardware and software handles everything correctly despite the fact that it contradicts the vast majority of advice online. If you could post a clarification of this apparent change in recommended practices, that would be a valuable contribution to the online advice.
Thank you for your posts to date. None of this made any sense until I found your post. Keep up the good work!
Jason Brownlee says
Thanks for sharing and for your support!
You raise a good point. In practice. I recommend testing max_workers equal to all logical vs all physical cpus and compare performance. It really depends on the specifics of the application.
Agreed, leaving one core free for main, if main is doing something important, is a good idea.
Pranav Khabale says
Hi Jason. Quite a helpful post. A doubt I had to ask. Just like how 61 is the max limit of workers for Windows OS , is there any rough estimate for Mac OS. If not, would it be likely that the limit be higher than that of Windows OS.
Jason Brownlee says
I don’t believe there is a fixed limit for MacOS. Instead, the limit would be whatever your system hardware can handle for the types of tasks running.
Perhaps run some experiments to see how many processes you can can run on your system.