Last Updated on October 29, 2022
You can configure the thread pool via arguments to the multiprocessing.pool.ThreadPool class constructor.
In this tutorial you will discover how to configure the ThreadPool in Python.
Let’s get started.
Need to Configure the ThreadPool
The multiprocessing.pool.ThreadPool in Python provides a pool of reusable threads for executing ad hoc tasks.
A thread pool object which controls a pool of worker threads to which jobs can be submitted.
— multiprocessing — Process-based parallelism
The ThreadPool class extends the Pool class. The Pool class provides a pool of worker processes for process-based concurrency.
Although the ThreadPool class is in the multiprocessing module it offers thread-based concurrency and is best suited to IO-bound tasks, such as reading or writing from sockets or files.
A ThreadPool can be configured when it is created, which will prepare the new threads.
We can issue one-off tasks to the ThreadPool using functions such as apply() or we can apply the same function to an iterable of items using functions such as map().
Results for issued tasks can then be retrieved synchronously, or we can retrieve the result of tasks later by using asynchronous versions of the functions such as apply_async() and map_async().
The ThreadPool can be customized for the application.
What can be configured in the ThreadPoolpool and how can we configure it?
Run loops using all CPUs, download your FREE book to learn how.
How to Configure the ThreadPool
The ThreadPool can be configured by specifying arguments to the multiprocessing.pool.ThreadPool class constructor.
ThreadPool Constructor Arguments
The arguments to the constructor are as follows:
- processes: Maximum number of worker threads (not processes) to use in the pool.
- initializer: Function executed after each worker thread is created.
- initargs: Arguments to the worker thread initialization function.
Unlike the multiprocessing.pool.Pool class that the ThreadPool extends, the ThreadPool does not have a “maxtasksperchild” argument to limit the number of tasks per worker. Also, because we are using threads instead of processes, we cannot configure the multiprocessing “context” used by the pool.
Next, let’s look at the default configuration for the ThreadPool.
Default Configuration
By default the multiprocessing.pool.ThreadPool class constructor does not take any arguments.
For example:
1 2 3 |
... # create a default thread pool pool = multiprocessing.pool.ThreadPool() |
This will create a thread pool that will use a number of worker threads that matches the number of logical CPU cores in your system.
It will not call a function that initializes the worker threads when they are created.
Each worker thread will be able to execute an unlimited number of tasks within the pool.
We can create a ThreadPool with the default configuration and report its details.
The complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 |
# SuperFastPython.com # create a threadpool with default configuration from multiprocessing.pool import ThreadPool # protect the entry point if __name__ == '__main__': # create a thread pool pool = ThreadPool() # report details of the thread pool print(pool) # close the thread pool pool.close() |
Running the example creates the thread pool and reports its details.
In this case we can see that the pool is started and has 8 worker threads to match the 8 logical CPU cores in my system.
Note, the number of workers will differ based on the number of logical CPU cores in your system.
1 |
<multiprocessing.pool.ThreadPool state=RUN pool_size=8> |
Now that we know what configuration the ThreadPool takes, let’s look at how we might configure each aspect of the ThreadPool.
How to Configure the Number of Worker Threads
We can configure the number of worker threads in the multiprocessing.pool.ThreadPool by setting the “processes” argument in the constructor.
Although the argument is called “processes“, it actually controls the number of worker threads.
processes is the number of worker threads to use. If processes is None then the number returned by os.cpu_count() is used.
— multiprocessing — Process-based parallelism
We can set the “processes” argument to specify the number of worker threads to create and use as workers in the ThreadPool.
For example:
1 2 3 |
... # create a threads pool with 4 workers pool = multiprocessing.pool.ThreadPool(processes=4) |
The “processes” argument is the first argument in the constructor and does not need to be specified by name to be set, for example:
1 2 3 |
... # create a thread pool with 4 workers pool = multiprocessing.pool.ThreadPool(4) |
If we are using the context manager to create the thread pool so that it is automatically shutdown, then you can configure the number of threads in the same manner.
For example:
1 2 3 4 |
... # create a thread pool with 4 workers with multiprocessing.pool.ThreadPool(4): # ... |
You can learn more about how to configure the number of worker threads in the tutorial:
Next, let’s look at how we might configure the worker thread initialization function.
Free Python ThreadPool Course
Download your FREE ThreadPool PDF cheat sheet and get BONUS access to my free 7-day crash course on the ThreadPool API.
Discover how to use the ThreadPool including how to configure the number of worker threads and how to execute tasks asynchronously
How to Configure the Initialization Function
We can configure worker threads in the ThreadPool to execute an initialization function prior to executing tasks.
This can be achieved by setting the “initializer” argument when configuring the ThreadPool via the class constructor.
The “initializer” argument can be set to the name of a function that will be called to initialize the worker threads.
If initializer is not None then each worker process will call initializer(*initargs) when it starts.
— multiprocessing — Process-based parallelism
Although the API documentation describes worker processes, the function is used to initialize worker threads.
For example:
1 2 3 4 5 6 7 |
# worker thread initialization function def worker_init(): # ... ... # create a thread pool and initialize workers pool = multiprocessing.pool.ThreadPool(initializer=worker_init) |
If our worker thread initialization function takes arguments, they can be specified to the ThreadPool constructor via the “initargs” argument, which takes an ordered list or tuple of arguments for the custom initialization function.
For example:
1 2 3 4 5 6 7 |
# worker thread initialization function def worker_init(arg1, arg2, arg3): # ... ... # create a thread pool and initialize workers pool = multiprocessing.pool.ThreadPool(initializer=worker_init, initargs=(arg1, arg2, arg3)) |
You can learn more about how to initialize worker threads in the tutorial:
Next, let’s look at how we might configure the maximum tasks per worker thread.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
How to Configure the Max Tasks Per Child
The ThreadPool does not support the “maxtasksperchild” argument.
This argument is supported in the multiprocessing.pool.Pool parent class.
Further Reading
This section provides additional resources that you may find helpful.
Books
- Python ThreadPool Jump-Start, Jason Brownlee (my book!)
- Threading API Interview Questions
- ThreadPool PDF Cheat Sheet
I also recommend specific chapters from the following books:
- Python Cookbook, David Beazley and Brian Jones, 2013.
- See: Chapter 12: Concurrency
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python ThreadPool: The Complete Guide
- Python Multiprocessing Pool: The Complete Guide
- Python ThreadPoolExecutor: The Complete Guide
- Python Threading: The Complete Guide
APIs
References
Takeaways
You now know how to configure the ThreadPool in Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by SPACEDEZERT on Unsplash
Do you have any questions?