Last Updated on September 12, 2022
You can adopt one of the common usage patterns to get the most out of the multiprocessing.Pool in Python.
In this tutorial, you will discover the common usage patterns for Python multiprocessing pools.
Let’s get started.
Multiprocessing Pool Usage Patterns
The multiprocessing.Pool provides a lot of flexibility for executing concurrent tasks in Python.
Nevertheless, there are a handful of common usage patterns that will fit most program scenarios.
This section lists the common usage patterns with worked examples that you can copy and paste into your own project and adapt as needed.
The patterns we will look at are as follows:
- Pattern 1: map() and Iterate Results Pattern
- Pattern 2: apply_async() and Forget Pattern
- Pattern 3: map_async() and Forget Pattern
- Pattern 4: imap_unordered() and Use as Completed Pattern
- Pattern 5: imap_unordered() and Wait for First Pattern
We will use a contrived task in each example that will sleep for a random amount of time equal to less than one second. You can easily replace this example task with your own task in each pattern.
Let’s start with the first usage pattern.
Run loops using all CPUs, download your FREE book to learn how.
Pattern 1: map() and Iterate Results Pattern
This pattern involves calling the same function with different arguments then iterating over the results.
It is a concurrent and parallel version of the built-in map() function with the main difference that all function calls are issued to the process pool immediately and we cannot process results until all tasks are completed.
It requires that we call the map() function with our target function and an iterable of arguments and process return values from each function call in a for loop.
1 2 3 4 |
... # issue tasks and process results for result in pool.map(task, range(10)): print(f'>got {result}') |
You can learn more about how to use the map() function on the process pool in the tutorial:
This pattern can be used for target functions that take multiple arguments by changing the map() function for the starmap() function.
You can learn more about the starmap() function in the tutorial:
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
# SuperFastPython.com # example of the map an iterate results usage pattern from time import sleep from random import random from multiprocessing import Pool # task to execute in a new process def task(value): # generate a random value random_value = random() # block for moment sleep(random_value) # return a value return (value, random_value) # protect the entry point if __name__ == '__main__': # create the process pool with Pool() as pool: # issue tasks and process results for result in pool.map(task, range(10)): print(f'>got {result}') |
Running the example, we can see that the map() function is called the task() function for each argument in the range 0 to 9.
Watching the example run, we can see that all tasks are issued to the process pool, complete, then once all results are available will the main process iterate over the return values.
1 2 3 4 5 6 7 8 9 10 |
>got (0, 0.5223115198151266) >got (1, 0.21783676257361628) >got (2, 0.2987824437365636) >got (3, 0.7878833057358723) >got (4, 0.3656686303407395) >got (5, 0.19329669829989338) >got (6, 0.8684106781905665) >got (7, 0.19365670382002365) >got (8, 0.6705626483476922) >got (9, 0.036792658761421904) |
Pattern 2: apply_async() and Forget Pattern
This pattern involves issuing one task to the process pool and then not waiting for the result. Fire and forget.
This is a helpful approach for issuing ad hoc tasks asynchronously to the process pool, allowing the main process to continue on with other aspects of the program.
This can be achieved by calling the apply_async() function with the name of the target function and any arguments the target function may take.
The apply_async() function will return an AsyncResult object that can be ignored.
For example:
1 2 3 |
... # issue task _ = pool.apply_async(task, args=(1,)) |
You can learn more about the apply_async() function in the tutorial:
Once all ad hoc tasks have been issued, we may want to wait for the tasks to complete before closing the process pool.
This can be achieved by calling the close() function on the pool to prevent it from receiving any further tasks, then joining the pool to wait for the issued tasks to complete.
1 2 3 4 5 |
... # close the pool pool.close() # wait for all tasks to complete pool.join() |
You can learn more about joining the process pool in the tutorial:
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
# SuperFastPython.com # example of the apply_async and forget usage pattern from time import sleep from random import random from multiprocessing import Pool # task to execute in a new process def task(value): # generate a random value random_value = random() # block for moment sleep(random_value) # prepare result result = (value, random_value) # report results print(f'>task got {result}', flush=True) # protect the entry point if __name__ == '__main__': # create the process pool with Pool() as pool: # issue task _ = pool.apply_async(task, args=(1,)) # close the pool pool.close() # wait for all tasks to complete pool.join() |
Running the example fires a task into the process pool and forgets about it, allowing it to complete in the background.
The task is issued and the main process is free to continue on with other parts of the program.
In this simple example, there is nothing else to go on with, so the main process then closes the pool and waits for all ad hoc fire-and-forget tasks to complete before terminating.
1 |
>task got (1, 0.1278130542799114) |
Free Python Multiprocessing Pool Course
Download your FREE Process Pool PDF cheat sheet and get BONUS access to my free 7-day crash course on the Process Pool API.
Discover how to use the Multiprocessing Pool including how to configure the number of workers and how to execute tasks asynchronously.
Pattern 3: map_async() and Forget Pattern
This pattern involves issuing many tasks to the process pool and then moving on. Fire-and-forget for multiple tasks.
This is helpful for applying the same function to each item in an iterable and then not being concerned with the result or return values.
The tasks are issued asynchronously, allowing the caller to continue on with other parts of the program.
This can be achieved with the map_async() function that takes the name of the target task and an iterable of arguments for each function call.
The function returns an AsyncResult object that provides a handle on the issued tasks, that can be ignored in this case.
For example:
1 2 3 |
... # issue tasks to the process pool _ = pool.map_async(task, range(10)) |
You can learn more about the map_async() function in the tutorial:
Once all asynchronous tasks have been issued and there is nothing else in the program to do, we can close the process pool and wait for all issued tasks to complete.
1 2 3 4 5 |
... # close the pool pool.close() # wait for all tasks to complete pool.join() |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
# SuperFastPython.com # example of the map_async and forget usage pattern from time import sleep from random import random from multiprocessing import Pool # task to execute in a new process def task(value): # generate a random value random_value = random() # block for moment sleep(random_value) # prepare result result = (value, random_value) # report results print(f'>task got {result}', flush=True) # protect the entry point if __name__ == '__main__': # create the process pool with Pool() as pool: # issue tasks to the process pool _ = pool.map_async(task, range(10)) # close the pool pool.close() # wait for all tasks to complete pool.join() |
Running the example issues ten tasks to the process pool.
The call returns immediately and the tasks are executed asynchronously. This allows the main process to continue on with other parts of the program.
There is nothing else to do in this simple example, so the process pool is then closed and the main process blocks, waiting for the issued tasks to complete.
1 2 3 4 5 6 7 8 9 10 |
>task got (1, 0.07000157647675087) >task got (0, 0.23377533908752213) >task got (4, 0.5817185149247178) >task got (3, 0.592827746280798) >task got (9, 0.39735803187389696) >task got (5, 0.6693476274660454) >task got (6, 0.7423437379725698) >task got (7, 0.8881483088702092) >task got (2, 0.9846685764130632) >task got (8, 0.9740735804232945) |
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Pattern 4: imap_unordered() and Use as Completed Pattern
This pattern is about issuing tasks to the pool and using results for tasks as they become available.
This means that results are received out of order, if tasks take a variable amount of time, rather than in the order that the tasks were issued to the process pool.
This can be achieved with the imap_unordered() function. It takes a function and an iterable of arguments, just like the map() function.
It returns an iterable that yields return values from the target function as the tasks are completed.
We can call the imap_unordered() function and iterate the return values directly in a for-loop.
For example:
1 2 3 4 |
... # issue tasks and process results for result in pool.imap_unordered(task, range(10)): print(f'>got {result}') |
You can learn more about the imap_unordered() function in the tutorial:
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
# SuperFastPython.com # example of the imap_unordered and use as completed usage pattern from time import sleep from random import random from multiprocessing import Pool # task to execute in a new process def task(value): # generate a random value random_value = random() # block for moment sleep(random_value) # return result return (value, random_value) # protect the entry point if __name__ == '__main__': # create the process pool with Pool() as pool: # issue tasks and process results for result in pool.imap_unordered(task, range(10)): print(f'>got {result}') |
Running the example issues all tasks to the pool, then receives and processes results in the order that tasks are completed, not the order that tasks were issued to the pool, e.g. unordered.
1 2 3 4 5 6 7 8 9 10 |
>got (6, 0.27185692519830873) >got (7, 0.30517408991009) >got (2, 0.4565919197158417) >got (0, 0.4866540025699637) >got (5, 0.5594145856578583) >got (3, 0.6073766993405534) >got (1, 0.6665710827894051) >got (8, 0.4987608917896833) >got (4, 0.8036914328418536) >got (9, 0.49972284685751034) |
Pattern 5: imap_unordered() and Wait for First Pattern
This pattern involves issuing many tasks to the process pool asynchronously, then waiting for the first result or first task to finish.
It is a helpful pattern when there may be multiple ways of getting a result but only a single or the first result is required, after which, all other tasks become irrelevant.
This can be achieved by the imap_unordered() function that, like the map() function, takes the name of a target function and an iterable of arguments.
It returns an iterable that yields return values in the order that tasks complete.
This iterable can then be traversed once manually via the next() built-in function which will return only once the first task to finish returns.
For example:
1 2 3 4 5 |
... # issue tasks and process results it = pool.imap_unordered(task, range(10)) # get the result from the first task to complete result = next(it) |
The result can then be processed and the process pool can be terminated, forcing any remaining tasks to stop immediately. This happens automatically via the context manager interface.
TYing this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# SuperFastPython.com # example of the imap_unordered and wait for first result usage pattern from time import sleep from random import random from multiprocessing import Pool # task to execute in a new process def task(value): # generate a random value random_value = random() # block for moment sleep(random_value) # return result return (value, random_value) # protect the entry point if __name__ == '__main__': # create the process pool with Pool() as pool: # issue tasks and process results it = pool.imap_unordered(task, range(10)) # get the result from the first task to complete result = next(it) # report first result print(f'>got {result}') |
Running the example first issues all of the tasks asynchronously.
The result from the first task to complete is then requested, which blocks until a result is available.
One task completes, returns a value, which is then processed, then the process pool and all remaining tasks are terminated automatically.
1 |
>got (4, 0.41272860928850164) |
Further Reading
This section provides additional resources that you may find helpful.
Books
- Multiprocessing Pool Jump-Start, Jason Brownlee (my book!)
- Multiprocessing API Interview Questions
- Pool Class API Cheat Sheet
I would also recommend specific chapters from these books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Multiprocessing Pool: The Complete Guide
- Python ThreadPool: The Complete Guide
- Python Multiprocessing: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
APIs
References
Takeaways
You now know the common usage patterns for the multiprocessing.Pool in Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Faiz Prasla on Unsplash
Do you have any questions?