Last Updated on September 13, 2022
You can apply a function to each item in an iterable in parallel using the Pool map() method.
In this tutorial you will discover how to use a parallel version of map() with the process pool in Python.
Let’s get started.
Need a Parallel Version of map()
The multiprocessing.pool.Pool in Python provides a pool of reusable processes for executing ad hoc tasks.
A process pool can be configured when it is created, which will prepare the child workers.
A process pool object which controls a pool of worker processes to which jobs can be submitted. It supports asynchronous results with timeouts and callbacks and has a parallel map implementation.
— multiprocessing — Process-based parallelism
The built-in map() function allows you to apply a function to each item in an iterable.
The Python process pool provides a parallel version of the map() function.
How can we use the parallel version of map() with the process pool?
Run loops using all CPUs, download your FREE book to learn how.
How to Use Pool.map()
The process pool provides a parallel map function via Pool.map().
Recall that the built-in map() function will apply a given function to each item in a given iterable.
Return an iterator that applies function to every item of iterable, yielding the results.
— Built-in Functions
It yields one result returned from the given target function called with one item from a given iterable. It is common to call map and iterate the results in a for-loop.
For example:
1 2 3 4 |
... # iterates results from map for result in map(task, items): # ... |
The multiprocessing.pool.Pool process pool provides a version of the map() function where the target function is called for each item in the provided iterable in parallel.
A parallel equivalent of the map() built-in function […]. It blocks until the result is ready.
— multiprocessing — Process-based parallelism
For example:
1 2 3 4 |
... # iterates results from map for result in pool.map(task, items): # ... |
Each item in the iterable is taken as a separate task in the process pool.
Like the built-in map() function, the returned iterator of results will be in the order of the provided iterable. This means that tasks are issued (and perhaps executed) in the same order as the results are returned.
Unlike the built-in map() function, the Pool.map() function only takes one iterable as an argument. This means that the target function executed in the process can only take a single argument.
The iterable of items that is passed is iterated in order to issue all tasks to the process pool. Therefore, if the iterable is very long, it may result in many tasks waiting in memory to execute, e.g. one per worker process.
It is possible to split up the items in the iterable evenly to worker processes.
For example, if we had a process pool with 4 child worker processes and an iterable with 40 items, we can split up the items into 4 chunks of 10 items, with one chunk allocated to each worker process.
The effect is less overhead in transmitting tasks to worker processes and collecting results.
This can be achieved via the “chunksize” argument to map().
This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integer.
— multiprocessing — Process-based parallelism
For example:
1 2 3 4 |
... # iterates results from map with chunksize for result in pool.map(task, items, chunksize=10): # ... |
You can learn more about how to configure the chunksize in the tutorial:
Difference Between map() and map_async()
How does the Pool.map() function compare to the Pool.map_async() for issuing tasks to the process pool?
Both the map() and map_async() may be used to issue tasks that call a function to all items in an iterable via the process pool.
The following summarizes the key differences between these two functions:
- The map() function blocks, whereas the map_async() function does not block.
- The map() function returns an iterable of return values from the target function, whereas the map_async() function returns an AsyncResult.
- The map() function does not support callback functions, whereas the map_async() function can execute callback functions on return values and errors.
The map() function should be used for issuing target task functions to the process pool where the caller can or must block until all function calls are complete.
The map_async() function should be used for issuing target task functions to the process pool where the caller cannot or must not block while the task is executing.
Now that we know how to use the map() function to execute tasks in the process pool, let’s look at some worked examples.
Example of Pool.map()
We can explore how to use the parallel version of map() on the process pool.
In this example, we can define a target task function that takes an integer as an argument, generates a random number, reports the value then returns the value that was generated. We can then call this function for each integer between 0 and 9 using the process pool map().
This will apply the function to each integer in parallel using as many cores as are available in the system.
Firstly, we can define the target task function.
The function takes an argument, generates a random number between 0 and 1, reports the integer and generated number. It then blocks for a fraction of a second to simulate computational effort, then returns the number that was generated.
The task() function below implements this.
1 2 3 4 5 6 7 8 9 10 |
# task executed in a worker process def task(identifier): # generate a value value = random() # report a message print(f'Task {identifier} executing with {value}', flush=True) # block for a moment sleep(value) # return the generated value return value |
We can then create and configure a process pool.
We will use the context manager interface to ensure the pool is shutdown automatically once we are finished with it.
If you are new to the context manager interface of the process pool, you can learn more in the tutorial:
1 2 3 4 |
... # create and configure the process pool with Pool() as pool: # ... |
We can then call the map() function on the process pool to apply our task() function to each value in a range between 0 and 1.
This returns an iterator over the results returned from the task() function, in the order that function calls are completed. We will iterate over the results and report each in turn.
This can all be achieved in a for-loop.
1 2 3 4 |
... # execute tasks in order for result in pool.map(task, range(10)): print(f'Got result: {result}', flush=True) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# SuperFastPython.com # example of parallel map() with the process pool from random import random from time import sleep from multiprocessing.pool import Pool # task executed in a worker process def task(identifier): # generate a value value = random() # report a message print(f'Task {identifier} executing with {value}', flush=True) # block for a moment sleep(value) # return the generated value return value # protect the entry point if __name__ == '__main__': # create and configure the process pool with Pool() as pool: # execute tasks in order for result in pool.map(task, range(10)): print(f'Got result: {result}', flush=True) # process pool is closed automatically |
Running the example first creates the process pool with a default configuration.
It will have one child worker process for each logical CPU in your system.
The map() function is then called for the range.
This issues ten calls to the task() function, one for each integer between 0 and 9. An iterator is returned with the result for each function call, in order.
Each call to the task function generates a random number between 0 and 1, reports a message, blocks, then returns a value.
The main process iterates over the values returned from the calls to the task() function and reports the generated values, matching those generated in each child process.
Importantly, all task() function calls are issued and executed before the iterator of results is returned. We cannot iterate over results as they are completed by the caller.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
Task 0 executing with 0.25274306680283887 Task 1 executing with 0.8117380416906219 Task 2 executing with 0.2249383542314648 Task 3 executing with 0.5441268743153568 Task 4 executing with 0.20938554670136755 Task 5 executing with 0.21474682340642592 Task 6 executing with 0.5280425396041588 Task 7 executing with 0.6644217040298188 Task 8 executing with 0.62318604008126 Task 9 executing with 0.21148696187351357 Got result: 0.25274306680283887 Got result: 0.8117380416906219 Got result: 0.2249383542314648 Got result: 0.5441268743153568 Got result: 0.20938554670136755 Got result: 0.21474682340642592 Got result: 0.5280425396041588 Got result: 0.6644217040298188 Got result: 0.62318604008126 Got result: 0.21148696187351357 |
Next, let’s look at an example where we might call a map for a function with no return value.
Free Python Multiprocessing Pool Course
Download your FREE Process Pool PDF cheat sheet and get BONUS access to my free 7-day crash course on the Process Pool API.
Discover how to use the Multiprocessing Pool including how to configure the number of workers and how to execute tasks asynchronously.
Example of Pool.map() with No Return Value
We can explore using the map() function to call a function for each item in an iterable that does not have a return value.
This means that we are not interested in the iterable of results returned by the call to map() and instead are only interested that all issued tasks get executed.
This can be achieved by updating the previous example so that the task() function does not return a value.
The updated task() function with this change is listed below.
1 2 3 4 5 6 7 8 |
# task executed in a worker process def task(identifier): # generate a value value = random() # report a message print(f'Task {identifier} executing with {value}', flush=True) # block for a moment sleep(value) |
Then, in the main process, we can call map() with our task() function and the range, and not iterate the results.
1 2 3 |
... # execute tasks, block until all completed pool.map(task, range(10)) |
Importantly, the call to map() on the process pool will block the main process until all issued tasks are completed.
Once completed, the call will return and the process pool will be closed by the context manager.
This is a helpful pattern to issue many tasks to the process pool with a single function call.
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
# SuperFastPython.com # example of parallel map() with the process pool and a task that does not return a value from random import random from time import sleep from multiprocessing.pool import Pool # task executed in a worker process def task(identifier): # generate a value value = random() # report a message print(f'Task {identifier} executing with {value}', flush=True) # block for a moment sleep(value) # protect the entry point if __name__ == '__main__': # create and configure the process pool with Pool() as pool: # execute tasks, block until all completed pool.map(task, range(10)) # process pool is closed automatically |
Running the example first creates the process pool with a default configuration.
The map() function is then called for the range. This issues ten calls to the task() function, one for each integer between 0 and 9. An iterator is returned with the result for each function call, but is ignored in this case.
Each call to the task function generates a random number between 0 and 1, reports a message, then blocks.
The main process blocks until the map() function returns.
The tasks finish, map() returns, then the process pool is closed.
This example again highlights that the call to map() blocks until all issued tasks are completed.
1 2 3 4 5 6 7 8 9 10 |
Task 0 executing with 0.3430883422801846 Task 1 executing with 0.505941095387199 Task 2 executing with 0.2950340395105896 Task 3 executing with 0.49912260710720713 Task 4 executing with 0.4541276718378785 Task 5 executing with 0.2993272526844628 Task 6 executing with 0.06380444907074578 Task 7 executing with 0.2387238116999184 Task 8 executing with 0.3311496041557106 Task 9 executing with 0.8087950700153486 |
Next, let’s explore the chunksize argument to the map() function.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Example of Pool.map() with chunksize
The map() function will apply a function to each item in an iterable.
If the iterable has a large number of items, it may be inefficient to issue function calls to the target function for each item.
A more efficient approach would be to divide the items in the iterable into chunks and issue chunks of items to each worker process to which the target function can be applied.
This can be achieved with the “chunksize” argument to the map() function.
We can demonstrate this with an example, first without using the “chunksize” argument, then using the “chunksize” to speed up the division of work.
If a chunksize is not specified, a default is used that has been found to be very effective. You can learn more about the default chunksize and how to configure it in the tutorial:
Example Without Chunks
Before we demonstrate the “chunksize” argument, we can devise an example that has a reasonably large iterable.
In this example, we can update the previous example to call the task() function that returns a value as before, but to have 4 child worker processes and to call the task() function 40 times, for integers 0 and 39.
This requires that we set the chunksize to 1.
This number of child workers and calls to task() were chosen so that we can test the “chunksize” argument in the next section. If you have fewer than 4 logical CPU cores in your system, change the example accordingly, e.g. 2 worker processes and 20 tasks.
1 2 3 4 5 |
... # create and configure the process pool with Pool(4) as pool: # execute tasks, block until all complete pool.map(task, range(40), chunksize=1) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
# SuperFastPython.com # example of parallel map() with the process pool with a larger iterable from random import random from time import sleep from multiprocessing.pool import Pool # task executed in a worker process def task(identifier): # generate a value value = random() # report a message print(f'Task {identifier} executing with {value}', flush=True) # block for a moment sleep(1) # return the generated value return value # protect the entry point if __name__ == '__main__': # create and configure the process pool with Pool(4) as pool: # execute tasks, block until all complete pool.map(task, range(40), chunksize=1) # process pool is closed automatically |
Running the example first creates the process pool with 4 child process workers.
The map() function is then called for the range.
This issues 40 calls to the task() function, one for each integer between 0 and 39. An iterator is returned with the result for each function call, in order.
Each call to the task function generates a random number between 0 and 1, reports a message, blocks, then returns a value.
The main process iterates over the values returned from the calls to the task() function and reports the generated values, matching those generated in each child process.
On my system, the example took about 12.2 seconds to complete.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
Task 0 executing with 0.7061242557983359 Task 3 executing with 0.05859203085511244 Task 6 executing with 0.40769994096580686 Task 9 executing with 0.975808777668726 Task 7 executing with 0.3606366646880488 Task 4 executing with 0.4615046388849908 Task 1 executing with 0.6433608731245162 Task 10 executing with 0.2956957558920491 Task 5 executing with 0.12872460583527878 Task 8 executing with 0.4309493777029251 Task 2 executing with 0.9967886503954435 Task 11 executing with 0.79955045339096 Task 12 executing with 0.7733438980156998 Task 15 executing with 0.7297915340208303 Task 18 executing with 0.30748964608813956 Task 21 executing with 0.4892827544960867 Task 13 executing with 0.018267951671115612 Task 16 executing with 0.3476333209014224 Task 19 executing with 0.3672864184918577 Task 22 executing with 0.5694450561676629 Task 20 executing with 0.7252998536028592 Task 14 executing with 0.3629754469857971 Task 17 executing with 0.5656610127828164 Task 23 executing with 0.7373224296791655 Task 24 executing with 0.7512826991090196 Task 27 executing with 0.2518482123784175 Task 30 executing with 0.9687271449118141 Task 33 executing with 0.5440843351926187 Task 28 executing with 0.4330456383203588 Task 25 executing with 0.6393762333394067 Task 31 executing with 0.4132713320426247 Task 34 executing with 0.03741703848483546 Task 29 executing with 0.6998057838359145 Task 32 executing with 0.33413878862216695 Task 26 executing with 0.07988830486590015 Task 35 executing with 0.4874276457218072 Task 36 executing with 0.8992767229821009 Task 39 executing with 0.6591309056700052 Task 37 executing with 0.6445276144216043 Task 38 executing with 0.2666553360176942 |
Next, let’s look at the same example using a chunksize.
Example With Chunks
We can update the previous example to use a chunksize.
This can be achieved by adding a “chunksize” argument to the call to map().
Given that we are issuing 40 calls to the task() function to the process pool and that we have 4 child worker processes, we can divide the work evenly into 4 groups of ten function calls.
This even division is a good starting point, but it is always a good idea to test different values of the chunksize and optimize it for the performance of your specific application with some trial and error.
This means we can set the chunksize to 10.
1 2 3 |
... # execute tasks in chunks, block until all complete pool.map(task, range(40), chunksize=10) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
# SuperFastPython.com # example of parallel map() with the process pool with a larger iterable and chunksize from random import random from time import sleep from multiprocessing.pool import Pool # task executed in a worker process def task(identifier): # generate a value value = random() # report a message print(f'Task {identifier} executing with {value}', flush=True) # block for a moment sleep(1) # return the generated value return value # protect the entry point if __name__ == '__main__': # create and configure the process pool with Pool(4) as pool: # execute tasks in chunks, block until all complete pool.map(task, range(40), chunksize=10) # process pool is closed automatically |
Running the example first creates the process pool with 4 child process workers.
The map() function is then called for the range with a chunksize of 10.
This issues 4 units of work to the process pool, one for each child worker process and each composed of 10 calls to the task() function.
Each call to the task function generates a random number between 0 and 1, reports a message, blocks, then returns a value.
The main process iterates over the values returned from the calls to the task() function and reports the generated values, matching those generated in each child process.
On my system, the example took about 10.2 seconds to complete. This is about 1.20x faster.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
Task 0 executing with 0.794491382811209 Task 10 executing with 0.5477217960378873 Task 20 executing with 0.15192050102920973 Task 30 executing with 0.7939659194505554 Task 11 executing with 0.1485413807153544 Task 1 executing with 0.7242337657726589 Task 21 executing with 0.05628878686776417 Task 31 executing with 0.28188516025547705 Task 12 executing with 0.9181996728216912 Task 2 executing with 0.4527842285444894 Task 22 executing with 0.23260582481210912 Task 32 executing with 0.15269151485471666 Task 3 executing with 0.09408707501848723 Task 13 executing with 0.8526576467474918 Task 33 executing with 0.6428449205480097 Task 23 executing with 0.798258040893137 Task 4 executing with 0.30309622660884106 Task 14 executing with 0.1981239223133333 Task 34 executing with 0.8958626267543908 Task 24 executing with 0.8147988483026208 Task 5 executing with 0.49226446286312964 Task 15 executing with 0.5638858505977956 Task 35 executing with 0.5455297420654162 Task 25 executing with 0.3405217036459983 Task 6 executing with 0.7663732832136095 Task 16 executing with 0.668594165814806 Task 36 executing with 0.7119646745796702 Task 26 executing with 0.557524963023561 Task 17 executing with 0.5460869840407998 Task 7 executing with 0.12773650330181796 Task 27 executing with 0.7306083569133269 Task 37 executing with 0.4922912795009039 Task 18 executing with 0.7748822500544968 Task 8 executing with 0.38025694297438917 Task 38 executing with 0.20040403315059507 Task 28 executing with 0.5279419278818743 Task 19 executing with 0.9922071902982653 Task 9 executing with 0.828102916388147 Task 39 executing with 0.4737291088697473 Task 29 executing with 0.786178588857227 |
Further Reading
This section provides additional resources that you may find helpful.
Books
- Multiprocessing Pool Jump-Start, Jason Brownlee (my book!)
- Multiprocessing API Interview Questions
- Pool Class API Cheat Sheet
I would also recommend specific chapters from these books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Multiprocessing Pool: The Complete Guide
- Python ThreadPool: The Complete Guide
- Python Multiprocessing: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
APIs
References
Takeaways
You now know how to use a parallel version of map() with the process pool.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
ishaanforthewin says
It is important to remember, and this article should mention it. That if you don’t specify a custom chunksize for map. It uses a default chunksize, extra = divmod(len(iterable), len(self.pool) * 4, if extra, chunksize += 1. So, if you’ve got a pool of 8 workers and 100 jobs, the chunksize will be 4.
This has been found to have best performance in average case.
Jason Brownlee says
Great point!
I updated the first example to set the chunkize to 1, instead of using the default. Thanks!
I cover chunksize including the default and how to configure it for map() in this tutorial:
https://superfastpython.com/multiprocessing-pool-map-chunksize/
I’ve updated the above to link to the tutorial in the relevant places.
Hua says
if my func input and output are arrays, pool.map return is not array instead of None,why that happens
Jason Brownlee says
The Pool.map() method can support target functions that take an array and return an array, e.g. a numpy array.
I suspect the problem is not with map(), but perhaps a big in the way you are using map.
Perhaps you can try to reduce your code example to a minimum possible example and post it below?