Multiprocessing Pool.map() in Python

Last Updated on September 13, 2022

You can apply a function to each item in an iterable in parallel using the Pool map() method.

In this tutorial you will discover how to use a parallel version of map() with the process pool in Python.

Let’s get started.

Table of Contents

Need a Parallel Version of map()

The multiprocessing.pool.Pool in Python provides a pool of reusable processes for executing ad hoc tasks.

A process pool can be configured when it is created, which will prepare the child workers.

A process pool object which controls a pool of worker processes to which jobs can be submitted. It supports asynchronous results with timeouts and callbacks and has a parallel map implementation.
— multiprocessing — Process-based parallelism

The built-in map() function allows you to apply a function to each item in an iterable.

The Python process pool provides a parallel version of the map() function.

How can we use the parallel version of map() with the process pool?

Run loops using all CPUs, download your FREE book to learn how.

How to Use Pool.map()

The process pool provides a parallel map function via Pool.map().

Recall that the built-in map() function will apply a given function to each item in a given iterable.

Return an iterator that applies function to every item of iterable, yielding the results.
— Built-in Functions

It yields one result returned from the given target function called with one item from a given iterable. It is common to call map and iterate the results in a for-loop.

For example:

...

# iterates results from map

for result in map(task, items):

# ...

The multiprocessing.pool.Pool process pool provides a version of the map() function where the target function is called for each item in the provided iterable in parallel.

A parallel equivalent of the map() built-in function […]. It blocks until the result is ready.
— multiprocessing — Process-based parallelism

For example:

...

# iterates results from map

for result in pool.map(task, items):

# ...

Each item in the iterable is taken as a separate task in the process pool.

Like the built-in map() function, the returned iterator of results will be in the order of the provided iterable. This means that tasks are issued (and perhaps executed) in the same order as the results are returned.

Unlike the built-in map() function, the Pool.map() function only takes one iterable as an argument. This means that the target function executed in the process can only take a single argument.

The iterable of items that is passed is iterated in order to issue all tasks to the process pool. Therefore, if the iterable is very long, it may result in many tasks waiting in memory to execute, e.g. one per worker process.

It is possible to split up the items in the iterable evenly to worker processes.

For example, if we had a process pool with 4 child worker processes and an iterable with 40 items, we can split up the items into 4 chunks of 10 items, with one chunk allocated to each worker process.

The effect is less overhead in transmitting tasks to worker processes and collecting results.

This can be achieved via the “chunksize” argument to map().

This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integer.
— multiprocessing — Process-based parallelism

For example:

...

# iterates results from map with chunksize

for result in pool.map(task, items, chunksize=10):

# ...

You can learn more about how to configure the chunksize in the tutorial:

How to Configure Multiprocessing Pool.map() Chunksize

Difference Between map() and map_async()

How does the Pool.map() function compare to the Pool.map_async() for issuing tasks to the process pool?

Both the map() and map_async() may be used to issue tasks that call a function to all items in an iterable via the process pool.

The following summarizes the key differences between these two functions:

The map() function blocks, whereas the map_async() function does not block.
The map() function returns an iterable of return values from the target function, whereas the map_async() function returns an AsyncResult.
The map() function does not support callback functions, whereas the map_async() function can execute callback functions on return values and errors.

The map() function should be used for issuing target task functions to the process pool where the caller can or must block until all function calls are complete.

The map_async() function should be used for issuing target task functions to the process pool where the caller cannot or must not block while the task is executing.

Now that we know how to use the map() function to execute tasks in the process pool, let’s look at some worked examples.

Download Now: Free Process Pool PDF Cheat Sheet

Example of Pool.map()

We can explore how to use the parallel version of map() on the process pool.

In this example, we can define a target task function that takes an integer as an argument, generates a random number, reports the value then returns the value that was generated. We can then call this function for each integer between 0 and 9 using the process pool map().

This will apply the function to each integer in parallel using as many cores as are available in the system.

Firstly, we can define the target task function.

The function takes an argument, generates a random number between 0 and 1, reports the integer and generated number. It then blocks for a fraction of a second to simulate computational effort, then returns the number that was generated.

The task() function below implements this.

# task executed in a worker process

def task(identifier):

# generate a value

value = random()

# report a message

print(f'Task {identifier} executing with {value}', flush=True)

# block for a moment

sleep(value)

# return the generated value

return value

We can then create and configure a process pool.

We will use the context manager interface to ensure the pool is shutdown automatically once we are finished with it.

If you are new to the context manager interface of the process pool, you can learn more in the tutorial:

Process Pool Context Manager

...

# create and configure the process pool

with Pool() as pool:

# ...

We can then call the map() function on the process pool to apply our task() function to each value in a range between 0 and 1.

This returns an iterator over the results returned from the task() function, in the order that function calls are completed. We will iterate over the results and report each in turn.

This can all be achieved in a for-loop.

...

# execute tasks in order

for result in pool.map(task, range(10)):

print(f'Got result: {result}', flush=True)

Tying this together, the complete example is listed below.

# SuperFastPython.com

# example of parallel map() with the process pool

from random import random

from time import sleep

from multiprocessing.pool import Pool

# task executed in a worker process

def task(identifier):

# generate a value

value = random()

# report a message

print(f'Task {identifier} executing with {value}', flush=True)

# block for a moment

sleep(value)

# return the generated value

return value

# protect the entry point

if __name__ == '__main__':

# create and configure the process pool

with Pool() as pool:

# execute tasks in order

for result in pool.map(task, range(10)):

print(f'Got result: {result}', flush=True)

# process pool is closed automatically

Running the example first creates the process pool with a default configuration.

It will have one child worker process for each logical CPU in your system.

The map() function is then called for the range.

This issues ten calls to the task() function, one for each integer between 0 and 9. An iterator is returned with the result for each function call, in order.

Each call to the task function generates a random number between 0 and 1, reports a message, blocks, then returns a value.

The main process iterates over the values returned from the calls to the task() function and reports the generated values, matching those generated in each child process.

Importantly, all task() function calls are issued and executed before the iterator of results is returned. We cannot iterate over results as they are completed by the caller.

Task 0 executing with 0.25274306680283887

Task 1 executing with 0.8117380416906219

Task 2 executing with 0.2249383542314648

Task 3 executing with 0.5441268743153568

Task 4 executing with 0.20938554670136755

Task 5 executing with 0.21474682340642592

Task 6 executing with 0.5280425396041588

Task 7 executing with 0.6644217040298188

Task 8 executing with 0.62318604008126

Task 9 executing with 0.21148696187351357

Got result: 0.25274306680283887

Got result: 0.8117380416906219

Got result: 0.2249383542314648

Got result: 0.5441268743153568

Got result: 0.20938554670136755

Got result: 0.21474682340642592

Got result: 0.5280425396041588

Got result: 0.6644217040298188

Got result: 0.62318604008126

Got result: 0.21148696187351357

Next, let’s look at an example where we might call a map for a function with no return value.

Free Python Multiprocessing Pool Course

Download your FREE Process Pool PDF cheat sheet and get BONUS access to my free 7-day crash course on the Process Pool API.

Discover how to use the Multiprocessing Pool including how to configure the number of workers and how to execute tasks asynchronously.

Learn more

Example of Pool.map() with No Return Value

We can explore using the map() function to call a function for each item in an iterable that does not have a return value.

This means that we are not interested in the iterable of results returned by the call to map() and instead are only interested that all issued tasks get executed.

This can be achieved by updating the previous example so that the task() function does not return a value.

The updated task() function with this change is listed below.

# task executed in a worker process

def task(identifier):

# generate a value

value = random()

# report a message

print(f'Task {identifier} executing with {value}', flush=True)

# block for a moment

sleep(value)

Then, in the main process, we can call map() with our task() function and the range, and not iterate the results.

...

# execute tasks, block until all completed

pool.map(task, range(10))

Importantly, the call to map() on the process pool will block the main process until all issued tasks are completed.

Once completed, the call will return and the process pool will be closed by the context manager.

This is a helpful pattern to issue many tasks to the process pool with a single function call.

Tying this together, the complete example is listed below.

# SuperFastPython.com

# example of parallel map() with the process pool and a task that does not return a value

from random import random

from time import sleep

from multiprocessing.pool import Pool

# task executed in a worker process

def task(identifier):

# generate a value

value = random()

# report a message

print(f'Task {identifier} executing with {value}', flush=True)

# block for a moment

sleep(value)

# protect the entry point

if __name__ == '__main__':

# create and configure the process pool

with Pool() as pool:

# execute tasks, block until all completed

pool.map(task, range(10))

# process pool is closed automatically

Running the example first creates the process pool with a default configuration.

The map() function is then called for the range. This issues ten calls to the task() function, one for each integer between 0 and 9. An iterator is returned with the result for each function call, but is ignored in this case.

Each call to the task function generates a random number between 0 and 1, reports a message, then blocks.

The main process blocks until the map() function returns.

The tasks finish, map() returns, then the process pool is closed.

This example again highlights that the call to map() blocks until all issued tasks are completed.

Task 0 executing with 0.3430883422801846

Task 1 executing with 0.505941095387199

Task 2 executing with 0.2950340395105896

Task 3 executing with 0.49912260710720713

Task 4 executing with 0.4541276718378785

Task 5 executing with 0.2993272526844628

Task 6 executing with 0.06380444907074578

Task 7 executing with 0.2387238116999184

Task 8 executing with 0.3311496041557106

Task 9 executing with 0.8087950700153486

Next, let’s explore the chunksize argument to the map() function.

Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps

Example of Pool.map() with chunksize

The map() function will apply a function to each item in an iterable.

If the iterable has a large number of items, it may be inefficient to issue function calls to the target function for each item.

A more efficient approach would be to divide the items in the iterable into chunks and issue chunks of items to each worker process to which the target function can be applied.

This can be achieved with the “chunksize” argument to the map() function.

We can demonstrate this with an example, first without using the “chunksize” argument, then using the “chunksize” to speed up the division of work.

If a chunksize is not specified, a default is used that has been found to be very effective. You can learn more about the default chunksize and how to configure it in the tutorial:

How to Configure Multiprocessing Pool.map() Chunksize

Example Without Chunks

Before we demonstrate the “chunksize” argument, we can devise an example that has a reasonably large iterable.

In this example, we can update the previous example to call the task() function that returns a value as before, but to have 4 child worker processes and to call the task() function 40 times, for integers 0 and 39.

This requires that we set the chunksize to 1.

This number of child workers and calls to task() were chosen so that we can test the “chunksize” argument in the next section. If you have fewer than 4 logical CPU cores in your system, change the example accordingly, e.g. 2 worker processes and 20 tasks.

...

# create and configure the process pool

with Pool(4) as pool:

# execute tasks, block until all complete

pool.map(task, range(40), chunksize=1)

Tying this together, the complete example is listed below.

# SuperFastPython.com

# example of parallel map() with the process pool with a larger iterable

from random import random

from time import sleep

from multiprocessing.pool import Pool

# task executed in a worker process

def task(identifier):

# generate a value

value = random()

# report a message

print(f'Task {identifier} executing with {value}', flush=True)

# block for a moment

sleep(1)

# return the generated value

return value

# protect the entry point

if __name__ == '__main__':

# create and configure the process pool

with Pool(4) as pool:

# execute tasks, block until all complete

pool.map(task, range(40), chunksize=1)

# process pool is closed automatically

Running the example first creates the process pool with 4 child process workers.

The map() function is then called for the range.

This issues 40 calls to the task() function, one for each integer between 0 and 39. An iterator is returned with the result for each function call, in order.

Each call to the task function generates a random number between 0 and 1, reports a message, blocks, then returns a value.

The main process iterates over the values returned from the calls to the task() function and reports the generated values, matching those generated in each child process.

On my system, the example took about 12.2 seconds to complete.

Task 0 executing with 0.7061242557983359

Task 3 executing with 0.05859203085511244

Task 6 executing with 0.40769994096580686

Task 9 executing with 0.975808777668726

Task 7 executing with 0.3606366646880488

Task 4 executing with 0.4615046388849908

Task 1 executing with 0.6433608731245162

Task 10 executing with 0.2956957558920491

Task 5 executing with 0.12872460583527878

Task 8 executing with 0.4309493777029251

Task 2 executing with 0.9967886503954435

Task 11 executing with 0.79955045339096

Task 12 executing with 0.7733438980156998

Task 15 executing with 0.7297915340208303

Task 18 executing with 0.30748964608813956

Task 21 executing with 0.4892827544960867

Task 13 executing with 0.018267951671115612

Task 16 executing with 0.3476333209014224

Task 19 executing with 0.3672864184918577

Task 22 executing with 0.5694450561676629

Task 20 executing with 0.7252998536028592

Task 14 executing with 0.3629754469857971

Task 17 executing with 0.5656610127828164

Task 23 executing with 0.7373224296791655

Task 24 executing with 0.7512826991090196

Task 27 executing with 0.2518482123784175

Task 30 executing with 0.9687271449118141

Task 33 executing with 0.5440843351926187

Task 28 executing with 0.4330456383203588

Task 25 executing with 0.6393762333394067

Task 31 executing with 0.4132713320426247

Task 34 executing with 0.03741703848483546

Task 29 executing with 0.6998057838359145

Task 32 executing with 0.33413878862216695

Task 26 executing with 0.07988830486590015

Task 35 executing with 0.4874276457218072

Task 36 executing with 0.8992767229821009

Task 39 executing with 0.6591309056700052

Task 37 executing with 0.6445276144216043

Task 38 executing with 0.2666553360176942

Next, let’s look at the same example using a chunksize.

Example With Chunks

We can update the previous example to use a chunksize.

This can be achieved by adding a “chunksize” argument to the call to map().

Given that we are issuing 40 calls to the task() function to the process pool and that we have 4 child worker processes, we can divide the work evenly into 4 groups of ten function calls.

This even division is a good starting point, but it is always a good idea to test different values of the chunksize and optimize it for the performance of your specific application with some trial and error.

This means we can set the chunksize to 10.

...

# execute tasks in chunks, block until all complete

pool.map(task, range(40), chunksize=10)

Tying this together, the complete example is listed below.

# SuperFastPython.com

# example of parallel map() with the process pool with a larger iterable and chunksize

from random import random

from time import sleep

from multiprocessing.pool import Pool

# task executed in a worker process

def task(identifier):

# generate a value

value = random()

# report a message

print(f'Task {identifier} executing with {value}', flush=True)

# block for a moment

sleep(1)

# return the generated value

return value

# protect the entry point

if __name__ == '__main__':

# create and configure the process pool

with Pool(4) as pool:

# execute tasks in chunks, block until all complete

pool.map(task, range(40), chunksize=10)

# process pool is closed automatically

Running the example first creates the process pool with 4 child process workers.

The map() function is then called for the range with a chunksize of 10.

This issues 4 units of work to the process pool, one for each child worker process and each composed of 10 calls to the task() function.

Each call to the task function generates a random number between 0 and 1, reports a message, blocks, then returns a value.

The main process iterates over the values returned from the calls to the task() function and reports the generated values, matching those generated in each child process.

On my system, the example took about 10.2 seconds to complete. This is about 1.20x faster.

Task 0 executing with 0.794491382811209

Task 10 executing with 0.5477217960378873

Task 20 executing with 0.15192050102920973

Task 30 executing with 0.7939659194505554

Task 11 executing with 0.1485413807153544

Task 1 executing with 0.7242337657726589

Task 21 executing with 0.05628878686776417

Task 31 executing with 0.28188516025547705

Task 12 executing with 0.9181996728216912

Task 2 executing with 0.4527842285444894

Task 22 executing with 0.23260582481210912

Task 32 executing with 0.15269151485471666

Task 3 executing with 0.09408707501848723

Task 13 executing with 0.8526576467474918

Task 33 executing with 0.6428449205480097

Task 23 executing with 0.798258040893137

Task 4 executing with 0.30309622660884106

Task 14 executing with 0.1981239223133333

Task 34 executing with 0.8958626267543908

Task 24 executing with 0.8147988483026208

Task 5 executing with 0.49226446286312964

Task 15 executing with 0.5638858505977956

Task 35 executing with 0.5455297420654162

Task 25 executing with 0.3405217036459983

Task 6 executing with 0.7663732832136095

Task 16 executing with 0.668594165814806

Task 36 executing with 0.7119646745796702

Task 26 executing with 0.557524963023561

Task 17 executing with 0.5460869840407998

Task 7 executing with 0.12773650330181796

Task 27 executing with 0.7306083569133269

Task 37 executing with 0.4922912795009039

Task 18 executing with 0.7748822500544968

Task 8 executing with 0.38025694297438917

Task 38 executing with 0.20040403315059507

Task 28 executing with 0.5279419278818743

Task 19 executing with 0.9922071902982653

Task 9 executing with 0.828102916388147

Task 39 executing with 0.4737291088697473

Task 29 executing with 0.786178588857227

Takeaways

You now know how to use a parallel version of map() with the process pool.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Photo by Magda V on Unsplash

Comments

ishaanforthewin says

September 12, 2022 at 9:39 am

It is important to remember, and this article should mention it. That if you don’t specify a custom chunksize for map. It uses a default chunksize, extra = divmod(len(iterable), len(self.pool) * 4, if extra, chunksize += 1. So, if you’ve got a pool of 8 workers and 100 jobs, the chunksize will be 4.

This has been found to have best performance in average case.

- Jason Brownlee says
  
  September 13, 2022 at 4:55 am
  
  Great point!
  
  I updated the first example to set the chunkize to 1, instead of using the default. Thanks!
  
  I cover chunksize including the default and how to configure it for map() in this tutorial:
  https://superfastpython.com/multiprocessing-pool-map-chunksize/
  
  I’ve updated the above to link to the tutorial in the relevant places.
  
Hua says

October 11, 2022 at 8:37 pm

if my func input and output are arrays, pool.map return is not array instead of None,why that happens

- Jason Brownlee says
  
  October 12, 2022 at 5:35 am
  
  The Pool.map() method can support target functions that take an array and return an array, e.g. a numpy array.
  
  I suspect the problem is not with map(), but perhaps a big in the way you are using map.
  
  Perhaps you can try to reduce your code example to a minimum possible example and post it below?

Multiprocessing Pool.map() in Python

Need a Parallel Version of map()

How to Use Pool.map()

Difference Between map() and map_async()

Example of Pool.map()

Example of Pool.map() with No Return Value

Example of Pool.map() with chunksize

Example Without Chunks

Example With Chunks

Further Reading

Takeaways

Related Tutorials:

Parallel Loops in Python

Loving the Tutorials?

Get The Book:

Don't Dabble!

Learn All Of Python Concurrency

No more idle CPUs

Learn the Pool Class Systematically

Additional menu

Need a Parallel Version of map()

How to Use Pool.map()

Difference Between map() and map_async()

Example of Pool.map()

Example of Pool.map() with No Return Value

Example of Pool.map() with chunksize

Example Without Chunks

Example With Chunks

Further Reading

Takeaways

Share this:

Related Tutorials:

About Jason Brownlee

Parallel Loops in Python

Reader Interactions

Comments

Leave a Reply Cancel reply

Footer

Learn the Pool Class Systematically