7 Multiprocessing Pool Common Errors in Python

Last Updated on September 12, 2022

You may encounter one among a number of common errors when using the multiprocessing.Pool in Python.

These errors are often easy to identify and often involve a quick fix.

In this tutorial you will discover the common errors when using multiprocessing pools in Python and how to fix each in turn.

Let’s get started.

Table of Contents

Common Errors When Using Multiprocessing Pool

There are a number of common errors when using the multiprocessing.Pool.

These errors are typically made because of bugs introduced by copy-and-pasting code, or from a slight misunderstanding in how the multiprocessing.Pool works.

We will take a closer look at some of the more common errors made when using the multiprocessing.Pool, such as:

Forgetting __main__
Using a Function Call in submit()
Using a Function Call in map()
Incorrect Function Signature for map()
Incorrect Function Signature for Callbacks
Arguments or Shared Data that Does Not Pickle
Not Flushing print() Statements

Do you have an error using the multiprocessing.Pool?
Let me know in the comments so I can recommend a fix and add the case to this tutorial.

Run loops using all CPUs, download your FREE book to learn how.

Error 1: Forgetting main

By far the biggest error when using the multiprocessing Pool is forgetting to protect the entry point, e.g. check for the __main__ module.

Recall that when using processes in Python such as the Process class or the multiprocessing.Pool class we must include a check for the top-level environment. This is specifically the case when using the ‘spawn‘ start method, the default on Win32 and MacOS, but is a good practice anyway.

We can check for the top-level environment by checking if the module name variable __name__ is equal to the string ‘__main__‘.

This indicates that the code is running at the top-level code environment, rather than being imported by a program or script.

For example:

# entry point

if __name__ == '__main__':

# ...

You can learn more about __main__ more generally here:

__main__ — Top-level code environment

Forgetting the main function will result in an error that can be quite confusing.

A complete example of using the multiprocessing.Pool without a check for the __main__ module is listed below.

# SuperFastPython.com

# example of not having a check for the main top-level environment

from time import sleep

from multiprocessing import Pool

# custom task that will sleep for a variable amount of time

def task(value):

# block for a moment

sleep(1)

return value

# start the process pool

with Pool() as pool:

# submit all tasks

for result in pool.map(task, range(5)):

print(result)

Running this example will fail with a RuntimeError.

Traceback (most recent call last):

...

RuntimeError:

An attempt has been made to start a new process before the

current process has finished its bootstrapping phase.

This probably means that you are not using fork to start your

child processes and you have forgotten to use the proper idiom

in the main module:

if __name__ == '__main__':

freeze_support()

...

The "freeze_support()" line can be omitted if the program

is not going to be frozen to produce an executable.

The error message does include information about the need to import an entry point to the program, but also comments on freeze_support which can be confusing for beginners.

This error can be fixed by protecting the entry point of the program with an if-statement:

1 2	if __name__ == '__main__': # ...

You can learn more about this in the tutorial:

Add if name == ‘main’ When Spawning a Child Process

Download Now: Free Process Pool PDF Cheat Sheet

Error 2: Using a Function Call in apply_async()

A common error is to call your function when using the apply_async() function.

For example:

...

# issue the task

result = pool.apply_async(task())

A complete example with this error is listed below.

# SuperFastPython.com

# example of calling submit with a function call

from time import sleep

from multiprocessing import Pool

# custom function executed in another process

def task():

# block for a moment

sleep(1)

return 'all done'

# protect the entry point

if __name__ == '__main__':

# start the process pool

with Pool() as pool:

# issue the task

result = pool.apply_async(task())

# get the result

value = result.get()

print(value)

Running this example will fail with an error.

multiprocessing.pool.RemoteTraceback:

"""

Traceback (most recent call last):

...

TypeError: 'str' object is not callable

"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

...

TypeError: 'str' object is not callable

You can fix the error by updating the call to apply_async() to take the name of your function and any arguments, instead of calling the function in the call to execute.

For example:

...

# issue the task

result = pool.apply_async(task)

Free Python Multiprocessing Pool Course

Download your FREE Process Pool PDF cheat sheet and get BONUS access to my free 7-day crash course on the Process Pool API.

Discover how to use the Multiprocessing Pool including how to configure the number of workers and how to execute tasks asynchronously.

Learn more

Error 3: Using a Function Call in map()

A common error is to call your function when using the map() function.

For example:

...

# issue all tasks

for result in pool.map(task(), range(5)):

print(result)

A complete example with this error is listed below.

# SuperFastPython.com

# example of calling map with a function call

from time import sleep

from multiprocessing import Pool

# custom function executed in another process

def task(value):

# block for a moment

sleep(1)

return 'all done'

# protect the entry point

if __name__ == '__main__':

# start the process pool

with Pool() as pool:

# issue all tasks

for result in pool.map(task(), range(5)):

print(result)

Running the example results in a TypeError.

Traceback (most recent call last):

...

for result in pool.map(task(), range(5)):

TypeError: task() missing 1 required positional argument: 'value'

This error can be fixed by changing the call to map() to pass the name of the target task function instead of a call to the function.

...

# issue all tasks

for result in pool.map(task, range(5)):

print(result)

Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps

Error 4: Incorrect Function Signature for map()

Another common error when using map() is to provide no second argument to the function, e.g. the iterable.

For example:

...

# issue all tasks

for result in pool.map(task):

print(result)

A complete example with this error is listed below.

# SuperFastPython.com

# example of calling map without an iterable

from time import sleep

from multiprocessing import Pool

# custom function executed in another process

def task(value):

# block for a moment

sleep(1)

return 'all done'

# protect the entry point

if __name__ == '__main__':

# start the process pool

with Pool() as pool:

# issue all tasks

for result in pool.map(task):

print(result)

Running the example does not issue any tasks to the process pool as there was no iterable for the map() function to iterate over.

Running the example results in a TypeError.

Traceback (most recent call last):

...

TypeError: map() missing 1 required positional argument: 'iterable'

The fix involves providing an iterable in the call to map() along with your function name.

...

# issue all tasks

for result in pool.map(task, range(5)):

print(result)

Error 5: Incorrect Function Signature for Callbacks

Loving The Tutorials?

Why not take the next step? Get the book.

Learn more

Another common error is to forget to include the result in the signature for the callback function when issuing tasks asynchronously.

For example:

# result callback function

def handler():

print(f'Callback got: {result}', flush=True)

A complete example with this error is listed below.

# SuperFastPython.com

# example of a callback function for apply_async()

from time import sleep

from multiprocessing.pool import Pool

# result callback function

def handler():

print(f'Callback got: {result}', flush=True)

# custom function executed in another process

def task():

# block for a moment

sleep(1)

return 'all done'

# protect the entry point

if __name__ == '__main__':

# create and configure the process pool

with Pool() as pool:

# issue tasks to the process pool

result = pool.apply_async(task, callback=handler)

# get the result

value = result.get()

print(value)

Running this example will result in an error when the callback is called by the process pool.

This will break the process pool and the program will have to be killed manually with a Control-C.

Exception in thread Thread-3:

Traceback (most recent call last):

...

TypeError: handler() takes 0 positional arguments but 1 was given

Fixing this error involves updating the signature of your callback function to include the result from the task.

# result callback function

def handler(result):

print(f'Callback got: {result}', flush=True)

You can learn more about using callback functions with asynchronous tasks in the tutorial:

Multiprocessing Pool Callback Functions in Python

This error can also happen with the error callback and forgetting to add the error as an argument in the error callback function.

Error 6: Arguments or Shared Data that Does Not Pickle

A common error is sharing data between processes that cannot be serialized.

Python has a built-in object serialization process called pickle, where objects are pickled or unpickled when serialized and unserialized.

When sharing data between processes, the data will be pickled automatically.

This includes arguments passed to target task functions, data returned from target task functions, and data accessed directly, such as global variables.

If data that is shared between processes cannot be automatically pickled, a PicklingError will be raised.

Most normal Python objects can be pickled.

Examples of objects that cannot pickle are those that might have an open connection, such as to a file, database, server or similar.

We can demonstrate this with an example below that attempts to pass a file handle as an argument to a target task function.

# SuperFastPython.com

# example of an argument that does not pickle

from time import sleep

from multiprocessing import Pool

# custom function executed in another process

def task(file):

# write to the file

file.write('hi there')

return 'all done'

# protect the entry point

if __name__ == '__main__':

# open the file

with open('tmp.txt', 'w') as file:

# start the process pool

with Pool() as pool:

# issue the task

result = pool.apply_async(task, file)

# get the result

value = result.get()

print(value)

Running the example, we can see that it falls with an error indicating that the argument cannot be pickled for transmission to the worker process.

Traceback (most recent call last):

...

TypeError: cannot pickle '_io.TextIOWrapper' object

This was a contrived example, nevertheless indicative of cases where you cannot pass some active objects to child processes because they cannot be picked.

In general, if you experience this error and you are attempting to pass around a connection or open file, perhaps try to open the connection within the task or use threads instead of processes.

If you experience this type of error with custom data types that are being passed around, you may need to implement code to manually serialize and deserialize your types. I recommend reading the documentation for the pickle module.

Error 7: Not Flushing print() Statements

A common error is to not flush standard out (stdout) when calling the built-in print() statement from target task functions.

By default, the built-in print() statement in Python does not flush output.

You can learn more about the built-in functions here:

Python Built-in Functions

The standard output stream (stout) will flush automatically in the main process, often when the internal buffer is full or a new line is detected. This means you see your print statements reported almost immediately after the print function is called in code.

There is a problem when calling the print() function from spawned or forked processes because standard out will buffer output by default.

This means if you call print() from target task functions in the multiprocessing.Pool, you probably will not see the print statements on standard out until the worker processes are closed.

This will be confusing because it will look like your program is not working correctly, e.g. buggy.

The example below demonstrates this with a target task function that will call print() to report some status.

# SuperFastPython.com

# example of not flushing output when call print() from tasks in new processes

from time import sleep

from random import random

from multiprocessing import Pool

# custom function executed in another process

def task(value):

# block for a moment

sleep(random())

# report a message

print(f'Done: {value}')

# protect the entry point

if __name__ == '__main__':

# start the process pool

with Pool() as pool:

# submit all tasks

pool.map(task, range(5))

Running the example will wait until all tasks in the process pool have completed before printing all messages on standard out.

Done: 0

Done: 1

Done: 2

Done: 3

Done: 4

All done!

This can be fixed by updating all calls to the print() statement called from target task functions to flush output after each call.

This can be achieved by setting the “flush” argument to True, for example:

...

# report a message

print(f'Done: {value}', flush=True)

You can learn more about printing from child processes in the tutorial:

How to print() from a Child Process in Python

More Errors

There are other less common errors that you may encounter.

This section lists some of the less common errors.

Error Sharing Pool With Workers

You may get an error when you attempt to share a process pool with the child workers directly.

1	pool objects cannot be passed between processes or pickled

This can be fixed using a multiprocessing.Manager.

You can learn more about accessing a process pool from within workers in the tutorial:

Share a Multiprocessing Pool With Workers

Error Where Tasks Fail Silently

When issuing tasks to the process pool, they may fail silently and not give any indication of what happened.

A quick fix is to add an error callback function to report any errors, in the case where you are issuing tasks asynchronously.

You can learn more about how to fix tasks failing silently in the tutorial:

Process Pool Tasks Fail Silently

Error When Sharing Synchronization Primitives

You may get an error when attempting to share and use a synchronization primitive in the process pool.

This includes:

Lock
RLock
Semaphore
Barrier
Condition
Event

Typically a RuntimeError is raised with an error that may look like the following:

1	Condition objects should only be shared between processes through inheritance

Or:

1	Semaphore objects should only be shared between processes through inheritance

Or:

1	Lock objects should only be shared between processes through inheritance

This error occurs because these synchronization primitive objects cannot be pickled and shared with child worker processes directly.

Instead, you must use a multiprocessing.Manager to create a centralized version of the primitive and share the proxy object that is returned.

You can learn more about how to safely share synchronization primitive objects in the process pool in the tutorials:

Error When Joining the Process Pool

You may get an error when you attempt to join the process pool by calling join().

The error may look as follows:

1	ValueError: Pool is still running

This error occurs because you attempt to join the process pool while it is still running.

You can fix this error by first closing the pool by calling close() or terminate().

You can learn more about joining the process pool in the tutorial:

Join a Multiprocessing Pool in Python

Error When Issuing Tasks

You may get an error when issuing tasks to the process pool.

The error may look as follows:

1	ValueError: Pool not running

This error occurs because you have closed the process pool then attempt to issue tasks to execute.

The pool cannot execute tasks if it is not running.

You must start a new pool or issue tasks before closing the pool.

You can learn more about correctly shutting down the process pool in the tutorial:

Shutdown the Multiprocessing Pool in Python

Takeaways

You now know about the common errors when using the multiprocessing.Pool in Python.

Do you have any questions?
Ask your question in the comments below and I will do my best to answer.

Photo by Luis Vidal on Unsplash

7 Multiprocessing Pool Common Errors in Python

Common Errors When Using Multiprocessing Pool

Error 1: Forgetting main

Error 2: Using a Function Call in apply_async()

Error 3: Using a Function Call in map()

Error 4: Incorrect Function Signature for map()

Error 5: Incorrect Function Signature for Callbacks

Error 6: Arguments or Shared Data that Does Not Pickle

Error 7: Not Flushing print() Statements