Multiprocessing Best Practices

Last Updated on September 12, 2022

It is important to follow best practices when using the multiprocessing.Process class in Python.

Best practices allow you to side-step the most common errors and bugs when using thread for concurrent tasks in your programs.

In this tutorial you will discover the best practices when using Python thread pools.

Let’s get started.

Table of Contents

Multiprocessing Best Practices

The multiprocessing.Process class is a flexible and powerful tool for executing tasks concurrently in child processes.

Once you know how the multiprocessing.Process works, it is important to review some best practices to consider when bringing thread pools into our Python programs.

To keep things simple, there are five best practices when creating new child processes, they are:

Use Context Managers
Use Timeouts When Waiting
Use Main Module Idiom
Use Shared ctypes
Use Pipes and Queues

Let’s get started with the first practice, which is to use the context manager.

Run loops using all CPUs, download your FREE book to learn how.

Tip 1: Use Context Managers

Acquire and release locks using a context manager, wherever possible.

Locks can be acquired manually via a call to acquire() at the beginning of the critical section followed by a call to release() at the end of the critical section.

For example:

...

# acquire the lock manually

lock.acquire()

# critical section...

# release the lock

lock.release()

This approach should be avoided wherever possible.

Traditionally, it was recommended to always acquire and release a lock in a try-finally structure.

The lock is acquired, the critical section is executed in the try block, and the lock is always released in the finally block.

For example:

...

# acquire the lock

lock.acquire()

try:

# critical section...

finally:

# always release the lock

lock.release()

This was since replaced with the context manager interface that achieves the same thing with less code.

For example:

...

# acquire the lock

with lock:

# critical section...

The benefit of the context manager is that the lock is always released as soon as the block is exited, regardless of how it is exited, e.g. normally, a return, an error, or an exception.

This applies to a number of synchronization primitives, such as:

Acquiring a mutex lock via the multiprocessing.Lock class.
Acquiring a reentrant mutex lock via the multiprocessing.RLock class.
Acquiring a semaphore via the multiprocessing.Semaphore class.
Acquiring a condition via the multiprocessing.Condition class.

The context manager interface is also supported on other multiprocessing utilities, such as:

Opening a connection via the multiprocessing.connection.Connection class
Creating a manager via the multiprocessing.Manager class
Creating a process pool via the multiprocessing.pool.Pool class.
Creating a listener via the multiprocessing.connection.Listener class.

Download Now: Free Multiprocessing PDF Cheat Sheet

Tip 2: Use Timeouts When Waiting

Always use a timeout when waiting on a blocking call.

Many calls made on synchronization primitives may block.

For example:

Waiting to acquire a multiprocessing.Lock via acquire().
Waiting for a process to terminate via join().
Waiting to be notified on a multiprocessing.Condition via wait().

And more.

All blocking calls on concurrency primitives take a “timeout” argument and return True if the call was successful or False otherwise.

Do not call a blocking call without a timeout, wherever possible.

For example:

...

# acquire the lock

if not lock.acquire(timeout=2*60):

# handle failure case...

This will allow the waiting process to give-up waiting after a fixed time limit and then attempt to rectify the situation, e.g. report an error, force termination, etc.

Free Python Multiprocessing Course

Download your FREE multiprocessing PDF cheat sheet and get BONUS access to my free 7-day crash course on the multiprocessing API.

Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.

Learn more

Tip 3: Use Main Module Idiom

A Python program that uses multiprocessing should protect the entry point of the program.

This can be achieved by using an if-statement to check that the entry point is the top-level environment.

For example:

...

# check for top-level environment

if __name__ == '__main__':

# ...

This will help to avoid a RuntimeError when creating child processes using the ‘spawn‘ start method, the default on Windows and MacOS.

You can learn more about protecting the entry point when using multiprocessing in the tutorial:

Fix RuntimeError When Spawning a Child Process

Additionally, it is a good practice to add freeze support as the first line of a Python program that uses multiprocessing.

Freezing a Python program is a process that transforms the Python code into C code for packaging and distribution.

When a program is frozen in order to be distributed, some features of Python are not included or disabled by default.

This is for performance and/or security reasons.

One feature that is disabled when freezing a Python program is multiprocessing.

That is, we cannot create new python processes via multiprocessing.Process instances when freezing our program for distribution.

Creating a process in a frozen application results in a RuntimeError.

We can add support for multiprocessing in our program when freezing code via the multiprocessing.freeze_support() function.

For example:

...

# enable support for multiprocessing

multiprocessing.freeze_support()

This will have no effect on programs that are not frozen.

You can learn more about adding freeze support in the tutorial:

Multiprocessing Freeze Support in Python

Protecting the entry point and adding freeze support together are referred to as the “main module” idiom when using multiprocessing.

Using this idiom is a best practice when using multiprocessing.

An attempt has been made to start a new process before the

current process has finished its bootstrapping phase.

This probably means that you are not using fork to start your

child processes and you have forgotten to use the proper idiom

in the main module:

if __name__ == '__main__':

freeze_support()

...

The "freeze_support()" line can be omitted if the program

is not going to be frozen to produce an executable.

Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps

Tip 4: Use Shared ctypes

Processes do not have shared memory.

Instead, shared memory must be simulated using sockets and/or files.

If you need to share simple data variables or arrays of variables between processes, this can be achieved using shared ctypes.

Shared ctypes provide a mechanism to share data safely between processes in a process-safe manner.

You can share ctypes among processes using the multiprocessing.Value and multiprocessing.Array classes.

The multiprocessing.Value class is used to share a ctype of a given type among multiple processes.
The multiprocessing.Array class is used to share an array of ctypes of a given type among multiple processes.

Share ctypes provide a simple and easy to use way of sharing data between processes.

For example, a shared ctype value can be defined in a parent process, then shared with multiple child processes. All child processes and the parent process can then safely read and modify the data within the shared value.

This can be useful in a number of use cases, such as:

A counter shared among multiple processes.
Returning data from a child process to a parent process.
Sharing results of computation among processes.

Shared ctypes can only be shared among processes on the one system. For sharing data across processes on

The multiprocessing.Value class will create a shared ctype with a specified data type and initial value.

For example:

...

# create a value

value = multiprocessing.Value(...)

The first argument defines the data type for the value. It may be a string type code or a Python ctype class. The second argument may be an initial value.

For example, we can define a signed integer type with the ‘i’ type code and an initial value of zero as follows:

...

# create a integer value

variable = multiprocessing.Value('i', 0)

Once defined, the value can then be shared and used within multiple processes, such as between a parent and a child process.

Internally, the multiprocessing.Value makes use of a multiprocessing.RLock that ensures that access and modification of the data inside the class is mutually exclusive, e.g. process-safe.

This means that only one process at a time can access or change the data within the multiprocessing.Value object.

The data within the multiprocessing.Value object can be accessed via the “value” attribute.

For example:

...

# get the data

data = variable.value

The data within the multiprocessing.Value can be changed by the same “value” attribute.

For example:

...

# change the data

variable.value = 100

You can learn more about using shared ctypes in the tutorial:

Multiprocessing Shared ctypes in Python

Loving The Tutorials?

Why not take the next step? Get the book.

Learn more

Tip 5: Use Pipes and Queues

Processes can share messages with each other directly using pipes or queues.

These are process-safe data structures that allow processes to send or receive pickleable Python objects.

In multiprocessing, a pipe is a connection between two processes in Python.

Python provides a simple pipe in the multiprocessing.Pipe class.

A pipe can be created by calling the constructor of the multiprocessing.Pipe class, which returns two multiprocessing.connection.Connection objects.

For example:

...

# create a pipe

conn1, conn2 = multiprocessing.Pipe()

Objects can be shared between processes using the Pipe.

The Connection.send() function can be used to send objects from one process to another.

The objects sent must be picklable.

For example:

...

# send an object

conn2.send('Hello world')

The Connection.recv() function can be used to receive objects in one process sent by another.

The objects received will be automatically un-pickled.

For example:

...

# receive an object

object = conn1.recv()

You can learn more about multiprocessing pipes in the tutorial:

Multiprocessing Pipe in Python

Python provides a process-safe queue in the multiprocessing.Queue class.

A queue is a data structure on which items can be added by a call to put() and from which items can be retrieved by a call to get().

The multiprocessing.Queue provides a first-in, first-out FIFO queue, which means that the items are retrieved from the queue in the order they were added. The first items added to the queue will be the first items retrieved. This is opposed to other queue types such as last-in, first-out and priority queues.

The multiprocessing.Queue can be used by first creating an instance of the class. This will create an unbounded queue by default, that is, a queue with no size limit.

For example:

...

# created an unbounded queue

queue = multiprocessing.Queue()

Items can be added to the queue via a call to put(), for example:

...

# add an item to the queue

queue.put(item)

Items can be retrieved from the queue by calls to get().

For example:

...

# get an item from the queue

item = queue.get()

You can learn more about multiprocessing queues in the tutorial:

Multiprocessing Queue in Python

Takeaways

You now know some best practices when creating new child processes in Python.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Photo by Ilse Orsel on Unsplash

Multiprocessing Best Practices

Multiprocessing Best Practices

Tip 1: Use Context Managers

Tip 2: Use Timeouts When Waiting

Tip 3: Use Main Module Idiom

Tip 4: Use Shared ctypes

Tip 5: Use Pipes and Queues

Further Reading

Takeaways

Related Tutorials:

Parallel Loops in Python

Multiprocessing Resources:

Loving the Tutorials?

Get The Book:

Don't Dabble!

Learn All Of Python Concurrency

No more idle CPUs

Learn Multiprocessing Systematically

Additional menu

Multiprocessing Best Practices

Tip 1: Use Context Managers

Tip 2: Use Timeouts When Waiting

Tip 3: Use Main Module Idiom

Tip 4: Use Shared ctypes

Tip 5: Use Pipes and Queues

Further Reading

Takeaways

Share this:

Related Tutorials:

About Jason Brownlee

Parallel Loops in Python

Reader Interactions

Leave a Reply Cancel reply

Footer

Learn Multiprocessing Systematically