How to Use a Manager Namespace to Share Data with Processes

Last Updated on September 12, 2022

You can use a manager to create a namespace that may be used to share primitive variables safely with processes.

In this tutorial you will discover how to use a namespace to share data among processes in Python.

Let’s get started.

Table of Contents

What is a Multiprocessing Manager

A manager in the multiprocessing module provides a way to create Python objects that can be shared easily between processes.

Managers provide a way to create data which can be shared between different processes, including sharing over a network between processes running on different machines. A manager object controls a server process which manages shared objects. Other processes can access the shared objects by using proxies.
— multiprocessing — Process-based parallelism

A manager creates a server process that hosts a centralized version of objects that can be shared among multiple processes.

The objects are not shared directly. Instead, the manager creates a proxy object for each object that it manages and the proxy objects are shared among processes.

The proxy objects are used and operate just like the original objects, except that they serialize data, synchronize and coordinate with the centralized version of the object hosted in the manager server process.

A proxy is an object which refers to a shared object which lives (presumably) in a different process. […] A proxy object has methods which invoke corresponding methods of its referent (although not every method of the referent will necessarily be available through the proxy).
— multiprocessing — Process-based parallelism

This makes managers a process-safe and preferred way to share Python objects among processes.

You can learn more about multiprocessing managers in the tutorial:

What is a Multiprocessing Manager

Next, let’s look at namespaces.

Run loops using all CPUs, download your FREE book to learn how.

What is a Multiprocessing Namespace

A namespace is a Python object used to share primitive variables among multiple processes.

A namespace object has no public methods, but does have writable attributes. Its representation shows the values of its attributes.
— multiprocessing — Process-based parallelism

It may be used to share primitive variables such as:

Integer values
Floating point values.
Strings
Characters

It cannot be used to share simple Python data structures such as lists, tuples, and dicts.

A namespace must be created by a manager. This means it is hosted in a manager’s server process and shared via proxy objects.

The proxy objects can be shared with child processes and automatically handle serialization (pickling) of data to and from the manager process and provide process-safety.

Centralized: A single copy of the namespace is hosted in the server process of a manager and interacted with via proxy objects, meaning all processes see the same data.
Pickability: Namespace (proxy) objects can be pickled, allowing them to be put on queues or passed as arguments to the Pool class.
Safety: Primitive variables on a namespace can be read and written concurrently in a process-safe manner.

This means that the proxy objects for the namespace can be shared via queues and as arguments to multiprocessing pools, where they must be pickled.

It also means that multiple child processes can read and write the variables on the namespace concurrently without fear of race conditions.

Now that we know about a namespace, let’s look at how we might use it.

Download Now: Free Multiprocessing PDF Cheat Sheet

How to Create and a Namespace to Share Data

Using a namespace requires first creating a manager, then using the manager to create the namespace.

The namespace can be shared among processes and used to read and write primitive variables directly.

The first step is to create a manager and start it.

This can be achieved manually by instantiating the multiprocessing.Manager class and calling start, for example:

...

# create the manager

manager = Manager()

# start the manager

manager.start()

The multiprocessing.Manager class is in fact an alias for the multiprocessing.managers.SyncManager class that provides the capability to create and host a Namespace instance.

Later we must close the manager to terminate the server process.

...

# close the manager

manager.close()

The preferred approach is to use the context manager interface and limit usage of the manager to the context manager block.

This will ensure the manager is closed automatically regardless of what happens in our program.

For example:

...

# create and start the manager

with Manager() as manager:

# ...

Once created and started, we can use the manager to create an instance of the multiprocessing.managers.Namespace class.

For example:

...

# create and start the manager

with Manager() as manager:

namespace = manager.Namespace()

This will create a Namespace object in the Manager‘s server process and return a proxy object that can be used to interact with the centralized version of the object safety.

Primitive variables can be added to the namespace directly by defining them.

Any arbitrary variable names may be used, except those that start with an underscore as they will become members on the proxy object itself instead of the hosted namespace.

For example:

...

# define primitive values on the namespace

namespace.value1 = 100

namespace.config = 33.3

namespace.name = 'Tester'

Primitive variables may be read by de-referencing them directly.

For example:

...

# read primitive values from the namespace

if namespace.value1 > 50:

# ...

Although reading and writing values is process-safe, some operations are not.

For example, it is not process-safe to add or subtract values from primitives on the namespace. This is because modifying a value is not atomic, instead it is a three step process of read, compute, and write.

As such, modification operations like adding, subtracting and incrementing values are not process safe and should be avoided.

For example:

...

# this is not process-safe

namespace.value1 += 1

You can learn more about process-safety in the tutorial:

Process-Safe in Python

Instead, a mutex lock can be created on the manager and used to limit modification of namespace variables to critical sections.

For example:

...

# create a managed lock

lock = manager.Lock()

# ...

# acquire the lock

with lock:

# increment variable on a namespace (this is safe)

namespace.value1 += 1

Finally, the namespace does not support data structures like list or dict.

Although they may be defined readily enough, their contents will be local to the current process only. Meaning that the content of data structures will not be centralized and made available to all processes.

For example:

...

# define a list (local only)

namespace.mylist = [1, 2, 3]

Now that we know how to use a namespace, let’s look at a worked example.

Free Python Multiprocessing Course

Download your FREE multiprocessing PDF cheat sheet and get BONUS access to my free 7-day crash course on the multiprocessing API.

Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.

Learn more

Example of Using a Manager Namespace to Share Data

We can explore how to use a namespace to share primitive variable data among child processes.

In this example we will use a manager to create a centralized hosted namespace. The main process will then define a variable on the namespace. A child process will then define its own variable and access the variable written by the main process. A second child process will define its own variable on the namespace and access the variable defined by the first child. Finally, the main process will access all variables.

This example highlights a few things:

How to read and write variables in a namespace.
Multiple different processes can define variables in the same namespace.
Variables on the namespace are available to all processes.

Firstly, we can define the function to execute for the first child process.

The task takes the shared namespace as an argument, generates a random number between 0 and 1, blocks for a fraction of a second to simulate computational effort, stores the generated number as “task1” on the namespace, then reports its generated number and a variable stored by the main process called “main“.

The task1() function below implements this.

# task executed in a new process

def task1(shared_ns):

# generate a number between 0 and 1

value = random()

# block for a fraction of a second

sleep(value)

# update the namespace variables

shared_ns.task1 = value

# report the data

print(f'Task1 sees {shared_ns.main} got {value}', flush=True)

Next, we can define the custom function for the second task.

The function takes the shared namespace as an argument. Like the first task, it generates a random number between 0 and 1, then blocks to simulate computational effort. It then reports its generated value and the variable “task1” stored by the first task.

The task2() function below implements this.

# task executed in a new process

def task2(shared_ns):

# generate a number between 0 and 1

value = random()

# block for a fraction of a second

sleep(value)

# report the data

print(f'Task2 sees {shared_ns.task1} got {value}', flush=True)

# update the namespace variables

shared_ns.task2 = value

Next, in the main process, a manager is created and started using the context manager interface.

...

# create the manager

with Manager() as manager:

# ...

The manager is then used to create the hosted namespace.

...

# create the shared namespace

namespace = manager.Namespace()

The main process then defines a variable on the namespace that the task1() function will access later.

...

# initialize the attribute

namespace.main = 55

A new multiprocessing.Process instance is created and configured to execute the task1() function and pass the shared namespace as an argument. Once configured, the process is started and the main process blocks until the process has terminated.

...

# start and run a process for task 1

process = Process(target=task1, args=(namespace,))

process.start()

process.join()

A second multiprocessing.Process instance is created and this time configured to execute our task2() function. It is started and the main process blocks until the new child process has terminated.

...

# start and run a process for task 2

process = Process(target=task2, args=(namespace,))

process.start()

process.join()

Finally, the main process reports all variables defined on the shared namespace.

This is achieved by reporting the string representation of the namespace object directly.

...

# report everything

print(f'Main sees: {namespace}')

Tying this together, the complete example is listed below.

# SuperFastPython.com

# example of sharing data with a manager namespace

from time import sleep

from random import random

from multiprocessing import Process

from multiprocessing import Manager

# task executed in a new process

def task1(shared_ns):

# generate a number between 0 and 1

value = random()

# block for a fraction of a second

sleep(value)

# update the namespace variables

shared_ns.task1 = value

# report the data

print(f'Task1 sees {shared_ns.main} got {value}', flush=True)

# task executed in a new process

def task2(shared_ns):

# generate a number between 0 and 1

value = random()

# block for a fraction of a second

sleep(value)

# report the data

print(f'Task2 sees {shared_ns.task1} got {value}', flush=True)

# update the namespace variables

shared_ns.task2 = value

# protect the entry point

if __name__ == '__main__':

# create the manager

with Manager() as manager:

# create the shared namespace

namespace = manager.Namespace()

# initialize the attribute

namespace.main = 55

# start and run a process for task 1

process = Process(target=task1, args=(namespace,))

process.start()

process.join()

# start and run a process for task 2

process = Process(target=task2, args=(namespace,))

process.start()

process.join()

# report everything

print(f'Main sees: {namespace}')

Running the example first creates the manager and starts the manager’s server process.

The manager is then used to create a namespace. The namespace is created in the Manager’s server process and a proxy object is returned that may be shared and used to access the namespace in a process-safe manner.

The main process then defines a variable on the namespace and assigns it an integer value.

A child process is configured and started to execute our custom task1() function. The main process blocks until the process terminates.

The first child process generates a random number, blocks, defines a new variable on the shared namespace, then reports the value defined by the main process as well as the random number that was generated.

The child process terminates and the main process continues on.

The main process configures and starts a second child process, this time to execute our task2() function. It blocks until the process terminates.

The second child process generates a random number, blocks for a fraction of a second then reports the value on the namespace defined by the first child process, as well as the random number that was generated. It then defines a new variable on the namespace with the generated floating point value.

The second process terminates and the main process continues on and reports the content of the shared namespace.

We can see the integer value defined by the main process, the floating point value defined by the first child process and the floating point value defined by the second child process.

This highlights that variables can be defined and accessed on the shared namespace arbitrarily by different processes.

Note: output will vary each time the program is run given the use of random numbers.

Task1 sees 55 got 0.19925402509706536

Task2 sees 0.19925402509706536 got 0.9487094130571998

Main sees: Namespace(main=55, task1=0.19925402509706536, task2=0.9487094130571998)

Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps

Takeaways

You now know how to use a namespace to share data primitives with processes in Python.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Photo by Alex Shutin on Unsplash

How to Use a Manager Namespace to Share Data with Processes

What is a Multiprocessing Manager

What is a Multiprocessing Namespace

How to Create and a Namespace to Share Data

Example of Using a Manager Namespace to Share Data

Further Reading

Takeaways

Related Tutorials:

Parallel Loops in Python

Multiprocessing Resources:

Loving the Tutorials?

Get The Book:

Don't Dabble!

Learn All Of Python Concurrency

No more idle CPUs

Learn Multiprocessing Systematically

Additional menu

What is a Multiprocessing Manager

What is a Multiprocessing Namespace

How to Create and a Namespace to Share Data

Example of Using a Manager Namespace to Share Data

Further Reading

Takeaways

Share this:

Related Tutorials:

About Jason Brownlee

Parallel Loops in Python

Reader Interactions

Do you have any questions?Cancel reply

Footer

Learn Multiprocessing Systematically