Last Updated on September 12, 2022
You can use a manager to create a namespace that may be used to share primitive variables safely with processes.
In this tutorial you will discover how to use a namespace to share data among processes in Python.
Let’s get started.
What is a Multiprocessing Manager
A manager in the multiprocessing module provides a way to create Python objects that can be shared easily between processes.
Managers provide a way to create data which can be shared between different processes, including sharing over a network between processes running on different machines. A manager object controls a server process which manages shared objects. Other processes can access the shared objects by using proxies.
— multiprocessing — Process-based parallelism
A manager creates a server process that hosts a centralized version of objects that can be shared among multiple processes.
The objects are not shared directly. Instead, the manager creates a proxy object for each object that it manages and the proxy objects are shared among processes.
The proxy objects are used and operate just like the original objects, except that they serialize data, synchronize and coordinate with the centralized version of the object hosted in the manager server process.
A proxy is an object which refers to a shared object which lives (presumably) in a different process. […] A proxy object has methods which invoke corresponding methods of its referent (although not every method of the referent will necessarily be available through the proxy).
— multiprocessing — Process-based parallelism
This makes managers a process-safe and preferred way to share Python objects among processes.
You can learn more about multiprocessing managers in the tutorial:
Next, let’s look at namespaces.
Run loops using all CPUs, download your FREE book to learn how.
What is a Multiprocessing Namespace
A namespace is a Python object used to share primitive variables among multiple processes.
A namespace object has no public methods, but does have writable attributes. Its representation shows the values of its attributes.
— multiprocessing — Process-based parallelism
It may be used to share primitive variables such as:
- Integer values
- Floating point values.
- Strings
- Characters
It cannot be used to share simple Python data structures such as lists, tuples, and dicts.
A namespace must be created by a manager. This means it is hosted in a manager’s server process and shared via proxy objects.
The proxy objects can be shared with child processes and automatically handle serialization (pickling) of data to and from the manager process and provide process-safety.
- Centralized: A single copy of the namespace is hosted in the server process of a manager and interacted with via proxy objects, meaning all processes see the same data.
- Pickability: Namespace (proxy) objects can be pickled, allowing them to be put on queues or passed as arguments to the Pool class.
- Safety: Primitive variables on a namespace can be read and written concurrently in a process-safe manner.
This means that the proxy objects for the namespace can be shared via queues and as arguments to multiprocessing pools, where they must be pickled.
It also means that multiple child processes can read and write the variables on the namespace concurrently without fear of race conditions.
Now that we know about a namespace, let’s look at how we might use it.
How to Create and a Namespace to Share Data
Using a namespace requires first creating a manager, then using the manager to create the namespace.
The namespace can be shared among processes and used to read and write primitive variables directly.
The first step is to create a manager and start it.
This can be achieved manually by instantiating the multiprocessing.Manager class and calling start, for example:
1 2 3 4 5 |
... # create the manager manager = Manager() # start the manager manager.start() |
The multiprocessing.Manager class is in fact an alias for the multiprocessing.managers.SyncManager class that provides the capability to create and host a Namespace instance.
Later we must close the manager to terminate the server process.
1 2 3 |
... # close the manager manager.close() |
The preferred approach is to use the context manager interface and limit usage of the manager to the context manager block.
This will ensure the manager is closed automatically regardless of what happens in our program.
For example:
1 2 3 4 |
... # create and start the manager with Manager() as manager: # ... |
Once created and started, we can use the manager to create an instance of the multiprocessing.managers.Namespace class.
For example:
1 2 3 4 |
... # create and start the manager with Manager() as manager: namespace = manager.Namespace() |
This will create a Namespace object in the Manager‘s server process and return a proxy object that can be used to interact with the centralized version of the object safety.
Primitive variables can be added to the namespace directly by defining them.
Any arbitrary variable names may be used, except those that start with an underscore as they will become members on the proxy object itself instead of the hosted namespace.
For example:
1 2 3 4 5 |
... # define primitive values on the namespace namespace.value1 = 100 namespace.config = 33.3 namespace.name = 'Tester' |
Primitive variables may be read by de-referencing them directly.
For example:
1 2 3 4 |
... # read primitive values from the namespace if namespace.value1 > 50: # ... |
Although reading and writing values is process-safe, some operations are not.
For example, it is not process-safe to add or subtract values from primitives on the namespace. This is because modifying a value is not atomic, instead it is a three step process of read, compute, and write.
As such, modification operations like adding, subtracting and incrementing values are not process safe and should be avoided.
For example:
1 2 3 |
... # this is not process-safe namespace.value1 += 1 |
You can learn more about process-safety in the tutorial:
Instead, a mutex lock can be created on the manager and used to limit modification of namespace variables to critical sections.
For example:
1 2 3 4 5 6 7 8 |
... # create a managed lock lock = manager.Lock() # ... # acquire the lock with lock: # increment variable on a namespace (this is safe) namespace.value1 += 1 |
Finally, the namespace does not support data structures like list or dict.
Although they may be defined readily enough, their contents will be local to the current process only. Meaning that the content of data structures will not be centralized and made available to all processes.
For example:
1 2 3 |
... # define a list (local only) namespace.mylist = [1, 2, 3] |
Now that we know how to use a namespace, let’s look at a worked example.
Free Python Multiprocessing Course
Download your FREE multiprocessing PDF cheat sheet and get BONUS access to my free 7-day crash course on the multiprocessing API.
Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.
Example of Using a Manager Namespace to Share Data
We can explore how to use a namespace to share primitive variable data among child processes.
In this example we will use a manager to create a centralized hosted namespace. The main process will then define a variable on the namespace. A child process will then define its own variable and access the variable written by the main process. A second child process will define its own variable on the namespace and access the variable defined by the first child. Finally, the main process will access all variables.
This example highlights a few things:
- How to read and write variables in a namespace.
- Multiple different processes can define variables in the same namespace.
- Variables on the namespace are available to all processes.
Firstly, we can define the function to execute for the first child process.
The task takes the shared namespace as an argument, generates a random number between 0 and 1, blocks for a fraction of a second to simulate computational effort, stores the generated number as “task1” on the namespace, then reports its generated number and a variable stored by the main process called “main“.
The task1() function below implements this.
1 2 3 4 5 6 7 8 9 10 |
# task executed in a new process def task1(shared_ns): # generate a number between 0 and 1 value = random() # block for a fraction of a second sleep(value) # update the namespace variables shared_ns.task1 = value # report the data print(f'Task1 sees {shared_ns.main} got {value}', flush=True) |
Next, we can define the custom function for the second task.
The function takes the shared namespace as an argument. Like the first task, it generates a random number between 0 and 1, then blocks to simulate computational effort. It then reports its generated value and the variable “task1” stored by the first task.
The task2() function below implements this.
1 2 3 4 5 6 7 8 9 10 |
# task executed in a new process def task2(shared_ns): # generate a number between 0 and 1 value = random() # block for a fraction of a second sleep(value) # report the data print(f'Task2 sees {shared_ns.task1} got {value}', flush=True) # update the namespace variables shared_ns.task2 = value |
Next, in the main process, a manager is created and started using the context manager interface.
1 2 3 4 |
... # create the manager with Manager() as manager: # ... |
The manager is then used to create the hosted namespace.
1 2 3 |
... # create the shared namespace namespace = manager.Namespace() |
The main process then defines a variable on the namespace that the task1() function will access later.
1 2 3 |
... # initialize the attribute namespace.main = 55 |
A new multiprocessing.Process instance is created and configured to execute the task1() function and pass the shared namespace as an argument. Once configured, the process is started and the main process blocks until the process has terminated.
1 2 3 4 5 |
... # start and run a process for task 1 process = Process(target=task1, args=(namespace,)) process.start() process.join() |
A second multiprocessing.Process instance is created and this time configured to execute our task2() function. It is started and the main process blocks until the new child process has terminated.
1 2 3 4 5 |
... # start and run a process for task 2 process = Process(target=task2, args=(namespace,)) process.start() process.join() |
Finally, the main process reports all variables defined on the shared namespace.
This is achieved by reporting the string representation of the namespace object directly.
1 2 3 |
... # report everything print(f'Main sees: {namespace}') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
# SuperFastPython.com # example of sharing data with a manager namespace from time import sleep from random import random from multiprocessing import Process from multiprocessing import Manager # task executed in a new process def task1(shared_ns): # generate a number between 0 and 1 value = random() # block for a fraction of a second sleep(value) # update the namespace variables shared_ns.task1 = value # report the data print(f'Task1 sees {shared_ns.main} got {value}', flush=True) # task executed in a new process def task2(shared_ns): # generate a number between 0 and 1 value = random() # block for a fraction of a second sleep(value) # report the data print(f'Task2 sees {shared_ns.task1} got {value}', flush=True) # update the namespace variables shared_ns.task2 = value # protect the entry point if __name__ == '__main__': # create the manager with Manager() as manager: # create the shared namespace namespace = manager.Namespace() # initialize the attribute namespace.main = 55 # start and run a process for task 1 process = Process(target=task1, args=(namespace,)) process.start() process.join() # start and run a process for task 2 process = Process(target=task2, args=(namespace,)) process.start() process.join() # report everything print(f'Main sees: {namespace}') |
Running the example first creates the manager and starts the manager’s server process.
The manager is then used to create a namespace. The namespace is created in the Manager’s server process and a proxy object is returned that may be shared and used to access the namespace in a process-safe manner.
The main process then defines a variable on the namespace and assigns it an integer value.
A child process is configured and started to execute our custom task1() function. The main process blocks until the process terminates.
The first child process generates a random number, blocks, defines a new variable on the shared namespace, then reports the value defined by the main process as well as the random number that was generated.
The child process terminates and the main process continues on.
The main process configures and starts a second child process, this time to execute our task2() function. It blocks until the process terminates.
The second child process generates a random number, blocks for a fraction of a second then reports the value on the namespace defined by the first child process, as well as the random number that was generated. It then defines a new variable on the namespace with the generated floating point value.
The second process terminates and the main process continues on and reports the content of the shared namespace.
We can see the integer value defined by the main process, the floating point value defined by the first child process and the floating point value defined by the second child process.
This highlights that variables can be defined and accessed on the shared namespace arbitrarily by different processes.
Note: output will vary each time the program is run given the use of random numbers.
1 2 3 |
Task1 sees 55 got 0.19925402509706536 Task2 sees 0.19925402509706536 got 0.9487094130571998 Main sees: Namespace(main=55, task1=0.19925402509706536, task2=0.9487094130571998) |
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Further Reading
This section provides additional resources that you may find helpful.
Python Multiprocessing Books
- Python Multiprocessing Jump-Start, Jason Brownlee (my book!)
- Multiprocessing API Interview Questions
- Multiprocessing API Cheat Sheet
I would also recommend specific chapters in the books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Multiprocessing: The Complete Guide
- Python Multiprocessing Pool: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
APIs
References
Takeaways
You now know how to use a namespace to share data primitives with processes in Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Alex Shutin on Unsplash
Do you have any questions?