Last Updated on September 12, 2022
You can use a Manager to host a centralized Python object that can be shared with multiple processes that is both process-safe and changes to the object are propagated and made available to all processes seamlessly.
In this tutorial you will discover how to use a Manager to share an ad hoc Python object with multiple processes.
Let’s get started.
Need Manager to Share Queue
A manager in the multiprocessing module provides a way to create Python objects that can be shared easily between processes.
Managers provide a way to create data which can be shared between different processes, including sharing over a network between processes running on different machines. A manager object controls a server process which manages shared objects. Other processes can access the shared objects by using proxies.
— multiprocessing — Process-based parallelism
A manager creates a server process that hosts a centralized version of objects that can be shared among multiple processes.
The objects are not shared directly. Instead, the manager creates a proxy object for each object that it manages and the proxy objects are shared among processes.
The proxy objects are used and operate just like the original objects, except that they serialize data, synchronize and coordinate with the centralized version of the object hosted in the manager server process.
A proxy is an object which refers to a shared object which lives (presumably) in a different process. […] A proxy object has methods which invoke corresponding methods of its referent (although not every method of the referent will necessarily be available through the proxy).
— multiprocessing — Process-based parallelism
This makes managers a process-safe and preferred way to share Python objects among processes.
You can learn more about multiprocessing managers in the tutorial:
When using multiprocessing, we may need to share an arbitrary Python object with child processes.
It may be a data structure or similar object where the internal state of the object may need to be updated from multiple child processes concurrently, although in a process-safe manner in order to avoid race conditions.
How can we use a multiprocessing Manager to share a Python object with child processes?
Run loops using all CPUs, download your FREE book to learn how.
How to Use a Manager to Share an Object with Processes
We can use a multiprocessing manager to share an ad hoc Python object with multiple processes.
This requires a few steps:
- Defining a custom Manager class that extends BaseManager.
- Registering the Python class with the custom manager.
- Creating an instance of the custom manager.
- Creating an instance of the Python object from the manager instance.
Let’s take a closer look at each step in turn.
Step 1. Define a Custom Manager
We must define a custom manager in order to use the manager to create and manage ad hoc Python objects.
This involves defining a new class that extends the multiprocessing.managers.BaseManager class.
The custom class does not need to override the constructor or any methods, nor define any functionality.
For example:
1 2 3 4 |
# custom manager to support custom classes class CustomManager(BaseManager): # nothing pass |
Step 2. Register the Python Class
The next step involves registering the Python class with the custom manager.
This is so when the program requests a new instance of the class from the manager, it knows how to create an instance of that class.
This involves calling a class function (static function) on the custom manager class called register().
The register() function takes two arguments, the string name of the class used by the program when requesting an instance and the name of the Python class to create.
This should probably be the same name, but do not have to be.
For example:
1 2 3 |
... # register the a python class with the custom manager CustomManager.register('set', set) |
Step 3. Create the Manager
The next step is to create an instance of the custom manager, and start it.
This is so that the program can request an instance of the custom class.
This can be achieved manually by instantiating the class and calling start, for example:
1 2 3 4 5 |
... # create the custom manager manager = CustomManager() # start the custom manager manager.start() |
Later we must close the manager to terminate the server process.
1 2 3 |
... # close the manager manager.close() |
The preferred approach is to use the context manager interface and limit usage of the manager to the context manager block.
This will ensure the manager is closed automatically regardless of what happens in our program.
For example:
1 2 3 4 |
... # create and start the custom manager with CustomManager() as manager: # ... |
Step 4. Create a Custom Class Instance
Finally, we can create an instance of our ad hoc Python class using the manager.
This will create the object in the Manager’s server process and return a proxy object.
The proxy object can then be shared among processes and used to interact with the centralized version of the custom class instance, with all data serialization and process-safety handled automatically under the covers.
For example:
1 2 3 |
... # create a shared python object shared_object = manager.set() |
Now that we know how to share an ad hoc class with processes using a manager, let’s look at a worked example.
Example of Using a Manager to Share an Object with Processes
We can explore how to use a Manager to share an ad hoc Python object.
In this example, we will use a Manager to host a Python set and share the set among multiple processes.
Recall, we can create a set via the set() built-in function and add items to the set via calling the add() method.
Importantly, if we just created a set and shared it with multiple processes to modify concurrently, it would not have the desired effect. The changes would be made locally within each process and not shared among all processes.
Firstly, we can define our custom Manager instance that we will use to register and host our set instance.
The CustomManager class is defined below.
1 2 3 4 |
# custom manager to support custom classes class CustomManager(BaseManager): # nothing pass |
Next, we can define a function to execute in each child process.
The function will take a unique integer identifier and the shared set as arguments. It will generate a random number between 0 and 1, block for a fraction of a second to simulate computational effort, then add a tuple of its integer argument and generated number to the shared set.
The task() function below implements this.
1 2 3 4 5 6 7 8 |
# custom function to be executed in a child process def task(number, shared_set): # generate a number value = random() # block for a moment sleep(value) # store the result shared_set.add((number,value)) |
Finally, in the main process, we will first register our set class with our custom Manager class.
This can be achieved by calling the register() class method on our CustomManager class and specifying the name of the class we are registering (which is arbitrary, but we will match the actual class name) and the actual Python class it maps to.
1 2 3 |
... # register the counter with the custom manager CustomManager.register('set', set) |
Next, we can create and start an instance of our custom manager.
In this case, we will use the context manager interface.
1 2 3 4 |
... # create a new manager instance with CustomManager() as manager: # ... |
We will then use the manager instance to create a new instance of our Python set class.
1 2 3 |
... # create a shared set instance shared_set = manager.set() |
We will share this set object among many child processes.
Next, we will create and configure 50 child processes to execute our custom task() function with an integer and the share set as arguments.
This can be achieved in a list comprehension which will create a list of configured Process instances.
1 2 3 |
... # start some child processes processes = [Process(target=task, args=(i,shared_set)) for i in range(50)] |
The main process can then traverse the list of Process instances and start each. Then iterate the same list and join each process in turn, having the effect of blocking waiting for all tasks to complete.
1 2 3 4 5 6 7 |
... # start processes for process in processes: process.start() # wait for processes to finish for process in processes: process.join() |
If you are new to joining processes and waiting for them to terminate, you can learn more in the tutoria:
Once all tasks are completed, the main process will report the total number of items in the shared set.
If the share was not hosted on the Manager, then the set would be empty as each process would have added an item to a local copy of the set object.
We cannot check the size of the shared set object directly, because we only have access to a proxy object representation. We can get access to the underlying set object via the _getvalue() function. We can then calculate the length of the set on the value returned.
We can also report the contents of the set.
1 2 3 4 5 6 |
... # all done print('Done') # report the results print(len(shared_set._getvalue())) print(shared_set) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
# SuperFastPython.com # example of sharing a python object using a manager from time import sleep from random import random from multiprocessing import Process from multiprocessing.managers import BaseManager # custom manager to support custom classes class CustomManager(BaseManager): # nothing pass # custom function to be executed in a child process def task(number, shared_set): # generate a number value = random() # block for a moment sleep(value) # store the result shared_set.add((number,value)) # protect the entry point if __name__ == '__main__': # register the counter with the custom manager CustomManager.register('set', set) # create a new manager instance with CustomManager() as manager: # create a shared set instance shared_set = manager.set() # start some child processes processes = [Process(target=task, args=(i,shared_set)) for i in range(50)] # start processes for process in processes: process.start() # wait for processes to finish for process in processes: process.join() # all done print('Done') # report the results print(len(shared_set._getvalue())) print(shared_set) |
Running the example first registers the set class with the custom manager class via a static class function.
Next, an instance of the custom manager class is created and started, starting a new server process for the manager.
A single centralized instance of a Python set object is then created using the manager and a proxy object for the instance is then returned.
A total of 50 child processes are configured to execute our custom task() function and pass an integer argument and the proxy object for the share set.
The child processes are then started and the main process blocks, waiting for all child processes to terminate.
Each child process generates a random number, blocks for a fraction of a second, then adds a tuple to the shared set containing its integer argument and the generated value.
The proxy objects ensure that all method calls on the centralized set object are process-safe. This ensures two things:
- No race condition is possible by two processes modifying the set object at the same time.
- Changes made to the hosted and centralized object are available to all other processes.
All tasks complete and the main process continues on.
It first reports the size of the shared set, which matches the execution of 50 items for the 50 tasks that were started.
Next, the content of the set is reported, showing that we have one item from 0 to 49 for each of the 50 tasks that added a tuple.
1 2 3 |
Done 50 {(26, 0.9683583780033496), (22, 0.13686892488319335), (39, 0.1414689707523733), (13, 0.5951166827165203), (17, 0.8494358251755124), (35, 0.40237396081453247), (4, 0.2522874971557658), (31, 0.8332097787030488), (23, 0.18722977072712244), (5, 0.33779828541166634), (28, 0.7439113416210278), (1, 0.17322276676116777), (24, 0.727053996708004), (47, 0.2533012542203056), (25, 0.8342307237961513), (29, 0.6187359680376543), (42, 0.2968688956217582), (12, 0.5676851016265028), (37, 0.14546902635045955), (20, 0.3868711282904791), (9, 0.3327623955067941), (2, 0.41510857673265966), (10, 0.6344327379460593), (8, 0.9047725964902997), (6, 0.8622076866372874), (48, 0.3895996433092157), (43, 0.6288467330940722), (14, 0.6041826196535989), (19, 0.1610412684825725), (18, 0.6897069725680317), (36, 0.028888642904620676), (21, 0.4839101428301853), (49, 0.8903931689087147), (7, 0.6652282602882754), (46, 0.8637619025440303), (33, 0.3104412938620833), (15, 0.4429830747758847), (38, 0.9206606902877741), (32, 0.8816867571018366), (11, 0.8945218098684898), (44, 0.05677471724926908), (27, 0.19586927234967577), (41, 0.7706467554224696), (45, 0.6759663716448222), (0, 0.963282709368371), (40, 0.7823591820906549), (16, 0.10105616728272415), (34, 0.8210198749842155), (30, 0.9136410079319093), (3, 0.12156928070685225)} |
Free Python Multiprocessing Course
Download your FREE multiprocessing PDF cheat sheet and get BONUS access to my free 7-day crash course on the multiprocessing API.
Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.
Further Reading
This section provides additional resources that you may find helpful.
Python Multiprocessing Books
- Python Multiprocessing Jump-Start, Jason Brownlee (my book!)
- Multiprocessing API Interview Questions
- Multiprocessing API Cheat Sheet
I would also recommend specific chapters in the books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Multiprocessing: The Complete Guide
- Python Multiprocessing Pool: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
APIs
References
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Takeaways
You now know how to use a Manager to share an ad hoc Python object with multiple processes.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Arnaud Mesureur on Unsplash
Do you have any questions?