Last Updated on September 12, 2022
You can create a managed python object and add a managed object to it, nesting one proxy object within another.
This allows hosted objects created via a multiprocessing Manager nested one within another to behave as expected when shared across processes.
In this tutorial you will discover how to nest proxy objects when sharing them between processes in Python.
Let’s get started.
What is a Multiprocessing Manager
A manager in the multiprocessing module provides a way to create Python objects that can be shared easily between processes.
Managers provide a way to create data which can be shared between different processes, including sharing over a network between processes running on different machines. A manager object controls a server process which manages shared objects. Other processes can access the shared objects by using proxies.
— multiprocessing — Process-based parallelism
A manager creates a server process that hosts a centralized version of objects that can be shared among multiple processes.
The objects are not shared directly. Instead, the manager creates a proxy object for each object that it manages and the proxy objects are shared among processes.
The proxy objects are used and operate just like the original objects, except that they serialize data, synchronize and coordinate with the centralized version of the object hosted in the manager server process.
A proxy is an object which refers to a shared object which lives (presumably) in a different process. […] A proxy object has methods which invoke corresponding methods of its referent (although not every method of the referent will necessarily be available through the proxy).
— multiprocessing — Process-based parallelism
This makes managers a process-safe and preferred way to share Python objects among processes.
You can learn more about multiprocessing managers in the tutorial:
Next, let’s consider the server process of the manager itself.
Run loops using all CPUs, download your FREE book to learn how.
Need for Manager Nested Proxy Objects
A problem when using hosted objects via their proxies is nested objects.
Recall, proxy objects are returned from the manager when we use the manager to create a hosted Python object.
We may use a manager to host a list and interact with the list from multiple processes via the proxies.
For example:
1 2 3 |
... # create a hosted list proxy = manager.list() |
We may then add objects to the list, such as dicts.
For example:
1 2 3 4 |
... # add a dict to the list d = dict() proxy.append(d) |
This is a problem: the dict is not hosted by the manager and is only local to the process that created and added it to the list.
When another process attempts to modify the dict, it will not share a copy with the process that added it.
For example:
1 2 3 |
... # get the dict from the shared list d = proxy[0] |
Instead, we must use nested proxy objects.
Support for nested proxy objects was added in Python 3.6 and is required in this example situation.
Changed in version 3.6: Shared objects are capable of being nested. For example, a shared container object such as a shared list can contain other shared objects which will all be managed and synchronized by the SyncManager.
— multiprocessing — Process-based parallelism
How to Nest Manager Proxy Objects
Nesting proxy objects is straightforward.
Firstly, the container object must be created by the manager as per normal.
For example, we might create a list via the manager:
1 2 3 |
... # create a hosted list proxy_list = manager.list() |
We must then create the nested object via the manager and add it to the container object.
For example:
1 2 3 4 5 |
... # create a hosted dict proxy_dict = manager.dict() # add dict to list proxy_list.append(proxy_dict) |
And that’s all there is to it.
We may now share the managed list with processes and changes to both the list and to the nested dict will be centralized in the manager process and available to all processes.
Now that we know how to nest proxy objects, let’s look at some worked examples.
Free Python Multiprocessing Course
Download your FREE multiprocessing PDF cheat sheet and get BONUS access to my free 7-day crash course on the multiprocessing API.
Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.
Example of Manager Without Nested Proxy Objects
Before we explore how to use nested proxy objects, let’s support a case where we get unexpected behavior from not nesting proxy objects, e.g. a failure mode.
In this example, we will create a manager and use the manager to create a hosted list object. We will then add a dict to the list. The list will then be shared with a child process which will modify the content of the dust in the list. Finally, the main process will check the content of the dict in the shared list again.
The expectation is that the child process would modify the dict within the list and that the main process would see the change. Instead, we will get unexpected behavior because we will not use nested proxy objects.
Firstly, we can define a function to be executed by child processes.
THe function will take a shared list as an argument. It will then retrieve a dict from the shared list, report its contents, change its contents, then report the contents again as confirmation that the changes were made, at least locally.
The task() function below implements this.
1 2 3 4 5 6 7 8 9 10 11 12 |
# task executed in child process def task(shared_list): # get the first item from the list item = shared_list[0] # report the list of dicts print(f'Task Before: {item}', flush=True) # update the dict in the shared list item['a'] = 1 item['b'] = 2 item['c'] = 3 # report the list of dicts print(f'Task After: {item}', flush=True) |
Next, in the main process we will create a manager via the context manager interface to ensure it is started and closed automatically for us.
1 2 3 4 |
... # create the manager with Manager() as manager: # ... |
Next, we will use the manager to create a shared list.
1 2 3 |
... # create the shared list list_proxy = manager.list() |
We will then create a dict inline with three items all mapped to the initial value zero.
1 2 3 |
... # create a dict dict_item = {'a':0, 'b':0, 'c':0} |
This dict is then added to the shared list hosted in the manager’s server process. The content of the dict is then reported as confirmation by the main process.
1 2 3 4 |
... # add the dict to the list list_proxy.append(dict_item) print(f'Main Before: {list_proxy[0]}', flush=True) |
Next, a child process is created and configured to execute our custom task() function and is passed the shared list as an argument. The process is then started and the main process blocks until the child process terminates.
1 2 3 4 5 |
... # start a child process process = Process(target=task, args=(list_proxy,)) process.start() process.join() |
If you are new to joining a child process, you can learn more in the tutorial:
Finally, the main process reports the content dict again, expecting to see changes made by the child process
1 2 3 |
... # report the list of dicts print(f'Main After: {list_proxy[0]}', flush=True) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
# SuperFastPython.com # example of unexpected behavior from not using nested proxy objects from multiprocessing import Process from multiprocessing import Manager # task executed in child process def task(shared_list): # get the first item from the list item = shared_list[0] # report the list of dicts print(f'Task Before: {item}', flush=True) # update the dict in the shared list item['a'] = 1 item['b'] = 2 item['c'] = 3 # report the list of dicts print(f'Task After: {item}', flush=True) # protect the entry point if __name__ == '__main__': # create the manager with Manager() as manager: # create the shared list list_proxy = manager.list() # create a dict dict_item = {'a':0, 'b':0, 'c':0} # add the dict to the list list_proxy.append(dict_item) print(f'Main Before: {list_proxy[0]}', flush=True) # start a child process process = Process(target=task, args=(list_proxy,)) process.start() process.join() # report the list of dicts print(f'Main After: {list_proxy[0]}', flush=True) |
Running the example first creates the manager which starts the manager’s server process.
Next, the shared list is created via the manager. The main process then defines a new dict with three initial values set to zero and adds the dict to the shared list. It reports the contents of the dict, matching what was initialized.
Next, the main process configures and starts a new child process and passes the shared list to it. It then blocks until the child process terminates.
The child process runs, first retrieving the dict from the shared list. It reports the content of the dict, which matches the content stored by the main process.
This highlights that a dict was added to the list and was stored in the manager’s process.
Next, the child process changes the content of the dict and reports the new values, confirming that the change was made, at least to the local copy of the dict.
The child process terminates and the main process continues on and reports the content of the dict within the shared list.
Here, we see an unexpected result. The changes made to the dict within the shared list by the child process are not reflected in the main process.
This suggests that the child process got a local copy of the list and the changes were not propagated to the manager and therefore remain unavailable to other processes.
1 2 3 4 |
Main Before: {'a': 0, 'b': 0, 'c': 0} Task Before: {'a': 0, 'b': 0, 'c': 0} Task After: {'a': 1, 'b': 2, 'c': 3} Main After: {'a': 0, 'b': 0, 'c': 0} |
Next, we will explore how we might fix this failure case using nested proxy objects.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Example of Manager With Nested Proxy Objects
We can explore how to nest proxy objects in order to overcome the failure case seen in the previous example where changes to a nested object are not propagated.
In this example, we will update the previous example and make one small change. The dict that is nested within the manage list, will itself be a proxy object, created via the manager.
This will ensure that changes made to it in isolation will propagate, along with changes to the list in which it is nested.
This requires a single line change in the way that the dict is created.
For example:
1 2 3 |
... # create a dict dict_item = manager.dict({'a':0, 'b':0, 'c':0}) |
And that’s it.
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
# SuperFastPython.com # example of nested proxy objects where changes are propagated from multiprocessing import Process from multiprocessing import Manager # task executed in child process def task(shared_list): # get the first item from the list item = shared_list[0] # report the list of dicts print(f'Task Before: {item}', flush=True) # update the dict in the shared list item['a'] = 1 item['b'] = 2 item['c'] = 3 # report the list of dicts print(f'Task After: {item}', flush=True) # protect the entry point if __name__ == '__main__': # create the manager with Manager() as manager: # create the shared list list_proxy = manager.list() # create a dict dict_item = manager.dict({'a':0, 'b':0, 'c':0}) # add the dict to the list list_proxy.append(dict_item) print(f'Main Before: {list_proxy[0]}', flush=True) # start a child process process = Process(target=task, args=(list_proxy,)) process.start() process.join() # report the list of dicts print(f'Main After: {list_proxy[0]}', flush=True) |
Running the example first creates the manager which starts the manager’s server process.
Next, the shared list is created via the manager. The main process then creates a dict via the manager, initialized with three values. It then adds this managed dict to the shared list. It reports the contents of the dict, matching what was initialized.
Next, the main process configures and starts a new child process and passes the shared list to it. It then blocks until the child process terminates.
The child process runs, first retrieving the managed dict from the shared list. It reports the content of the dict, which matches the content stored by the main process.
Next, the child process changes the content of the dict and reports the new values, confirming that the change was made. Because the dict is also managed, the changes are made on the hosted version of the object.
The child process terminates and the main process continues on and reports the content of the dict within the shared list.
In this case, we see that the main process sees the changes made by the child process.
This highlights how we might nest proxy objects in order to achieve expected behavior when multiple processes change and use nested Python objects.
Although we demonstrated one-level of nesting, we may in fact nest to any level and as long as the objects are hosted in the manager’s server process, changes will be propagated correctly.
1 2 3 4 |
Main Before: {'a': 0, 'b': 0, 'c': 0} Task Before: {'a': 0, 'b': 0, 'c': 0} Task After: {'a': 1, 'b': 2, 'c': 3} Main After: {'a': 1, 'b': 2, 'c': 3} |
Further Reading
This section provides additional resources that you may find helpful.
Python Multiprocessing Books
- Python Multiprocessing Jump-Start, Jason Brownlee (my book!)
- Multiprocessing API Interview Questions
- Multiprocessing API Cheat Sheet
I would also recommend specific chapters in the books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Multiprocessing: The Complete Guide
- Python Multiprocessing Pool: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
APIs
References
Takeaways
You now know how to nest proxy objects when sharing them between processes.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Ivan Tsaregorodtsev on Unsplash
Do you have any questions?