Last Updated on September 12, 2022
You can share objects among processes using a manager.
In this tutorial you will discover how to use managers to share access to centralized Python objects.
Let’s get started.
What Is a Manager
A manager in the multiprocessing module provides a way to create Python objects that can be shared easily between processes.
Managers provide a way to create data which can be shared between different processes, including sharing over a network between processes running on different machines. A manager object controls a server process which manages shared objects. Other processes can access the shared objects by using proxies.
— multiprocessing — Process-based parallelism
A manager creates a server process that hosts a centralized version of objects that can be shared among multiple processes.
The objects are not shared directly. Instead, the manager creates a proxy object for each object that it manages and the proxy objects are shared among processes.
The proxy objects are used and operate just like the original objects, except that they serialize data, synchronize and coordinate with the centralized version of the object hosted in the manager server process.
A proxy is an object which refers to a shared object which lives (presumably) in a different process. […] A proxy object has methods which invoke corresponding methods of its referent (although not every method of the referent will necessarily be available through the proxy).
— multiprocessing — Process-based parallelism
This makes managers a process-safe and preferred way to share simple data structures like lists and dicts among processes.
They are also a preferred way to share concurrency primitives among processes, specifically among workers in a multiprocessing pool.
You can learn more about multiprocessing managers in the tutorial:
Next, let’s look at how we might create and use a manager.
Run loops using all CPUs, download your FREE book to learn how.
How to Create and Use a Manager
A multiprocessing.Manager provides a way to create a centralized version of a Python object hosted on a server process.
Once created, it returns proxy objects that allow other processes to interact with the centralized objects automatically behind the scenes.
The multiprocessing.Manager provides the full multiprocessing API, allowing Python objects and concurrency primitives to be shared among processes.
This includes Python objects we may want to share, such as:
- dict
- list
It include shared ctypes for primitive values, such as:
- Value
- Array
It includes concurrency primitives for synchronizing and coordinating processes, such as:
- Lock
- Event
- Condition
- Semaphore
- Barrier
It even includes patterns and process-safe data structures, such as:
- Queue
- Pool
First, we must create a manager instance.
For example:
1 2 3 |
... # create a manager manager = Manager() |
The manager can then be used to create a centralized Python object and return a proxy object.
For example:
1 2 3 |
... # create a centralized mutex lock lock = manager.Lock() |
Once we are finished with the manager, it must be closed in order to release all resources including the server process.
For example:
1 2 3 |
... # close the manager manager.close() |
The manager supports the context manager interface. This allows all centralized objects to be created and used within the context manager block. Once the block is exited, the manager is closed automatically.
For example:
1 2 3 4 5 6 |
... # create the manager with Manager() as manager: # create a centralized mutex lock lock = manager.Lock() # ... |
This is the preferred usage of the manager.
Now that we are familiar with how to use the manager, let’s look at a worked example.
Example of Using a Manager for Sharing a Data Structure
We can use a manager to share access to a Python data structure like a list or dict among multiple processes.
The shared data structure can be read and modified from multiple processes in a safe manner, meaning that it will not become corrupt, inconsistent or suffer data loss by concurrent modifications.
In this example we will create a list on the manager and share it among a number of processes that will concurrently add objects.
Firstly, we can define a function to execute in child processes.
The function will take an integer number argument and the shared list. It will then generate a random number between 0 and 1, block for a fraction of a second, then append the number argument and generate the random number to the shared list as a tuple.
The task() function below implements this.
1 2 3 4 5 6 7 8 |
# task executed in a new process def task(number, shared_list): # generate a number between 0 and 1 value = random() # block for a fraction of a second sleep(value) # store the value in the shared list shared_list.append((number, value)) |
Next, in the main process, we will create and start a Manager instance using the context manager interface.
1 2 3 4 |
... # create the manager with Manager() as manager: # ... |
We can then use the manager to create a shared list.
The list will be hosted in the manager server process and returns proxy objects that can be shared among child processes, can be pickled if needed, and used to interact with the list in a process-safe manner.
1 2 3 |
... # create the shared list shared_list = manager.list() |
Next, ew can create many multiprocess.Process instances configured to run our task() function and pass an integer and the shared list as arguments.
This can be achieved in a list comprehension that will create a list of configured Process instances.
1 2 3 |
... # create many child processes processes = [Process(target=task, args=(i, shared_list)) for i in range(50)] |
The main process can then iterate through the processes and start each in turn.
1 2 3 4 |
... # start all processes for process in processes: process.start() |
The main process can then wait until all the processes finish.
This can be achieved by traversing the list of running Process instances and calling the join() function on each. This will block until the Process instance finishes, and once all processes have been joined we know that they have all terminated.
1 2 3 4 |
... # wait for all processes to complete for process in processes: process.join() |
Finally, we can report the size of the shared list.
We expect the list to contain 50 items, one item added by each child process.
1 2 3 |
... # report the number of items stored print(f'List: {len(shared_list)}') |
The complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
# SuperFastPython.com # example of shared list among processes using a manager from time import sleep from random import random from multiprocessing import Process from multiprocessing import Manager # task executed in a new process def task(number, shared_list): # generate a number between 0 and 1 value = random() # block for a fraction of a second sleep(value) # store the value in the shared list shared_list.append((number, value)) # protect the entry point if __name__ == '__main__': # create the manager with Manager() as manager: # create the shared list shared_list = manager.list() # create many child processes processes = [Process(target=task, args=(i, shared_list)) for i in range(50)] # start all processes for process in processes: process.start() # wait for all processes to complete for process in processes: process.join() # report the number of items stored print(f'List: {len(shared_list)}') |
Running the example first creates the manager using the context manager interface.
Next, the manager is used to create the shared list.
This creates a centralized version of the list on the manager’s server process and returns proxy objects for interacting with the list.
Next, 50 child processes are created and configured to call our custom task() function, then the processes are started. The main process then blocks until all child processes complete.
Each process generates a random number, blocks for a fraction of a second then appends a tuple with the task number and the generated value to the shared list.
All tasks complete and the main process unblocks and reports the total number of items in the shared list which matches the same number of processes that were started.
If the list was not created using a manager in this case, then changes to the list in each child process would not be shared, they would be made to a copy of the list.
1 |
List: 50 |
Free Python Multiprocessing Course
Download your FREE multiprocessing PDF cheat sheet and get BONUS access to my free 7-day crash course on the multiprocessing API.
Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.
Further Reading
This section provides additional resources that you may find helpful.
Python Multiprocessing Books
- Python Multiprocessing Jump-Start, Jason Brownlee (my book!)
- Multiprocessing API Interview Questions
- Multiprocessing API Cheat Sheet
I would also recommend specific chapters in the books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Multiprocessing: The Complete Guide
- Python Multiprocessing Pool: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
APIs
References
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Takeaways
You now know how to use managers to share access to centralized Python objects.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Johnny Such on Unsplash
Alain C says
How to use a Manager or SyncManager to share objects between processes started (manually or otherwise) on separate machines, e.g. two distinct desktops?
Jason Brownlee says
Good question.
I don’t have a tutorial on this exact topic, but the API docs have an example right here, under “Using a remote manager”:
https://docs.python.org/3/library/multiprocessing.html#manager
David L says
Hi Jason, I like your explanations! Here is a question: I am trying to set up a bunch of parallel processes and attach a queue to each one. I cribbed one of your examples and modified it to create an array of queues:
def task(number, shared_queues):print(“here i am in “, number)# generate a number between 0 and 1value = random()# block for a fraction of a secondsleep(value)# store the value in the shared listif (number > 0) and (number < 49) :shared_queues[number].put((number, value))print(number, ” put “, value)# protect the entry pointif __name__ == ‘__main__’:total_task_num = 44# create the managerwith Manager() as manager:# create the array of shared queuesshared_queues = [manager.Queue() for i in range(total_task_num)]# create many child processesprocesses = [Process(target=task, args=(i, shared_queues)) for i in range(total_task_num)]
# start all processesfor process in processes:process.start()# wait for all processes to completefor process in processes:process.join()
When I run this with total_task_num <= 44, no problem. When I go above that number, I get runtime errors from Python:
AttributeError: ‘ForkAwareLocal’ object has no attribute ‘connection’
I suspect it is some kind of race condition or memory failure. Any thoughts about a better way to manage queues for sharing data among processes? thanks!
Jason Brownlee says
Thank you David!
It’s really hard to read your code david. Perhaps you can try to paste it again with formatting and use pre tags?
I note that each multiprocessing queue uses a lot more resources than a simple threading queue. You may not want to have many open within a program, e.g. using one queue with wrapped messages and many producers/consumers might be effective.