Last Updated on September 29, 2023
You can share a numpy array between processes by using multiprocessing SharedMemory.
In this tutorial, you will discover how to share a numpy array between processes using multiprocessing SharedMemory.
Let’s get started.
Need to Share Numpy Array Between Processes
Python offers process-based concurrency via the multiprocessing module.
Process-based concurrency is appropriate for those tasks that are CPU-bound, as opposed to thread-based concurrency in Python which is generally suited to IO-bound tasks given the presence of the Global Interpreter Lock (GIL).
You can learn more about process-based concurrency and the multiprocessing module in the tutorial:
Consider the situation where we need to share numpy arrays between processes.
This may be for many reasons, such as:
- Data is loaded as an array in one process and analyzed differently in different subprocesses.
- Many child processes load small data as arrays that are sent to a parent process for handling.
- Data arrays are loaded in the parent process and processed in a suite of child processes.
Sharing Python objects and data between processes is slow.
This is because any data, like numpy arrays, shared between processes must be transmitted using inter-process communication (ICP) requiring the data first be pickled by the sender and then unpickled by the receiver.
You can learn more about this in the tutorial:
This means that if we share numpy arrays between processes, it assumes that we receive some benefit, such as a speedup, that overcomes the slow speed of data transmission.
For example, it may be the case that the arrays are relatively small and fast to transmit, whereas the computation performed on each array is slow and can benefit from being performed in separate processes.
Alternatively, preparing the array may be computationally expensive and benefit from being performed in a separate process, and once prepared, the arrays are small and fast to transmit to another process that requires them.
Given these situations, how can we share data between Processes in Python?
Run loops using all CPUs, download your FREE book to learn how.
How to Share Numpy Array Between Processes Using SharedMemory
One approach to sharing numpy arrays between processes is to use shared memory.
Python processes do not have shared memory. Instead, processes must simulate shared memory.
Since Python 3.8 a new approach to shared memory was added to Python in the multiprocessing.shared_memory module.
The benefit of shared memory is speed and efficiency.
No inter-process communication is required. Instead, processes are able to read and write the same shared memory block directly, although within constraints.
The module provides three classes to support shared memory, they are:
- multiprocessing.shared_memory.SharedMemory
- multiprocessing.shared_memory.ShareableList
- multiprocessing.managers.SharedMemoryManager
The SharedMemory class is the core class for providing shared memory.
It allows a shared memory block to be created, named, and attached to. It provides a “buf” attribute to read and write data in an array-like structure and can be closed and destroyed.
You can learn more about shared memory in the tutorial:
A SharedMemory can be created in a process by calling the constructor and specifying a “size” in bytes and the “create” argument to True.
For example:
1 2 3 |
... # create a shared memory shared_mem = SharedMemory(size=1024, create=True) |
A shared memory object can be assigned a meaningful name via the “name” attribute to the constructor.
For example:
1 2 3 |
... # create a shared memory with a name shared_mem = SharedMemory(name='MyMemory', size=1024, create=True) |
Another process can access a shared memory via its name. This is called attaching to a shared memory.
This can be achieved by specifying the name of the shared memory that has already been created and setting the “create” argument to False (the default).
For example:
1 2 3 |
... # attach to a shared memory shared_mem = SharedMemory(name='MyMemory', create=False) |
Once created data can be stored in the shared memory via the “buf” attribute that acts like an array of bytes.
For example:
1 2 3 |
... # write data to shared memory shared_mem.buf[0] = 1 |
You can learn more about creating and using the SharedMemory class in the tutorial:
We can create a shared memory and use it as the basis for a numpy array.
This means that multiple processes can interact with the same numpy array directly via the shared memory, rather than passing copies of the array around. This is significantly faster and mimics shared memory available between threads.
A new numpy.ndarray instance can be created and we can configure it to use the shared memory as the buffer via the “buffer” argument.
For example:
1 2 3 |
... # create a new numpy array that uses the shared memory data = ndarray((n,), dtype=numpy.double, buffer=shared_mem.buf) |
We must ensure that the SharedMemory has a “size” that is large enough for our numpy array.
For example, if we create a numpy array to hold double values, each double is 8 bytes. Therefore, we can set the size of the SharedMemory to be 8 multiplied by the size of the array we need.
For example:
1 2 3 4 5 6 7 |
... # define the size of the numpy array n = 10000 # bytes required for the array (8 bytes per value for doubles) n_bytes = n * 8 # create the shared memory shared_mem = SharedMemory(name='MyMemory', create=True, size=n_bytes) |
Now that we know how to share a numpy array using multiprocessing SharedMemory, let’s look at a worked example.
Example of Sharing a Numpy Array Using SharedMemory
We can explore an example of sharing a numpy array between processes using shared memory.
In this example, we will create a shared memory large enough to hold our array, then create an array backed by the shared memory. We will then start a child process and have it connect to the same shared memory and create an array backed by the shared memory and access the same data.
This example will highlight how multiple processes can interact with the same numpy array in memory directly.
Firstly, we will define a task to execute in the child process.
The task will not take any arguments. It will first connect to the shared memory, then create a new numpy array of double values backed by the shared memory.
1 2 3 4 5 6 7 |
... # define the size of the numpy array n = 10000 # attach another shared memory block sm = SharedMemory('MyMemory') # create a new numpy array that uses the shared memory data = ndarray((n,), dtype=numpy.double, buffer=sm.buf) |
Next, the task will report the first 10 values of the array, increment all data in the array and then confirm that the data in the array was changed.
1 2 3 4 5 6 7 |
... # check the contents print(f'Child {data[:10]}') # increment the data data[:] += 1 # confirm change print(f'Child {data[:10]}') |
Finally, the task will close its access to the shared memory.
1 2 3 |
... # close the shared memory sm.close() |
The task() function below implements this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# task executed in a child process def task(): # define the size of the numpy array n = 10000 # attach another shared memory block sm = SharedMemory('MyMemory') # create a new numpy array that uses the shared memory data = ndarray((n,), dtype=numpy.double, buffer=sm.buf) # check the contents print(f'Child {data[:10]}') # increment the data data[:] += 1 # confirm change print(f'Child {data[:10]}') # close the shared memory sm.close() |
Next, the main process will create the shared memory with the required size.
Given that we know we want a numpy array of a given size (10,000 elements) containing double values, we can calculate the size in memory required directly as 8 bytes (per double value) multiplied by the size of the array.
1 2 3 4 5 6 7 |
... # define the size of the numpy array n = 10000 # bytes required for the array (8 bytes per value for doubles) n_bytes = n * 8 # create the shared memory sm = SharedMemory(name='MyMemory', create=True, size=n_bytes) |
Double values may not always be 8 bytes, it is platform dependent.
You can learn more about numpy data types here:
Note, we can create a shared memory with more space than that required by our array, but not less space. Ensure you allocate enough memory for your array and do some basic checks and calculations first if you’re not sure.
If the buffer is too small to hold the array, you will get an error such as:
1 |
TypeError: buffer is too small for requested array |
Next, the main process will create a new numpy array with the given size, containing double values, and backed by our shared memory with a specific size.
1 2 3 |
... # create a new numpy array that uses the shared memory data = ndarray((n,), dtype=numpy.double, buffer=sm.buf) |
We will then fill the array with one value and then confirm the data in the array was changed and that the array has the expected dimensions, e.g. 10k elements.
1 2 3 4 5 |
... # populate the array data.fill(1.0) # confirm contents of the new array print(data[:10], len(data)) |
Next, we will create, configure and start a child process to execute our task() function and block until it is terminated.
1 2 3 4 5 6 7 |
... # create a child process child = Process(target=task) # start the child process child.start() # wait for the child process to complete child.join() |
If you are new to starting a child process to execute a target function, you can learn more in the tutorial:
Finally, we will confirm the contents of the shared array were changed by the child process, then close access to the shared memory and release the shared memory.
1 2 3 4 5 6 7 |
... # check some data in the shared array print(data[:10]) # close the shared memory sm.close() # release the shared memory sm.unlink() |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
# share numpy array via a shared memory from multiprocessing import Process from multiprocessing.shared_memory import SharedMemory from numpy import ones from numpy import ndarray import numpy # task executed in a child process def task(): # define the size of the numpy array n = 10000 # attach another shared memory block sm = SharedMemory('MyMemory') # create a new numpy array that uses the shared memory data = ndarray((n,), dtype=numpy.double, buffer=sm.buf) # check the contents print(f'Child {data[:10]}') # increment the data data[:] += 1 # confirm change print(f'Child {data[:10]}') # close the shared memory sm.close() # protect the entry point if __name__ == '__main__': # define the size of the numpy array n = 10000 # bytes required for the array (8 bytes per value for doubles) n_bytes = n * 8 # create the shared memory sm = SharedMemory(name='MyMemory', create=True, size=n_bytes) # create a new numpy array that uses the shared memory data = ndarray((n,), dtype=numpy.double, buffer=sm.buf) # populate the array data.fill(1.0) # confirm contents of the new array print(data[:10], len(data)) # create a child process child = Process(target=task) # start the child process child.start() # wait for the child process to complete child.join() # check some data in the shared array print(data[:10]) # close the shared memory sm.close() # release the shared memory sm.unlink() |
Running the example first creates the shared memory with the name “MyMemory” and 80,000 bytes in size (e.g. 10k * 8 bytes per double).
A new numpy array is then created with the given size and backed by the shared memory.
The array is filled with one values and the contents and size of the array are confirmed.
A child process is then created and configured to execute our task() function. The process is started and the main process blocks until the child process terminates.
The child process runs and connects to the shared memory by name, e.g. “MyMemory“. It then creates a new numpy array with the same size and data type as the parent process and backs it with the shared memory.
The child process first confirms that the array contains one values, as set by the parent process. This confirms that the child process is operating with the same memory (same array) as the parent process.
It increments all values in the array and confirms that the data was changed.
The child process closes its access to the shared memory and terminates.
The main process resumes and confirms that the contents of the array were changed by the child process. This confirms that changes made to the array by the child process are reflected in the parent process.
Finally, the main process closes access to the shared memory and then releases the shared memory.
1 2 3 4 |
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] 10000 Child [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] Child [2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] [2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] |
Free Concurrent NumPy Course
Get FREE access to my 7-day email course on concurrent NumPy.
Discover how to configure the number of BLAS threads, how to execute NumPy tasks faster with thread pools, and how to share arrays super fast.
Simplified Example of Sharing a Numpy Array Using SharedMemory
We can simplify the example a little and leave it functionally the same.
For example, we can define a function that creates a numpy array backed by a shared memory.
The function can take the shared memory object, the shape of the array, and the data type, then return a numpy backed by the shared memory.
For example:
1 2 3 4 |
# define a numpy array backed by shared memory def sm_array(shared_mem, shape, dtype=numpy.double): # create a new numpy array that uses the shared memory return ndarray(shape, dtype=dtype, buffer=shared_mem.buf) |
This function would ensure that the same numpy array is created each time, regardless of the process using the array.
We can also pass the size of the array and the name of the shared memory to the child process as arguments, to ensure the parent and child processes are consistent.
For example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
# task executed in a child process def task(n, name): # attach another shared memory block sm = SharedMemory(name) # create a new numpy array that uses the shared memory data = sm_array(sm, (n,)) # check the contents print(f'Child {data[:10]}') # increment the data data[:] += 1 # confirm change print(f'Child {data[:10]}') # close the shared memory sm.close() |
Finally, we can create the SharedMemory in the main process using the SharedMemoryManager and the context manager interface. This ensures that the memory is always closed and unlinked by the program, even if there are errors.
For example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
... # define the size of the numpy array n = 10000000 # bytes required for the array (8 bytes per value for doubles) n_bytes = n * 8 with SharedMemoryManager() as smm: # create the shared memory sm = smm.SharedMemory(size=n_bytes) # create a new numpy array that uses the shared memory data = sm_array(sm, (n,)) # populate the array data.fill(1.0) # confirm contents of the new array print(data[:10], len(data)) # create a child process child = Process(target=task, args=(n, sm.name)) # start the child process child.start() # wait for the child process to complete child.join() # check some data in the shared array print(data[:10]) |
You can learn more about the SharedMemoryManager in the tutorial:
Tying this together, the complete example with these changes is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
# share numpy array via a shared memory from multiprocessing import Process from multiprocessing.shared_memory import SharedMemory from multiprocessing.managers import SharedMemoryManager from numpy import ones from numpy import ndarray import numpy # define a numpy array backed by shared memory def sm_array(shared_mem, shape, dtype=numpy.double): # create a new numpy array that uses the shared memory return ndarray(shape, dtype=dtype, buffer=shared_mem.buf) # task executed in a child process def task(n, name): # attach another shared memory block sm = SharedMemory(name) # create a new numpy array that uses the shared memory data = sm_array(sm, (n,)) # check the contents print(f'Child {data[:10]}') # increment the data data[:] += 1 # confirm change print(f'Child {data[:10]}') # close the shared memory sm.close() # protect the entry point if __name__ == '__main__': # define the size of the numpy array n = 10000000 # bytes required for the array (8 bytes per value for doubles) n_bytes = n * 8 with SharedMemoryManager() as smm: # create the shared memory sm = smm.SharedMemory(size=n_bytes) # create a new numpy array that uses the shared memory data = sm_array(sm, (n,)) # populate the array data.fill(1.0) # confirm contents of the new array print(data[:10], len(data)) # create a child process child = Process(target=task, args=(n, sm.name)) # start the child process child.start() # wait for the child process to complete child.join() # check some data in the shared array print(data[:10]) |
Running the example is functionally the same as the first version and produces the same result.
It offers a little more safety by ensuring the creation of the array is consistent across both processes.
1 2 3 4 |
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] 10000000 Child [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] Child [2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] [2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] |
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Further Reading
This section provides additional resources that you may find helpful.
Books
- Concurrent NumPy in Python, Jason Brownlee (my book!)
Guides
- Concurrent NumPy 7-Day Course
- Which NumPy Functions Are Multithreaded
- Numpy Multithreaded Matrix Multiplication (up to 5x faster)
- NumPy vs the Global Interpreter Lock (GIL)
- ThreadPoolExecutor Fill NumPy Array (3x faster)
- Fastest Way To Share NumPy Array Between Processes
Documentation
- Parallel Programming with numpy and scipy, SciPi Cookbook, 2015
- Parallel Programming with numpy and scipy (older archived version)
- Parallel Random Number Generation, NumPy API
NumPy APIs
Concurrency APIs
- threading — Thread-based parallelism
- multiprocessing — Process-based parallelism
- concurrent.futures — Launching parallel tasks
Takeaways
You now know how to share a numpy array between processes using multiprocessing SharedMemory.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Denis Chick on Unsplash
Do you have any questions?