Last Updated on September 29, 2023
You can share numpy arrays between processes in Python.
There are many ways to share a numpy array between processes, such as as a function argument, as an inherited global variable, via a queue or a pipe, as a ctype Array and RawArray, memory-mapped file, SharedMemory backed array, or via a Manager.
In this tutorial, you will discover a suite of approaches that you can use to share a numpy array between python processes.
Let’s get started.
Updated August 2023: Added memory-mapped file method, added ctype Array section, added how to choose section.
Need to Share Numpy Array Between Processes
Python offers process-based concurrency via the multiprocessing module.
Process-based concurrency is appropriate for those tasks that are CPU-bound, as opposed to thread-based concurrency in Python which is generally suited to IO-bound tasks given the presence of the Global Interpreter Lock (GIL).
You can learn more about process-based concurrency and the multiprocessing module in the tutorial:
Consider the situation where we need to share numpy arrays between processes.
This may be for many reasons, such as:
- Data is loaded as an array in one process and analyzed differently in different subprocesses.
- Many child processes load small data as arrays that are sent to a parent process for handling.
- Data arrays are loaded in the parent process and processed in a suite of child processes.
Sharing Python objects and data between processes is slow.
This is because any data, like numpy arrays, shared between processes must be transmitted using inter-process communication (ICP) requiring the data first be pickled by the sender and then unpickled by the receiver.
You can learn more about this in the tutorial:
This means that if we share numpy arrays between processes, it assumes that we receive some benefit, such as a speedup, that overcomes the slow speed of data transmission.
For example, it may be the case that the arrays are relatively small and fast to transmit, whereas the computation performed on each array is slow and can benefit from being performed in separate processes.
Alternatively, preparing the array may be computationally expensive and benefit from being performed in a separate process, and once prepared, the arrays are small and fast to transmit to another process that requires them.
Given these situations, how can we share data between Processes in Python?
Run loops using all CPUs, download your FREE book to learn how.
How to Share a Numpy Array Between Processes
There are many ways to share a numpy array between processes.
Nevertheless, there are perhaps 9 main approaches that we can implement using the Python standard library.
They are:
- Share NumPy Array Via Function Argument
- Share NumPy Array Via Inherited Variable
- Share NumPy Array Via Queue
- Share NumPy Array Via Pipe
- Share NumPy Array Via ctype Array
- Share NumPy Array Via ctype RawArray
- Share NumPy Array Via SharedMemory
- Share NumPy Array via Memory Mapped File
- Share NumPy Array Manager
Do you know of another approach to sharing numpy arrays?
Let me know in the comments below.
Some approaches are slow, requiring that the numpy array be transmitted using inter-process communication, e.g. passing the array as an argument or using a queue or a pipe.
Other approaches are fast, using a direct memory copy, such as inheriting the array using a global variable.
The preferred approach is often to use a simulated shared memory model that is fast and allows changes to be reflected in each process, such as a RawArray or SharedMemory backed numpy array or an array hosted in a Manager server process.
We will take a close look at each approach and explore each with a worked example.
Let’s dive in.
Method 01: Share a NumPy Array Via Function Argument
We can share numpy arrays with child processes via function arguments.
Recall that when we start a child process, we can configure it to run a target function via the “target” argument. We can also pass arguments to the target function via the “args” keyword.
As such, we can define a target function that takes a numpy array as an argument and execute it in a child process.
For example:
1 2 3 |
... # create a child process child = Process(target=task, args=(data,)) |
You can learn more about running a function in a child process in the tutorial:
Passing a numpy array to a target function executed in a child process is slow.
It requires that the numpy array be pickled in the parent process and then unpickled in the child process. This means it may only be appropriate for modestly sized arrays that are fast to pickle and unpickle.
Nevertheless, it is a simple and effective approach.
You can learn more about sharing a numpy array as an argument to a process in the tutorial:
We can explore the case of sharing a numpy array with a child process via a function argument.
In this example, we will create an array in the parent process then pass it to a task executed in a child process. The array will be copied and transmitted to the child process and as such changes to it won’t be reflected in the parent process.
Firstly, we can define a task function to be executed in the child process.
The function will take a numpy array as an argument. It will report on some of the content of the array, change all values to zero, then report that the contents of the array were changed.
The task() function below implements this.
1 2 3 4 5 6 7 8 |
# task executed in a child process def task(data): # check some data in the array print(data[:5,:5]) # change data in the array data.fill(0.0) # confirm the data was changed print(data[:5,:5]) |
Next, in the main process, we can create an array of modest size, filled with ones.
1 2 3 4 5 |
... # define the size of the numpy array n = 10000 # create the numpy array data = ones((n,n)) |
We can then create and configure a child process to execute our task() function and pass the numpy array as an argument.
1 2 3 |
... # create a child process child = Process(target=task, args=(data,)) |
We can then start the child process and block until it has completed.
1 2 3 4 5 |
... # start the child process child.start() # wait for the child process to complete child.join() |
Finally, we can check the contents of the array.
1 2 3 |
... # check some data in the array print(data[:5,:5]) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
# share numpy array via function argument from multiprocessing import Process from numpy import ones # task executed in a child process def task(data): # check some data in the array print(data[:5,:5]) # change data in the array data.fill(0.0) # confirm the data was changed print(data[:5,:5]) # protect the entry point if __name__ == '__main__': # define the size of the numpy array n = 10000 # create the numpy array data = ones((n,n)) # create a child process child = Process(target=task, args=(data,)) # start the child process child.start() # wait for the child process to complete child.join() # check some data in the array print(data[:5,:5]) |
Running the example first creates the numpy array populated with one values.
The child process is then created, configured and started. The main process then blocks until the child process is terminated.
The child process runs, confirming the array was populated with one values. It then fills the array with zero values and confirms the contents of the array were changed.
The child process terminates and the main process resumes. It confirms that the contents of the array in the parent process have not changed.
This highlights how we can share a copy of an array with a child process and how changes to the copy of the array in the child process are not reflected in the parent process.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
[[1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.]] [[0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.]] [[1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.]] |
Free Concurrent NumPy Course
Get FREE access to my 7-day email course on concurrent NumPy.
Discover how to configure the number of BLAS threads, how to execute NumPy tasks faster with thread pools, and how to share arrays super fast.
Method 02: Share a NumPy Array Via Inheritance
We can share a numpy array with other processes via an inherited global variable.
That is, a process can load or create a numpy array, store it in a global variable, and other processes can access it directly.
The benefit of this approach is that it is really fast.
It can be up to 34x faster than sending the array between processes using other methods, such as a function argument.
You can learn more about this approach to sharing data between processes in the tutorial:
There are some limitations on this approach though, they are:
- The array must be stored in a global variable by a parent process.
- The array can only be accessed by child processes.
- Child processes must be created using the ‘fork’ start method.
- Changes to the array made by child processes will not be reflected in each other or in the parent process.
Let’s consider these concerns in detail.
A python program can declare an explicit global variable using the ‘global‘ keyword.
For example:
1 2 3 |
... # declare the global variable global data |
This is not needed in the parent process when preparing the numpy array, but may be needed in the function executed in the child process to ensure that the program is referred to the inherited global variable and not a newly created local variable.
This is called inheriting a global variable.
You can learn more about inheriting global variables in child processes in the tutorial:
Child processes can be created either by spawning a new instance of the Python interpreter or by forking an existing instance of the Python interpreter. Spawning is the current default on Windows and macOS, whereas forking is the default on Linux.
We can explicitly set the method used to start child processes via the multiprocessing.set_start_method() function and specify the method as a string, such as ‘fork’.
For example:
1 2 3 |
... # ensure we are using fork start method set_start_method('fork') |
The fork start method is not supported on Windows at the time of writing.
You can learn more about setting the start method used for creating child processes in the tutorial:
The child process will inherit a copy of all global variables from the parent process.
Because they are a copy, any changes made to the global variable in the child process will only be available to the child process. Similarly, any changes to the global variable made in the parent process after the child processes are created will not be reflected in the child processes.
As such, this approach to sharing an array between processes is appropriate in certain situations, such as:
- The array is large, but the machine can afford to store multiple copies in memory at the same time.
- Child processes compute something using the array, but changes to the array itself are not required in the parent process or other processes.
A common example that may be appropriate for this approach would be where one or more arrays are prepared or loaded and a suite of computationally expensive statistics needs to be calculated on each. Each statistic or set of statistics can be calculated using a copy of the array in a separate process.
You can learn more about inheriting a numpy array as a global variable in the tutorial:
We can explore how to share a numpy array as an inherited global variable with child processes.
In this example, we will create a modestly sized numpy array in a parent process and store it as a global variable. We will then start a child process using the fork start method and have it access the inherited numpy array.
Firstly, we can define a task to be executed in a child process.
In this case, the task does not take any arguments. Instead, it declares an inherited global variable, then accesses a portion of it directly.
It then changes the contents of the array, by assigning all values to zero and confirming that the change took effect.
The task() function below implements this.
1 2 3 4 5 6 7 8 9 10 |
# task executed in a child process def task(): # declare the global variable global data # check some data in the array print(data[:5,:5]) # change data in the array data.fill(0.0) # confirm the data was changed print(data[:5,:5]) |
Next, we can set the start method to fork.
1 2 3 |
... # ensure we are using fork start method set_start_method('fork') |
We can then create a modestly sized numpy array, stored in a variable.
1 2 3 4 5 |
... # define the size of the numpy array n = 10000 # create the numpy array data = ones((n,n)) |
We can then create a child process, configured to execute our task() function, and then start the process.
This will make a copy of the parent process, including the numpy array stored in the ‘data’ variable.
1 2 3 4 5 |
... # create a child process child = Process(target=task) # start the child process child.start() |
You can learn more about running a function in a child process in the tutorial:
Finally, we can wait for the child process to terminate and then report the contents of the array.
1 2 3 4 5 |
... # wait for the child process to complete child.join() # check some data in the array print(data[:5,:5]) |
Tying this together, the complete example is listed below.
Note, this example may not run on Windows as it requires support for the ‘fork’ start method.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
# share a numpy array as a global variable from multiprocessing import set_start_method from multiprocessing import Process from numpy import ones from numpy import zeros # task executed in a child process def task(): # declare the global variable global data # check some data in the array print(data[:5,:5]) # change data in the array data.fill(0.0) # confirm the data was changed print(data[:5,:5]) # protect the entry point if __name__ == '__main__': # ensure we are using fork start method set_start_method('fork') # define the size of the numpy array n = 10000 # create the numpy array data = ones((n,n)) # create a child process child = Process(target=task) # start the child process child.start() # wait for the child process to complete child.join() # check some data in the array print(data[:5,:5]) |
Running the example first sets the ‘fork’ start method for creating child processes.
Next, the numpy array is created and initialized to the value.
The child process is then created and started, and the main process blocks until it is terminated.
A forked copy of the parent process is created and runs the task() function. The inherited global variable is declared and then accessed, reporting a small subset of the data.
This confirms that the array was inherited correctly and contains the data initialized in the parent process.
The child process then fills the array with zero values and accesses them to confirm the data was changed.
The child process terminates and the main process then resumes. It reports the contents of a small portion of the array, confirming that the data change made in the child process was not propagated to the parent process.
This highlights how to share a numpy array with child processes by inheritance and that changes made in the child process to the inherited array do not propagate.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
[[1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.]] [[0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.]] [[1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.]] |
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Method 03: Share a NumPy Array Via a Queue
Another way to share numpy arrays between python processes is to use a queue.
Python provides a process-safe queue in the multiprocessing.Queue class.
A queue is a data structure on which items can be added by a call to put() and from which items can be retrieved by a call to get().
The multiprocessing.Queue provides a first-in, first-out FIFO queue, which means that the items are retrieved from the queue in the order they were added. The first items added to the queue will be the first items retrieved. This is opposed to other queue types such as last-in, first-out, and priority queues.
You can learn more about how to use the multiprocessing.Queue class in the tutorial:
You can learn more about sharing numpy arrays between processes via a queue in the tutorial:
We can explore an example of sending numpy arrays to child processes for processing using a shared queue.
In this example, the parent process will create many arrays with different dimensions and filled with random numbers. The arrays will be placed on a shared queue. Multiple child processes will then consume the arrays from the queue and perform a mathematical operation on them.
This may be a good template for those cases where loading or preparing the numpy arrays is relatively easy and the arrays a modest in size, yet each array requires non-trivial computation performed (which we will simulate in this case).
Firstly, we can define the task executed in child processes.
This task will run in a loop. Each iteration it will retrieve one array from the queue and then perform a computational operation on it. In this case, it will simply compute the max value in the array and report it. The loop is exited if a None message is received, signaling no further arrays.
The task() function below implements this, taking the shared array as an argument.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
# read numpy arrays from the queue and compute something def task(queue): # read arrays while True: # read one array data = queue.get() # check for final item if data is None: # report message print('Task done.', flush=True) # push signal back into queue queue.put(data) # exit the task break # compute max of array result = data.max() # report a message print(f'Task read an array {data.shape}, max={result}', flush=True) |
The main process will first create the shared queue, then starts 4 child processes, each executing the task() function.
1 2 3 4 5 6 7 |
... # create the shared queue queue = Queue() # issue task processes tasks = [Process(target=task, args=(queue,)) for _ in range(4)] for t in tasks: t.start() |
The main process then loops 20 times, each iteration creating a numpy array with random dimensions and filled with random values which is then put on the shared queue.
1 2 3 4 5 6 7 8 9 |
... # generate many arrays of random numbers for _ in range(20): # generate random dimensions dim = (randint(500,2000),randint(500,2000)) # generate array of random floats with random dimensions data = random(dim) # push into queue queue.put(data) |
The main process then signals that no further arrays are expected via a None value, then waits for all child processes to terminate.
1 2 3 4 5 6 7 8 |
... # signal no further arrays queue.put(None) # wait for task processes to be done for t in tasks: t.join() # report a final message print('Done.', flush=True) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
# create arrays in a parent process and read in children via queue from multiprocessing import Process from multiprocessing import Queue from random import randint from numpy.random import random # read numpy arrays from the queue and compute something def task(queue): # read arrays while True: # read one array data = queue.get() # check for final item if data is None: # report message print('Task done.', flush=True) # push signal back into queue queue.put(data) # exit the task break # compute max of array result = data.max() # report a message print(f'Task read an array {data.shape}, max={result}', flush=True) # protect the entry point if __name__ == '__main__': # create the shared queue queue = Queue() # issue task processes tasks = [Process(target=task, args=(queue,)) for _ in range(4)] for t in tasks: t.start() # generate many arrays of random numbers for _ in range(20): # generate random dimensions dim = (randint(500,2000),randint(500,2000)) # generate array of random floats with random dimensions data = random(dim) # push into queue queue.put(data) # signal no further arrays queue.put(None) # wait for task processes to be done for t in tasks: t.join() # report a final message print('Done.', flush=True) |
Running the program first creates the shared queue.
The 4 child processes are then created and configured to execute our task() function before being started. Each process waits on items to appear on the queue.
The main process then generates and puts 20 numpy arrays on the queue.
It then adds the signal to not expect any further arrays and waits for the child process to exit.
Each child process loops, retrieving one array from the queue. It computes the max value and then reports it along with the dimensions of the array.
This is repeated by all child processes until no further arrays are available.
Each child process encounters the None message, adds it back to the queue for other child processes, exits its loop, and reports a message, terminating the child process.
Once all child processes are terminated the main process resumes and reports a final message before terminating itself.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
Task read an array (827, 1388), max=0.999999824661523 Task read an array (1745, 1231), max=0.9999998440051578 Task read an array (823, 1081), max=0.999999653298093 Task read an array (1364, 1886), max=0.9999994826187178 Task read an array (1960, 1190), max=0.9999999890428217 Task read an array (1272, 1634), max=0.9999992473453926 Task read an array (1596, 1742), max=0.9999996946372763 Task read an array (1035, 1511), max=0.9999988748078102 Task read an array (1358, 951), max=0.9999996488860812 Task read an array (586, 564), max=0.999998583415731 Task read an array (1852, 673), max=0.9999997326673665 Task read an array (1273, 1114), max=0.9999999614124049 Task read an array (1920, 775), max=0.9999997603830041 Task read an array (1175, 1762), max=0.9999998728441418 Task read an array (1567, 843), max=0.9999978316858225 Task read an array (1579, 1028), max=0.9999996224604828 Task read an array (784, 1873), max=0.9999998776214708 Task read an array (1770, 1099), max=0.9999998975119301 Task read an array (1511, 788), max=0.9999996664224969 Task done. Task done. Task done. Task read an array (1851, 1027), max=0.9999997882123821 Task done. Done. |
Method 04: Share a NumPy Array Via a Pipe
We can share a numpy array between processes using a pipe.
This is much like sharing a numpy array using a queue. A numpy array must be written to the pipe in one process and read from the pipe in another process. A copy of the array is transmitted between processes using inter-process communication, which can be slow for large arrays.
A multiprocessing.Pipe class is a simple data structure for sharing data between processes. We can create a pipe, which will create a send and receive connection.
1 2 3 |
... # create the shared pipe conn1, conn2 = Pipe() |
We can then share the sending connection with the child process and the receiving connection with the parent process and use them to send an array back from the child to the parent process.
1 2 3 |
... # send the array via a pipe pipe.send(data) |
You can learn more about how to use pipes for interprocess communication in the tutorial:
You can learn more about sharing a numpy array between processes using a pipe in the tutorial:
We can explore the case of simulating the return of a numpy array from a child process using a pipe.
In this example, we will create a pipe and share the sending connection with the child process and keep the receive connection in the parent process. The child process will then create a numpy array and send it to the parent process via the pipe.
Firstly, we can define a task function that takes a pipe connection as an argument and creates and transmits the array.
The function first creates an array initialized with one values. It then confirms the contents of the array contains one values and sends it via the pipe.
The task() function below implements this.
1 2 3 4 5 6 7 8 9 10 |
# task executed in a child process def task(pipe): # define the size of the numpy array n = 10000 # create the numpy array data = ones((n,n)) # check some data in the array print(data[:5,:5]) # send the array via a pipe pipe.send(data) |
Next, the main process will create the pipe.
1 2 3 |
... # create the shared pipe conn1, conn2 = Pipe() |
It then creates the child process and configures it to execute our task function, passing it the sending connection as an argument.
1 2 3 |
... # create a child process child = Process(target=task, args=(conn2,)) |
The child process is then started and the main process blocks waiting to receive the array via the pipe.
1 2 3 4 5 |
... # start the child process child.start() # read the data from the pipe data = conn1.recv() |
Finally, the main process confirms the contents of the received numpy array.
1 2 3 |
... # check some data in the array print(data[:5,:5]) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
# share numpy array via pipe from multiprocessing import Process from multiprocessing import Pipe from numpy import ones # task executed in a child process def task(pipe): # define the size of the numpy array n = 10000 # create the numpy array data = ones((n,n)) # check some data in the array print(data[:5,:5]) # send the array via a pipe pipe.send(data) # protect the entry point if __name__ == '__main__': # create the shared pipe conn1, conn2 = Pipe() # create a child process child = Process(target=task, args=(conn2,)) # start the child process child.start() # read the data from the pipe data = conn1.recv() # check some data in the array print(data[:5,:5]) |
Running the example first creates the pipe with the sending and receiving connections.
Next, the child process is created and configured, and started.
The main process then blocks, waiting to read the numpy array from the pipe.
The child process runs, first creating a numpy array initialized with one values.
It then confirms the contents of the array and transmits it to the parent process via the pipe. The child process then terminates.
The main process reads the numpy array from the pipe and confirms it contains one values.
1 2 3 4 5 6 7 8 9 10 |
[[1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.]] [[1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.]] |
Method 05: Share a NumPy Array Via a ctype Array
One way to share a numpy array between processes is via shared ctypes.
The ctypes module provides tools for working with C data types.
Python provides the capability to share ctypes between processes on one system.
This is primarily achieved via the following classes:
- multiprocessing.Value: manage a shared value.
- multiprocessing.Array: manage an array of shared values.
A single Value or Array instance can be created in memory and shared among processes without duplicating the data.
You can learn more about sharing ctypes between processes in the tutorial:
We can copy a numpy array into a multiprocessing.Array and share it among multiple processes that can then read and write the same data.
A multiprocessing.Array can be created by specifying the data type and initial values.
For example:
1 2 3 |
... # create an integer array data = multiprocessing.Array(ctypes.c_int, (1, 2, 3, 4, 5)) |
We can use the numpy.ctypeslib.as_ctypes() function to determine the ctype for a given numpy array.
For example:
1 2 3 |
... # get ctype for our array ctype = as_ctypes(data) |
We can then use this type along with the array data to initialize a new multiprocessing.Array.
For example:
1 2 3 |
... # create ctype array initialized from our array array = Array(ctype._type_, data) |
This will make a copy of the data in the numpy array into the multiprocessing.Array.
Note that the shared ctype Array only supports one-dimensional arrays. This means that if you have a two-dimensional array, or more dimensions, you must flatten the array first before you copy it into the multiprocessing.Array, e.g. via the flatten() method.
You can learn more about how to share NumPy arrays between processes using a ctype Array in the tutorial:
We can explore the case of sharing a numpy array between processes using a shared ctype array.
In this example, we will create a one-dimensional numpy array initialized with one values. We will then copy it into a shared ctype Array and share the array with a child process. The child process will then change the contents of the array. The parent process will then confirm that the contents of the shared ctype array were changed.
First, we will define a function to execute in a child process.
The function will take the shared ctype array as an argument.
Firstly, it will check that the contents of the array match what was expected, e.g. what was passed in from the parent process. It will then change the content of the array to all zero values and confirm that the content of the array was changed.
The task() function listed below implements this.
1 2 3 4 5 6 7 8 9 |
# task executed in a child process def task(array): # check some data in the array print(array[:10], len(array)) # change data in the array for i in range(len(array)): array[i] = 0.0 # confirm the data was changed print(array[:10], len(array)) |
Next, the main process will create a new array with a modest size, initialized to all one values.
It then reports the contents of the array, to confirm it indeed contains all one values.
1 2 3 4 5 6 |
... # define the size of the numpy array n = 10000 # create the numpy array data = ones((n,)) print(data[:10], data.shape) |
Next, the main process gets the ctype equivalent for the array type and uses this type, along with the content of the array to create a new shared ctype Array without a mutex lock.
A lock is not required in this case as we know that the two processes will not be modifying the array at the same time.
1 2 3 4 5 |
... # get ctype for our array ctype = as_ctypes(data) # create ctype array initialized from our array array = Array(ctype._type_, data, lock=False) |
We can then confirm that the new shared ctype Array contains the same data as the numpy array.
1 2 3 |
... # confirm the contents of the shared array print(array[:10], len(array)) |
The parent process then creates a new child process, configured to execute our task() function, and passes it the shared ctype Array as an argument.
The child process is started and the main process blocks until the child process terminates.
1 2 3 4 5 6 7 |
... # create a child process child = Process(target=task, args=(array,)) # start the child process child.start() # wait for the child process to complete child.join() |
Finally, the parent process checks the contents of the array.
1 2 3 |
... # check some data in the shared array print(array[:10], len(array)) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
# share numpy array via a shared ctype from multiprocessing import Process from multiprocessing.sharedctypes import Array from numpy import ones from numpy.ctypeslib import as_ctypes # task executed in a child process def task(array): # check some data in the array print(array[:10], len(array)) # change data in the array for i in range(len(array)): array[i] = 0.0 # confirm the data was changed print(array[:10], len(array)) # protect the entry point if __name__ == '__main__': # define the size of the numpy array n = 10000 # create the numpy array data = ones((n,)) print(data[:10], data.shape) # get ctype for our array ctype = as_ctypes(data) # create ctype array initialized from our array array = Array(ctype._type_, data, lock=False) # confirm contents of the shared array print(array[:10], len(array)) # create a child process child = Process(target=task, args=(array,)) # start the child process child.start() # wait for the child process to complete child.join() # check some data in the shared array print(array[:10], len(array)) |
Running the example first creates a numpy array with 10,000 elements, initialized to one values.
The content of the numpy array is then confirmed.
Next, the ctype of the array is determined and is used along with the numpy array itself to create a new shared ctype Array.
The content of the numpy array is copied into the shared ctype array and we confirm that the shared ctype array contains all one values.
Next, the child process is created and started and the parent process blocks.
The child process runs. It first confirms the contents of the shared ctype Array contains all one values.
It then updates the shared array to contain all zero values and confirms the array’s contents were changed.
The child process terminates and the parent process resumes.
The parent process checks the contents of the shared array and confirms that it was changed by the child process.
This highlights that both parent and child processes operated upon the same single array in memory.
1 2 3 4 5 |
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] (10000,) [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] 10000 [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] 10000 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] 10000 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] 10000 |
Method 06: Share a NumPy Array Via a ctype RawArray
We can share a numpy array between processes via shared ctypes.
The ctypes module provides tools for working with C data types.
Python provides the capability to share ctypes between processes on one system.
You can learn more about sharing ctypes between processes in the tutorial:
We can use a shared ctype array as a buffer that backs a numpy array.
Numpy arrays can be created in each Python process backed by the same shared ctype array and share data directly.
This can be achieved by first creating a multiprocessing.sharedctypes.RawArray with the required type and large enough to hold the data required by the numpy array.
For example:
1 2 3 |
... # create the shared array array = RawArray('d', 10000) |
A new numpy array can then be created with the given type specify the RawArray as the buffer, e.g. the pre-allocated memory for the array.
This can be achieved by creating a new numpy.ndarray directly and specifying the “buff” as the array buffer.
For example:
1 2 3 |
... # create a new numpy array that uses the raw array data = ndarray((len(array),), dtype=numpy.double, buffer=array) |
It is generally not recommended to create ndarrays directly.
Instead, we can use the numpy.frombuffer() method to create the array for us with the given size, type, and backed by the RawArray.
For example:
1 2 3 |
... # create a new numpy array backed by the raw array data = frombuffer(array, dtype=double, count=len(array)) |
The RawArray instance can then be made available to other Python processes and used to create numpy arrays backed by the same memory.
This allows each process to read and write the same array data directly without having to copy it between processes.
You can learn more about sharing a numpy array backed by a RawArray between processes in the tutorial:
We can explore the case of creating a numpy array backed by a RawArray and sharing it among Python processes.
In this example, we will create a RawArray and then create a numpy array from the RawArray. We will then populate it with data in the main process. The RawArray will then be shared with a child process which will use it to create another numpy array, backed by the same data. It will then change the data and terminate. Finally, the main process will confirm that the data in the array was changed by the child process.
Firstly, we can define a function to execute in the child process.
The function will take a RawArray as an argument and use the RawArray to create a new numpy array. It then confirms the content of the array, increments all values in the array, and confirms that the data in the array was modified.
The task() function below implements this.
1 2 3 4 5 6 7 8 9 10 |
# task executed in a child process def task(array): # create a new numpy array backed by the raw array data = frombuffer(array, dtype=double, count=len(array)) # check the contents print(f'Child {data[:10]}') # increment the data data[:] += 1 # confirm change print(f'Child {data[:10]}') |
Next, in the main process will create a new RawArray of double values that has 10,000,000 elements.
1 2 3 4 5 |
... # define the size of the numpy array n = 10000000 # create the shared array array = RawArray('d', n) |
Next, the main process will create a new numpy array from the RawArray and populate it with one values and confirm that the data in the array was changed.
1 2 3 4 5 6 7 |
... # create a new numpy array backed by the raw array data = frombuffer(array, dtype=double, count=len(array)) # populate the array data.fill(1.0) # confirm contents of the new array print(data[:10], len(data)) |
A child process is then configured to execute our task() function and pass it the RawArray. The process is started and the main process blocks until the child process terminates.
1 2 3 4 5 6 7 |
... # create a child process child = Process(target=task, args=(array,)) # start the child process child.start() # wait for the child process to complete child.join() |
Finally, the main process confirms that the child process changed the content of the shared array.
1 2 3 |
... # check some data in the shared array print(data[:10]) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
# share numpy array between processes via a shared raw array from multiprocessing import Process from multiprocessing.sharedctypes import RawArray from numpy import frombuffer from numpy import double # task executed in a child process def task(array): # create a new numpy array backed by the raw array data = frombuffer(array, dtype=double, count=len(array)) # check the contents print(f'Child {data[:10]}') # increment the data data[:] += 1 # confirm change print(f'Child {data[:10]}') # protect the entry point if __name__ == '__main__': # define the size of the numpy array n = 10000000 # create the shared array array = RawArray('d', n) # create a new numpy array backed by the raw array data = frombuffer(array, dtype=double, count=len(array)) # populate the array data.fill(1.0) # confirm contents of the new array print(data[:10], len(data)) # create a child process child = Process(target=task, args=(array,)) # start the child process child.start() # wait for the child process to complete child.join() # check some data in the shared array print(data[:10]) |
Running the example first creates the RawArray with 10 million double elements.
Next, a numpy array is created that holds double values and is backed by the RawArray. That is, it does not allocate new memory for the array and instead reuses the memory already allocated by the RawArray.
Next, the main process fills the numpy array with one values and confirms the content of the array changed and that the shape of the array is as we expect.
Next, a child process is started to execute our task() function and the main process blocks until the process terminates.
The child process runs. It first creates a new numpy array using the RawArray passed in as an argument.
Passing the RawArray to the child process does not make a copy of the array. Instead, it passes a reference to the RawArray to the process.
The child process confirms the array contains one values, as set by the parent process. Next, it increments all values in the array and confirms that the values were changed.
This highlights both that the child array can directly access the same array as the parent process and that it is able to read and write to this data.
The child process terminates and the main process resumes. It confirms that the child process changed the content of the array.
This highlights that changes made to the array in the child process are reflected in other processes, such as the parent process.
1 2 3 4 |
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] 10000000 Child [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] Child [2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] [2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] |
Method 07: Share a NumPy Array Via SharedMemory
We can share numpy arrays between processes using shared memory.
Python processes do not have shared memory. Instead, processes must simulate shared memory.
Since Python 3.8 a new approach to shared memory was added to Python in the multiprocessing.shared_memory module.
The benefit of shared memory is speed and efficiency.
No inter-process communication is required. Instead, processes are able to read and write the same shared memory block directly, although within constraints.
The module provides three classes to support shared memory, they are:
- multiprocessing.shared_memory.SharedMemory
- multiprocessing.shared_memory.ShareableList
- multiprocessing.managers.SharedMemoryManager
The SharedMemory class is the core class for providing shared memory.
It allows a shared memory block to be created, named, and attached to. It provides a “buf” attribute to read and write data in an array-like structure and can be closed and destroyed.
You can learn more about shared memory in the tutorial:
A SharedMemory can be created in a process by calling the constructor and specifying a “size” in bytes and the “create” argument to True.
For example:
1 2 3 |
... # create a shared memory shared_mem = SharedMemory(size=1024, create=True) |
A shared memory object can be assigned a meaningful name via the “name” attribute to the constructor.
For example:
1 2 3 |
... # create a shared memory with a name shared_mem = SharedMemory(name='MyMemory', size=1024, create=True) |
Another process can access a shared memory via its name. This is called attaching to a shared memory.
This can be achieved by specifying the name of the shared memory that has already been created and setting the “create” argument to False (the default).
For example:
1 2 3 |
... # attach to a shared memory shared_mem = SharedMemory(name='MyMemory', create=False) |
Once created data can be stored in the shared memory via the “buf” attribute that acts like an array of bytes.
For example:
1 2 3 |
... # write data to shared memory shared_mem.buf[0] = 1 |
You can learn more about creating and using the SharedMemory class in the tutorial:
We can create a shared memory and use it as the basis for a numpy array.
This means that multiple processes can interact with the same numpy array directly via the shared memory, rather than passing copies of the array around. This is significantly faster and mimics shared memory available between threads.
A new numpy.ndarray instance can be created and we can configure it to use the shared memory as the buffer via the “buffer” argument.
For example:
1 2 3 |
... # create a new numpy array that uses the shared memory data = ndarray((n,), dtype=numpy.double, buffer=shared_mem.buf) |
We must ensure that the SharedMemory has a “size” that is large enough for our numpy array.
For example, if we create a numpy array to hold double values, each double is 8 bytes. Therefore, we can set the size of the SharedMemory to be 8 multiplied by the size of the array we need.
For example:
1 2 3 4 5 6 7 |
... # define the size of the numpy array n = 10000 # bytes required for the array (8 bytes per value for doubles) n_bytes = n * 8 # create the shared memory shared_mem = SharedMemory(name='MyMemory', create=True, size=n_bytes) |
You can learn more about sharing a SharedMemory-backed numpy array between processes in the tutorial:
We can explore an example of sharing a numpy array between processes using shared memory.
In this example, we will create a shared memory large enough to hold our array, then create an array backed by the shared memory. We will then start a child process and have it connect to the same shared memory and create an array backed by the shared memory and access the same data.
This example will highlight how multiple processes can interact with the same numpy array in memory directly.
Firstly, we will define a task to execute in the child process.
The task will not take any arguments. It will first connect to the shared memory, then create a new numpy array of double values backed by the shared memory.
1 2 3 4 5 6 7 |
... # define the size of the numpy array n = 10000 # attach another shared memory block sm = SharedMemory('MyMemory') # create a new numpy array that uses the shared memory data = ndarray((n,), dtype=numpy.double, buffer=sm.buf) |
Next, the task will report the first 10 values of the array, increment all data in the array and then confirm that the data in the array was changed.
1 2 3 4 5 6 7 |
... # check the contents print(f'Child {data[:10]}') # increment the data data[:] += 1 # confirm change print(f'Child {data[:10]}') |
Finally, the task will close its access to the shared memory.
1 2 3 |
... # close the shared memory sm.close() |
The task() function below implements this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# task executed in a child process def task(): # define the size of the numpy array n = 10000 # attach another shared memory block sm = SharedMemory('MyMemory') # create a new numpy array that uses the shared memory data = ndarray((n,), dtype=numpy.double, buffer=sm.buf) # check the contents print(f'Child {data[:10]}') # increment the data data[:] += 1 # confirm change print(f'Child {data[:10]}') # close the shared memory sm.close() |
Next, the main process will create the shared memory with the required size.
Given that we know we want a numpy array of a given size (10,000 elements) containing double values, we can calculate the size in memory required directly as 8 bytes (per double value) multiplied by the size of the array.
1 2 3 4 5 6 7 |
... # define the size of the numpy array n = 10000 # bytes required for the array (8 bytes per value for doubles) n_bytes = n * 8 # create the shared memory sm = SharedMemory(name='MyMemory', create=True, size=n_bytes) |
Double values may not always be 8 bytes, it is platform dependent.
You can learn more about numpy data types here:
Note, we can create a shared memory with more space than that required by our array, but not less space. Ensure you allocate enough memory for your array and do some basic checks and calculations first if you’re not sure.
If the buffer is too small to hold the array, you will get an error like:
1 |
TypeError: buffer is too small for requested array |
Next, the main process will create a new numpy array with the given size, containing double values, and backed by our shared memory with a specific size.
1 2 3 |
... # create a new numpy array that uses the shared memory data = ndarray((n,), dtype=numpy.double, buffer=sm.buf) |
We will then fill the array with one value and then confirm the data in the array was changed and that the array has the expected dimensions, e.g. 10k elements.
1 2 3 4 5 |
... # populate the array data.fill(1.0) # confirm contents of the new array print(data[:10], len(data)) |
Next, we will create, configure and start a child process to execute our task() function and block until it is terminated.
1 2 3 4 5 6 7 |
... # create a child process child = Process(target=task) # start the child process child.start() # wait for the child process to complete child.join() |
If you are new to starting a child process to execute a target function, you can learn more in the tutorial:
Finally, we will confirm the contents of the shared array were changed by the child process, then close access to the shared memory and release the shared memory.
1 2 3 4 5 6 7 |
... # check some data in the shared array print(data[:10]) # close the shared memory sm.close() # release the shared memory sm.unlink() |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
# share numpy array via a shared memory from multiprocessing import Process from multiprocessing.shared_memory import SharedMemory from numpy import ones from numpy import ndarray import numpy # task executed in a child process def task(): # define the size of the numpy array n = 10000 # attach another shared memory block sm = SharedMemory('MyMemory') # create a new numpy array that uses the shared memory data = ndarray((n,), dtype=numpy.double, buffer=sm.buf) # check the contents print(f'Child {data[:10]}') # increment the data data[:] += 1 # confirm change print(f'Child {data[:10]}') # close the shared memory sm.close() # protect the entry point if __name__ == '__main__': # define the size of the numpy array n = 10000 # bytes required for the array (8 bytes per value for doubles) n_bytes = n * 8 # create the shared memory sm = SharedMemory(name='MyMemory', create=True, size=n_bytes) # create a new numpy array that uses the shared memory data = ndarray((n,), dtype=numpy.double, buffer=sm.buf) # populate the array data.fill(1.0) # confirm contents of the new array print(data[:10], len(data)) # create a child process child = Process(target=task) # start the child process child.start() # wait for the child process to complete child.join() # check some data in the shared array print(data[:10]) # close the shared memory sm.close() # release the shared memory sm.unlink() |
Running the example first creates the shared memory with the name “MyMemory” and 80,000 bytes in size (e.g. 10k * 8 bytes per double).
A new numpy array is then created with the given size and backed by the shared memory.
The array is filled with one values and the contents and size of the array are confirmed.
A child process is then created and configured to execute our task() function. The process is started and the main process blocks until the child process terminates.
The child process runs and connects to the shared memory by name, e.g. “MyMemory“. It then creates a new numpy array with the same size and data type as the parent process and backs it with the shared memory.
The child process first confirms that the array contains one values, as set by the parent process. This confirms that the child process is operating with the same memory (same array) as the parent process.
It increments all values in the array and confirms that the data was changed.
The child process closes its access to the shared memory and terminates.
The main process resumes and confirms that the contents of the array were changed by the child process. This confirms that changes made to the array by the child process are reflected in the parent process.
Finally, the main process closes access to the shared memory and then releases the shared memory.
1 2 3 4 |
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] 10000 Child [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] Child [2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] [2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] |
Method 08: Share a NumPy Array Via Memory-Mapped File
We can share a NumPy array via a memory-mapped file.
A memory-mapped file is a structure that allows data to look and be used as though it exists in main memory, when in fact it is stored in a file on disk.
It allows very large data structures to be read and written without having to have the contents of the structure in main memory, as we do normally.
Numpy offers the ability for a NumPy array to be stored in a memory-mapped file and used as though the array exists in main memory.
This capability is provided via the numpy.memmap() function.
Memory-mapped files are used for accessing small segments of large files on disk, without reading the entire file into memory. NumPy’s memmap’s are array-like objects.
— numpy.memmap API.
To use a memory-mapped file NumPy array, the numpy.memmap() function can be called to define the array.
This includes the filename for where the array data will be stored, the data type of the array, the mode in which the array will be used (e.g. read, write, etc.), and the dimensional or shape of the array.
For example, we can create a memory-mapped file NumPy array in the current working directory with the filename “data.np“, to hold floating point values, that can be read and written and is one dimensional in shape with 1000 elements.
1 2 3 |
... # create a memory mapped numpy array data = memmap('data.np', dtype='float32', mode='w+', shape=(1000,)) |
The open modes are just like those used to open a file on disk, for example:
- ‘r’ Open the array file in read-only mode.
- ‘r+’ Open an existing array file for reading and writing.
- ‘w+’ Create or overwrite an existing file for reading and writing.
- ‘c’ Only make changes to the data in memory, called copy-on-write.
The array can then be used as per normal, such as populate it with initial values via the fill() method.
1 2 3 |
... # populate the array data.fill(1.0) |
Generally, changes to the array are written when we are finished with it.
We can use a NumPy memory-mapped file to share a NumPy array between processes.
If no array exists, one process can be responsible for creating the array.
This can be achieved by specifying the filename and dimensional of the array and opening it in create mode, e.g. ‘w+’.
For example:
1 2 3 |
... # create a memory mapped numpy array data = memmap('data.np', dtype='float32', mode='w+', shape=(1000,)) |
Changes can be made to the file and then flushed with a call to the flush method.
For example:
1 2 3 |
... # write changes to file data.flush() |
Once the file exists on disk, other processes can then open the file.
This might be in read-only mode, e.g. ‘r’ if the other processes need only read the data.
For example:
1 2 3 |
... # open existing memory mapped array in read-only mode data = memmap('data.np', dtype='float32', mode='r', shape=(1000,)) |
Alternatively, if the processes need to read and then write the data in the array, it can be opened in read-write mode and changes can then be flushed once the process is finished.
For example:
1 2 3 4 5 6 7 |
... # open existing memory-mapped array in read/write mode data = memmap('data.np', dtype='float32', mode='r+', shape=(1000,)) # make changes ... # write changes to file data.flush() |
The other processes must open the same file, and use the same data types and array dimensions.
This information could be provided to other processes from the main process via function arguments.
For example:
1 2 3 4 5 |
# task to work on shared numpy array def task(array_filename, array_type, array_shape): # open existing memory mapped array in read/write mode data = memmap(array_filename, dtype=array_type, mode='r+', shape=array_shape) ... |
You can learn more about sharing a numpy array between processes using a memory-mapped file in the tutorial:
We can explore an example of sharing a NumPy array between processes using a memory-mapped file.
In this case, we will define a NumPy array backed by a memory-mapped file. The main process will then initialize the array and flush the changes. A new child process will then be created and passed the details of the NumPy array backed by the memory-mapped file. It will load the array in a read/write mode, make changes, and then have those changes flushed. Finally, the main process will resume and confirm the changes made to the array by the child process.
Firstly, we can define the task function to execute in the child process.
The function will take the details of the array, including the name of the file and the size, assuming it is one-dimensional.
1 2 3 |
# load shared array def task(filename, n): # ... |
The function will then open the file-backed NumPy array in read/write mode and inspect the first few values.
1 2 3 4 5 |
... # load the memory-mapped file data = memmap(filename, dtype='float32', mode='r+', shape=(n,)) # check the status of the data print(f'Child: {data[:10]}') |
Next, the child process will update all data in the array, flush the change to the file, and confirm the array was updated.
1 2 3 4 5 6 7 |
... # change the data data[:] += 1 # flush the changes data.flush() # check the status of the data print(f'Child: {data[:10]}') |
Tying this together, the task() function below implements this.
1 2 3 4 5 6 7 8 9 10 11 12 |
# load shared array def task(filename, n): # load the memory-mapped file data = memmap(filename, dtype='float32', mode='r+', shape=(n,)) # check the status of the data print(f'Child: {data[:10]}') # change the data data[:] += 1 # flush the changes data.flush() # check the status of the data print(f'Child: {data[:10]}') |
Next, the main process will define the size of the array and where the array file will be stored. It then creates the array.
1 2 3 4 5 6 7 |
... # define the size of the data n = 1000 # define the filename filename = 'data.np' # create the memory-mapped file data = memmap(filename, dtype='float32', mode='w+', shape=(n,)) |
Next, the main process initializes the array, flushes the change from memory to file, and confirms that the array data change was made.
1 2 3 4 5 6 7 |
... # populate the array data.fill(1.0) # flush the changes data.flush() # check the status of the data print(data[:10]) |
The main process then creates and starts the child process, passing the details of the memory-mapped file, including the filename and size of the array.
The main process blocks until the child process is done.
1 2 3 4 5 6 7 |
... # create a child process child = Process(target=task, args=(filename,n)) # start the child process child.start() # wait for the child process to complete child.join() |
Once done, the main process then inspects the content of the array and confirms that the child process made changes.
1 2 3 |
... # check the status of the data print(data[:10]) |
Tying this together, the complete example of sharing a memory-mapped file NumPy array is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
# SuperFastPython.com # example of sharing a memory-mapped numpy array between processes from multiprocessing import Process from numpy import memmap # load shared array def task(filename, n): # load the memory mapped file data = memmap(filename, dtype='float32', mode='r+', shape=(n,)) # check the status of the data print(f'Child: {data[:10]}') # change the data data[:] += 1 # flush the changes data.flush() # check the status of the data print(f'Child: {data[:10]}') # protect the entry point if __name__ == '__main__': # define the size of the data n = 1000 # define the filename filename = 'data.np' # create the memory mapped file data = memmap(filename, dtype='float32', mode='w+', shape=(n,)) # populate the array data.fill(1.0) # flush the changes data.flush() # check the status of the data print(data[:10]) # create a child process child = Process(target=task, args=(filename,n)) # start the child process child.start() # wait for the child process to complete child.join() # check the status of the data print(data[:10]) |
Running the example first defines the properties of the memory-mapped file, then creates the file on disk with the given data type and size.
If the file already exists, it is overwritten. This allows the program to be re-run many times without error.
Next, the array is initialized to one values and the changes are written to the file. The main process then reports the first 10 values of the array, and we can see that it was initialized correctly.
Next, the child process is created and configured to execute our task() function with the filename and array size as arguments.
The child process is started and the main process blocks until it is done.
The child process runs the task() function. It starts by opening the memory-mapped file with the given filename and size.
It is careful to open the file in read-write mode, assuming the file exists. If the file is opened in write/overwrite mode, the contents of the file written by the main process will be lost.
The child process reports the first 10 values in the array, confirming that the change made by the main process was written correctly and that the child process is able to see that change.
The child process then updates the content of the array, adding one to all values, then flushes the change to file.
The child process then reports the first 10 values of the array again, confirming that the content of the array was updated from one values to two values.
The child process is terminated and the main process resumes.
The main process finally reports the first 10 values of the array, showing that the change made by the child process was made and stored as expected and is visible to the main process.
This highlights how we can share a NumPy array between processes using a memory-mapped file.
1 2 3 4 |
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] Child: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] Child: [2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] [2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] |
Method 09: Share a NumPy Array Via a Manager
We can share a numpy array efficiently between processes is to use a manager.
Multiprocessing Manager provides a way of creating centralized Python objects that can be shared safely among processes.
Manager objects create a server process that is used to host Python objects. Managers then return proxy objects used to interact with the hosted objects.
You can learn more about multiprocessing Managers in the tutorial:
A numpy array can be hosted by defining a custom Manager and configuring it to support numpy arrays.
This requires first defining a custom Manager that extends the BaseManager.
For example:
1 2 3 4 |
# custom manager to support custom classes class CustomManager(BaseManager): # nothing pass |
We can then register our numpy array with the custom manager via the register() function.
One approach is to register a numpy function used to create a numpy array on the server process, such as the numpy.ones() function.
For example:
1 2 3 |
... # register a function for creating numpy arrays on the manager CustomManager.register('ones', ones) |
We can then create the custom manager to start the server process.
1 2 3 4 |
... # create and start the custom manager with CustomManager() as manager: # ... |
The numpy array can then be created in the server process by calling the registered function, after which a proxy object is returned. The hosted numpy array can then be interacted with via the proxy object.
For example:
1 2 3 |
... # create a shared numpy array data_proxy = manager.shared_array((10,10)) |
The proxy object can then be passed between processes, allowing multiple processes to manipulate the same hosted numpy array.
You can learn more about hosting custom objects in manager processes in the tutorial:
You can learn more about sharing a numpy array between processes using a multiprocessing manager in the tutorial:
We can explore the case of sharing a numpy array hosed in a manager between processes.
In this example, we will create an array hosted in a manager process and report the sum of the values in the array. We will then start a child process and pass it the proxy objects for the array and have it perform the same sum operation on the array.
This will highlight how easy it is for multiple processes to operate on the same array efficiently via proxy objects.
Firstly, we will define the custom manager class so that we can register numpy arrays.
1 2 3 4 |
# custom manager to support custom classes class CustomManager(BaseManager): # nothing pass |
Next, we will define a function to execute in a child process. The function will take the proxy object for the numpy array and calculate the sum of values in the array.
1 2 3 4 |
# task executed in a child process def task(data_proxy): # report details of the array print(f'Array sum (in child): {data_proxy.sum()}') |
Next, in the main process, we will register a function for creating the hosted numpy array with the custom manager.
1 2 3 |
... # register a function for creating numpy arrays on the manager CustomManager.register('shared_array', ones) |
Next, we will create and start the custom manager and create the hosted numpy array with 100,000,000 elements.
1 2 3 4 5 6 7 8 |
... # create and start the custom manager with CustomManager() as manager: # define the size of the numpy array n = 100000000 # create a shared numpy array data_proxy = manager.shared_array((n,)) print(f'Array created on host: {data_proxy}') |
We will then calculate the sum of the values in the array on the server process, which we expect to equal 100,000,000, as all values equal one.
1 2 3 |
... # confirm content print(f'Array sum: {data_proxy.sum()}') |
Finally, we will create and configure a child process, configured to execute our task() function, then start it and wait for it to terminate.
1 2 3 4 5 |
... # start a child process process = Process(target=task, args=(data_proxy,)) process.start() process.join() |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
# share a numpy array between processes using a manager from multiprocessing import Process from multiprocessing.managers import BaseManager from numpy import ones # custom manager to support custom classes class CustomManager(BaseManager): # nothing pass # task executed in a child process def task(data_proxy): # report details of the array print(f'Array sum (in child): {data_proxy.sum()}') # protect the entry point if __name__ == '__main__': # register a function for creating numpy arrays on the manager CustomManager.register('shared_array', ones) # create and start the custom manager with CustomManager() as manager: # define the size of the numpy array n = 100000000 # create a shared numpy array data_proxy = manager.shared_array((n,)) print(f'Array created on host: {data_proxy}') # confirm content print(f'Array sum: {data_proxy.sum()}') # start a child process process = Process(target=task, args=(data_proxy,)) process.start() process.join() |
Running the example first registers the numpy array with the custom manager.
Next, the custom manager is created and started.
The numpy array is created on the manager’s server process and a proxy object is returned. The proxy object is printed, providing a string representation of the hosted array, confirming it has one values.
The parent process then reports the sum of the values in the array. This is calculated on the server process and the value is then reported as 100 million, matching our expectations.
Next, a child process is configured to execute our task() function and pass the proxy object. The parent process then blocks until the child process terminates.
The child process runs and executes the task() function. The sum of the array is calculated on the server process and reported via the child process.
This highlights that both processes are easily able to operate directly upon the same hosted array.
1 2 3 |
Array created on host: array([1., 1., 1., ..., 1., 1., 1.]) Array sum: 100000000.0 Array sum (in child): 100000000.0 |
What Method Should You Use?
We have seen a number of ways that we might share NumPy arrays between processes.
These methods could be divided into two main approaches:
- Methods that copy the array
- Methods that share access to the same array.
Let’s review the methods through these groupings and some recommendations, assuming we need or want to share arrays between processes.
Methods That Copy NumPy Arrays
There are four methods that make a copy of the array, which is then transmitted to another process.
They are:
- Inherited array
- Function argument
- Pipe data structure
- Queue data structure
These approaches are generally slow because a copy of the array must be made in memory, it must be pickled, transmitted between processes, and unpickled at the other end.
The inter-process communication is generally slow and particularly because of the serialization and deserialization of the arrays.
As such these methods are appropriate when the arrays to be transmitted are small, when the transmission is not frequent and when changes to arrays in one process do not need to be visible in other processes.
- If many processes need access to the same data in an array, then inheritance should be used because it is faster than the other methods.
- If a small array needs to be shared with another process in a one-offer manner, then a function argument is probably preferred for its simplicity.
- If a multi-step pipeline is operating on small NumPy arrays, like images or similar, then perhaps Queues are are appropriate.
Methods That Use Shared Memory
There are perhaps five methods that make use of an array backed by shared memory, or simulate this effect.
They are:
- Shared ctype Array
- Shared ctype RawArray
- SharedMemory
- Memory-Mapped File
- Manager
The benefit of these approaches is they are not limited by slow transmission of arrays via inter-process communication.
The shared downside is that some methods are generally limited to one-dimensional structures requiring transforms to simulate multiple dimensions. Other methods may require additional mechanisms like mutex locks to avoid race conditions.
- If we are not strictly limited to using a NumPy array, e.g. we don’t need the NumPy API methods and functions, then the shared ctype Array is preferred for speed, at the cost of making a copy of the array.
- If we want to operate on a structure that looks and feels like a NumPy array, than an array backed by a RawArray or SharedMemory buffer is preferred.
- If we don’t need to pass the array around via queues or function arguments, then reconstructing the array from the named SharedMemory buffer is an elegant solution.
- If the array is very large, then not loading it into memory and instead accessing it via a memory-mapped file is a preferred approach.
- If we just need to perform operations on the array or get summary statistics from the data, then hosting the array in a Manager and creating some helpful wrapper methods might be a preferred approach.
Choose Based On Performance
The performance of these methods is a big concern.
Often, we want to choose a method for sharing a NumPy array that meets our requiremens and is the fastest. This is because we are using multiprocessing to achieve a sped-up because of parallelism.
Here are some recommendations based on performance:
- If the shared array can be read-only, then the fastest method to share the array between processes is via inheritance.
- If multiple processes need to read and write the shared NumPy array then a memory-mapped file or shared memory-backed array is the fastest.
- Sharing NumPy arrays using queues and pipes is slower by nearly two orders of magnitude than other methods.
You can learn more about how each approach to sharing a NumPy compares to each other and which approach is the fastest in the tutorial:
Further Reading
This section provides additional resources that you may find helpful.
Books
- Concurrent NumPy in Python, Jason Brownlee (my book!)
Guides
- Concurrent NumPy 7-Day Course
- Which NumPy Functions Are Multithreaded
- Numpy Multithreaded Matrix Multiplication (up to 5x faster)
- NumPy vs the Global Interpreter Lock (GIL)
- ThreadPoolExecutor Fill NumPy Array (3x faster)
- Fastest Way To Share NumPy Array Between Processes
Documentation
- Parallel Programming with numpy and scipy, SciPi Cookbook, 2015
- Parallel Programming with numpy and scipy (older archived version)
- Parallel Random Number Generation, NumPy API
NumPy APIs
Concurrency APIs
- threading — Thread-based parallelism
- multiprocessing — Process-based parallelism
- concurrent.futures — Launching parallel tasks
Takeaways
You now know a suite of approaches that you can use to share a NumPy array between Python processes.
Join the discussion on HackerNews.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Karsten Winegeart on Unsplash
Dave says
So which one was fastest? Bang for the buck? Thanks for the very thorough article btw!
Jason Brownlee says
Great question.
Generally, I’d recommend benchmarking a few approaches for your use case.
Off the cuff, inheriting data from a parent process is VERY fast, for example:
https://superfastpython.com/multiprocessing-inherit-vs-sending-data/
Sharing a RawArray-backed or SharedMemory-backed data around can be efficient, test this. For example:
https://superfastpython.com/numpy-array-shared-ctype-rawarray/
And:
https://superfastpython.com/numpy-array-sharedmemory/
Avoid a queue and pipe, both are very slow, requiring pickling.
Generally, threads can be used instead of processes for most numpy functions, avoiding the problem entirely!
https://superfastpython.com/numpy-vs-gil/
idnsunset says
Hi, Json, this article really helps. My scenario is usually sharing numpy arrays (images in big size) between different program languages, e.g. between Python and C/C++ applications. Python had no built-in shared memory support before Python 3.8, considering there are some differences between posix and sysv shared memory implementations, plus multi-process synchronization requiring some extra efforts, it is unclear for inexperienced developers how to handle it correctly. So, if possible, could you please give some instructions on how to use shared memory properly between Python and C/C++.
Jason Brownlee says
I’m happy to hear that it helps!
Great question. Sorry, I don’t have any tutorials on this topic. I may write about it in the future.
Off the cuff, I would prefer to use a shared resource (disk/database) when sharing data between programs, and allow the resource to manage the high performance aspect, e.g. keep data in RAM/cache.
Paul Ross says
Great article Jason,
Regarding method 07 using multiprocessing.shared_memory I have an example of computing a rolling median on a large numpy array using one process per column (with benchmarks). Here: https://skiplist.readthedocs.io/en/latest/rolling_median.html#rolling-median-mp-shared-memory-label
Project code is here: https://github.com/paulross/skiplist/
PyPi: https://pypi.org/project/orderedstructs/
I hope that it is useful.
Paul.
Jason Brownlee says
Very cool Paul, thank you for sharing.