Last Updated on September 29, 2023
You can share a NumPy array between processes by using a memory-mapped file.
Each process is able to access the memory-mapped file directly. This means that we don’t have the overhead of sharing the array directly between processes, or the handle to the file.
In this tutorial, you will discover how to share a NumPy array between processes using a memory-mapped file.
Let’s get started.
Need to Share NumPy Array Between Processes
Python offers process-based concurrency via the multiprocessing module.
Process-based concurrency is appropriate for those tasks that are CPU-bound, as opposed to thread-based concurrency in Python which is generally suited to IO-bound tasks given the presence of the Global Interpreter Lock (GIL).
You can learn more about process-based concurrency and the multiprocessing module in the tutorial:
Consider the situation where we need to share NumPy arrays between processes.
This may be for many reasons, such as:
- Data is loaded as an array in one process and analyzed differently in different subprocesses.
- Many child processes load small data as arrays that are sent to a parent process for handling.
- Data arrays are loaded in the parent process and processed in a suite of child processes.
Sharing Python objects and data between processes is slow.
This is because any data, like NumPy arrays, shared between processes must be transmitted using inter-process communication (ICP) requiring the data first be pickled by the sender and then unpickled by the receiver.
You can learn more about this in the tutorial:
This means that if we share NumPy arrays between processes, it assumes that we receive some benefit, such as a speedup, that overcomes the slow speed of data transmission.
For example, it may be the case that the arrays are relatively small and fast to transmit, whereas the computation performed on each array is slow and can benefit from being performed in separate processes.
Alternatively, preparing the array may be computationally expensive and benefit from being performed in a separate process, and once prepared, the arrays are small and fast to transmit to another process that requires them.
Given these situations, how can we share data between Processes in Python?
Run loops using all CPUs, download your FREE book to learn how.
What is a NumPy Memory-Mapped File
A memory-mapped file is a structure that allows data to look and be used as though it exists in main memory, when in fact it is stored in a file on disk.
It allows very large data structures to be read and written without having to have the contents of the structure in main memory, as we do normally.
A memory-mapped file is a segment of virtual memory[1] that has been assigned a direct byte-for-byte correlation with some portion of a file or file-like resource. This resource is typically a file that is physically present on disk, but can also be a device, shared memory object, or other resource that the operating system can reference through a file descriptor.
— Memory-mapped file, Wikipedia.
Numpy offers the ability for a NumPy array to be stored in a memory-mapped file and used as though the array exists in main memory.
This capability is provided via the numpy.memmap() function.
Memory-mapped files are used for accessing small segments of large files on disk, without reading the entire file into memory. NumPy’s memmap’s are array-like objects.
— numpy.memmap API.
To use a memory-mapped file NumPy array, the numpy.memmap() function can be called to define the array.
This includes the filename for where the array data will be stored, the data type of the array, the mode in which the array will be used (e.g. read, write, etc.), and the dimensional or shape of the array.
For example, we can create a memory-mapped file NumPy array in the current working directory with the filename “data.np“, to hold floating point values, that can be read and written and is one dimensional in shape with 1000 elements.
1 2 3 |
... # create a memory mapped numpy array data = memmap('data.np', dtype='float32', mode='w+', shape=(1000,)) |
The open modes are just like those used to open a file on disk, for example:
- ‘r’ Open the array file in read-only mode.
- ‘r+’ Open an existing array file for reading and writing.
- ‘w+’ Create or overwrite an existing file for reading and writing.
- ‘c’ Only make changes to the data in memory, called copy-on-write.
The array can then be used as per normal, such as populate it with initial values via the fill() method.
1 2 3 |
... # populate the array data.fill(1.0) |
Generally, changes to the array are written when we are finished with it.
Nevertheless, it is good practice to flush changes from memory back to the file storage via the flush() method once a block of atomic changes has been made.
Flush the memmap instance to write the changes to the file.
— numpy.memmap API.
For example:
1 2 3 |
... # write changes to file data.flush() |
Now that we know what a NumPy memory-mapped file is, let’s look at how we might use it to share an array between processes.
How to Share NumPy Array Using Memory-Mapped File
We can use a NumPy memory-mapped file to share a NumPy array between processes.
If no array exists, one process can be responsible for creating the array.
This can be achieved by specifying the filename and dimensional of the array and opening it in create mode, e.g. ‘w+’.
For example:
1 2 3 |
... # create a memory mapped numpy array data = memmap('data.np', dtype='float32', mode='w+', shape=(1000,)) |
Changes can be made to the file and then flushed with a call to the flush method.
For example:
1 2 3 |
... # write changes to file data.flush() |
Once the file exists on disk, other processes can then open the file.
This might be in read-only mode, e.g. ‘r’ if the other processes need only read the data.
For example:
1 2 3 |
... # open existing memory mapped array in read-only mode data = memmap('data.np', dtype='float32', mode='r', shape=(1000,)) |
Alternatively, if the processes need to read and then write the data in the array, it can be opened in read-write mode and changes can then be flushed once the process is finished.
For example:
1 2 3 4 5 6 7 |
... # open existing memory-mapped array in read/write mode data = memmap('data.np', dtype='float32', mode='r+', shape=(1000,)) # make changes ... # write changes to file data.flush() |
The other processes must open the same file, and use the same data types and array dimensions.
This information could be provided to other processes from the main process via function arguments.
For example:
1 2 3 4 5 |
# task to work on shared numpy array def task(array_filename, array_type, array_shape): # open existing memory mapped array in read/write mode data = memmap(array_filename, dtype=array_type, mode='r+', shape=array_shape) ... |
Finally, it is possible for two or more processes to open the same memory-mapped file array and attempt to make changes at the same time.
Therefore, it may be a good practice to protect critical sections that operate on the shared array using a mutex lock. This ensures that only one process is able to read and write data on the shared array at the same time.
This can be achieved via a shared Lock instance that could be provided to task functions, along with the details of the memory-mapped array.
For example:
1 2 3 4 5 6 7 |
# task to work on shared numpy array def task(array_filename, array_type, array_shape, array_lock): # acquire lock on file first with array_lock: # open existing memory mapped array in read/write mode data = memmap(array_filename, dtype=array_type, mode='r+', shape=array_shape) ... |
You can learn more about multiprocessing mutex locks in the tutorial:
Now that we know how to use the NumPy memory-mapped array, let’s look at some worked examples.
Free Concurrent NumPy Course
Get FREE access to my 7-day email course on concurrent NumPy.
Discover how to configure the number of BLAS threads, how to execute NumPy tasks faster with thread pools, and how to share arrays super fast.
Example of Sharing a NumPy Array Using Memory-Mapped File
We can explore an example of sharing a NumPy array between processes using a memory-mapped file.
In this case, we will define a NumPy array backed by a memory-mapped file. The main process will then initialize the array and flush the changes. A new child process will then be created and passed the details of the NumPy array backed by the memory-mapped file. It will load the array in a read/write mode, make changes, and then have those changes flushed. Finally, the main process will resume and confirm the changes made to the array by the child process.
Firstly, we can define the task function to execute in the child process.
The function will take the details of the array, including the name of the file and the size, assuming it is one-dimensional.
1 2 3 |
# load shared array def task(filename, n): # ... |
The function will then open the file-backed NumPy array in read/write mode and inspect the first few values.
1 2 3 4 5 |
... # load the memory-mapped file data = memmap(filename, dtype='float32', mode='r+', shape=(n,)) # check the status of the data print(f'Child: {data[:10]}') |
Next, the child process will update all data in the array, flush the change to the file, and confirm the array was updated.
1 2 3 4 5 6 7 |
... # change the data data[:] += 1 # flush the changes data.flush() # check the status of the data print(f'Child: {data[:10]}') |
Tying this together, the task() function below implements this.
1 2 3 4 5 6 7 8 9 10 11 12 |
# load shared array def task(filename, n): # load the memory-mapped file data = memmap(filename, dtype='float32', mode='r+', shape=(n,)) # check the status of the data print(f'Child: {data[:10]}') # change the data data[:] += 1 # flush the changes data.flush() # check the status of the data print(f'Child: {data[:10]}') |
Next, the main process will define the size of the array and where the array file will be stored. It then creates the array.
1 2 3 4 5 6 7 |
... # define the size of the data n = 1000 # define the filename filename = 'data.np' # create the memory-mapped file data = memmap(filename, dtype='float32', mode='w+', shape=(n,)) |
Next, the main process initializes the array, flushes the change from memory to file, and confirms that the array data change was made.
1 2 3 4 5 6 7 |
... # populate the array data.fill(1.0) # flush the changes data.flush() # check the status of the data print(data[:10]) |
The main process then creates and starts the child process, passing the details of the memory-mapped file, including the filename and size of the array.
The main process blocks until the child process is done.
1 2 3 4 5 6 7 |
... # create a child process child = Process(target=task, args=(filename,n)) # start the child process child.start() # wait for the child process to complete child.join() |
Once done, the main process then inspects the content of the array and confirms that the child process made changes.
1 2 3 |
... # check the status of the data print(data[:10]) |
Tying this together, the complete example of sharing a memory-mapped file NumPy array is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
# SuperFastPython.com # example of sharing a memory-mapped numpy array between processes from multiprocessing import Process from numpy import memmap # load shared array def task(filename, n): # load the memory mapped file data = memmap(filename, dtype='float32', mode='r+', shape=(n,)) # check the status of the data print(f'Child: {data[:10]}') # change the data data[:] += 1 # flush the changes data.flush() # check the status of the data print(f'Child: {data[:10]}') # protect the entry point if __name__ == '__main__': # define the size of the data n = 1000 # define the filename filename = 'data.np' # create the memory mapped file data = memmap(filename, dtype='float32', mode='w+', shape=(n,)) # populate the array data.fill(1.0) # flush the changes data.flush() # check the status of the data print(data[:10]) # create a child process child = Process(target=task, args=(filename,n)) # start the child process child.start() # wait for the child process to complete child.join() # check the status of the data print(data[:10]) |
Running the example first defines the properties of the memory-mapped file, then creates the file on disk with the given data type and size.
If the file already exists, it is overwritten. This allows the program to be re-run many times without error.
Next, the array is initialized to one values and the changes are written to the file. The main process then reports the first 10 values of the array, and we can see that it was initialized correctly.
Next, the child process is created and configured to execute our task() function with the filename and array size as arguments.
The child process is started and the main process blocks until it is done.
The child process runs the task() function. It starts by opening the memory-mapped file with the given filename and size.
It is careful to open the file in read-write mode, assuming the file exists. If the file is opened in write/overwrite mode, the contents of the file written by the main process will be lost.
The child process reports the first 10 values in the array, confirming that the change made by the main process was written correctly and that the child process is able to see that change.
The child process then updates the content of the array, adding one to all values, then flushes the change to file.
The child process then reports the first 10 values of the array again, confirming that the content of the array was updated from one values to two values.
The child process is terminated and the main process resumes.
The main process finally reports the first 10 values of the array, showing that the change made by the child process was made and stored as expected and is visible to the main process.
This highlights how we can share a NumPy array between processes using a memory-mapped file.
1 2 3 4 |
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] Child: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] Child: [2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] [2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] |
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Example of Sharing Memory-Mapped File Array With Many Processes
We can explore the case of having many processes updating the same memory-mapped file NumPy array in parallel.
In this case, we will update the above example to have a larger array of 100 million items and then have 100 child processes all attempt to update the content of the array at the same time by adding 1 to the values.
We expect the final array to have the values of 100, with one added by each child process.
All explicit flushes are also removed from the example, in an effort to force a race condition.
The complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
# SuperFastPython.com # example of many child processes changing the same memory-mapped file array from multiprocessing import Process from numpy import memmap # load shared array def task(filename, shape): # load the memory-mapped file data = memmap(filename, dtype='float32', mode='r+', shape=shape) # change the data data[:] += 1 # protect the entry point if __name__ == '__main__': # define the size of the data shape = (100000000,) # define the filename filename = 'data.np' # create the memory-mapped file data = memmap(filename, dtype='float32', mode='w+', shape=shape) # populate the array data.fill(0.0) # check the status of the data print(data[:10]) # create many child tasks children = [Process(target=task, args=(filename,shape)) for _ in range(100)] # start all child tasks for child in children: child.start() # wait for all child tasks to complete for child in children: child.join() # check the status of the data print(data[:10]) |
Running the example defines the array with 100 million items initialized to zero values.
A total of 100 child processes are then created and started, all configured to load the same memory-mapped file NumPy array and add one to all values in the array.
The main process waits for all child processes to complete and then reports the first 10 values.
We can see that the array appears to have the expected values of 100, with one added for each child process that modified the shared array.
It seems that even with all 100 child processes modifying the same memory-mapped file and without explicit flushing of changes, no race condition occurred.
1 2 |
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [100. 100. 100. 100. 100. 100. 100. 100. 100. 100.] |
It is possible that the underlying memory-mapped file used by NumPy is thread and process-safe generally or on some platforms.
Or it may be that it is not thread and process safe, but this simple example above is not enough to trigger a race condition.
Nevertheless, we can enforce safety as an exercise. This can be achieved by updating the example so that all changes to the memory-mapped file NumPy array are mutually exclusive.
This can be achieved by creating a mutex lock and sharing it with each child process.
1 2 3 4 5 |
... # create the shared lock lock = Lock() # create many child tasks children = [Process(target=task, args=(filename,shape,lock)) for _ in range(100)] |
Each child process can then explicitly acquire the lock before operating on the shared array, ensuring only one process changes the array at a time.
1 2 3 4 5 6 7 8 |
# load shared array def task(filename, shape, lock): # acquire lock with lock: # load the memory-mapped file data = memmap(filename, dtype='float32', mode='r+', shape=shape) # change the data data[:] += 1 |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
# SuperFastPython.com # example of many child processes changing the same memory-mapped file array from multiprocessing import Process from multiprocessing import Lock from numpy import memmap # load shared array def task(filename, shape, lock): # acquire lock with lock: # load the memory-mapped file data = memmap(filename, dtype='float32', mode='r+', shape=shape) # change the data data[:] += 1 # protect the entry point if __name__ == '__main__': # define the size of the data shape = (100000000,) # define the filename filename = 'data.np' # create the memory-mapped file data = memmap(filename, dtype='float32', mode='w+', shape=shape) # populate the array data.fill(0.0) # check the status of the data print(data[:10]) # create the shared lock lock = Lock() # create many child tasks children = [Process(target=task, args=(filename,shape,lock)) for _ in range(100)] # start all child tasks for child in children: child.start() # wait for all child tasks to complete for child in children: child.join() # check the status of the data print(data[:10]) |
Running the example defines the memory-mapped file NumPy array with 100 million items initialized to zero values as before.
A shared mutex lock is created and shared with all 100 child processes that are created and started.
The child processes make use of the lock, acquiring it before updating the content of the shared array, and then releasing the lock again before terminating.
The main process reports the first 10 values of the shared array, confirming that all 100 child processes performed their updates as before.
Although there is no change in results for this example, we are confident that all writes to the shared array by child processes are atomic and not open to a race condition.
1 2 |
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [100. 100. 100. 100. 100. 100. 100. 100. 100. 100.] |
Further Reading
This section provides additional resources that you may find helpful.
Books
- Concurrent NumPy in Python, Jason Brownlee (my book!)
Guides
- Concurrent NumPy 7-Day Course
- Which NumPy Functions Are Multithreaded
- Numpy Multithreaded Matrix Multiplication (up to 5x faster)
- NumPy vs the Global Interpreter Lock (GIL)
- ThreadPoolExecutor Fill NumPy Array (3x faster)
- Fastest Way To Share NumPy Array Between Processes
Documentation
- Parallel Programming with numpy and scipy, SciPi Cookbook, 2015
- Parallel Programming with numpy and scipy (older archived version)
- Parallel Random Number Generation, NumPy API
NumPy APIs
Concurrency APIs
- threading — Thread-based parallelism
- multiprocessing — Process-based parallelism
- concurrent.futures — Launching parallel tasks
Takeaways
You now know how to share a NumPy array between processes using a memory-mapped file.
Did I make a mistake? See a typo?
I’m a simple humble human. Correct me, please!
Do you have any additional tips?
I’d love to hear about them!
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Taras Chernus on Unsplash
Do you have any questions?