Transmitting data between processes using a queue or a pipe requires that the data be pickled.
This is much slower than a child process inheriting the data from a parent process either as a global variable or being passed the data as an argument when the process is created.
In this tutorial, you will discover the speed differences between inheriting and transmitting data between processes.
Let’s get started.
Inherit Data vs Transmit Data Between Processes
Sharing data between processes in Python is slow.
It is slow because data must be serialized (pickled) before it is transmitted, then deserialized (unpickled) at the other end.
This adds a computational cost to every byte of data sent from one process to another.
For example, recent experiments suggest that it is up to 4x slower to transmit data between processes using a queue than to share data between threads using the same architecture.
You can learn more about the experiments that produced this result in the tutorial:
The Python multiprocessing API documentation warns about this. It is suggested to generally avoid sharing data between processes and to instead inherit data from a parent process whenever possible.
Better to inherit than pickle/unpickle
When using the spawn or forkserver start methods many types from multiprocessing need to be picklable so that child processes can use them. However, one should generally avoid sending shared objects to other processes using pipes or queues. Instead you should arrange the program so that a process which needs access to a shared resource created elsewhere can inherit it from an ancestor process.
— multiprocessing — Process-based parallelism
This raises some important questions, such as:
- How much slower is it to transmit data to another process than to inherit data from a parent process?
It also raises related questions, such as:
- How does transmitting data between processes via a queue compare to using a pipe?
- How does transmitting data via a function argument compare to sharing data via a queue?
We can develop some experiments to explore these questions and gather some real numbers for comparison.
Run loops using all CPUs, download your FREE book to learn how.
Benchmark Inherit Data From Parent Process
We can explore the benchmark speed of a child process inheriting data from a parent process.
In this example, we will create a new child process that inherits data many times and time how long it takes.
Each process started will execute a target function that does nothing other than access the inherited data via a global variable.
For example:
1 2 3 4 5 6 |
# task to run in a new process def task(): # declare global variable global data # access and copy global data data2 = data |
We can define a function that creates a process to execute our target function, start the process then wait for it to complete in a loop that repeats a specified number of times.
Importantly, before running the loop of tests, we will declare a global variable and then assign it some data. In this case, a list of 1,000,000 integer values.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
# run a test and time how long it takes def test(n_repeats): # declare global variable global data # create the data data = [i for i in range(1000000)] # repeat many times for i in range(n_repeats): # create the process process = Process(target=task) # start the process process.start() # wait for the process to complete process.join() |
You can learn more about starting a child function to run a target function in the tutorial:
We can then call our target function with a given number of loop iterations, in this case, 1,000 and time how long it takes to complete in seconds.
Importantly, we must set the ‘fork’ start method to be used when creating the child process.
This allows the global variable defined in the parent process to be inherited by a child process when it is created.
Note, the fork start method is not supported on windows. If you are a windows user, you will not be able to execute this part of the experiment.
You can learn more about process start methods in the tutorial:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# entry point if __name__ == '__main__': # set the fork start method so we can inherit data set_start_method('fork') # record the start time time_start = time() # perform the test n_repeats = 1000 test(n_repeats) # record the end time time_end = time() # report the total time duration = time_end - time_start print(f'Total Time {duration} seconds') # report estimated time per test per_test = duration / n_repeats print(f'About {per_test} seconds per process') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
# SuperFastPython.com # example of inheriting data from parent process from time import time from multiprocessing import Process from multiprocessing import set_start_method # task to run in a new process def task(): # declare global variable global data # access and copy global data data2 = data # run a test and time how long it takes def test(n_repeats): # declare global variable global data # create the data data = [i for i in range(1000000)] # repeat many times for i in range(n_repeats): # create the process process = Process(target=task) # start the process process.start() # wait for the process to complete process.join() # entry point if __name__ == '__main__': # set the fork start method so we can inherit data set_start_method('fork') # record the start time time_start = time() # perform the test n_repeats = 1000 test(n_repeats) # record the end time time_end = time() # report the total time duration = time_end - time_start print(f'Total Time {duration} seconds') # report estimated time per test per_test = duration / n_repeats print(f'About {per_test} seconds per process') |
Running the example first sets the start method to ‘fork‘.
It then records the start time and executes the test() function that starts a new child process 1,000 times.
Each child process that is created inherits the data created in the parent process.
This is a copy of the data given that the child process is a “fork” or a copy of the parent process.
The end time is recorded and the total time in seconds is reported. Because we know the number of processes created, we can also estimate how long each process takes to create.
In this case, the experiment took about 2.4 seconds to create 1,000 child processes that inherited the data, which was about 2.4 milliseconds per process.
1 2 |
Total Time 2.4471421241760254 seconds About 0.0024471421241760256 seconds per process |
Next, let’s repeat the experiment and instead pass the data as an argument to the child process instead of inheriting it.
Benchmark Transmit Data From Parent Process As Argument
We can explore a benchmark of how long it takes to transmit data from the parent process to the child process as a function argument.
This can be achieved by creating the data array and passing it to the child process when it is created.
Firstly, we can update the task() target function to receive the data as a function argument.
1 2 3 4 |
# task to run in a new process def task(data): # access the data data2 = data |
Next, we can update the test() function to not declare the “data” as a global argument, and instead pass it to the child process when it is created via the “args” argument.
1 2 3 4 5 6 7 8 9 10 11 12 |
# run a test and time how long it takes def test(n_repeats): # create the data to share data = [i for i in range(1000000)] # repeat many times for i in range(n_repeats): # create the process process = Process(target=task, args=(data,)) # start the process process.start() # wait for the process to complete process.join() |
And that’s it.
The complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
# SuperFastPython.com # example of transmitting data to child process via an argument from time import time from multiprocessing import Process from multiprocessing import set_start_method # task to run in a new process def task(data): # access the data data2 = data # run a test and time how long it takes def test(n_repeats): # create the data to share data = [i for i in range(1000000)] # repeat many times for i in range(n_repeats): # create the process process = Process(target=task, args=(data,)) # start the process process.start() # wait for the process to complete process.join() # entry point if __name__ == '__main__': # set the fork start method, to be consistent set_start_method('fork') # record the start time time_start = time() # perform the test n_repeats = 1000 test(n_repeats) # record the end time time_end = time() # report the total time duration = time_end - time_start print(f'Total Time {duration} seconds') # report estimated time per test per_test = duration / n_repeats print(f'About {per_test} seconds per process') |
Running the example first sets the start method to ‘fork‘. This is done so that the method used to create processes is consistent across all experiments.
The start time is then recorded and the test() function is called that starts a new child process 1,000 times.
Each child process that is created receives the data list as an argument to the child process.
This is a copy of the list.
The end time is recorded and the total time in seconds is reported. Because we know the number of processes created, we can also estimate how long each process takes to create.
In this case, the experiment took about 2.2 seconds to create 1,000 child processes that inherited the data, which was about 2.2 milliseconds per process.
1 2 |
Total Time 2.243790864944458 seconds About 0.002243790864944458 seconds per process |
Next, let’s repeat the experiment and instead transmit the data to the child process via a queue.
Free Python Multiprocessing Course
Download your FREE multiprocessing PDF cheat sheet and get BONUS access to my free 7-day crash course on the multiprocessing API.
Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.
Benchmark Transmit Data From Parent Process With Queue
We can update the experiment to transmit the data from the parent to the child process using a queue.
This can be achieved by creating the data and queue once for the experiment, then sharing the queue with each child process. The parent process sends the data to the child via the queue, while the task executed in the child process consumes the data from the queue.
For more on sharing data between processes using a queue, see the tutorial:
Firstly, we must update the task() function to take a queue as an argument, then block and read the data from the queue.
1 2 3 4 |
# task to run in a new process def task(queue): # get data data = queue.get() |
Next, we can update the test() function to create the shared queue, then pass the shared queue as an argument to each child process.
After creating the child process, the parent process can put the data on the queue, which will be consumed by the child process.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# run a test and time how long it takes def test(n_repeats): # create the data to share data = [i for i in range(1000000)] # create the shared queue queue = Queue() # repeat many times for i in range(n_repeats): # create the process process = Process(target=task, args=(queue,)) # start the process process.start() # transmit the data queue.put(data) # wait for the process to complete process.join() |
And that’s it.
Tying this together, the complete example with these changes is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
# SuperFastPython.com # example of transmitting data to child process via a shared queue from time import time from multiprocessing import Process from multiprocessing import Queue from multiprocessing import set_start_method # task to run in a new process def task(queue): # get data data = queue.get() # run a test and time how long it takes def test(n_repeats): # create the data to share data = [i for i in range(1000000)] # create the shared queue queue = Queue() # repeat many times for i in range(n_repeats): # create the process process = Process(target=task, args=(queue,)) # start the process process.start() # transmit the data queue.put(data) # wait for the process to complete process.join() # entry point if __name__ == '__main__': # set the fork start method, to be consistent set_start_method('fork') # record the start time time_start = time() # perform the test n_repeats = 1000 test(n_repeats) # record the end time time_end = time() # report the total time duration = time_end - time_start print(f'Total Time {duration} seconds') # report estimated time per test per_test = duration / n_repeats print(f'About {per_test} seconds per process') |
Running the example first sets the start method to ‘fork‘. This is done so that the method used to create processes is consistent across all experiments.
The start time is then recorded and the test() function is called that starts a new child process 1,000 times.
The test() function first creates the data and then the shared queue.
Each child process that is created receives the shared queue as an argument. The child process runs and blocks, waiting for the data to arrive via the queue. The parent process puts the list on the queue and waits for the child process to complete. The child process consumes the data and terminates.
The data received by the child process is a copy of the list shared by the parent process.
The end time is recorded and the total time in seconds is reported. Because we know the number of processes created, we can also estimate how long each process takes to create.
In this case, the experiment took about 82.8 seconds to create 1,000 child processes that inherited the data, which was about 82.8 milliseconds per process.
1 2 |
Total Time 82.83423495292664 seconds About 0.08283423495292663 seconds per process |
Next, we can update the example to transmit data from parent to child process using a pipe.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Benchmark Transmit Data From Parent Process With Pipe
We can update the experiment to transmit data from parent to child using a pipe instead of a queue.
This requires creating a pipe in the test() function and sharing the receiving connection with the child process and using the sending connection to transmit the data to the child process.
You can learn more about transmitting data between processes using a pipe in the tutorial:
Firstly, we must update the test() function to take the receiving connection as an argument and then to read the data object by calling the recv() method.
1 2 3 4 |
# task to run in a new process def task(conn_recv): # get data data = conn_recv.recv() |
Next, we can update the test() function to create the Pipe which returns the send and receive connections. The receive connection is then provided as an argument to each child process that is created, and the send connection is used to transmit the data to the child process via the send() method.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# run a test and time how long it takes def test(n_repeats): # create the data to share data = [i for i in range(1000000)] # create the shared queue conn_recv, conn_send = Pipe() # repeat many times for i in range(n_repeats): # create the process process = Process(target=task, args=(conn_recv,)) # start the process process.start() # transmit the data conn_send.send(data) # wait for the process to complete process.join() |
And that’s it.
Tying this together, the complete example with these changes is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
# SuperFastPython.com # example of transmitting data to child process via a shared pipe from time import time from multiprocessing import Process from multiprocessing import Pipe from multiprocessing import set_start_method # task to run in a new process def task(conn_recv): # get data data = conn_recv.recv() # run a test and time how long it takes def test(n_repeats): # create the data to share data = [i for i in range(1000000)] # create the shared queue conn_recv, conn_send = Pipe() # repeat many times for i in range(n_repeats): # create the process process = Process(target=task, args=(conn_recv,)) # start the process process.start() # transmit the data conn_send.send(data) # wait for the process to complete process.join() # entry point if __name__ == '__main__': # set the fork start method, to be consistent set_start_method('fork') # record the start time time_start = time() # perform the test n_repeats = 1000 test(n_repeats) # record the end time time_end = time() # report the total time duration = time_end - time_start print(f'Total Time {duration} seconds') # report estimated time per test per_test = duration / n_repeats print(f'About {per_test} seconds per process') |
Running the example first sets the start method to ‘fork‘. This is done so that the method used to create processes is consistent across all experiments.
The start time is then recorded and the test() function is called that starts a new child process 1,000 times.
The test() function first creates the data and then the Pipe with send and receive connections.
Each child process that is created receives the receive-side connection as an argument. The child process runs and blocks, waiting for the data to arrive via the pipe. The parent process sends the data on the pipe and waits for the child process to complete. The child process reads the data from the pipe and terminates.
The data received by the child process is a copy of the list shared by the parent process.
The end time is recorded and the total time in seconds is reported. Because we know the number of processes created, we can also estimate how long each process takes to create.
In this case, the experiment took about 86.7 seconds to create 1,000 child processes that inherited the data, which was about 86.7 milliseconds per process.
1 2 |
Total Time 86.71212387084961 seconds About 0.0867121238708496 seconds per process |
Next, let’s compare the different approaches for sending the same data to a child process.
Comparison of Data Sharing Methods Between Processes
First, let’s compare the results for the 4 methods for transmitting data from parent to child.
1 2 3 4 5 6 |
Method | Time (sec) | Time/Process (sec) -------------------------------------------- Inherited | 2.4 | 0.0024 Argument | 2.2 | 0.0022 Queue | 82.8 | 0.0828 Pipe | 86.7 | 0.0867 |
We can see two clusters of results.
Inheriting data from the parent process and passing the data to the process as an argument are approximately equivalent in time.
Each took about 2.5 seconds to complete the experiment or about 2.5 milliseconds per transmission.
In fact, transmission is probably not occurring in this case. It is not occurring in the inheritance of the data, instead, a direct copy of the process is performed. I suspect the same thing is happening when the data is passed as an argument, although this is not documented.
The second group of results is those for transmitting data using a queue or a pipe. These both took about 85 seconds to complete or about 1.5 minutes for the experiment to complete or about 85 milliseconds per transmission.
These two methods did transmit data from parent to child, requiring the data first be pickled in the parent and then unpickled in the child, although this process happened automatically behind the scenes.
The difference between these two clusters of results is nearly 80 seconds. Put another way, inheriting data from a parent process is about 34x faster than transmitting it between processes.
The results confirm the comments made in the multiprocessing API documentation.
Specifically, inherit data wherever possible.
In addition to inheriting data by forking processes, we can also recommend passing data as arguments when executing a target function in a child process.
Transmitting data between processes using queues and pipes should be avoided if possible, or kept to a minimum if speed is critical and a lot of data must be transmitted.
Further Reading
This section provides additional resources that you may find helpful.
Python Multiprocessing Books
- Python Multiprocessing Jump-Start, Jason Brownlee (my book!)
- Multiprocessing API Interview Questions
- Multiprocessing API Cheat Sheet
I would also recommend specific chapters in the books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Multiprocessing: The Complete Guide
- Python Multiprocessing Pool: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
APIs
References
Takeaways
You now know the speed differences between inheriting and transmitting data between processes.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Do you have any questions?