Processes are slow to start, threads are faster.
In fact, threads are about 40x faster to create than processes in Python.
The difference in time taken to create threads and processes depends on the specifics of the system and the start method used to create the threads. The difference in time can be calculated using a standard benchmark and compared directly.
In this tutorial, you will discover how to benchmark the starting of threads vs processes and how to compare the speed between the two in Python.
Let’s get started.
Processes Slower Than Threads
Python offers both thread-based and process-based concurrency.
Threads are provided via the threading module. The threading.Thread class can execute a target function in another thread.
This can be achieved by creating an instance of the threading.Thread class and specify the target function to execute via the target keyword. The thread can then be started by calling the start() function and it will execute the target function in another thread.
Processes are provided via the multiprocessing module. The multiprocessing.Process class can execute a target function in another process.
The multiprocessing.Process class has much the same API as the threading.Thread class and can be used to execute a target function in a child process.
You can learn more bout the similarities and differences between threads and processes in the tutorial:
It is generally considered that threads are lightweight compared to processes and are faster to start.
This may be an important issue in those programs that need to create many threads or processes as part of their normal execution.
If processes are slower than threads, how much slower are they?
In this tutorial, we will explore this question by benchmarking how long it takes to create many processes and many threads, then compare the results.
Run loops using all CPUs, download your FREE book to learn how.
Benchmark Starting Many Processes
We can explore the benchmark speed of starting many processes, in this case using the spawn start method.
The spawn start method is an entirely new instance of the Python interpreter started from scratch. It is not a copy of another process.
The parent process starts a fresh Python interpreter process. The child process will only inherit those resources necessary to run the process object’s run() method. In particular, unnecessary file descriptors and handles from the parent process will not be inherited. Starting a process using this method is rather slow compared to using fork or forkserver. Available on Unix and Windows. The default on Windows and macOS.
— multiprocessing — Process-based parallelism
In this example, we will create a new process many times and time how long it takes.
Each process started will execute a target function that does nothing other than return immediately.
For example:
1 2 3 4 |
# task to run in a new process def task(): # do nothing interesting pass |
We can define a process that creates a process to execute our target function, start the process then wait for it to complete in a loop that repeats a specified number of times.
1 2 3 4 5 6 7 8 9 10 |
# run a test and time how long it takes def test(n_repeats): # repeat many times for i in range(n_repeats): # create the process process = Process(target=task) # start the process process.start() # wait for the process to complete process.join() |
We can then call our target function with a given number of loop iterations, in this case, 1,000, and time how long it takes to complete in seconds.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# entry point if __name__ == '__main__': # set the start method set_start_method('spawn') # record the start time time_start = time() # perform the test n_repeats = 1000 test(n_repeats) # record the end time time_end = time() # report the total time duration = time_end - time_start print(f'Total Time {duration:.3} seconds') # report estimated time per process per_process = duration / n_repeats print(f'About {per_process:.3} seconds per process') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
# SuperFastPython.com # create many processes using the spawn start method from time import time from multiprocessing import Process from multiprocessing import set_start_method # task to run in a new process def task(): # do nothing interesting pass # run a test and time how long it takes def test(n_repeats): # repeat many times for i in range(n_repeats): # create the process process = Process(target=task) # start the process process.start() # wait for the process to complete process.join() # entry point if __name__ == '__main__': # set the start method set_start_method('spawn') # record the start time time_start = time() # perform the test n_repeats = 1000 test(n_repeats) # record the end time time_end = time() # report the total time duration = time_end - time_start print(f'Total Time {duration:.3} seconds') # report estimated time per process per_process = duration / n_repeats print(f'About {per_process:.3} seconds per process') |
Running the example first sets the start method to ‘spawn’.
It then records the start time and executes the test() function that spawns a new process 1,000 times.
The end time is recorded and the total time in seconds is reported. Because we know the number of processes created, we can also estimate how long each process takes to create.
In this case, the experiment took about 42.3 seconds to spawn 1,000 processes, which was about 42.3 milliseconds per process.
1 2 |
Total Time 42.3 seconds About 0.0423 seconds per process |
Processes are slow to spawn.
A faster way to start child processes is to fork them.
The fork start method uses a system function call to copy an existing process to create a new process.
This means that the child process has a copy of all memory used by the original process, including all global variables.
The parent process uses os.fork() to fork the Python interpreter. The child process, when it begins, is effectively identical to the parent process. All resources of the parent are inherited by the child process. Note that safely forking a multithreaded process is problematic. Available on Unix only. The default on Unix.
— multiprocessing — Process-based parallelism
For more on how forking is faster than spawning child processes, see the tutorial:
We can update the above example to fork the main process in order to create each child process.
The complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
# SuperFastPython.com # create many processes using the fork start method from time import time from multiprocessing import Process from multiprocessing import set_start_method # task to run in a new process def task(): # do nothing interesting pass # run a test and time how long it takes def test(n_repeats): # repeat many times for i in range(n_repeats): # create the process process = Process(target=task) # start the process process.start() # wait for the process to complete process.join() # entry point if __name__ == '__main__': # set the start method set_start_method('fork') # record the start time time_start = time() # perform the test n_repeats = 1000 test(n_repeats) # record the end time time_end = time() # report the total time duration = time_end - time_start print(f'Total Time {duration:.3} seconds') # report estimated time per process per_process = duration / n_repeats print(f'About {per_process:.3} seconds per process') |
Running the example first sets the start method to ‘fork’.
It then records the start time and executes the test() function that spawns a new process 1,000 times.
The end time is recorded and the total time in seconds is reported. Because we know the number of processes created, we can also estimate how long each process takes to create.
In this case, the experiment took about 2.07 seconds to fork 1,000 processes from the main process, which was about 2.07 milliseconds per process.
1 2 |
Total Time 2.07 seconds About 0.00207 seconds per process |
This is a more fair baseline as it is about as fast as we can create child processes.
Next, let’s perform the same experiment by starting many threads in the main process.
Benchmark Starting Many Threads
We can explore the benchmark speed of starting many threads instead of processes.
In this example, we will update the previous example and change the creation of processes to threads. Given that the creation and usage of threads and processes are so similar, the changes are quite minimal.
For example:
1 2 3 4 5 6 7 8 9 10 |
# run a test and time how long it takes def test(n_repeats): # repeat many times for i in range(n_repeats): # create the thread thread = Thread(target=task) # start the thread thread.start() # wait for the thread to complete thread.join() |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
# SuperFastPython.com # benchmark the time to create many threads from time import time from threading import Thread # task to run in a new thread def task(): # do nothing interesting pass # run a test and time how long it takes def test(n_repeats): # repeat many times for i in range(n_repeats): # create the thread thread = Thread(target=task) # start the thread thread.start() # wait for the thread to complete thread.join() # entry point if __name__ == '__main__': # record the start time time_start = time() # perform the test n_repeats = 1000 test(n_repeats) # record the end time time_end = time() # report the total time duration = time_end - time_start print(f'Total Time {duration:.3} seconds') # report estimated time per thread per_thread = duration / n_repeats print(f'About {per_thread:.3} seconds per thread') |
Running the example first records the start time.
It then executes the test() function that creates 1,000 new threads, one at a time.
The end time is recorded and the total time in seconds is reported. Because we know the number of threads created, we can also estimate how long each thread takes to create.
In this case, the experiment took about 0.048 seconds to create 1,000 threads, which was about 4.8e-05 seconds per thread (or 0.000048).
That is, less than one second to create all threads and a tiny fraction of a second to create each thread, reported in scientific notation because the number is so small.
1 2 |
Total Time 0.048 seconds About 4.8e-05 seconds per thread |
Next, let’s compare the speed of the two start methods.
Free Python Multiprocessing Course
Download your FREE multiprocessing PDF cheat sheet and get BONUS access to my free 7-day crash course on the multiprocessing API.
Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.
Comparison of Processes vs Threads Start Time
Creating threads is faster than creating processes.
Threads are not a little bit faster to create, they are orders of magnitude faster to create.
For example, it took about 2.070 seconds to fork 1,000 child processes and only 0.048 seconds to create 1,000 threads.
That is, creating 1,000 threads was about 2.022 seconds faster than forking the same number of processes or about 43.13 times faster.
Both approaches use function calls in the underlying operating system, although a process is a separate instance of the Python interpreter and requires more memory and computation to prepare.
This highlights that if your program is required to create many units of concurrency, then good attempts should be made to use threads over processes. This may only make sense for those tasks that explicitly release the GIL during execution, such as IO-bound tasks.
In practice, if so many threads or processes are required to be created in your application, you should probably consider re-using threads or processes, such as via a thread pool or process pool.
Two process pools built into the Python standard library include:
Two thread pools built into the Python standard library include:
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Further Reading
This section provides additional resources that you may find helpful.
Python Multiprocessing Books
- Python Multiprocessing Jump-Start, Jason Brownlee (my book!)
- Multiprocessing API Interview Questions
- Multiprocessing API Cheat Sheet
I would also recommend specific chapters in the books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Multiprocessing: The Complete Guide
- Python Multiprocessing Pool: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
APIs
References
Takeaways
You now know how much faster threads are to create than processes in Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Do you have any questions?