Processes Are About 40x Slower Than Threads in Python

Processes are slow to start, threads are faster.

In fact, threads are about 40x faster to create than processes in Python.

The difference in time taken to create threads and processes depends on the specifics of the system and the start method used to create the threads. The difference in time can be calculated using a standard benchmark and compared directly.

In this tutorial, you will discover how to benchmark the starting of threads vs processes and how to compare the speed between the two in Python.

Let’s get started.

Table of Contents

Processes Slower Than Threads

Python offers both thread-based and process-based concurrency.

Threads are provided via the threading module. The threading.Thread class can execute a target function in another thread.

This can be achieved by creating an instance of the threading.Thread class and specify the target function to execute via the target keyword. The thread can then be started by calling the start() function and it will execute the target function in another thread.

Processes are provided via the multiprocessing module. The multiprocessing.Process class can execute a target function in another process.

The multiprocessing.Process class has much the same API as the threading.Thread class and can be used to execute a target function in a child process.

You can learn more bout the similarities and differences between threads and processes in the tutorial:

Thread vs Process in Python

It is generally considered that threads are lightweight compared to processes and are faster to start.

This may be an important issue in those programs that need to create many threads or processes as part of their normal execution.

If processes are slower than threads, how much slower are they?

In this tutorial, we will explore this question by benchmarking how long it takes to create many processes and many threads, then compare the results.

Run loops using all CPUs, download your FREE book to learn how.

Benchmark Starting Many Processes

We can explore the benchmark speed of starting many processes, in this case using the spawn start method.

The spawn start method is an entirely new instance of the Python interpreter started from scratch. It is not a copy of another process.

The parent process starts a fresh Python interpreter process. The child process will only inherit those resources necessary to run the process object’s run() method. In particular, unnecessary file descriptors and handles from the parent process will not be inherited. Starting a process using this method is rather slow compared to using fork or forkserver. Available on Unix and Windows. The default on Windows and macOS.
— multiprocessing — Process-based parallelism

In this example, we will create a new process many times and time how long it takes.

Each process started will execute a target function that does nothing other than return immediately.

For example:

# task to run in a new process

def task():

# do nothing interesting

pass

We can define a process that creates a process to execute our target function, start the process then wait for it to complete in a loop that repeats a specified number of times.

# run a test and time how long it takes

def test(n_repeats):

# repeat many times

for i in range(n_repeats):

# create the process

process = Process(target=task)

# start the process

process.start()

# wait for the process to complete

process.join()

We can then call our target function with a given number of loop iterations, in this case, 1,000, and time how long it takes to complete in seconds.

# entry point

if __name__ == '__main__':

# set the start method

set_start_method('spawn')

# record the start time

time_start = time()

# perform the test

n_repeats = 1000

test(n_repeats)

# record the end time

time_end = time()

# report the total time

duration = time_end - time_start

print(f'Total Time {duration:.3} seconds')

# report estimated time per process

per_process = duration / n_repeats

print(f'About {per_process:.3} seconds per process')

Tying this together, the complete example is listed below.

# SuperFastPython.com

# create many processes using the spawn start method

from time import time

from multiprocessing import Process

from multiprocessing import set_start_method

# task to run in a new process

def task():

# do nothing interesting

pass

# run a test and time how long it takes

def test(n_repeats):

# repeat many times

for i in range(n_repeats):

# create the process

process = Process(target=task)

# start the process

process.start()

# wait for the process to complete

process.join()

# entry point

if __name__ == '__main__':

# set the start method

set_start_method('spawn')

# record the start time

time_start = time()

# perform the test

n_repeats = 1000

test(n_repeats)

# record the end time

time_end = time()

# report the total time

duration = time_end - time_start

print(f'Total Time {duration:.3} seconds')

# report estimated time per process

per_process = duration / n_repeats

print(f'About {per_process:.3} seconds per process')

Running the example first sets the start method to ‘spawn’.

It then records the start time and executes the test() function that spawns a new process 1,000 times.

The end time is recorded and the total time in seconds is reported. Because we know the number of processes created, we can also estimate how long each process takes to create.

In this case, the experiment took about 42.3 seconds to spawn 1,000 processes, which was about 42.3 milliseconds per process.

1 2	Total Time 42.3 seconds About 0.0423 seconds per process

Processes are slow to spawn.

A faster way to start child processes is to fork them.

The fork start method uses a system function call to copy an existing process to create a new process.

This means that the child process has a copy of all memory used by the original process, including all global variables.

The parent process uses os.fork() to fork the Python interpreter. The child process, when it begins, is effectively identical to the parent process. All resources of the parent are inherited by the child process. Note that safely forking a multithreaded process is problematic. Available on Unix only. The default on Unix.
— multiprocessing — Process-based parallelism

For more on how forking is faster than spawning child processes, see the tutorial:

Forking Processes is 20x Faster Than Spawning in Python

We can update the above example to fork the main process in order to create each child process.

The complete example is listed below.

# SuperFastPython.com

# create many processes using the fork start method

from time import time

from multiprocessing import Process

from multiprocessing import set_start_method

# task to run in a new process

def task():

# do nothing interesting

pass

# run a test and time how long it takes

def test(n_repeats):

# repeat many times

for i in range(n_repeats):

# create the process

process = Process(target=task)

# start the process

process.start()

# wait for the process to complete

process.join()

# entry point

if __name__ == '__main__':

# set the start method

set_start_method('fork')

# record the start time

time_start = time()

# perform the test

n_repeats = 1000

test(n_repeats)

# record the end time

time_end = time()

# report the total time

duration = time_end - time_start

print(f'Total Time {duration:.3} seconds')

# report estimated time per process

per_process = duration / n_repeats

print(f'About {per_process:.3} seconds per process')

Running the example first sets the start method to ‘fork’.

It then records the start time and executes the test() function that spawns a new process 1,000 times.

The end time is recorded and the total time in seconds is reported. Because we know the number of processes created, we can also estimate how long each process takes to create.

In this case, the experiment took about 2.07 seconds to fork 1,000 processes from the main process, which was about 2.07 milliseconds per process.

1 2	Total Time 2.07 seconds About 0.00207 seconds per process

This is a more fair baseline as it is about as fast as we can create child processes.

Next, let’s perform the same experiment by starting many threads in the main process.

Download Now: Free Multiprocessing PDF Cheat Sheet

Benchmark Starting Many Threads

We can explore the benchmark speed of starting many threads instead of processes.

In this example, we will update the previous example and change the creation of processes to threads. Given that the creation and usage of threads and processes are so similar, the changes are quite minimal.

For example:

# run a test and time how long it takes

def test(n_repeats):

# repeat many times

for i in range(n_repeats):

# create the thread

thread = Thread(target=task)

# start the thread

thread.start()

# wait for the thread to complete

thread.join()

Tying this together, the complete example is listed below.

# SuperFastPython.com

# benchmark the time to create many threads

from time import time

from threading import Thread

# task to run in a new thread

def task():

# do nothing interesting

pass

# run a test and time how long it takes

def test(n_repeats):

# repeat many times

for i in range(n_repeats):

# create the thread

thread = Thread(target=task)

# start the thread

thread.start()

# wait for the thread to complete

thread.join()

# entry point

if __name__ == '__main__':

# record the start time

time_start = time()

# perform the test

n_repeats = 1000

test(n_repeats)

# record the end time

time_end = time()

# report the total time

duration = time_end - time_start

print(f'Total Time {duration:.3} seconds')

# report estimated time per thread

per_thread = duration / n_repeats

print(f'About {per_thread:.3} seconds per thread')

Running the example first records the start time.

It then executes the test() function that creates 1,000 new threads, one at a time.

The end time is recorded and the total time in seconds is reported. Because we know the number of threads created, we can also estimate how long each thread takes to create.

In this case, the experiment took about 0.048 seconds to create 1,000 threads, which was about 4.8e-05 seconds per thread (or 0.000048).

That is, less than one second to create all threads and a tiny fraction of a second to create each thread, reported in scientific notation because the number is so small.

1 2	Total Time 0.048 seconds About 4.8e-05 seconds per thread

Next, let’s compare the speed of the two start methods.

Free Python Multiprocessing Course

Download your FREE multiprocessing PDF cheat sheet and get BONUS access to my free 7-day crash course on the multiprocessing API.

Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.

Learn more

Comparison of Processes vs Threads Start Time

Creating threads is faster than creating processes.

Threads are not a little bit faster to create, they are orders of magnitude faster to create.

For example, it took about 2.070 seconds to fork 1,000 child processes and only 0.048 seconds to create 1,000 threads.

That is, creating 1,000 threads was about 2.022 seconds faster than forking the same number of processes or about 43.13 times faster.

Both approaches use function calls in the underlying operating system, although a process is a separate instance of the Python interpreter and requires more memory and computation to prepare.

This highlights that if your program is required to create many units of concurrency, then good attempts should be made to use threads over processes. This may only make sense for those tasks that explicitly release the GIL during execution, such as IO-bound tasks.

In practice, if so many threads or processes are required to be created in your application, you should probably consider re-using threads or processes, such as via a thread pool or process pool.

Two process pools built into the Python standard library include: