The Python multiprocessing module allows you to create and manage new child processes in Python.
Although multiprocessing has been available since Python 2, it is not widely used, perhaps because of misunderstandings of the capabilities and limitations of threads and processes in Python.
This guide provides a detailed and comprehensive guide to multiprocessing in Python, including how processes work, how to use processes in multiprocessor programming, concurrency primitives used with processes, common questions, and best practices.
This is a massive 24,000+ word guide. You may want to bookmark it so you can refer to it as you develop your concurrent programs.
Let’s dive in.
Table of Contents
Python Processes
So what are processes and why do we care?
What Are Processes
A process refers to a computer program.
Every Python program is a process and has one default thread called the main thread used to execute your program instructions. Each process is, in fact, one instance of the Python interpreter that executes Python instructions (Python byte-code), which is a slightly lower level than the code you type into your Python program.
Sometimes we may need to create new processes to run additional tasks concurrently.
Python provides real system-level processes via the multiprocessing.Process class in the multiprocessing module.
The underlying operating system controls how new processes are created. On some systems, that may require spawning a new process, and on others, it may require that the process is forked. The operating-specific method used for creating new processes in Python is not something we need to worry about as it is managed by your installed Python interpreter.
The code in new child processes may or may not be executed in parallel (at the same time), even though the threads are executed concurrently.
There are a number of reasons for this, such as the underlying hardware may or may not support parallel execution (e.g. one vs multiple CPU cores).
This highlights the distinction between code that can run out of order (concurrent) from the capability to execute simultaneously (parallel).
- Concurrent: Code that can be executed out of order.
- Parallel: Capability to execute code simultaneously.
Processes and threads are different.
Next, let’s consider the important differences between processes and threads.
Thread vs Process
A process refers to a computer program.
Each process is in fact one instance of the Python interpreter that executes Python instructions (Python byte-code), which is a slightly lower level than the code you type into your Python program.
Process: The operating system’s spawned and controlled entity that encapsulates an executing application. A process has two main functions. The first is to act as the resource holder for the application, and the second is to execute the instructions of the application.
— PAGE 271, THE ART OF CONCURRENCY, 2009.
The underlying operating system controls how new processes are created. On some systems, that may require spawning a new process, and on others, it may require that the process is forked. The operating-specific method used for creating new processes in Python is not something we need to worry about as it is managed by your installed Python interpreter.
A thread always exists within a process and represents the manner in which instructions or code is executed.
A process will have at least one thread, called the main thread. Any additional threads that we create within the process will belong to that process.
The Python process will terminate once all (non background threads) are terminated.
- Process: An instance of the Python interpreter has at least one thread called the MainThread.
- Thread: A thread of execution within a Python process, such as the MainThread or a new thread.
You can learn more about the differences between processes and threads in the tutorial:
Next, let’s take a look at processes in Python.
Life-Cycle of a Process
A process in Python is represented as an instance of the multiprocessing.Process class.
Once a process is started, the Python runtime will interface with the underlying operating system and request that a new native process be created. The multiprocessing.Process instance then provides a Python-based reference to this underlying native process.
Each process follows the same life cycle. Understanding the stages of this life-cycle can help when getting started with concurrent programming in Python.
For example:
- The difference between creating and starting a process.
- The difference between run and start.
- The difference between blocked and terminated
And so on.
A Python process may progress through three steps of its life cycle: a new process, a running process, and a terminated process.
While running, the process may be executing code or may be blocked, waiting on something such as another process or an external resource. Although not all processes may block, it is optional based on the specific use case for the new process.
- New Child Process.
- Running Process.
- Blocked Process (optional).
- Terminated Process.
A new process is a process that has been constructed and configured by creating an instance of the multiprocessing.Process class.
A new process can transition to a running process by calling the start() method. This also creates and starts the main thread for the process that actually executes code in the process.
A running process may block in many ways if its main thread is blocked, such as reading or writing from a file or a socket or by waiting on a concurrency primitive such as a semaphore or a lock. After blocking, the process will run again.
Finally, a process may terminate once it has finished executing its code or by raising an error or exception.
A process cannot terminate until:
- All non-daemon threads have terminated, including the main thread.
- All non-daemon child processes have terminated, including the main process.
You can learn more about the life-cycle of processes in the tutorial:
Next, let’s take a closer look at the difference between parent and child processes.
Child vs Parent Process
Parent processes have one or more child processes, whereas child processes is created by a parent process.
Parent Process
A parent process is a process that is capable of starting child processes.
Typically, we would not refer to a process as a parent process until it has created one or more child processes. This is the commonly accepted definition.
- Parent Process: Has one or more child processes. May have a parent process, e.g. may also be a child process.
Recall, a process is an instance of a computer program. In Python, a process is an instance of the Python interpreter that executes Python code.
In Python, the first process created when we run our program is called the ‘MainProcess‘. It is also a parent process and may be called the main parent process.
The main process will create the first child process or processes.
A child process may also become a parent process if it in turn creates and starts a child process.
Child Process
A child process is a process that was created by another process.
A child process may also be called a subprocess, as it was created by another process rather than by the operating system directly. That being said, the creation and management of all processes is handled by the underlying operating system.
- Child Process: Has a parent process. May have its own child processes, e.g. may also be a parent.
The process that creates a child process is called a parent process. A child process only ever has one parent process.
There are three main techniques used to create a child process, referred to as process start methods.
They are: fork, spawn, and fork server.
Depending on the technique used to start the child, the child process may or may not inherit properties of the parent process. For example, a forked process may inherit a copy of the global variables from the parent process.
A child process may become orphaned if the parent process that created it is terminated.
You can learn more about the differences between child and parent processes in the tutorial:
Now that we are familiar with child and parent processes, let’s look at how to run code in new processes.
Run your loops using all CPUs, download my FREE book to learn how.
Run a Function in a Process
Python functions can be executed in a separate process using the multiprocessing.Process class.
In this section we will look at a few examples of how to run functions in a child process.
How to Run a Function In a Process
To run a function in another process:
- Create an instance of the multiprocessing.Process class.
- Specify the name of the function via the “target” argument.
- Call the start() function.
First, we must create a new instance of the multiprocessing.Process class and specify the function we wish to execute in a new process via the “target” argument.
1 2 3 |
... # create a process process = multiprocessing.Process(target=task) |
The function executed in another process may have arguments in which case they can be specified as a tuple and passed to the “args” argument of the multiprocessing.Process class constructor or as a dictionary to the “kwargs” argument.
1 2 3 |
... # create a process process = multiprocessing.Process(target=task, args=(arg1, arg2)) |
We can then start executing the process by calling the start() function.
The start() function will return immediately and the operating system will execute the function in a separate process as soon as it is able.
1 2 3 |
... # run the new process process.start() |
A new instance of the Python interrupter will be created and a new thread within the new process will be created to execute our target function.
And that’s all there is to it.
We do not have control over when the process will execute precisely or which CPU core will execute it. Both of these are low-level responsibilities that are handled by the underlying operating system.
Next, let’s look at a worked example of executing a function in a new process.
Example of Running a Function in a Process
First, we can define a custom function that will be executed in another process.
We will define a simple function that blocks for a moment then prints a statement.
The function can have any name we like, in this case we’ll name it “task“.
1 2 3 4 5 6 |
# a custom function that blocks for a moment def task(): # block for a moment sleep(1) # display a message print('This is from another process') |
Next, we can create an instance of the multiprocessing.Process class and specify our function name as the “target” argument in the constructor.
1 2 3 |
... # create a process process = Process(target=task) |
Once created we can run the process which will execute our custom function in a new native process, as soon as the operating system can.
1 2 3 |
... # run the process process.start() |
The start() function does not block, meaning it returns immediately.
We can explicitly wait for the new process to finish executing by calling the join() function.
1 2 3 4 |
... # wait for the process to finish print('Waiting for the process...') process.join() |
Tying this together, the complete example of executing a function in another process is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# SuperFastPython.com # example of running a function in another process from time import sleep from multiprocessing import Process # a custom function that blocks for a moment def task(): # block for a moment sleep(1) # display a message print('This is from another process') # entry point if __name__ == '__main__': # create a process process = Process(target=task) # run the process process.start() # wait for the process to finish print('Waiting for the process...') process.join() |
Running the example first creates the multiprocessing.Process then calls the start() function. This does not start the process immediately, but instead allows the operating system to schedule the function to execute as soon as possible.
At some point a new instance of the Python interpreter is created that has a new thread which will execute our target function.
The main thread of our initial process then prints a message waiting for the new process to complete, then calls the join() function to explicitly block and wait for the new process to terminate.
Once the custom function returns, the new process is closed. The join() function then returns and the main thread exits.
1 2 |
Waiting for the process... This is from another process |
Next, let’s look at how we might run a function with arguments in a new process.
Example of Running a Function in a Process With Arguments
We can execute functions in another process that takes arguments.
This can be demonstrated by first updating our task() function from the previous section to take two arguments, one for the time in seconds to block and the second for a message to display.
1 2 3 4 5 6 |
# a custom function that blocks for a moment def task(sleep_time, message): # block for a moment sleep(sleep_time) # display a message print(message) |
Next, we can update the call to the multiprocessing.Process constructor to specify the two arguments in the order that our task() function expects them as a tuple via the “args” argument.
1 2 3 |
... # create a process process = Process(target=task, args=(1.5, 'New message from another process')) |
Tying this together, the complete example of executing a custom function that takes arguments in a separate process is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# SuperFastPython.com # example of running a function with arguments in another process from time import sleep from multiprocessing import Process # a custom function that blocks for a moment def task(sleep_time, message): # block for a moment sleep(sleep_time) # display a message print(message) # entry point if __name__ == '__main__': # create a process process = Process(target=task, args=(1.5, 'New message from another process')) # run the process process.start() # wait for the process to finish print('Waiting for the process...') process.join() |
Running the example creates the process specifying the function name and the arguments to the function.
The new process is started and the function blocks for the parameterized number of seconds and prints the parameterized message.
1 2 |
Waiting for the process... New message from another process |
You can learn more about running functions in new processes in this tutorial:
Next let’s look at how we might run a function in a child process by extending the multiprocessing.Process class.
Confused by the multiprocessing module API?
Download my FREE PDF cheat sheet
Extend the Process Class
We can also execute functions in a child process by extending the multiprocessing.Process class and overriding the run() function.
In this section we will look at some examples of extending the multiprocessing.Process class.
How to Extend the Process Class
The multiprocessing.Process class can be extended to run code in another process.
This can be achieved by first extending the class, just like any other Python class.
For example:
1 2 3 |
# custom process class class CustomProcess(multiprocessing.Process): # ... |
Then the run() function of the multiprocessing.Process class must be overridden to contain the code that you wish to execute in another process.
For example:
1 2 3 |
# override the run function def run(self): # ... |
And that’s it.
Given that it is a custom class, you can define a constructor for the class and use it to pass in data that may be needed in the run() function, stored such as instance variables (attributes).
You can also define additional functions in the class to split up the work you may need to complete in another process.
Finally, attributes can also be used to store the results of any calculation or IO performed in another process that may need to be retrieved afterward.
Next, let’s look at a worked example of extending the multiprocessing.Process class.
Example of Extending the Process Class
First, we can define a class that extends the multiprocessing.Process class.
We will name the class something arbitrary such as “CustomProcess“.
We can then override the run() instance method and define the code that we wish to execute in another process.
In this case, we will block for a moment and then print a message.
1 2 3 4 5 6 |
# override the run function def run(self): # block for a moment sleep(1) # display a message print('This is coming from another process') |
Next, we can create an instance of our CustomProcess class and call the start() function to begin executing our run() function in another process. Internally, the start() function will call the run() function.
The code will then run in a new process as soon as the operating system can schedule it.
1 2 3 4 5 |
... # create the process process = CustomProcess() # start the process process.start() |
Finally, we wait for the new process to finish executing.
1 2 3 4 |
... # wait for the process to finish print('Waiting for the process to finish') process.join() |
Tying this together, the complete example of executing code in another process by extending the multiprocessing.Process class is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
# SuperFastPython.com # example of extending the Process class from time import sleep from multiprocessing import Process # custom process class class CustomProcess(Process): # override the run function def run(self): # block for a moment sleep(1) # display a message print('This is coming from another process') # entry point if __name__ == '__main__': # create the process process = CustomProcess() # start the process process.start() # wait for the process to finish print('Waiting for the process to finish') process.join() |
Running the example first creates an instance of the process, then executes the content of the run() function.
Meanwhile, the main thread waits for the new process to finish its execution, before exiting.
1 2 |
Waiting for the process to finish This is coming from another process |
You can learn more about extending the multiprocessing.Process class in this tutorial:
Next, let’s look at how we might return values from a child process.
Example of Extending the Process Class and Returning Values
Instance variable attributes can be shared between processes via the multiprocessing.Value and multiprocessing.Array classes.
These classes explicitly define data attributes designed to be shared between processes in a process-safe manner.
Shared variables mean that changes made in one process are always propagated and made available to other processes.
An instance of the multiprocessing.Value can be defined in the constructor of a custom class as a shared instance variable.
The constructor of the multiprocessing.Value class requires that we specify the data type and an initial value.
The data type can be specified using ctype “type” or a typecode.
We can define an instance attribute as an instance of the multiprocessing.Value which will automatically and correctly be shared between processes.
We can update the above example to use a multiprocessing.Value directly.
Firstly, we must update the constructor of the CustomProcess class to initialize the multiprocessing.Value instance. We will define it to be an integer with an initial value of zero.
1 2 3 |
... # initialize integer attribute self.data = Value('i', 0) |
A class constructor with this addition is listed below.
1 2 3 4 5 6 |
# override the constructor def __init__(self): # execute the base constructor Process.__init__(self) # initialize integer attribute self.data = Value('i', 0) |
We can then update the run() method to change the “value” attribute on the “data” instance variable and then to report this value via a print statement.
1 2 3 4 5 |
... # store the data variable self.data.value = 99 # report stored value print(f'Child stored: {self.data.value}') |
The updated run() function with this change is listed below.
1 2 3 4 5 6 7 8 |
# override the run function def run(self): # block for a moment sleep(1) # store the data variable self.data.value = 99 # report stored value print(f'Child stored: {self.data.value}') |
Tying this together, the updated CustomProcess class is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# custom process class class CustomProcess(Process): # override the constructor def __init__(self): # execute the base constructor Process.__init__(self) # initialize integer attribute self.data = Value('i', 0) # override the run function def run(self): # block for a moment sleep(1) # store the data variable self.data.value = 99 # report stored value print(f'Child stored: {self.data.value}') |
Finally, in the parent process, we can update the print statement to correctly access the value of the shared instance variable attribute.
1 2 3 |
... # report the process attribute print(f'Parent got: {process.data.value}') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
# SuperFastPython.com # example of extending the Process class and adding shared attributes from time import sleep from multiprocessing import Process from multiprocessing import Value # custom process class class CustomProcess(Process): # override the constructor def __init__(self): # execute the base constructor Process.__init__(self) # initialize integer attribute self.data = Value('i', 0) # override the run function def run(self): # block for a moment sleep(1) # store the data variable self.data.value = 99 # report stored value print(f'Child stored: {self.data.value}') # entry point if __name__ == '__main__': # create the process process = CustomProcess() # start the process process.start() # wait for the process to finish print('Waiting for the child process to finish') # block until child process is terminated process.join() # report the process attribute print(f'Parent got: {process.data.value}') |
Running the example first creates an instance of the custom class then starts the child process.
The constructor initializes the instance variable to be a multiprocessing.Value instance with an initial value of zero.
The parent process blocks until the child process terminates.
The child process blocks, then changes the value of the instance variable and reports the change. The change to the instance variable is propagated back to the parent process.
The child process terminates and the parent process continues on. It reports the value of the instance variable which correctly reflects the change made by the child process.
This demonstrates how to share instance variable attributes among parents via the multiprocessing.Value class.
1 2 3 |
Waiting for the child process to finish Child stored: 99 Parent got: 99 |
You can learn more about returning values from a child process in this tutorial:
Next, let’s take a closer look at process start methods.
Free Python Multiprocessing Course
Download my multiprocessing API cheat sheet and as a bonus you will get FREE access to my 7-day email course.
Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.
Process Start Methods
In multiprocessing, we may need to change the technique used to start child processes.
This is called the start method.
What is a Start Method
A start method is the technique used to start child processes in Python.
There are three start methods, they are:
- spawn: start a new Python process.
- fork: copy a Python process from an existing process.
- forkserver: new process from which future forked processes will be copied.
Each platform has a default start method.
The following lists the major platforms and the default start methods.
- Windows (win32): spawn
- macOS (darwin): spawn
- Linux (unix): fork
Not all platforms support all start methods.
The following lists the major platforms and the start methods that are supported.
- Windows (win32): spawn
- macOS (darwin): spawn, fork, forkserver.
- Linux (unix): spawn, fork, forkserver.
How to Change The Start Method
The multiprocessing module provides functions for getting and setting the start method for creating child processes.
The list of supported start methods can be retrieved via the multiprocessing.get_all_start_methods() function.
The function returns a list of string values, each representing a supported start method.
For example:
1 2 3 |
... # get supported start methods methods = multiprocessing.get_all_start_methods() |
The current start method can be retrieved via the multiprocessing.get_start_method() function.
The function returns a string value for the currently configured start method.
For example:
1 2 3 |
... # get the current start method method = multiprocessing.get_start_method() |
The start method can be set via the multiprocessing.set_start_method() function.
The function takes a string argument indicating the start method to use.
This must be one of the methods returned from the multiprocessing.get_all_start_methods() for your platform.
For example:
1 2 3 |
... # set the start method multiprocessing.set_start_method('spawn') |
It is a best practice, and required on most platforms that the start method be set first, prior to any other code, and to be done so within a if __name__ == ‘__main__’ check called a protected entry point or top-level code environment.
For example:
1 2 3 4 5 |
... # protect the entry point if __name__ == '__main__': # set the start method multiprocessing.set_start_method('spawn') |
If the start method is not set within a protected entry point, it is possible to get a RuntimeError such as:
1 |
RuntimeError: context has already been set |
It is also a good practice and required on some platforms that the start method only be set once.
In summary, the rules for setting the start method are as follows:
- Set the start method first prior to all other code.
- Set the start method only once in a program.
- Set the start method within a protected entry point.
How to Set Start Method Via Context
A multiprocessing context configured with a given start method can be retrieved via the multiprocessing.get_context() function.
This function takes the name of the start method as an argument, then returns a multiprocessing context that can be used to create new child processes.
For example:
1 2 3 |
... # get a context configured with a start method context = multiprocessing.get_context('fork') |
The context can then be used to create a child process, for example:
1 2 3 |
... # create a child process via a context process = context.Process(...) |
It may also be possible to force the start method.
This can be achieved via the “force” argument provided on the set_start_method() implementation in the DefaultContext, although not documented.
For example:
1 2 3 |
... # set the start method context.set_start_method('spawn', force=True) |
You can learn more about setting the process start method in the tutorial:
Next, let’s look at some properties of processes instances.
Overwheled by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Process Instance Attributes
An instance of the multiprocessing.Process class provides a handle of a new instance of the Python interpreter.
As such, it provides attributes that we can use to query properties and the status of the underlying process.
Let’s look at some examples.
Query Process Name
Each process has a name.
The parent process has the name “MainProcess“.
Child processes are named automatically in a somewhat unique manner within each process with the form “Process-%d” where %d is the integer indicating the process number created by the parent process, e.g. Process-1 for the first process created.
We can access the name of a process via the multiprocessing.Process.name attribute, for example:
1 2 3 |
... # report the process name print(process.name) |
The example below creates an instance of the multiprocessing.Process class and reports the default name of the process.
1 2 3 4 5 6 7 8 9 |
# SuperFastPython.com # example of accessing the child process name from multiprocessing import Process # entry point if __name__ == '__main__': # create the process process = Process() # report the process name print(process.name) |
Running the example creates a child process and reports the default name assigned to the process.
1 |
Process-1 |
You can learn more about configuring the process name in the tutorial:
Query Process Daemon
A process may be a daemon.
Daemon process is the name given to the background process. By default, processes are non-daemon processes because they inherit the daemon value from the parent process, which is set False for the MainProcess.
A Python parent process will only exit when all non-daemon processes have finished exiting. For example, the MainProcess is a non-daemon process. This means that the daemon process can run in the background and do not have to finish or be explicitly excited for the program to end.
We can determine if a process is a daemon process via the multiprocessing.Process.daemon attribute.
1 2 3 |
... # report the daemon attribute print(process.daemon) |
The example creates an instance of the multiprocessing.Process class and reports whether the process is a daemon or not.
1 2 3 4 5 6 7 8 9 |
# SuperFastPython.com # example of assessing whether a process is a daemon from multiprocessing import Process # entry point if __name__ == '__main__': # create the process process = Process() # report the daemon attribute print(process.daemon) |
Running the example reports that the process is not a daemon process, the default for new child processes.
1 |
False |
You can learn more about configuring daemon processes in the tutorial:
Query Process PID
Each process has a unique process identifier, called the PID, assigned by the operating system.
Python processes are real native processes, meaning that each process we create is actually created and managed by the underlying operating system. As such, the operating system will assign a unique integer to each process that is created on the system (across processes).
The process identifier can be accessed via the multiprocessing.Process.pid property and is assigned after the process has been started.
1 2 3 |
... # report the pid print(process.pid) |
The example below creates an instance of a multiprocessing.Process and reports the assigned PID.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
# SuperFastPython.com # example of reporting the native process identifier from multiprocessing import Process # entry point if __name__ == '__main__': # create the process process = Process() # report the process identifier print(process.pid) # start the process process.start() # report the process identifier print(process.pid) |
Running the example first creates the process and confirms that it does not have a native PID before it was started.
The process is then started and the assigned PID is reported.
Note, your PID will differ as the process will have a different identifier each time the code is run.
1 2 |
None 16302 |
Query Process Alive
A process instance can be alive or dead.
An alive process means that the run() method of the multiprocessing.Process instance is currently executing.
This means that before the start() method is called and after the run() method has completed, the process will not be alive.
We can check if a process is alive via the multiprocessing.Process.is_alive() method.
1 2 3 |
... # report the process is alive print(process.is_alive()) |
The example below creates a multiprocessing.Process instance then checks whether it is alive.
1 2 3 4 5 6 7 8 9 |
# SuperFastPython.com # example of assessing whether a process is alive from multiprocessing import Process # entry point if __name__ == '__main__': # create the process process = Process() # report the process is alive print(process.is_alive()) |
Running the example creates a new multiprocessing.Process instance then reports that the process is not alive.
1 |
False |
Query Process Exit Code
A child process will have an exit code once it has terminated.
An exit code provides an indication of whether processes completed successfully or not, and if not, the type of error that occurred that caused the termination.
Common exit codes include:
- 0: Normal (exit success)
- 1: Error (exit failure)
We can write the exit code for a child process via the multiprocessing.Process.exitcode attribute.
1 2 3 |
... # report the process exit code print(process.exitcode) |
A process will not have an exit code until the process is terminated. This means, checking the exit code of a process before it has started or while it is running will return a None value.
We can demonstrate this with a worked example that executes a custom target function that blocks for one second.
First, we create a new process instance to execute our task() function, report the exit code, then start the process and report the exit code while the process is running.
1 2 3 4 5 6 7 8 9 |
... # create the process process = Process(target=task) # report the exit status print(process.exitcode) # start the process process.start() # report the exit status print(process.exitcode) |
We can then block until the child process has terminated, then report the exit code.
1 2 3 4 5 |
... # wait for the process to finish process.join() # report the exit status print(process.exitcode) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
# SuperFastPython.com # example of checking the exit status of a child process from time import sleep from multiprocessing import Process # function to execute in a new process def task(): sleep(1) # entry point if __name__ == '__main__': # create the process process = Process(target=task) # report the exit status print(process.exitcode) # start the process process.start() # report the exit status print(process.exitcode) # wait for the process to finish process.join() # report the exit status print(process.exitcode) |
Running the example first creates a new process to execute our custom task function.
The status code of the child process is reported, which is None as expected. The process is started and blocked for one second. While running, the exit code is reported again, which again is None as expected.
The parent process then blocks until the child process terminates. The exit code is reported, and in this case we can see that a value of zero is reported indicating a normal or successful exit.
1 2 3 |
None None 0 |
The exit code for a process can be set by calling sys.exit() and specifying an integer value, such as 1 for a non-successful exit.
You can learn more about how to set exit codes in the tutorial:
Next, let’s look at how we might configure new child processes.
Configure Processes
Instances of the multiprocessing.Process class can be configured.
There are two properties of a process that can be configured, they are the name of the process and whether the process is a daemon or not.
Let’s take a closer look at each.
How to Configure Process Name
Processes can be assigned custom names.
The name of a process can be set via the “name” argument in the multiprocessing.Process constructor.
For example:
1 2 3 |
... # create a process with a custom name process = Process(name='MyProcess') |
The example below demonstrates how to set the name of a process in the process class constructor.
1 2 3 4 5 6 7 8 9 |
# SuperFastPython.com # example of setting the process name in the constructor from multiprocessing import Process # entry point if __name__ == '__main__': # create a process with a custom name process = Process(name='MyProcess') # report process name print(process.name) |
Running the example creates the child process with the custom name then reports the name of the process.
1 |
MyProcess |
The name of the process can also be set via the “name” property.
For example:
1 2 3 |
... # set the name process.name = 'MyProcess' |
The example below demonstrates this by creating an instance of a process and then setting the name via the property.
1 2 3 4 5 6 7 8 9 10 11 |
# SuperFastPython.com # example of setting the process name via the property from multiprocessing import Process # entry point if __name__ == '__main__': # create a process process = Process() # set the name process.name = 'MyProcess' # report process name print(process.name) |
Running the example creates the process and sets the name then reports that the new name was assigned correctly.
1 |
MyProcess |
You can learn more about how to configure the process name in this tutorial:
Next, let’s take a closer look at daemon processes.
How to Configure a Daemon Process
Processes can be configured to be “daemon” or “daemonic”, that is, they can be configured as background processes.
A parent Python process can only exit once all non-daemon child processes have exited.
This means that daemon child processes can run in the background and do not prevent a Python program from exiting when the “main parts of the program” have finished.
A process can be configured to be a daemon by setting the “daemon” argument to True in the multiprocessing.Process constructor.
For example:
1 2 3 |
... # create a daemon process process = Process(daemon=True) |
The example below shows how to create a new daemon process.
1 2 3 4 5 6 7 8 9 |
# SuperFastPython.com # example of setting a process to be a daemon via the constructor from multiprocessing import Process # entry point if __name__ == '__main__': # create a daemon process process = Process(daemon=True) # report if the process is a daemon print(process.daemon) |
Running the example creates a new process and configures it to be a daemon process via the constructor.
1 |
True |
We can also configure a process to be a daemon process after it has been constructed via the “daemon” property.
For example:
1 2 3 |
... # configure the process to be a daemon process.daemon = True |
The example below creates a new multiprocessing.Process instance then configures it to be a daemon process via the property.
1 2 3 4 5 6 7 8 9 10 11 |
# SuperFastPython.com # example of setting a process to be a daemon via the property from multiprocessing import Process # entry point if __name__ == '__main__': # create a process process = Process() # configure the process to be a daemon process.daemon = True # report if the process is a daemon print(process.daemon) |
Running the example creates a new process instance then configures it to be a daemon process, then reports the daemon status.
1 |
True |
You can learn more about daemon processes in this tutorial:
Now that we are familiar with how to configure new child processes, let’s take a closer look at the built-in main process.
Main Process
The main process is the parent process that executes your program.
In this section we will take a closer look at the main process in Python.
What is the Main Process?
The main process in Python is the process started when you run your Python program.
Recall that a process in Python is an instance of the Python interpreter. We can create and run new child processes via the multiprocessing.Process class.
What makes the main process unique is that it is the first process created when you run your program and the main thread of the process executes the entry point of your program.
As such the main process does not have a parent process.
The main process also has a distinct name, specifically “MainProcess“.
How Can the Main Process Be Identified
There are a number of ways that the main process can be identified.
They are:
- The main process has a distinct name of “MainProcess“.
- The main process does not have a parent process.
- The main process is an instance of the multiprocessing.process._MainProcess class.
How to Get the Main Process?
We can get a multiprocessing.Process instance for the main process.
There are a number of ways that we can do this, depending on where exactly we are in the code.
For example, if we are running code in the main process, we can get a multiprocessing.Process instance for the main process via the multiprocessing.current_process() function.
For example:
1 2 3 4 5 6 7 |
# SuperFastPython.com # example of getting the main process from the main process from multiprocessing import current_process # get the process instance process = current_process() # report details of the main process print(process) |
Running the example gets the process instance for the main process.
1 |
<_MainProcess name='MainProcess' parent=None started> |
If we are in a child process of the main process then we can get the main process via the multiprocessing.parent_process() function.
For example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
# SuperFastPython.com # example of getting the main process from a child of the main process from multiprocessing import parent_process from multiprocessing import Process # function executed in a child process def task(): # get the parent process instance process = parent_process() # report details of the main process print(process) # entry point if __name__ == '__main__': # create a new process process = Process(target=task) # start the new process process.start() # wait for the new process to terminate process.join() |
Running the example starts a new child process to execute a custom task() function.
The child process gets the parent process instance which is the main process and reports its details.
1 |
<_ParentProcess name='MainProcess' parent=None unknown> |
What is the Name of the Main Process?
The main process is assigned a distinct name when it is created.
The name of the parent process is ‘MainProcess‘.
It is distinct from the default name of child processes, that are named ‘Process-%d‘, where %d is an integer starting from 1 to indicate the child number created by the parent process, e.g. Process-1.
Nevertheless, process names can be changed any time and should not be relied upon to identify processes uniquely.
We can confirm the name of the main process by getting the parent process instance via the multiprocessing.current_process() function and getting the “name” attribute.
For example:
1 2 3 4 5 6 7 |
# SuperFastPython.com # example of reporting the name of the main process from multiprocessing import current_process # get the main process process = current_process() # report the name of the main process print(process.name) |
Running the example gets the multiprocessing.Process instance for the main process, then reports the name of the process.
1 |
MainProcess |
You can learn more about the main process in the tutorial:
Now that we are familiar with the main process, let’s look at some additional multiprocessing utilities.
Process Utilities
There are a number of utilities we can use when working with processes within a Python process.
These utilities are provided as multiprocessing module functions.
We have already seen two of these functions in the previous section, specifically multiprocessing.current_thread() and multiprocessing.main_thread().
In this section we will review a number of additional utility functions.
Active Child Processes
We can get a list of all active child processes for a parent process.
This can be achieved via the multiprocessing.active_children() function that returns a list of all child processes that are currently running.
1 2 3 |
... # get a list of all active child processes children = active_children() |
We can demonstrate this with a short example.
1 2 3 4 5 6 7 8 9 10 |
# SuperFastPython.com # list all active child processes from multiprocessing import active_children # get a list of all active child processes children = active_children() # report a count of active children print(f'Active Children Count: {len(children)}') # report each in turn for child in children: print(child) |
Running the example reports the number of active child processes.
In this case, there are no active children processes.
1 |
Active Children Count: 0 |
We can update the example so that we first start a number of children processes, have the children processes block for a moment, then get the list of active children.
This can be achieved by first defining a function to execute in a new process that blocks for a moment.
The task() function below implements this.
1 2 3 4 |
# function to execute in a new process def task(): # block for a moment sleep(1) |
Next, we can create a number of processes configured to run our task() function, then start them.
1 2 3 4 5 6 |
... # create a number of child processes processes = [Process(target=task) for _ in range(5)] # start the child processes for process in processes: process.start() |
We can then report the active child processes as before.
1 2 3 4 5 6 7 8 |
... # get a list of all active child processes children = active_children() # report a count of active children print(f'Active Children Count: {len(children)}') # report each in turn for child in children: print(child) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# SuperFastPython.com # list all active child processes from time import sleep from multiprocessing import active_children from multiprocessing import Process # function to execute in a new process def task(): # block for a moment sleep(1) # entry point if __name__ == '__main__': # create a number of child processes processes = [Process(target=task) for _ in range(5)] # start the child processes for process in processes: process.start() # get a list of all active child processes children = active_children() # report a count of active children print(f'Active Children Count: {len(children)}') # report each in turn for child in children: print(child) |
Running the example first creates five processes to run our task() function, then starts them.
A list of all active child processes is then retrieved.
The count is reported, which is shown as five, as we expect, then the details of each process are then reported.
This highlights how we can access all active child processes from a parent process.
1 2 3 4 5 6 |
Active Children Count: 5 <Process name='Process-3' pid=16853 parent=16849 started> <Process name='Process-4' pid=16854 parent=16849 started> <Process name='Process-2' pid=16852 parent=16849 started> <Process name='Process-1' pid=16851 parent=16849 started> <Process name='Process-5' pid=16855 parent=16849 started> |
Get The Number of CPU Cores
We may want to know the number of CPU cores available.
This can be determined via the multiprocessing.cpu_count() function.
The function returns an integer that indicates the number of logical CPU cores available in the system running the Python program.
Recall that a central processing unit or CPU executes program instructions. A CPU may have one or more physical CPU cores for executing code in parallel. Modern CPU cores may make use of hyperthreading, allowing each physical CPU core to operate like two (or more) logical CPU cores.
Therefore, a computer system with a CPU with four physical cores may report eight logical CPU cores via the multiprocessing.cpu_count() function function.
It may be useful to know the number of CPU cores available to configure a thread pool or the number of processes to create to execute tasks in parallel.
The example below demonstrates how to use the number of CPU cores.
1 2 3 4 5 6 7 |
# SuperFastPython.com # example of reporting the number of logical cpu cores from multiprocessing import cpu_count # get the number of cpu cores num_cores = cpu_count() # report details print(num_cores) |
Running the example reports the number of logical CPU cores available in the system.
In this case, my system provides eight logical CPU cores, your result may differ.
1 |
8 |
You can learn more about getting the number of CPU Cores in the tutorial:
The Current Process
We can get a multiprocessing.Process instance for the process running the current code.
This can be achieved via the multiprocessing.current_process() function that returns a multiprocessing.Process instance.
1 2 3 |
... # get the current process thread = current_process() |
We can use this function to access the multiprocessing.Process for the MainProcess.
This can be demonstrated with a short example, listed below.
1 2 3 4 5 6 7 |
# SuperFastPython.com # example of executing the current process from multiprocessing import current_process # get the current process process = current_process() # report details print(process) |
Running the example gets the process instance for the currently running process.
The details are then reported, showing that we have accessed the main process that has no parent process.
1 |
<_MainProcess name='MainProcess' parent=None started> |
We can use this function within a child process, allowing the child process to access a process instance for itself.
First, we can define a custom function that gets the current process and reports its details.
The task() function below implements this.
1 2 3 4 5 6 |
# function executed in a new process def task(): # get the current process process = current_process() # report details print(process) |
Next, we can create a new process to execute our custom task() function. Then start the process and wait for it to terminate.
1 2 3 4 5 6 7 |
... # create a new process process = Process(target=task) # start the new process process.start() # wait for the child process to terminate process.join() |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
# SuperFastPython.com # example of executing the current process within a child process from multiprocessing import Process from multiprocessing import current_process # function executed in a new process def task(): # get the current process process = current_process() # report details print(process) # entry point if __name__ == '__main__': # create a new process process = Process(target=task) # start the new process process.start() # wait for the child process to terminate process.join() |
Running the example first creates a new process instance, then starts the process and waits for it to finish.
The new process then gets an instance of its own multiprocessing.Process instance and then reports the details.
The Parent Process
We may need to access the parent process for a current child process.
This can be achieved via the multiprocessing.parent_process() function. This will return a multiprocessing.Process instance for the parent of the current process.
The MainProcess does not have a parent, therefore attempting to get the parent of the MainProcess will return None.
We can demonstrate this with a worked example.
1 2 3 4 5 6 7 |
# SuperFastPython.com # example of getting the parent process of the main process from multiprocessing import parent_process # get the the parent process process = parent_process() # report details print(process) |
Running the example attempts the multiprocessing.Process instance for the MainProcess.
The function returns None, as expected as the MainProcess does not have a parent process.
1 |
None |
We can make the example more interesting by creating a new child process, then getting the parent of the child process, which will be the MainProcess.
This can be achieved by first defining a function to get the parent process and then report its details.
The task() function below implements this.
1 2 3 4 5 6 |
# function to execute in a new process def task(): # get the the parent process process = parent_process() # report details print(process) |
Next, we can create a new process instance and configure it to execute our task() function. We can then start the new process and wait for it to terminate.
1 2 3 4 5 6 7 |
... # create a new process process = Process(target=task) # start the new process process.start() # wait for the new process to terminate process.join() |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
# SuperFastPython.com # example of getting the parent process of a child process from multiprocessing import parent_process from multiprocessing import Process # function to execute in a new process def task(): # get the the parent process process = parent_process() # report details print(process) # entry point if __name__ == '__main__': # create a new process process = Process(target=task) # start the new process process.start() # wait for the new process to terminate process.join() |
Running the example first creates a new process configured to execute our task function, then starts the process and waits for it to terminate.
The new child process then gets a multiprocessing.Process instance for the parent process, then reports its details.
We can see that the parent of the new process is the MainProcess, as we expected.
1 |
<_ParentProcess name='MainProcess' parent=None unknown> |
Process Mutex Lock
A mutex lock is used to protect critical sections of code from concurrent execution.
You can use a mutual exclusion (mutex) lock in Python via the multiprocessing.Lock class.
What is a Mutual Exclusion Lock
A mutual exclusion lock or mutex lock is a synchronization primitive intended to prevent a race condition.
A race condition is a concurrency failure case when two processes (or threads) run the same code and access or update the same resource (e.g. data variables, stream, etc.) leaving the resource in an unknown and inconsistent state.
Race conditions often result in unexpected behavior of a program and/or corrupt data.
These sensitive parts of code that can be executed by multiple processes concurrently and may result in race conditions are called critical sections. A critical section may refer to a single block of code, but it also refers to multiple accesses to the same data variable or resource from multiple functions.
A mutex lock can be used to ensure that only one process at a time executes a critical section of code at a time, while all other processes trying to execute the same code must wait until the currently executing process is finished with the critical section and releases the lock.
Each process must attempt to acquire the lock at the beginning of the critical section. If the lock has not been obtained, then a process will acquire it and other processes must wait until the process that acquired the lock releases it.
If the lock has not been acquired, we might refer to it as being in the “unlocked” state. Whereas if the lock has been acquired, we might refer to it as being in the “locked” state.
- Unlocked: The lock has not been acquired and can be acquired by the next process that makes an attempt.
- Locked: The lock has been acquired by process thread and any process that makes an attempt to acquire it must wait until it is released.
Locks are created in the unlocked state.
Now that we know what a mutex lock is, let’s take a look at how we can use it in Python.
How to Use a Mutex Lock
Python provides a mutual exclusion lock for use with processes via the multiprocessing.Lock class.
An instance of the lock can be created and then acquired by processes before accessing a critical section, and released after the critical section.
For example:
1 2 3 4 5 6 7 8 |
... # create a lock lock = multiprocessing.Lock() # acquire the lock lock.acquire() # ... # release the lock lock.release() |
Only one process can have the lock at any time. If a process does not release an acquired lock, it cannot be acquired again.
The process attempting to acquire the lock will block until the lock is acquired, such as if another process currently holds the lock then releases it.
We can attempt to acquire the lock without blocking by setting the “block” argument to False. If the lock cannot be acquired, a value of False is returned.
1 2 3 |
... # acquire the lock without blocking lock.acquire(block=false) |
We can also attempt to acquire the lock with a timeout, that will wait the set number of seconds to acquire the lock before giving up. If the lock cannot be acquired, a value of False is returned.
1 2 3 |
... # acquire the lock with a timeout lock.acquire(timeout=10) |
We can also use the lock via the context manager protocol via the with statement, allowing the critical section to be a block within the usage of the lock and for the lock to be released automatically once the block has completed.
For example:
1 2 3 4 5 6 |
... # create a lock lock = multiprocessing.Lock() # acquire the lock with lock: # ... |
This is the preferred usage as it makes it clear where the protected code begins and ends, and ensures that the lock is always released, even if there is an exception or error within the critical section.
We can also check if the lock is currently acquired by a process via the locked() function.
1 2 3 4 |
... # check if a lock is currently acquired if lock.locked(): # ... |
Now that we know how to use the multiprocessing.Lock class, let’s look at a worked example.
Example of Using a Mutex Lock
We can develop an example to demonstrate how to use the mutex lock.
First, we can define a target task function that takes a lock as an argument and uses the lock to protect a critical section.
In this case, the critical section involves reporting a message and blocking for a fraction of a second.
1 2 3 4 5 6 |
# work function def task(lock, identifier, value): # acquire the lock with lock: print(f'>process {identifier} got the lock, sleeping for {value}') sleep(value) |
Next, we can then create one instance of the multiprocessing.Lock to be shared among the processes.
1 2 3 |
... # create the shared lock lock = Lock() |
We can then create many processes, each configured to execute our task() function and compete to execute the critical section.
Each process will receive the shared lock as an argument as well as an integer id between 0 and 9 and a random time to sleep in seconds between 0 and 1.
We can implement this via a list comprehension, creating a list of ten configured multiprocessing.Process instances.
1 2 3 |
... # create a number of processes with different sleep times processes = [Process(target=task, args=(lock, i, random())) for i in range(10)] |
Next, we can start all of the processes.
1 2 3 4 |
... # start the processes for process in processes: process.start() |
Finally, we can wait for all of the new child processes to terminate.
1 2 3 4 |
... # wait for all processes to finish for process in processes: process.join() |
Tying this together, the complete example of using a lock is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
# SuperFastPython.com # example of a mutual exclusion (mutex) lock for processes from time import sleep from random import random from multiprocessing import Process from multiprocessing import Lock # work function def task(lock, identifier, value): # acquire the lock with lock: print(f'>process {identifier} got the lock, sleeping for {value}') sleep(value) # entry point if __name__ == '__main__': # create the shared lock lock = Lock() # create a number of processes with different sleep times processes = [Process(target=task, args=(lock, i, random())) for i in range(10)] # start the processes for process in processes: process.start() # wait for all processes to finish for process in processes: process.join() |
Running the example starts ten processes that are all configured to execute our custom function.
The child processes are then started and the main process blocks until all child processes finish.
Each child process attempts to acquire the lock within the task() function. Only one process can acquire the lock at a time and once they do, they report a message including their id and how long they will sleep. The process then blocks for a fraction of a second before releasing the lock.
Your specific results may vary given the use of random numbers. Try running the example a few times
1 2 3 4 5 6 7 8 9 10 |
>process 2 got the lock, sleeping for 0.34493199862842716 >process 0 got the lock, sleeping for 0.1690829274493061 >process 1 got the lock, sleeping for 0.586700038562483 >process 3 got the lock, sleeping for 0.8439760508777033 >process 4 got the lock, sleeping for 0.49642440261633747 >process 6 got the lock, sleeping for 0.7291278047802177 >process 5 got the lock, sleeping for 0.4495745681185115 >process 7 got the lock, sleeping for 0.6844618818829677 >process 8 got the lock, sleeping for 0.21518155457911792 >process 9 got the lock, sleeping for 0.30577395898093285 |
You can learn more about mutex locks in this tutorial:
Now that we are familiar with multiprocessing mutex locks, let’s take a look at the reentrant lock.
Process Reentrant Lock
A reentrant lock is a lock that can be acquired more than once by the same process.
You can use reentrant locks in Python via the multiprocessing.RLock class.
What is a Reentrant Lock
A reentrant mutual exclusion lock, “reentrant mutex” or “reentrant lock” for short, is like a mutex lock except it allows a process (or thread) to acquire the lock more than once.
A process may need to acquire the same lock more than once for many reasons.
We can imagine critical sections spread across a number of functions, each protected by the same lock. A process may call across these functions in the course of normal execution and may call into one critical section from another critical section.
A limitation of a (non-reentrant) mutex lock is that if a process has acquired the lock that it cannot acquire it again. In fact, this situation will result in a deadlock as it will wait forever for the lock to be released so that it can be acquired, but it holds the lock and will not release it.
A reentrant lock will allow a process to acquire the same lock again if it has already acquired it. This allows the process to execute critical sections from within critical sections, as long as they are protected by the same reentrant lock.
Each time a process acquires the lock it must also release it, meaning that there are recursive levels of acquire and release for the owning process. As such, this type of lock is sometimes called a “recursive mutex lock“.
Now that we are familiar with the reentrant lock, let’s take a closer look at the difference between a lock and a reentrant lock in Python.
How to Use the Reentrant Lock
Python provides a reentrant lock for processes via the multiprocessing.RLock class.
An instance of the multiprocessing.RLock can be created and then acquired by processes before accessing a critical section, and released after the critical section.
For example:
1 2 3 4 5 6 7 8 |
... # create a reentrant lock lock = multiprocessing.RLock() # acquire the lock lock.acquire() # ... # release the lock lock.release() |
The process attempting to acquire the lock will block until the lock is acquired, such as if another process currently holds the lock (once or more than once) then releases it.
We can attempt to acquire the lock without blocking by setting the “block” argument to False. If the lock cannot be acquired, a value of False is returned.
1 2 3 |
... # acquire the lock without blocking lock.acquire(block=false) |
We can also attempt to acquire the lock with a timeout, that will wait the set number of seconds to acquire the lock before giving up. If the lock cannot be acquired, a value of False is returned.
1 2 3 |
... # acquire the lock with a timeout lock.acquire(timeout=10) |
We can also use the reentrant lock via the context manager protocol via the with statement, allowing the critical section to be a block within the usage of the lock and for the lock to be released once the block is exited.
For example:
1 2 3 4 5 6 |
... # create a reentrant lock lock = multiprocessing.RLock() # acquire the lock with lock: # ... |
Now that we know how to use the multiprocessing.RLock class, let’s look at a worked example.
Example of Using a Reentrant Lock
We can develop an example to demonstrate how to use the multiprocessing.RLock for processes.
First, we can define a function to report that a process is done that protects the print() statement with a lock.
1 2 3 4 5 |
# reporting function def report(lock, identifier): # acquire the lock with lock: print(f'>process {identifier} done') |
Next, we can define a task function that reports a message, blocks for a moment, then calls the reporting function. All of the work is protected with the lock.
1 2 3 4 5 6 7 8 |
# work function def task(lock, identifier, value): # acquire the lock with lock: print(f'>process {identifier} sleeping for {value}') sleep(value) # report report(lock, identifier) |
Given that the target task function is protected with a lock and calls the reporting function that is also protected by the same lock, we can use a reentrant lock so that if a process acquires the lock in task(), it will be able to re-enter the lock in the report() function.
Next, we can create the reentrant lock.
1 2 3 |
... # create a shared reentrant lock lock = RLock() |
We can then create many processes, each configured to execute our task() function.
Each process will receive the shared multiprocessing.RLock as an argument as well as an integer id between 0 and 9 and a random time to sleep in seconds between 0 and 1.
We can implement this via a list comprehension, creating a list of ten configured multiprocessing.Process instances.
1 2 3 |
... # create processes processes = [Process(target=task, args=(lock, i, random())) for i in range(10)] |
Next, we can start all of the processes.
1 2 3 4 |
... # start child processes for process in processes: process.start() |
Finally, we can wait for all of the new child processes to terminate.
1 2 3 4 |
... # wait for child processes to finish for process in processes: process.join() |
Tying this together, the complete example of using a lock is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
# SuperFastPython.com # example of a reentrant lock for processes from time import sleep from random import random from multiprocessing import Process from multiprocessing import RLock # reporting function def report(lock, identifier): # acquire the lock with lock: print(f'>process {identifier} done') # work function def task(lock, identifier, value): # acquire the lock with lock: print(f'>process {identifier} sleeping for {value}') sleep(value) # report report(lock, identifier) # entry point if __name__ == '__main__': # create a shared reentrant lock lock = RLock() # create processes processes = [Process(target=task, args=(lock, i, random())) for i in range(10)] # start child processes for process in processes: process.start() # wait for child processes to finish for process in processes: process.join() |
Running the example starts up ten processes that execute the target task function.
Only one process can acquire the lock at a time, and then once acquired, blocks and then reenters the same lock again to report the done message via the report() function.
If a non-reentrant lock, e.g. a multiprocessing.Lock was used instead, then the process would block forever waiting for the lock to become available, which it can’t because the process already holds the lock.
Note, your specific results will differ given the use of random numbers. Try running the example a few times.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
>process 0 sleeping for 0.9703475136810683 >process 0 done >process 1 sleeping for 0.10372469305828702 >process 1 done >process 2 sleeping for 0.26627777997152036 >process 2 done >process 3 sleeping for 0.9821832886127358 >process 3 done >process 6 sleeping for 0.005591916432016064 >process 6 done >process 5 sleeping for 0.6150762561153148 >process 5 done >process 4 sleeping for 0.3145220383413917 >process 4 done >process 7 sleeping for 0.8961655132345371 >process 7 done >process 8 sleeping for 0.5968254072867757 >process 8 done >process 9 sleeping for 0.8139723778675512 >process 9 done |
You can learn more about reentrant locks in this tutorial:
Now that we are familiar with the reentrant lock, let’s look at condition variables.
Process Condition Variable
A condition allows processes to wait and be notified.
You can use a process condition object in Python via the multiprocessing.Condition class.
What is a Process Condition Variable
In concurrency, a condition variable (also called a monitor) allows multiple processes (or threads) to be notified about some result.
It combines both a mutual exclusion lock (mutex) and a conditional variable.
A mutex can be used to protect a critical section, but it cannot be used to alert other processes that a condition has changed or been met.
A condition can be acquired by a process (like a mutex) after which it can wait to be notified by another process that something has changed. While waiting, the process is blocked and releases the lock for other processes to acquire.
Another process can then acquire the condition, make a change, and notify one, all, or a subset of processes waiting on the condition that something has changed. The waiting process can then wake-up (be scheduled by the operating system), re-acquire the condition (mutex), perform checks on any changed state and perform required actions.
This highlights that a condition makes use of a mutex internally (to acquire/release the condition), but it also offers additional features such as allowing processes to wait on the condition and to allow processes to notify other processes waiting on the condition.
Now that we know what a condition is, let’s look at how we might use it in Python.
How to Use a Condition Variable
Python provides a condition variable via the multiprocessing.Condition class.
We can create a condition variable and by default it will create a new reentrant mutex lock (multiprocessing.RLock class) by default which will be used internally.
1 2 3 |
... # create a new condition variable condition = multiprocessing.Condition() |
We may have a reentrant mutex or a non-reentrant mutex that we wish to reuse in the condition for some good reason, in which case we can provide it to the constructor.
I don’t recommend this unless you know your use case has this requirement. The chance of getting into trouble is high.
1 2 3 |
... # create a new condition with a custom lock condition = multiprocessing.Condition(lock=my_lock) |
In order for a process to make use of the condition, it must acquire it and release it, like a mutex lock.
This can be achieved manually with the acquire() and release() functions.
For example, we can acquire the condition and then wait on the condition to be notified and finally release the condition as follows:
1 2 3 4 5 6 7 |
... # acquire the condition condition.acquire() # wait to be notified condition.wait() # release the condition condition.release() |
An alternative to calling the acquire() and release() functions directly is to use the context manager, which will perform the acquire/release automatically for us, for example:
1 2 3 4 5 |
... # acquire the condition with condition: # wait to be notified condition.wait() |
The wait() function will wait forever until notified by default. We can also pass a “timeout” argument which will allow the process to stop blocking after a time limit in seconds.
For example:
1 2 3 4 5 |
... # acquire the condition with condition: # wait to be notified condition.wait(timeout=10) |
The multiprocessing.Condition class also provides a wait_for() function that can be used to only unlock the waiting process if a condition is met, such as calling a function that returns a boolean value.
The name of the function that returns a boolean value can be provided to the wait_for() function directly, and the function also takes a “timeout” argument in seconds.
1 2 3 4 5 |
... # acquire the condition with condition: # wait to be notified and a function to return true condition.wait_for(all_data_collected) |
We also must acquire the condition in a process if we wish to notify waiting processes. This too can be achieved directly with the acquire/release function calls or via the context manager.
We can notify a single waiting process via the notify() function.
For example:
1 2 3 4 5 |
... # acquire the condition with condition: # notify a waiting process condition.notify() |
The notified process will stop-blocking as soon as it can re-acquire the mutex within the condition. This will be attempted automatically as part of its call to wait() or wait_for(), you do not need to do anything extra.
If there are more than one process waiting on the condition, we will not know which process will be notified.
We can also notify a subset of waiting processes by setting the “n” argument to an integer number of processes to notify, for example:
1 2 3 4 5 |
... # acquire the condition with condition: # notify 3 waiting processes condition.notify(n=3) |
Finally, we can notify all processes waiting on the condition via the notify_all() function.
1 2 3 4 5 |
... # acquire the condition with condition: # notify all processes waiting on the condition condition.notify_all() |
A final reminder, a process must acquire the condition before waiting on it or notifying waiting processes. A failure to acquire the condition (the lock within the condition) before performing these actions will result in a RuntimeError.
Now that we know how to use the multiprocessing.Condition class, let’s look at some worked examples.s
Example of Wait and Notify With a Condition Variable
In this section we will explore using a multiprocessing.Condition to notify a waiting process that something has happened.
We will create a new child process to simulate performing some work that the main process is dependent upon. Once prepared, the child process will notify the waiting main process, then the main process will continue on.
First, we will define a target task function to execute in a new child process.
The function will take the condition variable. The function will block for a moment to simulate performing a task, then notify the waiting main process.
The complete target task() function is listed below.
1 2 3 4 5 6 7 8 9 10 |
# target function to prepare some work def task(condition): # block for a moment sleep(1) # notify a waiting process that the work is done print('Child process sending notification...', flush=True) with condition: condition.notify() # do something else... sleep(1) |
In the main process, first we can create the shared condition variable.
1 2 3 |
... # create a condition condition = Condition() |
Next, we can acquire the condition variable, so that we can wait on it later.
1 2 3 4 5 |
... # wait to be notified that the data is ready print('Main process waiting for data...') with condition: # ... |
The main process will then create a new child process to perform some work, then notify the main process once the work is prepared.
Next, we can start a new child process calling our target task function and wait on the condition variable to be notified of the result.
1 2 3 4 5 6 |
... # start a new process to perform some work worker = Process(target=task, args=(condition,)) worker.start() # wait to be notified condition.wait() |
Note, we must start the new process after we have acquired the mutex lock in the condition variable in this example.
If we did not acquire the lock first, it is possible that there would be a race condition. Specifically, if we started the new child process before acquiring the condition and waiting in the main process, then it is possible for the new process to execute and notify before the main process has had a chance to start waiting. In which case the main process would wait forever to be notified.
Finally, we can report that the main process is all done.
1 2 3 |
... # we know the data is ready print('Main process all done') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
# SuperFastPython.com # example of wait/notify with a condition for processes from time import sleep from multiprocessing import Process from multiprocessing import Condition # target function to prepare some work def task(condition): # block for a moment sleep(1) # notify a waiting process that the work is done print('Child process sending notification...', flush=True) with condition: condition.notify() # do something else... sleep(1) # entry point if __name__ == '__main__': # create a condition condition = Condition() # wait to be notified that the data is ready print('Main process waiting for data...') with condition: # start a new process to perform some work worker = Process(target=task, args=(condition,)) worker.start() # wait to be notified condition.wait() # we know the data is ready print('Main process all done') |
Running the example first creates the condition variable.
The condition variable is acquired, then a new child process is created and started.
The child process blocks for a moment to simulate work, then notifies the waiting main process.
Meanwhile the main process waits to be notified by the child process, then once notified it continues on.
1 2 3 |
Main process waiting for data... Child process sending notification... All done |
You can learn more about condition variables in the tutorial:
Next, let’s look at how to use a semaphore.
Process Semaphore
A semaphore is essentially a counter protected by a mutex lock, used to limit the number of processes that can access a resource.
You can use a semaphore in Python by multiprocessing.Semaphore class.
What is a Semaphore
A semaphore is a concurrency primitive that allows a limit on the number of processes (or threads) that can acquire a lock protecting a critical section.
It is an extension of a mutual exclusion (mutex) lock that adds a count for the number of processes that can acquire the lock before additional processes will block. Once full, new processes can only acquire access on the semaphore once an existing process holding the semaphore releases access.
Internally, the semaphore maintains a counter protected by a mutex lock that is incremented each time the semaphore is acquired and decremented each time it is released.
When a semaphore is created, the upper limit on the counter is set. If it is set to be 1, then the semaphore will operate like a mutex lock.
A semaphore provides a useful concurrency tool for limiting the number of processes that can access a resource concurrently. Some examples include:
- Limiting concurrent socket connections to a server.
- Limiting concurrent file operations on a hard drive.
- Limiting concurrent calculations.
Now that we know what a semaphore is, let’s look at how we might use it in Python.
How to Use a Semaphore
Python provides a semaphore for processes via the multiprocessing.Semaphore class.
The multiprocessing.Semaphore instance must be configured when it is created to set the limit on the internal counter. This limit will match the number of concurrent processes that can hold the semaphore.
For example, we might want to set it to 100:
1 2 3 |
... # create a semaphore with a limit of 100 semaphore = multiprocessing.Semaphore(100) |
In this implementation, each time the semaphore is acquired, the internal counter is decremented. Each time the semaphore is released, the internal counter is incremented. The semaphore cannot be acquired if the semaphore has no available access in which case, processes attempting to acquire it must block until access becomes available.
The semaphore can be acquired by calling the acquire() function, for example:
1 2 3 |
... # acquire the semaphore semaphore.acquire() |
By default, it is a blocking call, which means that the calling process will block until access becomes available on the semaphore.
The “block” argument can be set to False in which case, if access is not available on the semaphore, the process will not block and the function will return immediately, returning a False value indicating that the semaphore was not acquired or a True value if access could be acquired.
1 2 3 |
... # acquire the semaphore without blocking semaphore.acquire(block=False) |
The “timeout” argument can be set to a number of seconds that the calling process is willing to wait for access to the semaphore if one is not available, before giving up. Again, the acquire() function will return a value of True if access could be acquired or False otherwise.
1 2 3 |
... # acquire the semaphore with a timeout semaphore.acquire(timeout=10) |
Once acquired, the semaphore can be released again by calling the release() function.
1 2 3 |
... # release the semaphore semaphore.release() |
More than one position can be made available by calling release and setting the “n” argument to an integer number of positions to release on the semaphore.
This might be helpful if it is known a process has died without correctly releasing the semaphore, or if one process acquires the same semaphore more than once.
Do not use this argument unless you have a clear use case, it is likely to get you into trouble with a semaphore left in an inconsistent state.
1 2 3 |
... # release three positions on the semaphore semaphore.release(n=3) |
Finally, the multiprocessing.Semaphore class supports usage via the context manager, which will automatically acquire and release the semaphore for you. As such it is the preferred usage, if appropriate for your program.
For example:
1 2 3 4 |
... # acquire the semaphore with semaphore: # ... |
Now that we know how to use the multiprocessing.Semaphore in Python, let’s look at a worked example.
Example of Using a Semaphore
We can explore how to use a multiprocessing.Semaphore with a worked example.
We will develop an example with a suite of processes but a limit on the number of processes that can perform an action simultaneously. A semaphore will be used to limit the number of concurrent processes which will be less than the total number of processes, allowing some processes to block, wait for access, then be notified and acquire access.
First, we can define a target task function that takes the shared semaphore and a unique integer as arguments. The function will attempt to acquire the semaphore, and once access is acquired it will simulate some processing by generating a random number and blocking for a moment, then report its data and release the semaphore.
The complete task() function is listed below.
1 2 3 4 5 6 7 8 9 |
# target function def task(semaphore, number): # attempt to acquire the semaphore with semaphore: # simulate computational effort value = random() sleep(value) # report result print(f'Process {number} got {value}') |
The main process will first create the multiprocessing.Semaphore instance and limit the number of concurrent processes to 2.
1 2 3 |
... # create the shared semaphore semaphore = Semaphore(2) |
Next, we will create and start 10 processes and pass each the shared semaphore instance and a unique integer to identify the process.
We can implement this via a list comprehension, creating a list of ten configured multiprocessing.Process instances.
1 2 3 |
... # create processes processes = [Process(target=task, args=(semaphore, i)) for i in range(10)] |
Next, we can start all of the processes.
1 2 3 4 |
... # start child processes for process in processes: process.start() |
Finally, we can wait for all of the new child processes to terminate.
1 2 3 4 |
... # wait for child processes to finish for process in processes: process.join() |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
# SuperFastPython.com # example of using a semaphore from time import sleep from random import random from multiprocessing import Process from multiprocessing import Semaphore # target function def task(semaphore, number): # attempt to acquire the semaphore with semaphore: # simulate computational effort value = random() sleep(value) # report result print(f'Process {number} got {value}') # entry point if __name__ == '__main__': # create the shared semaphore semaphore = Semaphore(2) # create processes processes = [Process(target=task, args=(semaphore, i)) for i in range(10)] # start child processes for process in processes: process.start() # wait for child processes to finish for process in processes: process.join() |
Running the example first creates the shared semaphore instance then starts ten child processes.
All ten processes attempt to acquire the semaphore, but only two processes are granted access at a time. The processes on the semaphore do their work and release the semaphore when they are done, at random intervals.
Each release of the semaphore (via the context manager) allows another process to acquire access and perform its simulated calculation, all the while allowing only two of the processes to be running within the critical section at any one time, even though all ten processes are executing their run methods.
Note, your specific values will differ given the use of random numbers.
1 2 3 4 5 6 7 8 9 10 |
Process 3 got 0.18383690831569133 Process 2 got 0.6897978479922813 Process 1 got 0.9585657826673842 Process 6 got 0.1590348392237605 Process 0 got 0.49322767126623646 Process 4 got 0.5704432231809451 Process 5 got 0.3505128460341568 Process 7 got 0.3135061835434463 Process 8 got 0.47103805023306533 Process 9 got 0.21838069804132387 |
You can learn more about semaphores in this tutorial:
Next, let’s look at event objects.
Process Event
An event is a process-safe boolean flag.
You can use an Event Object in Python via the multiprocessing.Event class.
How to Use an Event Object
Python provides an event object for processes via the multiprocessing.Event class.
An event is a simple concurrency primitive that allows communication between processes.
A multiprocessing.Event object wraps a boolean variable that can either be “set” (True) or “not set” (False). Processes sharing the event instance can check if the event is set, set the event, clear the event (make it not set), or wait for the event to be set.
The multiprocessing.Event provides an easy way to share a boolean variable between processes that can act as a trigger for an action.
First, an event object must be created and the event will be in the “not set” state.
1 2 3 |
... # create an instance of an event event = multiprocessing.Event() |
Once created we can check if the event has been set via the is_set() function which will return True if the event is set, or False otherwise.
For example:
1 2 3 4 |
... # check if the event is set if event.is_set(): # do something... |
The event can be set via the set() function. Any processes waiting on the event to be set will be notified.
For example:
1 2 3 |
... # set the event event.set() |
The event can be marked as “not set” (whether it is currently set or not) via the clear() function.
1 2 3 |
... # mark the event as not set event.clear() |
Finally, processes can wait for the event to set via the wait() function. Calling this function will block until the event is marked as set (e.g. another process calling the set() function). If the event is already set, the wait() function will return immediately.
1 2 3 |
... # wait for the event to be set event.wait() |
A “timeout” argument can be passed to the wait() function which will limit how long a process is willing to wait in seconds for the event to be marked as set.
The wait() function will return True if the event was set while waiting, otherwise a value of False returned indicates that the event was not set and the call timed-out.
1 2 3 |
... # wait for the event to be set with a timeout event.wait(timeout=10) |
Now that we know how to use a multiprocessing.Event, let’s look at a worked example.
Example of Using an Event Object
We can explore how to use a multiprocessing.Event object.
In this example we will create a suite of processes that each will perform some processing and report a message. All processes will use an event to wait to be set before starting their work. The main process will set the event and trigger the child processes to start work.
First, we can define a target task function that takes the shared multiprocessing.Event instance and a unique integer to identify the process.
1 2 3 |
# target task function def task(event, number): # ... |
Next, the function will wait for the event to be set before starting the processing work.
1 2 3 4 |
... # wait for the event to be set print(f'Process {number} waiting...', flush=True) event.wait() |
Once triggered, the process will generate a random number, block for a moment and report a message.
1 2 3 4 5 |
... # begin processing value = random() sleep(value) print(f'Process {number} got {value}', flush=True) |
Tying this together, the complete target task function is listed below.
1 2 3 4 5 6 7 8 9 |
# target task function def task(event, number): # wait for the event to be set print(f'Process {number} waiting...', flush=True) event.wait() # begin processing value = random() sleep(value) print(f'Process {number} got {value}', flush=True) |
The main process will first create the shared multiprocessing.Event instance, which will be in the “not set” state by default.
1 2 3 |
... # create a shared event object event = Event() |
Next, we can create and configure five new processes specifying the target task() function with the event object and a unique integer as arguments.
This can be achieved in a list comprehension.
1 2 3 |
... # create a suite of processes processes = [Process(target=task, args=(event, i)) for i in range(5)] |
We can then start all child processes.
1 2 3 4 |
... # start all processes for process in processes: process.start() |
Next, the main process will block for a moment, then trigger the processing in all of the child processes via the event object.
1 2 3 4 5 6 |
... # block for a moment print('Main process blocking...') sleep(2) # trigger all child processes event.set() |
The main process will then wait for all child processes to terminate.
1 2 3 4 |
... # wait for all child processes to terminate for process in processes: process.join() |
Tying this all together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
# SuperFastPython.com # example of using an event object with processes from time import sleep from random import random from multiprocessing import Process from multiprocessing import Event # target task function def task(event, number): # wait for the event to be set print(f'Process {number} waiting...', flush=True) event.wait() # begin processing value = random() sleep(value) print(f'Process {number} got {value}', flush=True) # entry point if __name__ == '__main__': # create a shared event object event = Event() # create a suite of processes processes = [Process(target=task, args=(event, i)) for i in range(5)] # start all processes for process in processes: process.start() # block for a moment print('Main process blocking...') sleep(2) # trigger all child processes event.set() # wait for all child processes to terminate for process in processes: process.join() |
Running the example first creates and starts five child processes.
Each child process waits on the event before it starts its work, reporting a message that it is waiting.
The main process blocks for a moment, allowing all child processes to begin and start waiting on the event.
The main process then sets the event. This triggers all five child processes that perform their simulated work and report a message.
Note, your specific results will differ given the use of random numbers.
1 2 3 4 5 6 7 8 9 10 11 |
Main process blocking... Process 0 waiting... Process 1 waiting... Process 2 waiting... Process 3 waiting... Process 4 waiting... Process 0 got 0.06198821143561384 Process 4 got 0.219334069761699 Process 3 got 0.7335552378594119 Process 1 got 0.7948771419640999 Process 2 got 0.8713839353896263 |
You can learn more about event objects in this tutorial:
Next, let’s take a closer look at process barriers.
Process Barrier
A barrier allows you to coordinate child processes.
You can use a process barrier in Python via the multiprocessing.Barrier class.
What is a Barrier
A barrier is a synchronization primitive.
It allows multiple processes (or threads) to wait on the same barrier object instance (e.g. at the same point in code) until a predefined fixed number of processes arrive (e.g. the barrier is full), after which all processes are then notified and released to continue their execution.
Internally, a barrier maintains a count of the number of processes waiting on the barrier and a configured maximum number of parties (processes) that are expected. Once the expected number of parties reaches the pre-defined maximum, all waiting processes are notified.
This provides a useful mechanism to coordinate actions between multiple processes.
Now that we know what a barrier is, let’s look at how we might use it in Python.
How to Use a Barrier
Python provides a barrier via the multiprocessing.Barrier class.
A barrier instance must first be created and configured via the constructor specifying the number of parties (processes) that must arrive before the barrier will be lifted.
For example:
1 2 3 |
... # create a barrier barrier = multiprocessing.Barrier(10) |
We can also perform an action once all processes reach the barrier which can be specified via the “action” argument in the constructor.
This action must be a callable such as a function or a lambda that does not take any arguments and will be executed by one process once all processes reach the barrier but before the processes are released.
1 2 3 |
... # configure a barrier with an action barrier = multiprocessing.Barrier(10, action=my_function) |
We can also set a default timeout used by all processes that reach the barrier and call the wait() function.
The default timeout can be set via the “timeout” argument in seconds in the constructor.
1 2 3 |
... # configure a barrier with a default timeout barrier = multiprocessing.Barrier(10, timeout=5) |
Once configured, the barrier instance can be shared between processes and used.
A process can reach and wait on the barrier via the wait() function, for example:
1 2 3 |
... # wait on the barrier for all other processes to arrive barrier.wait() |
This is a blocking call and will return once all other processes (the pre-configured number of parties) have reached the barrier.
The wait function returns an integer indicating the number of parties remaining to arrive at the barrier. If a process was the last to arrive, then the return value will be zero. This is helpful if you want the last process or one process to perform an action after the barrier is released, an alternative to using the “action” argument in the constructor.
1 2 3 4 5 6 |
... # wait on the barrier remaining = barrier.wait() # after released, check if this was the last party if remaining == 0: print('I was last...') |
A timeout can be set on the call to wait in second via the “timeout” argument. If the timeout expires before all parties reach the barrier, a BrokenBarrierError will be raised in all processes waiting on the barrier and the barrier will be marked as broken.
If a timeout is used via the “timeout” argument or the default timeout in the constructor, then all calls to the wait() function may need to handle the BrokenBarrierError.
1 2 3 4 5 6 |
... # wait on the barrier for all other processes to arrive try: barrier.wait() except BrokenBarrierError: # ... |
We can also abort the barrier.
Aborting the barrier means that all processes waiting on the barrier via the wait() function will raise a BrokenBarrierError and the barrier will be put in the broken state.
This might be helpful if you wish to cancel the coordination effort.
1 2 3 |
... # abort the barrier barrier.abort() |
A broken barrier cannot be used. Calls to wait() will raise a BrokenBarrierError.
A barrier can be fixed and made ready for use again by calling the reset() function.
This might be helpful if you cancel a coordination effort although you wish to retry it again with the same barrier instance.
1 2 3 |
... # reset a broken barrier barrier.reset() |
Finally, the status of the barrier can be checked via attributes.
- parties: reports the configured number of parties that must reach the barrier for it to be lifted.
- n_waiting: reports the current number of processes waiting on the barrier.
- broken: attribute indicates whether the barrier is currently broken or not.
Now that we know how to use the barrier in Python, let’s look at some worked examples.
Example of Using a Process Barrier
We can explore how to use a multiprocessing.Barrier with a worked example.
In this example we will create a suite of processes, each required to perform some blocking calculation. We will use a barrier to coordinate all processes after they have finished their work and perform some action in the main process. This is a good proxy for the types of coordination we may need to perform with a barrier.
First, let’s define a target task function to execute by each process. The function will take the barrier as an argument as well as a unique identifier for the process.
The process will generate a random value between 0 and 10, block for that many seconds, report the result, then wait on the barrier for all other processes to perform their computation.
The complete target task function is listed below.
1 2 3 4 5 6 7 8 9 10 |
# target function to prepare some work def task(barrier, number): # generate a unique value value = random() * 10 # block for a moment sleep(value) # report result print(f'Process {number} done, got: {value}', flush=True) # wait on all other processes to complete barrier.wait() |
Next in the main process we can create the barrier.
We need one party for each process we intend to create, five in this place, as well as an additional party for the main process that will also wait for all processes to reach the barrier.
1 2 3 |
... # create a barrier barrier = Barrier(5 + 1) |
Next, we can create and start our five child processes executing our target task() function.
1 2 3 4 5 6 |
... # create the worker processes for i in range(5): # start a new process to perform some work worker = Process(target=task, args=(barrier, i)) worker.start() |
Finally, the main process can wait on the barrier for all processes to perform their calculation.
1 2 3 4 |
... # wait for all processes to finish print('Main process waiting on all results...') barrier.wait() |
Once the processes are finished, the barrier will be lifted and the worker processes will exit and the main process will report a message.
1 2 3 |
... # report once all processes are done print('All processes have their result') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
# SuperFastPython.com # example of using a barrier with processes from time import sleep from random import random from multiprocessing import Process from multiprocessing import Barrier # target function to prepare some work def task(barrier, number): # generate a unique value value = random() * 10 # block for a moment sleep(value) # report result print(f'Process {number} done, got: {value}', flush=True) # wait on all other processes to complete barrier.wait() # entry point if __name__ == '__main__': # create a barrier barrier = Barrier(5 + 1) # create the worker processes for i in range(5): # start a new process to perform some work worker = Process(target=task, args=(barrier, i)) worker.start() # wait for all processes to finish print('Main process waiting on all results...') barrier.wait() # report once all processes are done print('All processes have their result') |
Running the example first creates the barrier then creates and starts the worker processes.
Each worker process performs its calculation and then waits on the barrier for all other processes to finish.
Finally, the processes finish and are all released, including the main process, reporting a final message.
Note, your specific results will differ given the use of random numbers. Try running the example a few times.
1 2 3 4 5 6 7 |
Main process waiting on all results... Process 3 done, got: 0.19554467220314398 Process 4 done, got: 0.345718981628913 Process 2 done, got: 2.37158455232798 Process 1 done, got: 5.970308025592141 Process 0 done, got: 7.102904442921531 All processes have their result |
You can learn more about how to use process barriers in this tutorial:
Next, let’s look at best practices when using multiprocessing.
Python Multiprocessing Best Practices
Now that we know processes work and how to use them, let’s review some best practices to consider when bringing multiprocessing into our Python programs.
Some best practices when using processes in Python are as follows:
- Use Context Managers
- Use Timeouts When Waiting
- Use Main Module Idiom
- Use Shared ctypes
- Use Pipes and Queues
Following best practices will help you avoid common concurrency failure modes like race conditions and deadlocks.
Let’s take a closer look at each in turn.
Tip 1: Use Context Managers
Acquire and release locks using a context manager, wherever possible.
Locks can be acquired manually via a call to acquire() at the beginning of the critical section followed by a call to release() at the end of the critical section.
For example:
1 2 3 4 5 6 |
... # acquire the lock manually lock.acquire() # critical section... # release the lock lock.release() |
This approach should be avoided wherever possible.
Traditionally, it was recommended to always acquire and release a lock in a try-finally structure.
The lock is acquired, the critical section is executed in the try block, and the lock is always released in the finally block.
For example:
1 2 3 4 5 6 7 8 |
... # acquire the lock lock.acquire() try: # critical section... finally: # always release the lock lock.release() |
This was since replaced with the context manager interface that achieves the same thing with less code.
For example:
1 2 3 4 |
... # acquire the lock with lock |