Last Updated on September 12, 2022
You can inherit global variables between forked processes, but not spawned processes.
In this tutorial you will discover how to inherit global variables between processes in Python.
Let’s get started.
Need to Share Global Variables
A process is a running instance of a computer program.
Every Python program is executed in a Process, which is a new instance of the Python interpreter. This process has the name MainProcess and has one thread used to execute the program instructions called the MainThread. Both processes and threads are created and managed by the underlying operating system.
Sometimes we may need to create new child processes in our program in order to execute code concurrently.
Python provides the ability to create and manage new processes via the multiprocessing.Process class.
In concurrent programming, we sometimes need to share data between processes.
One way to share data between processes is to inherit global variables from a parent process.
Can we inherit global variables between processes, and if so, how?
Run loops using all CPUs, download your FREE book to learn how.
How to Inherit Global Variables
A forked child process can inherit global variables from a parent process.
Recall that a global variable is a variable defined and assigned within a module, e.g. outside of a function. They are different from variables defined and assigned within a function, which are referred to as local variables.
In Python, variables that are only referenced inside a function are implicitly global. If a variable is assigned a value anywhere within the function’s body, it’s assumed to be a local unless explicitly declared as global.
— What are the rules for local and global variables in Python?
Global variables can be accessed and assigned within a function if they are defined as being global, using the “global” keyword before they are used.
For example:
1 2 3 4 |
# a function def myfunction(): # access an existing global variable global data |
A global variable can be inherited by a child process.
This means that we can define and assign a global variable in a parent process, then access and assign values to it in a function executed by a child process.
Importantly, changes made to the global variable in the child process will not propagate back up to the parent process.
Bear in mind that if code run in a child process tries to access a global variable, then the value it sees (if any) may not be the same as the value in the parent process at the time that Process.start was called.
— multiprocessing — Process-based parallelism
Global variables can only be shared or inherited by child processes that are forked from the parent process.
Specifically, this means that you must create child processes using the ‘fork’ start method.
Recall that the ‘fork‘ start method is the default on Unix platforms, and can be set prior to creating child processes via the multiprocessing.set_start_method() function.
For example:
1 2 3 |
... # set the start method to fork set_start_method('fork') |
The fork start method is not available on all platforms, e.g. it may not be available on Windows.
Although the global variables are inherited, changes to the variable are not propagated.
Specifically:
- Changes to a global variable in the parent process are not propagated to the child processes.
- Changes to the global variable in a child process are not propagated back to the parent process or to other child processes.
This means that the forked child process gets a snapshot of the global variables from the parent process at the time the child process was created.
You cannot inherit global variables when using the ‘spawn‘ method to start child processes.
Recall that the ‘spawn‘ start method is the default on macOS and Windows.
Attempting to access and use a global variable in a child process started with the ‘spawn‘ start method will result in an error.
You can learn more about process start methods in the tutorial:
Now that we know how to share global variables with child processes, let’s look at some worked examples.
Global Variables and Spawned Processes
We can explore what happens if we try to share global variables with a child process created using the ‘spawn‘ start method.
Recall, that child processes created with the spawn start method will not inherit global variables. Attempting to access global variables in a child process that were defined in the parent process will result in an error.
We can demonstrate this with a worked example.
The parent process will explicitly use the ‘spawn‘ start method. It will then define a global variable and assign a value to the variable and report the global variable to confirm it was assigned. A child process will then run a custom function that will attempt to access and report the same global variable defined in the parent process. We expect this to result in an error.
First, we can define a custom function to execute in a child process.
1 2 3 |
# function to be executed in a new process def task(): # ... |
The function will then define the variable “data” as being global.
1 2 3 |
... # declare global state global data |
Next, the function will report the value of the global variable, assign it a new value, then report the variable again to confirm the new value was assigned.
1 2 3 4 5 6 7 |
... # report global state print(f'child process: {data}') # change global state data = 'hello hello!' # report global state print(f'child process: {data}') |
We expect that accessing the global variable will fail with an error as the spawned child process will not inherit it.
Tying this together, the complete task() function is defined below.
1 2 3 4 5 6 7 8 9 10 |
# function to be executed in a new process def task(): # declare global state global data # report global state print(f'child process: {data}') # change global state data = 'hello hello!' # report global state print(f'child process: {data}') |
Next, in the main process we can set the start method to ‘spawn‘.
1 2 3 |
... # set the start method to spawn set_start_method('spawn') |
We can then define and assign a global variable named “data“, then report the value of the variable to confirm the assignment worked as expected.
1 2 3 4 5 |
... # define global state data = 'Hello there' # report global state print(f'main process: {data}') |
We can then create a new multiprocessing.Process instance, configured to execute our custom task() function. The child process can be started and the main process will block until the child process has terminated.
1 2 3 4 5 6 |
... # start a child process process = Process(target=task) process.start() # wait for the child to terminate process.join() |
Finally, the main process will report the value of the global variable again, to see if any change to it occurred.
1 2 3 |
... # report global state print(f'main process: {data}') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
# SuperFastPython.com # example of using global state shared between spawned processes from time import sleep from multiprocessing import Process from multiprocessing import set_start_method # function to be executed in a new process def task(): # declare global state global data # report global state print(f'child process: {data}') # change global state data = 'hello hello!' # report global state print(f'child process: {data}') # protect entry point if __name__ == '__main__': # set the start method to spawn set_start_method('spawn') # define global state data = 'Hello there' # report global state print(f'main process: {data}') # start a child process process = Process(target=task) process.start() # wait for the child to terminate process.join() # report global state print(f'main process: {data}') |
Running the example first sets the process start method to ‘spawn‘.
A global variable named ‘data‘ is then defined, assigned, and reported as per normal.
A new child process is then created using the ‘spawn‘ start method to execute our custom function. The process starts and declares a global variable named ‘data’. It then attempts to report the value of the global variable.
This results in a NameError, and the child process terminates. This is expected.
The main parent process blocks while the child process is running, then continues on once the child process terminates.
The main process then reports the value of the global variable again, reporting no change as expected.
This highlights that child processes created using the spawn method will not inherit global variables from the parent process.
1 2 3 4 5 6 |
main process: Hello there Process Process-1: Traceback (most recent call last): ... NameError: name 'data' is not defined main process: Hello there |
Next, let’s look at how forked child processes will inherit global variables.
Free Python Multiprocessing Course
Download your FREE multiprocessing PDF cheat sheet and get BONUS access to my free 7-day crash course on the multiprocessing API.
Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.
Global Variables and Forked Processes
We can explore how child processes forked from a parent process will inherit the parents global variables.
In this example, we can update the example from the previous section and change it using the ‘fork‘ start method instead of the ‘spawn‘ start method.
For example:
1 2 3 |
... # set the start method to fork set_start_method('fork') |
This will allow the child process to inherit the ‘data‘ global variable from the parent process, report it, and assign values to it.
It will also highlight that changes made to the global variable in the child process are not propagated back to the parent process.
Recall, the ‘fork‘ start method is not available on all platforms, e.g. Windows.
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
# SuperFastPython.com # example of using global state shared between forked processes from time import sleep from multiprocessing import Process from multiprocessing import set_start_method # function to be executed in a new process def task(): # declare global state global data # report global state print(f'child process before: {data}') # change global state data = 'hello hello!' # report global state print(f'child process after: {data}') # protect entry point if __name__ == '__main__': # set the start method to fork set_start_method('fork') # define global state data = 'Hello there' # report global state print(f'main process: {data}') # start a child process process = Process(target=task) process.start() # wait for the child to terminate process.join() # report global state print(f'main process: {data}') |
Running the example first sets the start method to ‘fork‘.
It then defines a global variable and assigns it the value ‘Hello there‘ and reports the variable to confirm that this is the current value.
A child process is then started using the ‘fork‘ start method and executes our custom function. The main process then blocks until the child process terminates.
The child process declares the ‘data‘ variable as being global.
It then reports the ‘data‘ global variable. This works as expected, as the variable was inherited from the parent process. The value is the same as that assigned in the parent value, specifically ‘Hello there’.
The child process then changes the value of the global variable, then reports the value again. This shows that the value was changed to ‘hello hello!‘.
The child process terminates and the parent process wakes up. It then reports the value of the global variable again, which is unchanged with the value ‘Hello there‘.
This highlights that indeed a forked child process will inherit global variables, but changes to the global variable are not propagated from the child back to the parent.
1 2 3 4 |
main process: Hello there child process before: Hello there child process after: hello hello! main process: Hello there |
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Best Practice
Although a forked child process can inherit global variables from a parent process, it should be avoided.
The most important reason is because the functionality is only supported on those platforms that support the ‘fork’ spawn method, e.g. not windows.
On Unix using the fork start method, a child process can make use of a shared resource created in a parent process using a global resource. However, it is better to pass the object as an argument to the constructor for the child process.
— multiprocessing — Process-based parallelism
Another reason is that it can be confusing to some developers, e.g. the fact that changes to the global state are not propagated may be surprising.
As such, alternate techniques should be used to share data between process that work with all process start methods, such as:
- Using shared ctypes, such as the multiprocessing.Value.
- Sending data as messages on a multiprocessing.Pipe or multiprocessing.Queue.
Further Reading
This section provides additional resources that you may find helpful.
Python Multiprocessing Books
- Python Multiprocessing Jump-Start, Jason Brownlee (my book!)
- Multiprocessing API Interview Questions
- Multiprocessing API Cheat Sheet
I would also recommend specific chapters in the books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Multiprocessing: The Complete Guide
- Python Multiprocessing Pool: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
APIs
References
Takeaways
You now know how to inherit global variables from a parent process.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Hamza Shaikh on Unsplash
Do you have any questions?