Multiprocessing Context for the ProcessPoolExecutor in Python
You can set ProcessPoolExecutor multiprocessing context via the "mp_context" argument in Python.
In this tutorial you will discover how to set the multiprocessing context for process pools in Python.
Let's get started.
Python MultiProcessing Context
Different operating systems provide different ways to create new processes.
Some operating systems support multiple ways to create processes.
Perhaps the two most common ways to create new processes are spawn and fork.
- spawn: Creates a new instance of the Python interpreter as a process. Available on Windows, Unix and MacOS.
- fork: Creates a fork of an existing Python interpreter process. Available on Unix.
There is also a forkserver context which is like fork.
- forkserver: Creates a server Python interpreter process to be used to create all all forked processes for the life of the program. Available on Unix.
Your Python installation will select the most appropriate method for creating a new process for your operating system.
Nevertheless, you can specify how new processes are created and this is called a "process context".
This is important because we may want to specify the multiprocessing context used by new processes in a ProcessPoolExecutor.
List Supported MultiProcessing Contexts
Not all operating systems support all methods for creating new processes.
Let's see what process start methods are supported by your operating system and discover the default method that you are using.
We can call the multiprocessing.get_all_start_methods() function to get a list of all supported methods and the multiprocessing.get_start_method() function to get the currently configured (default) process start method.
The program below will report the process start methods and default start method on your system.
# SuperFastPython.com
from multiprocessing import get_all_start_methods
from multiprocessing import get_start_method
# list of all process start methods supported on the os
result = get_all_start_methods()
print(result)
# get the default process start method
result = get_start_method()
print(result)
Running the example first reports all of the process start methods supported by your system.
Next, the default process start method is supported.
In this case, running the program on MacOS (my operating system), we can see that the operating system supports all three process start methods and the default is the "spawn" method.
What result did you get?
Let me know in the comments below.
['spawn', 'fork', 'forkserver']
spawn
Check the ProcessPoolExecutor Multiprocessing Context
We can also check the default context used to start processes in the process pool.
The ProcessPoolExecutor will use the default context unless it is configured to use a different context.
We can check the start process context used by the ProcessPoolExecutor via the "_mp_context" protected property.
The example below creates a process pool and reports the default context used by the process pool.
# SuperFastPython.com
# example of checking the process start context
from concurrent.futures import ProcessPoolExecutor
# entry point
def main():
# create a process pool
with ProcessPoolExecutor() as executor:
# report the context used
print(executor._mp_context)
if __name__ == '__main__':
main()
Running the example creates a process pool and reports the default start process context used by the pool.
In this case we can see that it is the 'spawn' context, denoted by the "SpawnContext" object.
<multiprocessing.context.SpawnContext object at 0x1034fd4c0>
Configure ProcessPoolExecutor Multiprocessing Context
We can create a process context for a specific method (e.g. fork or spawn) and pass this context to the ProcessPoolExecutor.
This will allow all new processes created by the process pool to be created using the provided context and use your preferred method for starting processes.
This can be achieved by setting an argument named "mp_context" that defines the context used for creating processes in the pool.
By default it is set to None, in which case the default context is used.
We can set the context by first calling the multiprocessing.get_context() function and specifying the preferred method as a string that matches a string returned from calling the multiprocessing.get_all_start_methods() function, e.g. 'fork' or 'spawn'.
Perhaps we wanted to force all processes to be created using the 'fork' method, regardless of the default.
Note, using 'fork' will not work on windows. You might want to change it to use 'spawn' or report the error message you see in the comments below.
First, we would create a context, then pass this context to the process pool. We can then access and report the context manager used by the process pool, for example.
...
# create a start process context
context = get_context('fork')
# create a process pool
with ProcessPoolExecutor(mp_context=context) as executor:
# report the context used
print(executor._mp_context)
Tying this together, the complete example of setting the context manager for the process pool and then confirming it was changed is listed below.
# SuperFastPython.com
# example of setting the process start context
from multiprocessing import get_context
from concurrent.futures import ProcessPoolExecutor
# entry point
def main():
# create a start process context
context = get_context('fork')
# create a process pool
with ProcessPoolExecutor(mp_context=context) as executor:
# report the context used
print(executor._mp_context)
if __name__ == '__main__':
main()
Running the example first creates a new start process context then passes it to the new ProcessPoolExecutor.
After the pool is created, the context manager used by the pool is reported, which in this case is 'fork' denoted by the 'ForkContext' object.
<multiprocessing.context.ForkContext object at 0x102d580a0>
Takeaways
You now know how to configure the ProcessPoolExecutor multiprocessing context.