Last Updated on September 12, 2022
You can set the multiprocessing context via the “context” argument to the multiprocessing.pool.Pool class constructor.
In this tutorial you will discover how to configure the context for the process pool in Python.
Let’s get started.
Need to Configure Process Pool Context
The multiprocessing.pool.Pool in Python provides a pool of reusable processes for executing ad hoc tasks.
A process pool can be configured when it is created, which will prepare the child workers.
A process pool object which controls a pool of worker processes to which jobs can be submitted. It supports asynchronous results with timeouts and callbacks and has a parallel map implementation.
— multiprocessing — Process-based parallelism
We can issue one-off tasks to the process pool using functions such as apply() or we can apply the same function to an iterable of items using functions such as map(). Results for issued tasks can then be retrieved synchronously, or we can retrieve the result of tasks later by using asynchronous versions of the functions such as apply_async() and map_async().
There are different ways to create new child processes, such as by forking the current process and spawning a new process.
How can we change the way new child worker processes in the process pool are created?
Run loops using all CPUs, download your FREE book to learn how.
What is a Start Method
A start method is the technique used to start child processes in Python.
There are three start methods, they are:
- spawn: start a new Python process.
- fork: copy a Python process from an existing process.
- forkserver: new process from which future forked processes will be copied.
Each platform has a default start method.
The following lists the major platforms and the default start methods.
- Windows (win32): spawn
- macOS (darwin): spawn
- Linux (unix): fork
Not all platforms support all start methods.
The following lists the major platforms and the start methods that are supported.
- Windows (win32): spawn
- macOS (darwin): spawn, fork, forkserver.
- Linux (unix): spawn, fork, forkserver.
The start method can be set for a Python program via the multiprocessing.set_start_method() module function.
You can learn more about process start methods in the tutorial:
Now that we know about process start methods, let’s take a look at a multiprocessing context.
What is a Multiprocessing Context
A multiprocessing context provides an alternate way of managing start methods within a Python program.
It is an object that is configured to use a specific start method and provides the entire multiprocessing module API for that start method.
We can configure a multiprocessing context for a given start method, then use that context for our multiprocessing needs within a part of our program. We can then configure a different multiprocessing context with an alternate start method for use in a different part of our application.
Multiprocessing contexts provide a more flexible way to manage process start methods directly within a program, and may be a preferred approach to changing start methods in general, especially within a Python library.
We can configure a new multiprocessing context to use a given start method using the multiprocessing.get_context() function.
You can learn more about multiprocessing context in the tutorial:
Now that we know about multiprocessing context, let’s look at how we can set the context for the process pool.
Free Python Multiprocessing Pool Course
Download your FREE Process Pool PDF cheat sheet and get BONUS access to my free 7-day crash course on the Process Pool API.
Discover how to use the Multiprocessing Pool including how to configure the number of workers and how to execute tasks asynchronously.
How to Configure the Process Pool Context
We can set the context for the process pool via the “context” argument to the multiprocessing.pool.Pool class constructor.
context can be used to specify the context used for starting the worker processes.
— multiprocessing — Process-based parallelism
The “context” is an instance of a multiprocessing context configured with a start method, created via the multiprocessing.get_context() function.
By default, “context” is None, which uses the current default context and start method configured for the application.
A new context can be created with a given start method and passed to the process pool.
For example:
1 2 3 4 5 |
... # create a process context ctx = multiprocessing.get_context('fork') # create a process pool with a given context pool = multiprocessing.pool.Pool(context=ctx) |
Next, let’s look at how we might check the default context used by the process pool.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Check the Process Pool Default Start Method
We explore how to check the default context, specifically the default start method used by the process pool.
One approach would be to use the multiprocessing.get_start_method() module function in the application and report the start method directly.
For example:
1 2 3 4 5 6 7 8 |
# SuperFastPython.com # example of getting the start method from multiprocessing import get_start_method # protect entry point if __name__ == '__main__': # get the start method method = get_start_method() print(method) |
Running the example retrieves the currently configured start method.
In this case, the ‘spawn‘ method is the default start method.
Your result may differ, depending on your platform.
1 |
spawn |
Another approach would be to get and report the start method from within a worker process.
This can be achieved by defining a function to run in the process pool that gets the start method used by the process pool and reporting it directly.
The task() function below implements this.
1 2 3 4 5 6 |
# task executed in a worker process def task(): # get the start method method = get_start_method() # report a message print(f'Worker using {method}', flush=True) |
Next, in the main process, we can first get the start method used by the application and report it.
We expect that this will be the same as the default start method used by the process pool.
1 2 3 4 |
... # get the start method method = get_start_method() print(f'Main process using {method}') |
Next, we can create a process pool and issue a task to the pool to execute our custom function.
We can create the process pool using the context manager interface with the default multiprocessing context.
1 2 3 4 |
... # create and configure the process pool with Pool() as pool: # ... |
We can then issue our task() function to the process pool and not wait for the function to be executed.
1 2 3 |
... # issue tasks to the process pool pool.apply_async(task) |
Finally, we can close the process pool and wait for the issue to complete.
1 2 3 4 5 |
... # close the process pool pool.close() # wait for all tasks to complete pool.join() |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# SuperFastPython.com # example of checking the default start method for the process pool from multiprocessing.pool import Pool from multiprocessing import get_start_method # task executed in a worker process def task(): # get the start method method = get_start_method() # report a message print(f'Worker using {method}', flush=True) # protect the entry point if __name__ == '__main__': # get the start method method = get_start_method() print(f'Main process using {method}') # create and configure the process pool with Pool() as pool: # issue tasks to the process pool pool.apply_async(task) # close the process pool pool.close() # wait for all tasks to complete pool.join() |
Running the example first gets and reports the default multiprocessing context used by the application.
Next, it creates the process pool with the default multiprocessing context. The task is issued and the main process blocks until the task is completed and the process pool shutdown.
The task is executed in the process pool. It retrieves the default start method for the default context and reports the value.
In this case we can see that the start method used in the application is ‘spawn‘, which matches the start method used within the worker process.
Note, you may see different results depending on the default start method used by your system.
1 2 |
Main process using spawn Worker using spawn |
Next, let’s look at how we might configure the context used by the multiprocessing context.
Configure the Process Pool Context
We can explore how to configure the context used by the process pool.
This is important as it allows us to use a different start method within the process pool compared to the main application.
For example, we could use the ‘spawn‘ start method in the application and ‘fork‘ within the process pool, which might be faster to start on the system compared to other start methods.
We can update the example from the previous section to create a new multiprocessing context with a different start method. This start method should then be reflected in the worker process and be different from the start method used by the main process.
Firstly, we can create a new multiprocessing context with a different start method, in this case the ‘fork‘ start method.
1 2 3 |
... # create a process context ctx = get_context('fork') |
This can then be passed to the process pool when it is configured.
1 2 3 4 |
... # create and configure the process pool with Pool(context=ctx) as pool: # ... |
When worker processes are created within the process pool, they will use the new context with the ‘fork‘ start method.
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
# SuperFastPython.com # example of checking changing the start method used in the process pool from multiprocessing.pool import Pool from multiprocessing import get_start_method from multiprocessing import get_context # task executed in a worker process def task(): # get the start method method = get_start_method() # report a message print(f'Worker using {method}', flush=True) # protect the entry point if __name__ == '__main__': # get the start method method = get_start_method() print(f'Main process using {method}') # create a process context ctx = get_context('fork') # create and configure the process pool with Pool(context=ctx) as pool: # issue tasks to the process pool pool.apply_async(task) # close the process pool pool.close() # wait for all tasks to complete pool.join() |
Running the example first gets and reports the default multiprocessing context used by the application.
Next, a new multiprocessing context is created with the ‘fork‘ start method.
Note, this assumes that the ‘fork‘ method is supported on your system. If you are on Windows, this example may not work.
Next, the process pool is created and configured with the new multiprocessing context. The task is issued and the main process blocks until the task is completed and the process pool shutdown.
The task is executed in the process pool. It retrieves the configured start method context and reports the value.
In this case we can see that the start method used in the application is ‘spawn‘, which is different from the start method used within the worker process which is ‘fork‘.
Note, you may see different results depending on the default start method used by your system. If so, try changing the start methods used and compare the results.
1 2 |
Main process using spawn Worker using fork |
Further Reading
This section provides additional resources that you may find helpful.
Books
- Multiprocessing Pool Jump-Start, Jason Brownlee (my book!)
- Multiprocessing API Interview Questions
- Pool Class API Cheat Sheet
I would also recommend specific chapters from these books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Multiprocessing Pool: The Complete Guide
- Python ThreadPool: The Complete Guide
- Python Multiprocessing: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
APIs
References
Takeaways
You now know how to configure the context for the process pool.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Brett Jordan on Unsplash
Do you have any questions?