Last Updated on September 12, 2022
You can get the PID of a worker process by calling the os.getpid() function when initializing the worker process or from within the target task function executed by a worker process.
In this tutorial you will discover how to get the PID of worker processes in the Python process pool.
Let’s get started.
Need PID of Worker Processes
The multiprocessing.pool.Pool in Python provides a pool of reusable processes for executing ad hoc tasks.
A process pool can be configured when it is created, which will prepare the child workers.
A process pool object which controls a pool of worker processes to which jobs can be submitted. It supports asynchronous results with timeouts and callbacks and has a parallel map implementation.
— multiprocessing — Process-based parallelism
We can issue one-off tasks to the process pool using functions such as apply() or we can apply the same function to an iterable of items using functions such as map().
Results for issued tasks can then be retrieved synchronously, or we can retrieve the result of tasks later by using asynchronous versions of the functions such as apply_async() and map_async().
When using the process pool, we may need the process identifiers (PIDs) of the worker processes.
This may be for many reasons, such as:
- To uniquely identify the worker in the application.
- To include the PID in logging.
- To debug which worker is completing which task.
How can we get the child worker process PIDs in Python?
Run loops using all CPUs, download your FREE book to learn how.
What is a PID
PID is an acronym for Process ID or Process identifier.
A process identifier is a unique number assigned to a process by the underlying operating system.
Each time a process is started, it is assigned a unique positive integer identifier and the identifier may be different each time the process is started.
The pid uniquely identifies one process among all active processes running on the system, managed by the operating system.
As such, the pid can be used to interact with a process, e.g. to send a signal to the process to interrupt or kill a process.
How to Get Worker PID
When working with a process pool, there are two situations where we may want to get the PID:
- Get the PID for each worker process in the process pool.
- Get the PID for the worker completing a given task.
Let’s take a closer look at each approach in turn.
Before we get the PID of worker processes, let’s take a brief look at how to get a process PID.
How to Get Process PID
There are two general ways to get the PID for a process, they are:
- multiprocessing.Process.pid attribute.
- os.getpid() function.
We may also get the process instance for the current process via the multiprocessing.current_process() function.
For example:
1 2 3 |
... # get the process instance process = multiprocessing.current_process() |
Once we have the process instance, we get the pid via the multiprocessing.Process.pid attribute.
For example:
1 2 3 |
... # get the pid pid = process.pid |
Alternatively We can get the pid for the current process via the os.getpid() function.
For example:
1 2 3 |
... # get the pid for the current process pid = os.getpid() |
You can learn more about how to get a PID for a process in the tutorial:
How to Get All Worker PID
We can get the PID for all child workers in the process pool.
This can be achieved from the parent process.
First, we can get a list of all active child processes via the multiprocessing.active_children() function. This will return a list of multiprocessing.Process instances.
1 2 3 |
... # get all active child processes children = multiprocessing.active_children() |
We can then access the “pid” attribute of each.
1 2 3 4 |
... # get the pid of each child process for child in children: print(child.pid) |
The downside of this approach is that the list of active children may include child processes that are not worker processes.
Another approach to getting the PID for all worker processes is to configure the process pool to use an initializer function and to get the worker process PID. This function will be called by each child worker process once when it is started in the process pool.
The initialization function does not take any arguments. Within the function we can get and use the worker PID.
For example:
1 2 3 4 |
# initialize the worker process def init_worker(): # get the pid for the current worker process pid = getpid() |
We can then configure the process pool to call the initialization function as each worker is created.
This can be achieved by setting the “initializer” argument in the multiprocessing.pool.Pool constructor to the name of the initialization function.
For example:
1 2 3 |
... # create and configure a process pool pool = multiprocessing.pool.Pool(=initializerinit_worker) |
You can learn more about how to initialize worker processes in the process pool in the tutorial:
How to Get Task PID
We can get the worker PID within a given task executed by the process pool.
This can be achieved by calling the getpid() within the target task function executed in the process pool.
For example:
1 2 3 4 |
# task executed in a worker process def task(identifier): # get the pid for the current worker process pid = getpid() |
Now that we know how to get worker process PIDs, let’s look at some worked examples.
Free Python Multiprocessing Pool Course
Download your FREE Process Pool PDF cheat sheet and get BONUS access to my free 7-day crash course on the Process Pool API.
Discover how to use the Multiprocessing Pool including how to configure the number of workers and how to execute tasks asynchronously.
Example of Getting All Worker PIDs
We can explore how to get all worker PIDs using a worker initialization function.
In this example we will define a worker initialization function that will get and report the PID of each worker process. We will then create and configure a process pool to use the initialization function, then issue many tasks to the process pool in order that all worker processes are started and their PIDs are reported.
First, we must define a worker process initialization function. The function will not take any arguments and will call the os.getpid() function to get the PID of a worker, then report its value.
The init_worker() function below implements this.
1 2 3 4 5 |
# initialize the worker process def init_worker(): # get the pid for the current worker process pid = getpid() print(f'Worker PID: {pid}', flush=True) |
Next, we can define a task that we will execute in the process pool.
The task will take an integer augment then block for a fraction of a second.
The task() function below implements this.
1 2 3 4 |
# task executed in a worker process def task(identifier): # block for a moment sleep(0.5) |
Next, in the main process, we can create and configure a new process pool.
We will use the context manager interface to ensure the process pool is closed automatically once we are finished with it.
1 2 3 4 |
... # create and configure the process pool with Pool(initializer=init_worker) as pool: # ... |
You can learn more about the context manager interface in the tutorial:
We will then issue 10 tasks to the process pool, each calling our task() function with an integer between 0 and 9. We will issue the task asynchronously using the map_async() function.
1 2 3 |
... # issues tasks to process pool result = pool.map_async(task, range(10)) |
Once issued, the main process will block on the AsyncResult returned from the map_async() function until the issued tasks are complete.
1 2 3 |
... # wait for tasks to complete result.wait() |
You can learn more about the map_async() function in the tutorial:
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
# SuperFastPython.com # example of getting child worker process pids from os import getpid from time import sleep from multiprocessing.pool import Pool # initialize the worker process def init_worker(): # get the pid for the current worker process pid = getpid() print(f'Worker PID: {pid}', flush=True) # task executed in a worker process def task(identifier): # block for a moment sleep(0.5) # protect the entry point if __name__ == '__main__': # create and configure the process pool with Pool(initializer=init_worker) as pool: # issues tasks to process pool result = pool.map_async(task, range(10)) # wait for tasks to complete result.wait() # process pool is closed automatically |
Running the example first configures and creates the process pool.
The ten tasks are then issued to the process pool and the main process blocks.
Each worker process is initialized with a call to the init_worker() function. This gets the worker PID and reports the value.
Each task is then executed, blocking for a fraction of a second and then returning.
All tasks complete and the main process continues on automatically closing the process pool and then the application itself.
Note, the specific PIDs reported will be different each time the program is run.
1 2 3 4 5 6 7 8 |
Worker PID: 94212 Worker PID: 94213 Worker PID: 94214 Worker PID: 94216 Worker PID: 94215 Worker PID: 94217 Worker PID: 94218 Worker PID: 94219 |
Next, let’s explore getting the worker PID from within a task.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Example of Getting Task Worker PID
We can get the worker PID within a task executed in the process pool.
In this example we will get the PID of the current process in the custom task function executed by the process pool. We will then issue a single task to the process pool that reports the PID.
Firstly, we can define the custom task function that gets the PID for the current worker process then reports the value.
The task() function below implements this.
1 2 3 4 5 |
# task executed in a worker process def task(identifier): # get the pid for the current worker process pid = getpid() print(f'Task PID: {pid}', flush=True) |
Next, in the main process we can create the process pool with a default configuration using the context manager interface.
1 2 3 4 |
... # create and configure the process pool with Pool() as pool: # ... |
Next, we can issue a single task asynchronously to the process pool using the apply_async() function on the process pool.
1 2 3 |
... # issues tasks to process pool result = pool.apply_async(task, (0,)) |
This will return a single AsyncResult object that we can wait on for the issued task to complete.
1 2 3 |
... # wait for tasks to complete result.wait() |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
# SuperFastPython.com # example of getting child worker process pids from os import getpid from multiprocessing.pool import Pool # task executed in a worker process def task(identifier): # get the pid for the current worker process pid = getpid() print(f'Task PID: {pid}', flush=True) # protect the entry point if __name__ == '__main__': # create and configure the process pool with Pool() as pool: # issues tasks to process pool result = pool.apply_async(task, (0,)) # wait for tasks to complete result.wait() # process pool is closed automatically |
Running the example first creates the process pool.
A single task is issued to the process pool and the main process blocks until the task is complete.
The task runs, first getting the PID of the current worker process that is running the task, then reporting the value.
The task completes, then the main process continues on automatically closing the process pool then terminating the application.
Note, the reported PID will differ each time the program is run.
1 |
Task PID: 94241 |
Further Reading
This section provides additional resources that you may find helpful.
Books
- Multiprocessing Pool Jump-Start, Jason Brownlee (my book!)
- Multiprocessing API Interview Questions
- Pool Class API Cheat Sheet
I would also recommend specific chapters from these books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Multiprocessing Pool: The Complete Guide
- Python ThreadPool: The Complete Guide
- Python Multiprocessing: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
APIs
References
Takeaways
You now know how to get the PID of worker processes in the process pool.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Samuel Bryngelsson on Unsplash
Ehsan says
Hi and Thanks for your great explanations,
I am not very good in Python, but I have learned very well from your great content. As I want to run one function in parallel and block leakage among different jobs, I thought to run them by multiprocessing module. I’ve only had one question regarding having unique process for each job.
Suppose that I have 100 jobs and want to have max_workers = 10. I watched that we have 10 unique processes from their PIDs, but I need them to be 100. So, is there any solution to make a unique process for each of them?
Jason Brownlee says
You’re welcome.
If you need 100 processes, you can increase the max_workers to 100.