Multiprocessing Pool Worker PIDs in Python

July 4, 2022 Python Multiprocessing Pool

You can get the PID of workers in the process pool using os.getpid() from within the task or multiprocessing.active_children() from the parent process.

In this tutorial you will discover how to get the PID of process pool child worker processes in Python.

Let's get started.

Need Worker Process PID

The multiprocessing.pool.Pool in Python provides a pool of reusable processes for executing ad hoc tasks.

A process pool can be configured when it is created, which will prepare the child workers.

A process pool object which controls a pool of worker processes to which jobs can be submitted. It supports asynchronous results with timeouts and callbacks and has a parallel map implementation.

-- multiprocessing — Process-based parallelism

We can issue one-off tasks to the process pool using functions such as apply() or we can apply the same function to an iterable of items using functions such as map(). Results for issued tasks can then be retrieved synchronously, or we can retrieve the result of tasks later by using asynchronous versions of the functions such as apply_async() and map_async().

Each Python process is created and managed by the underlying operating system.

As such, each process is allocated a unique process identifier, referred to as a PID.

How can we get the PID for worker processes in the process pool?

How to Get Worker Process PID

We can get the worker process PIDs in two ways, they are:

  1. From within the worker process itself.
  2. From the parent process that created the process pool.

Let's take a closer look at each approach.

How to Get Worker PID Within Worker

We can get the PID of worker processes from within the workers themselves.

The task that is executed by a worker process can get its own PID.

This can be achieved by calling the os.getpid() function.

For example:

...
# get the pid
pid = getpid()

Alternately, the task executed by the worker process can get the current process via the multiprocessing.current_process() function, then access the pid attribute.

For example:

...
# get the current process
process = current_process()
# get the pid
pid = process.pid

Next, let's look at how we might get the worker PID from the parent process.

How to Get Worker PID From Parent Process

The parent process that created the process pool can get the child worker PIDs.

This can be achieved by getting all active child processes via the multiprocessing.active_children() function and then accessing their pid attributes.

For example

...
# get the child processes
children = active_children()
# report the pid
for child in children:
    print(f'Worker pid={child.pid}', flush=True)

Alternatively, we can access the child processes directly on the pool via the _pool attribute on the multiprocessing.pool.Pool instance.

Note, the _pool attribute is a hidden member internal to the Pool class and may change in the future.

For example:

...
# report the pids
for worker in pool._pool:
    print(f'Worker pid={worker.pid}', flush=True)

Now that we know how to access the PIDs of child worker processes in the process pool, let's look at some worked examples.

Examples of Getting Worker PID From Within Task

We can get the PIDs of child worker processes within the process pool from within the tasks executed in the pool.

There are two main approaches to access the PID of a worker from within the task, they are:

  1. Using os.getpid()
  2. Using multiprocessing.Process.pid

Let's take a closer look at each.

Get Worker PID using os.getpid()

We can access the PID of the worker child process that is executing a task via the os.getpid() function.

This function can be called directly from within the task executed within the process pool, which will return the PID for the child process that is currently executing the task.

The task() below implements this.

# task executed in a worker process
def task():
    # get the pid
    pid = getpid()
    # report the pid
    print(f'Worker pid={pid}', flush=True)

We can then create a process pool with the default configuration using the context manager interface.

...
# create and configure the process pool
with Pool() as pool:
	# ...

We can then issue a task that executes our custom task() function, then close the pool and wait for all issued tasks to complete.

...
# issue tasks to the process pool
pool.apply_async(task)
# close the process pool
pool.close()
# wait for all tasks to complete
pool.join()

Tying this together, the complete example is listed below.

# SuperFastPython.com
# example of getting the worker process pid from within the worker
from multiprocessing.pool import Pool
from os import getpid

# task executed in a worker process
def task():
    # get the pid
    pid = getpid()
    # report the pid
    print(f'Worker pid={pid}', flush=True)

# protect the entry point
if __name__ == '__main__':
    # create and configure the process pool
    with Pool() as pool:
        # issue tasks to the process pool
        pool.apply_async(task)
        # close the process pool
        pool.close()
        # wait for all tasks to complete
        pool.join()

Running the example first creates and configures the process pool.

The task is then issued to the process pool and the main process closes the pool and waits for all issued tasks to complete.

The custom function executes as a task within the process pool. The task gets the PID of the process that is executing the task, and reports it.

Note, the PID will be different each time the code example is executed.

Worker pid=62808

Get Worker PID using multiprocessing.Process.pid

We can get the PID of the worker process by first getting the current process, then reporting the "pid" attribute.

We can update the example from the previous example to do this within the task() function.

First, we can call the multiprocessing.current_process() function to get the multiprocessing.Process instance for the child worker process that is currently executing the task.

...
# get the current process
process = current_process()

We can then access the "pid" attribute from the multiprocessing.Process instance.

...
# get the pid
pid = process.pid

The updated version of the task() function with these changes is listed below.

# task executed in a worker process
def task():
    # get the current process
    process = current_process()
    # get the pid
    pid = process.pid
    # report the pid
    print(f'Worker pid={pid}', flush=True)

Tying this together, the complete example is listed below.

# SuperFastPython.com
# example of getting the worker process pid from within the worker
from multiprocessing.pool import Pool
from multiprocessing import current_process

# task executed in a worker process
def task():
    # get the current process
    process = current_process()
    # get the pid
    pid = process.pid
    # report the pid
    print(f'Worker pid={pid}', flush=True)

# protect the entry point
if __name__ == '__main__':
    # create and configure the process pool
    with Pool() as pool:
        # issue tasks to the process pool
        pool.apply_async(task)
        # close the process pool
        pool.close()
        # wait for all tasks to complete
        pool.join()

Running the example first creates and configures the process pool.

The task is then issued to the process pool and the main process closes the pool and waits for all issued tasks to complete.

The custom function executes as a task within the process pool. The task gets the current process for the child worker process that is executing the task. It then gets the PID of the process that is executing the task, and reports it.

Note, the PID will be different each time the code example is executed.

Worker pid=62822

Next, let's look at how we can get the PID of worker processes from the parent process.

Example of Getting Worker Process PID From Parent

We can get the PIDs of child worker processes from the parent process that created and manages the process pool.

There are two main approaches that we can do this, they are:

  1. Using multiprocessing.active_children()
  2. Using multiprocessing.pool.Pool._pool

Let's take a closer look at each approach.

Get Worker PID using multiprocessing.active_children

The parent process can access the PID of child worker processes by first getting a list of all active child processes via the multiprocessing.active_children().

...
# get the child processes
children = active_children()

It can then access the "pid" attribute on each multiprocessing.Process instance.

This can be done after issuing the tasks in the process pool.

...
# report the pid
for child in children:
    print(f'Worker pid={child.pid}', flush=True)

The downside of this approach is that if the parent process has other child processes running at the same time, then we do not have an easy way of determining which processes belong to the process pool and which do not.

Tying this together, the complete example is listed below.

# SuperFastPython.com
# example of getting the worker process pid from the parent process
from time import sleep
from multiprocessing.pool import Pool
from multiprocessing import active_children

# task executed in a worker process
def task():
    # block for a moment
    sleep(1)

# protect the entry point
if __name__ == '__main__':
    # create and configure the process pool
    with Pool() as pool:
        # issue tasks to the process pool
        pool.apply_async(task)
        # get the child processes
        children = active_children()
        # report the pid
        for child in children:
            print(f'Worker pid={child.pid}', flush=True)
        # close the process pool
        pool.close()
        # wait for all tasks to complete
        pool.join()

Running the example first creates and configures the process pool.

The task is then issued to the process pool.

The custom function executes as a task within the process pool and blocks for a second.

The main process then gets a list of all active child processes, then reports the PID of each child process.

The main process then closes the pool and waits for all issued tasks to complete.

In this case, we can see that a total of 8 child worker processes were created.

Note, you have a different number of worker processes, depending on your system.

Note, the PIDs will be different each time the code example is executed.

Worker pid=62856
Worker pid=62861
Worker pid=62857
Worker pid=62855
Worker pid=62860
Worker pid=62859
Worker pid=62858
Worker pid=62862

Get Worker PID using multiprocessing.pool.Pool._pool

The parent process can access the multiprocessing.Process instances in the process pool directly via the _pool attribute.

This is a private attribute used internally within the pool. As such, the name of this attribute may change in the future and relying on it in an application could be risky.

The parent process can access the list of worker processes directly via this attribute then report their "pid" attributes.

...
# report the pids
for worker in pool._pool:
    print(f'Worker pid={worker.pid}', flush=True)

Tying this together, the complete example is listed below.

# SuperFastPython.com
# example of getting the worker process pid from the parent process
from time import sleep
from multiprocessing.pool import Pool
from multiprocessing import active_children

# task executed in a worker process
def task():
    # block for a moment
    sleep(1)

# protect the entry point
if __name__ == '__main__':
    # create and configure the process pool
    with Pool() as pool:
        # issue tasks to the process pool
        pool.apply_async(task)
        # report the pids
        for worker in pool._pool:
            print(f'Worker pid={worker.pid}', flush=True)
        # close the process pool
        pool.close()
        # wait for all tasks to complete
        pool.join()

Running the example first creates and configures the process pool.

The task is then issued to the process pool.

The custom function executes as a task within the process pool and blocks for a second.

The main process then directly accesses the worker processes and reports the PID of each child process.

The main process then closes the pool and waits for all issued tasks to complete.

In this case, we can see that a total of 8 child worker processes were created.

Note, you have a different number of worker processes, depending on your system.

Note, the PIDs will be different each time the code example is executed.

Worker pid=62874
Worker pid=62875
Worker pid=62876
Worker pid=62877
Worker pid=62878
Worker pid=62879
Worker pid=62880
Worker pid=62881

Takeaways

You now know how to get the PID of process pool child worker processes.



If you enjoyed this tutorial, you will love my book: Python Multiprocessing Pool Jump-Start. It covers everything you need to master the topic with hands-on examples and clear explanations.