Get Multiprocessing Pool Worker PID in Python

Last Updated on September 12, 2022

You can get the PID of a worker process by calling the os.getpid() function when initializing the worker process or from within the target task function executed by a worker process.

In this tutorial you will discover how to get the PID of worker processes in the Python process pool.

Let’s get started.

Table of Contents

Need PID of Worker Processes

The multiprocessing.pool.Pool in Python provides a pool of reusable processes for executing ad hoc tasks.

A process pool can be configured when it is created, which will prepare the child workers.

A process pool object which controls a pool of worker processes to which jobs can be submitted. It supports asynchronous results with timeouts and callbacks and has a parallel map implementation.
— multiprocessing — Process-based parallelism

We can issue one-off tasks to the process pool using functions such as apply() or we can apply the same function to an iterable of items using functions such as map().

Results for issued tasks can then be retrieved synchronously, or we can retrieve the result of tasks later by using asynchronous versions of the functions such as apply_async() and map_async().

When using the process pool, we may need the process identifiers (PIDs) of the worker processes.

This may be for many reasons, such as:

To uniquely identify the worker in the application.
To include the PID in logging.
To debug which worker is completing which task.

How can we get the child worker process PIDs in Python?

Run loops using all CPUs, download your FREE book to learn how.

What is a PID

PID is an acronym for Process ID or Process identifier.

A process identifier is a unique number assigned to a process by the underlying operating system.

Each time a process is started, it is assigned a unique positive integer identifier and the identifier may be different each time the process is started.

The pid uniquely identifies one process among all active processes running on the system, managed by the operating system.

As such, the pid can be used to interact with a process, e.g. to send a signal to the process to interrupt or kill a process.

Download Now: Free Process Pool PDF Cheat Sheet

How to Get Worker PID

When working with a process pool, there are two situations where we may want to get the PID:

Get the PID for each worker process in the process pool.
Get the PID for the worker completing a given task.

Let’s take a closer look at each approach in turn.

Before we get the PID of worker processes, let’s take a brief look at how to get a process PID.

How to Get Process PID

There are two general ways to get the PID for a process, they are:

multiprocessing.Process.pid attribute.
os.getpid() function.

We may also get the process instance for the current process via the multiprocessing.current_process() function.

For example:

...

# get the process instance

process = multiprocessing.current_process()

Once we have the process instance, we get the pid via the multiprocessing.Process.pid attribute.

For example:

...

# get the pid

pid = process.pid

Alternatively We can get the pid for the current process via the os.getpid() function.

For example:

...

# get the pid for the current process

pid = os.getpid()

You can learn more about how to get a PID for a process in the tutorial:

How to Get the Process PID in Python

How to Get All Worker PID

We can get the PID for all child workers in the process pool.

This can be achieved from the parent process.

First, we can get a list of all active child processes via the multiprocessing.active_children() function. This will return a list of multiprocessing.Process instances.

...

# get all active child processes

children = multiprocessing.active_children()

We can then access the “pid” attribute of each.

...

# get the pid of each child process

for child in children:

print(child.pid)

The downside of this approach is that the list of active children may include child processes that are not worker processes.

Another approach to getting the PID for all worker processes is to configure the process pool to use an initializer function and to get the worker process PID. This function will be called by each child worker process once when it is started in the process pool.

The initialization function does not take any arguments. Within the function we can get and use the worker PID.

For example:

# initialize the worker process

def init_worker():

# get the pid for the current worker process

pid = getpid()

We can then configure the process pool to call the initialization function as each worker is created.

This can be achieved by setting the “initializer” argument in the multiprocessing.pool.Pool constructor to the name of the initialization function.

For example:

...

# create and configure a process pool

pool = multiprocessing.pool.Pool(=initializerinit_worker)

You can learn more about how to initialize worker processes in the process pool in the tutorial:

Process Pool Initializer in Python

How to Get Task PID

We can get the worker PID within a given task executed by the process pool.

This can be achieved by calling the getpid() within the target task function executed in the process pool.

For example:

# task executed in a worker process

def task(identifier):

# get the pid for the current worker process

pid = getpid()

Now that we know how to get worker process PIDs, let’s look at some worked examples.

Free Python Multiprocessing Pool Course

Download your FREE Process Pool PDF cheat sheet and get BONUS access to my free 7-day crash course on the Process Pool API.

Discover how to use the Multiprocessing Pool including how to configure the number of workers and how to execute tasks asynchronously.

Learn more

Example of Getting All Worker PIDs

We can explore how to get all worker PIDs using a worker initialization function.

In this example we will define a worker initialization function that will get and report the PID of each worker process. We will then create and configure a process pool to use the initialization function, then issue many tasks to the process pool in order that all worker processes are started and their PIDs are reported.

First, we must define a worker process initialization function. The function will not take any arguments and will call the os.getpid() function to get the PID of a worker, then report its value.

The init_worker() function below implements this.

# initialize the worker process

def init_worker():

# get the pid for the current worker process

pid = getpid()

print(f'Worker PID: {pid}', flush=True)

Next, we can define a task that we will execute in the process pool.

The task will take an integer augment then block for a fraction of a second.

The task() function below implements this.

# task executed in a worker process

def task(identifier):

# block for a moment

sleep(0.5)

Next, in the main process, we can create and configure a new process pool.

We will use the context manager interface to ensure the process pool is closed automatically once we are finished with it.

...

# create and configure the process pool

with Pool(initializer=init_worker) as pool:

# ...

You can learn more about the context manager interface in the tutorial:

Process Pool Context Manager

We will then issue 10 tasks to the process pool, each calling our task() function with an integer between 0 and 9. We will issue the task asynchronously using the map_async() function.

...

# issues tasks to process pool

result = pool.map_async(task, range(10))

Once issued, the main process will block on the AsyncResult returned from the map_async() function until the issued tasks are complete.

...

# wait for tasks to complete

result.wait()

You can learn more about the map_async() function in the tutorial:

Multiprocessing Pool.starmap_async() in Python

Tying this together, the complete example is listed below.

# SuperFastPython.com

# example of getting child worker process pids

from os import getpid

from time import sleep

from multiprocessing.pool import Pool

# initialize the worker process

def init_worker():

# get the pid for the current worker process

pid = getpid()

print(f'Worker PID: {pid}', flush=True)

# task executed in a worker process

def task(identifier):

# block for a moment

sleep(0.5)

# protect the entry point

if __name__ == '__main__':

# create and configure the process pool

with Pool(initializer=init_worker) as pool:

# issues tasks to process pool

result = pool.map_async(task, range(10))

# wait for tasks to complete

result.wait()

# process pool is closed automatically

Running the example first configures and creates the process pool.

The ten tasks are then issued to the process pool and the main process blocks.

Each worker process is initialized with a call to the init_worker() function. This gets the worker PID and reports the value.

Each task is then executed, blocking for a fraction of a second and then returning.

All tasks complete and the main process continues on automatically closing the process pool and then the application itself.

Note, the specific PIDs reported will be different each time the program is run.

Worker PID: 94212

Worker PID: 94213

Worker PID: 94214

Worker PID: 94216

Worker PID: 94215

Worker PID: 94217

Worker PID: 94218

Worker PID: 94219

Next, let’s explore getting the worker PID from within a task.

Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps

Example of Getting Task Worker PID

We can get the worker PID within a task executed in the process pool.

In this example we will get the PID of the current process in the custom task function executed by the process pool. We will then issue a single task to the process pool that reports the PID.

Firstly, we can define the custom task function that gets the PID for the current worker process then reports the value.

The task() function below implements this.

# task executed in a worker process

def task(identifier):

# get the pid for the current worker process

pid = getpid()

print(f'Task PID: {pid}', flush=True)

Next, in the main process we can create the process pool with a default configuration using the context manager interface.

...

# create and configure the process pool

with Pool() as pool:

# ...

Next, we can issue a single task asynchronously to the process pool using the apply_async() function on the process pool.

...

# issues tasks to process pool

result = pool.apply_async(task, (0,))

This will return a single AsyncResult object that we can wait on for the issued task to complete.

...

# wait for tasks to complete

result.wait()

Tying this together, the complete example is listed below.

# SuperFastPython.com

# example of getting child worker process pids

from os import getpid

from multiprocessing.pool import Pool

# task executed in a worker process

def task(identifier):

# get the pid for the current worker process

pid = getpid()

print(f'Task PID: {pid}', flush=True)

# protect the entry point

if __name__ == '__main__':

# create and configure the process pool

with Pool() as pool:

# issues tasks to process pool

result = pool.apply_async(task, (0,))

# wait for tasks to complete

result.wait()

# process pool is closed automatically

Running the example first creates the process pool.

A single task is issued to the process pool and the main process blocks until the task is complete.

The task runs, first getting the PID of the current worker process that is running the task, then reporting the value.

The task completes, then the main process continues on automatically closing the process pool then terminating the application.

Note, the reported PID will differ each time the program is run.

1	Task PID: 94241

Takeaways

You now know how to get the PID of worker processes in the process pool.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Photo by Samuel Bryngelsson on Unsplash

Comments

Ehsan says

June 8, 2023 at 6:06 am

Hi and Thanks for your great explanations,
I am not very good in Python, but I have learned very well from your great content. As I want to run one function in parallel and block leakage among different jobs, I thought to run them by multiprocessing module. I’ve only had one question regarding having unique process for each job.
Suppose that I have 100 jobs and want to have max_workers = 10. I watched that we have 10 unique processes from their PIDs, but I need them to be 100. So, is there any solution to make a unique process for each of them?

- Jason Brownlee says
  
  June 9, 2023 at 6:39 am
  
  You’re welcome.
  
  If you need 100 processes, you can increase the max_workers to 100.

Get Multiprocessing Pool Worker PID in Python

Need PID of Worker Processes

What is a PID

How to Get Worker PID

How to Get Process PID

How to Get All Worker PID

How to Get Task PID

Example of Getting All Worker PIDs

Example of Getting Task Worker PID

Further Reading

Takeaways

Related Tutorials:

Parallel Loops in Python

Loving the Tutorials?

Get The Book:

Don't Dabble!

Learn All Of Python Concurrency

No more idle CPUs

Learn the Pool Class Systematically

Additional menu

Need PID of Worker Processes

What is a PID

How to Get Worker PID

How to Get Process PID

How to Get All Worker PID

How to Get Task PID

Example of Getting All Worker PIDs

Example of Getting Task Worker PID

Further Reading

Takeaways

Share this:

Related Tutorials:

About Jason Brownlee

Parallel Loops in Python

Reader Interactions

Comments

Leave a Reply Cancel reply

Footer

Learn the Pool Class Systematically