Last Updated on September 12, 2022
The ProcessPoolExecutor in Python creates internal processes and threads.
In this tutorial you will discover how to check the ids and names of processes and threads created in Python process pools.
Let’s get started.
Process Identifier and Main Thread
When you run a Python program you will create one Python process with one thread.
The process has a process identifier or a pid.
This can be checked for the pid for your Python program via the os.get_pid() function.
For example, we can report the pid for our program as follows:
1 2 3 |
... # report the process print(f'Main pid: {getpid()}') |
Each process is created with one thread to execute the instructions, called the “MainThread“.
We can access the current thread context via the threading.current_thread() function.
The threading.current_thread() function returns a threading.Thread instance for the current working thread, on which we can access details like the (typically unique) name of the thread via the name property.
For example, we can report the name of the current thread in our Python program (the main thread) as follows:
1 2 3 |
... # report the thread print(f'Main thread: {current_thread().name}') |
Tying this together, the complete example of checking the pid and thread name in a Python program is as follows.
1 2 3 4 5 6 7 8 9 10 11 |
# SuperFastPython.com # check the pid and thread names of a program from os import getpid from threading import current_thread # entry point if __name__ == '__main__': # report the process print(f'Main pid: {getpid()}') # report the thread print(f'Main thread: {current_thread().name}') |
Running the example reports the process id and thread name for our program.
In this case, we can see that the process id was 47791 and the name of the main thread was MainThread.
Note, you will have a different pid than the one listed below, and likely a different pid each time the program is run. This is because the pid is allocated by your operating system.
1 2 |
Main pid: 47791 Main thread: MainThread |
Run loops using all CPUs, download your FREE book to learn how.
ProcessPoolExecutor Worker Processes and Main Threads
The ProcessPoolExecutor will create worker processes, by design.
Each worker process will also have its own main thread.
We can define a target task function that will report the pid and name of the thread it is running in.
We would expect each worker process to have a separate pid from the main process and perhaps the same thread name, given that all main threads for each process are named “MainThread“.
We can also check the parent process for each worker process via the os.get_ppid() function.
1 2 3 4 5 |
# target task function def work(value): sleep(0.1) # report the worker process details print(f'Worker pid={getpid()}, ppid={getppid()} thread={current_thread().name}') |
We can then create a process pool with two worker processes and use map() on the process pool to submit two tasks to report their details.
1 2 3 4 5 |
... # create a process pool with ProcessPoolExecutor(2) as executor: # submit some tasks _ = executor.map(work, range(2)) |
Tying this together, the complete example of reporting the details of worker processes in the ProcessPoolExecutor is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
# SuperFastPython.com # report pid and thread name for each worker process in the process pool from time import sleep from os import getpid from os import getppid from threading import current_thread from concurrent.futures import ProcessPoolExecutor # target task function def work(value): sleep(0.1) # report the worker process details print(f'Worker pid={getpid()}, ppid={getppid()} thread={current_thread().name}') # entry point if __name__ == '__main__': # report the main process details print(f'Main pid={getpid()}, ppid={getppid()} thread={current_thread().name}') # create a process pool with ProcessPoolExecutor(2) as executor: # submit some tasks _ = executor.map(work, range(2)) |
Running the example first reports the details of the main process and we can see it has a process id, a separate process id for each worker and a main thread.
We then report the details for each worker process.
We can see that each worker has a different pid as we expected, and that they both have the same parent, which is the main process in which we created the process pool.
Finally, we can see that each process has a main thread.
1 2 3 |
Main pid=47838, ppid=472 thread=MainThread Worker pid=47840, ppid=47838 thread=MainThread Worker pid=47841, ppid=47838 thread=MainThread |
ProcessPoolExecutor Creates Internal Helper Threads
The ProcessPoolExecutor will also create two internal worker threads used by the process pool for the internal maintenance of tasks.
We can see this if we report all threads that exist within the main process.
This can be achieved by using the threading.enumerate() function that will return a threading.Thread instance for each thread in the process.
For example:
1 2 3 4 |
... # report all thread names for thread in threading.enumerate(): print(thread.name) |
We can create a process pool and then submit some tasks, to ensure that the worker processes are created and that the internal worker threads are created.
If we report all threads in the main process after submitting some tasks, we would expect to see three threads, the main thread and two internal helper threads.
Tying this together, the complete example of inspecting the ProcessPoolExecutor internal helper threads is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
# SuperFastPython.com # report helper threads created by the process pool import threading from concurrent.futures import ProcessPoolExecutor # target task function def work(value): pass # entry point if __name__ == '__main__': # create a process pool with ProcessPoolExecutor(4) as executor: # submit some tasks so that helper threads are created _ = [executor.submit(work, i) for i in range(4)] # report all thread names for thread in threading.enumerate(): print(thread.name) |
Running the example, we can see that three threads were reported in the main process as we expected.
The first is the main thread, created for every Python process.
The second is an internal helper thread for the ProcessPoolExecutor. Looking in the source code for the ProcessPoolExecutor class, this thread is created by an _ExecutorManagerThread instance and is responsible for managing the communication between the main process and the worker processes, e.g. getting results.
The second is another internal helper thread for the ProcessPoolExecutor. Looking in the source code for the multiprocessing.SimpleQueue and multiprocessing.Queue, we can see that the queue used to receive results from the worker processes has its own internal helper thread.
1 2 3 |
MainThread Thread-1 QueueFeederThread |
Free Python ProcessPoolExecutor Course
Download your FREE ProcessPoolExecutor PDF cheat sheet and get BONUS access to my free 7-day crash course on the ProcessPoolExecutor API.
Discover how to use the ProcessPoolExecutor class including how to configure the number of workers and how to execute tasks asynchronously.
ProcessPoolExecutor Helper Thread Runs Callbacks
One question we might consider is what process and thread runs callbacks.
This is important to know if we need to access a resource shared between processes and threads from within the callback function. It will determine any locks and cross-process data sharing mechanisms that may be needed.
When we submit tasks to the process pool via the submit() function we receive a Future object in return.
The Future object can be used to check the status of the task, get the result and to add a callback function to call once the task is done. This can be achieved by calling the add_done_callback() function in the future and specifying the callback function to call.
The callback function must take a single argument which is the Future object for the task that is done.
We can define a callback function that reports the pid, parent pid and thread name to learn about the process and thread that is executing the callback.
1 2 3 |
# callback function to call when a task is completed def custom_callback(future): print(f'Callback pid={getpid()}, ppid={getppid()} thread={current_thread().name}') |
We can define a target task function that reports the same details, for example:
1 2 3 4 5 |
# target task function def work(): sleep(0.1) # report the worker process details print(f'Worker pid={getpid()}, ppid={getppid()} thread={current_thread().name}') |
In our main process, we can first report the details of the process.
1 2 3 |
... # report the main process details print(f'Main pid={getpid()}, ppid={getppid()} thread={current_thread().name}') |
Then, we can create the thread pool, submit a single task in order to get a Future object, then register the callback function.
1 2 3 4 5 6 7 8 9 |
... # report the main process details print(f'Main pid={getpid()}, ppid={getppid()} thread={current_thread().name}') # create a process pool with ProcessPoolExecutor(2) as executor: # submit a task future = executor.submit(work) # add a callback function future.add_done_callback(custom_callback) |
Tying this together, the complete example of checking the process and thread details of the process that executes callbacks for the ProcessPoolExecutor is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
# SuperFastPython.com # check the process and thread used to execute future callbacks from time import sleep from os import getpid from os import getppid from threading import current_thread from concurrent.futures import ProcessPoolExecutor # callback function to call when a task is completed def custom_callback(future): print(f'Callback pid={getpid()}, ppid={getppid()} thread={current_thread().name}') # target task function def work(): sleep(0.1) # report the worker process details print(f'Worker pid={getpid()}, ppid={getppid()} thread={current_thread().name}') # entry point if __name__ == '__main__': # report the main process details print(f'Main pid={getpid()}, ppid={getppid()} thread={current_thread().name}') # create a process pool with ProcessPoolExecutor(2) as executor: # submit a task future = executor.submit(work) # add a callback function future.add_done_callback(custom_callback) |
Running the example we can first see the main process has the pid of 48004 and the main thread as we expected.
We can then see that the worker process has a different pid of 48006 and a parent pid that matches the main process. This too is as we expected.
Finally, the details of the process that executed the callback function are reported.
In this case, we can see that the main process executed the callback using the “Thread-1” internal helper thread.
Note, your specific pids will differ because the numbers are allocated by your operating system at the time that the program is executed.
1 2 3 |
Main pid=48004, ppid=472 thread=MainThread Worker pid=48006, ppid=48004 thread=MainThread Callback pid=48004, ppid=472 thread=Thread-1 |
This means that the callback function can share data (e.g. shared memory) with the main process and does not need to worry about protecting data from simultaneous calls to the callback function.
Put another way, the callback function is atomic for any data or variables used only by the callback or the main thread, such as a counter.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Further Reading
This section provides additional resources that you may find helpful.
Books
- ProcessPoolExecutor Jump-Start, Jason Brownlee (my book!)
- Concurrent Futures API Interview Questions
- ProcessPoolExecutor PDF Cheat Sheet
I also recommend specific chapters from the following books:
- Effective Python, Brett Slatkin, 2019.
- See Chapter 7: Concurrency and Parallelism
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python ProcessPoolExecutor: The Complete Guide
- Python ThreadPoolExecutor: The Complete Guide
- Python Multiprocessing: The Complete Guide
- Python Pool: The Complete Guide
APIs
References
- Thread (computing), Wikipedia.
- Process (computing), Wikipedia.
- Thread Pool, Wikipedia.
- Futures and promises, Wikipedia.
Takeaways
You now know how to check the process ids and thread names in the ProcessPoolExecutor
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Andrew Palmer on Unsplash
Do you have any questions?