How to Report The Number of Remaining Tasks in the ProcessPoolExecutor

February 3, 2022 Python ProcessPoolExecutor

You can check the number of remaining tasks in the ProcessPoolExecutor by the _pending_work_items dict.

In this tutorial you will discover how to check the number of remaining tasks in the process pool in Python.

Let's get started.

Need to Report the Number of Remaining Tasks

The ProcessPoolExecutor in Python provides a pool of reusable processes for executing ad hoc tasks.

You can submit tasks to the process pool by calling the submit() function and passing in the name of the function you wish to execute on another process.

Calling the submit() function will return a Future object that allows you to check on the status of the task and get the result from the task once it completes.

You may need to know the number of remaining tasks in the process pool.

This may be for many reasons, such as providing feedback to the user as to how much work remains to be completed before the program will be done.

It may not be straightforward to check how many tasks remain in the process pool, especially if you do not keep a reference to Future objects returned from calls to submit().

How can you check the number of remaining tasks in the process pool?

How to Check the Number of Remaining Tasks

There are two ways you can estimate the number of tasks that remain in the process pool.

In the first case, you can keep track of the number of tasks submitted with a counter from the main thread, e.g. if all tasks are submitted from the main thread.

You could then use a process-safe counter in a Future callback to keep track of the number of tasks that are done. This could be used as a crude progress indicator.

More simply, you can query into the protected members of the ProcessPoolExecutor directly in order to find out how many tasks remain to be processed.

Perhaps a simple structure to check internal to the process pool is the dictionary that maps work item identifiers to work items, e.g. the tasks that have been submitted but not yet dispatched to processes for execution.

This is in the _pending_work_items protected property.

For example:

...
# access the number of tasks that are not yet being processed
executor._pending_work_items

We can report the size of this dictionary directly, for example:

...
# report the number of remaining tasks
print(f'About {len(executor._pending_work_items)} tasks remain')

Now that we know how to check the number of remaining tasks in the process pool, let's look at a worked example.

Example of Checking The Number of Remaining Tasks

Let's look at how we might check the number of tasks that remain in the ProcessPoolExecutor.

First, let's define a task that will block for a moment.

# test that works for moment
def task():
    value = random()
    sleep(value)

Next, we can start the process pool and submit 50 tasks for execution.

...
# start the process pool
with ProcessPoolExecutor(4) as executor:
    # submit many tasks
    futures = [executor.submit(task) for _ in range(50)]
    print('Waiting for tasks to complete...')

While there are tasks remaining in the pool, we can report the number of tasks that remain.

An easy way to do this is to use the Future objects returned from calls to submit() and to update the process of the pool each time a task is completed.

...
# update each time a task finishes
for _ in as_completed(futures):
    # report the number of remaining tasks
    print(f'About {len(executor._pending_work_items)} tasks remain')

Tying this together, the complete example below submits 50 tasks into a process pool with four worker processes, and reports the number of tasks that remain as each task finishes.

# SuperFastPython.com
# example of estimating the number of remaining tasks
from time import sleep
from random import random
from concurrent.futures import ProcessPoolExecutor
from concurrent.futures import as_completed

# mock test that works for moment
def task():
    value = random()
    sleep(value)

# entry point
def main():
    # start the process pool
    with ProcessPoolExecutor(4) as executor:
        # submit many tasks
        futures = [executor.submit(task) for _ in range(50)]
        print('Waiting for tasks to complete...')
        # update each time a task finishes
        for _ in as_completed(futures):
            # report the number of remaining tasks
            print(f'About {len(executor._pending_work_items)} tasks remain')

if __name__ == '__main__':
    main()

Running the example, executes the tasks quickly and provides an updated report on the number of tasks that remain as each task is completed.

Waiting for tasks to complete...
About 49 tasks remain
About 48 tasks remain
About 47 tasks remain
About 46 tasks remain
About 45 tasks remain
About 44 tasks remain
About 43 tasks remain
About 42 tasks remain
About 41 tasks remain
About 40 tasks remain
About 39 tasks remain
About 38 tasks remain
About 37 tasks remain
About 36 tasks remain
About 35 tasks remain
About 34 tasks remain
About 33 tasks remain
About 32 tasks remain
About 31 tasks remain
About 30 tasks remain
About 29 tasks remain
About 28 tasks remain
About 27 tasks remain
About 26 tasks remain
About 25 tasks remain
About 24 tasks remain
About 23 tasks remain
About 22 tasks remain
About 21 tasks remain
About 20 tasks remain
About 19 tasks remain
About 18 tasks remain
About 17 tasks remain
About 16 tasks remain
About 15 tasks remain
About 14 tasks remain
About 13 tasks remain
About 12 tasks remain
About 11 tasks remain
About 10 tasks remain
About 9 tasks remain
About 8 tasks remain
About 7 tasks remain
About 6 tasks remain
About 5 tasks remain
About 4 tasks remain
About 3 tasks remain
About 2 tasks remain
About 1 tasks remain
About 0 tasks remain

Takeaways

You now know how to check the number of tasks remaining in the ProcessPoolExecutor.