How to Shutdown the ThreadPool in Python

September 16, 2022 Python ThreadPool

You can close a ThreadPool via the close() or terminate() methods.

In this tutorial you will discover how to close a ThreadPool in Python.

Let's get started.

Need to Close a ThreadPool

The multiprocessing.pool.ThreadPool in Python provides a pool of reusable threads for executing ad hoc tasks.

A thread pool object which controls a pool of worker threads to which jobs can be submitted.

-- multiprocessing — Process-based parallelism

The ThreadPool class extends the Pool class. The Pool class provides a pool of worker processes for process-based concurrency.

Although the ThreadPool class is in the multiprocessing module it offers thread-based concurrency and is best suited to IO-bound tasks, such as reading or writing from sockets or files.

A ThreadPool can be configured when it is created, which will prepare the new threads.

We can issue one-off tasks to the ThreadPool using functions such as apply() or we can apply the same function to an iterable of items using functions such as map().

Results for issued tasks can then be retrieved synchronously, or we can retrieve the result of tasks later by using asynchronous versions of the functions such as apply_async() and map_async().

The ThreadPool must be closed once we are finished with it in order to release the worker threads.

How can we safely close the ThreadPool once we are finished with it?

Problem With Not Closing ThreadPool

We may encounter problems if we do not close the ThreadPool once we are finished with it.

The ThreadPool maintains multiple worker threads. Each thread is an object maintained by the operating system with its own stack space.

In addition, the ThreadPool has worker threads responsible for dispatching tasks and gathering results from the worker threads.

If the ThreadPool is not explicitly closed, it means that the resources required to operate the ThreadPool, e.g. the worker threads and stack space, may not be released and made available to the program.

multiprocessing.pool objects have internal resources that need to be properly managed (like any other resource) by using the pool as a context manager or by calling close() and terminate() manually.

-- multiprocessing — Process-based parallelism

In addition, it is possible for worker threads to prevent the main thread from exiting if they are still running.

The worker threads are daemon threads, which means that the main thread can exit if worker threads are running.

You can learn more about daemon threads in the tutorial:

Nevertheless, there is a known issue where if we let the Python garbage collector shutdown and delete the ThreadPool for us (e.g. finalize), and an exception is raised during the shutdown procedure, then it is possible for the pool to not shutdown properly.

... Failure to do this can lead to the process hanging on finalization. Note that it is not correct to rely on the garbage collector to destroy the pool as CPython does not assure that the finalizer of the pool will be called (see object.__del__() for more information).

-- multiprocessing — Process-based parallelism

The documentation describes issues with processes in the multiprocessing.pool.Pool class, nevertheless, we may have issues with threads as well.

Therefore, it is a best practice to always shutdown the ThreadPool.

How to Shutdown the ThreadPool

There are two ways to shutdown the ThreadPool.

They are:

We can also call terminate() automatically via the context manager interface.

Let's take a closer look at each of these cases in turn.

How to Close the ThreadPool

The ThreadPool can be shutdown by calling the close() method.

For example:

...
# close the thread pool
pool.close()

This will prevent the pool from accepting new tasks.

Once all issued tasks are completed, the resources of the ThreadPool, such as the worker threads, will be released.

Prevents any more tasks from being submitted to the pool. Once all the tasks have been completed the worker processes will exit.

-- multiprocessing — Process-based parallelism

How to Terminate the ThreadPool

The ThreadPool can be shutdown by calling the terminate() method.

For example:

...
# terminate the thread pool
pool.terminate()

This will prevent the pool from accepting new tasks.

It is documented on the Pool class as immediately terminate all workers, even if they are in the middle of executing a task.

Stops the worker processes immediately without completing outstanding work.

-- multiprocessing — Process-based parallelism

The problem is threads cannot be terminated immediately in the same way as processes.

Therefore the terminate() method will not actually terminate workers that are executing tasks. Instead, it will only close workers that are not executing tasks, then close workers that are executing tasks once they are complete.

In effect, terminate() is the same as close() on the ThreadPool, unlike the Pool class.

How to Automatically Terminate the ThreadPool

The ThreadPool.terminate() function can be called automatically in two circumstances:

It is not a good idea to rely on the terminate() function being called by the Python garbage collector in order to shutdown the ThreadPool as it can result in the main thread hanging in rare circumstances.

The ThreadPool class provides the context manager interface.

This means that we can create and configure a ThreadPool, use it within the context manager block, and once the block is exited, the ThreadPool will be closed automatically for us.

For example:

...
# create a thread pool
with ThreadPool() as pool:
	# use the thread pool
	# ...
# thread pool is closed automatically

Exiting the block of the context manager normally or via an error will result in the terminate() function of the ThreadPool being called automatically.

Pool objects now support the context management protocol [...]. __enter__() returns the pool object, and __exit__() calls terminate().

-- multiprocessing — Process-based parallelism

You can learn more about the ThreadPool context manager interface in the tutorial:

close() vs shutdown()

The key difference between close() and terminate() according to the API documentation in the Pool class (the parent of the ThreadPool class) is that terminate shuts down the pool immediately, even if tasks are running, whereas close() waits for the task to complete before shutting down.

As we have seen, threads cannot be terminated while exiting.

Therefore, there is no practical difference between close() and terminate() on the ThreadPool class.

Now that we know how to shutdown the ThreadPool, let's look at some worked examples.

Example of Closing the ThreadPool

We can explore how to close the ThreadPool safely.

In this example we will create a ThreadPool, issue a task, wait for the task to complete, then close the ThreadPool. We will define a new custom function to execute as a task in the pool which will report a message and sleep for a moment.

Firstly, we can define a function to run in a new task in the ThreadPool.

The task() function below implements this.

# task executed in a worker thread
def task():
    # report a message
    print(f'Task executing', flush=True)
    # block for a moment
    sleep(1)
    # report a message
    print(f'Task done', flush=True)

Next, in the main thread we can create a ThreadPool with a default configuration.

...
# create and configure the thread pool
pool = ThreadPool()

Next, we can issue our task() function to the ThreadPool asynchronously.

Notice that we don't have to wait for the task to complete.

...
# issue tasks to the thread pool
result = pool.apply_async(task)

Next, we can close the ThreadPool, then call join() to wait a moment for all resources to be released.

...
# close the thread pool
pool.close()
# wait a moment
pool.join()
# report a message
print('Main all done.')

Finally, we can report the number of worker threads for the main thread, which we expect to be one (for the main thread), as the ThreadPool has now shut down.

...
# report the number of worker threads that are still active
active_threads = active_count()
print(f'Active threads: {active_threads}')

Tying this together, the complete example is listed below.

# SuperFastPython.com
# example of closing the thread pool
from time import sleep
from multiprocessing.pool import ThreadPool
from threading import active_count

# task executed in a worker thread
def task():
    # report a message
    print(f'Task executing')
    # block for a moment
    sleep(1)
    # report a message
    print(f'Task done')

# protect the entry point
if __name__ == '__main__':
    # create and configure the thread pool
    pool = ThreadPool()
    # issue tasks to the thread pool
    result = pool.apply_async(task)
    # close the thread pool
    pool.close()
    # wait a moment
    pool.join()
    # report a message
    print('Main all done.')
    # report the number of worker threads that are still active
    active_threads = active_count()
    print(f'Active threads: {active_threads}')

Running the example first creates the ThreadPool then issues the task to the ThreadPool.

The ThreadPool begins executing the task in a worker thread.

The main thread then closes the ThreadPool while the task is running.

This prevents the pool from taking any further tasks, then closes all worker threads once all tasks are completed. The main thread blocks waiting for all worker threads to be released.

The task in the ThreadPool finishes and the worker threads in the pool are closed.

The main thread carries on and reports the number of active threads, which is one as expected (e.g. the main thread).

Task executing
Task done
Main all done.
Active threads: 1

Example of Terminating the ThreadPool

We can explore how to shutdown the ThreadPool by terminating the worker threads.

In this example, we can update the previous example and let the issued task run for a moment then terminate the pool before the task has had a chance to complete.

...
# issue tasks to the thread pool
result = pool.apply_async(task)
# wait a moment
sleep(0.5)
# terminate the thread pool
pool.terminate()

As before, we can wait for the worker threads in the pool to finish, then report the number of active threads for the main thread, which we expect to be one for the main thread.

...
# wait a moment
pool.join()
# report a message
print('Main all done.')
# report the number of worker threads that are still active
active_threads = active_count()
print(f'Active threads: {active_threads}')

Tying this together, the complete example is listed below.

# SuperFastPython.com
# example of terminating the thread pool
from time import sleep
from multiprocessing.pool import ThreadPool
from threading import active_count

# task executed in a worker thread
def task():
    # report a message
    print(f'Task executing')
    # block for a moment
    sleep(1)
    # report a message
    print(f'Task done')

# protect the entry point
if __name__ == '__main__':
    # create and configure the thread pool
    pool = ThreadPool()
    # issue tasks to the thread pool
    result = pool.apply_async(task)
    # wait a moment
    sleep(0.5)
    # terminate the thread pool
    pool.terminate()
    # wait a moment
    pool.join()
    # report a message
    print('Main all done.')
    # report the number of worker threads that are still active
    active_threads = active_count()
    print(f'Active threads: {active_threads}')

Running the example first creates the ThreadPool then issues the task to the ThreadPool.

The ThreadPool begins executing the task in a worker thread.

The main thread blocks for a moment, then terminates the ThreadPool.

This prevents the pool from taking any further tasks, then immediately closes all worker threads. The main thread blocks waiting for all worker threads to be closed.

Note, the running tasks are not terminated, but are instead allowed to finish. This because threading.Thread instances cannot be terminated mid-task, e.g. they do not have a terminate() method.

The task in the ThreadPool does not get a chance to finish. The worker threads in the pool are closed.

The main thread carries on and reports the number of active threads, which is one as expected (e.g. the main thread).

Task executing
Task done
Main all done.
Active threads: 1

Automatically Terminating the ThreadPool with Garbage Collection

We can explore the Python garbage collector terminating the ThreadPool for us.

As mentioned above, this approach is not recommended.

Nevertheless, it is important to understand the behavior of the pool under these circumstances.

We can update the example to issue above to issue the task to the pool, wait a moment, then exit the main thread without explicitly closing the ThreadPool.

...
# issue tasks to the thread pool
result = pool.apply_async(task)
# wait a moment
sleep(0.5)
# report a message
print('Main all done.')

This will trigger the Python garbage collector to call finalize() on the ThreadPool object.

At this time the ThreadPool will still be running and executing one task.

The terminate() function will be called to immediately terminate the pool, allowing the main thread to exit.

Note, the worker threads in the ThreadPool are daemon threads, therefore will not prevent the main thread from exiting.

We can demonstrate this by reporting the daemon status of all threads.

...
# report the number of thread that are still active
active_threads = 
print(f'Active children: {len(active_threads)}')
# report the daemon status of the child
for thread in active_threads:
    print(thread)

Tying this together, the complete example is listed below.

# SuperFastPython.com
# example of not closing the thread pool
from time import sleep
from multiprocessing.pool import ThreadPool
import threading

# task executed in a worker thread
def task():
    # report a message
    print(f'Task executing')
    # block for a moment
    sleep(1)
    # report a message
    print(f'Task done')

# protect the entry point
if __name__ == '__main__':
    # create and configure the thread pool
    pool = ThreadPool()
    # issue tasks to the thread pool
    result = pool.apply_async(task)
    # wait a moment
    sleep(0.5)
    # report a message
    print('Main all done.')
    # report the number of thread that are still active
    active_threads = 
    print(f'Active children: {len(active_threads)}')
    # report the daemon status of the child
    for thread in active_threads:
        print(thread)

Running the example first creates the ThreadPool then issues the task to the ThreadPool.

The ThreadPool begins executing the task in a worker thread.

The main thread blocks for a moment.

It then reports the number of active threads, which is 12 in this case.

On my system, there are 8 worker threads, 1 main thread, and 3 helper threads internal to the ThreadPool class.

Note, the number of active threads will differ depending on your system and the default number of worker threads that were created.

The daemon status of all worker threads is then reported.

We can see that all worker threads with the name "DummyProcess" are daemon threads, as expected. This means they will not prevent the main thread from exiting. We can also see the three helper threads are also daemon threads, as we would expect, and the main thread is not a daemon thread.

The main thread exits. The Python garbage collector triggers the multiprocessing.pool.ThreadPool object to be deleted and indirectly results in the terminate() function on the pool being called.

This prevents the pool from taking any further tasks, then closes all worker threads.

The task in the thread pool does not get a chance to finish. The worker threads in the pool are forcefully stopped.

Task executing
Main all done.
Active children: 12
<_MainThread(MainThread, started 4721761792)>
<DummyProcess(Thread-1, started daemon 123145500561408)>
<DummyProcess(Thread-2, started daemon 123145517350912)>
<DummyProcess(Thread-3, started daemon 123145534140416)>
<DummyProcess(Thread-4, started daemon 123145550929920)>
<DummyProcess(Thread-5, started daemon 123145567719424)>
<DummyProcess(Thread-6, started daemon 123145584508928)>
<DummyProcess(Thread-7, started daemon 123145601298432)>
<DummyProcess(Thread-8, started daemon 123145618087936)>
<Thread(Thread-9, started daemon 123145634877440)>
<Thread(Thread-10, started daemon 123145651666944)>
<Thread(Thread-11, started daemon 123145668456448)>

Automatically Terminating the ThreadPool with the Context Manager

We can automatically call terminate on the ThreadPool once we are finished with it via the context manager interface.

This is a preferred way to work with the ThreadPool, if the work with the pool can fit within the context managers block.

We can update the example to create and configure the pool using the context manager interface.

...
# create and configure the thread pool
with ThreadPool() as pool:
	# ...

We can then issue the task within the pool, and wait for it to complete.

...
# issue tasks to the thread pool
result = pool.apply_async(task)
# wait a moment
result.wait()

It is important that we wait for the task to complete within the context manager block.

This can be achieved by using blocking calls on the ThreadPool or waiting on the result directly as we are in this case.

If we carried on and the context manager block exited, then the pool would be terminated while the task was still running.

One we are finished with the ThreadPool, we can report a message and check the number of active threads.

...
# report a message
print('Main all done.')
# wait a moment
sleep(0.5)
# report the number of threads that are active
active_threads = active_count()
print(f'Active threads: {active_threads}')

Tying this together, the complete example is listed below.

# SuperFastPython.com
# example of automatically terminating the thread pool
from time import sleep
from multiprocessing.pool import ThreadPool
from threading import active_count

# task executed in a worker thread
def task():
    # report a message
    print(f'Task executing')
    # block for a moment
    sleep(1)
    # report a message
    print(f'Task done')

# protect the entry point
if __name__ == '__main__':
    # create and configure the thread pool
    with ThreadPool() as pool:
        # issue tasks to the thread pool
        result = pool.apply_async(task)
        # wait a moment
        result.wait()
    # report a message
    print('Main all done.')
    # wait a moment
    sleep(0.5)
    # report the number of threads that are active
    active_threads = active_count()
    print(f'Active threads: {active_threads}')

Running the example first creates the ThreadPool then issues the task to the ThreadPool.

The ThreadPool begins executing the task in a worker thread.

The main threads then blocks, waiting for the task to complete.

Once the task is finished, the main thread unblocks and continues on. It exits the context manager block which triggers the terminate() function to be called by the __exit__() function of the context manager.

The worker threads in the ThreadPool are then closed.

The main thread blocks for a moment to allow the worker threads to shutdown completely.

The main thread then reports the number of active threads, which is one as expected, e.g. the main thread.

Task executing
Task done
Main all done.
Active threads: 1

Error When Issuing Task to Closed ThreadPool

We cannot issue new tasks to a ThreadPool that has been shutdown.

That is, if the pool has been shutdown with a call to close() or terminate() and we try to issue a new task, it results in an error.

We can demonstrate this with a worked example.

The example above can be updated so that we close the pool, then issue a second task.

...
# issue tasks to the thread pool
result = pool.apply_async(task)
# close the thread pool
pool.close()
# issue another task
pool.apply_async(task)

This is expected to result in an error.

Tying this together, the complete example is listed below.

# SuperFastPython.com
# example of issuing a task to a pool that has already been closed.
from time import sleep
from multiprocessing.pool import ThreadPool

# task executed in a worker thread
def task():
    # report a message
    print(f'Task executing')
    # block for a moment
    sleep(1)
    # report a message
    print(f'Task done')

# protect the entry point
if __name__ == '__main__':
    # create and configure the thread pool
    pool = ThreadPool()
    # issue tasks to the thread pool
    result = pool.apply_async(task)
    # close the thread pool
    pool.close()
    # issue another task
    pool.apply_async(task)

Running the example first creates the ThreadPool then issues the task to the ThreadPool.

The ThreadPool begins executing the task in a worker thread.

The main thread then closes the ThreadPool while the task is running.

This prevents the pool from taking any further tasks.

The main thread then attempts to issue a new task to the pool. This results in an error, as expected, reporting "Pool not running".

Traceback (most recent call last):
  ...
Task executing
    pool.apply_async(task)
  ...
ValueError: Pool not running

Takeaways

You now know how to shutdown the ThreadPool in Python.



If you enjoyed this tutorial, you will love my book: Python ThreadPool Jump-Start. It covers everything you need to master the topic with hands-on examples and clear explanations.