ThreadPool When Are Workers Started

October 5, 2022 Python ThreadPool

Child worker threads are started automatically after creating an instance of the ThreadPool class.

In this tutorial, you will discover when the worker threads are created in the ThreadPool in Python.

Let's get started.

ThreadPool Workers

The ThreadPool provides a pool of reusable workers for executing ad hoc tasks with thread-based concurrency.

An instance of the ThreadPool class can be created, specifying the number of workers to create, otherwise, a default number of workers will be created to match the number of logical CPUs in the system.

Once created, ad hoc tasks can be issued to the pool for execution via the apply() method. Multiple tasks may be executed in the book by calling the same function with different arguments via the map() method.

Tasks may be executed asynchronously in the pool via apply_async() and map_async() methods, allowing the caller to carry on with other tasks.

One concern with the ThreadPool is when exactly the worker threads used internally within the pool are created.

There are two main options for when child workers could be created:

We may need to know this for many reasons, such as:

When are worker threads created in the ThreadPool?

When Are ThreadPool Workers Created

Workers in the ThreadPool are started immediately after the pool is created.

For example:

...
# create a pool
pool = multiprocessing.pool.ThreadPool(4)

Moments after the class is created, it will internally create enough workers to populate the pool. In this case four.

This also applies when using the pool via the context manager interface.

For example:

...
# create a pool with the context manager interface
with multiprocessing.pool.ThreadPool(4) as pool:
	# ...

Again, moments after the Pool instance is defined, it is populated with worker threads.

Now that we know when the worker threads are created in the ThreadPool, let's look at a worked example.

Example of When Workers Are Created in the Pool

We can explore when the worker threads in the ThreadPool are created.

In this example, we will create a ThreadPool, wait a moment, then check if the caller thread has any running threads.

If it does have workers and the number of active threads matches the number of workers configured in the pool, it means that the child workers are created after the class is created.

Otherwise, it means that worker threads are created on demand.

This is an experimental approach to the question, an alternate approach would be to review the source code of the multiprocessing.pool.Pool class (the base class for the ThreadPool class).

Firstly, we can create a ThreadPool instance with four child worker threads. Don't worry if you have more or fewer CPU cores, as we will not be running any tasks.

...
# create a pool
pool = ThreadPool(4)

We will then wait a moment to allow any internal initialization by the pool to be completed.

...
# wait a moment
sleep(0.5)

Next, we can get access to all active threads for the current process and report their details.

This can be achieved via the threading.enumerate() module function.

...
# report all active child threads
for thread in threading.enumerate():
    print(thread)

This could be a list with one item, the main thread running the program. It could be a list with 5 or more threads, one for each worker, the main thread, and any other threads running.

Finally, we can close the pool now that we are finished with it.

...
# close the pool
pool.close()

Tying this together, the complete example is listed below.

# SuperFastPython.com
# when are worker threads started
from time import sleep
from multiprocessing.pool import ThreadPool
import threading

# protect the entry point
if __name__ == '__main__':
    # create a pool
    pool = ThreadPool(4)
    # wait a moment
    sleep(0.5)
    # report all active threads
    for thread in threading.enumerate():
        print(thread)
    # close the pool
    pool.close()

Running the example first creates an instance of the ThreadPool.

The caller then blocks to allow the pool to start up completely. This is probably not necessary.

Next, a list of all running threads is retrieved and traversed, reporting the details of each.

In this case, we can see that the list of running threads contains many threads, including:

This confirms that worker threads are created after the class is created, and not on demand.

Finally, the ThreadPool is closed.

<_MainThread(MainThread, started 4505034240)>
<DummyProcess(Thread-1, started daemon 123145323581440)>
<DummyProcess(Thread-2, started daemon 123145340370944)>
<DummyProcess(Thread-3, started daemon 123145357160448)>
<DummyProcess(Thread-4, started daemon 123145373949952)>
<Thread(Thread-5, started daemon 123145390739456)>
<Thread(Thread-6, started daemon 123145407528960)>
<Thread(Thread-7, started daemon 123145424318464)>

Takeaways

You now know when worker threads are created in the ThreadPool.



If you enjoyed this tutorial, you will love my book: Python ThreadPool Jump-Start. It covers everything you need to master the topic with hands-on examples and clear explanations.