Thread-Local With ThreadPoolExecutor in Python

Last Updated on September 12, 2022

You can use thread local data by passing an instance of local to task functions in the ThreadPoolExecutor in Python.

In this tutorial, you will discover how to use thread local data in Python thread pools.

Let’s get started.

Table of Contents

Need for Thread Local Data in ThreadPoolExecutor Tasks

The ThreadPoolExecutor in Python provides a pool of reusable threads for executing ad hoc tasks.

You can submit tasks to the thread pool by calling the submit() function and passing in the name of the function you wish to execute on another thread. You can also submit tasks by calling the map() function and specifying the name of the function to execute and the iterable of items to which your function will be applied.

When using a thread pool, we may need to initialize a variable, data, or resource to be used by each worker thread across all tasks executed by that thread.

For example, perhaps each thread is required to have its own handle for logging or connection to a remote server to be held open and reused when executing tasks.

This requires that we initialize data for each worker thread and store it in a way that only each worker thread can access.

We may be tempted to use global variables, but this would allow any thread to access the data rather than keeping the data private to each worker thread.

How can we store data for each worker thread in the ThreadPoolExecutor?

Run loops using all CPUs, download your FREE book to learn how.

How to Use Thread Local Data With The ThreadPoolExecutor

Worker threads can store private data using thread local data.

Thread local data is a context that allows threads to store data that is thread-specific. That is, other threads cannot access it.

A thread local context can be created by calling the threading.local() function.

For example:

...

# create a local context

local = local()

Next, we need a way for each worker thread to initialize thread-specific data and store it in the thread local context.

This can be achieved by defining a custom worker thread initialization function and passing the thread local context as an argument. Each worker thread in the ThreadPoolExecutor can then initialize any required data and store it in the thread local context.

For example:

# function for initializing the worker threads

def initializer_worker(local):

# prepare data

value = 'test'

# store data in local

local.data = value

Variables with any names we like can be added to the thread local context; here, local.data is arbitrary.

The value of local.data will be specific to each thread that adds a variable to the thread local context.

We can then configure the ThreadPoolExecutor to call the custom initialization function when each worker thread is created.

This can be achieved via the initializer argument in the constructor that specifies the name of the custom initialization function, and the initargs argument that specifies a tuple of arguments to the function.

...

# configure the thread pool with a custom initialization function

executor = ThreadPoolExecutor(initializer=initializer_worker, initargs=(local,))

Finally, the target task function can take the thread local context as an argument, allowing each worker thread to access the thread local data.

# target task function

def task(local):

# access thread local data

data = local.value

# do things...

Now that we know how to store local data for each worker thread in the ThreadPoolExecutor, let’s look at a worked example.

Download Now: Free ThreadPoolExecutor PDF Cheat Sheet

Example of Using Thread Local Data in the ThreadPoolExecutor

Let’s develop an example of storing local data for each worker thread in the ThreadPoolExecutor.

First, we can define a custom initializer function that takes a thread local context and sets up a custom variable named key with a unique value between 0.0 and 1.0 for each worker thread.

We can also report the value stored by each thread, useful for reference later.

# function for initializing the worker thread

def initializer_worker(local):

# generate a unique value for the worker thread

local.key = random()

# store the unique worker key in a thread local variable

print(f'Initializing worker thread {local.key}')

Next, we can then define our target task function to take the same thread local context and to access the thread local variable for the worker thread and make use of it.

In this case, the target task will sleep to mimic an IO-bound task and use the value stored in the thread local context, that is the specific value for the worker thread executing the task. The task will then report the thread-specific value that was used that will match a value printed during the initialization of the worker thread.

# a mock task that sleeps for a random amount of time less than one second

def task(local):

# access the unique key for the worker thread

mykey = local.key

# make use of it

sleep(mykey)

return f'Worker using {mykey}'

We can then configure our new ThreadPoolExecutor instance to use the initializer with the required local argument.

First, we create the single thread local context shared between the initializer functions and the task functions.

...

# get the local context

local = threading.local()

Next, we can create the thread pool and specify two worker threads and the custom worker thread initializer function that we defined above and the single argument to the initialization function, which is the thread local context.

...

# create a thread pool

executor = ThreadPoolExecutor(max_workers=2, initializer=initializer_worker, initargs=(local,))

We can then dispatch ten tasks into the thread pool with the same thread local context.

...

# dispatch tasks

futures = [executor.submit(task, local) for _ in range(10)]

And that’s it.

Tying this together, the complete example of using a ThreadPoolExecutor that makes use of thread local data is listed below.

# SuperFastPython.com

# example of thread local storage for worker threads

from time import sleep

from random import random

import threading

from concurrent.futures import ThreadPoolExecutor

from concurrent.futures import wait

# function for initializing the worker thread

def initializer_worker(local):

# generate a unique value for the worker thread

local.key = random()

# store the unique worker key in a thread local variable

print(f'Initializing worker thread {local.key}')

# a mock task that sleeps for a random amount of time less than one second

def task(local):

# access the unique key for the worker thread

mykey = local.key

# make use of it

sleep(mykey)

return f'Worker using {mykey}'

# get the local context

local = threading.local()

# create a thread pool

executor = ThreadPoolExecutor(max_workers=2, initializer=initializer_worker, initargs=(local,))

# dispatch asks

futures = [executor.submit(task, local) for _ in range(10)]

# wait for all threads to complete

for future in futures:

result = future.result()

print(result)

# shutdown the thread pool

executor.shutdown()

print('done')

Running the example first configures the thread pool to use our custom initializer function, which sets up a thread local data for each worker thread with a unique value, in this case two threads each with a value between 0 and 1.

Each worker thread then works on tasks in the queue, all ten of them, each using the specific value of the thread local variable setup for the thread in the initialization function.

Initializing worker thread 0.9360961457279074

Initializing worker thread 0.9075641843481475