Last Updated on September 12, 2022
You can use thread local data by passing an instance of local to task functions in the ThreadPoolExecutor in Python.
In this tutorial, you will discover how to use thread local data in Python thread pools.
Let’s get started.
Need for Thread Local Data in ThreadPoolExecutor Tasks
The ThreadPoolExecutor in Python provides a pool of reusable threads for executing ad hoc tasks.
You can submit tasks to the thread pool by calling the submit() function and passing in the name of the function you wish to execute on another thread. You can also submit tasks by calling the map() function and specifying the name of the function to execute and the iterable of items to which your function will be applied.
When using a thread pool, we may need to initialize a variable, data, or resource to be used by each worker thread across all tasks executed by that thread.
For example, perhaps each thread is required to have its own handle for logging or connection to a remote server to be held open and reused when executing tasks.
This requires that we initialize data for each worker thread and store it in a way that only each worker thread can access.
We may be tempted to use global variables, but this would allow any thread to access the data rather than keeping the data private to each worker thread.
How can we store data for each worker thread in the ThreadPoolExecutor?
Run loops using all CPUs, download your FREE book to learn how.
How to Use Thread Local Data With The ThreadPoolExecutor
Worker threads can store private data using thread local data.
Thread local data is a context that allows threads to store data that is thread-specific. That is, other threads cannot access it.
A thread local context can be created by calling the threading.local() function.
For example:
1 2 3 |
... # create a local context local = local() |
Next, we need a way for each worker thread to initialize thread-specific data and store it in the thread local context.
This can be achieved by defining a custom worker thread initialization function and passing the thread local context as an argument. Each worker thread in the ThreadPoolExecutor can then initialize any required data and store it in the thread local context.
For example:
1 2 3 4 5 6 |
# function for initializing the worker threads def initializer_worker(local): # prepare data value = 'test' # store data in local local.data = value |
Variables with any names we like can be added to the thread local context; here, local.data is arbitrary.
The value of local.data will be specific to each thread that adds a variable to the thread local context.
We can then configure the ThreadPoolExecutor to call the custom initialization function when each worker thread is created.
This can be achieved via the initializer argument in the constructor that specifies the name of the custom initialization function, and the initargs argument that specifies a tuple of arguments to the function.
1 2 3 |
... # configure the thread pool with a custom initialization function executor = ThreadPoolExecutor(initializer=initializer_worker, initargs=(local,)) |
Finally, the target task function can take the thread local context as an argument, allowing each worker thread to access the thread local data.
1 2 3 4 5 |
# target task function def task(local): # access thread local data data = local.value # do things... |
Now that we know how to store local data for each worker thread in the ThreadPoolExecutor, let’s look at a worked example.
Example of Using Thread Local Data in the ThreadPoolExecutor
Let’s develop an example of storing local data for each worker thread in the ThreadPoolExecutor.
First, we can define a custom initializer function that takes a thread local context and sets up a custom variable named key with a unique value between 0.0 and 1.0 for each worker thread.
We can also report the value stored by each thread, useful for reference later.
1 2 3 4 5 6 |
# function for initializing the worker thread def initializer_worker(local): # generate a unique value for the worker thread local.key = random() # store the unique worker key in a thread local variable print(f'Initializing worker thread {local.key}') |
Next, we can then define our target task function to take the same thread local context and to access the thread local variable for the worker thread and make use of it.
In this case, the target task will sleep to mimic an IO-bound task and use the value stored in the thread local context, that is the specific value for the worker thread executing the task. The task will then report the thread-specific value that was used that will match a value printed during the initialization of the worker thread.
1 2 3 4 5 6 7 |
# a mock task that sleeps for a random amount of time less than one second def task(local): # access the unique key for the worker thread mykey = local.key # make use of it sleep(mykey) return f'Worker using {mykey}' |
We can then configure our new ThreadPoolExecutor instance to use the initializer with the required local argument.
First, we create the single thread local context shared between the initializer functions and the task functions.
1 2 3 |
... # get the local context local = threading.local() |
Next, we can create the thread pool and specify two worker threads and the custom worker thread initializer function that we defined above and the single argument to the initialization function, which is the thread local context.
1 2 3 |
... # create a thread pool executor = ThreadPoolExecutor(max_workers=2, initializer=initializer_worker, initargs=(local,)) |
We can then dispatch ten tasks into the thread pool with the same thread local context.
1 2 3 |
... # dispatch tasks futures = [executor.submit(task, local) for _ in range(10)] |
And that’s it.
Tying this together, the complete example of using a ThreadPoolExecutor that makes use of thread local data is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
# SuperFastPython.com # example of thread local storage for worker threads from time import sleep from random import random import threading from concurrent.futures import ThreadPoolExecutor from concurrent.futures import wait # function for initializing the worker thread def initializer_worker(local): # generate a unique value for the worker thread local.key = random() # store the unique worker key in a thread local variable print(f'Initializing worker thread {local.key}') # a mock task that sleeps for a random amount of time less than one second def task(local): # access the unique key for the worker thread mykey = local.key # make use of it sleep(mykey) return f'Worker using {mykey}' # get the local context local = threading.local() # create a thread pool executor = ThreadPoolExecutor(max_workers=2, initializer=initializer_worker, initargs=(local,)) # dispatch asks futures = [executor.submit(task, local) for _ in range(10)] # wait for all threads to complete for future in futures: result = future.result() print(result) # shutdown the thread pool executor.shutdown() print('done') |
Running the example first configures the thread pool to use our custom initializer function, which sets up a thread local data for each worker thread with a unique value, in this case two threads each with a value between 0 and 1.
Each worker thread then works on tasks in the queue, all ten of them, each using the specific value of the thread local variable setup for the thread in the initialization function.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
Initializing worker thread 0.9360961457279074 Initializing worker thread 0.9075641843481475 Worker using 0.9360961457279074 Worker using 0.9075641843481475 Worker using 0.9075641843481475 Worker using 0.9360961457279074 Worker using 0.9075641843481475 Worker using 0.9360961457279074 Worker using 0.9075641843481475 Worker using 0.9360961457279074 Worker using 0.9075641843481475 Worker using 0.9360961457279074 done |
Free Python ThreadPoolExecutor Course
Download your FREE ThreadPoolExecutor PDF cheat sheet and get BONUS access to my free 7-day crash course on the ThreadPoolExecutor API.
Discover how to use the ThreadPoolExecutor class including how to configure the number of workers and how to execute tasks asynchronously.
Further Reading
This section provides additional resources that you may find helpful.
Books
- ThreadPoolExecutor Jump-Start, Jason Brownlee, (my book!)
- Concurrent Futures API Interview Questions
- ThreadPoolExecutor Class API Cheat Sheet
I also recommend specific chapters from the following books:
- Effective Python, Brett Slatkin, 2019.
- See Chapter 7: Concurrency and Parallelism
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python ThreadPoolExecutor: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
- Python Threading: The Complete Guide
- Python ThreadPool: The Complete Guide
APIs
References
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Takeaways
You now know how to use thread local data with the ThreadPoolExecutor in Python.
Do you have any questions about how to use thread local data with thread pools?
Ask your question in the comments below and I will do my best to answer.
Photo by Jeremy Bezanger on Unsplash
Do you have any questions?