Last Updated on September 12, 2022
You can make a thread-safe list by using a mutual exclusion (mutex) lock via the threading.Lock class.
In this tutorial you will discover how to develop a thread-safe list in Python.
Let’s get started.
Need a Thread-Safe List
A thread is a thread of execution in a computer program.
Every Python program has at least one thread of execution called the main thread. Both processes and threads are created and managed by the underlying operating system.
Sometimes we may need to create additional threads in our program in order to execute code concurrently.
Python provides the ability to create and manage new threads via the threading module and the threading.Thread class.
You can learn more about Python threads in the guide:
In concurrent programming we may need to share a list data structure between threads.
Multiple threads may need to append data to the same list, other threads may wish to remove items or check the length of the list.
Is the list thread-safe in Python and if not, how can we make it thread-safe?
Run loops using all CPUs, download your FREE book to learn how.
Most List Operations Are Atomic
Many common operations on a list are atomic, meaning that they are thread-safe.
Recall, a list is an instance of the List class.
Atomic means that the operation either occurs or does not occur with no in between inconsistent state.
Operations such as adding, removing, and reading a value on a list are atomic.
In practice, it means that operations on shared variables of built-in data types (ints, lists, dicts, etc) that “look atomic” really are.
— What kinds of global value mutation are thread-safe?
Specifically:
- Appending a value via append().
- Adding a list to a list via extend().
- Assigning multiple values of a list via a slice.
- Getting a value from a list via bracket notation [].
- Removing the last value from a list via pop().
- Sorting a list via sort().
You can learn about atomic operations in Python here:
Because atomic operations either occur or do not occur, it means that they are thread-safe.
Specifically, using any of the above operations on a list shared between multiple threads will not result in a race condition, corruption of the list or corruption of the data within the list.
Next, let’s consider why relying on atomic list operations might be fragile.
Atomic List Operations Are Fragile
Common operations on a list are atomic and therefore thread safe as we saw in the previous section.
This is only true at the time of writing because of a few specific considerations, such as:
- The precise details on how these list operations are converted to python virtual machine bytecode.
- The use of the reference python interpreter.
- The use of a Global Interpreter Lock (GIL) within the reference python interpreter.
This means that depending on the thread-safety of these operations could be fragile in future versions of Python or when executing your Python program with alternate Python interpreters.
This is not a minor concern.
There are frequent development efforts to improve the Python interpreter and even attempts to remove the GIL. These will likely change the specifics of the Python VM, bytecode compiling and thread-safety of built-in data structures.
It is also becoming more common to run Python code using third-party interpreters, mostly to achieve better performance. Alternate interpreters may or may not implement the same rules for atomic operations on lists.
Therefore, we may desire a thread-safe Python list that is future-proof to changes to Python interpreters and the GIL.
When in doubt, use a mutex!
— What kinds of global value mutation are thread-safe?
Next, let’s look at how we might develop a thread-safe list.
Free Python Threading Course
Download your FREE threading PDF cheat sheet and get BONUS access to my free 7-day crash course on the threading API.
Discover how to use the Python threading module including how to create and start new threads and how to use a mutex locks and semaphores
How to Develop a Thread-Safe List
A list can be made thread-safe using a mutual exclusion (mutex) lock.
Specifically, each add, delete, update, and read of the list must be protected by a lock.
This can be achieved by wrapping a list with a new class. The class can then expose the specific operations that we require on the list and protect those operations on the internal list using a lock.
An instance of the threading.Lock can be used to protect the list.
You can learn more about the mutex lock here:
This lock must be acquired prior to an operation on the list and released once the operation on the list has completed.
For example:
1 2 3 4 5 6 7 |
... # acquire the lock lock.acquire() # operate on the list ... # release the lock lock.release() |
This ensures that only one thread at a time is able to operate on the list.
The context manager interface on the lock may be used to simplify this operation, allowing the lock to be released automatically.
For example:
1 2 3 4 5 |
... # acquire the lock with lock: # operate on the list ... |
This adds overhead, making each list operation much slower than it otherwise would be without the lock. The benefit is confidence that race conditions with list have been removed entirely.
If your design is to have some wrapped functions call other wrapped functions, then a reentrant lock is preferred, in order to avoid a deadlock.
This can be achieved via the threading.RLock class. It will allow the lock to be acquired by the same thread again.
You can learn more about the reentrant lock here:
Let’s develop a class that wraps some common list operations in a thread-safe manner.
One approach would be to override the List class and implement thread-safe versions of all or a subset of functions.
Another approach would be to define a new class with the same interface as the List class, except the functions on the class are threads-safe.
A final approach would be to define a new class with list-like operations, that may not resemble the interface of the List class.
I prefer the latter approach to make it clear to the user of the class that they are working with something different. It is not a slot-in replacement for List that is thread safe, it is a data structure designed for a thread-safe use case that must be treated with extreme care.
First, we can define the class with a clear name like ThreadSafeList.
1 2 3 |
# custom class wrapping a list in order to make it thread safe class ThreadSafeList(): # ... |
Next, we can define a constructor that initializes both an internal list and a lock used to protect the list as instance variables on the class.
1 2 3 4 5 6 |
# constructor def __init__(self): # initialize the list self._list = list() # initialize the lock self._lock = Lock() |
Next, we can start adding functionality that we may need in our application.
In this case, we will provide four operations on the list:
- Adding a value via append().
- Removing and returning a value via pop().
- Accessing a value at an index via get().
- Getting the length of the list via length().
Each operation on the list will be protected by the lock using the context manager interface.
The context manager interface is preferred as an operation on the list may raise an error (e.g. index out of bounds) and we need to ensure that the lock will always be released, regardless of the success or failure of the operation.
For example, the method for the append operation is listed below.
1 2 3 4 5 6 |
# add a value to the list def append(self, value): # acquire the lock with self._lock: # append the value self._list.append(value) |
The methods for the pop, get, and length operations are listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
# remove and return the last value from the list def pop(self): # acquire the lock with self._lock: # pop a value from the list return self._list.pop() # read a value from the list at an index def get(self, index): # acquire the lock with self._lock: # read a value at the index return self._list[index] # return the number of items in the list def length(self): # acquire the lock with self._lock: return len(self._list) |
Not a lot of surprises there.
The lock ensures only one thread can perform an operation on the list at a time.
For example, no thread can add, remove or even check the value at an index while another thread is checking the size of the list.
The full listing of the thread-safe list class is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
# custom class wrapping a list in order to make it thread safe class ThreadSafeList(): # constructor def __init__(self): # initialize the list self._list = list() # initialize the lock self._lock = Lock() # add a value to the list def append(self, value): # acquire the lock with self._lock: # append the value self._list.append(value) # remove and return the last value from the list def pop(self): # acquire the lock with self._lock: # pop a value from the list return self._list.pop() # read a value from the list at an index def get(self, index): # acquire the lock with self._lock: # read a value at the index return self._list[index] # return the number of items in the list def length(self): # acquire the lock with self._lock: return len(self._list) |
As an extension, try adding some additional operations you’d like to use on your list like sort() and extend().
Also try to add some operations that involve composite operations on the list, like index() or reverse().
Next, let’s demonstrate the thread-safety of the list in a worked example.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Example of a Thread-Safe List
We can develop an example of using thread-safe lists from multiple threads.
In this example, we will create ten threads and have each thread attempt to add 100,000 items to the list concurrently. The expected result will be a list with 1,000,000 items.
Again, this can be achieved with a default Python list using the current (v3.10) version of the reference Python interpreter, but it may or may not be future proof to changes in the interpreter. Therefore, we will use our thread-safe list class developed in the previous section.
First, we can define a function to be executed by each worker thread.
The function will take an instance of the ThreadSafeList, then loop 100,000 times, adding a value to the list each iteration.
The add_items() function below implements this.
1 2 3 4 |
# add items to the list def add_items(safe_list): for i in range(100000): safe_list.append(i) |
Next, in the main thread, we can define an instance of our thread-safe list.
1 2 3 |
... # create the thread safe list safe_list = ThreadSafeList() |
We can then create and configure ten threads, each of which will call our add_items() function and be passed the same ThreadSafeList as an argument.
This can be done in a list comprehension, to give a list of threads we can use later.
1 2 3 |
... # configure threads to add to the list threads = [Thread(target=add_items, args=(safe_list,)) for i in range(10)] |
The list of threads can then be iterated and started.
1 2 3 4 |
... # start threads for thread in threads: thread.start() |
The main thread can then wait for all new threads to terminate, by iterating the list of threads and joining each in turn. This will block the main thread until all tasks are completed.
1 2 3 4 5 |
... # wait for all threads to terminate print('Main waiting for threads...') for thread in threads: thread.join() |
Finally, we can report the length of the new list, which we expect to be 1,000,000 items.
1 2 3 |
... # report the number of items in the list print(f'List size: {safe_list.length()}') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
# SuperFastPython.com # example of a thread-safe list from threading import Thread from threading import Lock # custom class wrapping a list in order to make it thread safe class ThreadSafeList(): # constructor def __init__(self): # initialize the list self._list = list() # initialize the lock self._lock = Lock() # add a value to the list def append(self, value): # acquire the lock with self._lock: # append the value self._list.append(value) # remove and return the last value from the list def pop(self): # acquire the lock with self._lock: # pop a value from the list return self._list.pop() # read a value from the list at an index def get(self, index): # acquire the lock with self._lock: # read a value at the index return self._list[index] # return the number of items in the list def length(self): # acquire the lock with self._lock: return len(self._list) # add items to the list def add_items(safe_list): for i in range(100000): safe_list.append(i) # create the thread safe list safe_list = ThreadSafeList() # configure threads to add to the list threads = [Thread(target=add_items, args=(safe_list,)) for i in range(10)] # start threads for thread in threads: thread.start() # wait for all threads to terminate print('Main waiting for threads...') for thread in threads: thread.join() # report the number of items in the list print(f'List size: {safe_list.length()}') |
Running the example first creates an instance of our thread-safe list, that internally defines a new list instance and a lock to protect the list.
We then configure and start ten threads, each that aggressively attempts to add 100,000 items to the list.
Each call to append() on the thread-safe list is protected by a lock, ensuring that only one thread can add one item to the list at a time.
A helpful improvement to the ThreadSafeList class might be to add a function so that a thread can get the lock once and add many items to the list in batch.
The main thread blocks until all ten of the new threads terminate.
Finally, all threads finish adding their items, the main thread unblocks and reports the length of the list.
In this case, we can see that it contains the expected 1,000,000 items. This will be the case every time the code is run, regardless of future changes to the reference interpreter or the specific interpreter on which the code is executed.
1 2 |
Main waiting for threads... List size: 1000000 |
Next, let’s look at an alternate approach to developing a thread-safe list.
Alternate to a Thread-Safe List
Developing a new thread-safe list class has risks.
The more code you write, the more likely you are to introduce bugs, especially code that involves concurrency.
An alternative to developing a thread-safe list is to use a queue data structure.
Python provides thread-safe queue data structures in the queue module, such as the queue.Queue and queue.SimpleQueue classes.
These classes, such as the queue.Queue class are designed specifically to be thread-safe in a concurrent programming environment.
For example, we can create a new instance of the queue.Queue class.
1 2 3 |
... # create a thread-safe queue queue = queue.Queue() |
We can add items to the queue in a thread-safe manner using the put() function.
For example:
1 2 3 |
... # add an item to the queue in a thread-safe manner queue.put(item) |
We can remove items from the queue in a thread-safe manner using the get() function.
For example:
1 2 3 |
... # get the next item from the queue in a thread-safe manner item = queue.get() |
These queue classes are designed to be fast, abstracting away how the structure is kept thread-safe while doing so in the fastest and most appropriate way for the current version of Python.
We can see this if we review the source code for the queue module and classes.
As such, some operations are performed directly, relying on the atomic nature of the List class.
For example, putting an item with put() does not use an internal mutex lock.
Other operations that involve a composite of operations on the list are protected by a mutex lock internally such as checking the size of the queue via the qsize() function and checking if the queue is full via full() or empty via empty().
Additionally, the queue provides a number of wait/notify capabilities via internal threading.Condition objects, allowing threads to wait on getting items from the queue until there are new items or to wait on putting items in the queue until it is not full. These conditions also share the same mutex.
Operations that first acquire a condition that makes use of the same mutex lock internally are also protected. An example is getting items from the queue via get().
I would recommend using a queue class instead of developing a thread-safe list class, if it provides all or most of the functionality you require in your application.
Example of Using a Thread-Safe Queue
We can update our thread-safe list example to use a queue.Queue instead of our ThreadSafeList class.
First, we must update our add_items() function to call put() to add items on the queue.
The updated version of the function is listed below.
1 2 3 4 |
# add items to the list def add_items(safe_list): for i in range(100000): safe_list.put(i) |
We can then define our safe_list variable to be an instance of queue.Queue.
1 2 3 |
... # create the thread safe list (queue) safe_list = Queue() |
Finally, when checking the size of the structure at the end of the program we can call the qsize() function on the queue.
1 2 3 |
... # report the number of items in the list print(f'List size: {safe_list.qsize()}') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
# SuperFastPython.com # example of a thread-safe list via a queue from threading import Thread from queue import Queue # add items to the list def add_items(safe_list): for i in range(100000): safe_list.put(i) # create the thread safe list (queue) safe_list = Queue() # configure threads to add to the list threads = [Thread(target=add_items, args=(safe_list,)) for i in range(10)] # start threads for thread in threads: thread.start() # wait for all threads to terminate print('Main waiting for threads...') for thread in threads: thread.join() # report the number of items in the list print(f'List size: {safe_list.qsize()}') |
Running the example first creates an instance of the queue.
The ten threads are then configured and started. Each thread adds 100,000 items to the queue concurrently as fast as they can.
The main thread blocks until the worker threads terminate.
Finally, the main thread unblocks and reports the size of the queue.
In this case, the size is reported as 1,000,000 items, as expected.
Further, you will notice that this example is significantly faster to execute as the queue.Queue does not block when adding items – which is the majority of the work performed by the threads in this example.
1 2 |
'Main waiting for threads... List size: 1000000 |
Further Reading
This section provides additional resources that you may find helpful.
Python Threading Books
- Python Threading Jump-Start, Jason Brownlee (my book!)
- Threading API Interview Questions
- Threading Module API Cheat Sheet
I also recommend specific chapters in the following books:
- Python Cookbook, David Beazley and Brian Jones, 2013.
- See: Chapter 12: Concurrency
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Threading: The Complete Guide
- Python ThreadPoolExecutor: The Complete Guide
- Python ThreadPool: The Complete Guide
APIs
References
Takeaways
You now know how to use a thread-safe list in Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Harley-Davidson on Unsplash
Do you have any questions?