Python uses a Global Interpreter Lock, or GIL, which makes the interpreter thread-safe at the cost of allowing only one thread to execute at a time, in most circumstances.
In this tutorial, you will discover the relationship between the ThreadPool and the Global Interpreter Lock in Python.
Let’s get started.
Table of Contents
ThreadPool Affected By GIL?
The multiprocessing.pool.ThreadPool in Python provides a pool of reusable threads for executing ad hoc tasks.
A thread pool object which controls a pool of worker threads to which jobs can be submitted.— multiprocessing — Process-based parallelism
The ThreadPool class extends the Pool class. The Pool class provides a pool of worker processes for process-based concurrency.
Although the ThreadPool class is in the multiprocessing module it offers thread-based concurrency and is best suited to IO-bound tasks, such as reading or writing from sockets or files.
A ThreadPool can be configured when it is created, which will prepare the new threads.
We can issue one-off tasks to the ThreadPool using methods such as apply() or we can apply the same function to an iterable of items using methods such as map().
Results for issued tasks can then be retrieved synchronously, or we can retrieve the result of tasks later by using asynchronous versions of the methods such as apply_async() and map_async().
Once concerned with the ThreadPool is whether it is affected by the Global Interpreter Lock.
If the workers in the ThreadPool are affected by the GIL, it limits the types of tasks that they can execute concurrently to those that release the GIL, such as blocking I/O.
Otherwise, if the ThreadPool workers are not affected by the GIL, then the workers can execute arbitrary tasks using full parallelism.
Is the ThreadPool subject to the GIL?
Run your loops using all CPUs, download my FREE book to learn how.
What is the Global Interpreter Lock?
The internals of the Python interpreter is not thread-safe.
This means that there can be race conditions between multiple threads within a single Python process, potentially resulting in unexpected behavior and corrupt data.
As such, the Python interpreter makes use of a Global Interpreter Lock, or GIL for short, to make instructions executed by the Python interpreter (called Python bytecodes) thread-safe.
The GIL is a programming pattern in the reference Python interpreter called CPython, although similar locks exist in other interpreted languages, such as Ruby. It is a lock in the sense that it uses a synchronization primitive called a mutual exclusion or mutex lock to ensure that only one thread of execution can execute instructions at a time within a Python process.
In CPython, the global interpreter lock, or GIL, is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once. The GIL prevents race conditions and ensures thread safety.— Global Interpreter Lock, Python Wiki.
The effect of the GIL is that whenever a thread within a Python program wants to run, it must acquire the lock before executing. This is not a problem for most Python programs that have a single thread of execution, called the main thread.
It can become a problem in multi-threaded Python programs, such as programs that make use of the threading.Thread class or the concurrent.futures.ThreadPoolExecutor class.
The lock is explicitly released and re-acquired periodically by each Python thread, specifically after approximately every 100 bytecode instructions executed within the interpreter. This allows other threads within the Python process to run, if present.
The lock is also released in some circumstances, allowing other threads to run.
An important example is when a thread performs an I/O operation, such as reading or writing from an external resource like a file, socket, or device.
Luckily, many potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore it is only in multithreaded programs that spend a lot of time inside the GIL, interpreting CPython bytecode, that the GIL becomes a bottleneck.— Global Interpreter Lock, Python Wiki.
The lock is also explicitly released by some third-party Python libraries when performing computationally expensive operations in C-code, such as many array operations in NumPy.
In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation).— threading — Thread-based parallelism
The GIL is a simple and effective solution to thread safety in the Python interpreter, but it has the major downside that full multithreading is not supported by Python.
An alternative solution might be to explicitly make the interpreter thread-safe by protecting each critical section. This has been tried a number of times and typically results in the worse performance of single-threaded Python programs by up to 30%.
Unfortunately, both experiments exhibited a sharp drop in single-thread performance (at least 30% slower), due to the amount of fine-grained locking necessary to compensate for the removal of the GIL.— Python Global Interpreter Lock, Python Wiki.
Now that we are familiar with the GIL, let’s look at how ThreadPool is impacted.
Confused by the ThreadPool class API?
Download my FREE PDF cheat sheet
ThreadPool and the Global Interpreter Lock
The presence of the GIL in Python impacts the ThreadPool.
The ThreadPool maintains a fixed-sized pool of worker threads that supports concurrent tasks, but the presence of the GIL means that most tasks will not run in parallel.
You may recall that concurrency is a general term that suggests order independence between tasks, e.g. they can be completed at any time or at the same time. Parallel might be considered a subset of concurrency and explicitly suggests that tasks are executed simultaneously.
The GIL means that worker threads cannot run in parallel, in most cases.
Specifically, in cases where the target task functions are CPU-bound tasks. These are tasks that are limited by the speed of the CPU in the system, such as working with no data in memory or calculating something.
Nevertheless, worker threads can run in parallel in some special circumstances, one of which is when an IO task is being performed.
These are tasks that involve reading or writing from an external resource.
- Reading or writing a file from the hard drive.
- Reading or writing to standard output, input, or error (stdin, stdout, stderr).
- Printing a document.
- Downloading or uploading a file.
- Querying a server.
- Querying a database.
- Taking a photo or recording a video.
- And so much more.
When a Python thread executes a blocking IO task, it will release the GIL and allow another Python thread to execute.
This still means that only one Python thread can execute Python bytecodes at any one time. But it also means that we will achieve seemingly parallel execution of tasks if tasks perform blocking IO operations.
Luckily, many potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore it is only in multithreaded programs that spend a lot of time inside the GIL, interpreting CPython bytecode, that the GIL becomes a bottleneck.— Python Global Interpreter Lock, Python Wiki.
This suggests that the ThreadPool should be limited to those tasks that release the GIL.
It also suggests that if tasks that do not release the GIL are executed by worker threads, such as CPU-bound tasks, we may expect worse performance because of the locking of the GIL required by each thread before executing and switching between tasks every 100 instructions.
Free Python ThreadPool Course
Download my ThreadPool API cheat sheet and as a bonus you will get FREE access to my 7-day email course.
Discover how to use the ThreadPool including how to configure the number of worker threads and how to execute tasks asynchronously
This section provides additional resources that you may find helpful.
- Python ThreadPool Jump-Start, Jason Brownlee, 2022 (my book!).
- Threading API Interview Questions
- ThreadPool Class API Cheat Sheet
I also recommend specific chapters from the following books:
- Python Cookbook, David Beazley and Brian Jones, 2013.
- See: Chapter 12: Concurrency
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Overwheled by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
You now know how the ThreadPool relates to the Global Interpreter Lock in Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.