You can achieve full parallelism in Python with the multiprocessing pool, side-stepping the GIL.
In this tutorial you will discover the relationship between the multiprocessing pool and the Global Interpreter Lock in Python.
Let’s get started.
Table of Contents
Multiprocessing Pool Affected By GIL?
The multiprocessing pool provides a pool of reusable workers for executing ad hoc tasks with process-based concurrency.
An instance of the multiprocessing.Pool class can be created, specifying the number of workers to create, otherwise a default number of workers will be created to match the number of logical CPUs in the system.
Once created, ad hoc tasks can be issued to the pool for execution via the Pool.apply() function. Multiple tasks may be executed in the book by calling the same function with different arguments via the Pool.map() function.
Tasks may be executed asynchronously in the pool via Pool.apply_async() and Pool.map_async(), allowing the caller to carry on with other tasks.
Once concerned with the multiprocessing pool is whether it is affected by the Global Interpreter Lock.
If the workers in the multiprocessing pool are affected by the GIL, it limits the types of tasks that they can execute in parallel to those that release the GIL, such as blocking I/O.
Otherwise, if the multiprocessing pool workers are not affected by the GIL, then the workers can execute arbitrary tasks using true parallelism.
Is the multiprocessing pool subject to the GIL?
Run your loops using all CPUs, download my FREE book to learn how.
What is the Global Interpreter Lock?
The internals of the Python interpreter are not thread-safe.
This means that there can be race conditions between multiple threads within a single Python process, potentially resulting in unexpected behavior and corrupt data.
As such, the Python interpreter makes use of a Global Interpreter Lock, or GIL for short, to make instructions executed by the Python interpreter (called Python bytecodes) thread-safe.
The GIL is a programming pattern in the reference Python interpreter called CPython, although similar locks exist in other interpreted languages, such as Ruby. It is a lock in the sense that it uses a synchronization primitive called a mutual exclusion or mutex lock to ensure that only one thread of execution can execute instructions at a time within a Python process.
In CPython, the global interpreter lock, or GIL, is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once. The GIL prevents race conditions and ensures thread safety.— Global Interpreter Lock, Python Wiki.
The effect of the GIL is that whenever a thread within a Python program wants to run, it must acquire the lock before executing. This is not a problem for most Python programs that have a single thread of execution, called the main thread.
It can become a problem in multi-threaded Python programs, such as programs that make use of the threading.Thread class or the concurrent.futures.ThreadPoolExecutor class.
The lock is explicitly released and re-acquired periodically by each Python thread, specifically after approximately every 100 bytecode instructions executed within the interpreter. This allows other threads within the Python process to run, if present.
The lock is also released in some circumstances, allowing other threads to run.
An important example is when a thread performs an I/O operation, such as reading or writing from an external resource like a file, socket, or device.
Luckily, many potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore it is only in multithreaded programs that spend a lot of time inside the GIL, interpreting CPython bytecode, that the GIL becomes a bottleneck.— Global Interpreter Lock, Python Wiki.
The lock is also explicitly released by some third-party Python libraries when performing computationally expensive operations in C-code, such as many array operations in NumPy.
In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation).— threading — Thread-based parallelism
The GIL is a simple and effective solution to thread safety in the Python interpreter, but it has the major downside that full multithreading is not supported by Python.
An alternative solution might be to explicitly make the interpreter thread-safe by protecting each critical section. This has been tried a number of times and typically results in worse performance of single-threaded Python programs by up to 30%.
Unfortunately, both experiments exhibited a sharp drop in single-thread performance (at least 30% slower), due to the amount of fine-grained locking necessary to compensate for the removal of the GIL.— Python Global Interpreter Lock, Python Wiki.
Now that we are familiar with the GIL, let’s look at how multiprocessing is impacted.
Confused by the Pool class API?
Download my FREE PDF cheat sheet
Multiprocessing Pool and the Global Interpreter Lock
The multiprocessing module that provides process-based concurrency is not limited by the Global Interpreter Lock.
Both threads and processes can execute concurrently (out of order), but only python processes are able to execute in parallel (simultaneously), not Python threads (with some caveats).
This means that if we want out Python code to run on all CPU cores and make the best use of our system hardware, we should use process-based concurrency.
The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.— multiprocessing — Process-based parallelism
In fact, Jesse Noller and Richard Oudkerk proposed and developed the multiprocessing module (originally called “pyprocessing“) in Python specifically to overcome the limitations and side-step the GIL.
The pyprocessing package offers a method to side-step the GIL allowing applications within CPython to take advantage of multi-core architectures without asking users to completely change their programming paradigm (i.e.: dropping threaded programming for another “concurrent” approach – Twisted, Actors, etc).— PEP 371 – Addition of the multiprocessing package to the standard library
This includes the multiprocessing.Pool class that was introduced with the multiprocessing module in Python 2.6 and 3.0.
The multiprocessing pool is not limited by the Global Interpreter Lock and can achieve true parallelism in Python.
Free Python Multiprocessing Pool Course
Sign-up to my FREE 7-day email course and discover how to use the multiprocessing Pool, including how to configure the number of workers, how to execute tasks asynchronously, and much more!
Click the button below and enter your email address to sign-up and get the first lesson right now.
This section provides additional resources that you may find helpful.
- multiprocessing — Process-based parallelism
- Multiprocessing Pool: The Complete Guide
- Pool Class API Cheat Sheet
- Multiprocessing API Interview Questions
- Multiprocessing Pool Jump-Start (my 7-day course)
Overwheled by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
You now know how the multiprocessing pool relates to the Global Interpreter Lock in Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.