You can achieve full parallelism in Python with the multiprocessing module, side-stepping the GIL.
In this tutorial you will discover the relationship between the multiprocessing module for process-based concurrency and the Global Interpreter Lock in Python.
Let’s get started.
Table of Contents
Is Multiprocessing Limited By The GIL?
The “multiprocessing” module provides process-based concurrency in Python.
A process refers to a computer program.
Every Python program is a process and has one default thread called the main thread used to execute your program instructions. Each process is, in fact, one instance of the Python interpreter that executes Python instructions (Python byte-code), which is a slightly lower level than the code you type into your Python program.
Central to the multiprocessing module is the multiprocessing.Process class that provides a Python handle on a native process (managed by the underlying operating system).
Sometimes we may need to create new child processes in our program in order to execute code concurrently.
Python provides the ability to create and manage new processes via the multiprocessing.Process class.
Once concerned with the multiprocessing module and creating child processes via the multiprocessing.Process class is whether they are affected by the Global Interpreter Lock (GIL).
If the child processes are affected by the GIL, it limits the types of tasks that they can execute in parallel to those that release the GIL, such as blocking I/O.
Otherwise, if child processes are not affected by the GIL, then the workers can execute arbitrary tasks using full parallelism.
Is the multiprocessing module and multiprocessing.Process class subject to the GIL?
Run your loops using all CPUs, download my FREE book to learn how.
What is the Global Interpreter Lock?
The internals of the Python interpreter are not thread-safe.
This means that there can be race conditions between multiple threads within a single Python process, potentially resulting in unexpected behavior and corrupt data.
As such, the Python interpreter makes use of a Global Interpreter Lock, or GIL for short, to make instructions executed by the Python interpreter (called Python bytecodes) thread-safe.
The GIL is a programming pattern in the reference Python interpreter called CPython, although similar locks exist in other interpreted languages, such as Ruby. It is a lock in the sense that it uses a synchronization primitive called a mutual exclusion or mutex lock to ensure that only one thread of execution can execute instructions at a time within a Python process.
In CPython, the global interpreter lock, or GIL, is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once. The GIL prevents race conditions and ensures thread safety.— Global Interpreter Lock, Python Wiki.
The effect of the GIL is that whenever a thread within a Python program wants to run, it must acquire the lock before executing. This is not a problem for most Python programs that have a single thread of execution, called the main thread.
It can become a problem in multi-threaded Python programs, such as programs that make use of the threading.Thread class or the concurrent.futures.ThreadPoolExecutor class.
The lock is explicitly released and re-acquired periodically by each Python thread, specifically after approximately every 100 bytecode instructions executed within the interpreter. This allows other threads within the Python process to run, if present.
The lock is also released in some circumstances, allowing other threads to run.
An important example is when a thread performs an I/O operation, such as reading or writing from an external resource like a file, socket, or device.
Luckily, many potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore it is only in multithreaded programs that spend a lot of time inside the GIL, interpreting CPython bytecode, that the GIL becomes a bottleneck.— Global Interpreter Lock, Python Wiki.
The lock is also explicitly released by some third-party Python libraries when performing computationally expensive operations in C-code, such as many array operations in NumPy.
In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation).— threading — Thread-based parallelism
The GIL is a simple and effective solution to thread safety in the Python interpreter, but it has the major downside that full multithreading is not supported by Python.
An alternative solution might be to explicitly make the interpreter thread-safe by protecting each critical section. This has been tried a number of times and typically results in worse performance of single-threaded Python programs by up to 30%.
Unfortunately, both experiments exhibited a sharp drop in single-thread performance (at least 30% slower), due to the amount of fine-grained locking necessary to compensate for the removal of the GIL.— Python Global Interpreter Lock, Python Wiki.
Now that we are familiar with the GIL, let’s look at how multiprocessing is impacted.
Confused by the multiprocessing module API?
Download my FREE PDF cheat sheet
Multiprocessing and the Global Interpreter Lock
The multiprocessing module that provides process-based concurrency is not limited by the Global Interpreter Lock.
Both threads and processes can execute concurrently (out of order), but only python processes are able to execute in parallel (simultaneously), not Python threads (with some caveats).
This means that if we want out Python code to run on all CPU cores and make the best use of our system hardware, we should use process-based concurrency.
The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.— multiprocessing — Process-based parallelism
In fact, Jesse Noller and Richard Oudkerk proposed and developed the multiprocessing module (originally called “pyprocessing“) in Python specifically to overcome the limitations and side-step the GIL.
The pyprocessing package offers a method to side-step the GIL allowing applications within CPython to take advantage of multi-core architectures without asking users to completely change their programming paradigm (i.e.: dropping threaded programming for another “concurrent” approach – Twisted, Actors, etc).— PEP 371 – Addition of the multiprocessing package to the standard library
The multiprocessing module is not limited by the Global Interpreter Lock and can achieve full parallelism in Python.
Free Python Multiprocessing Course
Sign-up to my FREE 7-day email course. Discover how to use the Python multiprocessing module, including how to create and start child processes, how to use a mutex and semaphore, and much more!
Click the button below and enter your email address to sign-up and get the first lesson right now.
This section provides additional resources that you may find helpful.
- multiprocessing — Process-based parallelism
- Multiprocessing: The Complete Guide
- Multiprocessing Module API Cheat Sheet
- Multiprocessing API Interview Questions
- Multiprocessing Jump-Start (my 7-day course)
Overwheled by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
You now know how the multiprocessing module relates to the Global Interpreter Lock in Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.