Last Updated on September 12, 2022
You may encounter one among a number of common errors when using the multiprocessing.Process class in Python.
These errors are typically easy to identify and often involve a quick fix.
In this tutorial, you will discover the common errors when creating child processes in Python and how to fix each in turn.
Let’s get started.
Common Multiprocessing Errors
The multiprocessing module and multiprocessing.Process class provide a flexible and powerful approach to concurrency using child processes.
When you are getting started with multiprocessing in Python, you may encounter one of many common errors.
These errors are typically made because of bugs introduced by copy-and-pasting code, or from a slight misunderstanding in how new child processes work.
We will take a closer look at some of the more common errors made when creating new child processes; they are:
- Error 1: RuntimeError Starting New Processes
- Error 2: print() Does Not Work In Child Processes
- Error 3: Adding Attributes to Classes that Extend Process
Do you have an error using the multiprocessing module?
Let me know in the comments so I can recommend a fix and add the case to this tutorial.
Run loops using all CPUs, download your FREE book to learn how.
Error 1: RuntimeError Starting New Processes
It is common to get a RuntimeError when starting a new Process in Python.
The content of the error often looks as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable. |
This will happen on Windows and MacOS where the default start method is ‘spawn‘. It may also happen when you configure your program to use the ‘spawn‘ start method on other platforms.
This is a common error and is easy to fix.
The fix involves checking if the code is running in the top-level environment and only then, attempt to start a new process.
This is a best practice.
The idiom for this fix, as stated in the message of the RuntimeError, is to use an if-statement and check if the name of the module is equal to the string ‘__main__‘.
For example:
1 2 3 4 |
... # check for top-level environment if __name__ == '__main__': # ... |
This is called “protecting the entry point” of the program.
Recall, that __name__ is a variable that refers to the name of the module executing the current code.
Also, recall that ‘__main__‘ is the name of the top-level environment used to execute a Python program.
Using an if-statement to check if the module is the top-level environment and only starting child processes within that block will resolve the RuntimeError.
It means that if the Python file is imported, then the code protected by the if-statement will not run. It will only run when the Python file is run directly, e.g. is the top-level environment.
The if-statement idiom is required, even if the entry point of the program calls a function that itself starts a child process.
You can learn more about this common error in the tutorial:
Error 2: print() Does Not Work In Child Processes
Printing to standard out (stdout) with the built-in print() function may not work property from child processes.
For example, you may print output messages for the user or debug messages from a child process and they may never appear, or may only appear when the child process is terminated.
For example:
1 2 3 |
... # report a message from a child process print('Hello from the child process') |
This is a very common situation and the cause is well understood and easy to workaround.
The print() function is a built-in function for displaying messages on standard output or stdout.
When you call print() from a child process created using the ‘spawn‘ start method, the message will not appear.
This is because the messages are block buffered by default and the buffer is not flushed by default after every message. This is unlike the main process that is interactive and will flush messages after each line, e.g. line buffered.
Instead, the buffered messages are only flushed occasionally, such as when the child process terminates and the buffer is garbage collected.
We can flush stdout automatically with each call to print().
This can be achieved by setting the ‘flush‘ argument to True.
For example:
1 2 3 |
... # report a message from a child process print('Hello from the child process', flush=True) |
An alternate approach is to call the flush() function on the sys.stdout object directly.
For example:
1 2 3 4 5 |
... # report a message from a child process print('Hello from the child process') # flush output sys.stdout.flush() |
The problem with the print() function only occurs when using the ‘spawn‘ start method.
You can change the start method to ‘fork‘ which will cause print() to work as expected.
Note, the ‘fork‘ start method is not supported on Windows at the time of writing.
You can set the start method via the multiprocessing.set_start_method() function.
For example:
1 2 3 |
... # set the start method to fork set_start_method('fork') |
You can learn more about process start methods in the tutorial:
You can learn more about fixing print() from child processes in the tutorial:
Free Python Multiprocessing Course
Download your FREE multiprocessing PDF cheat sheet and get BONUS access to my free 7-day crash course on the multiprocessing API.
Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.
Error 3: Adding Attributes to Classes that Extend Process
Python provides the ability to create and manage new processes via the multiprocessing.Process class.
We can extend this class and override the run() function in order to run code in a new child process.
You can learn more about extending the the multiprocessing.Process class in the tutorial:
Extending the multiprocessing.Process and adding attributes that are shared among multiple processes will fail with an error.
For example, if we define a new class that extends the multiprocessing.Process class that sets an attribute on the class instance from the run() method executed in a new child process, then this attribute will not be accessible by other processes, such as the parent process.
This is the case even if both parent and child processes share access to the “same” object.
This is because class instance variables are not shared among processes by default. Instead, instance variables added to the multiprocessing.Process are private to the process that added them.
Each process operates on a serialized copy of the object and any changes made to that object are local to that process only, by default.
If you set class attributes in the child process and try to access them in the parent process or another process, you will get an error.
For example:
1 2 3 |
Traceback (most recent call last): ... AttributeError: 'CustomProcess' object has no attribute 'data' |
This error occurred because the child process operates on a copy of the class instance that is different from the copy of the class instance used in the parent process.
Instance variable attributes can be shared between processes via the multiprocessing.Value and multiprocessing.Array classes.
These classes explicitly define data attributes designed to be shared between processes in a process-safe manner.
Shared variables mean that changes made in one process are always propagated and made available to other processes.
An instance of the multiprocessing.Value can be defined in the constructor of a custom class as a shared instance variable.
The constructor of the multiprocessing.Value class requires that we specify the data type and an initial value.
The data type can be specified using ctype “type” or a typecode.
Typecodes are familiar and easy to use, for example ‘i’ for a signed integer or ‘f’ for a single floating-point value.
For example, we can define a multiprocessing.Value shared memory variable that holds a signed integer and is initialized to the value zero.
1 2 3 |
... # initialize an integer shared variable data = multiprocessing.Value('i', 0) |
This can be initialized in the constructor of the class that extends the multiprocessing.Process class.
We can change the value of the shared data variable via the “value” attribute.
For example:
1 2 3 |
... # change the value of the shared variable data.value = 100 |
We can access the value of the shared data variable via the same “value” attribute.
For example:
1 2 3 |
... # access the shared variable value = data.value |
The propagation of changes to the shared variable and mutual exclusion locking of the shared variable is all performed automatically behind the scenes.
You can learn more about this error and how to fix in the tutorial:
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Further Reading
This section provides additional resources that you may find helpful.
Python Multiprocessing Books
- Python Multiprocessing Jump-Start, Jason Brownlee (my book!)
- Multiprocessing API Interview Questions
- Multiprocessing API Cheat Sheet
I would also recommend specific chapters in the books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Multiprocessing: The Complete Guide
- Python Multiprocessing Pool: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
APIs
References
Takeaways
You now know about the common errors when using multiprocessing in Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Rachael Annabelle on Unsplash
Do you have any questions?