Fix FileNotFoundError With Multiprocessing in Python

May 26, 2022 Python Multiprocessing

You may get a FileNotFoundError when sharing concurrency primitives with child processes.

In this tutorial you will discover the preconditions when getting a FileNotFoundError when using multiprocessing in Python and how to fix this error in your programs.

Let's get started.

FileNotFoundError With New Process

It is common to get a FileNotFoundError when sharing a concurrency primitive object with a child process.

This error is most common on MacOS and Windows.

This error is well understood, repeatable and there are fixes we can use.

Before we dive into understanding and fixing this error, let's look at a worked example that recreates it.

The example below runs a custom function named task() in a child process. The function takes a mutex lock, then acquires the lock and blocks for one second. The main process starts first, creates the shared lock, then configures the child process to execute the task() function by passing the lock as an argument.

The complete example is listed below.

# SuperFastPython.com
# example of a common error when using processes
from time import sleep
from multiprocessing import Process
from multiprocessing import Lock

# executed in a new thread
def task(lock):
    # acquire the lock
    with lock:
        # block for a moment
        sleep(1)

# entry point
if __name__ == '__main__':
    # create a shared lock
    lock = Lock()
    # create a process that uses the lock
    process = Process(target=task, args=(lock,))
    # start the process
    process.start()

Running the example on MacOS or Windows results in the error.

Traceback (most recent call last):
  ...
FileNotFoundError: [Errno 2] No such file or directory

We can create the same error on other platforms, like Linux.

This can be achieved by configuring the program to use the 'spawn' method when creating the child process, the default on MacOS and Windows.

For example:

# SuperFastPython.com
# example of a common error when using processes
from time import sleep
from multiprocessing import set_start_method
from multiprocessing import Process
from multiprocessing import Lock

# executed in a new thread
def task(lock):
    # acquire the lock
    with lock:
        # block for a moment
        sleep(1)

# entry point
if __name__ == '__main__':
    # set the spawn start method
    set_start_method('spawn')
    # create a shared lock
    lock = Lock()
    # create a process that uses the lock
    process = Process(target=task, args=(lock,))
    # start the process
    process.start()

Running the example on any platform, e.g. Linux, Windows, or MacOS, results in the error.

Traceback (most recent call last):
  ...
FileNotFoundError: [Errno 2] No such file or directory

What are the preconditions for this error to occur?

Cause of FileNotFoundError With New Process

This error is known and has been reported, for example:

It is sometimes reported as being specific to MacOS or to a specific concurrency primitive.

Nevertheless, there are three elements or pre-conditions that must be present for this error to occur.

They are:

  1. Use the 'spawn' start method to create new processes.
  2. Share a concurrent primitive with one or more processes.
  3. Main process exits before the child process starts up fully.

Let's take a closer look at each element in turn.

Precondition 1: Spawn Start Method

The error requires that you are using the 'spawn' start method.

Recall, the start method is the technique used by your system to start new child processes.

It may be 'fork' or 'spawn', or sometimes 'forkserver'.

For the error to occur, you must be using the 'spawn' start method. This is the default start method for creating child processes on MacOS and Windows.

spawn: Available on Unix and Windows. The default on Windows and macOS.

-- multiprocessing — Process-based parallelism

You can check the child process start method you are using via the multiprocessing.get_start_method() function.

For example:

# SuperFastPython.com
# example of checking the start method
from multiprocessing import get_start_method
# get the start method
method = get_start_method()
# report the start method
print(method)

Running the example reports the current start method used for creating child processes.

spawn

Precondition 2: Share Concurrency Primitive

The error requires that you share a concurrency primitive between the main process and one or more child processes.

Recall that the main process is the default or first process started to run your program.

Also, recall that concurrency primitives are those objects used to synchronization or coordinate behavior between processes, such as:

Precondition 3: Main Process Ends Early

The error requires that the main process finishes before the child process (or child processes) have had time to start-up completely.

This typically involves the main process having no further instructions to execute after starting the child process or processes.

This causes the error because the main process will terminate too soon and attempt to clean-up or garbage collect objects that are no longer needed, such as the shared concurrency primitive. Then, when the child process starts and attempts to use the concurrency primitive, it gets an error.

We get a FileNotFoundError because under the covers, processes coordinate with each other via files on the hard drive.

Now that we know the preconditions for the error, let's look at how we might fix the error.

How to Fix FileNotFoundError With New Process

There are two ways we might fix the error.

The they:

  1. Change start method.
  2. Block the main process.

Let's take a closer look at each in turn.

Change Start Method

We can fix the error by changing the start method to 'fork', if on Windows or MacOS.

This can be achieved via the multiprocessing.set_start_method() which must be called before creating a child process.

For example:

...
# change start method
set_start_method('fork')

This method may not be available on all platforms, e.g. it may not be available on windows.

The complete example with this fix is listed below.

# SuperFastPython.com
# example of fix for common error with processes
from time import sleep
from multiprocessing import Process
from multiprocessing import Lock
from multiprocessing import set_start_method

# executed in a new thread
def task(lock):
    # acquire the lock
    with lock:
        # block for a moment
        sleep(1)

# entry point
if __name__ == '__main__':
    # change start method
    set_start_method('fork')
    # create a shared lock
    lock = Lock()
    # create a process that uses the lock
    process = Process(target=task, args=(lock,))
    # start the process
    process.start()

Block Main Process

We can fix the error by blocking the main process long enough to allow the child processes to start-up and run.

This could be achieved with a call to time.sleep(), although how long is long enough to sleep?

A preferred approach is to block the main process by waiting for the child process to terminate. This can be achieved via the Process.join() function.

For example:

...
# block main process
process.join()

The complete example with this fix is listed below.

# SuperFastPython.com
# example of fix for common error with processes
from time import sleep
from multiprocessing import Process
from multiprocessing import Lock

# executed in a new thread
def task(lock):
    # acquire the lock
    with lock:
        # block for a moment
        sleep(1)

# entry point
if __name__ == '__main__':
    # create a shared lock
    lock = Lock()
    # create a process that uses the lock
    process = Process(target=task, args=(lock,))
    # start the process
    process.start()
    # block main process
    process.join()

Takeaways

You now know the preconditions for getting FileNotFoundError when using multiprocessing in Python, and how to fix the error.



If you enjoyed this tutorial, you will love my book: Python Multiprocessing Jump-Start. It covers everything you need to master the topic with hands-on examples and clear explanations.