You can fix and recover a mutex lock used in a terminated child process by first releasing it before using it again, and handling the case where the lock had not been acquired.
In this tutorial, you will discover how to recover a mutex lock used in a terminated child process.
Let’s get started.
Broken Mutex Lock When Terminating Child Processes
We can run a task in a child process.
This can be achieved by creating an instance of the multiprocessing.Process class and specify the target function to execute via the “target” argument and any arguments to the target function via the “args” argument.
For example:
1 2 3 |
... # create and configure a child process process = multiprocessing.Process(target=task) |
You can learn more about running a function in a child process in the tutorial:
We may create a mutex lock via the multiprocessing.Lock class and share it with the child process.
For example:
1 2 3 |
... # create a mutex lock lock = multiprocessing.Lock() |
You can learn more about multiprocessing locks in the tutorial:
It is possible to terminate a child process while it has acquired the lock.
This can be achieved by calling the terminate() method on the Process instance.
For example:
1 2 3 |
... # terminate the child process process.terminate() |
You can learn more about terminating a child process in the tutorial:
Terminating the child process while it holds a mutex lock may leave the lock in an unknown and inconsistent state.
This case is specifically warned against in the multiprocessing module documentation.
For example:
Using the Process.terminate method to stop a process is liable to cause any shared resources (such as locks, semaphores, pipes and queues) currently being used by the process to become broken or unavailable to other processes.
Therefore it is probably best to only consider using Process.terminate on processes which never use any shared resources.
—multiprocessing — Process-based parallelism
How can we fix a mutex lock left in an inconsistent state when terminating a child process?
Run loops using all CPUs, download your FREE book to learn how.
How to Fix a Broken Mutex Lock
If a mutex lock is used in a child process that is terminated, the lock is not broken.
Instead, the lock is in an unknown state.
If the lock had been acquired, it would not have been released if the child process was terminated.
Therefore, we must explicitly release the lock before using it.
For example:
1 2 3 4 5 6 |
... # recover the broken lock lock.release() # acquire the lock with lock: # ... |
If the lock had not been acquired or had already been released and we acquire it again, a ValueError exception will be raised.
Unlike the threading.Lock class, the multiprocessing.Lock class does not offer a locked() method to check if a mutex lock is currently locked.
Therefore, if we don’t know the state of the mutex lock we can try and release it before acquiring it, and wrap the call in a try-except block.
For example:
1 2 3 4 5 6 7 8 9 |
... # try and recover the broken lock try: lock.release() except ValueError: pass # acquire the lock with lock: # ... |
Now that we know how to recover a lock used in a terminated child process, let’s look at some worked examples.
Example of a Broken Mutex Lock When Terminating Child Process
Before we look at recovering a mutex lock from a child process, let’s look at a failure case.
In this example, we will share a lock with a child process, then terminate the child process while it is holding the lock. The parent process will then attempt to acquire the lock, which will fail with a deadlock as the lock will never be released by the terminated child process.
Firstly, we can define a task function that takes the lock as an argument, acquires it, and blocks for a few seconds to simulate computational effort.
1 2 3 4 5 6 7 |
# task executed in a child process def task(lock): # acquire the lock with lock: # block for a while to simulate work print('Task has lock...', flush=True) sleep(2) |
Next, in the main process, we can create the shared lock, then create and configure the child process to execute our task() function and take the shared lock as an argument.
1 2 3 4 5 6 7 |
... # create the shared lock lock = Lock() # create the child process child = Process(target=task, args=(lock,)) # start the child process child.start() |
The main process will then wait a moment then terminate the child and wait for it to close.
1 2 3 4 5 6 7 |
... # wait a moment sleep(1) # terminate the child process child.terminate() # wait for child to terminate child.join() |
Finally, the main process will attempt to acquire the shared lock.
1 2 3 4 5 6 7 8 |
... print(f'Child is terminated, still alive: {child.is_alive()}') # try to acquire the lock print('Main acquiring lock...') with lock: print('Acquired lock') # all done normally print('Main done') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
# SuperFastPython.com # example of terminating a process and breaking a lock from time import sleep from multiprocessing import Process from multiprocessing import Lock # task executed in a child process def task(lock): # acquire the lock with lock: # block for a while to simulate work print('Task has lock...', flush=True) sleep(2) # protect the entry point if __name__ == '__main__': # create the shared lock lock = Lock() # create the child process child = Process(target=task, args=(lock,)) # start the child process child.start() # wait a moment sleep(1) # terminate the child process child.terminate() # wait for child to terminate child.join() print(f'Child is terminated, still alive: {child.is_alive()}') # try to acquire the lock print('Main acquiring lock...') with lock: print('Acquired lock') # all done normally print('Main done') |
Running the example first creates the shared lock.
The child process is then created and configured to execute our task() function and is passed the shared lock as an argument.
The child process is started and the main process then blocks for one second.
The child process runs, acquires the lock, reports a message then sleeps.
The main process resumes and terminates the child process and waits for it to close. It then reports that the child process is no longer alive.
At this point, the lock is acquired and the child process is stopped.
The main process reports a message and attempts to acquire the lock.
This is a deadlock. The main process is unable to acquire the lock because it was acquired by the child process and never released.
It is important to note that even though the child process used the context manager interface, it did not release the lock because the exit of the context manager was never executed when the child process was terminated.
1 2 3 |
Task has lock... Child is terminated, still alive: False Main acquiring lock... |
Next, let’s look at how we might recover the lock.
Free Python Multiprocessing Course
Download your FREE multiprocessing PDF cheat sheet and get BONUS access to my free 7-day crash course on the multiprocessing API.
Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.
Example of Fixing a Broken Mutex Lock
We can explore the case of fixing a broken lock before acquiring it.
In this example, we will update the previous example to first release the lock before acquiring it.
We know that the lock will be acquired at the time the child process is terminated and therefore we know that all we need to do is release the lock before acquiring it again.
For example:
1 2 3 4 5 6 7 8 9 10 |
... # try and recover the broken lock lock.release() print(f'Child is terminated, still alive: {child.is_alive()}') # try to acquire the lock print('Main acquiring lock...') with lock: print('Acquired lock') # all done normally print('Main done') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
# SuperFastPython.com # example of attempting to recover a broken lock from time import sleep from multiprocessing import Process from multiprocessing import Lock # task executed in a child process def task(lock): # acquire the lock with lock: # block for a while to simulate work print('Task has lock...', flush=True) sleep(2) # protect the entry point if __name__ == '__main__': # create the shared lock lock = Lock() # create the child process child = Process(target=task, args=(lock,)) # start the child process child.start() # wait a moment sleep(1) # terminate the child process child.terminate() # wait for child to terminate child.join() # try and recover the broken lock lock.release() print(f'Child is terminated, still alive: {child.is_alive()}') # try to acquire the lock print('Main acquiring lock...') with lock: print('Acquired lock') # all done normally print('Main done') |
Running the example first creates the shared lock.
The child process is then created and configured to execute our task() function and is passed the shared lock as an argument.
The child process is started and the main process then blocks for one second.
The child process runs, acquires the lock, reports a message then sleeps.
The main process resumes and terminates the child process and waits for it to close. It then reports that the child process is no longer alive.
At this point, the lock is acquired and the child process is stopped.
The main process then releases the lock. It reports a message and attempts to acquire the lock.
The lock is acquired normally, as expected.
1 2 3 4 5 |
Task has lock... Child is terminated, still alive: False Main acquiring lock... Acquired lock Main done |
This is a contrived situation because we know the state of the lock in the terminated child process.
Next, let’s explore the case where we don’t know the state of the lock in the terminated child process.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Example of More Safely Fixing a Broken Mutex Lock
We can explore the case where we don’t know the status of the lock in the terminated child process.
If we assume the lock is acquired in the child process when it was terminated, then attempt to release it before acquiring it again in the main process, then this will result in a ValueError exception.
We can demonstrate this with a worked example.
The task() function can be updated to block before acquiring the lock.
The main process will then terminate the child process before the child process is able to acquire the lock. It will then release the lock and attempt to acquire it.
The complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
# SuperFastPython.com # example of attempting to recover a broken lock and failing from time import sleep from multiprocessing import Process from multiprocessing import Lock # task executed in a child process def task(lock): # work before we get the lock sleep(2) # acquire the lock with lock: # block for a while to simulate work print('Task has lock...', flush=True) sleep(2) # protect the entry point if __name__ == '__main__': # create the shared lock lock = Lock() # create the child process child = Process(target=task, args=(lock,)) # start the child process child.start() # wait a moment sleep(1) # terminate the child process child.terminate() # wait for child to terminate child.join() # try and recover the broken lock lock.release() print(f'Child is terminated, still alive: {child.is_alive()}') # try to acquire the lock print('Main acquiring lock...') with lock: print('Acquired lock') # all done normally print('Main done') |
Running the example first creates the shared lock.
The child process is then created and configured to execute our task() function and is passed the shared lock as an argument.
The child process is started and the main process then blocks for one second.
The child process runs and sleeps.
The main process resumes and terminates the child process and waits for it to close.
It then attempts to release the lock before acquiring it again.
This fails with a ValueError exception.
1 2 3 |
Traceback (most recent call last): ... ValueError: semaphore or lock released too many times |
This highlights that we cannot assume the state of the lock in the terminated child process and always release it before acquiring it again.
Instead, we need a more robust solution.
We can update the example to still attempt to always release the lock before using it, but protect against a possible ValueError exception, in case the lock was not locked.
1 2 3 4 5 6 |
... # try and recover the broken lock try: lock.release() except ValueError: pass |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
# SuperFastPython.com # example of attempting to recover a broken lock more safely from time import sleep from multiprocessing import Process from multiprocessing import Lock # task executed in a child process def task(lock): # work before we get the lock sleep(2) # acquire the lock with lock: # block for a while to simulate work print('Task has lock...', flush=True) sleep(2) # protect the entry point if __name__ == '__main__': # create the shared lock lock = Lock() # create the child process child = Process(target=task, args=(lock,)) # start the child process child.start() # wait a moment sleep(1) # terminate the child process child.terminate() # wait for child to terminate child.join() # try and recover the broken lock try: lock.release() except ValueError: pass print(f'Child is terminated, still alive: {child.is_alive()}') # try to acquire the lock print('Main acquiring lock...') with lock: print('Acquired lock') # all done normally print('Main done') |
Running the example first creates the shared lock.
The child process is then created and configured to execute our task() function and is passed the shared lock as an argument.
The child process is started and the main process then blocks for one second.
The child process runs and sleeps.
The main process resumes and terminates the child process and waits for it to close.
It then attempts to release the lock before acquiring it again.
This fails with a ValueError exception, which is caught and ignored. The main process then reports a message, acquires the lock, and reports a message.
This highlights how we can safely recover a lock from a terminated child process with an unknown state.
1 2 3 4 |
Child is terminated, still alive: False Main acquiring lock... Acquired lock Main done |
Further Reading
This section provides additional resources that you may find helpful.
Python Multiprocessing Books
- Python Multiprocessing Jump-Start, Jason Brownlee (my book!)
- Multiprocessing API Interview Questions
- Multiprocessing API Cheat Sheet
I would also recommend specific chapters in the books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python Multiprocessing: The Complete Guide
- Python Multiprocessing Pool: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
APIs
References
Takeaways
You now know how to recover a mutex lock used in a terminated child process.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Do you have any questions?