Last Updated on August 21, 2023
You can run one-off file IO tasks in the background using a new thread or new child process.
Running a file IO task in the background allows the main thread of the program to continue on with other tasks.
In this tutorial, you will discover how to run one-off file IO tasks in the background.
Let’s get started.
Need to Run One-Off File I/O Tasks in the Background
We often need to perform a one-off file IO operation in our programs.
For example, we may need to load some data from a file, store state to a file, or append some data to a file.
If our program depends on this task being performed before moving on, it cannot or probably not be performed concurrently.
For example, a program may need to load data and then process that data. The processing task depends upon the load task and is blocked until the data is loaded. Concurrency offers no benefit in this case.
In other cases, the one-off operation may not be depended upon by subsequent tasks, in which case it can be performed asynchronously in a fire-and-forget manner.
For example, a program may need to save state on request by the user. This activity can be performed in the background.
The task can be dispatched and any error handling or notifications can be handled by the task itself.
For a one-off file IO task to be completed in the background, it assumes some properties of the task, such as:
- Independence. The task does not depend upon another task and is not depended upon by another task, meaning the program is not waiting for the file IO operation to complete.
- Has Data. All data required by the task is provided to the task upfront, e.g. the data to save in the file and the name and path of the file where it is saved.
- Error Handling. The task should manage its own error handling, such as handling and reporting exceptions as the program does not explicitly depend upon the task.
- Isolated. There is only one or a very small number of tasks to perform, e.g. one-off. This means that we are not referring to batch file IO operations, like loading all files in a directory.
We are now familiar with the properties of a one-off file IO task.
How can we run one-off file IO tasks in the background?
Run loops using all CPUs, download your FREE book to learn how.
How to Run One-Off File I/O Tasks in the Background
One-off file IO operations can be performed in the background of the main program.
This can be achieved using concurrency, such as a new thread or child process.
- Run file IO tasks in the background with a new thread.
- Run file IO tasks in the background with a child process.
Let’s take a closer look at each of these cases, as well as the default case of not running the task in the background.
One-Off I/O Tasks Block the Main Thread
When we perform a file IO task in a program, the calling thread is blocked.
This means that the function call made to perform the file IO does not return until the task is complete.
The program cannot continue until the function returns, as such is blocked from execution.
For example:
1 2 3 4 5 |
... # open a file for writing text with open('path/to/file.ext', 'w', encoding='utf-8') as handle: # write a string to file handle.write('Hello world!') |
In this case, the program is blocked in two cases:
- Opening the file via the open() function
- Writing to the file via the write() method
If the program is not dependent upon the file being opened and data written to the file, this task can be performed in the background.
This would mean that the main thread would not be blocked and would be free to continue on with other tasks.
Background Tasks with Thread
We can execute an independent file IO task in the background using a new thread.
This can be achieved by first defining an independent function to perform the task.
This function must take all required data as arguments and not return anything.
For example, if we are saving data to file, then the function would take the data and file path as arguments.
1 2 3 4 5 6 |
# function to write a file def write_file(filename, data): # open a file for writing text with open(filename, 'w', encoding='utf-8') as handle: # write string data to the file handle.write(data) |
We can then create a new threading.Thread class and configure it to call our write_file() function and pass it the required arguments.
This can be achieved via the “target” argument that takes the name of the function to run and the “args” argument that takes a tuple of the ordered arguments to the function.
For example:
1 2 3 |
... # create a new thread to run our function thread = Thread(target=write_file, args=('data.txt', 'Hello world')) |
We can then start the new thread.
This will request that the underlying operating system execute our function as soon as it is able.
For example:
1 2 3 |
... # start the new thread thread.start() |
The program is then free to continue on with other tasks.
You can learn more about how to run functions in a new thread in the tutorial:
Background Tasks with Process
We can also execute independent file IO tasks in the background using a new child process.
As with a new thread, we first define a new function to execute our task that takes all required data for the task as arguments.
We can then create a new multiprocessing.Process object and configure it to execute our target function and take the required arguments.
The multiprocessing.Process has the same API as the threading.Thread, including a “target” argument for the name of the function to execute and an “args” argument for an ordered tuple of arguments to the function.
For example:
1 2 3 |
... # create a new child process process = Process(target=write_file, args=('data.txt', 'Hello world')) |
The new process can then be started via the start() method.
This requests that the operating system execute our function in the new child process as soon as it is able.
For example:
1 2 3 |
... # run the new process process.start() |
The program is then free to continue on and execute other tasks while the file IO operation occurs in the background.
You can learn more about how to run functions in new processes in the tutorial:
Choose Between Thread vs Process
Why would we choose to run a file IO task in the background using a thread vs a process?
A thread is lightweight meaning that it is fast to create and run and uses minimal memory. It is also able to access data and variables shared in the main thread, which may be helpful for error handling.
A process is heavyweight compared to a thread, for example, a process has one or more threads, whereas a thread does not have processes. A thread is slower to start and takes up more memory. It is also much slower to share data with a new process via function arguments. This may mean that there will be overhead when using a child process if a file IO task involves saving a large amount of data in the background.
Generally, I would recommend using a thread for running one-off file IO tasks in the background.
Nevertheless, sometimes the independence of a new process is required, such as if a large amount of CPU-bound work is required before saving the file. If run in a thread, this CPU-bound work would block the main thread because of the global interpreter lock (GIL), whereas when run in a new process, the main thread is free to continue.
You can learn more using threads vs using processes in the tutorial:
And also:
Now that we are familiar with how to run a one-off file IO task in the background, let’s look at some worked examples.
Example of Writing a File in the Background
In this section, we will explore an example of running a one-off file IO task in the background.
This example will first show how to save data to a file in a way that blocks the main thread. We will then run the same task in the background using a new thread, then finally run in the background using a new child process.
- Save the file by blocking the main thread.
- Save the file in the background with a new thread.
- Save a file in the background with a child process.
Let’s dive in.
Write a File And Block the Main Thread
In this example, we will write a file in a way that will block the main thread.
Firstly, we will define a function to perform the file IO task independently. The function takes the filename and the data that will be saved. It opens the file, writes the data, and closes the file. It then reports a final message to signal that the task is complete.
The write_file() function below implements this.
1 2 3 4 5 6 7 8 |
# task to write a file def write_file(filename, data): # open a file for writing text with open(filename, 'w', encoding='utf-8') as handle: # write string data to the file handle.write(data) # report a message print(f'File saved: {filename}') |
Next in the main thread will report a message that the program is running, then another message that we are about to save the file. The write_file() is then called and a final message is reported that the main thread is done.
1 2 3 4 5 6 7 8 |
... # report a message print('Main is running') # save the file print('Saving file...') write_file('data.txt', 'Hello world') # report a message print('Main is done.') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# SuperFastPython.com # example of writing a file as a foreground task # task to write a file def write_file(filename, data): # open a file for writing text with open(filename, 'w', encoding='utf-8') as handle: # write string data to the file handle.write(data) # report a message print(f'File saved: {filename}') # protect the entry point if __name__ == '__main__': # report a message print('Main is running') # save the file print('Saving file...') write_file('data.txt', 'Hello world') # report a message print('Main is done.') |
Running the program first reports a message that the program is running.
The main thread then reports a message that it is about to save the file.
The write_file() is called, which blocks the main thread until the function returns,
The function opens the file using the context manager interface and writes the data to the file before closing the file again. A message is reported indicating that the file IO operation is complete.
The program resumes and reports a final message that it is done.
This example highlights that performing a file IO operation will block the main thread until the task is complete.
1 2 3 4 |
Main is running Saving file... File saved: data.txt Main is done. |
Next, let’s explore how we might perform the same operation in the background without blocking the main thread.
Write File in the Background with Thread
We can update the above example so that the file IO operation is performed in the background and does not block the main thread.
Firstly, we need to create a new threading.Thread configured to execute our function with the required arguments.
1 2 3 |
... # create and configure the new thread thread = Thread(target=write_file, args=('data.txt', 'Hello world')) |
The new thread can then be started.
1 2 3 |
... # start running the new thread in the background thread.start() |
And that’s it.
Tying this together, the updated version that runs the file IO task in the background using a new thread is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# SuperFastPython.com # example of writing a file as a background task using a thread from threading import Thread # task to write a file def write_file(filename, data): # open a file for writing text with open(filename, 'w', encoding='utf-8') as handle: # write string data to the file handle.write(data) # report a message print(f'File saved: {filename}') # protect the entry point if __name__ == '__main__': # report a message print('Main is running') # save the file print('Saving file...') # create and configure the new thread thread = Thread(target=write_file, args=('data.txt', 'Hello world')) # start running the new thread in the background thread.start() # report a message print('Main is done.') |
Running the program first reports a message that the program is running.
The main thread then reports a message that it is about to save the file.
The new thread is then created and configured to run our custom function, then the thread is started.
This schedules the thread to run at some time in the future by the operating system.
The main thread continues on immediately, reporting the final message and terminating.
The program continues running because the new thread is still running in the background.
The write_file() function is executed in a new thread. It opens the file using the context manager interface and writes the data to the file before closing the file again. A message is reported indicating that the file IO operation is complete.
The program then terminates.
This example highlights how to perform a file IO operation in the background with a new thread, allowing the main thread to perform other tasks at the same time, or terminate early.
1 2 3 4 |
Main is running Saving file... Main is done. File saved: data.txt |
Next, let’s look at how we might perform the same task in the background using a new child process.
Write File in the Background with Process
We can update the above example so that the file IO operation is performed in the background using a new child process
Firstly, we need to create a new multiprocessing.Process configured to execute our function with the required arguments.
1 2 3 |
... # create and configure a new child process process = Process(target=write_file, args=('data.txt', 'Hello world')) |
The new child process can then be started.
1 2 3 |
... # start running the new child process process.start() |
And that’s it.
Tying this together, the updated version that runs the file IO task in the background using a new child process is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# SuperFastPython.com # example of writing a file as a background task using a process from multiprocessing import Process # task to write a file def write_file(filename, data): # open a file for writing text with open(filename, 'w', encoding='utf-8') as handle: # write string data to the file handle.write(data) # report a message print(f'File saved: {filename}', flush=True) # protect the entry point if __name__ == '__main__': # report a message print('Main is running') # save the file print('Saving file...') # create and configure a new child process process = Process(target=write_file, args=('data.txt', 'Hello world')) # start running the new child process process.start() # report a message print('Main is done.') |
Running the program first reports a message that the program is running.
The main thread then reports a message that it is about to save the file.
The new child process is then created and configured to run our custom function, then the process is started.
This schedules the process to run at some time in the future by the operating system.
The main thread continues on immediately, reporting the final message and terminating.
The program continues running because the new child process is still running in the background.
The write_file() function is executed in a child process. It opens the file using the context manager interface and writes the data to the file before closing the file again. A message is reported indicating that the file IO operation is complete.
The program then terminates.
This example highlights how to perform a file IO operation in the background with a new process, allowing the main thread to perform other tasks at the same time, or terminate early.
1 2 3 4 |
Main is running Saving file... Main is done. File saved: data.txt |
Free Concurrent File I/O Course
Get FREE access to my 7-day email course on concurrent File I/O.
Discover patterns for concurrent file I/O, how save files with a process pool, how to copy files with a thread pool, and how to append to a file from multiple threads safely.
Further Reading
This section provides additional resources that you may find helpful.
Books
- Concurrent File I/O in Python, Jason Brownlee (my book!)
Guides
Python File I/O APIs
- Built-in Functions
- os - Miscellaneous operating system interfaces
- os.path - Common pathname manipulations
- shutil - High-level file operations
- zipfile — Work with ZIP archives
- Python Tutorial: Chapter 7. Input and Output
Python Concurrency APIs
- threading — Thread-based parallelism
- multiprocessing — Process-based parallelism
- concurrent.futures — Launching parallel tasks
- asyncio — Asynchronous I/O
File I/O in Asyncio
References
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Takeaways
You now know how to run one-off file IO tasks in the background.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Do you have any questions?