Last Updated on August 21, 2023
Manipulating files is perhaps one of the core activities in most Python programs.
Before we can explore how to manipulate files concurrently with threads or processes, we need to review the basics of how to file IO in Python.
In this tutorial, you will discover how to manipulate files in Python.
After completing this tutorial, you will know:
- How to list, open, read, write, rename, delete, move and copy files in Python.
- How to manipulate paths in python such as get the basename, file extension, and construct paths.
- How to create zip files, and add files to an archive and unzip files from an archive.
Let’s dive in.
File IO
File IO refers to manipulating files on a hard disk.
This typically includes reading and writing data to files, but also includes a host of related operations such as renaming files, copying files, and deleting files. It may also refer to operations such as zipping files into common archive formats.
File IO might be one of the most common operations performed in Python programs, or programs generally. Files are where we store data, and programs need data to be useful.
Python provides a number of modules for manipulating files.
The most common Python file IO modules include the following:
- built-in: with functions such as the open() function for opening a file.
- os: with functions such makedirs() remove(), rename(), and many more.
- os.path: with functions such as basename(), join(), and many more.
- shutil: with functions such as copy(), move(), and many more.
We cannot cover all of the functions or all of the operations, but we can take a whirlwind tour of some of the most common file IO operations.
Run loops using all CPUs, download your FREE book to learn how.
How to List Files
The contents of a directory can be listed using the os.listdir() function.
The function takes a directory path as an argument and returns string names for each file and directory it contains.
For example:
1 2 3 |
... # list the contents of a directory names = listdir('/') |
The “.” and “..” directories that represent the current and previous directory are omitted from the results.
The names returned may be files or directories. They will not be sorted.
The example below lists the contents of the root directory on your machine.
1 2 3 4 5 6 7 8 9 |
# SuperFastPython.com # list the contents of a directory from os import listdir # directory to list directory = '/' # report the contents of the directory for name in listdir(directory): print(name) |
Running the example will report the contents of the ‘/’ root directory on your machine.
On my machine, we see the following (your results will differ).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
home usr bin sbin .file etc var Library System .VolumeIcon.icns private .vol Users Applications opt dev Volumes tmp cores |
Alternatives to the os.listdir() function include os.scandir().
How to Open a File
Files are opened in Python using the open() built-in function.
The function takes a number of arguments, most notably the path of the file to open, the mode in which to open it and the encoding used when reading or writing from the file.
Common open modes strings values include:
- ‘r‘: open in read mode (default)
- ‘w‘: open in write mode.
- ‘a‘: open in append write mode.
- ‘b‘: open in binary mode.
- ‘t‘: open in text mode (assumed).
- ‘x‘: open in creation mode.
Modes can be combined, for example ‘rb‘ indicates opening a file for reading (r) binary (b) data.
Encodings often refer to the codec, such as ‘utf-8’ which is common for ASCII text.
The open() function returns a file object or handle on which operations can be performed like reading and writing.
For example
1 2 3 4 5 |
... # open a file handle = open('/path/to/file.ext', 'w', encoding='utf-8') # ... handle.close() |
Operations on the file include functions such as read(), write(), and close(). It is important to close a file once you are finished using it, especially when writing to the file as it ensures any buffered input is written.
It is common to use the context manager when opening a file.
This involves using the “with” keyword, calling the function and specifying a name for the file handle.
For example:
1 2 3 4 |
... # open a file with open('/path/to/file.ext', 'w', encoding='utf-8') as handle: # ... |
This creates a block in which file operations can be performed. Once the block is exited, normally or by a raised exception, then the file is closed automatically.
Given that the file is closed automatically, opening files with the context manager is the preferred way to open files.
The example below creates a new file but does not read or write any data.
1 2 3 4 5 6 |
# SuperFastPython.com # create a new file # create a new file in the current working directory with open('new_file.txt', 'x', encoding='utf-8') as handle: pass |
Running the example creates a new file with the name ‘new_file.txt‘ in the current working directory (the same directory as the Python script).
1 |
new_file.txt |
If you try to create the file again after it already exists, the program will result in an error.
1 |
FileExistsError: [Errno 17] File exists: 'new_file.txt' |
Free Concurrent File I/O Course
Get FREE access to my 7-day email course on concurrent File I/O.
Discover patterns for concurrent file I/O, how save files with a process pool, how to copy files with a thread pool, and how to append to a file from multiple threads safely.
How to Create a Directory
A directory can be created using the os.makedirs() function.
The function takes the path of the directory to create. If one or more directories in the provided path do not exist, they will be created.
The function also tasks an argument exist_ok which defaults to False, which means that if the final directory in the path already exists then an error will be raised. The exist_ok argument can be set to True to ignore the case if the directory already exists, allowing the script with directory creation to be attempted each time it is run.
For example:
1 2 3 |
... # create directories makedirs('create/all/of/these/dirs', exist_ok=True) |
The following example will create a new subdirectory in the current working directory.
1 2 3 4 5 6 7 8 |
# SuperFastPython.com # create directories from os import makedirs # directory to create path = 'tmp' # create all directories in the path makedirs(path, exist_ok=True) |
Running the example creates a tmp/ subdirectory in the current working directory.
Alternatives to the os.makedirs() function include os.mkdir().
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
How to Write To File
Files can be written to after being opened by calling the write() function.
A file must first be opened by calling the open() built-in function with a mode that permits writing, such as ‘w‘. This function returns a file handle on which the write() function can be called.
For example:
1 2 3 4 5 |
... # open a file for writing text with open('path/to/file.ext', 'w', encoding='utf-8') as handle: # write a string to file handle.write('Hello world!') |
If the file was open in binary mode, then write() must take bytes, whereas if the file was opened in text mode then write() must take strings.
The example below writes a string to file. If the file already exists, the content is replaced because we are opening the file in write mode ‘w‘ instead of append mode ‘a‘.
1 2 3 4 5 6 7 |
# SuperFastPython.com # write a string to a file # open a file for writing text with open('new_file.txt', 'w', encoding='utf-8') as handle: # write string data to the file handle.write('We are writing to file') |
Running the example creates a new file in the current working directory with the name “new_file.txt“.
Checking the file in a text editor, we can see the contents match the string that we wrote.
1 |
We are writing to file |
Alternatives to the write() function include writelines().
How to Read From File
Files can be read from after being opened by calling the read() function.
A file must first be opened by calling the open() built-in function with a mode that permits reading, such as ‘r‘. This function returns a file object on which the read() function can be called.
For example:
1 2 3 4 5 |
... # open a file for reading text with open('path/to/file.ext', 'r', encoding='utf-8') as handle: # read the contents of the file data = handle.read() |
The example below opens the “/etc/services” file on a POSIX system and reports the length of the data in the file.
Note, change the file to the Python script itself if you are not on a POSIX system (e.g. mac or linux).
1 2 3 4 5 6 7 8 9 10 11 |
# SuperFastPython.com # read a file into memory # the file to open path = '/etc/services' # open a file for writing text with open(path, 'r', encoding='utf-8') as handle: # read the contents of the file as string data = handle.read() # report details about the content print(f'{path} has {len(data)} characters') |
Running the example loads the file, reads the contents into memory as a string and reports the length of the string in terms of the number of characters.
1 |
/etc/services has 677972 characters |
Alternatives to the read() function include readline() and readlines().
How to Move Files
Files can be moved in Python using the shutil.move() function.
The shutil.move() function takes the source file path and the destination file path as arguments.
For example:
1 2 3 |
... # move a file move('src/file.ext', 'dst/file.ext') |
If the source is a directory, then the move() function will move the directory and its contents to the destination.
If the destination file does not match the source file, it will be renamed accordingly.
If the destination is a directory, then the source will be moved under the destination directory.
The example below creates a new subdirectory and a file in the current directory, then moves the file under the subdirectory.
1 2 3 4 5 6 7 8 9 10 11 12 |
# SuperFastPython.com # move a file from os import makedirs from shutil import move # create a new file in the current working directory with open('moving_file.txt', 'x', encoding='utf-8') as handle: pass # create a new sub-directory in the current working directory makedirs('tmp', exist_ok=True) # move the file under the sub-directory move('moving_file.txt', 'tmp') |
Running the example first creates a new file in the current directory named “moving_file.txt” and a new subdirectory under the current working directory named ‘tmp/‘. It then moves ‘./moving_file.txt‘ to ‘tmp/moving_file.txt‘.
1 |
tmp/moving_file.txt |
Alternatives to the shutil.move() function include os.rename() and os.replace().
How to Copy Files
Files can be copied in Python using the shutil.copy() function.
The shutil.copy() function takes a source file path and a destination file path.
If the destination file path is a directory, the source file will be copied into the destination directory.
The example below creates a new file in the current working directory and then copies it to a file with a different name in the current working directory.
1 2 3 4 5 6 7 8 9 10 |
# SuperFastPython.com # copy a file from os import makedirs from shutil import copy # create a new file in the current working directory with open('copy_file.txt', 'x', encoding='utf-8') as handle: pass # copy the file copy('copy_file.txt', 'copy_file2.txt') |
Running the example creates a file named “copy_file.txt” in the current working directory then copies it to a new file with the name “copy_file2.txt” in the current working directory.
Alternatives to the shutil.copy() function include os.sendfile(), shutil.copyfileobj(), shutil.copyfile(), and shutil.copy2().
How to Rename Files
Files can be renamed in Python using the os.rename() function.
The os.rename() takes a source file path and a destination file path.
For example:
1 2 3 |
... # rename a file rename('src.ext', 'dst.ext') |
If the destination file path already exists, then an error will be raised.
The example below will create a new file in the current working directory and will then rename it.
1 2 3 4 5 6 7 8 9 |
# SuperFastPython.com # rename a file from os import rename # create a new file in the current working directory with open('rename_file.txt', 'x', encoding='utf-8') as handle: pass # rename the file rename('rename_file.txt', 'test_file.txt') |
Running the example creates a new file named “rename_file.txt” in the current working directory, then renames the file to “test_file.txt“.
1 |
test_file.txt |
Alternatives to the os.rename() function include os.renames(), os.replace(), and shutil.move().
How to Delete Files
Files can be deleted in Python via the os.remove() function.
Remove takes a file path to delete.
For example:
1 2 3 |
... # delete a file remove('/path/to/file.ext') |
If the file does not exist, an error is raised. Similarly, if the path is a directory an error is raised.
The example below creates a file in the current working directory, then deletes it.
1 2 3 4 5 6 7 8 9 |
# SuperFastPython.com # delete a file from os import remove # create a new file in the current working directory with open('delete_file.txt', 'x', encoding='utf-8') as handle: pass # delete the file remove('delete_file.txt') |
Running the example creates a new file with the name “delete_file.txt” in the current working directory, then deletes it.
Alternatives to the os.remove() function include the shutil.rmtree() function. Related functions include os.rmdir() and os.removedirs().
How to Get Path Base Name
The base filename can be retrieved in Python via the os.path.basename() function.
A path string to a file or directory may contain one or more directories. The file or directory at the end of the path is called the basename.
The os.path.basename() function takes a path string and returns the basename file string.
For example:
1 2 3 |
... # get the basename of a path base = basename('/path/to/a/file.ext') # returns file.ext |
The function operates on the string directly and the path or path-like string does not exist on disk.
The example below returns the basename of a contrived long path string.
1 2 3 4 5 6 7 8 9 |
# SuperFastPython.com # get the basename of a path from os.path import basename # path to a file path = '/this/is/the/path/to/a/file.ext' # get the basename from the path name = basename(path) print(name) |
Running the example reports the basename of the path which is the file component at the end of the path in this case.
1 |
file.ext |
Alternatives to the os.path.basename() function include os.path.split().
How to Join Paths
A file or directory can be joined to an existing path string via the os.path.join() function.
The os.path.join() function takes a path and one or more additional directories and/or files to join to the path.
For example:
1 2 3 |
... # join a filename to a directory path = join('/path/to/', 'file.ext') # returns /path/to/file.ext |
Different operating systems use different characters to delimit directories and files in a path string, such as ‘/’ and ‘\’.
The join() function provides a platform agnostic way of constructing string paths in Python using the appropriate path delimiter for the platform on which the program is being run.
The example below will construct a path with a directory and ending in a file.
1 2 3 4 5 6 7 |
# SuperFastPython.com # construct a path from os.path import join # create a path name = join('tmp', 'file.ext') print(name) |
Running the example constructs a path using the delimiter that is appropriate for the platform on which the Python script is being run.
In this case, it is being run on MacOS where paths are delimited by ‘/’.
1 |
tmp/file.ext |
How to Split Filename and Extension
The name and extension parts of a filename can be separated using the os.path.splitext() function.
The os.path.splitext() takes a path string, such as a filename and returns a tuple that contains the name part and the extension part of the provided string.
The name part will include any directory elements and the extension part will include the ‘.’ extension deliminoter.
For example:
1 2 3 |
... # separate filename into name and extension name, extension = splitext('filename.ext') # returns ('filename', '.ext') |
The example below separates a filename into the name and extension parts.
1 2 3 4 5 6 7 |
# SuperFastPython.com # separate a filename into name and extension from os.path import splitext # separate into name and extension name, ext = splitext('filename.ext') print(name, ext) |
Running the example splits the filename ‘filename.ext‘ into the name ‘filename‘) and extension ‘.ext‘ elements.
1 |
filename .ext |
How to Open a Zip File
A zip file can be opened by creating an instance of the zipfile.ZipFile class.
The zipfile.ZipFile class constructor takes the path to the zip file and the mode in which it is opened. Modes are the same as those used when opening files generally, such as ‘r‘ for read mode and ‘w‘ for write mode.
Once opened, operations can be performed on the ZipFile instance like adding or extracting files from the archive, and closing the file.
For example:
1 2 3 4 5 |
... # open a zip file handle = ZipFile('file.zip', 'r') # ... handle.close() |
Once a file is open it must be closed, especially if files are being added to the archive.
The ZipFile class supports a context manager for opening and closing the file. Operations can be performed on the ZipFile instance within the context manager block and the file will be closed automatically once the block is exited, normally or otherwise.
As such, the context manager is the preferred way to open and use ZipFile instances.
For example:
1 2 3 4 |
... # open a zip file with ZipFile('file.zip', 'r') as handle: # ... |
The example below creates a new zip file in the current working directory.
1 2 3 4 5 6 |
# SuperFastPython.com # create a new zip file # create a zip file in the current directory with open('archive.zip', 'x') as handle: pass |
Running the example creates a new zip file in the current working directory with the name “archive.zip” and no content.
1 |
archive.zip |
How to Add Files to Zip
Files can be added to a zip file in Python via the write() function on an open ZipFile instance.
The write() function takes the path of the file to add to the archive.
For example:
1 2 3 |
... # add a file to a zip archive handle.write('/path/to/file.ext') |
The example below creates a new file in the current working directory then adds that file to a new zip file archive.
1 2 3 4 5 6 7 8 9 10 11 12 |
# SuperFastPython.com # add files to a zip file from zipfile import ZipFile # open a file for writing text with open('new_file.txt', 'w', encoding='utf-8') as handle: # write string data to the file handle.write('We are writing to file') # create a zip file in the current directory with ZipFile('file_archive.zip', 'w') as handle: # add a text file to the archive handle.write('new_file.txt') |
Running the example first creates a text file in the current working directory with the name “new_file.txt” that contains a string of data.
A new zip file is then created in the current working directory with the name “file_archive.zip” and the file “new_file.txt” is then added.
1 |
file_archive.zip |
Alternatives to the write() function include the writestr() function and shutil.make_archive() function.
How to Extract a Zip File
Files can be extracted from a zip file in Python via the extract() function on an open ZipFile instance.
The extract() function takes the name of the member in the zip file to extract and extracts the file into the current working directory.
For example:
1 2 3 |
... # unzip a file from a zip file handle.extract('filename.ext') |
The example below creates a new file in the current working directory, then adds that file to a new zip file archive, then extracts the same file from the archive into the current working directory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# SuperFastPython.com # extract files from a zip file from zipfile import ZipFile # open a file for writing text with open('new_file.txt', 'w', encoding='utf-8') as handle: # write string data to the file handle.write('We are writing to file') # create a zip file in the current directory with ZipFile('file_archive.zip', 'w') as handle: # add a text file to the archive handle.write('new_file.txt') # open the zip file for extracting files with ZipFile('file_archive.zip', 'r') as handle: # extract the file from the archive data = handle.extract('new_file.txt') |
Running the example first creates a text file in the current working directory with the name “new_file.txt” that contains a string of data. A new zip file is then created in the current working directory with the name “file_archive.zip” and the file “new_file.txt” is then added.
Finally, the file “new_file.txt” is extracted from the archive “file_archive.zip” into the current working directory.
1 |
new_file.txt |
Alternatives to the extract() function include extractall() and read().
Further Reading
This section provides additional resources that you may find helpful.
Books
- Concurrent File I/O in Python, Jason Brownlee (my book!)
Guides
Python File I/O APIs
- Built-in Functions
- os - Miscellaneous operating system interfaces
- os.path - Common pathname manipulations
- shutil - High-level file operations
- zipfile — Work with ZIP archives
- Python Tutorial: Chapter 7. Input and Output
Python Concurrency APIs
- threading — Thread-based parallelism
- multiprocessing — Process-based parallelism
- concurrent.futures — Launching parallel tasks
- asyncio — Asynchronous I/O
File I/O in Asyncio
References
Takeaways
In this tutorial you discovered how to manipulate files in Python.
- How to list, open, read, write, rename, delete, move and copy files in Python.
- How to manipulate paths in python such as get the basename, file extension, and construct paths.
- How to create zip files, and add files to an archive and unzip files from an archive.
Do you have any questions?
Leave your question in a comment below and I will reply fast with my best advice.
Photo by Paul Chambers on Unsplash
Do you have any questions?