When working with Python you’ll often need to access files and directories on your computer. Python includes a useful os
module that gives you access to your computer or server’s underlying file system, so you can search for files, and list those directories and files found at a given path.
In this tutorial, you will discover how to list files and directories with Python using the os
module. It’s really easy to use and a very powerful tool when creating Python automations.
To get started, open a Python script or Jupyter notebook and import the os
module. The os
module is part of the Python standard library and does not need to be installed separately. The os
module provides a way to interact with the operating system and perform tasks such as listing files and directories.
import os
The listdir()
function in the os
module can be used to list all files and directories in a given directory. The listdir()
function takes the path to the directory as an argument. The following example lists all files and directories in the current working directory.
files = os.listdir()
iles
To get a list of all files in a specific directory, we can pass the path to the directory as an argument to the listdir()
function. The following example lists all files and directories in the directory named directory1
.
files = os.listdir('directory1')
files
Another common task in data science is listing only files with a specific file extension or file suffix, such as .csv
. We can use the listdir()
function in combination with a list comprehension to achieve this. The following example lists all files with the .csv
extension in the current working directory.
csv_files = [file for file in os.listdir() if file.endswith('.csv')]
csv_files
We can use the same method as above to return a list of only the files in a specific directory with a specific file extension. The following example lists all files with the .csv
extension in the directory named directory1
by passing this directory as an argument to the listdir()
function.
csv_files = [file for file in os.listdir('directory1') if file.endswith('.csv')]
csv_files
By default, the listdir()
function will return all directories and files found at a given path, or will return the files at the current working directory if no path is specified. To get a list of only the files within a directory, and not any directories also present, we can use the isfile()
function in combination with a list comprehension.
The isfile()
function takes the path to a file as an argument and returns True
if the path is a file and False
if the path is a directory. The following example lists all files in the directory named directory1
and ignores any directories also present.
files = [file for file in os.listdir('directory1') if os.path.isfile(os.path.join('directory1', file))]
files
To get a list of all files in a directory and all subdirectories we can use the walk()
function in the os
module. The walk()
function takes the path to a directory as an argument and returns a generator object that can be iterated over to get a list of all files in the directory and all subdirectories. The following example lists all files in the directory named directory1
and all subdirectories.
files = []
for dirpath, dirnames, filenames in os.walk('directory1'):
for filename in filenames:
files.append(os.path.join(dirpath, filename))
files
Finally, let’s say you’ve got a whole load of directories and you want to search them all and return a list of all files within that have a specific file extension.
We can use the walk()
function in combination with a list comprehension to achieve this. The following example lists all files with the .txt
extension in the directory named directory1
and all subdirectories.
csv_files = []
for dirpath, dirnames, filenames in os.walk('directory1'):
for filename in filenames:
if filename.endswith('.txt'):
csv_files.append(os.path.join(dirpath, filename))
csv_files
Matt Clarke, Wednesday, October 12, 2022