How to rename columns in Pandas dataframes

Learn how to rename Pandas dataframe columns, add a prefix or suffix, convert them to lowercase or uppercase, and much more...

How to rename columns in Pandas dataframes
Picture by Roberto Nickson, Pexels.
11 minutes to read

Renaming Pandas dataframe columns is a common task for the data scientist. Neat, consistent column names make your dataframe easier to read and your code cleaner to write and maintain.

It’s a good idea to come up with your own consistent naming convention for formatting Pandas column names as it will tidy up your Pandas code and make it quicker to write code, as you’ll instantly know how the column name should be formatted.

In this quick tutorial, we’ll go over some example code to rename columns in Pandas dataframes. We’ll use the rename() function to set the column names, show how you can add prefixes and suffixes to column names, convert them to uppercase or lowercase, and even slugify them.

Import Pandas and create a dataframe

To get started, open a Jupyter notebook and import the Pandas package using import pandas as pd. Once that’s imported, we’ll create a Pandas dataframe from dictionary containing some content and the Pandas from_dict() function.

import pandas as pd
data = {'Model ': ['Defender 90', 'Defender 110', 'Defender 130'],
        'Wheel base': ['90 inches', '110 inches', '130 inches'], 
        'BHP': [140, 140, 140],
        'Likelihood to break down': ['High', 'High', 'High']
       }

df = pd.DataFrame.from_dict(data)
df
Model Wheel base BHP Likelihood to break down
0 Defender 90 90 inches 140 High
1 Defender 110 110 inches 140 High
2 Defender 130 130 inches 140 High

Rename Pandas columns

The most common way to rename columns in a Pandas dataframe is to use the rename() function. This is appended to the dataframe, i.e. df.rename() and is passed a dictionary that maps the current column name to the new column name.

Let’s change the Model column name to defender model, the Wheel base column name to wheelbase inches, the BHP column to bhp, and the Likelihood to break down column name to unreliability. Run the code and then print the head() to see the new column names.

df = df.rename(columns={
    'Model ': 'defender model',
    'Wheel base': 'wheelbase inches',
    'BHP': 'bhp',
    'Likelihood to break down': 'unreliability'
})
df.head()
defender model wheelbase inches bhp unreliability
0 Defender 90 90 inches 140 High
1 Defender 110 110 inches 140 High
2 Defender 130 130 inches 140 High

Convert Pandas column names to lowercase

Another common thing you may want to do is convert column names to lowercase. This can be done in a couple of different ways. The first way is to append the str.lower() function to the output of df.columns. The columns argument a list of column names on the Pandas dataframe, while appending str.lower() converts them to lowercase. Reassign the output back to df.columns and you’ll rename the column names to lowercase.

df.columns = df.columns.str.lower()
df.head()
defender model wheelbase inches bhp unreliability
0 Defender 90 90 inches 140 High
1 Defender 110 110 inches 140 High
2 Defender 130 130 inches 140 High

The cleanest way to rename column names is to combine the rename() function with str.lower(). To do this we reassign df.rename(columns=str.lower) back to df. Alternatively, you can use inplace=True and skip the df = bit at the beginning.

df = df.rename(columns=str.lower)
df.head()
defender model wheelbase inches bhp unreliability
0 Defender 90 90 inches 140 High
1 Defender 110 110 inches 140 High
2 Defender 130 130 inches 140 High

Convert Pandas column names to uppercase

The process for convert Pandas column names to uppercase is exactly the same as for lowercase, you just need to replace the lower() function with upper() instead.

df.columns = df.columns.str.upper()
df.head()
DEFENDER MODEL WHEELBASE INCHES BHP UNRELIABILITY
0 Defender 90 90 inches 140 High
1 Defender 110 110 inches 140 High
2 Defender 130 130 inches 140 High

You can also pass str.upper to the columns argument of rename() to perform the task of converting column names to uppercase using slightly cleaner code.

df = df.rename(columns=str.upper)
df.head()
DEFENDER MODEL WHEELBASE INCHES BHP UNRELIABILITY
0 Defender 90 90 inches 140 High
1 Defender 110 110 inches 140 High
2 Defender 130 130 inches 140 High

Add a prefix to Pandas column names

It’s also possible to add a prefix to column names using the add_prefix() function. You can apply this to specific columns if you provide a subset, or you can append add_prefix() to the dataframe and then pass in the prefix string. This is quite a common one during feature engineering, especially when working with highly dimensional datasets.

df1 = df.add_prefix('prefix_')
df1.head()
prefix_DEFENDER MODEL prefix_WHEELBASE INCHES prefix_BHP prefix_UNRELIABILITY
0 Defender 90 90 inches 140 High
1 Defender 110 110 inches 140 High
2 Defender 130 130 inches 140 High

Add a suffix to Pandas column names

There’s also an add_suffix() function that works in exactly the same way and appends a string to the column name.

df2 = df.add_suffix('_suffix')
df2.head()
DEFENDER MODEL_suffix WHEELBASE INCHES_suffix BHP_suffix UNRELIABILITY_suffix
0 Defender 90 90 inches 140 High
1 Defender 110 110 inches 140 High
2 Defender 130 130 inches 140 High

Strip whitespace from Pandas column names

One thing that often causes confusion when working with third-party data that’s been imported from a spreadsheet is the presence of whitespace in the column name. You’ll filter the Pandas dataframe only for it to return no data, or an error, because someone has inadvertently included whitespace in the column name. You can strip whitespace from Pandas column names using df.columns.str.strip().str.lower().

df.columns = df.columns.str.strip().str.lower()
df.columns
Index(['defender model', 'wheelbase inches', 'bhp', 'unreliability'], dtype='object')

Slugify Pandas column names

Finally, one thing I like to do is to slugify column names. This removes whitespace and replaces it with underscores and ensures column names contain only letters, numbers, and underscores. It’s a great way to tidy up column names and keep them consistent.

df.columns = df.columns.str.lower().str.replace('[^0-9a-zA-Z]+','_',regex=True)
df.columns
Index(['defender_model', 'wheelbase_inches', 'bhp', 'unreliability'], dtype='object')

Matt Clarke, Friday, June 10, 2022

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.