As you add new columns to Pandas dataframes they’ll often start to get large and the columns may appear in an order that no longer makes sense. To make your dataframes easier to read, you’ll often need to reorder the columns and rearrange them into a specific order.
There are a couple of ways to do this. In this quick tutorial I’ll show you two easy techniques you can apply to your Pandas dataframe to select the columns you want to display and put them in the order you want them to appear.
To get started, open a Jupyter notebook and import the Pandas library. Then either import data into a dataframe or create a dataframe from scratch.
import pandas as pd
df = pd.DataFrame({'col1': [1, 2, 3],
'col2': [4, 5, 6],
'col3': [7, 8, 9],
'col4': [10, 11, 12]})
df
col1 | col2 | col3 | col4 | |
---|---|---|---|---|
0 | 1 | 4 | 7 | 10 |
1 | 2 | 5 | 8 | 11 |
2 | 3 | 6 | 9 | 12 |
The easiest way to reorder columns in a Pandas dataframe is pass in a list of column names in the new order to df[]
. This will rearrange the columns into your specified order. If you only want to show some of the columns, you can simply include the ones you want to include in the dataframe.
Doing this will only print the dataframe with the columns rearranged in your chosen order, if you want to modify the original dataframe, you’ll need to reassign the reordered dataframe back to the original dataframe variable.
df[['col2', 'col3', 'col1']]
col2 | col3 | col1 | |
---|---|---|---|
0 | 4 | 7 | 1 |
1 | 5 | 8 | 2 |
2 | 6 | 9 | 3 |
You can also assign the list of column names to a variable and pass that to df[]
to reorder or rearrange the columns. If you assign this back to the original dataframe variable Pandas will overwrite the original dataframe with the columns in your preferred order.
cols = ['col3', 'col1', 'col2', 'col4']
df = df[cols]
df
col3 | col1 | col2 | col4 | |
---|---|---|---|---|
0 | 7 | 1 | 4 | 10 |
1 | 8 | 2 | 5 | 11 |
2 | 9 | 3 | 6 | 12 |
You can also use the Pandas reindex()
function to reorder the columns in a dataframe. The reindex()
function takes a list of column names as an argument and returns a new dataframe with the columns in the order specified in the list.
df = df.reindex(columns=['col3', 'col1', 'col2'])
df
col3 | col1 | col2 | |
---|---|---|---|
0 | 7 | 1 | 4 |
1 | 8 | 2 | 5 |
2 | 9 | 3 | 6 |
If you want to reorder the dataframe columns so they are in alphabetical order, you can fetch the column names using df.columns
, and then sort them using sorted()
, and then use the sorted column names to reorder the dataframe columns.
df = df.reindex(sorted(df.columns), axis=1)
df
col1 | col2 | col3 | |
---|---|---|---|
0 | 1 | 4 | 7 |
1 | 2 | 5 | 8 |
2 | 3 | 6 | 9 |
Matt Clarke, Tuesday, November 29, 2022