When working with a Pandas dataframe you’ll sometimes need to convert the dataframe or a series to a list or dictionary. There are certain operations that are easier to perform on a list or dictionary than on a dataframe or series, and they’re often more efficient.
In this post, we’ll show you how to convert a Pandas dataframe or series to a list using the tolist()
function and to a dictionary using to_dict()
function.
To get started, open a Jupyter notebook, import the Pandas library and create a dataframe.
import pandas as pd
df = pd.DataFrame({
'model': ['XF', 'XE', 'XJ'],
'top_speed': [120, 121, 145]
})
df
model | top_speed | |
---|---|---|
0 | XF | 120 |
1 | XE | 121 |
2 | XJ | 145 |
To convert a specific Pandas series or column to a list you simply reference the column and append the tolist()
function. You can assign the output of this to a variable allowing it to be manipulated easily.
models = df['model'].tolist()
models
['XF', 'XE', 'XJ']
The easiest method to convert a Pandas dataframe to a list of lists is to use the Pandas values
method and then append the tolist()
function. This returns a list of lists containing the values in each series or column of the dataframe.
df.values.tolist()
[['XF', 120], ['XE', 121], ['XJ', 145]]
Pandas now recommends using the to_numpy()
function instead of values
. Instead of getting the values and then passing them to the to_list()
function, this simply converts the columns or series to a Numpy array first.
df.to_numpy().tolist()
[['XF', 120], ['XE', 121], ['XJ', 145]]
In most cases, when exporting a Pandas dataframe to another format, the most common approach is generally to convert it to a dictionary instead of a list, since this gives you the useful column headers that allow you to manipulate the data more easily. The to_dict()
function makes this very easy.
data = df.to_dict()
data
{'model': {0: 'XF', 1: 'XE', 2: 'XJ'}, 'top_speed': {0: 120, 1: 121, 2: 145}}
Matt Clarke, Saturday, November 05, 2022