While the Pandas drop()
method is probably the most common way to drop columns or remove columns from a Pandas dataframe, there is another lesser known method you can also use - pop()
. The pop()
method removes a single column or series from a Pandas dataframe, but unlike drop()
, it can also be used to return the dropped column as a variable, so it can be used for further processing.
In this quick and easy tutorial, I’ll show you how you can use the Pandas pop()
function to drop a column from a Pandas dataframe and return the dropped series as a variable, so you can use it.
First, open a Jupyter notebook and import the Pandas package and either import data into a dataframe, or create a dataframe from scratch. I’ve created one below that includes the age, weight, and length, of various fish species.
import pandas as pd
df = pd.DataFrame(
[('Pterophyllum altum', 3, 12.5, 13.3),
('Pterophyllum scalare', 2, 10.0, 11.0),
('Pterophyllum leopoldi', 1, 8.0, 9.0)],
columns=['species', 'age', 'length', 'weight']
)
df
species | age | length | weight | |
---|---|---|---|---|
0 | Pterophyllum altum | 3 | 12.5 | 13.3 |
1 | Pterophyllum scalare | 2 | 10.0 | 11.0 |
2 | Pterophyllum leopoldi | 1 | 8.0 | 9.0 |
To use the Pandas pop()
method to remove or drop a column from the dataframe above we simply pass the column name as an argument to the pop()
function, so calling df.pop('weight')
will remove the weight
column from the original dataframe. If you reprint df
, you’ll see that the column has now been removed.
df.pop('weight')
0 13.3
1 11.0
2 9.0
Name: weight, dtype: float64
df
species | age | length | |
---|---|---|---|
0 | Pterophyllum altum | 3 | 12.5 |
1 | Pterophyllum scalare | 2 | 10.0 |
2 | Pterophyllum leopoldi | 1 | 8.0 |
The other use of pop()
is to drop the column but return the dropped series as a result. This is useful when you want to extract a column from the dataframe and use it for further processing. To do this, you simply need to assign the result of the pop()
method to a variable.
Calling data = df.pop('length')
will drop the length
column from our dataframe, assign it to a variable called data
, and then leave us with a dataframe in which the length
column has been dropped.
data = df.pop('length')
data
0 12.5
1 10.0
2 8.0
Name: length, dtype: float64
df
species | age | |
---|---|---|
0 | Pterophyllum altum | 3 |
1 | Pterophyllum scalare | 2 |
2 | Pterophyllum leopoldi | 1 |
Matt Clarke, Saturday, November 26, 2022