How to use the Pandas truncate() function

Pandas comes with two functions called truncate() that allow you to crop the top, bottom, or middle of a dataframe based on the index or values.

How to use the Pandas truncate() function
Picture by Marcus Spiske, Pexels.
10 minutes to read

Have you ever needed to chop the top or bottom off a Pandas dataframe, or extract a specific section from the middle? If so, there’s a Pandas function called truncate() to let you perform this task with ease.

There are technically two truncate() functions in Pandas: pd.DataFrame.truncate() works on the dataframe level, while pd.Series.truncate() works on the series level.

In this quick tutorial I’ll show you some code examples showing how you can use the pd.DataFrame.truncate() and pd.Series.truncate() methods to truncate a Pandas dataframe or series by removing rows from before, after, or on either side of specific index values.

pd.DataFrame.truncate()

The Pandas pd.DataFrame.truncate() functions can be used to truncate data from a specific section of a dataframe based on the value of its index. You can use it to select all rows after a specific index position, all rows before a specific index position, or all rows between two index positions.

As the name suggests, the pd.DataFrame.truncate() method need to be appended to your Pandas dataframe object. When run, it will transform the dataframe by truncating the data, and it returns a new dataframe. As a result, it will work fine with the method chaining technique.

Create a Pandas dataframe

To get started, either import some data into a Pandas dataframe or create a dataframe containing some dummy data. Importantly, the pd.DataFrame.truncate() method uses the index of the dataframe, so you’ll need to use the set_index() function to change the date column to a date index.

import pandas as pd
df = pd.DataFrame({'date': ['2023-01-01', '2023-01-02', '2023-01-03', 
                            '2023-01-04', '2023-01-05', '2023-01-06', 
                            '2023-01-07', '2023-01-08', '2023-01-09'], 
                   'orders': [3432, 3233, 3434, 3235, 3436, 
                              3237, 3438, 3239, 3440]})
df = df.set_index('date')
df
orders
date
2023-01-01 3432
2023-01-02 3233
2023-01-03 3434
2023-01-04 3235
2023-01-05 3436
2023-01-06 3237
2023-01-07 3438
2023-01-08 3239
2023-01-09 3440

Use the before parameter to truncate previous rows

The before parameter takes an index value, which is a date in the case of our dataframe, and truncates all values before this one. If we pass before='2023-01-02 to truncate() it will drop the 2023-01-01 row and return a subset dataframe with all rows after that one.

df1 = df.truncate(before='2023-01-02')
df1
orders
date
2023-01-02 3233
2023-01-03 3434
2023-01-04 3235
2023-01-05 3436
2023-01-06 3237
2023-01-07 3438
2023-01-08 3239
2023-01-09 3440

Use the after parameter to truncate subsequent rows

Similarly, we can use the after attribute to drop any rows after a given index position. So, if we enter after='2023-01-06 all dates after this position in the index will be dropped from the subset of the dataframe returned.

df2 = df.truncate(after='2023-01-06')
df2
orders
date
2023-01-01 3432
2023-01-02 3233
2023-01-03 3434
2023-01-04 3235
2023-01-05 3436
2023-01-06 3237

Use before and after to return a section of a dataframe

You can combine the before and after parameters to select a subset of rows from within a Pandas dataframe based on their index values. If we pass before='2023-01-03' and after='2023-01-05' to the truncate() function it will return the rows where the index values are between these two dates.

df3 = df.truncate(before='2023-01-03', after='2023-01-05')
df3
orders
date
2023-01-03 3434
2023-01-04 3235
2023-01-05 3436

Truncating a dataframe with a numeric index

The examples above show how to truncate a dataframe that has a date index. However, you can use the same technique to truncate dataframes with a regular numeric index in the same way.

df_fish = pd.DataFrame(
    [('Pterophyllum altum', 'Cichlidae', 15, 12.3),
        ('Pterophyllum scalare', 'Cichlidae', 10, 9.2),
        ('Pterophyllum leopoldi', 'Cichlidae', 8, 6.5), 
        ('Corydoras paleatus', 'Callichthyidae', 5, 0.5),
        ('Corydoras aeneus', 'Callichthyidae', 7, 2.5),
        ('Corydoras trilineatus', 'Callichthyidae', 5, 1.5),
        ('Corydoras habrosus', 'Callichthyidae', 3, 0.5),
    ],
    columns=['species', 'family', 'length', 'weight']
)
df_fish
species family length weight
0 Pterophyllum altum Cichlidae 15 12.3
1 Pterophyllum scalare Cichlidae 10 9.2
2 Pterophyllum leopoldi Cichlidae 8 6.5
3 Corydoras paleatus Callichthyidae 5 0.5
4 Corydoras aeneus Callichthyidae 7 2.5
5 Corydoras trilineatus Callichthyidae 5 1.5
6 Corydoras habrosus Callichthyidae 3 0.5

To truncate the dataframe and remove all rows before index 3, we simply use before=3 and we’ll get back a subset dataframe containing rows 3 upwards.

df4 = df_fish.truncate(before=3)
df4
species family length weight
3 Corydoras paleatus Callichthyidae 5 0.5
4 Corydoras aeneus Callichthyidae 7 2.5
5 Corydoras trilineatus Callichthyidae 5 1.5
6 Corydoras habrosus Callichthyidae 3 0.5

Use axis to truncate columns

You can also use the truncate() method to truncate a dataframe horizontally and drop or remove columns from before, after, or on either side of specific column names. To do this you pass the column names to before and after and set the axis parameter to axis='columns'.

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})

df.truncate(before='B', axis='columns')
B C
0 4 7
1 5 8
2 6 9

pd.Series.truncate()

The pd.Series.truncate() function is slightly different to pd.DataFrame.truncate(). This method is applied to a Pandas series or column and returns a new series with the rows removed based on your chosen before and after` parameter arguments.

For example, let’s say we wanted to return only the species series from our dataframe where the index values were between 2 and 4. To do this we’d append the pd.Series.truncate() method to our species series, as df_fish['species'].truncate() and then pass in before=2, after=4 as the arguments.

If you want to return a dataframe instead of a series, you can append the Pandas to_frame() function to the end and you’ll get back a dataframe.

df_fish = pd.DataFrame(
    [('Pterophyllum altum', 'Cichlidae', 15, 12.3),
        ('Pterophyllum scalare', 'Cichlidae', 10, 9.2),
        ('Pterophyllum leopoldi', 'Cichlidae', 8, 6.5), 
        ('Corydoras paleatus', 'Callichthyidae', 5, 0.5),
        ('Corydoras aeneus', 'Callichthyidae', 7, 2.5),
        ('Corydoras trilineatus', 'Callichthyidae', 5, 1.5),
        ('Corydoras habrosus', 'Callichthyidae', 3, 0.5),
    ],
    columns=['species', 'family', 'length', 'weight']
)

df_fish['species'].truncate(before=2, after=4)
2    Pterophyllum leopoldi
3       Corydoras paleatus
4         Corydoras aeneus
Name: species, dtype: object
df_fish['species'].truncate(before=2, after=4).to_frame()
species
2 Pterophyllum leopoldi
3 Corydoras paleatus
4 Corydoras aeneus

Matt Clarke, Tuesday, January 17, 2023

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.