How to reformat dates in Pandas

Learn how to use Python and Pandas to reformat dates and datetimes so you can display them in your reports or use them in your models.

How to reformat dates in Pandas
Picture by Olya Kobruseva, Pexels.
19 minutes to read

If you regularly work with time series data in Pandas it’s probable that you’ll sometimes need to convert dates or datetimes and extract additional features from them.

This could be something as simple as extracting the month name from a date to make your data visualisations clearer, or you could be calculating the day of the week to see if there’s a correlation with your model’s target variable.

Whatever you’re doing with dates and datetimes, Pandas makes most things relatively simple. The built in Series.dt class lets you access the values of a series in your dataframe and can return various date and time properties, such as the weekday, or quarter.

However, there are some things you can’t do in Pandas without resorting to additional bits of Python code, as the Series.dt datetime properties don’t support every date or time feature you’re likely to need. Here’s a detailed guide showing how you can extract or calculate these features from dates, datetimes, and strings containing dates to help you with your projects.

Load packages and data

To get started, open a new Jupyter notebook, import the pandas package, and create a simple dataframe containing a few dates. I’ve deliberately used a non-standard format for the dates here. The 2021/04/29 dates shown are strings, not datetimes, but we’ll convert them in our next step.

import pandas as pd
df = pd.DataFrame({
    'event': ['April Pay Day', 'Mother\'s Day', 'May Pay Day'],
    'event_date': ['2021/04/29', '2021/04/29', '2021/05/09']
})
df.head()
event event_date
0 April Pay Day 2021/04/29
1 Mother's Day 2021/04/29
2 May Pay Day 2021/05/09

Check the data types

As you can see from the output of the info() function, both columns in our dataframe are detected with the object dtype, so our first step is going to be to convert these string values to datetimes, as all date manipulations in Pandas and Python require this common format.

df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   event       3 non-null      object
 1   event_date  3 non-null      object
dtypes: object(2)
memory usage: 176.0+ bytes

Changing a date string to a datetime

To change a string or object data type to a datetime or datetime64[ns] data type we can use the Pandas to_datetime() function. We’ll create a new column in our dataframe called date and assignt the output of to_datetime() to this. The second argument, format='%Y/%m/%d', tells Pandas that our dates are in the YYYY/MM/DD format, so they get correctly reformatted.

df['date'] = pd.to_datetime(df['event_date'], format='%Y/%m/%d')
df.head()
event event_date date
0 April Pay Day 2021/04/29 2021-04-29
1 Mother's Day 2021/04/29 2021-04-29
2 May Pay Day 2021/05/09 2021-05-09

To confirm that we’ve successfully completed our first step of converting our date string to a datetime, we can re-run info() and check that the Dtype column of the new date column is set to datetime64[ns]. Now this is in the right format, we can reformat it to a wide range of different date features.

df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype         
---  ------      --------------  -----         
 0   event       3 non-null      object        
 1   event_date  3 non-null      object        
 2   date        3 non-null      datetime64[ns]
dtypes: datetime64[ns](1), object(2)
memory usage: 200.0+ bytes

Extracting day specific elements from a date

There are two main ways to reformat dates and extract features from them in Pandas. You can use the Pandas Series.dt class, or you can use Python’s strftime() function. We’ll start with the Series.dt method. To use this method we’ll access the date column, append the dt method to it and assign the value to a new column.

These should be self-explanatory, but df['date'].dt.dayofweek returns the day of the week, df['date'].dt.dayofyear returns the day of the year, df['date'].dt.weekday returns the weekday, and df['date'].dt.isocalendar().day returns the weekday using an ISO week format. The other day features, such as the day name, need to be accessed using Python’s strftime() function, which we’ll tackle in a moment.

df = df[['date']]
df['dayofweek'] = df['date'].dt.dayofweek
df['dayofyear'] = df['date'].dt.dayofyear
df['weekday'] = df['date'].dt.weekday
df['iso_weekday'] = df['date'].dt.isocalendar().day
df.head()
date dayofweek dayofyear weekday iso_weekday
0 2021-04-29 3 119 3 4
1 2021-04-29 3 119 3 4
2 2021-05-09 6 129 6 7

Extracting week, month, and year elements from a date

Next, we’ll calculate the week, month, year, and quarter for each date in our dataframe. These all use dt, apart from week. This used to use dt.week, but this is now deprecated, and you should instead use dt.isocalendar().week.

df = df[['date']]
df['iso_week'] = df['date'].dt.isocalendar().week
df['year'] = df['date'].dt.year
df['month'] = df['date'].dt.month
df['quarter'] = df['date'].dt.quarter
df.head()
date iso_week year month quarter
0 2021-04-29 17 2021 4 2
1 2021-04-29 17 2021 4 2
2 2021-05-09 18 2021 5 2

Extracting ISO date features from a date

For ISO dates, there are a number of new functions that can be used to extract the week, year, and day. These are formatted differently to the regular Series.dt methods and need to use isocalendar().

df = df[['date']]
df['iso_year'] = df['date'].dt.isocalendar().year
df['iso_week'] = df['date'].dt.isocalendar().week
df['iso_weekday'] = df['date'].dt.isocalendar().day
df.head()
date iso_year iso_week iso_weekday
0 2021-04-29 2021 17 4
1 2021-04-29 2021 17 4
2 2021-05-09 2021 18 7

Creating custom features using strftime

Since Pandas can’t do everything, we’ll need another solution to calculate other date features and reformat dates in different ways. The strftime (short for “string from time”) function, lets you create custom date features from an existing datetime column in your dataframe.

df = df[['date']]
df['month_year'] = df['date'].dt.strftime('%b-%Y')
df.head()
date month_year
0 2021-04-29 Apr-2021
1 2021-04-29 Apr-2021
2 2021-05-09 May-2021

Python date formats

Python uses the 1989 C standard implementation for dates, which are common in most programming languages. These use a directive, such as %Y, to tell the language how a date element should be reformatted. The most commonly used date directives are below, but you can see the full list of these in the Python documentation.

Day

Date directive Description Example output
%a Weekday with locale abbreviation Sat, Sun, Mon
%A Full name of weekday for locale Sunday
%w Decimal weekday (Sun = 0, Sat = 6) 0, 1, 2
%d Day of month with zero-padding 01, 02, 03
%j Day of year with zero-padding 001, 123, 365
df = df[['date']]
df['weekday_abbreviated'] = df['date'].dt.strftime('%a')
df['weekday_full'] = df['date'].dt.strftime('%A')
df['weekday_decimal'] = df['date'].dt.strftime('%w')
df['day_of_month'] = df['date'].dt.strftime('%d')
df['day_of_year'] = df['date'].dt.strftime('%j')
df.head()
date weekday_abbreviated weekday_full weekday_decimal day_of_month day_of_year
0 2021-04-29 Thu Thursday 4 29 119
1 2021-04-29 Thu Thursday 4 29 119
2 2021-05-09 Sun Sunday 0 09 129

Week

One thing to look out for with week numbers is that all days in a new year preceding the first Sunday (%U) or Monday (%W) are assigned to week 0, which can cause confusion if you’re expecting them to be in week 1.

Date directive Description Example output
%W Week (starting Mon) with zero-padding 00, 02
%U Week (starting Sun) with zero-padding 01, 52
%V ISO 8601 week (starting Mon) with zero-padding 01, 52
df = df[['date']]
df['week_starting_monday'] = df['date'].dt.strftime('%W')
df['week_starting_sunday'] = df['date'].dt.strftime('%U')
df['iso_week_starting_monday'] = df['date'].dt.strftime('%V')
df.head()
date week_starting_monday week_starting_sunday iso_week_starting_monday
0 2021-04-29 17 17 17
1 2021-04-29 17 17 17
2 2021-05-09 18 19 18

Month

Date directive Description Example output
%b Month name with locale abbreviation Jan, Feb
%B Full name of month for locale January
%m Month with zero-padding 01, 02, 03
df = df[['date']]
df['month_name_abbreviated'] = df['date'].dt.strftime('%b')
df['month_name_full'] = df['date'].dt.strftime('%B')
df['month_number'] = df['date'].dt.strftime('%m')
df.head()
date month_name_abbreviated month_name_full month_number
0 2021-04-29 Apr April 04
1 2021-04-29 Apr April 04
2 2021-05-09 May May 05

Year

Date directive Description Example output
%Y Four digit year with century 2021, 2050
%y Two digit year with zero-padding 01, 20
df = df[['date']]
df['year_four_digit'] = df['date'].dt.strftime('%Y')
df['year_two_digit'] = df['date'].dt.strftime('%y')
df.head()
date year_four_digit year_two_digit
0 2021-04-29 2021 21
1 2021-04-29 2021 21
2 2021-05-09 2021 21

Custom date strings

You can also create custom date strings and date formats by adding a combination of date directives as arguments to strftime(). For example, df['date'].dt.strftime('%A, %d %B, %Y') will return a date in the following format Thursday, 29 April, 2021.

df = df[['date']]
df['example_1'] = df['date'].dt.strftime('%A, %d %B, %Y')
df['example_2'] = df['date'].dt.strftime('%B %d, %Y')
df['example_3'] = df['date'].dt.strftime('%b %d, %y')
df.head()
date example_1 example_2 example_3
0 2021-04-29 Thursday, 29 April, 2021 April 29, 2021 Apr 29, 21
1 2021-04-29 Thursday, 29 April, 2021 April 29, 2021 Apr 29, 21
2 2021-05-09 Sunday, 09 May, 2021 May 09, 2021 May 09, 21

Converting Google Analytics yearMonth

The Google Analytics yearMonth API dimension returns a concatenated year-month value in the format 202110 or %Y%m. To convert this to something more readable, such as October, 2021 we can run two conversions like this.

df['date'] = pd.to_datetime(df['yearMonth'], format='%Y%m')
df['period'] = df['date'].dt.strftime('%B, %Y')

Removing the time from a datetime

If your dataframe contains dates with a time appended via the datetime format, you can remove the time part and return the date without the time by using dt.date. This is handy if you need to group your data by date, as it removes granularity.

df['order_date'] = df['order_date'].dt.date

Convert a timestamp to a datetime

The to_datetime() function can be used to convert a timestamp integer, i.e. 1569324548, to a datetime by passing the unit='s' argument.

df['datetime'] = pd.to_datetime(df['timestamp'], unit='s')

Matt Clarke, Saturday, March 06, 2021

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.