Like most people who work in ecommerce and marketing, I spend a lot of time in Google Analytics. It’s a great tool, but when reporting on the numbers, it helps to extract the key metrics, analyse them, and display them in a different format. Historically, I’d done this in Google Sheets, but I’ve now switched completely to Pandas.
Since most of my reports in Google Sheets used the excellent Google Analytics API extension for Sheets, I wanted the same functionality in Pandas, so I created GAPandas, a Python package that lets you query the GA API from within a Python script or Jupyter notebook and display the output in a Pandas dataframe.
In this project, I’ll show you how you can use some new functions I recently added to GAPandas to speed up the process of creating my monthly ecommerce and marketing reports, and cut down on the amount of code I needed to write. You can use these to quickly create a plethora of reports customised to your own business.
To get started, open up Jupyter and install the latest version of GAPandas. This package lets you query your Google Analytics data via the API, just like you can in Google Sheets. Once installed, load up the package.
!pip3 install --upgrade gapandas
import gapandas as gp
To use GAPandas you’ll need to create a client secrets JSON key file to authenticate against the Google Analytics
API. Use the get_service()
function to get a service object.
service = gp.get_service('client-secret.json')
To save the hassle of re-entering the same values in various functions, I’d recommend creating a few variables to hold the view ID (i.e. 123456789), and the start and date for your reporting period. I set use a 13-month window in my monthly reports, since this shows the whole year and the same month last year.
view = '123456789'
start_date = '2020-02-01'
end_date = '2021-03-31'
Next, use the monthly_ecommerce_overview()
function to fetch the basic ecommerce metrics for your site. You’ll
need to provide the service
object, and the view
, start_date
, and end_date
variables. You can leave the
segment
and filters
option blank.
df_all = gp.monthly_ecommerce_overview(service, view, start_date, end_date, segment=None, filters=None)
df_all.head(13)
This function will return the year month period (i.e. March, 2021), the Entrances, Sessions, Pageviews, Transactions, Conversion rate, Revenue, and average order value or AOV for each month in the 13-month window, allowing you to scan the data easily for potential issues.
Period | Entrances | Sessions | Pageviews | Transactions | Conversion rate | Revenue | AOV | |
---|---|---|---|---|---|---|---|---|
0 | March, 2021 | 23,376 | 23,376 | 37,238 | 0 | 0.00% | £0 | £0.00 |
1 | February, 2021 | 17,005 | 17,005 | 27,056 | 0 | 0.00% | £0 | £0.00 |
2 | January, 2021 | 18,682 | 18,683 | 30,110 | 0 | 0.00% | £0 | £0.00 |
3 | December, 2020 | 16,838 | 16,838 | 27,841 | 0 | 0.00% | £0 | £0.00 |
4 | November, 2020 | 18,671 | 18,671 | 30,893 | 0 | 0.00% | £0 | £0.00 |
5 | October, 2020 | 20,839 | 20,839 | 36,128 | 0 | 0.00% | £0 | £0.00 |
6 | September, 2020 | 23,158 | 23,159 | 39,408 | 0 | 0.00% | £0 | £0.00 |
7 | August, 2020 | 26,461 | 26,461 | 44,705 | 0 | 0.00% | £0 | £0.00 |
8 | July, 2020 | 30,106 | 30,106 | 49,009 | 0 | 0.00% | £0 | £0.00 |
9 | June, 2020 | 32,476 | 32,476 | 53,217 | 0 | 0.00% | £0 | £0.00 |
10 | May, 2020 | 28,552 | 28,553 | 48,794 | 0 | 0.00% | £0 | £0.00 |
11 | April, 2020 | 15,555 | 15,555 | 26,681 | 0 | 0.00% | £0 | £0.00 |
12 | March, 2020 | 17,029 | 17,029 | 29,515 | 0 | 0.00% | £0 | £0.00 |
Creating reports for other ecommerce and marketing data is as easy as re-running the query with a different filter
or segment. To fetch the data for organic search, for example, we can create a filter with the value
ga:medium==organic
and we’ll see just the data on organic search for each period.
filter_organic = 'ga:medium==organic'
df_organic = gp.monthly_ecommerce_overview(service, view, start_date, end_date, segment=None, filters=filter_organic)
df_organic.head(13)
Period | Entrances | Sessions | Pageviews | Transactions | Conversion rate | Revenue | AOV | |
---|---|---|---|---|---|---|---|---|
0 | March, 2021 | 20,709 | 20,709 | 32,579 | 0 | 0.00% | £0 | £0.00 |
1 | February, 2021 | 15,205 | 15,205 | 23,843 | 0 | 0.00% | £0 | £0.00 |
2 | January, 2021 | 16,639 | 16,640 | 26,535 | 0 | 0.00% | £0 | £0.00 |
3 | December, 2020 | 14,996 | 14,996 | 24,550 | 0 | 0.00% | £0 | £0.00 |
4 | November, 2020 | 16,778 | 16,778 | 27,516 | 0 | 0.00% | £0 | £0.00 |
5 | October, 2020 | 18,609 | 18,609 | 32,051 | 0 | 0.00% | £0 | £0.00 |
6 | September, 2020 | 20,507 | 20,508 | 34,379 | 0 | 0.00% | £0 | £0.00 |
7 | August, 2020 | 23,282 | 23,282 | 39,194 | 0 | 0.00% | £0 | £0.00 |
8 | July, 2020 | 27,168 | 27,168 | 43,596 | 0 | 0.00% | £0 | £0.00 |
9 | June, 2020 | 29,048 | 29,048 | 47,109 | 0 | 0.00% | £0 | £0.00 |
10 | May, 2020 | 25,326 | 25,326 | 42,703 | 0 | 0.00% | £0 | £0.00 |
11 | April, 2020 | 13,256 | 13,256 | 22,123 | 0 | 0.00% | £0 | £0.00 |
12 | March, 2020 | 14,640 | 14,640 | 24,821 | 0 | 0.00% | £0 | £0.00 |
To see the data for paid search, we can create a filter containing the medium cpc
. This will return all paid
search traffic assigned to this medium, which typically includes both your Google Ads and Bing Ads sessions.
filter_cpc = 'ga:medium==cpc'
df_cpc = gp.monthly_ecommerce_overview(service, view, start_date, end_date, segment=None, filters=filter_cpc)
df_cpc.head(13)
Period | Entrances | Sessions | Pageviews | Transactions | Conversion rate | Revenue | AOV | |
---|---|---|---|---|---|---|---|---|
0 | March, 2021 | 0 | 0 | 0 | 0 | 0.00% | £0 | £0.00 |
1 | February, 2021 | 0 | 0 | 0 | 0 | 0.00% | £0 | £0.00 |
2 | January, 2021 | 0 | 0 | 0 | 0 | 0.00% | £0 | £0.00 |
3 | December, 2020 | 0 | 0 | 0 | 0 | 0.00% | £0 | £0.00 |
4 | November, 2020 | 0 | 0 | 0 | 0 | 0.00% | £0 | £0.00 |
5 | October, 2020 | 0 | 0 | 0 | 0 | 0.00% | £0 | £0.00 |
6 | September, 2020 | 0 | 0 | 0 | 0 | 0.00% | £0 | £0.00 |
7 | August, 2020 | 0 | 0 | 0 | 0 | 0.00% | £0 | £0.00 |
8 | July, 2020 | 0 | 0 | 0 | 0 | 0.00% | £0 | £0.00 |
9 | June, 2020 | 0 | 0 | 0 | 0 | 0.00% | £0 | £0.00 |
10 | May, 2020 | 0 | 0 | 0 | 0 | 0.00% | £0 | £0.00 |
11 | April, 2020 | 0 | 0 | 0 | 0 | 0.00% | £0 | £0.00 |
12 | March, 2020 | 0 | 0 | 0 | 0 | 0.00% | £0 | £0.00 |
We can fetch the direct (or untracked) data by setting the filter the medium (none)
.
filter_direct = 'ga:medium==(none)'
df_direct = gp.monthly_ecommerce_overview(service, view, start_date, end_date, segment=None, filters=filter_direct)
df_direct.head(13)
Period | Entrances | Sessions | Pageviews | Transactions | Conversion rate | Revenue | AOV | |
---|---|---|---|---|---|---|---|---|
0 | March, 2021 | 2,444 | 2,444 | 4,202 | 0 | 0.00% | £0 | £0.00 |
1 | February, 2021 | 1,626 | 1,626 | 2,863 | 0 | 0.00% | £0 | £0.00 |
2 | January, 2021 | 1,854 | 1,854 | 3,212 | 0 | 0.00% | £0 | £0.00 |
3 | December, 2020 | 1,666 | 1,666 | 2,948 | 0 | 0.00% | £0 | £0.00 |
4 | November, 2020 | 1,698 | 1,698 | 3,067 | 0 | 0.00% | £0 | £0.00 |
5 | October, 2020 | 1,795 | 1,795 | 3,395 | 0 | 0.00% | £0 | £0.00 |
6 | September, 2020 | 2,385 | 2,385 | 4,590 | 0 | 0.00% | £0 | £0.00 |
7 | August, 2020 | 2,919 | 2,919 | 5,099 | 0 | 0.00% | £0 | £0.00 |
8 | July, 2020 | 2,628 | 2,628 | 4,850 | 0 | 0.00% | £0 | £0.00 |
9 | June, 2020 | 3,175 | 3,175 | 5,640 | 0 | 0.00% | £0 | £0.00 |
10 | May, 2020 | 2,830 | 2,831 | 5,321 | 0 | 0.00% | £0 | £0.00 |
11 | April, 2020 | 1,949 | 1,949 | 3,746 | 0 | 0.00% | £0 | £0.00 |
12 | March, 2020 | 2,073 | 2,073 | 3,861 | 0 | 0.00% | £0 | £0.00 |
Finally, to fetch our email data, we create a filter containing the API filter command ga:medium==email
. You can
use the same approach to create individual reports for any data you wish to display in Pandas.
filter_email = 'ga:medium==email'
df_email = gp.monthly_ecommerce_overview(service, view, start_date, end_date, segment=None, filters=filter_email)
df_email.head(13)
Period | Entrances | Sessions | Pageviews | Transactions | Conversion rate | Revenue | AOV | |
---|---|---|---|---|---|---|---|---|
0 | March, 2021 | 3 | 3 | 12 | 0 | 0.00% | £0 | £0.00 |
1 | February, 2021 | 3 | 3 | 72 | 0 | 0.00% | £0 | £0.00 |
2 | January, 2021 | 3 | 3 | 5 | 0 | 0.00% | £0 | £0.00 |
3 | December, 2020 | 19 | 19 | 57 | 0 | 0.00% | £0 | £0.00 |
4 | November, 2020 | 6 | 6 | 7 | 0 | 0.00% | £0 | £0.00 |
5 | October, 2020 | 7 | 7 | 7 | 0 | 0.00% | £0 | £0.00 |
6 | September, 2020 | 14 | 14 | 35 | 0 | 0.00% | £0 | £0.00 |
7 | August, 2020 | 8 | 8 | 18 | 0 | 0.00% | £0 | £0.00 |
8 | July, 2020 | 10 | 10 | 30 | 0 | 0.00% | £0 | £0.00 |
9 | June, 2020 | 13 | 13 | 80 | 0 | 0.00% | £0 | £0.00 |
10 | May, 2020 | 13 | 13 | 41 | 0 | 0.00% | £0 | £0.00 |
11 | April, 2020 | 25 | 25 | 104 | 0 | 0.00% | £0 | £0.00 |
12 | March, 2020 | 30 | 30 | 234 | 0 | 0.00% | £0 | £0.00 |
Here’s an example of referral traffic data, which uses the medium referral
.
filter_referral = 'ga:medium==referral'
df_referral = gp.monthly_ecommerce_overview(service, view, start_date, end_date, segment=None, filters=filter_referral)
df_referral.head(13)
Period | Entrances | Sessions | Pageviews | Transactions | Conversion rate | Revenue | AOV | |
---|---|---|---|---|---|---|---|---|
0 | March, 2021 | 220 | 220 | 445 | 0 | 0.00% | £0 | £0.00 |
1 | February, 2021 | 168 | 168 | 271 | 0 | 0.00% | £0 | £0.00 |
2 | January, 2021 | 186 | 186 | 358 | 0 | 0.00% | £0 | £0.00 |
3 | December, 2020 | 157 | 157 | 286 | 0 | 0.00% | £0 | £0.00 |
4 | November, 2020 | 189 | 189 | 303 | 0 | 0.00% | £0 | £0.00 |
5 | October, 2020 | 426 | 426 | 670 | 0 | 0.00% | £0 | £0.00 |
6 | September, 2020 | 252 | 252 | 404 | 0 | 0.00% | £0 | £0.00 |
7 | August, 2020 | 252 | 252 | 394 | 0 | 0.00% | £0 | £0.00 |
8 | July, 2020 | 300 | 300 | 533 | 0 | 0.00% | £0 | £0.00 |
9 | June, 2020 | 240 | 240 | 388 | 0 | 0.00% | £0 | £0.00 |
10 | May, 2020 | 383 | 383 | 729 | 0 | 0.00% | £0 | £0.00 |
11 | April, 2020 | 324 | 324 | 707 | 0 | 0.00% | £0 | £0.00 |
12 | March, 2020 | 286 | 286 | 599 | 0 | 0.00% | £0 | £0.00 |
If you run promotional coupons, the monthly_coupons_overview()
function does a similar thing. For each period,
this returns the number of coupon transactions, the percentage of transactions via coupons, the coupon revenue
generated, the coupon and non-coupon AOV, and the uplift that coupons added to your site-wide AOV. (This site
doesn’t run coupons, so ignore the zeros.)
df_coupons = gp.monthly_coupons_overview(service, view, start_date, end_date)
df_coupons.head(13)
Period | Coupon transactions | Transactions via coupons | Coupon revenue | Revenue via coupons | Coupon AOV | Non-coupon AOV | Coupon AOV uplift | |
---|---|---|---|---|---|---|---|---|
0 | March, 2021 | 0 | nan% | £0 | nan% | £0.00 | £0.00 | £0.00 |
1 | February, 2021 | 0 | nan% | £0 | nan% | £0.00 | £0.00 | £0.00 |
2 | January, 2021 | 0 | nan% | £0 | nan% | £0.00 | £0.00 | £0.00 |
3 | December, 2020 | 0 | nan% | £0 | nan% | £0.00 | £0.00 | £0.00 |
4 | November, 2020 | 0 | nan% | £0 | nan% | £0.00 | £0.00 | £0.00 |
5 | October, 2020 | 0 | nan% | £0 | nan% | £0.00 | £0.00 | £0.00 |
6 | September, 2020 | 0 | nan% | £0 | nan% | £0.00 | £0.00 | £0.00 |
7 | August, 2020 | 0 | nan% | £0 | nan% | £0.00 | £0.00 | £0.00 |
8 | July, 2020 | 0 | nan% | £0 | nan% | £0.00 | £0.00 | £0.00 |
9 | June, 2020 | 0 | nan% | £0 | nan% | £0.00 | £0.00 | £0.00 |
10 | May, 2020 | 0 | nan% | £0 | nan% | £0.00 | £0.00 | £0.00 |
11 | April, 2020 | 0 | nan% | £0 | nan% | £0.00 | £0.00 | £0.00 |
12 | March, 2020 | 0 | nan% | £0 | nan% | £0.00 | £0.00 | £0.00 |
The monthly_google_ads_overview()
does much the same thing, but specifically looks for the medium cpc
and the
source google
, and returns a range of commonly reported PPC metrics, plus the Cost of Sale metric some retailers
prefer to use, which isn’t included in GA.
df_adwords = gp.monthly_google_ads_overview(service, view, start_date, end_date)
df_adwords.head(13)
Period | Entrances | Sessions | Transactions | Conversion rate | Revenue | AOV | Costs | CPC | COS | |
---|---|---|---|---|---|---|---|---|---|---|
0 | March, 2021 | 0 | 0 | 0 | 0.00% | £0 | £0.00 | £0 | £0.00 | nan% |
1 | February, 2021 | 0 | 0 | 0 | 0.00% | £0 | £0.00 | £0 | £0.00 | nan% |
2 | January, 2021 | 0 | 0 | 0 | 0.00% | £0 | £0.00 | £0 | £0.00 | nan% |
3 | December, 2020 | 0 | 0 | 0 | 0.00% | £0 | £0.00 | £0 | £0.00 | nan% |
4 | November, 2020 | 0 | 0 | 0 | 0.00% | £0 | £0.00 | £0 | £0.00 | nan% |
5 | October, 2020 | 0 | 0 | 0 | 0.00% | £0 | £0.00 | £0 | £0.00 | nan% |
6 | September, 2020 | 0 | 0 | 0 | 0.00% | £0 | £0.00 | £0 | £0.00 | nan% |
7 | August, 2020 | 0 | 0 | 0 | 0.00% | £0 | £0.00 | £0 | £0.00 | nan% |
8 | July, 2020 | 0 | 0 | 0 | 0.00% | £0 | £0.00 | £0 | £0.00 | nan% |
9 | June, 2020 | 0 | 0 | 0 | 0.00% | £0 | £0.00 | £0 | £0.00 | nan% |
10 | May, 2020 | 0 | 0 | 0 | 0.00% | £0 | £0.00 | £0 | £0.00 | nan% |
11 | April, 2020 | 0 | 0 | 0 | 0.00% | £0 | £0.00 | £0 | £0.00 | nan% |
12 | March, 2020 | 0 | 0 | 0 | 0.00% | £0 | £0.00 | £0 | £0.00 | nan% |
Most of the reports above can be customised by using filters and segments, but you can easily create totally custom
reports using GAPandas. Just check out the source code in the reports.py
script to see how it’s done and modify
the code to suit your needs.
Matt Clarke, Saturday, April 24, 2021