How to classify Google Search Console data in EcommerceTools

Learn how to use ABCD classification to classify Google Search Console page data in EcommerceTools and help your SEO team prioritise their work.

How to classify Google Search Console data in EcommerceTools
Picture by Monoar Rahman, Pexels.
7 minutes to read

ABC analysis originally came from the field of inventory management, where it’s used by procurement staff to classify inventory items into three categories - A, B, and C - to help them control their inventory and avoid costly stock-outs. However, the ABC analysis technique is extremely powerful in other fields too. I’ve regularly used ABC analysis for customer segmentation, and for the segmentation or classification of site content, and typically add a fourth class - D - to denote items with zero contribution.

In this post, I’ll show you how to use ABC analysis to classify Google Search Console data using my EcommerceTools library. EcommerceTools is a data science toolkit for doing data stuff in Python for ecommerce, marketing, and SEO. It makes it really easy to query the Google Search Console API.

ABC analysis of Google Search Console data

ABC analysis works by calculating a cumulative sum and then assigning the first 80% to Class A, the next 10% to Class B, and the final 10% to Class C. The cumulative sum is calculated by sorting the data in descending order and then adding up the values. We’ll be calculating the cumulative sum of the number of clicks each page generates, so our Class D will contain those pages generating zero clicks.

While this might seem quite trivial stuff, it’s surprisingly useful for helping SEOs and content team understand a site’s traffic and to help prioritise their work. The Class A pages represent those that need to be watched carefully as they carry risk.

The Class D group could include pages that you don’t intend to rank for, or pages that are poorly optimised and need to be fixed. The Class B and Class C groups both include pages that could have room for improvement. Combine the ABCD class data with impressions, and you’ve got a powerful dataset for SEOs to use.

Install the packages

To get started, open a Jupyter notebook and import Pandas and my EcommerceTools package. If you don’t have EcommerceTools installed, you can install it via the PyPi Python package repository by entering the command pip3 install --upgrade ecommercetools.

!pip3 install --upgrade ecommercetools
import pandas as pd
from ecommercetools import seo

Configure the Google Search Console API

We’re going to use EcommerceTools to fetch data from the Google Search Console API and then classify the data using the EcommerceTools SEO module. In order to connect to Google Search Console you’ll need to create a service account and download the JSON credentials file. You’ll also need the name of your Google Search Console API site URL. This will usually be in the format of https://www.example.com, but if you have a domain property it will be in the sc-domain:example.com format instead. You’ll also need to define the start and end date for your analysis.

key = "pds-client-secrets.json"
site_url = "sc-domain:practicaldatascience.co.uk"
start_date = '2022-10-01'
end_date = '2022-10-31'

Create an ABCD classification of Google Search Console data

Next, we’ll run the classify_pages() function. This will query the Google Search Console API, fetch all the pages within your start and end date range, and then classify them into the four categories of A, B, C, and D. Class A will comprise the pages that generate the first 80% of cumulative clicks, Class B will comprise the next 10%, Class C will comprise the next 10%, and Class D will comprise all those pages generating zeo clicks.

To run the function, you simply need to pass the key variable containing the path to your JSON client secrets key file, the site_url variable containing the URL of your website, and the start_date and end_date variables containing the start and end date of the period you want to classify, and set the output variable to summary.

df_summary = seo.classify_pages(key, site_url, start_date, end_date, output='summary')

Based on my site, I get 63 pages classified as Class A, which generate 80% of my clicks. 46 Class B pages generate the next 10% of clicks, and 190 Class C pages generate the final 10% of clicks. I have 36 pages in Class D that generate no clicks. m

df_summary
class pages impressions clicks avg_ctr avg_position share_of_clicks share_of_impressions
0 A 63 747643 36980 5.126349 22.706825 79.7 43.7
1 B 46 639329 4726 3.228043 31.897826 10.2 37.4
2 C 190 323385 4698 2.393632 38.259368 10.1 18.9
3 D 36 1327 0 0.000000 25.804722 0.0 0.1

View the ABCD page classifications

To view the pages and their ABCD classes we can run the same function but change the output parameter to ‘classes’. We get back a dataframe containing the raw Google Search Console data on each page on the site, plus the ABCD classes and their underlying metrics.

df_classes = seo.classify_pages(key, site_url, start_date, end_date, output='classes')
df_classes.head()
page clicks impressions ctr position clicks_cumsum clicks_running_pc pc_share class class_rank
0 https://practicaldatascience.co.uk/machine-lea... 3890 36577 10.64 12.64 3890 8.382898 8.382898 A 1
1 https://practicaldatascience.co.uk/data-scienc... 2414 16618 14.53 14.30 6304 13.585036 5.202138 A 2
2 https://practicaldatascience.co.uk/data-scienc... 2378 71496 3.33 16.39 8682 18.709594 5.124558 A 3
3 https://practicaldatascience.co.uk/data-scienc... 1942 14274 13.61 15.02 10624 22.894578 4.184984 A 4
4 https://practicaldatascience.co.uk/data-scienc... 1738 23979 7.25 11.80 12362 26.639945 3.745367 A 5

Export the classifications to CSV

We can use the to_csv method to export the classifications to a CSV file and pass it on to our SEO team. They’ll be able to use the page classifications to improve the SEO of the website by focusing on the pages that need the most attention.


df_classes.to_csv('google_search_console_classifications.csv', index=False)

Matt Clarke, Friday, November 18, 2022

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.