One common task you’ll perform in Google Search Console is to compare the data from two different time periods to see how impressions, clicks, click-through rate (CTR), or average position have changed over time.
If you’re getting your search engine optimisation right, and you haven’t fallen foul of one of Google’s many algorithm updates, then you’ll hopefully see the metrics trending in the right direction.
Since most data scientists, and now many SEOs, work in Python, it makes sense to be able to query your data using the Google Search Console API and display and manipulate your web metrics in this environment, something that’s made much easier using EcommerceTools.
In this simple project, I’ll show you how you can query two time periods using the Google Search Console API using EcommerceTools and generate a Pandas dataframe showing the changes.
We’ll only need to install one package for this project - EcommerceTools. This is my Python data science toolkit for those who work in ecommerce and marketing, which allows you to perform a wide range of common tasks.
You can install EcommerceTools via PyPi by entering the command !pip3 install --upgrade ecommercetools
in a cell in a Jupyter notebook. Once installed, import the seo
module from ecommercetools
.
!pip3 install --upgrade ecommercetools
from ecommercetools import seo
If you’ve not used the Google Search Console API via EcommerceTools before you’ll first need to create a client secrets JSON key file and save it to your machine. This will provide access to your GSC data via the application.
I’ve assigned the location of my client_secrets.json
file to a variable called key
, and have created a variable called site_url
to hold the URL of the Google Search Console property I want to access. This will typically be a URL, but if you are accessing a domain property, you will need to prefix it with sc-domain:
instead of https://
.
key = "client_secrets.json"
site_url = "sc-domain:practicaldatascience.co.uk"
Finally, we need to create two payload
dictionaries containing the API query parameters we wish to run. The payload_before
dictionary contains the API query for the earliest period, while the payload_after
dictionary contains the API query for the later period.
Both payloads need to query the same dimensions. The dimensions provided can include the page, query, or device, but obviously not the date, as this won’t be found in both the before and after periods.
payload_before = {
'startDate': "2021-08-11",
'endDate': "2021-08-31",
'dimensions': ["page","query", "device"],
}
payload_after = {
'startDate': "2021-07-21",
'endDate': "2021-08-10",
'dimensions': ["page","query", "device"],
}
Once the payloads are created, you can then pass the key
, site_url
and the two payloads to the query_google_search_console_compare()
query. This will fetch the data for the two periods and join it on the dimensions provided, returning a Pandas dataframe containing the metrics for each period, as well as the change between the two periods.
df = seo.query_google_search_console_compare(key, site_url, payload_before, payload_after)
df.head()
page | query | device | impressions_before | impressions_after | impressions_change | clicks_before | clicks_after | clicks_change | ctr_before | ctr_after | ctr_change | position_before | position_after | position_change | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | https://practicaldatascience.co.uk/data-scienc... | scrape google search results python | DESKTOP | 114 | 122.0 | 8.0 | 26 | 34.0 | 8.0 | 22.81 | 27.87 | 5.06 | 3.06 | 2.25 | -0.81 |
1 | https://practicaldatascience.co.uk/machine-lea... | xgbregressor hyperparameter tuning | DESKTOP | 382 | 187.0 | -195.0 | 22 | 6.0 | -16.0 | 5.76 | 3.21 | -2.55 | 7.43 | 8.10 | 0.67 |
2 | https://practicaldatascience.co.uk/machine-lea... | xgboost classifier | DESKTOP | 948 | 264.0 | -684.0 | 20 | 5.0 | -15.0 | 2.11 | 1.89 | -0.22 | 10.63 | 14.64 | 4.01 |
3 | https://practicaldatascience.co.uk/data-scienc... | pandas read google sheet | DESKTOP | 237 | 195.0 | -42.0 | 17 | 6.0 | -11.0 | 7.17 | 3.08 | -4.09 | 5.66 | 6.19 | 0.53 |
4 | https://practicaldatascience.co.uk/machine-lea... | xgboost classifier python | DESKTOP | 396 | 282.0 | -114.0 | 15 | 13.0 | -2.0 | 3.79 | 4.61 | 0.82 | 9.33 | 10.06 | 0.73 |
Matt Clarke, Monday, September 06, 2021