How to access the Google Search Console API using Python

By accessing Google Search Console API data using Python you'll have access to whatever data you want and can easily connect it to other data sources.

How to access the Google Search Console API using Python
Picture by Solen Feyissa, Unsplash.
7 minutes to read

Google Search Console contains loads of really useful information for technical SEO. However, there are limits to what you can do using the front-end interface, and it takes time to extract the data you need for your analyses. However, you can make life much easier if you access what you need via the Google Search Console API instead.

The API gives you direct access to all the data you need directly from within Python, so you can extract what you need, build automated scripts, create data pipelines to manipulate the data, connect it to other sources, and move the data into other systems. Here’s how it’s done.

For a quicker, easier way to access Google Search Console data, please check out my EcommerceTools Python package that contains a set of tools for extracting data from Google Search Console.

Create a service account

To access data from the Google Search Console API you’ll need to create a Service Account using the Google APIs Console and download the JSON client secrets key file to your machine. The process for doing this is somewhat convoluted:

  1. Go to Google API Console > Credentials and select or create a project.
  2. Click Create Credentials > Service Account, fill in the form, and click Create.
  3. Select a role for your Service Account user, i.e. Viewer, then Save.
  4. Copy the email address added for the service account, i.e. xxx@xxxx.iam.gserviceaccount.com.
  5. Create a JSON key and download it to your machine.
  6. Go to Google Search Console and select your Property.
  7. Click Settings > Users and permissions > Add user, then enter the service account email.

Install the packages

Open a Jupyter notebook or Python script and import google.oauth2, googleapiclient.discovery, requests, json, and pandas. Any packages you don’t have can be installed by entering pip3 install package-name in your terminal.

from google.oauth2 import service_account
from googleapiclient.discovery import build
import requests
import json
import pandas as pd

To make the Pandas dataframes a bit easier to read, you may want to change the maximum column width from the default value to a higher number using pd.set_option('max_colwidth', 150).

pd.set_option('max_colwidth', 150)

Store your key path

Next, create a variable called key and add the path to your client secrets JSON keyfile used to authenticate you on the Service Account used for the Google Search Console API.

key = 'google-search-console.json'

Create an API connection using your key

To handle the connection, we’ll create a basic function that passes the key and scope to the API and returns a service object that we can use to run queries with. This is fine for Jupyter notebook use, but you’ll want to include some error handling functionality if you want to use it in production.

def connect(key):
    """Create a connection to the Google Search Console API and return service object.
    
    Args:
        key (string): Google Search Console JSON client secrets path.
    
    Returns:
        service (object): Google Search Console service object.
    """
    
    scope = ['https://www.googleapis.com/auth/webmasters']
    credentials = service_account.Credentials.from_service_account_file(key, 
                                                                        scopes=scope)
    service = build(
        'webmasters',
        'v3',
        credentials=credentials
    )
    
    return service

Create a function to run a query

Next, we’ll create a function called query() that takes our authenticated service object, the site_url identifying the Search Console property we want to query, and a dictionary called payload which contains our API query.

We’ll use execute() to run this query on the API, and will then extract the rows, reformat the data so it is neat and tidy, and add the output to a Pandas dataframe using from_dict().

def query(service, site_url, payload):
    """Run a query on the Google Search Console API and return a dataframe of results.
    
    Args:
        service (object): Service object from connect()
        site_url (string): URL of Google Search Console property
        payload (dict): API query payload dictionary
    
    Return:
        df (dataframe): Pandas dataframe containing requested data. 
    
    """
    
    response = service.searchanalytics().query(siteUrl=site_url, body=payload).execute()
    
    results = []
    
    for row in response['rows']:    
        data = {}
        
        for i in range(len(payload['dimensions'])):
            data[payload['dimensions'][i]] = row['keys'][i]

        data['clicks'] = row['clicks']
        data['impressions'] = row['impressions']
        data['ctr'] = round(row['ctr'] * 100, 2)
        data['position'] = round(row['position'], 2)        
        results.append(data)
    
    return pd.DataFrame.from_dict(results)

Run your Search Console API query

Finally, we can put these steps together. We’ll pass our key to connect() to get a service object. We’ll create a simple API query payload dictionary, define the site_url of the property we want to query, and then fetch the data using query(). This returns a Pandas dataframe that we can manipulate as we wish.

service = connect(key)

payload = {
    'startDate': "2019-01-01",
    'endDate': "2019-12-31",
    'dimensions': ["page","device","query"],
    'rowLimit': 100,
    'startRow': 0
}

site_url = "http://flyandlure.org"

df = query(service, site_url, payload)
df.head()
page query clicks impressions ctr position
0 http://flyandlure.org/ fly and lure 144 467 30.84 2.73
1 http://flyandlure.org/articles/fly_fishing_gea... simms freestone waders review 134 698 19.20 4.01
2 http://flyandlure.org/articles/fly_fishing_gea... orvis encounter waders review 87 449 19.38 4.57
3 http://flyandlure.org/articles/fly_tying/9_zon... zonker fly 80 1542 5.19 2.76
4 http://flyandlure.org/articles/fly_fishing/how... cats whisker fly 72 654 11.01 1.40

Matt Clarke, Friday, March 12, 2021

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.