Neither Google Search Console nor Google Analytics gives you access to the data found in both systems in one place. However, with a bit of ingenuity and some relatively simple Python code, you can combine or blend data from both sources in a single report. Here’s how it’s done.
Open a Jupyter notebook and import the google.oauth2
and googleapiclient
packages, so we can connect to the Google Search Console API, the requests
and json
packages, so can send our query, and the Pandas and GAPandas packages so we can query the Google Analytics API.
from google.oauth2 import service_account
from googleapiclient.discovery import build
import requests
import json
import pandas as pd
import gapandas as gp
Next, create some set-up variables containing the name of your Google Search Console client secrets JSON key file, and your Google Analytics client secrets JSON key file, plus the ID of your Google Analytics view, the URL of your site, and the period you want to examine.
key_gsc = "client-secrets-gsc.json"
key_ga = "client-secrets-ga.json"
view_ga = "123456789"
start_date = "2020-01-01"
end_date = "2020-01-31"
site_url = "http://flyandlure.org"
filename_gsc_pages = "gsc_pages.csv"
filename_ga_pages = "gsc_pages.csv"
First, we’ll create a little function to create a connection to the Search Console API and return an authenticated service object. If this is the first time you’ve used the Search Console API, I’d recommend checking out my beginner’s guide to using the Search Console API which explains how to get things set up.
def connect(key):
"""Create a connection to the Google Search Console API and return service object.
Args:
key (string): Google Search Console JSON client secrets path.
Returns:
service (object): Google Search Console service object.
"""
scope = ['https://www.googleapis.com/auth/webmasters']
credentials = service_account.Credentials.from_service_account_file(key,
scopes=scope)
service = build(
'webmasters',
'v3',
credentials=credentials
)
return service
service = connect(key_gsc)
Next, we’ll create a function to send our query payload to the API and return the results in a Pandas dataframe. This takes the service
object from the function above, the site_url
containing the name of the property, and a payload
dictionary containing your API query.
def query(service, site_url, payload):
"""Run a query on the Google Search Console API and return a dataframe of results.
Args:
service (object): Service object from connect()
site_url (string): URL of Google Search Console property
payload (dict): API query payload dictionary
Return:
df (dataframe): Pandas dataframe containing requested data.
"""
response = service.searchanalytics().query(siteUrl=site_url, body=payload).execute()
results = []
for row in response['rows']:
data = {}
for i in range(len(payload['dimensions'])):
data[payload['dimensions'][i]] = row['keys'][i]
data['clicks'] = row['clicks']
data['impressions'] = row['impressions']
data['ctr'] = round(row['ctr'] * 100, 2)
data['position'] = round(row['position'], 2)
results.append(data)
return pd.DataFrame.from_dict(results)
Now that’s in place, we’ll run a Search Console query to return all the pages viewed within the date range selected and write them into a Pandas dataframe called df_gsc
. We’ll also back this up to a CSV file, so we can use it elsewhere if we need to.
payload = {
'startDate': start_date,
'endDate': end_date,
'dimensions': ["page"],
'rowLimit': 25000,
'startRow': 0
}
df_gsc = query(service, site_url, payload)
df_gsc.to_csv(filename_gsc_pages, index=False)
df_gsc.head()
page | clicks | impressions | ctr | position | |
---|---|---|---|---|---|
0 | http://flyandlure.org/articles/fly_fishing/win... | 265 | 3456 | 7.67 | 13.02 |
1 | http://flyandlure.org/articles/fly_tying/9_zon... | 212 | 3980 | 5.33 | 12.00 |
2 | http://flyandlure.org/articles/fly_fishing/how... | 206 | 4430 | 4.65 | 9.63 |
3 | http://flyandlure.org/articles/fly_tying/7_of_... | 198 | 2883 | 6.87 | 7.27 |
4 | http://flyandlure.org/articles/fly_fishing/how... | 170 | 2049 | 8.30 | 6.08 |
Next we’ll use GAPandas to create a connection to the Google Analytics API. By passing the key_ga
filepath for our client secrets JSON keyfile, we get back a service
object just like the one above, that we can then use to run an API query. If it’s your first time using this package, please check out my beginner’s guide to using GAPandas.
service = gp.get_service(key_ga)
Next, we’ll create a payload
dictionary containing a Google Analytics Reporting API query. This fetches data from the same period as our GSC query, and includes the ga:landingPagePath
dimension and is set to only include the traffic from the ga:medium==organic;ga:source==google
so it should match (approximately…) the data we got back from GSC. In reality, it never matches perfectly, but it is pretty close.
payload = {
'start_date': start_date,
'end_date': end_date,
'metrics': 'ga:entrances, ga:users, ga:pageviews, ga:bounces, ga:transactions, ga:transactionRevenue, ga:transactionsPerSession',
'dimensions': 'ga:landingPagePath',
'filters': 'ga:medium==organic;ga:source==google',
'sort': '-ga:entrances'
}
df_ga = gp.run_query(service, view_ga, payload)
df_ga.to_csv(filename_ga_pages, index=False)
df_ga.head()
landingPagePath | entrances | users | pageviews | bounces | transactions | transactionRevenue | transactionsPerSession | |
---|---|---|---|---|---|---|---|---|
0 | /articles/fly_fishing/winter_fly_fishing_tips_... | 287 | 246 | 462 | 226 | 0 | 0.0 | 0.0 |
1 | /articles/fly_tying/9_zonker_fly_patterns_to_tie | 274 | 212 | 350 | 241 | 0 | 0.0 | 0.0 |
2 | /articles/fly_tying/7_of_the_best_boobies_youl... | 237 | 202 | 370 | 191 | 0 | 0.0 | 0.0 |
3 | /articles/fly_fishing/how_to_fly_fish_with_a_s... | 209 | 191 | 278 | 181 | 0 | 0.0 | 0.0 |
4 | /articles/fly_fishing/how_to_fish_the_blob_fly | 202 | 180 | 368 | 148 | 0 | 0.0 | 0.0 |
Since the data in Google Search Console uses a dimension called page
which contains the absolute URL, and Google Analytics uses a relative URL (minus the site name and protocol), we need to prepend the start part of the URL to the landingPagePath
. This is dead easy in Pandas.
df_ga['page'] = site_url+df_ga['landingPagePath']
df_ga.head(10)
landingPagePath | entrances | users | pageviews | bounces | transactions | transactionRevenue | transactionsPerSession | page | |
---|---|---|---|---|---|---|---|---|---|
0 | /articles/fly_fishing/winter_fly_fishing_tips_... | 287 | 246 | 462 | 226 | 0 | 0.0 | 0.0 | http://flyandlure.org/articles/fly_fishing/win... |
1 | /articles/fly_tying/9_zonker_fly_patterns_to_tie | 274 | 212 | 350 | 241 | 0 | 0.0 | 0.0 | http://flyandlure.org/articles/fly_tying/9_zon... |
2 | /articles/fly_tying/7_of_the_best_boobies_youl... | 237 | 202 | 370 | 191 | 0 | 0.0 | 0.0 | http://flyandlure.org/articles/fly_tying/7_of_... |
3 | /articles/fly_fishing/how_to_fly_fish_with_a_s... | 209 | 191 | 278 | 181 | 0 | 0.0 | 0.0 | http://flyandlure.org/articles/fly_fishing/how... |
4 | /articles/fly_fishing/how_to_fish_the_blob_fly | 202 | 180 | 368 | 148 | 0 | 0.0 | 0.0 | http://flyandlure.org/articles/fly_fishing/how... |
5 | /articles/fly_tying/7_of_the_best_fly_patterns... | 194 | 124 | 300 | 143 | 0 | 0.0 | 0.0 | http://flyandlure.org/articles/fly_tying/7_of_... |
6 | / | 185 | 65 | 826 | 67 | 0 | 0.0 | 0.0 | http://flyandlure.org/ |
7 | /articles/fly_tying/6_stunning_freshwater_shri... | 175 | 144 | 215 | 153 | 0 | 0.0 | 0.0 | http://flyandlure.org/articles/fly_tying/6_stu... |
8 | /articles/fly_tying/5_pheasant_tail_nymph_patt... | 159 | 137 | 213 | 130 | 0 | 0.0 | 0.0 | http://flyandlure.org/articles/fly_tying/5_phe... |
9 | /articles/fly_fishing/how_to_fish_snake_flies_... | 149 | 126 | 192 | 128 | 0 | 0.0 | 0.0 | http://flyandlure.org/articles/fly_fishing/how... |
Finally, we can now use the Pandas merge()
function to join the df_gsc
dataframe to the df_ga
dataframe by using the page
column as our key. This should give us back a dataframe containing the data from both sources, neatly connected to show data from Google Search Console that we can’t easily obtain in GA. As I mentioned above, it’s never perfect, but it should be fairly close!
df_all = pd.merge(df_gsc, df_ga, how='right', on=['page'])
df_all.head()
page | clicks | impressions | ctr | position | landingPagePath | entrances | users | pageviews | bounces | transactions | transactionRevenue | transactionsPerSession | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | http://flyandlure.org/articles/fly_fishing/win... | 265.0 | 3456.0 | 7.67 | 13.02 | /articles/fly_fishing/winter_fly_fishing_tips_... | 287 | 246 | 462 | 226 | 0 | 0.0 | 0.0 |
1 | http://flyandlure.org/articles/fly_tying/9_zon... | 212.0 | 3980.0 | 5.33 | 12.00 | /articles/fly_tying/9_zonker_fly_patterns_to_tie | 274 | 212 | 350 | 241 | 0 | 0.0 | 0.0 |
2 | http://flyandlure.org/articles/fly_tying/7_of_... | 198.0 | 2883.0 | 6.87 | 7.27 | /articles/fly_tying/7_of_the_best_boobies_youl... | 237 | 202 | 370 | 191 | 0 | 0.0 | 0.0 |
3 | http://flyandlure.org/articles/fly_fishing/how... | 206.0 | 4430.0 | 4.65 | 9.63 | /articles/fly_fishing/how_to_fly_fish_with_a_s... | 209 | 191 | 278 | 181 | 0 | 0.0 | 0.0 |
4 | http://flyandlure.org/articles/fly_fishing/how... | 170.0 | 2049.0 | 8.30 | 6.08 | /articles/fly_fishing/how_to_fish_the_blob_fly | 202 | 180 | 368 | 148 | 0 | 0.0 | 0.0 |
Matt Clarke, Saturday, March 13, 2021