How to analyse Google Analytics demographics and interests with GAPandas

Google Analytics demographics and interests data are a useful way to quickly understand the customers or markets who interact with your site. Here's how to access the data in Python using GAPandas.

How to analyse Google Analytics demographics and interests with GAPandas
Pictures by Kampus, Pexels.
36 minutes to read

The demographics and interests data provided in Google Analytics can be a useful way to understand who is visiting your site or purchasing your products, without the need to perform a complex and expensive demographic segmentation project that might not deliver an ROI.

While there are some limitations to the demographic data provided in Google Analytics, they are still quite useful for developing a simple overview of your customer base and can work alongside market segmentation or customer segmentation and reveal patterns of which you may not have been aware.

Having worked in businesses which have segmented their customers using expensive customer segmentation techniques such as Mosaic, that use demographic segmentation to report on the overall customer base, I’m not left convinced that these are worth spending money on, compared to what you can get for free using Google Analytics data.

If you only want an overview of customer demographics, such as the age brackets, gender, geographic location, and market interests of your customers, then the demographic and interests data in Google Analytics should be enough to give you such a high level view, for absolutely no cost.

They’re a great way to help your business owners understand who interacts with your website, how the demographics of your customer base may be changing over time, and can help you identify what products to launch, what markets to target, and what marketing activity might work best.

In this project, we’ll be using my GAPandas package to query the Google Analytics API and retrieve the demographics and interests data for the visitors to one of my (non-ecommerce) websites. This will cover the basic techniques you need to apply to understand the demographic segments of your site visitors at a high level, just as Mosaic segmentation does.

Understanding demographics and interests data in Google Analytics

Google Analytics demographics and interests data are collected using Google’s advertising cookies. Therefore, in order to see any demographic and interests data in GA, you will first need to enable a couple of optional settings.

  1. Go to Google Analytics > Admin
  2. Select the Google Analytics property for which you want to enable data collection
  3. Set Remarketing to On
  4. Set Advertising Reporting Features to On
  5. Save

Once you’ve saved these settings, Google Analytics will use existing Google cookies for the DoubleClick , Android, and iOS advertising networks to capture some additional data on your site visitors and add it to your Google Analytics account in an anonymised form. Since the data is collected daily, you’ll need to leave the settings turned on for a period of time before running your analysis.

Data thresholds and anonymisation

One potential application of these data, you may be thinking, could be to capture demographic data on each of your customers to augment the customer profile you hold upon each of them. This would be a great way to let you target customers based on their demographics, and could work brilliantly alongside the aggregate demographics and interests data.

However, while this would be fascinating from a data perspective, Google doesn’t want you to be able to fetch data on the specific demographics of individual users for privacy reasons. For example, you’re unable to write an API query to fetch the userAgeBracket or userGender of each customer when you select a transactionId dimension.

In addition, if you query any data and there are few people in the dataset, which could inadvertently leak information to you about the people, Google will prevent you from seeing the data by setting a threshold on the minimum volume to anonymise the data. At the higher level queries we’re examining below, this rarely kicks in, but it might if you create more granular API queries.

Ecommerce

Install the packages

First, open a Jupyter notebook and install the Pandas and GAPandas packages. You’ll probably have Pandas already installed, but you can install GAPandas by entering the command pip3 install gapandas in your terminal or !pip3 install gapandas in a cell in your Jupyter notebook. Any other packages we need can be installed later in our notebook.

!pip3 install gapandas
import gapandas as gp
import pandas as pd

Connect to the Google Analytics API

Next, we’ll use GAPandas to connect to the Google Analytics API. To authenticate you will need to download a client secrets JSON keyfile from the Google API Console. This gives you access to a Google Cloud Service Account, via which we can extract our Google Analytics data. There’s a step-by-step guide to creating a client secrets JSON keyfile in my other guide to using GAPandas.

service = gp.get_service('client-secrets.json')

To avoid entering the same values throughout our code, we’ll create some variables to hold the view ID for the Google Analytics property we want to query, as well as the start and end dates for the date period we want to extract via the GA API.

view = '1234567890'
start_date = '2021-07-01'
end_date = '2021-07-31'

Age bracket

The first piece of demographic data we’ll extract is the age bracket for our users, which is stored in an API dimension called userAgeBracket. To obtain this demographic data, we simply assemble a GAPandas payload dictionary containing the ga:userAgeBracket dimension and our choice of metrics, and then run the query using the run_query() function, remembering to pass in the service object and the view ID for our GA property.

As the data below show, the site has the fewest visitors in the 18-24 bracket (there is no lower age bracket in the data), and older visitors tend to dominate on this site. Although there’s no e-ecommerce data in my site’s data, I’ve included the metrics in the payload so you can apply this to analysing an e-commerce site. These will often show differences in average order value and conversion rate for each age bracket, which can reveal how well you cater for customers of different ages, or how easy or difficult they find your site to use.

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions, ga:transactions, ga:transactionRevenue, 
    ga:transactionsPerSession, ga:revenuePerTransaction',
    'dimensions': 'ga:userAgeBracket',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df.head(10)
userAgeBracket sessions transactions transactionRevenue transactionsPerSession revenuePerTransaction
0 65+ 1301 0 0.0 0.0 0.0
1 55-64 1291 0 0.0 0.0 0.0
2 45-54 1255 0 0.0 0.0 0.0
3 25-34 1236 0 0.0 0.0 0.0
4 35-44 1055 0 0.0 0.0 0.0
5 18-24 574 0 0.0 0.0 0.0

Gender

We can repeat the process for the ga:userGender demographic data simply by adjusting the dimension we define in the payload. This reveals that the site has a strong male bias in the visitor profile. Again, there’s no e-commerce data for this site, but this also often shows big differences in customer behaviour according to gender.

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions, ga:transactions, ga:transactionRevenue, 
    ga:transactionsPerSession, ga:revenuePerTransaction',
    'dimensions': 'ga:userGender',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df.head(10)
userGender sessions transactions transactionRevenue transactionsPerSession revenuePerTransaction
0 male 5406 0 0.0 0.0 0.0
1 female 1427 0 0.0 0.0 0.0

Age and gender

To get more granular data on age and gender, you can define multiple dimensions. In the example below, I’ve included both the ga:userGender and ga:userAgeBracket dimensions, to get a more detailed breakdown of customer ages and genders. To analyse these more easily, you may wish to convert the raw metrics to percentages to better show the proportional split within each age bracket.

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions, ga:transactions, ga:transactionRevenue, 
    ga:transactionsPerSession, ga:revenuePerTransaction',
    'dimensions': 'ga:userGender, ga:userAgeBracket',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df.sort_values(by='userAgeBracket').head(10)
userGender userAgeBracket sessions transactions transactionRevenue transactionsPerSession revenuePerTransaction
5 male 18-24 400 0 0.0 0.0 0.0
11 female 18-24 174 0 0.0 0.0 0.0
3 male 25-34 984 0 0.0 0.0 0.0
8 female 25-34 252 0 0.0 0.0 0.0
4 male 35-44 846 0 0.0 0.0 0.0
10 female 35-44 209 0 0.0 0.0 0.0
2 male 45-54 987 0 0.0 0.0 0.0
6 female 45-54 268 0 0.0 0.0 0.0
0 male 55-64 1051 0 0.0 0.0 0.0
9 female 55-64 240 0 0.0 0.0 0.0

Continent

You can also perform geographical segmentation using the same technique. There are various levels of dimension provided that allow you to drilldown through your data to understand where site visitors are located. From an e-commerce perspective, this can be a useful way to judge whether it may be worth considering offering delivery services to specific countries, or even offering local language content.

Do be aware, though, that all geographic data in Google Analytics should be taken with a pinch of salt. While Google has the ability to geolocate users, it would appear that these data may often be based upon the location of the network through which you are routed, so VPNs and other network systems can potentially falsify some data. However, it’s generally sufficient to give you a rough idea. Here’s the data on the ga:continent dimension, which shows most of my visitors are in Europe.

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions, ga:transactions, ga:transactionRevenue, 
    ga:transactionsPerSession, ga:revenuePerTransaction',
    'dimensions': 'ga:continent',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df.head(10)
continent sessions transactions transactionRevenue transactionsPerSession revenuePerTransaction
0 Europe 15924 0 0.0 0.0 0.0
1 Americas 4808 0 0.0 0.0 0.0
2 Oceania 382 0 0.0 0.0 0.0
3 Africa 202 0 0.0 0.0 0.0
4 Asia 171 0 0.0 0.0 0.0
5 (not set) 9 0 0.0 0.0 0.0

Country

To drill down through our geographic segmentation data we can use different dimensions. By running an API query with the ga:country dimension in the payload we can see that the UK dominates the site’s traffic, followed by the US and Canada.

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions, ga:transactions, ga:transactionRevenue, 
    ga:transactionsPerSession, ga:revenuePerTransaction',
    'dimensions': 'ga:country',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df.head(10)
country sessions transactions transactionRevenue transactionsPerSession revenuePerTransaction
0 United Kingdom 13580 0 0.0 0.0 0.0
1 United States 3807 0 0.0 0.0 0.0
2 Canada 910 0 0.0 0.0 0.0
3 Ireland 414 0 0.0 0.0 0.0
4 Sweden 405 0 0.0 0.0 0.0
5 Norway 343 0 0.0 0.0 0.0
6 Australia 250 0 0.0 0.0 0.0
7 South Africa 173 0 0.0 0.0 0.0
8 Finland 170 0 0.0 0.0 0.0
9 New Zealand 131 0 0.0 0.0 0.0

Region

You can also drill down to more granular levels. For example, by adding in the ga:region dimension we also get back the region (i.e. England, Wales, Scotland) for each country. To look at the regions for a specific country, you can perform a simple filter on the country column and set it to the country name.

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions, ga:transactions, ga:transactionRevenue, 
    ga:transactionsPerSession, ga:revenuePerTransaction',
    'dimensions': 'ga:country, ga:region',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df[df['country']=='United Kingdom'].head(10)
country region sessions transactions transactionRevenue transactionsPerSession revenuePerTransaction
0 United Kingdom England 9034 0 0.0 0.0 0.0
1 United Kingdom Scotland 3005 0 0.0 0.0 0.0
2 United Kingdom Wales 995 0 0.0 0.0 0.0
3 United Kingdom Northern Ireland 510 0 0.0 0.0 0.0
89 United Kingdom (not set) 19 0 0.0 0.0 0.0
95 United Kingdom Isle of Man 17 0 0.0 0.0 0.0

City

To get even more detail, you can also add in the ga:city dimension and use the same approach. As you’ll notice, there’s a big chunk of (not set) data which could be anonymised to avoid revealing too much information. Looks like London and Glasgow are the main hotspots for my site.

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions, ga:transactions, ga:transactionRevenue, 
    ga:transactionsPerSession, ga:revenuePerTransaction',
    'dimensions': 'ga:country, ga:region, ga:city',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df[df['country']=='United Kingdom'].head(10)
country region city sessions transactions transactionRevenue transactionsPerSession revenuePerTransaction
0 United Kingdom England London 1655 0 0.0 0.0 0.0
1 United Kingdom England (not set) 1104 0 0.0 0.0 0.0
2 United Kingdom Scotland Glasgow 806 0 0.0 0.0 0.0
3 United Kingdom Scotland (not set) 477 0 0.0 0.0 0.0
4 United Kingdom Scotland Edinburgh 371 0 0.0 0.0 0.0
5 United Kingdom England Birmingham 283 0 0.0 0.0 0.0
6 United Kingdom Wales (not set) 276 0 0.0 0.0 0.0
7 United Kingdom England Manchester 261 0 0.0 0.0 0.0
8 United Kingdom England Newcastle upon Tyne 236 0 0.0 0.0 0.0
10 United Kingdom England Bristol 188 0 0.0 0.0 0.0

Designated Market Area (DMA) or Metro

You can also examine the Designated Market Area (DMA) or Metro. These are known as “media markets” or television market areas and define the TV or radio stations that a population receives. If you wanted to run TV or radio advertising, or advertise in other channels alongside your TV and radio ads, these DMAs could be very useful in helping you find the best places to advertise. On e-commerce sites, there’s sometimes a striking difference in customer behaviour between DMAs.

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions, ga:transactions, ga:transactionRevenue, 
    ga:transactionsPerSession, ga:revenuePerTransaction',
    'dimensions': 'ga:country, ga:region, ga:metro',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df[df['country']=='United Kingdom'].sort_values(by='region', ascending=False).head(50)
country region metro sessions transactions transactionRevenue transactionsPerSession revenuePerTransaction
9 United Kingdom Wales HTV Wales 490 0 0.0 0.0 0.0
7 United Kingdom Wales (not set) 505 0 0.0 0.0 0.0
8 United Kingdom Scotland (not set) 495 0 0.0 0.0 0.0
52 United Kingdom Scotland Border 44 0 0.0 0.0 0.0
11 United Kingdom Scotland North Scotland 462 0 0.0 0.0 0.0
1 United Kingdom Scotland Central Scotland 2004 0 0.0 0.0 0.0
10 United Kingdom Northern Ireland Ulster 463 0 0.0 0.0 0.0
48 United Kingdom Northern Ireland (not set) 47 0 0.0 0.0 0.0
120 United Kingdom Isle of Man (not set) 17 0 0.0 0.0 0.0
6 United Kingdom England North East 641 0 0.0 0.0 0.0
5 United Kingdom England Yorkshire 837 0 0.0 0.0 0.0
4 United Kingdom England North West 1208 0 0.0 0.0 0.0
12 United Kingdom England Meridian (exc. Channel Islands) 430 0 0.0 0.0 0.0
13 United Kingdom England HTV West 412 0 0.0 0.0 0.0
14 United Kingdom England East Of England 389 0 0.0 0.0 0.0
18 United Kingdom England South West 224 0 0.0 0.0 0.0
22 United Kingdom England Border 135 0 0.0 0.0 0.0
3 United Kingdom England Midlands 1355 0 0.0 0.0 0.0
2 United Kingdom England (not set) 1394 0 0.0 0.0 0.0
0 United Kingdom England London 2009 0 0.0 0.0 0.0
105 United Kingdom (not set) (not set) 19 0 0.0 0.0 0.0

Latitude and longitude

You can also get the approximate anonymised latitude and longitude coordinates for each visitor using the ga:latitude and ga:longitude dimensions. Judging from comparisons made to e-commerce data, where delivery postcodes are known, these aren’t 100% accurate as mentioned above, but they will give you a rough idea of where your visitors are physically located.

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions',
    'dimensions': 'ga:latitude, ga:longitude',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df.head()
latitude longitude sessions
0 0.0000 0.0000 2553
1 51.5074 -0.1278 1655
2 55.8642 -4.2518 806
3 55.9533 -3.1883 371
4 52.4862 -1.8904 283

Distance from store

Once you have the visitor’s coordinates, you can use maths to calculate their distance from a given location using the excellent GeoPy extension. This can be a useful way to identify roughly (possibly very roughly) how many visitors you have within the radius of a particular store or event.

!pip3 install geopy
from geopy.distance import geodesic

start_latitude = 52.98
start_longitude = -3.36

df['distance'] = df.apply(lambda x: 
                          geodesic((start_latitude, start_longitude),\
                                   (x.latitude, x.longitude)).miles, axis=1)
df[df['distance']< 10].head()
latitude longitude sessions distance
115 53.1149 -3.3103 26 9.555561

Interest category

Finally, we have the interest categories, which are handled via the ga:interestOtherCategory, ga:interestAffinityCategory, and ga:interestInMarketCategory dimensions. These can tell you what interests your website visitors have, and let you drill down to get increasing levels of detail. Most of my visitors are into the outdoors, sporting goods, cars, and travel, which totally makes sense given the subject matter.

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions, ga:transactions, ga:transactionRevenue, 
    ga:transactionsPerSession, ga:revenuePerTransaction',
    'dimensions': 'ga:interestOtherCategory',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df.head(10)
interestOtherCategory sessions transactions transactionRevenue transactionsPerSession revenuePerTransaction
0 Hobbies & Leisure/Outdoors/Fishing 2831 0 0.0 0.0 0.0
1 Sports/Team Sports/Soccer 1588 0 0.0 0.0 0.0
2 News/Weather 1179 0 0.0 0.0 0.0
3 Arts & Entertainment/Celebrities & Entertainme... 1138 0 0.0 0.0 0.0
4 News/Sports News 1055 0 0.0 0.0 0.0
5 Arts & Entertainment/TV & Video/Online Video 787 0 0.0 0.0 0.0
6 Food & Drink/Cooking & Recipes 602 0 0.0 0.0 0.0
7 Travel & Transportation/Hotels & Accommodations 565 0 0.0 0.0 0.0
8 Home & Garden/Patio, Lawn & Garden/Gardening 494 0 0.0 0.0 0.0
9 Real Estate/Real Estate Listings/Residential S... 416 0 0.0 0.0 0.0

Interest affinity category

The ga:interestAffinityCategory gives broader information. For my site, we can see that “Lifestyles & Hobbies/Outdoor Enthusiasts” is the top one, which again fits perfectly with the subject matter.

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions, ga:transactions, ga:transactionRevenue, 
    ga:transactionsPerSession, ga:revenuePerTransaction',
    'dimensions': 'ga:interestAffinityCategory',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df.head(10)
interestAffinityCategory sessions transactions transactionRevenue transactionsPerSession revenuePerTransaction
0 Lifestyles & Hobbies/Outdoor Enthusiasts 4628 0 0.0 0.0 0.0
1 Home & Garden/Do-It-Yourselfers 3845 0 0.0 0.0 0.0
2 Food & Dining/Cooking Enthusiasts/30 Minute Chefs 3815 0 0.0 0.0 0.0
3 Sports & Fitness/Sports Fans 3769 0 0.0 0.0 0.0
4 Lifestyles & Hobbies/Green Living Enthusiasts 3294 0 0.0 0.0 0.0
5 News & Politics/Avid News Readers 2791 0 0.0 0.0 0.0
6 Banking & Finance/Avid Investors 2644 0 0.0 0.0 0.0
7 Sports & Fitness/Health & Fitness Buffs 2605 0 0.0 0.0 0.0
8 Shoppers/Value Shoppers 2601 0 0.0 0.0 0.0
9 Lifestyles & Hobbies/Business Professionals 2439 0 0.0 0.0 0.0

In market category

Finally, there’s in-market category which is similar to the others above…

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions, ga:transactions, ga:transactionRevenue, 
    ga:transactionsPerSession, ga:revenuePerTransaction',
    'dimensions': 'ga:interestInMarketCategory',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df.head(10)
interestInMarketCategory sessions transactions transactionRevenue transactionsPerSession revenuePerTransaction
0 Sports & Fitness/Outdoor Recreational Equipmen... 2784 0 0.0 0.0 0.0
1 Sports & Fitness/Sporting Goods 1946 0 0.0 0.0 0.0
2 Autos & Vehicles/Motor Vehicles/Motor Vehicles... 1415 0 0.0 0.0 0.0
3 Travel/Hotels & Accommodations 1031 0 0.0 0.0 0.0
4 Real Estate/Residential Properties/Residential... 939 0 0.0 0.0 0.0
5 Real Estate/Residential Properties/Residential... 840 0 0.0 0.0 0.0
6 Home & Garden/Home & Garden Services/Landscape... 831 0 0.0 0.0 0.0
7 Travel/Trips by Destination/Trips to Europe/Tr... 818 0 0.0 0.0 0.0
8 Apparel & Accessories/Women's Apparel 620 0 0.0 0.0 0.0
9 Home & Garden/Home Improvement 590 0 0.0 0.0 0.0

Interest category by gender

The other thing you can do is analyse different combinations of dimensions. For example, in the code below I’m querying ga:interestOtherCategory with ga:userGender to reveal what gender differences exist within the interest categories. The topic of my site is male dominated, so the bulk of visitors are male in each category.

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions, ga:transactions, ga:transactionRevenue, 
    ga:transactionsPerSession, ga:revenuePerTransaction',
    'dimensions': 'ga:interestOtherCategory, ga:userGender',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df.head(20)
interestOtherCategory userGender sessions transactions transactionRevenue transactionsPerSession revenuePerTransaction
0 Hobbies & Leisure/Outdoors/Fishing male 2461 0 0.0 0.0 0.0
1 Sports/Team Sports/Soccer male 1319 0 0.0 0.0 0.0
2 News/Weather male 934 0 0.0 0.0 0.0
3 News/Sports News male 918 0 0.0 0.0 0.0
4 Arts & Entertainment/Celebrities & Entertainme... male 846 0 0.0 0.0 0.0
5 Arts & Entertainment/TV & Video/Online Video male 627 0 0.0 0.0 0.0
6 Food & Drink/Cooking & Recipes male 436 0 0.0 0.0 0.0
7 Travel & Transportation/Hotels & Accommodations male 374 0 0.0 0.0 0.0
8 Home & Garden/Patio, Lawn & Garden/Gardening male 352 0 0.0 0.0 0.0
9 Autos & Vehicles/Boats & Watercraft male 342 0 0.0 0.0 0.0
10 Real Estate/Real Estate Listings/Residential S... male 312 0 0.0 0.0 0.0
11 News/Politics male 302 0 0.0 0.0 0.0
12 Hobbies & Leisure/Outdoors/Fishing female 270 0 0.0 0.0 0.0
13 Arts & Entertainment/Celebrities & Entertainme... female 253 0 0.0 0.0 0.0
14 News/Weather female 216 0 0.0 0.0 0.0
15 Internet & Telecom/Email & Messaging/Email male 211 0 0.0 0.0 0.0
16 Autos & Vehicles/Vehicle Shopping/Used Vehicles male 210 0 0.0 0.0 0.0
17 Pets & Animals/Pets/Dogs male 207 0 0.0 0.0 0.0
18 Sports/Team Sports/Soccer female 206 0 0.0 0.0 0.0
19 Internet & Telecom/Search Engines male 203 0 0.0 0.0 0.0

Interest category, gender, and age bracket

You can also add in the ga:userAgeBracket to get the age brackets added in. Collectively, the various queries above give you a really good general overview of the market segments this site serves, which could be useful when communicating to key stakeholders, can help you target marketing or advertising, and can help you spot potential opportunities with products or technologies. For free, the data are really quite useful.

payload = {
    'start_date': start_date, 
    'end_date': end_date, 
    'metrics': 'ga:sessions, ga:transactions, ga:transactionRevenue,
    ga:transactionsPerSession, ga:revenuePerTransaction',
    'dimensions': 'ga:interestOtherCategory, 
    ga:userGender, ga:userAgeBracket',
    'sort': '-ga:sessions'
}

df = gp.run_query(service, view, payload)
df.head(20)
interestOtherCategory userGender userAgeBracket sessions transactions transactionRevenue transactionsPerSession revenuePerTransaction
0 Hobbies & Leisure/Outdoors/Fishing male 65+ 606 0 0.0 0.0 0.0
1 Hobbies & Leisure/Outdoors/Fishing male 55-64 569 0 0.0 0.0 0.0
2 Hobbies & Leisure/Outdoors/Fishing male 45-54 433 0 0.0 0.0 0.0
3 Hobbies & Leisure/Outdoors/Fishing male 35-44 364 0 0.0 0.0 0.0
4 Hobbies & Leisure/Outdoors/Fishing male 25-34 351 0 0.0 0.0 0.0
5 Sports/Team Sports/Soccer male 55-64 284 0 0.0 0.0 0.0
6 Sports/Team Sports/Soccer male 45-54 259 0 0.0 0.0 0.0
7 Sports/Team Sports/Soccer male 25-34 237 0 0.0 0.0 0.0
8 Sports/Team Sports/Soccer male 35-44 235 0 0.0 0.0 0.0
9 News/Weather male 65+ 220 0 0.0 0.0 0.0
10 News/Sports News male 55-64 213 0 0.0 0.0 0.0
11 Sports/Team Sports/Soccer male 65+ 213 0 0.0 0.0 0.0
12 News/Weather male 55-64 208 0 0.0 0.0 0.0
13 Arts & Entertainment/Celebrities & Entertainme... male 55-64 207 0 0.0 0.0 0.0
14 Arts & Entertainment/Celebrities & Entertainme... male 65+ 186 0 0.0 0.0 0.0
15 News/Sports News male 45-54 176 0 0.0 0.0 0.0
16 News/Sports News male 65+ 168 0 0.0 0.0 0.0
17 News/Weather male 45-54 165 0 0.0 0.0 0.0
18 News/Sports News male 35-44 160 0 0.0 0.0 0.0
19 Arts & Entertainment/Celebrities & Entertainme... male 45-54 154 0 0.0 0.0 0.0

Matt Clarke, Tuesday, August 10, 2021

Matt Clarke Matt is a Digital Director who uses data science to help in his work. He has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.

Joining Data with pandas

Learn to combine data from multiple tables by joining data together using pandas.

Start course for FREE

Comments