How to use operators in Google Analytics API queries

Google Analytics API query operators let you select or filter specific data from your GA account. Here's a quick guide to using them.

How to use operators in Google Analytics API queries
Picture by ThisIsEngineering, Pexels.
10 minutes to read

To extract specific data from the Google Analytics API you will often need to use segments and filters to ensure you get the data you want. For example, you might want to find all the data on visitors with the userType of New Visitor, so you’d need to pass this to your Google Analytics API query using an operator.

Operators are basically a way of specifying how a given filter or segment should be extracted from your GA data. For example, you can use the == operator to find all the data on visitors with the userType of New Visitor, or you could use the != operator to find all the data on visitors who are not New Visitor.

In this post we’ll look at the different operators and how to use them in your Google Analytics API queries. We’ll be using my GAPandas Python package designed to make it quick and easy to query the Google Analytics API using Python and Pandas.

Google Analytics API query operators

The Google Analytics API includes query operators to handle: equals; does not equal; greater than; greater than or equal to, and less than or equal to. In addition, you can also create operators that identify whether a string contains a substring or not, or matches a regular expression (or regex) or not.

Since Google Analytics metrics contain numeric data and Google Analytics dimensions contain categorical data, you can only use certain operators on certain types of data. These are known as valid combinations. If you attempt to use an invalid combination, you’ll get an error.

Here’s a summary of the available operators and the data types they will work with.

Operator Description Works with
== Equals Metrics and dimensions
!= Does not equal Metrics and dimensions
> Greater than Metrics only
< Less than Metrics only
>= Greater than or equal to Metrics only
<= Less than or equal to Metrics only
=@ Contains substring Dimensions only
!@ Does not contain substring Dimensions only
=~ Matches regex Dimensions only
!~ Does not match regex Dimensions only

Using operators with Google Analytics filters

You can apply Google Analytics operators to two places in an API query: the filters and the segment. As the name suggests, filters are used to filter your data and are the most basic of the two.

To use an operator with a Google Analytics filter using the API you need to call the metric or dimension with its ga: prefix followed by your operator and the value. For example, the filter ga:country==United Kingdom will return only sessions were the ga:country dimension was set to United Kingdom.

When passing operators to the API you ordinarily need to URL encode them, so == would become %3D%3D. However, GAPandas will automatically handle the URL encoding for you so you can just enter them in their unencoded form.

Using operators with filters

In the simple example below we’ll use the equals operator == to select all data from the API where the ga:country dimension is equal to United Kingdom. GAPandas will fetch the data from your Google Analytics account and return it in a neatly formatted Pandas dataframe.

import gapandas as gp
service = gp.get_service('client_secrets.json')
view = '1234567'

payload = {
    'start_date': '30daysAgo',
    'end_date': 'today',
    'metrics': 'ga:sessions',
    'dimensions': 'ga:date, ga:country, ga:userType',
    'filters': 'ga:country==United Kingdom'
}

df = gp.run_query(service, view, payload)
df.head()
date country userType sessions
0 2021-12-26 United Kingdom New Visitor 2627
1 2021-12-26 United Kingdom Returning Visitor 3177
2 2021-12-27 United Kingdom New Visitor 3467
3 2021-12-27 United Kingdom Returning Visitor 3331
4 2021-12-28 United Kingdom New Visitor 3562

Chaining multiple filters

If you have a more complex filter you want to run on your data, such as all the sessions from the United Kingdom who were using a mobile device running iOS, you can chain multiple filters together.

There are two main ways to chain operators: OR and AND. If you’re passing operators and want them to be considered with an OR operator you need to separate each filter query with a comma. For example, ga:county==United Kingdom,ga:userType==United States will return all sessions from either United Kingdom or United States.

To apply an AND operator to multiple filters you need to separate the values with a semicolon. For example, ga:county==United Kingdom;ga:country==United States will return all sessions from both United Kingdom and United States.

payload = {
    'start_date': '30daysAgo',
    'end_date': 'today',
    'metrics': 'ga:sessions',
    'dimensions': 'ga:date, ga:country, ga:userType',
    'filters': 'ga:country==United Kingdom;ga:country==United States'
}

df = gp.run_query(service, view, payload)
df.head()
date country userType sessions
0 2021-12-26 United Kingdom New Visitor 2627
1 2021-12-26 United Kingdom Returning Visitor 3177
2 2021-12-26 United States New Visitor 264
3 2021-12-26 United States Returning Visitor 41
4 2021-12-27 United Kingdom New Visitor 3467

Using operators with Google Analytics segments

The other way to select specific data from your Google Analytics account via the API is using segment. The neat thing about segment is that you can set them to extract data based on either the session or the user to group the data differently. You can also chain lots of them together to create very sophisticated queries.

In the below example we’ll create a segment that will extract all sessions with a sessionDuration of greater than or equal to 90 seconds. If you wanted to do the same for users, you’d substitute sessions:condition:: for users::condition:: before you call your first dimension or metric.

payload = {
    'start_date': '30daysAgo',
    'end_date': 'today',
    'metrics': 'ga:sessionDuration',
    'dimensions': 'ga:date, ga:country, ga:userType',
    'segment': 'sessions::condition::ga:sessionDuration>90'
}

df = gp.run_query(service, view, payload)
df.head()
date country userType sessionDuration
0 2021-12-26 France New Visitor 3759.0
1 2021-12-26 Guernsey Returning Visitor 104.0
2 2021-12-26 Ireland New Visitor 463.0
3 2021-12-26 Israel Returning Visitor 394.0
4 2021-12-26 Jersey New Visitor 928.0

Next, we’ll combine a few different operators to create a more complex segment. sessions::condition::ga:sessionDuration>90;ga:country==Guernsey,ga:country==Jersey will extract all sessions where the sessionDuration is greater than or equal to 90 seconds and the country is either Guernsey or Jersey.

Note that we used a semicolon ; and operator between the two segments. This is because the ; is a delimiter for the API and the ; is not a valid character in a segment, while , or comma is the operator for OR, so it will select either `

payload = {
    'start_date': '30daysAgo',
    'end_date': 'today',
    'metrics': 'ga:sessionDuration',
    'dimensions': 'ga:date, ga:country, ga:userType',
    'segment': 'sessions::condition::ga:sessionDuration>90;ga:country==Guernsey,ga:country==Jersey'
}

df = gp.run_query(service, view, payload)
df.head()
date country userType sessionDuration
0 2021-12-26 Guernsey Returning Visitor 104.0
1 2021-12-26 Jersey New Visitor 928.0
2 2021-12-26 Jersey Returning Visitor 3674.0
3 2021-12-27 Guernsey Returning Visitor 710.0
4 2021-12-27 Jersey New Visitor 507.0

Matt Clarke, Tuesday, January 25, 2022

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.