How to Use Regular Expressions in Google Analytics like a Pro

Let's start

Regular expressions are widely used in different spheres helping to solve some complex questions related to data analysis. If you are using Google Analytics, you know how much data it can give you, and you need something that will help you to deal with it. Regular expressions, or RegEx, are a good match here. In the article, I’ll explain how to use them to create pro reports and filters.

Some theory

Before creating and using regular expressions, you should speak their language. So here are some most important “words” that will help you to bridge that gap.

. – this symbol matches any single character. Therefore, two such signs match two symbols, etc.

hou.e –> house, hou7e,
hou)e; ho..e – house, ho5re, ho%be

* – equals 0 or more of the previous characters.

magent* –  magentn, magent, magenta, magentm

| – the tube is used to separate parts of the regular expression from each other

hou.e| magent* – house, magento

^ – requires the matching data be at the beginning of the field

^color – matches color21, colorkjir, color-23-67, etc. but does NOT match anything like 9color, ycolor, icolor, etc.

$ – limits data search at the end of the field

htm$ – matches any-url.htm but NOT any-url.html

() – contains variations of matching items. It is usually combined with the tube.

dear (Mr|Mrs|Ms|Miss) Smith – matches dear Mr Smith, dear Mrs Smith, dear Ms Smith, dear Miss Smith.

Parenthesis are also usually combined with the asterisk * and dot . to create this part (.*) which is treated as “absolutely everything”:

/map/country/(.*) – /map/country/usa, /map/country/usa15, /map/country/belarus/, /map/country/uk/

\ – transforms any special RegEx character into a simple symbol

my-url-(promo|campaign|ref)\.html – matches my-url-promo.html, my-url- campaign.html, my-url-ref.html

Now when the theory is over, let’s come and play with Google Analytics filters.

RegEx for Google Analytics filters

The main reason to use regular expressions in Google Analytics is to filter out the data you need to explore. For example, you have 5000+ URLs and need stats for just 250 of them. If you try to get the stats for these URLs one by one, you’ll get bored soon and have to spend lots (I really mean LOTS here) of time. Instead, you can use regular expression to get the stats in a few clicks.

When it comes to Google Analytics filters, you can create them in the existing reports:

 use regex in reporting

or in custom reports that are created by you:

 regex custom reports

The logic behind building a regular expression is to make a list of the needed URLs and find common parts in them. Here are some examples.

URLs in one category

Example 1:

You need all URLs in a single category:

site.com/magento-extensions/color-swatch

site.com/magento-extensions/search

site.com/magento-extensions/rma

site.com/magento-extensions/ajax-cart

site.com/magento-extensions/navigation

But not these:

site.com/promo/magento-extensions/ajax-cart

site.com/promo/magento-extensions/navigation

In these examples you see that /magento-extensions/ is the common part, so you should use it to create your RegEx:

^/magento-extensions/(.*)

^ – excludes any URLs containing anything except for the needed category after host, i.e. “site.com” in this example.

Example 2:

You need particular URLs within one category:

site.com/courses/acca/part-time.html

site.com/courses/acca/full-time.html

site.com/courses/acca/online.html

But not these:

site.com/courses/acca/part-time-promo.html

site.com/courses/acca/full-time-promo.html

site.com/courses/acca/online-promo.html

You can use the following regular expression:

/acca/(part-time|full-time|online)\.html

URLs from different categories

Example 1:

Here are some other examples. If you need to include these URLs:

site.com/gifts-for-her/mugs

site.com/gifts-for-him/mugs

site.com/gifts-for-her/caps

site.com/gifts-for-him/caps

Here is what you should use:

/gifts-for-her/(mugs|caps)|/gifts-for-him//(mugs|caps)

Example 2:

To get info on these URLs:

site.com/promotions/promo-banners.htm

site.com/navigation/internal-links.htm

site.com/customers/segmentation.htm

site.com/sales/checkout.htm

you can use this:

/promotions/promo-banners\.htm|/navigation/internal-links\.htm|/customers/segmentation\.htm|/sales/checkout\.htm

Excluding parameters from Google Analytics reports

You will find numerous URLs with parameters like ?, =, etc. They can be generated automatically by your store navigation or just can occur and you can’t control it. To exclude such params from your reports you can use this regular expression:

\?|=

Note that both inclusion and exclusion filters can be used in one report which is quite handy.

IP addresses exclusion

When a team works on a site, each member open the pages many times a day. This will result in inaccurate data in Google Analytics. The thing is you need to see actions of real users and customers but not team members. You can exclude internal traffic with an IP exclusion filter on a view level.

Make a list of all the internal IPs and create a regular expression from them. For example:

125.10.156.19

125.10.158.19

345.21.67.890

Here is what you can use:

125\.10\.(156|158)\.19|345\.21\.67\.890

Ranges can also be used here but I prefer using more simple but understandable structure. It’s also easy to add or delete any IP address from the list if you need to.

You can add your IP filter in Admin > Filters > New Filter > Custom

 exclude multiple ips

Important things to remember

  • You should know that view-level filters like IP exclusion (those that change the way your data is collected) cannot be undone: if you’ve mistakenly excluded to many IPs and lost Analytics data for this exclusion period, you won’t get it back by removing the exclusion filter. That’s why you should always have a test GA view to apply different filters here.
  • Search filters in reports are safe: you can use them in any Google Analytics view. You can also create and save custom reports with them.
  • Your regular expressions should not contain any blanks as any part after a blank is ignored.
  • If you need to create a super big regular expression, create it part by part. You can test each part to make sure it works and then combine them using the tube |. This way you won’t need to review huge regular expression in case something goes wrong with it.
  • Testing is always the answer. This will help you in creating a particular RegEx for your site.

Don't miss out when new resources launch

Our customer analytics experts share wisdom only once a month

Share now
We are customer-analytics consultancy that transforms messy data into actionable insights that will help you grow your company and make better data-backed decisions.