Google Analytics Architecture Explained for Beginners
To climb the highest step you need to step on the first ones. Same goes for mastering Google Analytics which requires the basic understanding of Google Analytics functionalities. The way it tracks website visitors, processes data, and presents in a well-formatted way.
This article discusses the basic architecture behind the functionality of Google Analytics. Structurally the architecture is broken down into following four steps which can also be called as the pillars of Google Analytics.
If you want to understand the process of data collection as a whole you must first understand the concept of each step involved. In order to understand how google analytics collect data, we must understand the concept of hit, session, and users.
A hit is the atomic unit of information in Google Analytics. It is the most granular piece of data in Google analytics. Everytime the user interacts with the website a hit request is generated. A hit gathers all of the information about the interaction at that exact particular moment – a snapshot of information, and sends this information to the collection server.
A hit doesn’t necessarily need to make you see stars in daylight. This interaction can be as simple as every time a user views a page to everytime he purchases from the website – a new snapshot about what is happening is sent to GA server. The GA server combines this Hit data into sessions and these sessions are then tied on to particular user matching his client ID saved in the browser cookie.
Whenever a new visitor lands on your website, the analytics tracking code assigns a unique client ID. Unique client ID is like DNA in Analytics worlds. Hence, two users can never have a similar unique ID. Analytics uses a single first-party cookie names _ga to store the client ID. So, the next time the visitor lands on your site will be tracked as returning visitor based on the information saved in the cookie.
Apart from user interaction data, a hit also combines data from other sources such as IP address, Server log files, and ad-serving data. From these additional sources, GA gets additional information about the user for example location, browser, operating system, age, gender, about the referral. This information when later processed helps in creating more detail reports. (this information being sent in a hit is usually processed into dimensions).
A hit is a simple URL string with query parameters containing useful information. A hit is sent via GET or POST methods.
Configuration can be thought of as the setting you apply to customize the data being collected. It’s about setting up the rules for data processing, which includes configuring Google Analytics features like setting up goals, applying filters, and creating custom dimensions/metrics.
You can also unlock hidden power features at the property level like demographics and interest reports, in-page analytics or enhanced link attribution. These all features help you define your data in analytics and enables you to analyze data more critically.
Once a hit is sent to Google Analytics carrying the interaction information, it processes that data. It is important to understand how data gets processed in order to make more informed decisions about data collection.
Google analytics processes collected data in the following three steps.
– New Vs Returning Users
– Other data sources
Identifies New vs Returning User
The first thing Google Analytics does is to identify the user type – new vs returning. It does this with the help of information stored in the browser cookies. If it finds an existing client id in the cookie, it identifies as a returning user otherwise it assigns a client id and identifies as a new user.
After the visitors are grouped into user types next these hits are grouped into sessions. In order to analyze the web traffic on the site GA groups together the Hits generated in a particular time frame into sessions. When GA detects that the user is no longer active it will end the session and start a new one when the user is back. By default, this time frame is set to 30 mins but as mentioned earlier in configuration step this can be set according to objective and purpose of the website. With the data organized into visitors and sessions, the GA can now calculate metrics such as bounce rate, pages per sessions, time on site etc
After the data is categorized into sessions the next step in data processing is joining data from other sources you have specified that can be measurement protocol – data coming from any other internet connected device or any other marketing tool like Adwords.
Reporting provides access to all the processed data in the form of infographics through web interface, and also allows you to get the processed data through reporting API.
However, in web interface, you can look into reports of various types including: Real time, Acquisition, Audience, behaviour, and Conversions.
However, GA allows great flexibility to create custom reports apart from these default reports you can also create your own custom reports in order to analyze two different dimensions and metrics together.
By now, you will have a good understanding of the Google Analytics architecture and ready to take off your Google Analytics learning plan. Understanding the fundamentals will always help you gain new insights in your data. Finally, don’t forget to sign up to our newsletter below to receive our latest blog post straight in your inbox.