Written by Jose Vicente
Cohort analysis is a widely used technique in medicine, which consists of gathering a group of healthy patients and performing a study of a disease’s behaviour, through their exposure to certain risk factors over a period of time.
When applied to web analytics, what we obtain is a way to know whether a group of users characterised by a particular behaviour bring value to our business in the long term.
Experts talked about this kind of analysis and its impact on web analytics at the Google Analytics User Conference Spain 2013, which we attended this past May, and on the Google Analytics official blog. This new characteristic and other changes will be rolled out soon in the advanced segmentation of Google Analytics, providing all necessary features to carry out this kind of analysis.
What is a cohort?
A cohort is a group of people who share a characteristic or common experience, within an established time frame. It’s important to highlight that this common characteristic must happen within a very well defined time period, but it doesn’t have to take place simultaneously with the analysis.
Examples of use of this technique are very easy to find in medicine. A cohort could be a group of people who started smoking during the 1990s, and are now within the framework of a lung cancer study carried out between 2010 and 2020 for this particular segment of patients.
Now, the question is: how can we apply cohorts to an e-commerce store, for example? A good example could be behaviour analysis of customers who made their first purchase during sale season. We could define the cohort as users who made their first purchase between 1 and 30 July, but set the study of their behaviour to be carried out over the course of the following six months. During these six months, we could observe whether this cohort:
- Returns to our store.
- Makes purchases when it’s not sale season.
- Spends the same amount of money as other users.
Analysing cohorts with Google Analytics
Presently, performing a cohort analysis in Google Analytics is complicated for two reasons:
- We cannot establish time frames in our segments.
- Google Analytics segmentation is applied to visits, not users.
We can overcome these obstacles through use of custom variables. If we define a custom variable with the date of their first purchase, we can define advanced traffic segments according to this value.
_gaq.push(['setCustomVar', 1,'FirstPurchaseDate', 'DDMMYYYY', 1]);
Given that session variables are defined at a user level, the second problem would thus be solved. However, this method is not foolproof, and has other issues too, namely:
- We lose custom variable user information when they delete their cookies.
- If a user hasn’t logged in in our store, we won’t be able to establish the date of their first purchase.
Once we have the necessary data, we can create the cohort with an advanced segment, where we will select our custom variable “FirstPurchaseDate”, and the established time frame in July. Since we defined the time frame with a DDMMYYYY format, we will only need the date to end with ‘072013’ to see users who have made their first purchase in July.
Once we’ve defined this advanced traffic segment, we will be able to analyse it to obtain data such as:
- Do they spend more or less as opposed to other traffic segments?
- Do they still buy the same type of products?
- What traffic sources have they come from the first time and made this first purchase on our website?
The easiest way to answer these questions is by comparing this data with other traffic segments, in order to find out whether the analysed cohort brings any long-term value to our business.
How are we going to perform the cohort analysis in the future with Google Analytics?
Google has announced new features for advanced segmentation in Google Analytics, and amongst them we extract those that solve our problems when performing cohort analyses:
- User segmentation: as we’ve mentioned previously, advanced segments are based on sessions. This new user segmentation option will enable us to select all user sessions that fit certain criteria, like demographics or specific behaviour. The new feature can be used in combination with those session features that already exist in the tool.
Cohort definition: the possibility of adding date ranges to advanced segments is now included. This allows us to define user cohorts with a specific behaviour for an established period of time, which avoids us making additional implementations to our Google Analytics UA code to obtain this data.
These changes will become available in Google Analytics in the coming months, and as usual, the update will be rolled out gradually, with some accounts receiving it before it reaches others.
The ability to make this sort of analyses based on user behaviour, such as the one described in this post, implies one of the biggest concept changes for this tool, that we will continue to try out once we have access to these new features.