More Users than Sessions? Impossible! 172 Pageviews, but 0 Sessions? Probably an issue with the implementation. Unfortunately, it is the weird way in which Google Analytics, to this day, interprets Sessions, formerly known as Visits.
I was about to finish the second part of my Comparison of Custom Variables in Adobe, Webtrends and Google Analytics, but now this has to wait a bit. While writing about GA’s Custom Dimensions, I kept stumbling over the one reporting issue (besides sampling) that has been hounding and annoying me for years.
Since this issue is a bit complex, I decided to dedicate an entire article to it. I am talking about the weird Sessions metric in Google Analytics (formerly “Visits”, but with the same deficiencies).
The Root of Most Misinterpretations
Google Analytics has a general strength that is also a deficiency. It makes everything look super-easy and user-friendly – that makes it easier for newcomers. But it also makes people who have no idea of Analytics think they can use it without any deeper training. The devil is in the details, and these details can be very complex, especially with the often-misused Sessions metric. I would go as far as to say that this Sessions deficiency causes most Analysts and companies to interpret their data wrongly! The issue has been around forever, and I wonder when Google will finally fix this terribly misleading metric.
What’s Wrong Here?
Let’s start with a common screenshot. A report on a Hit-based Custom Dimension (I could show a Page or a Page Title or any other Hit-based Dimension here): What is wrong here?
As any Analyst learns in kindergarten: Users < Sessions < Pageviews. So why the heck do we have more Users here than Sessions? Did some people sit at the computer together while visiting this site – so they “shared their session”? It would be an interesting challenge on how such a “shared session” could be tracked at all, and certainly, Google knows a lot about you and your habits, but such session sharing is even out of Google’s reach. So in “normal” Analytics theory, more Users than Sessions should never happen.
Pageviews out of Space
I could provide many more examples where clients or even less experienced Analysts of my own team (including myself) created Custom Reports with Sessions and a Hit-based Dimension. In many cases, they didn’t add “Users” as a metric, so the problem often went unnoticed. Enter the next screenshot:
This report shows which options a user chose to filter a list. Positions 1 to 3 look ok, but in 4 to 6 we have the same problem as in our first example: More Users than Sessions. And “5. Un” is an even more extreme example because it has 0 Sessions, but 172 Pageviews! So what happened to those 172 Pageviews for “5. Un”? Did they fly in from out of space? Don’t we all agree that any Hit, i.e. also a Pageview can only occur within a Session?
“Sessions” are actually “Entrances” (almost)!
Interestingly, not even the more seasoned Analysts often know that GA counts Sessions for Hit-based dimensions (i.e., also for Pages!) only when the Hit that set the dimension was the first Hit in the Session. So why does Google call it “Sessions” when it actually is “Entrances”? And what does GA’s separate “Entrances” metric mean then? The difference is tiny, as this eye-opening article from a couple of years ago explained to me back then (I am grateful to this day to Jordan Louis!):
“Entrances count the first hit of the session that is also a page, while [Sessions] captures the first hit of the session, even if it is an Event, transaction, or any other type of hit.”
So both can happen: More Entrances than Sessions, or more Sessions than Entrances. The Google Analytics help section has in the meantime added quite a good explanation as well.
Here is an older screenshot I saved from the time when the metric was called “Visits” and “Users” were “Unique Visitors” (again, also compare Visits to the enormous amount of Unique Visitors):
And here a newer example where we have slightly more Entrances than Sessions:
These differences between Sessions and Entrances are just a side note here and usually not of greater concern. So let’s get back to our main problem: The deficient Sessions metric.
Since that evil Sessions metric seems so simple and would be so useful (a deduplicated count of the same value in a session is quite a common reporting demand), Custom Reports of unenlightened Analysts usually suberabound with Sessions to measure Hit-based Dimensions. And these reports in turn lead to wrong conclusions until someone wonders why this data is so “weird” and starts questioning the implementation.
But it’s never the implementation, it’s always the way Google Analytics processes the data!
Why only Google needs Bastard Metrics like “Unique Pageviews” and “Unique Content Group Views”
In my first years with Google Analytics, I often wondered why there were such strange metrics like “Unique Pageviews” – later came the equally weird “Unique Content Group Views”. Why can’t we simply use “Visits” to see which Page was viewed once or more during a user’s Visit? I then learned that Visits for Pages can be applied in Custom Reports, but that you shouldn’t do that because that does not produce sensible results. I didn’t really understand why, and a later article by Avinash Kaushik also only scratched the surface: It had something to do with the general advice that you shouldn’t combine Session-level Metrics with Hit-based Dimensions. That in itself is no problem and something you need to be aware of in any Analytics tool. And it makes sense for example not to add “Transactions” (a session-level metric) when viewing a report for Pages (unless you want to see the pages on which transactions occurred). But it makes no sense for “Sessions”, a metric where all other Analytics tools show what you’d expect: A deduplicated count of occurrences per Session for a value.
As even Avinash could not solve the mystery that was the Sessions metric back then for me, later I was enlightened by Louis Jordan (see link above). Now I also knew why GA needs the weirdo “Unique Pageviews” metric, and why the “Unique Content Group Views” that came later.
An example: If Content Group “women” was viewed 7 times during a Session, then you usually would like your Analytics tool to report Sessions = 1, Pageviews = 7. GA instead says: Count a Session only if that Content Group was the first the user viewed in his Session – so a user may visit the women section but still show up as 0 Sessions. So Sessions could be 1, it could also be 0 for this case.
Session-based Analysis in Google Analytics is quite Limited
That means: To show something as basic as a deduplicated count of a value during a session, GA needs a special session-deduplicating metric each and every time. In this case we have “Unique Content Group Views” which, in our example here shows something much more sensible than Sessions:
But apart from Unique Pageviews and Unique Content Group Views, there are not many other such metrics. That makes the problem endemic because we keep using more and more Custom Dimensions for which we do not have special “Unique” metrics. This ultimately means that you often cannot do much sensible session-based analysis, and simple questions like “in how many sessions did this value occur at least once?” can often not be answered without resorting to an endless amount of Advanced Segments and thus more time spent getting your report data – and of course more sampling and thus more distorted data (see http://www.webanalyticsworld.net/2014/05/why-pay-for-a-tool-when-there-is-google-analytics.html).
Why not Simply use a Dimension Scoped to Session-Level then?
“Ok”, you may say, “then I just create a new Custom Dimension scoped to session-level. Into this new Dimension, I track the same value which I am already tracking into my Hit-based Dimension that had the issue with those evil Sessions”.
The good: For Session-scoped Dimensions, GA does indeed show Sessions even if a value was not set on the first Hit.
The bad: If the Dimension gets set several times with various values during one session, only the last value prevails. As GA writes itself for Session-scoped Dimensions:
“When two values with session scope are set at the same index in a session, the last value set gets precedence and is applied to all hits in that session.”
In general, this makes sense and is nothing really “bad”. Other tools are more flexible here, but this “last value wins” principle is clear and understandable. However, together with the deficiency of the Sessions metric, it leads to more problems:
The Forms that Disappeared
Say you want to know which promotion ads a user has seen in his session before converting. When you track these promotion names into a Custom Dimension of session scope, only the last one will count, e.g. all Conversions will be attributed to this last ad. Now some clients have ads even in their shopping cart. That means that ad in the shopping cart needs to be tracked with a different mechanism, or it will always be the one that gets the conversion. (With Enhanced Ecommerce Promotions, this can now be solved to a degree, but it means a lot of work!).
Another example: You are a bank and have a plethora of forms on your site, too many to track all of them as individual goals. It is common for your users to fill in more than one form during a session. You want to track the form name into a Session-based Custom Dimension and use a Custom Metric “Form Completions” to track the number of forms sent. So what happens if the user fills in two forms? The Session-scoped Custom Dimension for the Form Name will only report the last form that was filled in, however the Custom Metric “Form Completions” will count 2 Completions. So it looks as if that second form was filled in twice when in fact the second completion stemmed from an entirely different form.
We wouldn’t have this problem if the Session metric worked fine. Then we could simply set the form name into a Hit-based Custom Dimension and count the number of Sessions in which any of the forms where viewed or filled in. This is absolute standard in other Analytics tools, so I cannot understand why Google cannot do it the same way? Another solution would be to have a more flexible scope for Custom Dimensions – say like in Adobe Analytics where you can set the scope of a dimension (eVar) to anything you want, e.g. you can even say that a dimension’s value should persist until some event happens (e.g. another Content Group View or when a form is completed).
Not all is lost
To finish this off, let’s give you some hope (thanks to @peter_oneill for pointing this out): There IS a metric which, in many cases, seems to give you what we would expect from the “Sessions” metric. It is “Unique Events”. Long known from the Event Tracking Report, its description to this day reads: “The number of unique events per category, action, or label”. Google should add: “per Session. And Google should write that this metric does not only apply to Event Hits and Event Dimensions, but also to Pageview Hits and most other Dimensions. It cannot be combined with some metrics like Unique Pageviews, but for many purposes, it seems to do the job – an official confirmation from Google or a change of this terribly misleading metric description would be much appreciated. So far, I remain reluctant to recommend a metric that officially says it does something else than what we are seeing it do when we use it.
Your Sessions Experience
Have you also had your fair share of problems with the Sessions metric in GA? What other issues should Google Analytics improve on? I am glad to read your feedback.