Google Analytics is just wrong!

The other day, I was looking through TubeBattle’s syslogs…

I don’t get much time, but try to do a quick scan of it on a weekly basis just to see who is trying to hack that server (you’d be amazed at the HTTP calls people try to make!). Anyway, as I was looking through it, it occurred to me that I was getting way more hits than Google Analytics was reporting!

My syslogs were coming back with the following statistics:

  • 5,927 unique visitors for the last 14 days of January (Google was reporting 5,556)
  • 16,684 page views for the last 14 days of January (Google was reporting 15,704)
  • 8,027 unique visitors for the first 9 days of February (Google was reporting 4,215)
  • 18,684 page views for the first 9 days of February (Google was reporting 9,452)

A quick calculation shows that Google was under-reporting by:

  • 6.3% unique visitor count in January
  • 47.5% unique visitor count in February
  • 5.9% page view count in January
  • 49.5% page view count in February

In summary, it looked like Google was under-reporting more than more as TubeBattle got busier.

So off I went to investigate the cause (using Google Search, ironically), and this is summary of what I discovered:

  • Google Analytics requires the browser to make a javascript call to a Google server, to register the visit.
    1. It is susceptible to being “lost” (if the Google Analytics servers are busy or off-line, it will miss the count).
    2. If the browser does not support Javascript (such as: old browsers or proxy servers that don’t support Javascript, users who turn off Javascript support, etc) the call is never made and so that visit is never registered.
  • The syslog is the most comprehensive logging system, recording every visit.
    1. The problem is that it records visits by bots (e.g. if Google Search is scanning and indexing your site, it would still be counted as a visitor).
    2. The good news is that the bot, regardless of how many times it visits or pages that it scans, is still only counted as one unique visitor.
    3. The bad news is that Google probably has a bunch of these bots. So even though one bot counts as one unique visitor, ultimately all the bots from all the search engines out there could probably throw off stats by a few hundred (or possibly even more).
    4. But then again, if the majority of your site’s stat count is from search engines, you’ve got bigger problems to worry about :P

So all up, it looks like Google Analytics under-reports and syslogs report everything. Whereas the number that people really care about is somewhere in between (but closer to syslogs’ answer).

With this finding, I’m removing Google Analytics from all my sites and going back to the syslog analysis. Right now, I’m using awstats and webalizer (those were pre-loaded on my server). If anyone has a better syslog based analysis tool, please shoot me an email or leave a comment!

Comments
One comment so far, why not make it two?
  1. MyAvatars 0.2 Richard Janes
    July 30, 2008

    Google said that I had 14,000 page impressions last week but my server figures (GoDaddy) say I had 48,000!!! That’s a crazy difference.

    I’m just looking around to make sure I’m okay telling advertisers that we have the 48,000 page views. I don’t want to mislead but by the same score if google is so off then I don’t want to go by what they say!!!!

    Cheers,

    Richard
    Film Industry Bloggers

    Leave a reply
Leave a Comment
Add your picture!
Join MyBlogLog and upload your avatar. C'mon, it's free!