Our server and Google Analytics really disagree on stats.
September 11, 2014 7:34 AM   Subscribe

Our Google Analytics numbers are hugely inflated but we don't know why.

In April 2014 we started seeing traffic spikes in Google Analytics. Suspecting a denial of service attack, we looked into it (traffic from IE 11 on a desktop computer in a tiny town in Michigan, which we expect to send us little to no traffic). Then we saw that the traffic spikes weren't matching up with server downtime, and since we don't have any ads on our site (and hence no Google Ads revenue at risk) we just kind of shrugged and decided we'd look into it further if it started causing site slowdowns and/or crashes.

The traffic spikes have continued, and now they've spread to five other cities around the country. We've also started getting questions about the unusual traffic.

Looking into it, it seems like Google's numbers are just very, very wrong.

On one day, Google Analytics lists over 4,000 pageviews on the search results page for one specific term with another 3,000 the following day. Our server logs show 23 searches for that term in that time.

On another day, Google Analytics shows over 3,000 pageviews on the search results page for one particular term, but the server logs show 5.

Manually excluding statistics from the affected cities is probably not the best answer because a) it seems to be spreading and b) we won't know what to do if it starts hitting cities where we do expect a significant amount of site traffic.

I've tried posting in the forum for the software running our site and have had no response. I've also tried posting in the Google Products forums, but the last time we had a problem with Google Analytics, AskMe had the answer and the Google forums didn't. I'm hoping AskMe will have the answer again.

Do you have any idea what's going on here? Any ways to correct the problem?
posted by johnofjack to Computers & Internet (9 answers total) 2 users marked this as a favorite
I am not an expert in this area, but I have noticed Analytics can be quite different from the server logs in unexpected ways. The two methods actually measure slightly different things.

Maybe the design of your site somehow triggers a refresh of the page on the client (maybe only with certain browsers) without hitting the server because everything is cached. The GA javascript would see this as multiple hits. The fact that one spike came from a single obscure location makes me think this is caused by a single browser that gets messed up somehow. I bet if you broke the search sessions down by browser or IP (in the GA console) you would see they all came from a single person.

If the spikes don't appear in the server logs then the server is not experiencing lots of traffic and it is business as usual.
posted by AndrewStephens at 8:12 AM on September 11, 2014

Response by poster: I think you're onto something.

Search on Kiwifruit + 4 refresh: server 1; Google Analytics 5
Search on Avocado + reclicking "Go" + 3 refresh: server 2; GA 5
Search on a-z + hold CTRL+R, stop, let page load; hold it some more, stop again, let page load; hold it some more: server 1, GA 8.
All of that last thing + clicking "Go" again: server 2; GA 9.

So it looks like Google Analytics adds one to the stats every time the page reloads--and sometimes before it has fully reloaded (e.g. in constantly commanding “refresh/reload”)--but the server only adds one whenever the form is resubmitted.

The next question would be "why thousands of refreshes"? Unless there's a budding epidemic of impatience, there's maybe a problem with IE 11 and Google Analytics? (More likely: a problem with IE 11 + certain settings and/or plugins and Google Analytics.)
posted by johnofjack at 8:40 AM on September 11, 2014 [1 favorite]

"why thousands of refreshes"?

my guess is its an unintentional refresh triggered by some framework element in the site's scripting introduced by a script kiddie who thinks jquery === javascript. see the 'skills gap' thread.
posted by j_curiouser at 8:59 AM on September 11, 2014 [1 favorite]

It sounds like maybe your server is ignoring robot/spider hits and maybe GA isn't ignoring them? That would be one obvious possibility. Also important to know if you're measuring apples to apples. Unique pageviews vs. total pageviews is often a culprit, but likely not the case here. Just make sure you know, definitively, that the two metrics you are comparing are the same thing.
posted by turntraitor at 8:59 AM on September 11, 2014

Response by poster: Unfortunately it's not a pageviews vs. unique pageviews problem (I wish it were!)

Top 3 search results pages on a particular date:
Pageviews: 3467. Unique pageviews: 3467.
Pageviews: 3401. Unique pageviews: 3401.
Pageviews: 3094. Unique pageviews: 3094.
posted by johnofjack at 9:16 AM on September 11, 2014

The next question would be "why thousands of refreshes"

Someone could also be running one of those browser tab add-ons set to reload the page some stupid-often amount of times and just left the window open and left. Or it could be something that was caching based on some sort of changed information and you have a carousel or clock or weather or something else on the page that means that it grabs the "new" version all the time?
posted by jessamyn at 9:31 AM on September 11, 2014 [1 favorite]

If GA counts them as unique page hits, then my guess is that one person has either disabled cookies or is running some privacy browser plugin that messes with GA's idea of who viewed the page. This also may be why your site keeps refreshing, some script on your site might not play nice with the user's settings.
posted by AndrewStephens at 10:21 AM on September 11, 2014

Response by poster: Having cookies off doesn't trigger this behavior; the site doesn't let you do searches if you're logged out (no, I don't like that and no, it's not my choice). Even if you do load the search page, turn cookies off, and then submit your search it reports a redirect loop but there's no corresponding traffic spike in Google Analytics.

We put together a custom report and tried to map the spikes to dates, times, and locations. It looks like this is all conceivably one person who travels; we're seeing things like traffic spikes from a metropolitan area and then, an hour or two later, traffic spikes from a small town about 30 miles away, or a different small town about 30 miles away. We haven't yet seen traffic spikes from two different places at the same time.

We found some spots in the logs showing this same behavior from Firefox so now we think it's probably caused by an extension. We don't have any info on how visiting browsers are set up so we'll probably never know which one.

We may just put some filters on our reports and have to hope the person doesn't visit places which do generate a lot of our traffic.
posted by johnofjack at 9:11 AM on September 12, 2014

Response by poster: We looked into it further and finally got some actual IP addresses, then determined that the person is using a satellite connection and so might not be traveling at all. The traffic spikes still could all be from one person and/or one machine; the one log showing that it was Firefox was wrong, as IE 11 actually reports itself as Mozilla.

We wrote the person's ISP and have heard nothing back.

I tried three different privacy-oriented extensions (Disconnect, Privacy Badger, and Ghostery) with all sorts of different settings and wasn't able to reproduce this behavior.

We still don't know what's causing the traffic spikes but will just be subtracting the numbers for the traffic spikes from any stats before submitting them, and also hoping that the traffic spikes don't start showing in areas which do generate a large percentage of our traffic.
posted by johnofjack at 9:17 AM on October 6, 2014

« Older Which language should I learn after Italian:...   |   Wordpress not loading, index.php error? Newer »
This thread is closed to new comments.