How do I correct a discrepancy between MRTG and Apache?
April 5, 2007 1:43 AM Subscribe
How can I go about rectifying the discrepancy between AWStats/Analog/Apache log tools and MRTG graphs?
I've searched endlessly on Google, but everyone has the opposite problem to me. My MRTG graphs from my provider (LayeredTech, for what it's worth) are consistently lower than what is reported by Apache logs and therefore tools like AWStats, Analog, Webalizer and the like. I figure this is due to people starting downloads (and thus HTTP 200 status codes), but cancelling their downloads.
The discrepancy is as much as 30GB a month (45GB in MRTG vs. 75GB elsewhere), so a solution to this would be greatly appreciated. Any pointers, AskMefi?
I've searched endlessly on Google, but everyone has the opposite problem to me. My MRTG graphs from my provider (LayeredTech, for what it's worth) are consistently lower than what is reported by Apache logs and therefore tools like AWStats, Analog, Webalizer and the like. I figure this is due to people starting downloads (and thus HTTP 200 status codes), but cancelling their downloads.
The discrepancy is as much as 30GB a month (45GB in MRTG vs. 75GB elsewhere), so a solution to this would be greatly appreciated. Any pointers, AskMefi?
Response by poster: Hm. The provider only gives me MRTG graphs (it's an unmanaged dedicated box in a datacentre), so I trust that their data is accurate. I would imagine that these statistics programmes would take into account range requests for 206 code files? Perhaps not. I'll have to research that.
I know that those three tools (AWStats, Analog and Webalizer) all give me the same results, which leads me to believe there's not much hope. Then again, I know it must be possible because others can do it!
I hope that helps somehow. I can do any research into my setup that you think might help. :-)
posted by PuGZ at 6:00 AM on April 5, 2007
I know that those three tools (AWStats, Analog and Webalizer) all give me the same results, which leads me to believe there's not much hope. Then again, I know it must be possible because others can do it!
I hope that helps somehow. I can do any research into my setup that you think might help. :-)
posted by PuGZ at 6:00 AM on April 5, 2007
Just remember, the provider's MRTG graph is generated from the byte counter on their switch port. (Or perhaps using flow data, if you don't have a dedicated port)
Anything your analysis software does will be an estimate, based on whatever Apache puts in the log files for the size of the data returned in response to a request, which doesn't count headers, as far as I can tell. As long as Apache reports the total file size when it doesn't send the whole file, there's not much the log analyzers can do since the data just isn't there to begin with
posted by wierdo at 6:19 AM on April 5, 2007
Anything your analysis software does will be an estimate, based on whatever Apache puts in the log files for the size of the data returned in response to a request, which doesn't count headers, as far as I can tell. As long as Apache reports the total file size when it doesn't send the whole file, there's not much the log analyzers can do since the data just isn't there to begin with
posted by wierdo at 6:19 AM on April 5, 2007
Best answer: Are you compressing with mod_gzip or the like, so that the actual sent data is compressed, so MRTG reporting on actual i/o usage of your data port will only see the gzipped content but your apache log will report on uncompressed?
posted by cmm at 6:26 AM on April 5, 2007
posted by cmm at 6:26 AM on April 5, 2007
I was going to say what cmm just said... that's your most likely issue...
MRTG is far more trustworthy than AWStats.
posted by twiggy at 6:55 AM on April 5, 2007
MRTG is far more trustworthy than AWStats.
posted by twiggy at 6:55 AM on April 5, 2007
Response by poster: Yeah, I know that, that's why I want to fix the stats to match MRTG's output - which I know to be correct.
cmm's solution might explain it, actually! I send out a *lot* of text files.
posted by PuGZ at 6:32 PM on April 5, 2007
cmm's solution might explain it, actually! I send out a *lot* of text files.
posted by PuGZ at 6:32 PM on April 5, 2007
Response by poster: Further research shows that it's not mod_gzip funny business. Hm.
posted by PuGZ at 7:48 PM on April 5, 2007
posted by PuGZ at 7:48 PM on April 5, 2007
This thread is closed to new comments.
You may have to configure your other software to disregard these hits.
Web statistic analysis is such an inexact science. Different analysis tools report different things in different ways. It's hard to help you without knowing more and seeing your setup.
posted by chillmost at 5:35 AM on April 5, 2007