Please recommend an Apache log analysis?
March 20, 2007 9:02 AM
What open source package do you like for Apache logfile analysis?
I need to set up some significant usage reporting for a web application. I'm looking for a reporting application for Apache style logs that's robust and customizable. Not looking just for some little report of top URLs for the day, but something that goes into detail on user agents, page load times, IP addresses, etc. Something with a pluggable report system would be ideal because I need to do some app-specific custom reports.
Much of my traffic is API calls, not web browsers, so I think Javascript-based logging like Mint and server-hosted things like Google Analytics won't work. I need something that works primarily off of logs. Storing data in MySQL is ok, but I'd prefer just to work straight from log files.
What's good, flexible, and not ugly? I know about Webalizer, Analog, and AWStats, but in the past when I've used them they've all seemed a bit awkward and limited. It's been awhile, maybe they've improved? Or maybe there's an alternative?
I need to set up some significant usage reporting for a web application. I'm looking for a reporting application for Apache style logs that's robust and customizable. Not looking just for some little report of top URLs for the day, but something that goes into detail on user agents, page load times, IP addresses, etc. Something with a pluggable report system would be ideal because I need to do some app-specific custom reports.
Much of my traffic is API calls, not web browsers, so I think Javascript-based logging like Mint and server-hosted things like Google Analytics won't work. I need something that works primarily off of logs. Storing data in MySQL is ok, but I'd prefer just to work straight from log files.
What's good, flexible, and not ugly? I know about Webalizer, Analog, and AWStats, but in the past when I've used them they've all seemed a bit awkward and limited. It's been awhile, maybe they've improved? Or maybe there's an alternative?
(Not sure about custom reports, but if you write Perl you can mod it to do anything.)
posted by SpecialK at 9:09 AM on March 20, 2007
posted by SpecialK at 9:09 AM on March 20, 2007
AWStats meets your needs. It certainly does custom reports (although you can't take that to every last degree).
One thing to note is that "page load times" is generally incalculable just based on server logs.
posted by Remy at 9:54 AM on March 20, 2007
One thing to note is that "page load times" is generally incalculable just based on server logs.
posted by Remy at 9:54 AM on March 20, 2007
I came here to suggest Webalizer and Analog... Analog has really come a long way in the last 2-3 years.
posted by qvtqht at 12:12 PM on March 20, 2007
posted by qvtqht at 12:12 PM on March 20, 2007
Thanks for the replies. Looks like I haven't overlooked any options. AWstats seems the best of the lot, but its graphing is awfully hard to read.
posted by Nelson at 7:36 PM on March 21, 2007
posted by Nelson at 7:36 PM on March 21, 2007
Nelson, quite frankly, the open source log analysis packages out there all pretty much suck.
If can use commercial software, Sawmill is the best (affordable) log processing package I've found - it's reasonably priced, and allows for very extensive customization.
It's a little bit of a bear to get going, but is flexible enough to do just about everything you want, including customized scripting, and with the EE, MySQL queries (and on clusters, which was important at my last job where we were crunching very large log sets).
Once you have your data parsed, you have all sorts of visualization options at your disposal (jpgraph and php/swf charts are good for simple graphs, or lots of flash or processing options available for fancier infoviz).
As Remy mentioned, you're not going to be able to get page load times unless w/o beaconing.
I've always felt that it's too bad that there's never been a focused open source project on making better log analysis tool, but I guess it's one of those things that's both difficult enough and valuable enough that anyone who spends enough time to write something decent is bound to commercialize it (or it stays in-house)?
posted by lhl at 1:19 AM on March 22, 2007
If can use commercial software, Sawmill is the best (affordable) log processing package I've found - it's reasonably priced, and allows for very extensive customization.
It's a little bit of a bear to get going, but is flexible enough to do just about everything you want, including customized scripting, and with the EE, MySQL queries (and on clusters, which was important at my last job where we were crunching very large log sets).
Once you have your data parsed, you have all sorts of visualization options at your disposal (jpgraph and php/swf charts are good for simple graphs, or lots of flash or processing options available for fancier infoviz).
As Remy mentioned, you're not going to be able to get page load times unless w/o beaconing.
I've always felt that it's too bad that there's never been a focused open source project on making better log analysis tool, but I guess it's one of those things that's both difficult enough and valuable enough that anyone who spends enough time to write something decent is bound to commercialize it (or it stays in-house)?
posted by lhl at 1:19 AM on March 22, 2007
This thread is closed to new comments.
posted by SpecialK at 9:08 AM on March 20, 2007