Please recommend an Apache log analysis?
March 20, 2007 9:02 AM   Subscribe

What open source package do you like for Apache logfile analysis?

I need to set up some significant usage reporting for a web application. I'm looking for a reporting application for Apache style logs that's robust and customizable. Not looking just for some little report of top URLs for the day, but something that goes into detail on user agents, page load times, IP addresses, etc. Something with a pluggable report system would be ideal because I need to do some app-specific custom reports.

Much of my traffic is API calls, not web browsers, so I think Javascript-based logging like Mint and server-hosted things like Google Analytics won't work. I need something that works primarily off of logs. Storing data in MySQL is ok, but I'd prefer just to work straight from log files.

What's good, flexible, and not ugly? I know about Webalizer, Analog, and AWStats, but in the past when I've used them they've all seemed a bit awkward and limited. It's been awhile, maybe they've improved? Or maybe there's an alternative?
posted by Nelson to Computers & Internet (6 answers total) 6 users marked this as a favorite
I like AWstats. It's not the easiest to set up in a shared hosting environment, but it's a cakewalk if you're the server admin. Compared to Google Analytics, it's very thorough.
posted by SpecialK at 9:08 AM on March 20, 2007

(Not sure about custom reports, but if you write Perl you can mod it to do anything.)
posted by SpecialK at 9:09 AM on March 20, 2007

AWStats meets your needs. It certainly does custom reports (although you can't take that to every last degree).

One thing to note is that "page load times" is generally incalculable just based on server logs.
posted by Remy at 9:54 AM on March 20, 2007

I came here to suggest Webalizer and Analog... Analog has really come a long way in the last 2-3 years.
posted by qvtqht at 12:12 PM on March 20, 2007

Response by poster: Thanks for the replies. Looks like I haven't overlooked any options. AWstats seems the best of the lot, but its graphing is awfully hard to read.
posted by Nelson at 7:36 PM on March 21, 2007

Nelson, quite frankly, the open source log analysis packages out there all pretty much suck.

If can use commercial software, Sawmill is the best (affordable) log processing package I've found - it's reasonably priced, and allows for very extensive customization.

It's a little bit of a bear to get going, but is flexible enough to do just about everything you want, including customized scripting, and with the EE, MySQL queries (and on clusters, which was important at my last job where we were crunching very large log sets).

Once you have your data parsed, you have all sorts of visualization options at your disposal (jpgraph and php/swf charts are good for simple graphs, or lots of flash or processing options available for fancier infoviz).

As Remy mentioned, you're not going to be able to get page load times unless w/o beaconing.

I've always felt that it's too bad that there's never been a focused open source project on making better log analysis tool, but I guess it's one of those things that's both difficult enough and valuable enough that anyone who spends enough time to write something decent is bound to commercialize it (or it stays in-house)?
posted by lhl at 1:19 AM on March 22, 2007

« Older Good scientist, good boss?   |   How do I quit my Ph.D. program? Newer »
This thread is closed to new comments.