I need a freeware that will scan an NCSA logfile
December 20, 2004 8:36 AM   Subscribe

I want a utility (freeware?) which will scan an NCSA extended logfile and create a simple report on which referrers generated how many sales transactions.

Any recommendations?
posted by ZenMasterThis to Computers & Internet (3 answers total)
You'll probably have to write something for yourself. It might not be easy. Like, do the log files have something in them that will tell you when a sale was made?
posted by RustyBrooks at 9:25 AM on December 20, 2004

I don't think there is anything standalone out there that does this, though I know some commerce packages have this built in, as I've written this feature for at least one of them.

High precision might be somewhat difficult, given that there's really nothing that goes into a standard extended log file that can completely reliably identify an individual. It shouldn't be too hard to do roughly, though. What you really want is to catch two ends of a click trail -- one that starts with a referer from a foreign domain, one that ends with the call of a specific resource (a page or a script) that signifies a completed sale. You can use the remote host/ip field somewhat reliably to identify an individuals requests.

A quick and dirty algorithm in something like perl would be to generate a hash of all requests from a given host/ip, with the key based on that host/ip. Make special note of host/ip's from which a request to the "sale complete" resource is made. Then, for each of those hosts, find the initial request and print the referer if any.

while($line = <LOGFILE>)
$line =~ m/$someRegExpThatParsesLogFiles/;
$host_or_ip = $1;
$request = $4; // or something like that
push $requestsByHost{$host_or_ip}, $line;
if($request =~ m/$saleResourceName/)
{ push @hostsThatBought, $host_or_ip; }

foreach $hostThatBought (@hostsThatBought)
@requestsFromHost = $requestsByHost{$hostThatBought};
foreach $requestFromHost (@requestsFromHost)
if($requestFromHost =~ m/$regExpThatGetsReferer/)
{ print $requestFromHost; }

This is ineffecient, imprecise, and has at least one reference problem. But it might give you a start.
posted by weston at 9:59 AM on December 20, 2004

Be sure you remember that you can't rely on referer to be reported accurately.
posted by grouse at 10:07 AM on December 20, 2004

« Older How do I get some vintage coins appraised?   |   Can anyone recommend any independent, niche... Newer »
This thread is closed to new comments.