How to programmatically access referrer data for an html page visit
August 2, 2007 6:29 PM Subscribe
How can I programmatically access the "referring page" environment data for an html or shtml page visit?
I'm trying to save to a database the referring page ("HTTP_REFERER") for all accesses to a specific .shtml page. I'd like it to be triggered by the page visit, or at least update frequently. Tried to call cgi and php scripts using server-side includes but it replaces the referer with the calling script. Is there some simple way to pass this data or get it into a form that can be programmatically manipulated?
You can grab the referrer via javascript and then hit myloggingscript.php?page=foo&referrer=bar, so the php script would store them in a database.
posted by Firas at 6:58 PM on August 2, 2007
posted by Firas at 6:58 PM on August 2, 2007
Best answer: I guess you can't chance the configuration of your Web server, or read its log files? You could put
in your .shtml page, and when a browser requests it, it will then request /log_referers.cgi with the referer in the query string.
posted by nicwolff at 7:33 PM on August 2, 2007 [1 favorite]
<img style="display: none" src="/log_referers.cgi?<!--#echo var="HTTP_REFERER" -->">
in your .shtml page, and when a browser requests it, it will then request /log_referers.cgi with the referer in the query string.
posted by nicwolff at 7:33 PM on August 2, 2007 [1 favorite]
Response by poster: nicwolff's technique works and is exactly what I was looking for.
I do have access to the server logs (its shared hosting). That would seem a better alternative, except that I'd like to update as often as every minute and the log files are rather large. I'm thinking that grepping the whole log file every minute or so to get the new records would be a bad idea. Unless there is a better way to extract the new records from the log?
posted by Manjusri at 10:27 PM on August 2, 2007
I do have access to the server logs (its shared hosting). That would seem a better alternative, except that I'd like to update as often as every minute and the log files are rather large. I'm thinking that grepping the whole log file every minute or so to get the new records would be a bad idea. Unless there is a better way to extract the new records from the log?
posted by Manjusri at 10:27 PM on August 2, 2007
"I'm thinking that grepping the whole log file every minute or so to get the new records would be a bad idea."
An old Pentium MMX machine with a 2 GB SCSI drive I have does a 10 meg log file in about 15 seconds using Webalizer, while serving 20 to 30 connections a minute on a database backed Web site. That's old hardware. A decent shared hosting set up shouldn't be too taxed.
Even on a shared host, piping -tail to restrict your query to the last x lines of the log is simple. Why would you grep the whole file for each update?
posted by paulsc at 11:52 PM on August 2, 2007
An old Pentium MMX machine with a 2 GB SCSI drive I have does a 10 meg log file in about 15 seconds using Webalizer, while serving 20 to 30 connections a minute on a database backed Web site. That's old hardware. A decent shared hosting set up shouldn't be too taxed.
Even on a shared host, piping -tail to restrict your query to the last x lines of the log is simple. Why would you grep the whole file for each update?
posted by paulsc at 11:52 PM on August 2, 2007
Best answer: Uh, how would he know how many lines to ask "tail" for? You could pipe "tail -f" to a long-running process, but that will keep tailing the old file when the log is rotated, so you'd have to restart it from the log-rotating script.
Instead, use Perl and the File::Tail module. I do this on a very busy maillog to keep track of IPs that have recently authenticated for POP, which I then allow to connect for SMTP.
posted by nicwolff at 1:07 AM on August 3, 2007 [1 favorite]
Instead, use Perl and the File::Tail module. I do this on a very busy maillog to keep track of IPs that have recently authenticated for POP, which I then allow to connect for SMTP.
posted by nicwolff at 1:07 AM on August 3, 2007 [1 favorite]
This thread is closed to new comments.
posted by paulsc at 6:37 PM on August 2, 2007