How to find the age of a page?
July 29, 2008 2:02 PM   Subscribe

For certain reasons, I need to get the time of birth of a certain web page on another company's site, down to the minute.

I've checked the source code for notes, archive.org, and a couple of SEO tools I use but I'm drawing a blank on how to find out the precise age of this page to the date and time, EST, without access to their server logs. The page cache is not even indexed in Google at this time.

Is there a good tool or technique out there for finding the exact age of a web page?

And, no, unfortunately I cannot simply call them and ask them.

The url is formed like this:
http://example.com/?id=foo
posted by anonymous to Computers & Internet (10 answers total) 2 users marked this as a favorite
 
If it's a static page, it may get passed to you in the headers; use something like the Web Developer toolbar for Firefox, and then Information -> View Response Headers.

For example, a local static website:

Content-Length: 14145
Content-Type: text/html
Content-Location: {redacted}
Last-Modified: Fri, 18 Jul 2008 20:39:47 GMT
Accept-Ranges: bytes
Etag: "90a4406b16e9c81:a25"
Server: Microsoft-IIS/6.0
ServerName: IIS005
X-Powered-By: ASP.NET
Date: Tue, 29 Jul 2008 21:10:32 GMT

200 OK


If it's a dynamic site, which can even include something that doesn't change but just use a content management system, the "birth" of the page is the second you request the page, as it's generated on the fly, and, short of somehow getting access to their server, you're out of luck.
posted by fogster at 2:14 PM on July 29, 2008


(Web Developer toolbar as a link.)
posted by fogster at 2:15 PM on July 29, 2008


If you want to take a look at the response headers, Web Sniffer will allow you to enter a web address in and see the response headers that come back.
posted by esd at 2:15 PM on July 29, 2008


Those headers will tell you the date of last change, not the date of origination.

As far as I know, there's no way to determine the information you want except eye-witness testimony and/or delving into the server log files -- which are not available across the internet.

That's what "discovery" is for, I'm afraid.
posted by Class Goat at 2:19 PM on July 29, 2008


Your example http://example.com/?id=foo is likely to be loaded with problems, since there might be a page called index.htm or index.php that was last modified three years ago, though that page could include dynamic content from yesterday, loaded from a database that doesn't have an inherent date-stamp. The age of the "page" is misleading in that case.

If your certain reason involves possible legal action, note that almost all of the timestamp information can be spoofed, since it relies on the web server's idea of what the date and time is, and file modification dates are pretty trivial to change. Even the server logs aren't guaranteed to be accurate if the host (or someone with access to the host) may have a reason to obscure this info.

Your best bet, from a reasonable-doubt standpoint, is to hope for a third-party copy that's in an untouchable place, such as a cached version at Google "cached at: dddd-tttttt".

(It's the closest thing to a printout dated and notarized.)
posted by rokusan at 2:21 PM on July 29, 2008


If you want to be absolutely sure, you need to get access to the hard drive that hosted this web page. That would require a lawsuit, a computer expert and an attorney experienced with modern electronic discovery.
posted by abdulf at 2:30 PM on July 29, 2008


Server logs won't show when the page was created. Short of getting access to the server and checking the creation date, you're SOL.
posted by wongcorgi at 3:34 PM on July 29, 2008


There isn't a good way to do this -- even the server's "create time" may not be accurate depending on the situation. The best you can hope for is that it IS correct, or they have some sort of log, or revision control.

Our intranet is all subversion controlled -- but our externally facing website is unmanaged php from a database -- you could check what Wordpress put in the database as a "create" date.. but the php script that is spitting out content's create date is meaningless.

Theres no carbon dating for bits.
posted by SirStan at 5:56 PM on July 29, 2008


Hmm. Can't think of much here that hasn't been covered. But perhaps there's another way to go about it. For instance, the domain registrar for the company's site might keep traffic logs, which could be analyzed for traffic to their host page, etc. Assuming they've retained such logs, it shouldn't be too hard to pinpoint when traffic first started hitting it.

Of course, getting that information out of them would prove difficult. The first step would be to contact the registrar (find it using DNS lookup like whois.net) and then maybe just do some social engineering ;)
posted by sprocket87 at 7:42 AM on July 30, 2008


Even if you had physical access to the file server, you wouldn't necessarily be able to determine the creation time of the file.

Typically web pages are created on a developer's workstation (or development server) and then copied to a production web server.
posted by kenliu at 4:30 PM on July 31, 2008


« Older XP and Vista laptops won't access net at same time...   |   How to get rid of ants bees wasps ladybugs etc Newer »
This thread is closed to new comments.