Retrieving website caches
November 22, 2005 8:38 PM Subscribe
How can I obtain copies (screenshot or cache) of government officials' websites as they appeared a week, two weeks, a month ago?
Is Google's cached website feature about the best I can do? Does Google retain multiple versions of cached data and, if so, how can I retrieve the various versions?
Methinks we have a trend...
Is Google's cached website feature about the best I can do? Does Google retain multiple versions of cached data and, if so, how can I retrieve the various versions?
Methinks we have a trend...
I don't know whether goverment officials are bound by official information laws, or whether US laws are similar to ours, but you can request previous versions of government websites under official information requests. Here in New Zealand they have to keep snapshots of websites several times a week and diffs between versions, etc.
posted by holloway at 9:44 PM on November 22, 2005
posted by holloway at 9:44 PM on November 22, 2005
The wayback machine is archive.org, phredhead.
Unfortunately they have drastically cut back on the frequency of their updates. I find many pages haven't been archived since 2004 or even 2003, and very few that have been archived in the last year, period.
I sent mail asking what was up, if it was simply a money issue (/obvious), but they never got back to me. It's a huge shame if this service can't be maintained.
*resigns self*
posted by dhartung at 8:51 AM on November 23, 2005
Unfortunately they have drastically cut back on the frequency of their updates. I find many pages haven't been archived since 2004 or even 2003, and very few that have been archived in the last year, period.
I sent mail asking what was up, if it was simply a money issue (/obvious), but they never got back to me. It's a huge shame if this service can't be maintained.
*resigns self*
posted by dhartung at 8:51 AM on November 23, 2005
dhartung, I know there's been a small team rewriting their web crawler from the ground up -- the old one couldn't keep up with the number of web pages out there. I'm not sure where the project stands at the moment, but they were looking to improve the index with a more robust crawler.
posted by mathowie at 9:09 AM on November 23, 2005
posted by mathowie at 9:09 AM on November 23, 2005
The Internet Archive is kept, by design and contract, a minimum of six months behind. It's useless for "a week, two weeks, a month ago" searches.
posted by nakedcodemonkey at 11:03 AM on November 23, 2005
posted by nakedcodemonkey at 11:03 AM on November 23, 2005
This thread is closed to new comments.
posted by mathowie at 8:40 PM on November 22, 2005