How to save all cached pages for a particular domain? June 28, 2007 10:53 PM Subscribe
A web site with lots of useful information recently went offline, but the cached pages are still available from Google. Is there a program (for either OS X or Windows) that will automatically save all of the cached pages on Google associated with a particular domain? posted by Ø to computers & internet (8 comments total)
6 users marked this as a favorite
Note - the Wayback machine doesn't show new pages until many months (6 - 12) later, so you may want to check them again several months from now. posted by zippy at 1:21 AM on June 29, 2007
Warrick is a command-line utility for reconstructing or recovering a website when a back-up is not available. Warrick will search the Internet Archive, Google, MSN, and Yahoo for stored pages and images and will save them to your filesystem.
Unfortunately the Wayback Machine was blocked by the site's robots.txt. Google has it all cached, though.
I've been reading the documentation for wget (which looks really cool -- thanks for the recommendation!), but can't figure out how to get it to archive just the cached page links returned by Google. I'm probably overlooking something very obvious... any pointers would be appreciated... posted by Ø at 1:45 AM on June 29, 2007
Wow, blag! Warrick looks perfect!!!! Thank you. posted by Ø at 1:47 AM on June 29, 2007
I was wondering if there's another program out there like Warrick, or if it's the only one (sorry to derail, but really curious about this one) posted by Merdryn at 8:01 AM on June 29, 2007
Happy to help.
Merdryn: as far as I know, it's the only one. posted by blag at 6:09 PM on July 1, 2007
« Older
Guitar Filter: On a 6 string ...
| Help me post personal game sta...
Newer »
posted by Blazecock Pileon at 11:06 PM on June 28, 2007