How to copy/archive a favorite website?
June 22, 2017 3:32 PM   Subscribe

One of my go-to websites for Japanese Recipes, Washoku Guide suddenly announced that they're closing up shop next week. That's not enough time to go through and copy all my favorites! Is there any way to copy/archive the whole thing myself?
posted by Caravantea to Computers & Internet (12 answers total) 15 users marked this as a favorite
 
If you're on a Mac there's SiteSucker. I think Safari has an archive feature, or used to, but I know iCab does.
posted by bongo_x at 3:43 PM on June 22, 2017


If you're at all handy with a terminal wget will definitely do this:
wget --mirror --convert-links --adjust-extension --page-requisites
--no-parent http://example.org
Whether one should do such things (bulk downloading of material that one doesn't own) is left as an excercise for the reader.
posted by mce at 4:46 PM on June 22, 2017 [3 favorites]


They mention on their notice that they're shutting down their social media so in addition to wgetting their webpage I would also use youtube-dl to download their youtube channel. It too is run from the terminal.
posted by coleboptera at 5:06 PM on June 22, 2017 [1 favorite]


Firefox has the Down Them All! extension.
posted by Emperor SnooKloze at 5:21 PM on June 22, 2017 [1 favorite]


I'd also recommend looking at the documentation that Archive Team have put together. (And if possible, contacting them to see if there can be a public backup, since this sounds like quite the resource to lose)
posted by CrystalDave at 6:24 PM on June 22, 2017 [2 favorites]


When you're looking at things, realize there are some that download certain assets (pics, video, etc.) and some that make an archive of the whole site.
posted by bongo_x at 6:52 PM on June 22, 2017


This sounds like a job for the Wayback Machine!
https://blog.archive.org/2017/01/25/see-something-save-something/

Looks like it's manual and page by page, but with multiple people and some time you could save a lot for everybody.
posted by sacchan at 7:36 PM on June 22, 2017


Though I guess on further thought, do one solution that saves all/most content for you, then work on saving page by page for everybody if you like
posted by sacchan at 8:16 PM on June 22, 2017


You might try HTTrack. From the commandline/terminal this should do what you need:
httrack https://washoku.guide/ -O "./washoku" -%v +*amazonaws.com/*
This should download the photos too. I think the Wget above will not download the photos since it limits it's download to washoku.guide which doesn't host the photos.
posted by gregr at 8:07 AM on June 23, 2017 [1 favorite]


Seconding HTTrack. Assuming you're on Windows, there's a handy GUI where you just enter the URL and it'll grab the site - no command line/terminal necessary. They'll be HTML files, though, not something easy to flip through.

It looks like they're running WordPress, so Blogbooker should also work. That will output the entire site as a PDF if you're willing to pay for the basic plan.
posted by okayokayigive at 9:43 AM on June 23, 2017


Damn. I love that site. Wonder how many us are out here...
posted by Gusaroo at 11:55 AM on June 24, 2017


Thirding HTTrack. Worked well for me when I needed to keep a copy of a site my client had lost the logins for but didn't want to delete. We took a copy and archived it.
posted by harriet vane at 8:41 AM on June 27, 2017


« Older Bumper stickers with a single "M"   |   Gladness personified Newer »
This thread is closed to new comments.