What can I use to easily archive web pages that I've bookmarked to my hard drive?
April 5, 2011 9:04 AM Subscribe
What can I use to easily archive web pages that I've bookmarked to my hard drive?
Like most people, I've built up an extensive library of useful bookmarked web pages, but it occured to me that they're useless if the website ceases to exist or the content gets moved or removed.
I'd like to create an archive of web pages that I've bookmarked, in their entirety, as easily and autonomously as possible.
Any suggestions?
Like most people, I've built up an extensive library of useful bookmarked web pages, but it occured to me that they're useless if the website ceases to exist or the content gets moved or removed.
I'd like to create an archive of web pages that I've bookmarked, in their entirety, as easily and autonomously as possible.
Any suggestions?
Based on your question history, you use OS X, right? Save the page as a "Web Archive" in Safari or "Web Page, Complete" in Firefox. Spotlight will index the pages stored on your hard drive just like any other document.
If you want to keep them in a dedicated app with tagging and the like, consider Yojimbo.
posted by bcwinters at 9:17 AM on April 5, 2011
If you want to keep them in a dedicated app with tagging and the like, consider Yojimbo.
posted by bcwinters at 9:17 AM on April 5, 2011
Response by poster: I'm looking at archiving in excess of 500 pages, sorry, should have said.
posted by ilumos at 9:17 AM on April 5, 2011
posted by ilumos at 9:17 AM on April 5, 2011
Best answer: I did this last week for work. On my work PC I used HTTrack; on my personal computer (Mac) I used SiteSucker.
posted by OLechat at 10:11 AM on April 5, 2011 [1 favorite]
posted by OLechat at 10:11 AM on April 5, 2011 [1 favorite]
Seconding Pinboard's archiving option. You could upload your existing bookmarks to it, run the archiving process on those, and then move forward using Pinboard exclusively. Since there may be details specific to your use case that will be a determining factor on going with Pinboard I'll reiterate episodic's link to their FAQ section on Archiving.
posted by safetyfork at 10:38 AM on April 5, 2011
posted by safetyfork at 10:38 AM on April 5, 2011
Best answer: HTTrack will scan one or more whole websites on his own, depending on your chosen settings, but you still must do it manually for each of your bookmarks. I recommend archiving each site separately, as you may need to fine-tune the configuration for some. It is also advisable that you stay near the computer when archiving a site, because if you're careless you may inadvertently tell HTTrack to archive more than you intend to and fill your disk with gigabytes of useless data; check the progress regularly to make sure the program is not misbehaving. And when done, clear your browser's cache, set it to offline mode (or turn off your internet connection), and open your archive to confirm that it works as intended.
If you're not interested in the whole website but just a few pages, use Firefox with the maf extension as it is way simpler. Just open the pages (once again, no automation for this) you want to save in different tabs, right-click and choose "Save All Tabs In Archive As...". The .maff file format is nothing more than a single zip file (it can be opened with programs like 7-zip) containing the html and other relevant files from the pages you archived. You can then open it directly in Firefox, or extract its contents and view them with any browser.
posted by Bangaioh at 11:02 AM on April 5, 2011
If you're not interested in the whole website but just a few pages, use Firefox with the maf extension as it is way simpler. Just open the pages (once again, no automation for this) you want to save in different tabs, right-click and choose "Save All Tabs In Archive As...". The .maff file format is nothing more than a single zip file (it can be opened with programs like 7-zip) containing the html and other relevant files from the pages you archived. You can then open it directly in Firefox, or extract its contents and view them with any browser.
posted by Bangaioh at 11:02 AM on April 5, 2011
Look at DevonThink or any of its sister-apps if for OS X.
posted by webhund at 11:31 AM on April 5, 2011
posted by webhund at 11:31 AM on April 5, 2011
Response by poster: Fantastic, I'm going to give MAF a try, but everyone has suggested tools that would definitely be up to the job. I was aware of HTTrack, but I think it would be overkill for my needs. SiteSucker looks like a very handy Mac alternative to HTTrack.
Thanks again mefi, I can't commend this community enough!
posted by ilumos at 11:56 AM on April 5, 2011
Thanks again mefi, I can't commend this community enough!
posted by ilumos at 11:56 AM on April 5, 2011
« Older iPad 3G cellular vs. wi-fi -- which takes... | Finding a volunteer position related to cleaning... Newer »
This thread is closed to new comments.
posted by backwards guitar at 9:16 AM on April 5, 2011