How to save online articles permanently, with the original format intact
August 16, 2015 5:47 AM   Subscribe

I've been writing for an online pop-culture magazine and at this point have hundreds of clips. The company is shifting to a new CMS soon and I'm worried some of my stories might disappear, either deliberately or not. What's the best way to preserve them for my own portfolio?

The best way and the fastest way, actually, since the move is happening soon. I have them as Word docs, but those are the pre-edited versions and I'd like to preserve the final copy-edited versions with the photos, etc.. I'm not especially tech-savvy, so specific instructions would be much appreciated.

Related question: What's the best online portfolio for writers? Contently? Squarespace? How can I quickly and easily house these articles in an aesthetically pleasing way? Thank you!
posted by pipti to Computers & Internet (16 answers total) 11 users marked this as a favorite
 
Quite apart from the actual how-to, check your contract to see if you are legally able to post the content in full elsewhere. Portfolio-wise you should focus on relevant excerpts. Other people may own the rights to the post-edit content and the photographs.
posted by kariebookish at 5:52 AM on August 16, 2015


Do you still need the stories "as a webpage"?
The most straightforward way to preserve them visually would probably be to "Print to PDF" and keep as a static document.
posted by jozxyqk at 5:54 AM on August 16, 2015 [3 favorites]


For a question like this, it helps to identify your operating system and the browser you are using. Viewing this very page under Windows on Firefox, I chose the option to "save page," available with the 3-bar icon at upper right, and this created an HTM file and a folder that contains all of the non-HTML content referenced by this page - six Javascript files, three CSS files, and a couple of image files. Opening the HTML page displays a faithful rendition of the page as seen online.
posted by megatherium at 5:59 AM on August 16, 2015


Hmm... If you have Evernote and an Evernote clipper extension in your browser, you can "save as article." Also, check out a program called Greenshot.

For writing clips, I use Squarespace where for each sample I include an article screenshot and then beneath it, I put the article in text. Also the text is my own version, because sometimes editors introduce errors or change things that I don't want to be indicative of my writing.
posted by Leontine at 6:35 AM on August 16, 2015 [2 favorites]


Response by poster: Thanks for answers so far.

--Yes, I am allowed to post the content in full elsewhere.
--I use Chrome on a Mac.
--I have tried "print" and then "open pdf in preview" and it opens it up in Preview...but it doesn't look exactly like the online article. There's no cover photo, for example, and the web site's menu is listed as text, and the fonts are different. It looks clunky and not designed.
--I think that Evernote might be beyond me? But I'll check it out. And the Squarespace article screenshot + full text might be a good idea, will investigate.
posted by pipti at 6:42 AM on August 16, 2015


I would stick with the "print-to-PDF," but perma.cc might also be helpful to you: https://perma.cc/
posted by lusitania at 7:29 AM on August 16, 2015


I came here to suggest saving pages as MHTML files using either UnMHT or MAF but both are Firefox extensions and I don't know of Chrome alternatives.
I typically get better results using UnMHT due to its script handling but it really varies a lot depending on the source page, which is why I keep both extensions.

If you don't mind installing Firefox just for this, read on.

A MHTML file is a single-file version of megatherium's "save page" solution, they can be opened by Internet Explorer, old Opera (don't know about the Chrome-derived newer versions), and (extended) Firefox. UnMHT supports MHTML files containing several different pages, that when opened display as multiple tabs.

MAF has another option available, a .maff file that is simply a renamed ZIP archive of the standard HTML + folder structure of the regular "save page" function, that also allows saving multiple tabs in a single .maff file.
The advantage compared to MHTML is that if you find yourself wanting to open a saved page but have no compatible browsers available, you can point any archiver (like 7-zip) to the .maff file and extract it like you would a regular .zip, ending up with the standard .html + folder that can be opened by any browser. You can also browse the .maff archive contents and delete files you don't want to keep file size down (useful if the saved page has lots of images you don't care about).

Whatever you choose, always test your archives with an offline browser with a cleared cache to make sure they do display (and behave, if they contain scripts) as intended.
posted by Bangaioh at 7:52 AM on August 16, 2015 [1 favorite]


I'd say Evernote too. It's easy; setup a (free) account, install the web plugin for Chrome, browse to the website and click save.
posted by chrispy108 at 7:53 AM on August 16, 2015 [1 favorite]


On Chrome under OS X, the behavior I described is just about the same, except that the menu choice is "Save Page As."

Really, in my view the best option for a Mac user is to plunk down the $80 or $150 for one of the higher end versions of DevonThink and use that for your archive. One major advantage over the per-page approach is that DT will also archive the URL of the page for future direct access if needed.
posted by megatherium at 9:08 AM on August 16, 2015


You can use something like Awesome Screenshot (extension for Chrome or Firefox) that will allow you to take a single screenshot of the entire page. I think someone mentioned a similar but different tool already.
posted by aloysius on the mixing boards at 9:54 AM on August 16, 2015 [2 favorites]


In Firefox, you can use Pearl Crescent Page Saver for this, too, with the option to save an image of an entire page (not just the visible part). The screenshots I made that way are the only copy I have of some of my old articles after CMS changes at former publications. If you use a cloud backup solution to screenshot all your stories that currently live on someone else's server, make sure you use something that lets you also download a local copy (in case that cloud backup solution eventually goes out of business or also goes through a format change that would in the future lose your data), then upload them all to whatever server you use to host your own clips website as well. Start this soon!
posted by limeonaire at 10:22 AM on August 16, 2015


Response by poster: Thanks everyone! I am trying Awesome Screenshot and saving them locally...which seems to be going okay. I'm making PDFs as back-ups too--they're clunky and ugly but it's better than nothing. Thank you all! Further suggestions welcome.
posted by pipti at 1:07 PM on August 16, 2015


Pinboard does this is you pay for archiving.
posted by caek at 1:53 PM on August 16, 2015 [1 favorite]


I'm making PDFs as back-ups too--they're clunky and ugly but it's better than nothing.

This is highly advisable. Not necessarily PDFs but choose a file format you can easily open and at the very least copy text out of, with both future and past hardware/software.
posted by Bangaioh at 2:30 PM on August 16, 2015


This is more general for everyone than specific to your already-solved needs, but consider supporting the Internet Archive!
posted by pos at 6:09 AM on August 17, 2015 [1 favorite]


I agree with megatherium, I love DevonThink on my Mac. I use their webarchive capture utility probably a dozen or two times a day. It's wonderfully searchable and I don't have to worry about the original page disappearing down the memory hole.

PDFs of each page (which could be done programatically from a list of URLs) would also work (though sometimes the print styling of a page isn't the same as the on-screen styling of the page, so you might lose some things. Double checking the output quickly is probably a good idea.

(FWIW, DevonThink could also index and search all of the PDFs if spotlight's searching wasn't good enough.)
posted by Brian Puccio at 7:03 AM on August 17, 2015


« Older Facebook chat (me up)   |   Lock up your suitcase, throw away the key... Newer »
This thread is closed to new comments.