What is the best way to download online magazine/newspaper articles?
December 28, 2007 5:17 PM   Subscribe

What is the best way to download online magazine/newspaper articles? In particular the Economist, which you can only access free for a certain length of time. Any plugins or software that can do this for free?

What I'd like is some way to given an article have an easy way to save the article text as a text document on my computer without any of the ads on the page. (currently I open word, copy and paste the text, delete ads, and then save the file as the article title)

I'd settle for something that would allow me to save the website as it is and allow me to view the page even after the article is no longer accessible.

Bonus points if it will also work for the New York Times and the Washington post and/or is easy to search the articles I have downloaded.
posted by vegetableagony to Computers & Internet (8 answers total) 4 users marked this as a favorite
You might find some helpful ideas in this thread.

You can also use Screengrab, an extension for Firefox, that allows you to save webpages.
posted by HotPatatta at 5:48 PM on December 28, 2007

There's also a Greasemonkey script that allows you to click on any NYT, WaPo, or LAT article and it automatically takes you to the printer friendly page. That way you don't have to click "next page" over and over when you're in the middle of reading an article. Maybe someone can help me out with a link for you.
posted by HotPatatta at 5:50 PM on December 28, 2007

I'm pretty sure this isn't what you are looking for... However, if you're looking for a way to get access to copyrighted and/or stuff you're supposed to pay for for free, that's not really what AskMe is for. That said, there are ways to access this sort of thing without paying for it by going to your (hopefully) local library. In many cases, periodicals like the Economist are available via full-text from databases your library may subscribe to. So, for one example, one of the libraries I have a card at has access to the Business and Company Resource Center database which is a Gale database.

If I log in with my library card, I can use the advanced search feature to search only articles from the Economist (US or UK I think). Leaving the search box blank I can get all articles from each issue of the Economist, perfectly formatted with no ads. With the interface that my library has, I can then get a list of all the articles individually and I can opt to read them on screen, and/or email them to myself. Once they're in your email, as text, they're keyword searchable. With gmail, which is what I use they're speedily keyword searchable and you can filter them all into one folder for easier retrieval.

I'm sure there are ways to automate this process that are 1) outside my range of expertise and 2) probably against licensing terms for these databases, but I bet some other crafy library patron or librarian could figure them out using some "check all boxes on this list" plug in and a few other simple tools.
posted by jessamyn at 5:54 PM on December 28, 2007

I'm sorry I think I misunderstood your question which indicates that you already have access to these periodicals.

I also wanted to mention that if you do have access to some of these databases (Expanded Academic ASAP is the one I am looking at now) you can create an RSS feed of all the new content in a particular journal feed which, in this case, can include headlines for every Economist article. If you pick and choose which ones you want to read, this can be a great way to click through to the articles you want while still getting a table of contents for each issue. Again I'm aware this may not be exactly what you're looking for, but if you click through the RSS feed [or automate a way to do it] you get, again, ad-free Economist with just a little finagling. I'd love to see someone automate this even further.
posted by jessamyn at 6:17 PM on December 28, 2007

In Firefox, you could try using the "Work Offline" mode, to view the cached page:

Offline mode allows you to view web pages you've previously visited without being connected to the Internet.

Search the page titles and dates by bringing up History. (But not a full text search.)

With IE or Firefox you could save the pages to disk with or without graphics. You could save the pages, and index them. Windows desktop search can index html files,
and other document types. It looks like you can index pdf files with an add-on from Adobe called iFilter.

Google desktop has its own search feature.

There are programs that copy entire sites, so that you can read them offline.
An example - http://www.httrack.com/ , also look at ProxyTrack to build up an archive that you can access later on.
posted by geekP1ng at 7:41 PM on December 28, 2007

Response by poster: Just for clarification, the articles I want I can currently access but want to be able to get back to them in the future even if I no longer can access them later (economist articles stop being freely accessible after a few months).
posted by vegetableagony at 11:05 PM on December 28, 2007

Best answer: This solution lets you save the articles, but not as a text file on your computer. Still, you can access them and read them (online or offline, respectively), which sounds like what you want. It also has the plus side of keeping images.

I use Google Notebook for this, with the Firefox extension. Once it's installed in Firefox, you can select the text, then right click and go to "Note this (Google Notebook)". Of course, you'll need a Google account.

It is annoying when the article you want spans several pages, but I usually look for the "Single Page" or "Print" mode (I'm not sure if the Ecnonomist has this).

You can definitely access the text after the page itself is no longer accessible, through either the Firefox extension or the Google Notebook website. I'm pretty sure that the pictures are actually saved into Google Notebook, rather than just linking to the original.

One downside is that you can't access the articles when offline (unless Google adds Google Gears support for it soon), but you can access the saved items whenever you have internet access.

There are times when it will tell you that the note is too long, although I find this very rare. In those cases, I use Scrapbook, a Firefox extension that saves the webpage (or selected text) to your local drive. This is a good option if you need to be able to access articles while offline.

Both of these save the original URL, so that you can go back to the webpage that you got the text from if you want to later on (assuming that it's still up).
posted by jasminerain at 11:53 PM on December 28, 2007

Best answer: Also: Both Google Notebook and Scrapbook let you search through what you've saved quite easily. I find that searching in GN is faster for large amounts of text/items though.
posted by jasminerain at 11:56 PM on December 28, 2007

« Older Better than sex and $50?   |   I Need Lingerie That Fits! Newer »
This thread is closed to new comments.