How do I download PDFs from online journals automatically?
April 26, 2010 1:44 PM   Subscribe

How do I automatically "subscribe" to my online professional journals? I want the PDFs from new issues downloaded automatically, is that possible?

I can go the the websites and download the articles by hand, but I want to have them automatically synced to my iPad for reading when new issues come out.

None of the citation managers in Windows I know about (Zotero and Mendeley, ftw) have this feature, I think, Papers might, but I am not a Mac guy. The RSS feeds just have abstracts.

Any thoughts?
posted by blahblahblah to Computers & Internet (14 answers total) 7 users marked this as a favorite
Can you give an example of the sorts of journals you are talking about and the type of access you currently have? I know that some database vendors have RSS ability built into them [i.e. you can literally just subscribe to a feed for whatever the journal is] but this definitely isn't something that is standardized across journals at all. Anything behind a paywall makes this pretty difficult.
posted by jessamyn at 1:50 PM on April 26, 2010

I do subscribe to quite a few magazines in pdf format, and I find it most annoying to have to remember to go to their website every month, to download the latest issue.
If I subscribe to them in paperformat they send me a papercopy in the mail without me prompting them for it, why can they not deliver me a pdf once the new issue has been released.
posted by digividal at 2:01 PM on April 26, 2010

Response by poster: Jessamyn- I am a sociologist at a business school. So, mostly academic journals hosted by EBSCO, JSTOR, Highwire and the like. The Academy of Management Journal, for example. All are behind paywalls, I think.
posted by blahblahblah at 2:46 PM on April 26, 2010

Ask the IT people at the places that provide the PDF files what they would suggest. They may be able to help, and if not it'll alert them to the fact that you want to do this.
posted by seanyboy at 2:57 PM on April 26, 2010

I threw up a trial balloon to my Twitter folks and they seem to be saying either "ask your librarian" because in a lot of places emailing you this sort of thing is what they do, or ask over at a more specialized librarian community like Unshelved Answers. My generalized searches are turning up nothing useful.
posted by jessamyn at 4:03 PM on April 26, 2010

I have RSS feeds set up for as many of the journals as have them. If the journal doesn't have a "new article" RSS feed, many have "new article" pages, and I just have Google Reader create a feed to update me whenever there are changes to the page.

It is definitely not automatically downloading PDFs (and I hope someone has a solution for that, so I'll be watching the thread for that as well), but at least I don't have to remember to check if there are new articles on a journal by journal basis...except those journals who don't have a "new articles" page OR any rss feeds...
posted by arnicae at 4:59 PM on April 26, 2010

People are mentioning RSS feeds, which are great for keeping up with abstracts, but don't automatically download the PDFs as the originally poster asked for. I could see that being difficult, as you might need access from your institution to download those PDFs in the first place, and what if you're off campus or something?

There are certainly programs capable of automatically downloading TV shows out of RSS feeds, so it should be feasible. Find a programmer friend to write a script?

Finally, just a quick note that Papers for Mac does not have this ability either, as far as I know. Great program otherwise, for anyone who might be curious.
posted by goateebird at 5:41 PM on April 26, 2010

Well, for some the pdfs are simple functions which would be easy to automate.
The pdf for

is just
posted by a robot made out of meat at 6:32 PM on April 26, 2010

You might be interested in Zotero. It won't automate downloading from RSS feeds, but it will allow you to click through to an interesting article, and then download the PDF and add the article to your personal library with one additional click. This will work with most journals/publishers, and it will save you a step or two, anyway.
posted by chrisamiller at 7:32 PM on April 26, 2010

Many journals let you sign up for table of contents alerts, where it emails you a list of what's in the latest edition. If it has links to all the articles then there's probably some way you can script them to download automatically (um, google pipes? some kind of third party software? open all function in firefox?). For my Uni I'd need to add a proxy into the link to get full text access but I'm sure that could be put in automatically along the way.

There are also ways to set up automatic searches in many journal databases. I don't know about the ones you use but I've heard good things about pubcrawler for pub med for example, and web of science can email you publication alerts. I don't think it scrapes pdfs but again once you have a list of links in one place there should be some way of opening and downloading all the links at once. Personally I never readn everything published in one journal but more paperrs with specific keywords published across many journals so this is how I'd go, but your search could always just be on the journal name you're interested in.

I'm sorry I don't know the second step of how to get things to download but table of contents alerts and database search term notifications would be a place to start with compiling the links in the first place.

Endnote interfaces directly with many databases but I can't find any way to download pdfs from there. I vaguely recall grabbing pdfs via refman so maybe that's another one to look at. I don't think it's free though.
posted by shelleycat at 10:36 PM on April 26, 2010

Yes Papers definitely DOES do this. I use it daily to review medical journals. Netnewswire handles the rss feeds and there is a script available online that allows you to automatically import the reference to papers.

Papers itself can be set up to authenticate through any library facitility that you have access to which has an Ezproxy service. Thus one click on the reference and voila... instant pdf download and filed.

You do sometimes have to authenticate once for the session with your library - probably some ip number voodoo, but it's basically orgasmically simple to use and a thing of beauty.

Bookends (another mac programme) has a slightly clunkier way of getting pdfs, and i don't know if there is a way to handle rss feeds to get the contents quite so automatically. (I use it purely for formatting bibliographies because Papers has one click export to it.) Sente, the 3rd Mac programme, doesn't yet have ezproxy support, but apparently it's coming.

Oh, and Papers has a v. nice iPad version which syncs with your awesome Mac library of pdfs.

Honestly worth getting a mac just for this.

Mister Bister
posted by bister at 6:30 AM on April 27, 2010 [1 favorite]

Best answer: I asked my smartie techie librarian friend and he said...
Everyone seemed very focused on RSS, which makes sense given what it
does for journals in terms of alerting readers to new content. I
cannot claim to have solved the problems, but had these thoughts:

- Anything built to auto-download the PDFs would have to do so very,
very carefully. As you know, vendors are very wary of active
downloaders, and anything that looks like a bot coming through a
university IP will get shut down right away. So, if one can do what I
suggest below, then I would suggest putting in a time delay between
fetches so as not to run afoul of the watchdogs.

- So the pieces that occurred to me are as follows. The person with
the original question wanted these to just synch automatically to his
(her?) iPad. Well, Dropbox instantly came to mind. If I could have
some scripty thing on my desktop or laptop grabbing these files and
tossing them into a Dropbox folder, then the synching issue is solved
since they will be available on the iPad and anywhere else this person
installs the Dropbox app. One could even share the folder with
colleagues, friends, etc. Neat, if somewhat dubious from a legal

- This is either a Greasemonkey script--super hard and he/she would
need a programmer of some skill to write something flexible enough to
work with various interfaces without constant tinkering--or perhaps
better for something like Yahoo Pipes, where mere mortals can patch
stuff together. I am thinking along the lines of taking the RSS feeds
and then using the String RegEx Module to extract the URL (which the
feed would have, albeit likely to the publisher's abstract page), and
then modifying it with other String Modules to build the URL to the
PDF based on predictable patterns from the abstract URL. Then one just
needs an action that retrieves what is at the other end of that URL.
Somewhere at this point, I personally would say, oh bloody hell, I
will just subscribe to the feed and click through to get the articles
I want, but a halfway competent programmer could likely build a really
spiffy tool along these lines, slap the GPL on it, and let us all
profit from their genius.

The person who wrote in about Papers is basically, if I read it
correctly, talking about manual functionality that Zotero also
possesses, i.e.- the ability to go from citation to fulltext. But it
still requires user action. I cannot imagine the creators of Papers
building auto-download functionality into the software due to the
aforementioned watchdogs. One does not want to run afoul of some of
those larger firms and their lawyers.
posted by jessamyn at 2:01 PM on April 28, 2010

As those above have said, clicking through from the RSS or email TOC and then adding the reference is about as smooth as it gets, and to throw my hat in the ring too, Mendeley also works that way and has automatic syncing as well.
posted by Mr. Gunn at 5:43 PM on April 28, 2010

Response by poster: Thanks jessamyn, asking my library for help.
posted by blahblahblah at 7:07 PM on May 3, 2010

« Older Has the genre of music loosely known as "trip-hop"...   |   Brown Jordan patio furniture prices? Newer »
This thread is closed to new comments.