How to automate a reverse DOI lookup
January 11, 2011 2:54 PM   Subscribe

I have a database containing more than 2000 scientific literature references with titles, authors, journal, publication year etc. I would like to add Digital Object Identifiers (DOI). So far, I've done it by looking up the articles in Google (or in the publishers' sites) but it's obviously too time-consuming (and some references don't have DOIs). How can I automate the process?

CrossRef offers two reverse DOI lookup forms (metadata in, DOI out): one with a simple input box and one that accepts a text file containing references. The simple one works but only with small batches. The upload form did not work: I got back a file without DOIs even though some of the references I sent were correctly identified by the simple form. Anyone knows a better way to do that?
posted by elgilito to Computers & Internet (5 answers total) 1 user marked this as a favorite
 
Depending on what your end goals are, Mendeley might be the program for you. It automates the extraction of metadata and indexes everything--there are also ways to export some of this information. The program is free and runs on Windows, OSX, and Linux.
posted by Aanidaani at 3:00 PM on January 11, 2011


Response by poster: Unfortunately Mendeley just does the regular DOI lookup (DOI in, metadata out). It does find URLs, so that's a start, but I would definitely prefer DOIs (which are supposed to be more stable). BTW I found a CrossRef API to retrieve DOIs but it requires a CrossRef account.
posted by elgilito at 2:29 AM on January 12, 2011


If the papers are indexed by pubmed, you could obtain the DOIs by a 2 step process: first find the PMIDs using the batch citation matcher form or esearch API method, and then find the DOIs using the esummary API method.
posted by James Scott-Brown at 12:21 PM on January 13, 2011


Also, the CrossRef Simple Text Query Form states "Please contact us (info@crossref.org) if you represent an organization that needs to retrieve CrossRef DOIs in quantity". If you explain why you need to do 2000 look-ups, they might be willing to help.
posted by James Scott-Brown at 12:27 PM on January 13, 2011


Response by poster: Thanks for the input. Finally I used the simple CrossRef form and processed the references by batches of 50-60. It took a couple of hours. CrossRef retrieved both DOIs and PMIDs and I got about 420 DOIs and 50 PMIDs (most of the references are from really obscure journals). For some reason CrossRef failed to find the DOIs on certain journals even though they're using them and I'll probably have to process those refs manually.
posted by elgilito at 1:24 PM on January 13, 2011


« Older Burning down the blog   |   Building electromagnet Newer »
This thread is closed to new comments.