Automate putting PMCIDs into .bib files?
April 8, 2013 11:28 AM   Subscribe

BibTeX-filter: I have some large .bib files. I need to add PMCID entries to every article that has an ID in the pubmed database. What's the easiest way to automate this?

The offending entries generally do not have a PMID listed in the .bib, either -- it's not just a matter of converting, but of finding the IDs (I can convert PMID->PMCID if I have it). With the NIH rules for including PMCIDs, I can't be the only person in need of this....

Ideally, there's a tool specifically for this task. If not, can I use the bibtex import feature of some reference management software to pull in my bibs, make it look up the articles in pubmed and fetch the PMCIDs, and then export a .bib with the same cite keys as the original but with the pmcid field added to each record? If so: which? I am totally unfamiliar with tools EndNote, Zotero, &c. -- I haven't changed my system for maintaining my bib files since their creation nearly 20yrs ago (I know, I know), so I have no idea whether any of these tools are capable of what I describe. Hope me?
posted by Westringia F. to Computers & Internet (5 answers total) 2 users marked this as a favorite
I have no idea how to do what you want, but I would suggest that you try asking your question over at
posted by number9dream at 2:05 PM on April 8, 2013

Are you comfortable with BioPython/BioPerl/BioINTERCAL &c.? BioPython, at least, provides an interface to Entrez which should probably be all you'd need to associate the entry with a PMCID. Good luck!
posted by lambdaphage at 3:35 PM on April 8, 2013

Best answer: Is this the right idea?
That is do your bib entries have the 'journal_title|year|volume|first_page|author_name|your_key|' information (and you have some sort of pubmed/ncbi key or something).
Just looking out of curiosity and there seems to be quite a few querying options and E-Utilities floating about the PubMed site and some example code here and there. Likely if you can decently look up a bib entry and get the ID by hand it could be automated at least a bit.

On preview, yeah that Entrez thing may be already wrapped up by some language bindings.
posted by zengargoyle at 3:40 PM on April 8, 2013 [1 favorite]

Best answer: And on reflection, with that batch lookup thing the 'your_key' might just be your bib key so that you can match results with the original query. The batch file query thing reads to me like provide a file of queries:


And recieve:

BAR78|whatever result
FOO88|whatever result

posted by zengargoyle at 3:44 PM on April 8, 2013 [1 favorite]

Response by poster: GLORIOUS! That NCBI Batch Citation Matcher is exactly the piece I needed. Should be able to write a .bst file to generate the query string format from my bibs pretty easily, and then moosh the output back in with perl or sed....

For future reference, here's the spec for the citation matcher's input & output formats, along with examples. Basically it's what you said, zengargoyle, 'cept the input has to end in a | and the output is the whole query string followed by the query result.

And incidentally... AskMe wins the Best Place to Get an Obscure Question Answered award yet again! I actually did post this question on tex.stackexchange at the same time that I posted here. The hive mind came through first and best. :) Thank you all!
posted by Westringia F. at 10:15 PM on April 8, 2013

« Older I want to read more books about art without boring...   |   Help with planning/proposals and options for... Newer »
This thread is closed to new comments.