APA --> EndNote (ugh!)
November 21, 2006 12:25 PM Subscribe
I have a bunch of APA-style refs in a huge (580+ page) MS Word document. How do I get them all into EndNote?
This recent post made me decide to go ahead and ask this question, even though I have little hope.
Over the past ten years or so my boss has built up a hefty collection of journal / book references, and they have all been typed into MS Word by folks like me, over the years, and occasionally by work-study undergrads. I want to import them into EndNote, but before I do things the Really Hard Way, I want to know if there's an easier way.
Smith, J., Johnson, D., L., & Bikkle, R. (1998). What I did on my summer vacation. Cognitive Science, 9, 231-292.
Event if all of my entries were perfectly formatted (and they are not) I'm still not sure how to convert them.
My best idea so far is to write a PHP script that parses them, somehow, and determines if they are books or journals or unpublished manuscripts. Then it would spit out a tab-delimited list of refs, along with a list of references that were poorly formatted. Then someone would have to go through the first list and make sure that nothing got mangled (which it will) and go through the second list and fix each missing parenthesis by hand. Did I mention that I have around five thousand of these?
It's also possible that this PHP program will just screw things up from the get-go, and I should just have someone enter all of them by hand. I'm hoping, however, that someone out there has a Better Way. Maybe?
posted by Squid Voltaire to computers & internet (3 answers total) 3 users marked this as a favorite
Ugh. If parens are mising, they're missing around the date, so put them back in automatically.
Here's what you'll find, based on my doing something similar recently, with precinct names and polling place locations: the failures will cluster in classes, such that one transformation will fix all in that class
OK, let's build a BNF
APACITE ::= NAMELIST DATE ARTICLEORBOOK
NAMELIST ::= NAME | NAME AMPERSAND NAME | NAME COMMA NAMELIST
NAME ::= LASTNAME COMMA INITIALS
LASTNAME ::= LETTER | LETTER LASTNAME //forget init Caps so we match van den Bergs and MacIntoshes
LETTER ::= [A-Za-z] // Jennifer 8 Lee we don't match, sorry
COMMA ::= ","
AMPERSAND ::= "&"
INITIALS ::= LETTER PERIOD
PERIOD :== "."
DATE :== [OPENPAREN] DIGITLIST[CLOSEPAREN] [PERIOD]
DIGITLIST ::= DIGIT | DIGITLIST
DIGIT ::= [0-9]
And so forth. If you export from Word such that italics are preserved all the easier to seperate article titles from journal names.
Throw the BNF into Yacc or GOLD or JavaCC, or just write a regex in Perl.
Or, use an expisting CPAN Perl module: Biblio::Citation::Parser
Or, use this one, that the author claims is superior: http://sunir.org/monkey/AcademicCitation/
posted by orthogonality at 1:57 PM on November 21, 2006