Subscribe
Smith, J., Johnson, D., L., & Bikkle, R. (1998). What I did on my summer vacation. Cognitive Science, 9, 231-292.
Event if all of my entries were perfectly formatted (and they are not) I'm still not sure how to convert them.
My best idea so far is to write a PHP script that parses them, somehow, and determines if they are books or journals or unpublished manuscripts. Then it would spit out a tab-delimited list of refs, along with a list of references that were poorly formatted. Then someone would have to go through the first list and make sure that nothing got mangled (which it will) and go through the second list and fix each missing parenthesis by hand. Did I mention that I have around five thousand of these?
It's also possible that this PHP program will just screw things up from the get-go, and I should just have someone enter all of them by hand. I'm hoping, however, that someone out there has a Better Way. Maybe?
You are not logged in, either login or create an account to post comments
Ugh. If parens are mising, they're missing around the date, so put them back in automatically.
Here's what you'll find, based on my doing something similar recently, with precinct names and polling place locations: the failures will cluster in classes, such that one transformation will fix all in that class
OK, let's build a BNF
APACITE ::= NAMELIST DATE ARTICLEORBOOK
NAMELIST ::= NAME | NAME AMPERSAND NAME | NAME COMMA NAMELIST
NAME ::= LASTNAME COMMA INITIALS
LASTNAME ::= LETTER | LETTER LASTNAME //forget init Caps so we match van den Bergs and MacIntoshes
LETTER ::= [A-Za-z] // Jennifer 8 Lee we don't match, sorry
COMMA ::= ","
AMPERSAND ::= "&"
INITIALS ::= LETTER PERIOD
PERIOD :== "."
DATE :== [OPENPAREN] DIGITLIST[CLOSEPAREN] [PERIOD]
DIGITLIST ::= DIGIT | DIGITLIST
DIGIT ::= [0-9]
And so forth. If you export from Word such that italics are preserved all the easier to seperate article titles from journal names.
Throw the BNF into Yacc or GOLD or JavaCC, or just write a regex in Perl.
Or, use an expisting CPAN Perl module: Biblio::Citation::Parser
Or, use this one, that the author claims is superior: http://sunir.org/monkey/AcademicCitation/
posted by orthogonality at 1:57 PM on November 21, 2006