How to convert a dictionary in XML format to a text file for use in a flashcard program?
January 7, 2011 7:15 AM Subscribe
I use Anki
as a flashcard program to learn languages. I have an XML file
of an open source English-Swedish dictionary. I'd like to turn this file into a text file that Anki can import
I know nothing about XML files (I don't even quite know what to open it with. OSX tries to use Adobe Illustrator, but surely that can't be right?). Is there any way to do this more or less easily?
posted by snoogles to Computers & Internet (12 answers total) 7 users marked this as a favorite
Below is an example of the entry for the word "ord", meaning "word", as displayed on the website.
The current format that I use for my cards includes fields for the word in Swedish, the definition, the inflections, and examples. It would already be fantastic to be able to extract this data from the XML file and produce a comma, semicolon or tab-separated text file.
Even better would be to have a way to extract all the idioms, and separate the ones from the same entry every time there is a comma.
Even more amazing (but now entering the realm of language learning OCD) would be to automatically capitalize the example sentences and add a period at the end of the sentence (which is indicated by a space followed by an opening bracket) if there's no other punctuation mark.
Being able to do this would save me a considerable amount of time. I'm therefore ready to commit a reasonable amount of effort to making it work...
Can anyone the best tools for this (preferably on OSX, but I can find a computer that runs Windows if necessary)? If you can't/don't want to walk me through how to do this step by step, what are some good tutorials that might help me figure it out?
ord noun, word
See Saldo: associations inflections
Inflections: ordet, ord, orden
Synonyms: glosa, glosor
Explanation: minsta självständiga språkliga enhet
Example: fula ord (foul language, swearwords),
säg inte ett ord till någon! (don't say a word to anyone!)
Idiom: med andra ord ("annorlunda uttryckt") (in other words ("put in another way")),
ord för ord ("ordagrant") (word for word ("literally verbatim")),
ha ord om sig ("vara känd för") att vara snål (be known to be mean),
innan man vet ordet av ("mycket snabbt") (before I knew where I was ("very quickly")),
ta till orda ("börja tala") (begin to speak),
hålla sitt ord ("hålla vad man lovat") (keep one's word ("do what one has promised")),
begära el. ha ordet ("vilja hålla el. hålla ett anförande") (ask to speak (ask for the floor) or have the floor ("want to address, or address, a gathering")),
ordet är fritt ("vem som helst får yttra sig") (the debate is open ("anyone may speak")),
ta någon på orden ("tro på vad någon säger") (take sby at their word ("believe what sby says")),
ha sista ordet ("vara den som bestämmer") (have the last word ("be the one to decide"))
Compounds: glåpord (taunt, jeer),
ord|följd (word order),
ord|lista (word list, glossary)