How should I convert a Thunderbird inbox into text files?
March 14, 2008 11:57 AM Subscribe
What's the best way to export all of the emails in a Thunderbird inbox to text files, one text file per email?
I'm planning on doing some linguistic analysis of emails and need to take a Thunderbird inbox and convert it into text files before I get to work. Google searches have revealed many mbox (which I think Thunderbird uses) parsers of, I am sure, varying quality. I'd love to skip the endless trial and error stage and start with the perfect tool. What's the best way to convert a Thunderbird inbox to text files?
ps - I'm comfortable using Perl and Python if you guys think a scripting solution is best.
I'm planning on doing some linguistic analysis of emails and need to take a Thunderbird inbox and convert it into text files before I get to work. Google searches have revealed many mbox (which I think Thunderbird uses) parsers of, I am sure, varying quality. I'd love to skip the endless trial and error stage and start with the perfect tool. What's the best way to convert a Thunderbird inbox to text files?
ps - I'm comfortable using Perl and Python if you guys think a scripting solution is best.
What OS? On XP, you can install a generic "print to file" printer, select all the messages in the Inbox and print them directly to individual text files using that printer.
posted by bizwank at 12:14 PM on March 14, 2008
posted by bizwank at 12:14 PM on March 14, 2008
Thunderbird stores mail as mbox, Sylpheed stores them as Maildir, which is essentially one file per message.
So, if you just install Sylpheed and use the file/import to import your mbox file, you will be done.
posted by gmarceau at 12:34 PM on March 14, 2008
So, if you just install Sylpheed and use the file/import to import your mbox file, you will be done.
posted by gmarceau at 12:34 PM on March 14, 2008
Best answer: I recently exported a few years' worth of email to .eml format from Thunderbird using the SmartSave extension. An .eml file is basically a text file that includes some header information at the beginning. I think from there you could just do a batch rename of all your newly-created files (*.eml -> *.txt).
posted by good in a vacuum at 1:28 PM on March 14, 2008 [1 favorite]
posted by good in a vacuum at 1:28 PM on March 14, 2008 [1 favorite]
good in a vacuum FTW!!
I've been looking to do this for a while. I just installed SmartSave, and it looks like it does great job. You can even configure it to use the .txt extension automatically.
posted by ochenk at 1:40 PM on March 14, 2008
I've been looking to do this for a while. I just installed SmartSave, and it looks like it does great job. You can even configure it to use the .txt extension automatically.
posted by ochenk at 1:40 PM on March 14, 2008
Response by poster: Thanks guys! Now I just have to strip out all the html people shove in their email and I'll be ready to go.
posted by eisenkr at 4:55 PM on March 14, 2008
posted by eisenkr at 4:55 PM on March 14, 2008
This thread is closed to new comments.
Perl's MessageParser class should do what you want.
posted by zippy at 12:07 PM on March 14, 2008