Bulk Exporting E-Mail from Eudora Pro to TXT
May 9, 2007 1:53 PM   Subscribe

How would I batch export Eudora Pro e-mails with customized filenames?

After reading "Bit Literacy" recently (highly recommended, by the way), I deleted 8,000 e-mails from my Inbox. Now, though, I need to export to text files the remaining 1,000 messages that I need to keep.

I am using Eudora Pro 7.1 (although I could easily revert back to Eudora 5.1, if necessary) on Windows XP. I found this program which does the exporting quite nicely, but, unfortunately, it does not allow for detailed file names.

Specifically, I need to be able to export each e-mail to individual text files with file names along the lines of....

"YYYY MM DD, Time, Author, Subject.txt"
or "2007 05 08, 19-27 PST, Bob Smith, Exporting E-Mail.txt"

The dates and times put the folder of text files in chronological order, and the author and subject make it easier to find things.

What can I do to accomplish this?

Is there a program or plug-in that already exists?
Someone recommended a Python script, but that is not something with which I am familiar.
Is that something that, with instruction, I could do myself?
Or can I pay somebody to create the script or software that I need?

Thank you!
posted by stst399 to Computers & Internet (13 answers total) 1 user marked this as a favorite
 
Response by poster: Not that it probably matters much, but I actually have Windows Vista. Oops! ("Confirm or deny?")
posted by stst399 at 2:08 PM on May 9, 2007


I don't know the answer, but am curious why you'd want each email in a separate text file. There might be a simpler way to get what you want. For example, if it's for something like searching or archiving, Eudora's mailboxes are already text files, just one per mailbox. You can manipulate them with any good text editor.
posted by EllenC at 2:46 PM on May 9, 2007


At least with Eudora 5.x and 6.x, and I'm guessing with 7.x, you could do a File/Save As on all highlighted e-mail (Ctrl-A to select all in a folder) and it would save the entire e-mail dump to a text file without any need for plug-ins.

Once you have your e-mails to a text file, you're gold. Just checked with Eudora Pro 6.1 and it places the line "-----Original Message-----" between each e-mail in the text file. It's trivial to write a script to process with that delimiter and the e-mail header, outputting a unique file of your preference format for each e-mail. Myself, I'd probably use Perl to do it, but about any modern language would suffice.

First things first, though. Check if the 'Save As' dump to a text file efficiently works to dump all your e-mails to a monolithic text file, and each e-mail is separated by a recognizable pattern.

If the text dump works as expected, but you're not a programmer type, just post back. If someone else doesn't pop in here with a 3-line Perl program or 2-cent macro or 1 great plug-in to parse out the e-mails in the next few hours, my decaying talent remains sufficient to hack out a quick script for ya. You have a Perl-ready machine available without messing about with a new install, yes? Though a macro language is probably sufficient.
posted by mdevore at 2:52 PM on May 9, 2007


Response by poster: Yes, mdevore, I can either export out as a mega text-file, or as individual files (but with the wrong custom file name). And since there actually is no separator in the mega file, it's probably going to be a lot easier to work with those individual files.

I know a lot about computers and macros, but next to nothing about writing or running back scripts. How would I go about next in pulling out of those individual text e-mail files and creating the file names that I described above??
posted by stst399 at 3:08 PM on May 9, 2007


Welp, not to intrude on private affairs, but if you have a few random-idiot-reader-safe e-mails, can you send me dump of the text file of those using File/Save As? Might be all you/me/we need to finish the task.

I would suggest uploading a few on a public accessible site for everybody who's interested in taking a crack at it, but umm, global Internet access lowers the bar for 'random-idiot-reader-safe' below even my own level of idiocy.
posted by mdevore at 3:14 PM on May 9, 2007


Response by poster: Good idea.
Here are the files.

Notice that the individual files are out of order since it's message #4 (the fifth one) that was actually the first one sent. That's why it's important to get the date and times into the file names so that everything will then sort correctly after being processed.

Also, I realized that certain characters would have to be stripped out. The "Re:" in subject names would need to become "Re" since colons aren't allowed in file names. (I wonder too about stripping out the HTML code or not.)

Thanks,
Scott
posted by stst399 at 3:46 PM on May 9, 2007


What? Nobody came in with a script while I was taking my power nap? MetaFilter members are, like, 14.3% full-time programmers and nobody, mutter, mutter, buncha lazy, mutter, crab, crab, crab.

OK, I had a chance to briefly look over your files. It's a bit weird that Eudora 7.x no longer has any delimiters between e-mails on text dumps like with version 5 and 6. Plus there are definitely different starting patterns to match depending on e-mail source. No big deal as long as it's just a few like that, and you don't have e-mail content trying to spoof other e-mails.

Should have a chance to write, test, and post a Perl script up later tonight. I'll have it underscore illegal chars and spaces (because spaces are generally a pain everywhere) for date/fm/sub file naming unless I see a different format request from you before then.
posted by mdevore at 7:55 PM on May 9, 2007


Best answer: OK, emex.pl is up as a downloadable ZIP file. 87-line Perl app, hopefully you can use that.

emex.pl works on your test case file, breaking each e-mail into a separate file of five total. There are three patterns to recognize individual e-mails. You may need to add more but it is trivial to do if you can follow program logic. Simply add more patterns to the top targets array, following the example of the current three. The rest of the program isn't hard-coded to only the three patterns, so no worry there.

The output file name is long, you might want to trim it. Currently the app grabs all of the date, from, and subject fields, validates the chars, and makes the entire blob the output file name. Certainly could stand more intelligent parsing, but I figured what the heck, I didn't know exactly what you wanted and you probably wanted to customize it anyway. Left as an exercise for the reader, and all that.

Admittedly ugly. I've been away from and am rusty on Perl, so the app is pretty much a hack and whack job. However, successful behavior is worth something. Got any problems or further questions, drop a note.
posted by mdevore at 1:51 AM on May 10, 2007


Response by poster: Got the script. I guess I need to install Perl 5.8.8, huh? I'll be working on that next. Thank you!

If anyone has other suggestions too for this issue, I'm still open to alternatives. :-)
posted by stst399 at 1:33 PM on May 10, 2007


Any Perl installed within the past 12 years ought to work, the script is vanilla 5.x. Or if you have access to any type of *nix box, Perl is usually installed as a matter of course.

Alternatively, not too hard to move the logic to most modern programming languages, including Python as you mentioned earlier. I was tempted to use C, but you'd need a push/pop library and deal with compiling and linking, so no. If JavaScript had a seamless way to write to a file, I'd have been all over that. Perl makes my head hurt when going beyond the regex support.
posted by mdevore at 1:55 PM on May 10, 2007


If you really don't want to install perl here it is fed to the ActiveState tool to make it a standalone executable.
posted by phearlez at 3:25 PM on May 10, 2007


Incidentally, and because I only have a slight majority of posts in this topic so far, I've thought of a couple ways a subtle deviation in your e-mails could make the script misbehave. Few minutes to fix if that happens, though, so if you have problems with e-mails being broken in two with bad output files or running together, you could, should you so wish, post another example file and we can clear that up no sweat.
posted by mdevore at 9:40 PM on May 10, 2007


Response by poster: I've got Perl installed and working. I found someone here who is making a few tweaks to the script to narrow down the fields in the file name even more. When it's all done and properly working, I'll post it here for everyone.... Thanks!
posted by stst399 at 8:44 AM on May 11, 2007


« Older helathy food near grand central in nyc?   |   tell me everything about the MCATs Newer »
This thread is closed to new comments.