How to export Outlook PST mail archives to a text format
October 17, 2010 10:15 AM Subscribe
How do I export Outlook .pst mail archives to a text format for backup purposes?
I have Outlook 2000 mail folders stretching back 10 years. To prevent the Inbox getting too large, at the end of every year I archive the previous year's Inbox and Sent into a Mailarchive200x mailbox. Each such mailbox is a .pst file.
I want to export all that old mail into a text format. My goal is not portability, since I don't care about having mail over 2 years old on my mail server. (I have been on IMAP for the last 2 years, and am actually happy to stick with Outlook 2000, such is wretched state of all other email clients.) But I do want to preserve all that mail, in a format that can't easily become corrupted or obsolete, is readable outside of the application that created it, and is easily indexable by a desktop search app (such as Copernic).
I'd prefer to have each mailbox as a single file, rather than individual mail messages, unless you can convince me otherwise. The most important mail headers should be retained. I'd prefer if HTML formatting was retained. Attachments I'm not too worried about, as long as the attachment filenames are visible in a message's mail header and the attachments themselves are dumped in the same directory with the mail archive.
Is there a solution to my problems? This previous AskMefi didn't help.
I have Outlook 2000 mail folders stretching back 10 years. To prevent the Inbox getting too large, at the end of every year I archive the previous year's Inbox and Sent into a Mailarchive200x mailbox. Each such mailbox is a .pst file.
I want to export all that old mail into a text format. My goal is not portability, since I don't care about having mail over 2 years old on my mail server. (I have been on IMAP for the last 2 years, and am actually happy to stick with Outlook 2000, such is wretched state of all other email clients.) But I do want to preserve all that mail, in a format that can't easily become corrupted or obsolete, is readable outside of the application that created it, and is easily indexable by a desktop search app (such as Copernic).
I'd prefer to have each mailbox as a single file, rather than individual mail messages, unless you can convince me otherwise. The most important mail headers should be retained. I'd prefer if HTML formatting was retained. Attachments I'm not too worried about, as long as the attachment filenames are visible in a message's mail header and the attachments themselves are dumped in the same directory with the mail archive.
Is there a solution to my problems? This previous AskMefi didn't help.
Best answer: Maybe there is an easier way, but at the least, if you select all the messages in the archive and do File > Save As... and save as txt. No idea if there's an easy way to save all attachments, though, but you could always do a search to find all messages with attachments.
posted by beyond_pink at 10:48 AM on October 17, 2010
posted by beyond_pink at 10:48 AM on October 17, 2010
How about importing the mail into Mozilla Thunderbird, which uses mbox format (essentially 1 long text file for each folder). It's indexable (can't say for Copernic, but it's a standard format, and indexable by at least some search apps.
I'm not sure how that deals with attachments, using VBA to export them all first to a folder might be useful.
posted by Boobus Tuber at 11:26 AM on October 17, 2010
I'm not sure how that deals with attachments, using VBA to export them all first to a folder might be useful.
posted by Boobus Tuber at 11:26 AM on October 17, 2010
Best answer: You can do the 'Save as Text' thing mentioned above, but I am skeptical about how portable that format is — it'll let you extract the content of the email, certainly, but there are other routes you can go which would let you load the email, in bulk, back into a mailserver at some later date.
Although there is more work involved, what I do is have a mailserver of my own set up (you can do this either on an actual spare PC you have around, running Linux or something, or using a virtual machine ... it doesn't need to be on all the time), running Dovecot exposed as IMAP. If you are running Ubuntu it is pretty easy to set this up (cf. #3b), and you don't have to worry about much in the way of security since you're only going to access it from within your home network. You want to set it to use 'mbox' as the storage format.
You then create a new mail account in Outlook for this server, create folders and copy messages to it however you like. You can copy all the mail from your various PSTs over to it, if you like. You can do this from as many client programs (assuming they are new enough to support IMAP) and as many computers as you own — making it a central archive repository for your mail.
What you'll end up with is a lot of plaintext files on the server which are in RFC4155 format, one for each subfolder that you create. Virtually every email program can deal with these, they're well-documented, and there are lots of utilities and libraries for parsing, loading, sorting, and otherwise manipulating them. It is probably the best archive format going, in my opinion. Copy them off to CDs or a portable hard drive or two, and you shouldn't ever have a problem reading that mail.
Of course ... you can construct a pretty good argument that Outlook PSTs are so common, that despite being a crummy archival format on its merits, they're such a de facto standard (like Word .docs) that you'll really never have much trouble opening them because someone will always have a tool or conversion utility for sale. So if you just copied those files off to a CD right now, you'd probably be better than 90% of the people around. However if you really want to do it right, I'd say that mbox is the format you want.
posted by Kadin2048 at 11:30 AM on October 17, 2010 [1 favorite]
Although there is more work involved, what I do is have a mailserver of my own set up (you can do this either on an actual spare PC you have around, running Linux or something, or using a virtual machine ... it doesn't need to be on all the time), running Dovecot exposed as IMAP. If you are running Ubuntu it is pretty easy to set this up (cf. #3b), and you don't have to worry about much in the way of security since you're only going to access it from within your home network. You want to set it to use 'mbox' as the storage format.
You then create a new mail account in Outlook for this server, create folders and copy messages to it however you like. You can copy all the mail from your various PSTs over to it, if you like. You can do this from as many client programs (assuming they are new enough to support IMAP) and as many computers as you own — making it a central archive repository for your mail.
What you'll end up with is a lot of plaintext files on the server which are in RFC4155 format, one for each subfolder that you create. Virtually every email program can deal with these, they're well-documented, and there are lots of utilities and libraries for parsing, loading, sorting, and otherwise manipulating them. It is probably the best archive format going, in my opinion. Copy them off to CDs or a portable hard drive or two, and you shouldn't ever have a problem reading that mail.
Of course ... you can construct a pretty good argument that Outlook PSTs are so common, that despite being a crummy archival format on its merits, they're such a de facto standard (like Word .docs) that you'll really never have much trouble opening them because someone will always have a tool or conversion utility for sale. So if you just copied those files off to a CD right now, you'd probably be better than 90% of the people around. However if you really want to do it right, I'd say that mbox is the format you want.
posted by Kadin2048 at 11:30 AM on October 17, 2010 [1 favorite]
How do I export Outlook .pst mail archives to a text format for backup purposes?
You don't.
Don't save it as unstructured text, you'll be shooting yourself in the foot. If you want to save mail, save it as mbox. Simple, reliable, greppable, and most importantly pretty much every mail-manipulation tool anywhere will be able to read it back in. Pretty poor performance characteristics, but hey. Like you say, this is for backups.
What you want to do here is set up an IMAP server for yourself or use somebody else's (Google's totally fine) and then use Thunderbird plus this extension to export the folder to wherever you want to save it.
posted by mhoye at 11:32 AM on October 17, 2010
You don't.
Don't save it as unstructured text, you'll be shooting yourself in the foot. If you want to save mail, save it as mbox. Simple, reliable, greppable, and most importantly pretty much every mail-manipulation tool anywhere will be able to read it back in. Pretty poor performance characteristics, but hey. Like you say, this is for backups.
What you want to do here is set up an IMAP server for yourself or use somebody else's (Google's totally fine) and then use Thunderbird plus this extension to export the folder to wherever you want to save it.
posted by mhoye at 11:32 AM on October 17, 2010
Response by poster: Three different answers are all right, in a way. Select All: Save As Text is something I can do quickly, and will do just enough to give me peace of mind that all that old email is preserved.
The best answer, by the looks of it, is as Kadin2048 says to set up my own IMAP mailserver and move the stuff there in mbox format. However, knowing my (lack of) sysadmin prowess and miserable time management, I don't know if I'll ever get that far. (The fact that I'm on Windows doesn't help either.)
And finally, as Kadin2048 also says, .pst file utilities abound and the mail should be safe in that format for a long time. (And I am in the habit of backing them up regularly.)
Thanks people!
posted by snarfois at 3:25 PM on October 17, 2010
The best answer, by the looks of it, is as Kadin2048 says to set up my own IMAP mailserver and move the stuff there in mbox format. However, knowing my (lack of) sysadmin prowess and miserable time management, I don't know if I'll ever get that far. (The fact that I'm on Windows doesn't help either.)
And finally, as Kadin2048 also says, .pst file utilities abound and the mail should be safe in that format for a long time. (And I am in the habit of backing them up regularly.)
Thanks people!
posted by snarfois at 3:25 PM on October 17, 2010
You know, this question comes up often enough (seems like every few months) that I wonder if there's a market for this as a service.
Basically just an IMAP provider that, instead of giving you an address for incoming email, let you upload messages from your client via IMAP and then allowed you to download your mail as mbox files (or archived it for you on DVD or something).
Seems like the sort of thing somebody could set up on EC2 in a few days. For all I know it might already exist.
posted by Kadin2048 at 9:50 AM on October 18, 2010
Basically just an IMAP provider that, instead of giving you an address for incoming email, let you upload messages from your client via IMAP and then allowed you to download your mail as mbox files (or archived it for you on DVD or something).
Seems like the sort of thing somebody could set up on EC2 in a few days. For all I know it might already exist.
posted by Kadin2048 at 9:50 AM on October 18, 2010
« Older sudo apt-get me a little help here, people? | How do you avoid getting buried by all the... Newer »
This thread is closed to new comments.
Note: This works in Outlook 2007, but I don't have Outlook 2000 to try it out with. Nonetheless, this part of Outlook probably hasn't changed.
posted by Simon Barclay at 10:47 AM on October 17, 2010