Email in mbox format vs individual message storage ...
November 2, 2007 4:46 PM   Subscribe

What are the pros/cons of Apple's move away from using the standard .mbox format in their Mail software?

I am a long-time Mac user, now approaching the 'end-of-life' of my current system, a G4 system running OSX 10.3.9 For the ten years that I have been using email, all the programs I have used to control it - starting with Navigator 3.1 - have all used the standard mbox format for storing messages. In OSX 10.4, Apple switched to storing each email message as an individual file (for easier searching, I understand), and this seems to have continued in the newest upgrade.

But what are the negative implications of having hundreds of thousands of messages stored this way? Is such overhead really manageable for the file system? Could such scattered mail ever be converted back into mbox format for use with another system? Does it matter?

At present, I'm thinking that when the time comes to upgrade to a new Mac, I'd rather use Thunderbird, which does seem to use the mbox format. I would appreciate hearing advice/recommendations about potential problems with this.
posted by woodblock100 to Computers & Internet (15 answers total) 2 users marked this as a favorite
 
Mail.app can still import and export mbox files.
posted by mpls2 at 4:58 PM on November 2, 2007


Apple Mail has a "save as" feature, which in 10.4 allowed you to export mail into an mostly mbox format, I assume it's there in 10.5.

It's (almost) always possible to convert one format to another format, it just assumes you're ready/willing/able to put in the time to figure out how to do it.
posted by alana at 5:02 PM on November 2, 2007


I think you're worrying too much. I have many gigabytes of email stored under Mail.app and there's no performance issue at all.

Mail.app gives you so much more then Thunderbird does.
posted by schwa at 5:04 PM on November 2, 2007


My understanding is that Apple does this so they can return the individual email when searching in Spotlight.

The chief downside I've seen is in the scenario where home directories are mounted via NFS. Because Mail seems to create many, many temporary files when accessing the individual email files, the Spotlight indexer hammers the NFS server when scanning the email, causing problems for all the other systems trying to use the server.

If anyone knows a way around this problem (turning off indexing for the mail directory is not an option) I would be very interested to hear it.
posted by donpardo at 5:13 PM on November 2, 2007


Whoops - the temporary file issue was causing problems for the virus scanner.

Sorry about that. Wine with Friday night's dinner.
posted by donpardo at 5:24 PM on November 2, 2007


In addition to the Spotlight thing, it saves a ton of space when doing incremental backups. If you have a 1GB mbox file, and you add or delete a single 2k message, most backup programs (including Time Machine) will have to back up the entire file. If you add or delete a single file, the backup program only has to add or remove that one small file from the backup set.
posted by indyz at 5:27 PM on November 2, 2007


donpardo,
The only thing I know of to solve the NFS issue is to not use NFS to host home directories, or, optionally, use a local drive for the Mail directory located in the users home directory (located, as you probably know, in ~/Library/Mail).

Simply create the folder Mail somewhere on the local disk where the user will have read/write access and copy the contents of ~/Library/Mail to that location. Then delete the folder ~/Library/Mail and replace it with an alias to the new Mail folder on the local disk. This allows mds (the background task that does all the indexing) to do it's thing without hitting the NFS share.

Also, if you have the disk space, you can optionally use Mobile Users under Mac OS X 10.4 and higher, but you will need to customize the login/logout settings for copying the users data back to the server and keeping it in sync. It uses AFP though, and can be a hog on the server if you have a lot of client machines logging in/out at the same time.

/derail
posted by daq at 6:26 PM on November 2, 2007


If you have an imap account you can access it from both Thunderbird and Mail.app, and thereby have it in both mbox and Apple's proprietary format.
posted by alms at 7:26 PM on November 2, 2007


It's a bit of a stretch to even call mbox a "format" - it's just all of your emails concatenated together one after another, separated by the regular expression "^From ". Many other desktop unix mail applications have moved on to use maildir instead (which also solves the NFS locking problem). I'm not certain if this is what Apple is using, but the idea of breaking up your mailbox into individual files per email has been around for a while, and in general seems like a good one.
posted by whir at 8:07 PM on November 2, 2007


Here is a benchmark comparing the performance of mbox and maildir.

Geared toward servers, but might still be useful.
posted by ydnagaj at 8:37 PM on November 2, 2007


Apple's one-message-per-file format is different from maildir, iirc.

(Back in the day I used to use mh, which I miss ... doesn't play well with imap, though.)
posted by hattifattener at 8:44 PM on November 2, 2007


daq: That's what we've done, but it's an unsatisfactory fix, in our estimation. It complicates doing complete backups. We're using imap, so it's not critical (the imap server is also backed up). Fortunately, we only have one user who insists on using Mail.app. Everyone else is using Thunderbird or the webmail client.

Somewhat back on the subject: I'm very interested to hear what happens when you try to export hundreds of thousands of email messages from Mail.app.
posted by donpardo at 8:48 PM on November 2, 2007


Response by poster: try to export hundreds of thousands of email messages from Mail.app

Well, of more immediate concern - when I make this upgrade - is how smoothly the messages will get into the app. But I wonder if it really is necessary to put all the old mail in there (ten years' worth). I found the old Netscape 3.1 to be very useful in handling archived email. I kept my email in topic folders within larger yearly folders, and periodically moved the older yearly folders out to an archive location. On those occasions when I wanted to look at something in the archive, Netscape allowed me to temporarily mount any of the old mailboxes to browse it, by using a 'View Mailbox' command (I didn't have to 'import' the stuff). Mail.app (at least the 10.3.9) version I have, doesn't allow this; it's 'import' or nothing.

I'd like to keep just the last couple of years mail in the mail program itself (thus keeping things uncluttered), as that is adequate for 99.x% of my mail activity, but would like the ability to easily browse the archived stuff through the same interface. At present, I have to do this searching/browsing with a text editor like BBEdit; it works, but it's clumsy.
posted by woodblock100 at 9:38 PM on November 2, 2007


You might also consider some kind of archive system that does not require you having an app open with hundreds of thousands of email messages eating up lots of memory unnecessarily.
posted by softlord at 1:35 AM on November 3, 2007


Multiple files makes searching easier, and there is less chance of corrupting an entire mailbox.

It does tend to make backups take longer (backing up 500,000 small files is time consuming)

Another difference is the time it takes to open a given mailbox... with one large mbox file, you are constrained by the size of the file, although pre-indexing can help drastically.
With individual files for each message, you are only constrained by the number of messages, rather than the cumulative size as well.
posted by TravellingDen at 8:00 AM on November 4, 2007


« Older Why are there holes all over the lawn?   |   What is the Osage word for "please" (phonetic... Newer »
This thread is closed to new comments.