I have over 10 years of sent email sitting in a folder. Are there any tools (preferably for OS X or *nix, but anything interesting is welcome) that I can use to generate interesting statistics, or draw pretty graphs, word clouds... basically anything interesting that works on a huge number of emails.
IBM has a neat site called Many Eyes. It's easy, check it out.
posted by oceanjesse at 11:50 AM on June 28, 2013

You could build a searchable corpus! You basically want MySQL (to hold all the data in useful, cross-referenceable tables), OpenRefine (to easily edit, sort and view data), a bit of awk/sed knowledge (to make necessary changes to lots of text files) and Tableau (to make pretty pictures from the SQL database queries). All of these tools are freely available, with loads of video and how to instruction available online.
posted by iamkimiam at 12:31 PM on June 28, 2013

mailstat (or mailstats) is available on most unix-like systems.
There is are perl version as well as C versions.

In general, the various incarnations will tell you average message size, mailbox size, etc.
Some more advanced versions will tell you top sender, etc.

There are so many different versions about, it's hard to link to just one.
The output is never very pretty though, for what it's worth.
posted by madajb at 12:45 PM on June 28, 2013

Stephen Wolfram (of WolframAlpha and Mathematica) wrote a blog post about his analysis of some 23 years of email with Mathematica.
posted by Nonsteroidal Anti-Inflammatory Drug at 3:39 PM on June 28, 2013 [1 favorite]

Xobni is an outlook add in that gives loads of stats on email. Not That handy for a Mac though.
posted by Admira at 4:00 PM on June 28, 2013

There's a Thunderbird add-on, tbStats (I haven't tried it, but I'm intrigued), that will compute some stats and make some graphs from your email.
posted by limeonaire at 6:07 PM on June 28, 2013

