Spam Analysis Project (personal)
I have been looking for a way to analyze the spam I get.

One of my oldest email addresses get bookoos spam. I decided as a New Years project I would analyze it over a period of time, to see what is about, subjects, from, etc.
I am looking for ideas on how to do this. I use Thunderbird for this account and it already does a good job of catching spam along with my hosting service.
I have check Tbird extensions and Googled a little.
Any ideas?
posted by raildr to Technology
Does this oldest email address run on/from a box with spamassassin installed? SA adds bits to the headers of emails that it has analyzed for spam contents, maybe that's a starting point?
posted by slater at 4:44 AM on January 16, 2006

Hmm. Keyword frequency? But so many of the keywords are obfuscated.

Google throws up this paper, which suggests that de-obfuscation is a somewhat-acheivable goal. Once you've "washed" your incoming spam with something like this, I guess it would be much easier to categorise.
posted by Leon at 6:16 AM on January 16, 2006

SpamBayes is written in Python. I don't even know Python, but I was able to hack SpamBayes to do what I wanted (I forget what that was) with ease.
posted by orthogonality at 8:24 AM on January 16, 2006

