<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

	<title>Comments on: Spamassassin has failed me; what do I do?</title>
	<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do/</link>
	<description>Comments on Ask MetaFilter post Spamassassin has failed me; what do I do?</description>
	<pubDate>Fri, 10 Nov 2006 01:12:33 -0800</pubDate>
	<lastBuildDate>Fri, 10 Nov 2006 01:12:33 -0800</lastBuildDate>
	<language>en-us</language>
	<docs>http://blogs.law.harvard.edu/tech/rss</docs>
	<ttl>60</ttl>

	<item>
		<title>Question: Spamassassin has failed me; what do I do?</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do</link>	
		<description>What&apos;s the state of the art in installable spam filters for Unix mail hosts? Spamassassin has failed me. &lt;br /&gt;&lt;br /&gt; &lt;a href=&quot;http://ask.metafilter.com/mefi/28684&quot;&gt;A year ago&lt;/a&gt; you guys convinced me not to use a challenge-response system for spam filtering. In the meantime &lt;a href=&quot;http://www.somebits.com/weblog/tech/bad/spamOverload.html&quot;&gt;my spamassassin setup has failed&lt;/a&gt; to the point that 80% of the mail that makes it through my filter is still spam. It&apos;s intolerable. What can I do?&lt;br&gt;
&lt;br&gt;
I&apos;m running postfix and dovecot on a Debian Linux box. I&apos;ve got spamassassin set up with bayesian filtering, razor, pyzor, and am running sa-update regularly. 85% of my incoming mail is filtered as spam immediately, but 80% of the remainder is still spam. I&apos;m a software engineer and capable of doing all sorts of hacks, but I&apos;m just looking for something simple that I can just install and be done with it.</description>
		<guid isPermaLink="false">post:ask.metafilter.com,2006:site.50586</guid>
		<pubDate>Fri, 10 Nov 2006 01:04:56 -0800</pubDate>
		<dc:creator>Nelson</dc:creator>
		
			<category>spam</category>
		
			<category>spamassassin</category>
		
			<category>email</category>
		
			<category>smtp</category>
		
			<category>bayesian</category>
		
	</item> <item>
		<title>By: mattdini</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766611</link>	
		<description>On the front page of digg right now is:&lt;br&gt;
&lt;br&gt;
Enhance Your Mail Server With ASSP (Anti-Spam SMTP Proxy)&lt;br&gt;
http://www.howtoforge.com/antispam_smtp_proxy&lt;br&gt;
&lt;br&gt;
Something to look at!</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766611</guid>
		<pubDate>Fri, 10 Nov 2006 01:12:33 -0800</pubDate>
		<dc:creator>mattdini</dc:creator>
	</item><item>
		<title>By: aubilenon</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766615</link>	
		<description>Greylisting works pretty well for me</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766615</guid>
		<pubDate>Fri, 10 Nov 2006 01:34:48 -0800</pubDate>
		<dc:creator>aubilenon</dc:creator>
	</item><item>
		<title>By: aye</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766616</link>	
		<description>Try DNS blocklists through your mail server. I use dsn.rfc-ignorant.org : dul.dnsbl.sorbs.net : list.dsbl.org : sbl-xbl.spamhaus.org</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766616</guid>
		<pubDate>Fri, 10 Nov 2006 01:35:01 -0800</pubDate>
		<dc:creator>aye</dc:creator>
	</item><item>
		<title>By: mock</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766620</link>	
		<description>You might want to try a free trial of &lt;a href=&quot;http://mailchannels.com&quot;&gt;MailChannels&apos;&lt;/a&gt; Traffic Control product.  It does traffic shaping for SMTP, which should reduce the amount of spam you see, as well as reducing the load on your content filter so you can tune it to be more agressive.&lt;br&gt;
&lt;br&gt;
(Disclosure, I sit on the board of MailChannels and came up with some of the technology involved.  Obviously I think it&apos;s awesome but you should really find out for yourself)</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766620</guid>
		<pubDate>Fri, 10 Nov 2006 01:40:31 -0800</pubDate>
		<dc:creator>mock</dc:creator>
	</item><item>
		<title>By: donut</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766621</link>	
		<description>And you&apos;re always using sa-learn to learn from spam that passes through your setup? You can also learn from non-spam (option --ham).</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766621</guid>
		<pubDate>Fri, 10 Nov 2006 01:45:03 -0800</pubDate>
		<dc:creator>donut</dc:creator>
	</item><item>
		<title>By: paulsc</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766633</link>	
		<description>&lt;em&gt;&quot;... I&apos;m just looking for something simple that I can just install and be done with it.&quot;&lt;/em&gt;&lt;br&gt;
&lt;br&gt;
Sorry. The spammers beat SMTP long ago. Running a mail server is work, not fun, and has been for some time.&lt;br&gt;
&lt;br&gt;
The more tests you add, the more work your mail server is doing, and the more ways things can break. But if you&apos;re pissed enough about spam, here are some things you could be doing, beyond what you&apos;ve said you&apos;re doing. And it&apos;s important to review your filter implementations, informed by your spam statistics, so that you are throwing out spam in the most efficient way.&lt;br&gt;
&lt;br&gt;
1) Do &lt;a href=&quot;http://en.wikipedia.org/wiki/Reverse_DNS_lookup&quot;&gt;reverse DNS lookups&lt;/a&gt;, and add failures to your bayesian filtering. Failure on a reverse lookup doesn&apos;t drop a connection by itself, but it &quot;pre-weights&quot; the bayesian score for spam.&lt;br&gt;
&lt;br&gt;
2) You can implement sender verification schemes such as &lt;a href=&quot;http://experts.about.com/e/s/se/sender_policy_framework.htm&quot;&gt;SPF&lt;/a&gt;, and &lt;a href=&quot;http://experts.about.com/e/d/do/domainkeys.htm&quot;&gt;DomainKeys&lt;/a&gt;.&lt;br&gt;
&lt;br&gt;
3) &lt;a href=&quot;http://encyclopedia.kids.net.au/page/sp/Spamming#Tarpits_and_Honeypots&quot;&gt;Teergrube&lt;/a&gt;. High volume spammers and botnet operators will eventually mark you as a teergrube, and skip you. Frankly, I find teergrube is easier to setup and control on Exim than Postfix, but then again, I don&apos;t spend a lot of time with Postfix.&lt;br&gt;
&lt;br&gt;
4) I see some real benefit for ASSP for medium to large mail systems, but for operators of small single server systems, the effort and expense may not be more beneficial than maintaining a greylist system in SpamAssasin.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766633</guid>
		<pubDate>Fri, 10 Nov 2006 02:16:31 -0800</pubDate>
		<dc:creator>paulsc</dc:creator>
	</item><item>
		<title>By: Nelson</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766647</link>	
		<description>Thanks for the advice so far. I am running a single-user mail system so the effort is a significant nuisance. The mail host is in now way overloaded o I don&apos;t mind doing more processing. Of the suggestions here the one the one that seems the most hopeful + new for me so far is greylisting. I found &lt;a href=&quot;http://www.debian-administration.org/articles/168&quot;&gt;this article on Debian, greylisting, and postfix&lt;/a&gt; that looks like a place to start.&lt;br&gt;
&lt;br&gt;
Please keep the suggestions coming!</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766647</guid>
		<pubDate>Fri, 10 Nov 2006 02:57:49 -0800</pubDate>
		<dc:creator>Nelson</dc:creator>
	</item><item>
		<title>By: eriko</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766693</link>	
		<description>&lt;i&gt;You can also learn from non-spam (option --ham).&lt;/i&gt;&lt;br&gt;
&lt;br&gt;
No, you *MUST* also learn non-spam email for the Bayesian filter to work. &lt;br&gt;
&lt;br&gt;
Quick test. &quot;sa-learn --dump magic.&quot; In particular, these lines.&lt;br&gt;
&lt;br&gt;
0.000          0     106378          0  non-token data: nspam&lt;br&gt;
0.000          0      10531          0  non-token data: nham&lt;br&gt;
0.000          0     498061          0  non-token data: ntokens&lt;br&gt;
&lt;br&gt;
That&apos;s &quot;number of spam messages that provided tokens&quot; and &quot;number of ham messages that provided tokens.&quot; Both numbers *must* be above 200, or the Bayesian filter shuts down.&lt;br&gt;
&lt;br&gt;
Another check is in the headers -- there should be a BAYES_XX (where XX is a number) in the X-Spam-Status: header. If not, Bayesian didn&apos;t run.&lt;br&gt;
&lt;br&gt;
Bayesian makes all the difference on my setup between useful and useless.&lt;br&gt;
&lt;br&gt;
Another issue -- if you get lots and lots of spam, and not that much non spam, you&apos;ll find that the default token expire count is too low -- you end up expiring out your ham tokens almost as fast as you save them. The answer here is in ~/.spamassasin/user_prefs, change or add this:&lt;br&gt;
&lt;br&gt;
bayes_expiry_max_db_size 500000&lt;br&gt;
&lt;br&gt;
The number is in tokens, so I&apos;m telling SA to expire old tokens only when there are more than 500,000 of them. Note how I&apos;m showing 498K tokens, but only ~117K messages with tokens -- 200 ham and spam tokens isn&apos;t enough, you need 200 ham *messages*, each providing at least one token, and 200 spam, ditto, for the filter to kick in.&lt;br&gt;
&lt;br&gt;
Finally, I use rbldnsd to block several countries. The rule I have is if I get 100,000 spam, and zero real email, I no longer accept email from that country.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766693</guid>
		<pubDate>Fri, 10 Nov 2006 05:29:26 -0800</pubDate>
		<dc:creator>eriko</dc:creator>
	</item><item>
		<title>By: SpecialK</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766696</link>	
		<description>What eriko said. Every time I&apos;ve, as a contract sysadmin, taken a look at a installation of a &apos;broken&apos; spamassassin, it&apos;s broken because they&apos;re not training the bayesian filter.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766696</guid>
		<pubDate>Fri, 10 Nov 2006 05:39:09 -0800</pubDate>
		<dc:creator>SpecialK</dc:creator>
	</item><item>
		<title>By: fcain</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766730</link>	
		<description>What about using Google Gmail for domains? Just run all your mail through them, and give all your users Gmail accounts. Let them handle the problem.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766730</guid>
		<pubDate>Fri, 10 Nov 2006 06:38:47 -0800</pubDate>
		<dc:creator>fcain</dc:creator>
	</item><item>
		<title>By: jacobian</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766763</link>	
		<description>I&apos;ve been extremely happy with &lt;a href=&quot;http://crm114.sourceforge.net/&quot;&gt;CRM114&lt;/a&gt;, which is essentially a Bayesian analyzer on steroids.  After about a week of training, it&apos;s scarily accurate; I&apos;d estimate only about 1 spam in 1000 slips though, and I can&apos;t remember the last false-positive I had.&lt;br&gt;
&lt;br&gt;
Silly name, yes, but give it a shot.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766763</guid>
		<pubDate>Fri, 10 Nov 2006 07:12:27 -0800</pubDate>
		<dc:creator>jacobian</dc:creator>
	</item><item>
		<title>By: finn</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766804</link>	
		<description>&lt;i&gt;Of the suggestions here the one the one that seems the most hopeful + new for me so far is greylisting.&lt;/i&gt;&lt;br&gt;
&lt;br&gt;
One technique I read about recently was to use greylisting, but to exempt any messages from the greylisting process that could be verified via SPF. Major mail providers like gmail and Yahoo use SPF and thus would avoid the delay in delivery that sometimes occurs with greylisting. You might also want to look at &lt;a href=&quot;http://smtpd.develooper.com/&quot;&gt;qpsmtpd&lt;/a&gt;, an SMTP daemon that can add an additional level of message filtering configurable via plugins written in Perl.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766804</guid>
		<pubDate>Fri, 10 Nov 2006 07:49:31 -0800</pubDate>
		<dc:creator>finn</dc:creator>
	</item><item>
		<title>By: purephase</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766806</link>	
		<description>paulsc &lt;a href=&quot;http://ask.metafilter.com/mefi/50586#766633&quot;&gt;says&lt;/a&gt;: &lt;em&gt; You can implement sender verification schemes such as SPF, and DomainKeys.&lt;/em&gt;&lt;br&gt;
&lt;br&gt;
It&apos;s my understanding that these technologies will not necessarily reduce inbound SPAM but prevent other hosts from munging the mail header to spoof your domain (provided the receiver actively checks for an SPF record). So, if more and more hosts implemented (and checked) for the SPF record, it would be less likely that legitimate mail from your mail server would be labeled as SPAM and anyone else that is spoofing your domain in their sent mail would be dropped on the receiving host.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766806</guid>
		<pubDate>Fri, 10 Nov 2006 07:51:11 -0800</pubDate>
		<dc:creator>purephase</dc:creator>
	</item><item>
		<title>By: nicwolff</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766969</link>	
		<description>Or, you could ignore the haters and use &lt;a href=&quot;http://angel.net/~nic/spam-x/&quot;&gt;my simple procmail C/R script&lt;/a&gt; which works quite well.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766969</guid>
		<pubDate>Fri, 10 Nov 2006 09:54:15 -0800</pubDate>
		<dc:creator>nicwolff</dc:creator>
	</item><item>
		<title>By: paulsc</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#766992</link>	
		<description>&lt;em&gt;&quot;It&apos;s my understanding that these technologies will not necessarily reduce inbound SPAM...&quot;&lt;/em&gt;&lt;br&gt;
posted by purephase at 10:51 AM EST on November 10&lt;br&gt;
&lt;br&gt;
purephase, I think your understanding above was originally right, in terms of the ambitions for SPF, but many people are now using SPF lookups as a test to combat botnets. Botnets may be sending messages with artfully spoofed headers, but if a botnet IP doesn&apos;t match the SPF records for the domain they say they are from, or there is no SPF record for that domain, the connection may be dropped into teergrube, or the message flagged for additional filter steps. Gmail, AOL, and Earthlink are pretty good now about shutting down spammers from their internal networks, mostly by internal throttles and list filtering, and while you may still get a lot of nuisance mail from addresses in those domains in aggregate, it&apos;s small scale compared to the botnets spoofing them. And that is what is making the SPF idea helpful, as you cover in your second sentence. ...:-)&lt;br&gt;
&lt;br&gt;
And make no mistake, &lt;a href=&quot;http://www.eweek.com/article2/0,1895,2029720,00.asp&quot;&gt;it&apos;s botnets that have caused the explosion in spam noted in the last 4 to 6 months&lt;/a&gt;.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-766992</guid>
		<pubDate>Fri, 10 Nov 2006 10:10:52 -0800</pubDate>
		<dc:creator>paulsc</dc:creator>
	</item><item>
		<title>By: odinsdream</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#767215</link>	
		<description>nicwolff: That&apos;s a wonderful solution! It&apos;s exactly what I&apos;ve been looking for, and when I get a free weekend, I&apos;ll definitely be trying it out. Thanks for your efforts.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-767215</guid>
		<pubDate>Fri, 10 Nov 2006 13:16:52 -0800</pubDate>
		<dc:creator>odinsdream</dc:creator>
	</item><item>
		<title>By: autojack</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#767289</link>	
		<description>Are you keeping your spamassassin up to date by running sa-update? I run it weekly. You must be running version 3 or better.&lt;br&gt;
&lt;br&gt;
Are you using any of the third-party rules that others have written? Those helped me immensely. Get them from &lt;a href=&quot;http://www.rulesemporium.com/&quot;&gt;here.&lt;/a&gt; I&apos;m running:&lt;br&gt;
&lt;br&gt;
-rw-r--r--  1 root root   3839 Jun  1  2005 70_sare_bayes_poison_nxm.cf&lt;br&gt;
-rw-r--r--  1 root root  24298 Oct  5  2005 70_sare_evilnum0.cf&lt;br&gt;
-rw-r--r--  1 root root 187643 Dec 26  2005 70_sare_genlsubj.cf&lt;br&gt;
-rw-r--r--  1 root root 384645 Oct 30  2005 70_sare_header.cf&lt;br&gt;
-rw-r--r--  1 root root  28066 Jun  3 22:00 70_sare_html0.cf&lt;br&gt;
-rw-r--r--  1 root root  39625 Jun  3 22:00 70_sare_html1.cf&lt;br&gt;
-rw-r--r--  1 root root     66 Feb 14  2005 70_sare_html1.cf.sig&lt;br&gt;
-rw-r--r--  1 root root 158513 Oct  1  2005 70_sare_obfu.cf&lt;br&gt;
-rw-r--r--  1 root root  18190 Dec 12  2005 70_sare_random.cf&lt;br&gt;
-rw-r--r--  1 root root  97820 May 27 20:00 70_sare_specific.cf&lt;br&gt;
-rw-r--r--  1 root root  59515 Oct 18 13:00 70_sare_stocks.cf&lt;br&gt;
-rw-r--r--  1 root root  15481 May 15 20:00 72_sare_redirect_post3.0.0.cf&lt;br&gt;
-rw-r--r--  1 root root  57580 Feb 14  2005 99_FVGT_Tripwire.cf&lt;br&gt;
-rw-r--r--  1 root root  14284 Feb 14  2005 antidrug.cf&lt;br&gt;
-rw-r--r--  1 root root  22546 Feb 14  2005 backhair.cf&lt;br&gt;
-rw-r--r--  1 root root  23422 Feb 14  2005 chickenpox.cf&lt;br&gt;
-rw-r--r--  1 root root   4883 Feb 14  2005 random.current.cf&lt;br&gt;
-rw-r--r--  1 root root  56238 Jun  1  2005 tripwire.cf&lt;br&gt;
-rw-r--r--  1 root root   3880 Feb 14  2005 weeds.cf&lt;br&gt;
&lt;br&gt;
sare_stocks in particular is pretty good at killing those damn stock market spams, which still get through greylisting for me. All those rules get updated weekly with rules_du_jour.&lt;br&gt;
&lt;br&gt;
You can&apos;t just expect to install something and forget it. Whether you use bayes, SA, or something else, you&apos;ve got to keep pace with the spammers or eventually their state of the art will supercede yours. Get sa-update and rules_du_jour running weekly in cron, set up greylisting, and you should see a marked improvement. I&apos;m not even using Bayes with my SA setup and I do pretty well. I&apos;m using Debian and Postfix, same as you.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-767289</guid>
		<pubDate>Fri, 10 Nov 2006 14:24:44 -0800</pubDate>
		<dc:creator>autojack</dc:creator>
	</item><item>
		<title>By: autojack</title>
		<link>http://ask.metafilter.com/50586/Spamassassin-has-failed-me-what-do-I-do#767312</link>	
		<description>Oh, and I use Postgrey for my greylisting.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.50586-767312</guid>
		<pubDate>Fri, 10 Nov 2006 14:45:33 -0800</pubDate>
		<dc:creator>autojack</dc:creator>
	</item>
	</channel>
</rss>
