<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

	<title>Comments on: Need help squashing foreign spam</title>
	<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam/</link>
	<description>Comments on Ask MetaFilter post Need help squashing foreign spam</description>
	<pubDate>Thu, 21 Sep 2006 08:19:06 -0800</pubDate>
	<lastBuildDate>Thu, 21 Sep 2006 08:19:06 -0800</lastBuildDate>
	<language>en-us</language>
	<docs>http://blogs.law.harvard.edu/tech/rss</docs>
	<ttl>60</ttl>

	<item>
		<title>Question: Need help squashing foreign spam</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam</link>	
		<description>I get tons of spam in languages other than english -- entire messages in other languages that I can&apos;t even read to determine what the spam is about. I&apos;d like to make a simple set of email filters to whack out any email in russian, chinese, japanese, and korean. Can someone give me a single character from each language that represents the most commonly used letter? (like the letter &quot;a&quot; in English) &lt;br /&gt;&lt;br /&gt; People I know never seem to send me stuff with other languages in it, so even if I go with the sledgehammer of looking for single popular characters, I don&apos;t think I&apos;ll get too many false positives.&lt;br&gt;
&lt;br&gt;
Or should I try using language message headers instead? Do spammers stick to language-specific character sets?</description>
		<guid isPermaLink="false">post:ask.metafilter.com,2006:site.46962</guid>
		<pubDate>Thu, 21 Sep 2006 08:12:41 -0800</pubDate>
		<dc:creator>mathowie</dc:creator>
		
			<category>spam</category>
		
			<category>language</category>
		
	</item> <item>
		<title>By: krautland</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#715368</link>	
		<description>throw a sample paragraph in here or into a translation site such as &lt;a href=&quot;http://babelfish.altavista.com/&quot;&gt;babelfish.&lt;/a&gt;&lt;br&gt;
&lt;br&gt;
I&apos;m getting tons of japanese spam myself.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-715368</guid>
		<pubDate>Thu, 21 Sep 2006 08:19:06 -0800</pubDate>
		<dc:creator>krautland</dc:creator>
	</item><item>
		<title>By: kindall</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#715378</link>	
		<description>I&apos;d block using the content-type header. If you have control over your mail server, you can also just use country-specific blacklists and block entire countries. This is the approach I use; the bounce message includes instructions for getting whitelisted in case I unintentionally bounce any human correspondents. In the last year or so, exactly zero senders in Asia have added themselves to my whitelist.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-715378</guid>
		<pubDate>Thu, 21 Sep 2006 08:27:59 -0800</pubDate>
		<dc:creator>kindall</dc:creator>
	</item><item>
		<title>By: sonofslim</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#715379</link>	
		<description>i thought the letter &apos;e&apos; was the most commonly used letter in english?</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-715379</guid>
		<pubDate>Thu, 21 Sep 2006 08:28:56 -0800</pubDate>
		<dc:creator>sonofslim</dc:creator>
	</item><item>
		<title>By: majick</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#715388</link>	
		<description>It&apos;s perhaps a bit of a big hammer to apply, but I&apos;ve found that filtering messages having any non-ASCII character (or any Base64 escaped character) in the Subject: header pretty much clobbers all of the non-ASCII spam.&lt;br&gt;
&lt;br&gt;
I used to do this with a regular expression in Procmail, but SpamAssassin has long supported language filtering, so I no longer have a convenient filter expression to paste for you.  Sorry.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-715388</guid>
		<pubDate>Thu, 21 Sep 2006 08:34:08 -0800</pubDate>
		<dc:creator>majick</dc:creator>
	</item><item>
		<title>By: majick</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#715393</link>	
		<description>(Or any MIME-escaped character, for that matter /=\d\d/ appearing in the Subject: header matches a certain chunk of my non-ASCII spam)</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-715393</guid>
		<pubDate>Thu, 21 Sep 2006 08:37:13 -0800</pubDate>
		<dc:creator>majick</dc:creator>
	</item><item>
		<title>By: Eater</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#715409</link>	
		<description>&lt;a href=&quot;http://web.archive.org/web/20010507012940/http://www3.sympatico.ca/walter.dnes/email/chinese/index.html&quot;&gt;Here&lt;/a&gt; is an archived version of a now-gone page that addresses this issue for Procmail users.&lt;br&gt;
&lt;br&gt;
Here&apos;s a simpler filter recipe based on it, which I found somewhere long ago:&lt;br&gt;
&lt;code&gt;&lt;br&gt;
:0 BD&lt;br&gt;
* -1^1 .&lt;br&gt;
*  2^1 =[0-9A-F][0-9A-F]&lt;br&gt;
* 33^1 [&#161;&#162;&#163;&#164;&#165;&#166;&#167;&#168;&#169;&#170;&#171;&#172;&#173;&#174;&#175;&#176;&#177;&#178;&#179;&#180;&#181;&#182;&#183;&#184;&#185;&#186;&#187;&#188;&#189;&#190;&#191;]&lt;br&gt;
* 33^1 [&#192;&#193;&#194;&#195;&#196;&#197;&#198;&#199;&#200;&#201;&#202;&#203;&#204;&#205;&#206;&#207;&#208;&#209;&#210;&#211;&#212;&#213;&#214;&#215;&#216;&#217;&#218;&#219;&#220;&#221;&#222;&#223;]&lt;br&gt;
* 33^1 [&#224;&#225;&#226;&#227;&#228;&#229;&#230;&#231;&#232;&#233;&#234;&#235;&#236;&#237;&#238;&#239;&#240;&#241;&#242;&#243;&#244;&#245;&#246;&#247;&#248;&#249;&#250;&#251;&#252;&#253;&#254;&#255;]&lt;br&gt;
* 33^1 =[A-F][0-9A-F]&lt;br&gt;
| formail -i &quot;Subject:[Asian character set]&quot;&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
(not sure how well that&apos;ll appear in y&apos;all&apos;s browsers)</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-715409</guid>
		<pubDate>Thu, 21 Sep 2006 08:48:11 -0800</pubDate>
		<dc:creator>Eater</dc:creator>
	</item><item>
		<title>By: Loto</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#715416</link>	
		<description>SpamAssassin has an option like this.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-715416</guid>
		<pubDate>Thu, 21 Sep 2006 08:51:29 -0800</pubDate>
		<dc:creator>Loto</dc:creator>
	</item><item>
		<title>By: krautland</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#715418</link>	
		<description>eater, it looks here as if you are blocking a bunch of german and danish characters as well...</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-715418</guid>
		<pubDate>Thu, 21 Sep 2006 08:52:27 -0800</pubDate>
		<dc:creator>krautland</dc:creator>
	</item><item>
		<title>By: jdroth</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#715420</link>	
		<description>I&apos;ve always wondered why ISPs can&apos;t handle this. My mailserver is hosted by Dreamhost. I want an option in their control panel that will allow me to block all e-mail with non-Western character sests, and it frustrates me that none exists. It seems that this simple act (along with blocking of image-only e-mails) would cut 50% of the spam I receive.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-715420</guid>
		<pubDate>Thu, 21 Sep 2006 08:56:05 -0800</pubDate>
		<dc:creator>jdroth</dc:creator>
	</item><item>
		<title>By: jellicle</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#715421</link>	
		<description>&lt;a href=&quot;http://ask.metafilter.com/mefi/37418&quot;&gt;Previously asked&lt;/a&gt;.  :)&lt;br&gt;
&lt;br&gt;
I still think my answer is good: look for gb2312, koi8-r, big5, and so on in the Content-Type header of your email.  This works extremely well, is easy, and has zero false positives.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-715421</guid>
		<pubDate>Thu, 21 Sep 2006 08:59:01 -0800</pubDate>
		<dc:creator>jellicle</dc:creator>
	</item><item>
		<title>By: Eater</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#715426</link>	
		<description>Yeah, the pasting of the recipe above misrendered the high-bit characters.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-715426</guid>
		<pubDate>Thu, 21 Sep 2006 09:03:58 -0800</pubDate>
		<dc:creator>Eater</dc:creator>
	</item><item>
		<title>By: blindcarboncopy</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#715460</link>	
		<description>Just use Outlook 2003 or higher :) It lets you allow/block messages by language.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-715460</guid>
		<pubDate>Thu, 21 Sep 2006 09:24:22 -0800</pubDate>
		<dc:creator>blindcarboncopy</dc:creator>
	</item><item>
		<title>By: mathowie</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#715513</link>	
		<description>jellicle, this is for gmail, so I think I can still do that by searching the header for those phrases.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-715513</guid>
		<pubDate>Thu, 21 Sep 2006 09:59:40 -0800</pubDate>
		<dc:creator>mathowie</dc:creator>
	</item><item>
		<title>By: stavrosthewonderchicken</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#716169</link>	
		<description>This is the nearest phonetic equivalent to &apos;a&apos; in Korean : &#12623;</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-716169</guid>
		<pubDate>Thu, 21 Sep 2006 16:54:19 -0800</pubDate>
		<dc:creator>stavrosthewonderchicken</dc:creator>
	</item><item>
		<title>By: jessicapierce</title>
		<link>http://ask.metafilter.com/46962/Need-help-squashing-foreign-spam#716857</link>	
		<description>I use gmail, have the same problem, and have been able to block most non-English spam by creating single-character filters. I did this by scanning the text of the spam and trying to find characters which appeared more than once. Sometimes this worked and sometimes it didn&apos;t, but when I couldn&apos;t find a duplicate, I chose a random character &amp;amp; created a filter for that. This isn&apos;t a perfect system and doesn&apos;t always work on the first try, but it definitely helped cut down on spam, right away.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.46962-716857</guid>
		<pubDate>Fri, 22 Sep 2006 08:54:25 -0800</pubDate>
		<dc:creator>jessicapierce</dc:creator>
	</item>
	</channel>
</rss>
