Join 3,512 readers in helping fund MetaFilter (Hide)


Mystery spam?
January 12, 2010 5:59 AM   Subscribe

Can someone tell me the purpose of this webpage?

It appears to be random place names from Canada, but it includes my obscure and very small home community. I found it through a google alert. I assume it was created for a reason. Is it spam? What's up with the root page? Why would anyone waste time creating this jumble of words?
posted by Brodiggitty to Computers & Internet (6 answers total)
 
Looks like a primitive wiki to me.
posted by devnull at 6:00 AM on January 12, 2010


It is a spammers tool. It just accesses a database to give a random assortment of words which were culled from various online sources. All a spammer has to is to copy and paste in their spam message to get past the email filters.

You can change the "nll-cornwallis" to various things and get different random assortments of words.
posted by JJ86 at 6:10 AM on January 12, 2010 [1 favorite]


Thanks. I've seen other similar pages so I knew there had to be a purpose.
posted by Brodiggitty at 6:15 AM on January 12, 2010


It also doesn't work against Bayesian filters.
posted by Chocolate Pickle at 8:17 AM on January 12, 2010


Chocolate Pickle: "It also doesn't work against Bayesian filters."

Actually it is intended precisely to fool or poison Bayesian filters. The closer spam is to pure entropy, the harder it is to effectively and positively identify. This page is a tool for generating entropy for spam usage. If you use text like this to train your ruleset, you are reducing the ability to test for actual spam content. If you don't, it is still less likely to match your spam ruleset because of the increased entropy. (and no, you can't just use the entropy level as a filtering rule - you would get false positives for a source code diff or fragments of perl code or rot13 text for example).
posted by idiopath at 9:49 AM on January 12, 2010


Yes, that's the intention. But it doesn't work, because of the non-linear weighting that a Bayesian filter uses. It doesn't poison the filter or swamp the signal with noise.
posted by Chocolate Pickle at 11:19 AM on January 12, 2010


« Older Does anyone know how to make r...   |  What equipment would I need to... Newer »
This thread is closed to new comments.