Blog comment spam puzzler
June 29, 2006 10:01 AM   Subscribe

I am getting some odd blog comment spam. Does anyone know what this is about?

I moderate comments on a few blogs and have recently seen a new trend in blog comment spam. There are comments left with no apparent spam links - simply short text messages that involve numbers, and if a link is included, it is simply to google.com. A google search on either the message or the the poster's name shows the exact same message on thousands of blogs with only the numeral in the message changing. This has whetted my curiousity about why someone would comment spam without trying to hype a link - could it be a code of some sort, and if so, what? This is not of any earth-shattering urgency, it's just a puzzler to me that I thought one of you smart folks might be able to figure out.
posted by madamjujujive to Computers & Internet (23 answers total) 1 user marked this as a favorite
 
Previous thread on a similar topic. (No, it was not resolved.)
posted by languagehat at 10:09 AM on June 29, 2006


Also, I've been getting a shitload of comment spam with no author and no link to anything useful to a spammer; why does this happen?
posted by languagehat at 10:11 AM on June 29, 2006


I monitor comments on several blogs. Never seen this.

I did some testing. If I use 55808 I see the spam with that number on several blogs. But if I try any of the numbers 55807-55800 I get zero results. This makes me think the number isn't just randomly generated for each spam attempt. That is, the numbers have significance.
posted by Binkeeboo at 10:14 AM on June 29, 2006


Sorry languagehat, somehow I missed that prior thread. Thanks for pointing me to it. I am getting a few other messages too - all seemingly inocuous, and all with t5 digit number. Well that post was a year ago - maybe this go round, someone will have some ideas.

I can't see the point, but there must be one. Mafia numbers? CIA or terrists code?
posted by madamjujujive at 10:19 AM on June 29, 2006


http://isc.sans.org/diary.php?storyid=1391&rss

http://www.boingboing.net/2006/05/30/reader_feedback_on_b.html
posted by luriete at 10:20 AM on June 29, 2006


"Also, I've been getting a shitload of comment spam with no author and no link to anything useful to a spammer; why does this happen?"

I'd always assumed that this was to get the email address through the "first-time commenter" moderation mechanism and subsequent comments by that spammer can contain the spammy links.
posted by ceri richard at 10:27 AM on June 29, 2006


In the old thread several people are suggesting the number could be used by spammers to identify blogs that aren't maintaining their comments. That is, if the number shows up in Google it means the blog with that number is probably ripe for a major expenditure of resources.

That would be a pretty cool hack. My thoughts -

1) But since the numbers are used on more than one blog, the theory doesn't hold up. For that hack to work you'd need a 1 to 1 relationship.

2) Even given #1 you could get the domain from the Google entry. So the numbers might indicate a successful version number or parameter list for the spam bot.

3) But then again....... No one seems to be reporting that they get a massive spam attack after seeing these messages.
posted by Binkeeboo at 10:29 AM on June 29, 2006


More here -

http://peterkaminski.com/2005/11/fivedigit_blog_spam.html
http://en.wikipedia.org/wiki/Comment_spam_number_mystery

I also like the Bayesian filter white noise idea.

The more I think about this the more I think it might be a probe to test new spam bots. As I was thinking a few weeks ago about how to stop spammers on my home brewed CMS, I was thinkjing about attacking the programmer behind the bot rather than the bot itself. In other words, if I make the CMS handle comments differently, the programmer will have to update the bot to make it work again. But I could let spam stay in place for several minutes which would make debugging very difficult. Since everything, even flagged spam, would get posted, they'd have no way to tell if the bot was working.

I decided not to go that route.

But extending that idea, one might wonder if a spam bot writer might get tired of reprogramming things all the time and just build a tool to try random stuff. He could then search Google for a number which would be tied to a parameter list in the database. The spambot then has a handy list of parameters that have are known to work.

Presto!!!! A self-learning spambot. Pretty cool, yes?
posted by Binkeeboo at 11:01 AM on June 29, 2006 [1 favorite]






If your blog publically logs IPs, maybe the five-digit number is a convenient way to advertise the Zombie Port of the Day™.
posted by ikkyu2 at 11:38 AM on June 29, 2006


I'm not sure if this is related or not, but some friends and I have recently been getting wierd numerical gmail spam. It's wierd, because the message seemed to come from myself. There was a 6-digit number in the subject line, and a the only text in the body was a 4-digit number (see below). Spam never used to get through my gmail spam filter, but since I got this wierd email, a few spam messages have slipped through.

From: [nyterrant]

To: [nyterrant]
Date: Jun 6, 2006 2:24 AM
Subject: 586876
Reply | Reply to all | Forward | Print | Add sender to Contacts list | Delete this message | Report phishing | Show original | Message text garbled?
5556
posted by nyterrant at 11:42 AM on June 29, 2006


See also:

http://www.wikia.com/wiki/Forum:String_of_numbers_type_vandalism

madamjujujive: If you tag this thread with "fivedigitnumbers" (as opposed to "five-digit-spammers"), it will be properly grouped with the older thread. Thanks.
posted by cribcage at 11:48 AM on June 29, 2006


Very interesting idea of briefly posting the flagged spam, Binkeeboo. Understandably unpalatable -- could you make it more so by changing the color of the posted text to match your background color, so that only visitors who willfully insisted on their own schemes would be able to see it?

In any case, Metafilter: lymph node of the internet's immune system.
posted by jamjam at 12:05 PM on June 29, 2006


"so that only visitors who willfully insisted on their own schemes would be able to see it?"

Would have zero impact. The deal with comment spam is that it's designed to be picked up by search engines, not people. Thus, most comment spam will be put in old threads that no one reads. The idea is to increase page rank on the search engines. Few people will read a particular blog, but everyone uses Google.

What meehawl describes above is literally coming true in a sense. One bot (the spambot) is trying to talk to another bot (Googlebot) to manipulate the reality (Google results) of humans who don't see or don't suspect the crosstalk in the background.

Said another way, if we accept the premise that Google top 10 results color our reality - what we know and learn online, which businesses we are funneled to, what is flagged as significant - then we *are* being manipulated by computer programs. And with Google giving machine readable feedback on success or failure you have a powerful tool for a self-learning program.

Perhaps one day we'll look back and talk about the computers that gained consciousness and attacked us. And they won't be the military computers typically used in Sci-fi, but rather, spambots.
posted by Binkeeboo at 12:40 PM on June 29, 2006


I'm thinking about this too hard.........

More reason to think that the numbers are related to testing spambots - The beauty of testing this way is that since the ultimate target is the Google top ten list you get exponentially better test results by looking at success in Google than you do by looking at success on the blog itself.
posted by Binkeeboo at 1:12 PM on June 29, 2006


The spammers buy Google ads with that number as a keyword. That's how they get their ads on your page, and since there's no links or flagged words in the comments themselves, they bypass spam filters.
posted by turaho at 2:45 PM on June 29, 2006 [1 favorite]


If you have a Gmail account, you can test this.

Send an email to yourself with the subject 586876 and a body of 5556. (Like the email that nyterrant received.)

When you read your email, do you get Google ads from FreeGiftWorld and AllFreeGifts for Nero 6?

If so, then these spammers have successfully sent you a link to their websites without actually sending you a link.
posted by turaho at 2:57 PM on June 29, 2006


I tried turaho's suggestion. I got the ads he mentioned, plus one for burning copy protected DVDs.
posted by donnagirl at 3:07 PM on June 29, 2006


"The spammers buy Google ads with that number as a keyword."

I don't think so. Googling "I live at 55808 Commonwealth in Seattle. Been up here before" returns about 10 blogs. Many of them don't even have ads, and for the ones that do, most use something other than AdWords.

Additionally, Googling just 55808 returns a huge number of results, many of which have Google ads. So any targeting would be drowned out completely. And since it would be easy to choose a number that wasn't a zip code, the targeted marketing thing seems unlikely. For something like this you'd choose a 6-digit number, since 5 is so likely to cause problems.

Also - Using the numbers to target ads, rather than keywords, seems like a very bad marketing strategy. The ads are more likely to get clicked on if they're on a page where they have some relevance.

I can certainly see how the targeted marketing angle could be useful - putting a specific ad on a specific page - but that isn't what we're seeing from the real-world examples given.
posted by Binkeeboo at 3:09 PM on June 29, 2006


Some Wikis have been having trouble with this, too.
posted by Steven C. Den Beste at 6:01 PM on June 29, 2006


Well for now, it's still a mystery - but thanks to all who contributed links and theories. I didn't realize that this was under such discussion on the web already. Binkeeboo, your thinking is interesting. You may be on to something with the spam bot probe theory. However, I like the more clandestine theories along the lines of radio numbers and spies.

Meanwhile, the latest one sitting in my junk comment filter today: You can't be 34668 serious?!? Mary Box

Ms. Box uses a box.com email. She's pretty prolific in her comment posting, too. Wehn I first started looking into this, I noted that a lot of bloggers actually comment back to these bots: "No, I've never even been to Seattle, why do you ask?" or "I'm totally serious, Mary!"

There's something rather sad about these earnest bloggers thinking they have a comment and its only a bot. Rapping with the bots. Maybe you are all bots too. Cue Rod Serling type music.
posted by madamjujujive at 8:04 AM on June 30, 2006


Turaho, I also got the ads you mentioned.

I am impressed with your insight.
posted by jamjam at 11:58 AM on July 2, 2006


« Older Help me take down a scam junk mailer   |   Any suggestions on how to troubleshoot a scooter... Newer »
This thread is closed to new comments.