Comment spam on my blog?
December 29, 2004 11:14 AM Subscribe
Comment Spam question. Recently I've been hit with hundreds of comment spams to my weblog (MT 3.0 w/ mt-blacklist). The funny thing is the links aren't to the normal kind of sites. The domain names are randoms letters that don't resolve and the text is usually a person's name. For example the link might be 'Lucille' with the domain of xzxesfseor.com.
Why are the comment spammers doing this? Are they attempting to overwhelm the blacklist systems with nonsense? Any explanations?
Why are the comment spammers doing this? Are they attempting to overwhelm the blacklist systems with nonsense? Any explanations?
Response by poster: Even stranger, last night, after clearing out all the day's comment spam, I modified MT to change the name of the comment script (mt-comments.cgi) to something else.
Surely this will put an end to it I thought. Amazingly, within five minutes there were new comment spams using the new script name.
posted by Argyle at 11:22 AM on December 29, 2004
Surely this will put an end to it I thought. Amazingly, within five minutes there were new comment spams using the new script name.
posted by Argyle at 11:22 AM on December 29, 2004
Argyle, only dumb comment spammers (I know, aren't they all dumb?) just assume that the name of your script is mt-comments.cgi. Smarter ones use systems that load up an individual archive page on a weblog, automatically find the form used to submit comments, and look at what the name of the page is in the form's submit action. Then they attack this page with their spam. (See here, and the thread which preceded that comment, for more info.)
It's sorta like the email spammers' methods of scraping webpages -- only the dumb ones are fooled by all those fancy schmancy Javascript-encoding methods of hiding email addresses. The smart ones use the same web page parsers that you and I use to normalize the page, and grab the email address with the same ease that you and I do.
posted by delfuego at 11:31 AM on December 29, 2004
It's sorta like the email spammers' methods of scraping webpages -- only the dumb ones are fooled by all those fancy schmancy Javascript-encoding methods of hiding email addresses. The smart ones use the same web page parsers that you and I use to normalize the page, and grab the email address with the same ease that you and I do.
posted by delfuego at 11:31 AM on December 29, 2004
Oh, and as for the nonsense domain names, there's been some talk over at the MT-Blacklist forums that it's just what majick says, a probe of your defenses. There's likely to be some automated system that checks to see if the comments make it onto a web page; if they do, then you're added to the database of sites which might be more lucrative to spam. It's all a guess, but it makes sense.
posted by delfuego at 11:34 AM on December 29, 2004
posted by delfuego at 11:34 AM on December 29, 2004
Response by poster: Thanks for the good info.
I'm going to buy Jay Allen more beers again this year at SXSW, but I don't know if we can outlast the comment spammers. I think it may require the firm application of a lead pipe to their skulls.
My site flips between PR6 & 7 and it draws the spammers like flies...
posted by Argyle at 11:44 AM on December 29, 2004
I'm going to buy Jay Allen more beers again this year at SXSW, but I don't know if we can outlast the comment spammers. I think it may require the firm application of a lead pipe to their skulls.
My site flips between PR6 & 7 and it draws the spammers like flies...
posted by Argyle at 11:44 AM on December 29, 2004
I figured that they were trying to fill up my Wordpress spamwords list with garbage. Which, for a little while, they were doing.
It seems to just be a tantrum, since all of the attacks are the same, not some intelligent probing of my defenses. Then again, perhaps they're checking how much of a DDoS I can handle...
posted by codger at 12:08 PM on December 29, 2004
It seems to just be a tantrum, since all of the attacks are the same, not some intelligent probing of my defenses. Then again, perhaps they're checking how much of a DDoS I can handle...
posted by codger at 12:08 PM on December 29, 2004
I can buy the defense-probe theory. It squares with what I've seen on my site over the past three weeks or so.
I was getting very, very heavily comment-spammed in early December (Drupal 4.5.1). I installed the Drupal spam filtering module, flagged them all as spam, and Drupal promptly stopped another several hundred comments from posting through. After that, the bot would come back every few days and try again, and I would usually tag the first one in time for all the remaining messages in that bunch to get sequestered.
Well, by about the fourth return visit, the bot was posting links to domains that were comprised solely of a woman's full name (like "dagnymwilson.com", or something like that). They were always duds, and it would post only two or three tries, then go away. I'd flag them, but the spam filter wouldn't get a chance to flag anymore, because the bot wouldn't come back. As it stands now, it hasn't been around in almost a full week; that's the longest run since the attacks started.
(What will be really cool is when Drupal's captcha module gets mod'd to control comment posts....)
posted by lodurr at 12:36 PM on December 29, 2004
I was getting very, very heavily comment-spammed in early December (Drupal 4.5.1). I installed the Drupal spam filtering module, flagged them all as spam, and Drupal promptly stopped another several hundred comments from posting through. After that, the bot would come back every few days and try again, and I would usually tag the first one in time for all the remaining messages in that bunch to get sequestered.
Well, by about the fourth return visit, the bot was posting links to domains that were comprised solely of a woman's full name (like "dagnymwilson.com", or something like that). They were always duds, and it would post only two or three tries, then go away. I'd flag them, but the spam filter wouldn't get a chance to flag anymore, because the bot wouldn't come back. As it stands now, it hasn't been around in almost a full week; that's the longest run since the attacks started.
(What will be really cool is when Drupal's captcha module gets mod'd to control comment posts....)
posted by lodurr at 12:36 PM on December 29, 2004
I think it's quite possible that simple error is to blame in some cases.
For example, I got a bunch of comment spam recently where the hyperlink target URLs and the text were transposed (ie the text was inside the HREF attribute and the URL was visible). Clearly someone had simply filled out a form or dialogue box wrong. I also get email spam where strings like %RNDWRD% appear, and obviously those are placeholders in the spam tool template which the spammer is not using right.
Since spammers squirt out so much, they can probably make a lot of mistakes without realising.
OTOH, I have definitely had a couple of test probes in the last week - they had a hyperlink of "#" and a text of "test".
The worst stuff (by which I mean most persistent and offensive) is all coming from Ukraine and Byelorussia, for some reason.
posted by i_am_joe's_spleen at 1:20 PM on December 29, 2004
For example, I got a bunch of comment spam recently where the hyperlink target URLs and the text were transposed (ie the text was inside the HREF attribute and the URL was visible). Clearly someone had simply filled out a form or dialogue box wrong. I also get email spam where strings like %RNDWRD% appear, and obviously those are placeholders in the spam tool template which the spammer is not using right.
Since spammers squirt out so much, they can probably make a lot of mistakes without realising.
OTOH, I have definitely had a couple of test probes in the last week - they had a hyperlink of "#" and a text of "test".
The worst stuff (by which I mean most persistent and offensive) is all coming from Ukraine and Byelorussia, for some reason.
posted by i_am_joe's_spleen at 1:20 PM on December 29, 2004
Response by poster: How do you know that the spammers are in Russia?
When I reverse DNS the IPs, I invariably get home DSL or cable modem accounts in the US. The computers sitting at the end of those lines must be zombied.
Am I missing some tracks that these guys are leaving behind?
posted by Argyle at 1:44 PM on December 29, 2004
When I reverse DNS the IPs, I invariably get home DSL or cable modem accounts in the US. The computers sitting at the end of those lines must be zombied.
Am I missing some tracks that these guys are leaving behind?
posted by Argyle at 1:44 PM on December 29, 2004
Hmm. Wouldn't making commenters just register on the site do away with automated spam woes? Probably make for more intelligent comments too. Or at least ones with more effort put into them.
posted by Mossy at 3:18 PM on December 29, 2004
posted by Mossy at 3:18 PM on December 29, 2004
Whois on the domains in the comments. You're right, usually it's done through zombie proxies. Not always though - a few months ago someone was comment spamming from a bot on some compromised servers from an Italian ISP. BTW, Ukraine are Byelorussia are different countries from Russia.
posted by i_am_joe's_spleen at 3:31 PM on December 29, 2004
posted by i_am_joe's_spleen at 3:31 PM on December 29, 2004
Mossy, just another form to fill out, right?
Also, PhotoDude posts and coment threads point out two big problems with CAPTCHAs, unfortunately: problematic for visually impaired people and easily worked around by setting up free porn sites that reuse the image and have people decode them for access.
posted by billsaysthis at 5:40 PM on December 29, 2004
Also, PhotoDude posts and coment threads point out two big problems with CAPTCHAs, unfortunately: problematic for visually impaired people and easily worked around by setting up free porn sites that reuse the image and have people decode them for access.
posted by billsaysthis at 5:40 PM on December 29, 2004
...which has been merged with this topic hence giving you a friendly 404 at the URL above.
posted by fooljay at 12:14 PM on December 30, 2004
posted by fooljay at 12:14 PM on December 30, 2004
I wonder why comment systems don't:
a) take comment
b) spit back graphic of numbers/letters that you have to type in (with the squigly lines and all so as not to be parsed)
c) and then post the comment when you type in the string in the graphic.
Like when you do a whois on some registrar's web sites. Seems like a minor step to solve the whole thing once and for all.
posted by pissfactory at 4:12 PM on December 30, 2004
a) take comment
b) spit back graphic of numbers/letters that you have to type in (with the squigly lines and all so as not to be parsed)
c) and then post the comment when you type in the string in the graphic.
Like when you do a whois on some registrar's web sites. Seems like a minor step to solve the whole thing once and for all.
posted by pissfactory at 4:12 PM on December 30, 2004
That's a captcha, pissfactory, and they're an accessibility nightmare. Not a big deal if you don't have any visually impaired commenters, but you don't know if you do or not, and you don't want to place a barrier if such a person should come along.
Putting comments inline rather than a pop-up, using Blacklist (or whatever exists for your CMS), changing the name of the file, forcing previews rather than allowing posting with a single click and having a robust robots.txt all seem to help significantly without creating accessibility problems.
posted by Dreama at 5:39 PM on December 30, 2004
Putting comments inline rather than a pop-up, using Blacklist (or whatever exists for your CMS), changing the name of the file, forcing previews rather than allowing posting with a single click and having a robust robots.txt all seem to help significantly without creating accessibility problems.
posted by Dreama at 5:39 PM on December 30, 2004
This thread is closed to new comments.
posted by majick at 11:17 AM on December 29, 2004