Why is a Chinese reader zipping through my (friend's) blog at a rate of one post a second?
June 15, 2012 10:08 AM Subscribe
Why is a "reader" zipping through a friend's blog at a rate of one post a second?
Friend of mine has a modest but, in its niche, not entirely obscure blog. Not commercial, just informational. It has visitor counting software with live updates.
Someone in China or Germany or both is currently going through the whole thing at a rate too fast to be read but presumably not to be copied.
This can't be good. The question is, exactly how can this not be good? What to expect, what, if anything, to do?
Friend of mine has a modest but, in its niche, not entirely obscure blog. Not commercial, just informational. It has visitor counting software with live updates.
Someone in China or Germany or both is currently going through the whole thing at a rate too fast to be read but presumably not to be copied.
This can't be good. The question is, exactly how can this not be good? What to expect, what, if anything, to do?
It's a bot. Most likely it's the sort of bot that is likely to leave a comment like,
"Good post, really contains a lot of information I can find useful about this topic. Another beneficial good place to check out cool info like it is (link to some sort of commercial and/or phishing site). Thanks have a good day."
Or whatever. Anyway, it's a bot, not a person. Advise him to keep an eye on the comments on his entries.
It may also be a bot looking for specific vulnerabilities in his blogging software, in which case I'd advise him to read up on whatever current known exploits exist, just to be safe. But the former case seems more likely to me.
posted by FAMOUS MONSTER at 10:15 AM on June 15, 2012
"Good post, really contains a lot of information I can find useful about this topic. Another beneficial good place to check out cool info like it is (link to some sort of commercial and/or phishing site). Thanks have a good day."
Or whatever. Anyway, it's a bot, not a person. Advise him to keep an eye on the comments on his entries.
It may also be a bot looking for specific vulnerabilities in his blogging software, in which case I'd advise him to read up on whatever current known exploits exist, just to be safe. But the former case seems more likely to me.
posted by FAMOUS MONSTER at 10:15 AM on June 15, 2012
Seconding bot. Also, consider that the purpose of this would be to set up a clone somewhere for SEO purposes. Bot clones entires site, hosts it somewhere else and add tons of ads and/or outgoing links to various customers.
posted by Foci for Analysis at 10:20 AM on June 15, 2012
posted by Foci for Analysis at 10:20 AM on June 15, 2012
They are probably scraping the content, and probably wants hand written content so they can spam forums or others blogs with their own ads without setting off the filters.
It's usually good practice to throttle it slow enough to not be detected or to not throw off any monitoring alarms.
posted by wongcorgi at 10:22 AM on June 15, 2012
It's usually good practice to throttle it slow enough to not be detected or to not throw off any monitoring alarms.
posted by wongcorgi at 10:22 AM on June 15, 2012
Best answer: If he's running his own Wordpress installation, he should have Akismet installed as a minimum and I would suggest adding Bad Behavior as well since it can be setup to block these kinds of things before they even get to drop their spammy comments.
posted by tommasz at 10:27 AM on June 15, 2012 [4 favorites]
posted by tommasz at 10:27 AM on June 15, 2012 [4 favorites]
Response by poster: akismet installed and bad behaviour recommended. Plenty of spam in the past, but none (yet) seems to be associated with this current nonsense.
It's usually good practice to throttle it slow enough to not be detected or to not throw off any monitoring alarms.
In the simplest terms, how is this achieved? Throttle what slow enough?
Bot clones entires site, hosts it somewhere else and add tons of ads and/or outgoing links to various customers.
Sounds bad. Any way to stop this?
posted by IndigoJones at 10:35 AM on June 15, 2012
It's usually good practice to throttle it slow enough to not be detected or to not throw off any monitoring alarms.
In the simplest terms, how is this achieved? Throttle what slow enough?
Bot clones entires site, hosts it somewhere else and add tons of ads and/or outgoing links to various customers.
Sounds bad. Any way to stop this?
posted by IndigoJones at 10:35 AM on June 15, 2012
Best answer: >> Sounds bad. Any way to stop this?
Not really. Scrapers gonna scrape. The server sysadmins can do things that'll, but there's not a great way to stop it no matter what. They'll find ways around most anything.
>> In the simplest terms, how is this achieved? Throttle what slow enough?
One Plugin there are others. Also can be done by the sysadmins.
Really, it happens to ALL blogs, I bet mefi gets crazy numbers of scrapers stealing content.
See Also: Introduction to Website Parasites
posted by Blake at 10:53 AM on June 15, 2012
Not really. Scrapers gonna scrape. The server sysadmins can do things that'll, but there's not a great way to stop it no matter what. They'll find ways around most anything.
>> In the simplest terms, how is this achieved? Throttle what slow enough?
One Plugin there are others. Also can be done by the sysadmins.
Really, it happens to ALL blogs, I bet mefi gets crazy numbers of scrapers stealing content.
See Also: Introduction to Website Parasites
posted by Blake at 10:53 AM on June 15, 2012
Best answer: Also, checkout CloudFlare. It's an extremely simple to use Content Delivery Network that has some additional security tools which prevent these bots from even accessing your website in the first place.
posted by joinks at 10:59 AM on June 15, 2012
posted by joinks at 10:59 AM on June 15, 2012
I get these on my server all the time. They ignore my "robots" file and go ahead and dump the place. I've found that the vast majority of them are in China, Russia, or Ukraine. Ultimately the only solution I've found is to block them in my firewall. Or grin and bear it.
posted by Chocolate Pickle at 11:01 AM on June 15, 2012
posted by Chocolate Pickle at 11:01 AM on June 15, 2012
A "scraper" is not necessarily nefarious. There are programs that are designed to download an entire site at a time, so that the user can refer to it offline. That may be the nature of the bot in question.
posted by yclipse at 1:40 PM on June 15, 2012 [1 favorite]
posted by yclipse at 1:40 PM on June 15, 2012 [1 favorite]
Response by poster: Okay, I will pass this on, and as always, very much obliged to you all.
posted by IndigoJones at 4:37 PM on June 15, 2012
posted by IndigoJones at 4:37 PM on June 15, 2012
This thread is closed to new comments.
posted by DarlingBri at 10:14 AM on June 15, 2012 [1 favorite]