Apache went boom. Diagnosis?
December 14, 2010 1:13 PM Subscribe
What the heck just happened to my Apache server?
I'm running an Apache 2.2 server on an HP Fedora Core box - a LAMP setup. This runs a CMS (SilverStripe) that does a smallish amount of traffic. Nothing huge, maybe a hundred visitors per day.
This morning I got a lovely "Hey the website is down" email.
So I then:
- Attempt to load site, it fails.
- Log in, several hundred httpd processes are running.
- Restart httpd, everything's happy. Site loads now.
- Check netstat, and there's a block of about 10 IP's in the Phillipines that are scraping the website.
- I block the IP's, start looking at logs.
Here's what I'm curious abount - right when the server started losing its brains, the httpd access logs started listing accesses out of sync, like this (which was roughly from 8:09 to 8:18)
14/Dec/2010:08:09:53 -0800] "GET
14/Dec/2010:08:02:58 -0800] "GET
14/Dec/2010:08:09:57 -0800] "GET
14/Dec/2010:07:59:17 -0800] "GET
14/Dec/2010:06:58:41 -0800] "GET
14/Dec/2010:08:14:01 -0800] "GET
14/Dec/2010:08:14:02 -0800] "GET
14/Dec/2010:07:08:31 -0800] "GET
14/Dec/2010:08:16:02 -0800] "GET
14/Dec/2010:08:08:53 -0800] "GET
(... more of the same until I restarted httpd)
Reading up on this I see that apache only log sevents at the end of the request - so this means that many old httpd processes (07:59:17,06:58:41,07:08:31) were hanging out and finishing very late. I'm guessing there were several hundred of these.
So - what gives? Is this some horribly coded site scraper somewhere just eating up all my httpd processes - or was this an actual attack, something like Slowloris?
I'm running an Apache 2.2 server on an HP Fedora Core box - a LAMP setup. This runs a CMS (SilverStripe) that does a smallish amount of traffic. Nothing huge, maybe a hundred visitors per day.
This morning I got a lovely "Hey the website is down" email.
So I then:
- Attempt to load site, it fails.
- Log in, several hundred httpd processes are running.
- Restart httpd, everything's happy. Site loads now.
- Check netstat, and there's a block of about 10 IP's in the Phillipines that are scraping the website.
- I block the IP's, start looking at logs.
Here's what I'm curious abount - right when the server started losing its brains, the httpd access logs started listing accesses out of sync, like this (which was roughly from 8:09 to 8:18)
14/Dec/2010:08:09:53 -0800] "GET
14/Dec/2010:08:02:58 -0800] "GET
14/Dec/2010:08:09:57 -0800] "GET
14/Dec/2010:07:59:17 -0800] "GET
14/Dec/2010:06:58:41 -0800] "GET
14/Dec/2010:08:14:01 -0800] "GET
14/Dec/2010:08:14:02 -0800] "GET
14/Dec/2010:07:08:31 -0800] "GET
14/Dec/2010:08:16:02 -0800] "GET
14/Dec/2010:08:08:53 -0800] "GET
(... more of the same until I restarted httpd)
Reading up on this I see that apache only log sevents at the end of the request - so this means that many old httpd processes (07:59:17,06:58:41,07:08:31) were hanging out and finishing very late. I'm guessing there were several hundred of these.
So - what gives? Is this some horribly coded site scraper somewhere just eating up all my httpd processes - or was this an actual attack, something like Slowloris?
Response by poster: IP's were a range from what looks to be "proxy1.skybroadband.com.ph" through "proxy6.skybroadband.com.ph".
And there were real valid GETs there, asking for real content, - I've just truncated the logs here to show the times and not IP's or the rest of the log lines.
posted by bhance at 1:43 PM on December 14, 2010
And there were real valid GETs there, asking for real content, - I've just truncated the logs here to show the times and not IP's or the rest of the log lines.
posted by bhance at 1:43 PM on December 14, 2010
Best answer: Sounds like a poorly written bot or script that flooded the site with requests... maybe it was stuck in a loop requesting the same things over and over. The reason the site was not available was likely because you hit the MaxClients (default 256) limit which determines how many httpd worker processes can be active at any time. If you have MaxClients requests being handled then no more new requests can be served until one frees up.
posted by Rhomboid at 1:53 PM on December 14, 2010
posted by Rhomboid at 1:53 PM on December 14, 2010
What Rhombold said, I see this ALL the time on my servers. It's PROBABLY just a mistake. You can do rate limiting with IP tables that could help, also check out mod_evasive or whatever that's called.
posted by Blake at 2:17 PM on December 14, 2010
posted by Blake at 2:17 PM on December 14, 2010
Best answer: Oh - I was thinking they were bizzare truncated GETs that were just holding the socket open. Now it sounds more like a crappy crawler. It's some residential ISP in the Philippines' outbound customer proxy servers. Unless you have people in the Philippines regularly browsing your sites, I'd simply IP block them.
posted by GuyZero at 3:03 PM on December 14, 2010
posted by GuyZero at 3:03 PM on December 14, 2010
Response by poster: This is a followup:
After upping the MaxClients and putting some more monitoring on the box, there was a 2nd incident.
Closer examination of the web logs shows deliberate malformatting in the GETs by injecting a weird string (\xb0) in various valid URLs, like this:
So at this point I'm calling this malicious, and not a runaway bot.
Anyone seen this before?
posted by bhance at 4:32 PM on December 20, 2010
After upping the MaxClients and putting some more monitoring on the box, there was a 2nd incident.
Closer examination of the web logs shows deliberate malformatting in the GETs by injecting a weird string (\xb0) in various valid URLs, like this:
114.108.192.9 - - [20/Dec/2010:02:45:26 -0800] "GET /\xb0 HTTP/1.1" 404 17106 "http://www.(redacted).com/"
114.108.192.8 - - [20/Dec/2010:07:08:11 -0800] "GET /ThingD\xb0etails/Order/197 HTTP/1.1" 404 17239 "http://www.(redacted).com/"
114.108.192.9 - - [20/Dec/2010:07:08:11 -0800] "GET /Thi\xb0ngDetails/Order/63 HTTP/1.1" 404 17234 "http://www.(redacted).com/"
114.108.192.9 - - [20/Dec/2010:07:08:11 -0800] "GET /MyCollection/Ad\xb0dRemoveThing/51 HTTP/1.1" 404 35497 "http://www.(redacted).com/"
114.108.192.12 - - [20/Dec/2010:07:08:12 -0800] "GET /Comparison/AddRemoveThing/19\xb06/Order HTTP/1.1" 200 46 "http://www.(redacted).com/"
So at this point I'm calling this malicious, and not a runaway bot.
Anyone seen this before?
posted by bhance at 4:32 PM on December 20, 2010
Best answer: That does appear to be someone doing a vulnerability scan. \xb0 in the 8859-1 (Latin-1) encoding is simply the degree symbol (°), so that's pretty harmless. However, in UTF-8 it's an invalid byte sequence (illegal continuation byte) which means that if your software treats the URL as UTF-8 then it will possibly encounter an error condition which was rarely tested, and is probably what the attacker is looking for. Perhaps the software might throw an exception, which sometimes gives the attacker a nice stack trace which can reveal operating details such as version numbers and paths where files are stored. You might want to check your error log to see if anything was reported for those requests. If your software just returns a 404 without freaking out or doing anything abnormal then you probably have nothing to worry about, but I'd recommend looking at the body of such a request just to make sure. And definitely add his whole /24 to the blocklist.
posted by Rhomboid at 9:10 PM on December 20, 2010
posted by Rhomboid at 9:10 PM on December 20, 2010
This thread is closed to new comments.
posted by GuyZero at 1:40 PM on December 14, 2010 [1 favorite]