How can I know exactly what my webserver is doing?
February 7, 2008 11:00 AM   Subscribe

I've got a web server under a fairly heavy load of web traffic. What techniques can I use to find the bottlenecks?

I run a dedicated webserver for a website that gets about 600,000 pageviews a month. Most of the time it's fine, but there are times when it chokes on the traffic and becomes unaccessible.

What are some tools, logs, or applications I can run that can help me diagnose where the weak points are. I already know how to use "top", and tail -f on the log file to watch a stream of entries accumulating, but that's not enough.

Is it a rogue spider crawling through my directories, gobbling all my bandwidth? Is the hard drive not able to serve up the files fast enough? Is there some kind of hacking attempt going on? Have I misconfigured one little setting in my Apache httpd.conf file that's turning away users? Is there a stupid infinite redirect somewhere?

I want to be able to figure out exactly what's going on for the server when it's having a slowdown to diagnose the problem.

Any suggestions?

The server is running Linux CentOS 4.4. I'm comfortable compiling and installing software, running advanced commands, stitching together perl code, etc.
posted by fcain to Computers & Internet (11 answers total) 3 users marked this as a favorite
Use a cpu monitor. Are you at full or near full load?

Use a hard drive monitor tool to check to see if it's getting pounded. Maybe you need more memory?

Use a network monitor to determine if you have enough bandwidth, or if your server is getting clobbered. Do you need a bigger pipe? Maybe distributed servers?

If it's a rouge spider, try putting a draconian robots.txt file on your server to shut them down.

Just a few suggestions!
posted by fusinski at 11:16 AM on February 7, 2008

what version is your web server?
what is the main content it's serving?
what is your cms?
what db do you use if any?
is the db on the server?
etc etc..

need more data
posted by iamabot at 11:30 AM on February 7, 2008

A good start would be looking at top. What are your load averages?

Is the CPU tied up or is it blocking on I/O?

If the CPU is very busy, is it the web server or DB process that's using the most CPU?

Have you optimized your database configuration for your data set? Have enough memory for your indexes?
posted by kableh at 11:36 AM on February 7, 2008

Catching it when it's unaccessible is the trick, so you can look at all of these factors at once and ID the weak link. Catch it misbehaving and check top/memory usage, netstat, iostat, ps aux, mysqladmin -show processes, and tail httpd_errors all at once to see if there's any standout issues.

If your host has a bandwidth monitor, you'll want to check that, too. Could be you're just maxing out your alotted bandwidth pipe.

If you have anything hitting a database, that's also a good place to start. Mysql (if that is what you use) has many well documented places to start tuning (maxconnections, etc., various levels of my_cnf for varying server sizes, etc.)

If you're running a dB and Apache on the same machine, you may consider splitting that to two.

httpd.conf's mixervers/maxservers is worth a little reading if your Apache is complaining about not having enough threads sitting around to serve

[ on preview - what fusinski said, basically :) ]
posted by bhance at 11:37 AM on February 7, 2008

You should profile your scripts. I'm assuming you're using PHP? If so, just search google for PHP profiling - the first page of results is full of great articles.

If you're using something else, it's easy to build your own profiler. Just insert code in your scripts that logs out the current system time (in ticks or milliseconds) after every major block of code. Then have your script post the results somewhere. You should find your bottlenecks pretty quickly.

Also, tell your database software to log queries that take a long time (very few queries should take more than 500ms or so). Then find out why those queries are taking so long and optimize them. It could be a poorly-written query, or a missing index, or something that should be cached instead of executed constantly, etc.. It should be easy to find instructions on how to log queries using Google as well.
posted by helios at 11:38 AM on February 7, 2008

Install munin or cacti. Acquaint yourself with vmstat and iostat. This should get you pretty far. If you're using a database, a slow query log has solved many many bottleneck and spiking issues.
posted by rhizome at 12:22 PM on February 7, 2008

You can also run all of your logs through splunk and draw some correlating data on what is happening when and what errors are being thrown system wide.
posted by iamabot at 1:14 PM on February 7, 2008

Debugging web app performance is kind of a PITA. But there are some general steps to try. I keep some info about this kind of stuff here. Sadly, moreof that info is outdated than I would like it to be.

Any patterns to the outages? Lots of time things as simple as backups or log rotation, or various cron jobs can interact with web servers and cause poor performance.

The mod perl performance tuning guide is very good. Especially if you are using perl, but the basic ideas are good either way.

vmstat/iostat/sar are good places to start. vmstat will give you a good idea of basic i/o useage, and also some job scheduling issues, which can often creep up with apache. Watch top or ps to see if processes are stalling (or even worse, rapidly starting lots of quickly dieing subprocesses, always a pain to debug).

Hard to give much advice without know what the app does. Is it a db intensive app? Is it serving up large files? etc.
posted by alikins at 1:29 PM on February 7, 2008

A dedicated web server should have no trouble with 600,000 page views a month unless your doing some very intense processing.

Some suggestions:

It could be a rogue spider-- I've had Google cause some pretty heavy spikes on one of my servers when it tried to index a discussion forum. Try using robots.txt to tell the spiders what to ignore.

Use one of the free web server reports to see what the most commonly requested pages are.

If you're using MySQL, enable the slow query log. Also check for "select full joins".

One technique that's worked well for me is wrap my request handler in a function that randomly chooses certain requests and times them (using microtime() in PHP, I forget what the Perl equivalent is called). It then inserts these into a database along with an identifier that categorizes the request (e.g., what page was asked for or what function was executed). After a day, I have a large enough sample to see what functions take the longest to execute on average, as well as which ones take the longest cumulatively.
posted by justkevin at 2:14 PM on February 7, 2008

Response by poster: The server's mainly running a website on Wordpress. I'm using caching software so it minimizes the PHP+MySQL queries, serving up plain HTML when possible. Webserver is Apache 1.3.
posted by fcain at 7:44 AM on February 8, 2008

Response by poster: Sorry, and just to clarify the question. I'm not looking for ways to optimize the server... yet. I want to learn tools and techniques to figure out where the bottlenecks are. To use one of the examples above, there's no point increasing the memory on the server if my host is limiting my bandwidth.
posted by fcain at 7:49 AM on February 8, 2008

« Older Can I be friends with husband's team's spouses?   |   Local File Browsing and Viewing Via a Web... Newer »
This thread is closed to new comments.