How do you calculate the burden a bit of PHP is likely to put on the server?
August 24, 2007 10:06 PM   Subscribe

How do you calculate the burden a bit of code is likely to put on the server? Extra points for free and easily understood.

I'm webmaster of a ~100 page site with every page containing a sidebar. The sidebar has buttons on it which either link to that page if you're not on it or put a CSS class on the text if you are.

I'm in process of updating and streamlining the site. I'd like to remove the sidebar to (a new file) sidebar.php and use an include on all the other pages so that any time the sidebar changes I can change and reupload only one file rather than 100.

I know this can be done in Javascript, but I'd rather not. I'd like an option that works for *every* visitor, not simply the ones visiting in a Javascript-capable browser with Javascript turned on.

The code I was thinking of was something like
$here = $_SERVER['PHP_SELF'];
if ($here == '/index.php')
print "Home";
else
print "Home";

It works, and I suspect it won't be a problem even though we have just a very basic hosting package and get about 5k page views a week. (The math: 5,000 page hits a week * 10 if/thens = 50k if/thens/week = 7153 if/thens a day = 297 an hour = 5 a minute = most likely not a problem.) But how do you know how much of that kind of access a host permits? At what point does it get their attention and/or ire?

Also, how do you calculate the burden of proposed code? Where does one go for that information? What do you do to make sure you're writing code that can handle a lot of traffic?
posted by Tuwa to Computers & Internet (14 answers total) 2 users marked this as a favorite
 
Foiled again. Ah, the if/then I expect to repeat:

if ($here == '/index.php')
print "<span class=\"here\">Home</span>";
else
print "<a href=\"index.php\">Home</a>";
posted by Tuwa at 10:36 PM on August 24, 2007


The processing power that your example code would require is so negligible it will not even register on the radar. I would benchmark your code to tell you the precise fraction of CPU time it would consume, but consumer-level PCs do not measure time accurately enough with a small enough granularity.

In other words: don't stress. Shared hosts only kick up a fuss if you're running something on the order of magnitude of vBulletin, PHPbb or the like -- and even then only if you have many active users at all times of day and they're a particularly stringent host.

I hope this goes some way in quelling your fears!
posted by PuGZ at 11:17 PM on August 24, 2007


1) That kind of php include will be trivial, unless you're serving up many pages per second or doing some very intensive processing. (you're not)

2) The easiest way to test the resource usage of a piece of code is to set up php on your system, then monitor the system as you execute it. If you want detailed information, most programming languages have Profilers available that will tell you how much time is spent in each part of the code. (I don't know of one for php offhand, but that's because I use perl and ruby)
posted by chrisamiller at 11:18 PM on August 24, 2007


I realised I didn't address all your questions. There is no general rule: each host is different. However, with that said, most hosts will not be concerned unless you're consistently over-burdening the server and inconveniencing other users. Unless they are trying to cram on thousands of users onto a single machine, it is unlikely this will ever be a concern for you.

Personally, I calculate the burden of code that I write using a handy little programme named ab - apache benchmark. This tells you how many requests for your site your server can handle without breaking down in tears.

Generally, anything taking less than 20ms is perfect anything less than 100ms is acceptable for most surfing purposes. If you're accepting user input (forms, etc) anything up to 500ms is fine. Frankly, the majority of time spent preparing a webpage to be sent to a user is waiting on IO -- grabbing the code from the harddrive, etc. This goes for both static and dynamic pages: the time spent on executing your code is negligible compared to this time.

Unfortunately, as a beginner there isn't too much you can to write code that scales well - this is something that comes from experience and asking lots of questions!
posted by PuGZ at 11:31 PM on August 24, 2007


You can use the Apache bench tool to benchmark this kind of thing. You can't do it directly on your host because it's one of those things that definitely raises eyebrows, since it's a purposeful bombardment of requests to the server, but you can try it in your development environment and get an idea of the scale of change if not for the actual numbers.

I made a static page with the 10 links and some random html cruft a PHP page with the same cruft and dynamic links implemented about as you described and tested the results on my laptop using
ab -n 5000 -c 1
(5000 total hits as fast as possible with only one connection at a time, as concurrency isn't likely for your site's traffic). Just typing 'ab' where it is installed gives you a list of the switches, and 'man ab' gets a little more in-depth, and googling gets much much more in-depth.

The static page scored 1432 requests per second, taking a mean time of 0.698ms to serve each request. The dynamic page served 507 requests per second, taking a mean time of 1.973ms to serve each request. The numbers will be (sometimes wildly) different depending on where you run the tests, but the ratios are often quite consistent. You can expect your site to be a bit faster than 3 times slower with the dynamic links than it was as a collection of purely static pages. There is a large initial impact when moving from pure static to anything dynamic because you now have to deal with per-request overhead of the PHP mechanism, which is where nearly all of that slowdown is coming from. A handful of conditionals is nothing anyone who's not working on a very tight inner loop in a cutting-edge game engine should be concerned about for performance reasons.

You can find out exactly how much time PHP section of the request is taking by profiling Apache, and how much time PHP is spending in each section of code by profiling the script itself. There are several mechanisms to test both, but none really work here because you're dealing in such small processing potatoes that the overhead introduced by the respective profilers is very likely to skew or mask the impact of the part of the code you're trying to test. Plus you've only got the one section of code, so the bottleneck is obviously there for lack of anywhere else to be.

Although PHP was nearly 3 times slower, keep in mind the scale was tiny to begin with. The PHP tests finished in less than 10 seconds on an old and untuned non-server laptop running a bunch of extraneous junk, and covered your traffic for a week (not including images and other extra files, but those will be served at the same speed with and without PHP). You should definitely follow through with your plan, as changing the sidebar has got to be a big enough pain in the ass to warrant an extra millisecond and change per request that isn't absolutely necessary.

In fact, I'd advise you to not worry a bit about performance until you start to notice a problem. Dedicated hosting and memcached can do wonders for a growing site, and after that optimizing database access is usually the biggest gain, and then there are a choice few things you can do at the scripting level to marginally improve your performance lot. By the time you get down to analyzing the impact of a few conditionals, though, it's just not worth development time to explore performance issues.

As for whether your host is going to get mad at you? Not a chance. Chances are they've got hundreds of nubs pounding MySQL with text-pattern searches on big tables of improperly-indexed Linkin Park slashfic and your site won't be a blip on their radar until you're ready for the IPO.
posted by moift at 11:37 PM on August 24, 2007


5000 page views per week containing a bit of PHP just isn't going to be noticeable to the hosting firm. As others have pointed out, the resource hogs are things like bulletin boards that can have literally thousands of lines of code and tens of database queries per page.
posted by malevolent at 1:20 AM on August 25, 2007


Thanks, everyone. I didn't think about bulletin boards but I can see now how they could be resource hogs.

"Performance analysis" is a good term to know and I'll look into ab since I have apache running locally already.

I just wanted to be very cautious about this since it's for a local non-profit, not my own site.
posted by Tuwa at 4:53 AM on August 25, 2007


Do you happen to know whether you're running PHP as CGI or an Apache module? That, more than any code snippet will determine whether you're being unkind to the server or not. If it's a PHP module, you're fine. If it's CGI, the mere overhead of invoking a PHP process in the first place is the thing you should be worried about. To check, make a page whose sole content is "< ?php phpinfo(); ?>" and check it out in a browser.
posted by migurski at 8:46 AM on August 25, 2007


migurski, I didn't know about that. phpinfo.php tells me, among other things, that the _ENV["GATEWAY_INTERFACE"] is CGI/1.1 running from /usr/local/apache/cgi-bin/php.cgi-4.3.1

All of our pages are in PHP (and have been for a few weeks).

Poking around my host's forums, they state that they're using PHP as a CGI script rather than a module for security reasons. Hm.
posted by Tuwa at 8:57 AM on August 25, 2007


I commonly take timestamps at start and end of a request to get an idea of how long things are taking. With PHP5 it's as simple as:

$start = microtime(true);
...
$end = microtime(true);
printf("Request took %.3f seconds", $end - $start);


In PHP4 replace the microtime() calls with:

function getmicrotime() {
  list($usec, $sec) = explode(" ", microtime());
  return ((float)$usec + (float)$sec);
}


PEAR's Benchmark_Timer generalizes this a bit so you can get timestamps at multiple points.

But yes, your little conditional is utterly irrelevant; on a reasonable server you can run 1-2 million of them in a second, so 5/minute isn't going to bother anyone. Running PHP as a CGI takes *thousands* of times more resources, and at 5000/week even that is going to be lost in the noise.

If you are really worried about it, don't use PHP for this; use Server Side Includes (SSI), which can be securely handled by the webserver directly, and should be a couple of orders of magnitude faster than CGI PHP.
posted by Freaky at 9:23 AM on August 25, 2007


The big resource suck is going to be invoking PHP at all (especially since it's run as a CGI instead of as a module loaded into apache). Once the interpreter is started up, evaluating a conditional or two is going to have an immeasurably small cost.

But I agree with everyone else: this isn't worth worrying about until you start actually seeing high loads on the webserver.
posted by hattifattener at 1:05 PM on August 25, 2007


Thanks for the function, Freaky. The server doesn't have PEAR installed and we're not allowed to install it (I asked some time back because I was interested in the templating system for a separate project).

I don't think I'll move everything from PHP back to HTML just to use SSI, simply because it would involve rewriting .htaccess again and I can't be bothered. My host claims they support it and so they should support it, especially since its simpler functions are all the more I'm using.

hattifattener, just out of curiosity, is each time you put PHP in a page (wrapped in <?php and ?>) a separate invocation of PHP which raises the demands on resources? I just ask because that's how this O'Reilly book shows doing it and so that's what I've been doing.
posted by Tuwa at 5:45 PM on August 25, 2007


Tuwa: No, AIUI the php interpreter will start up, process the whole page, then exit; the incremental cost of dealing with each <?php ... ?> section is small.
posted by hattifattener at 6:19 PM on August 25, 2007


CGI's not the greatest, but everything's already in PHP, so your incremental cost of that one tiny conditional is too small to measure. Don't worry about it.

More generally, the way you measure the burden of proposed code is to try it out on a staging server, hammer the hell out of it with ab or the equivalent, and see if anything keels over. Learning to predict what will or won't be a problem is a matter of experience.

Regarding PEAR, no one needs to install it for you, you can just drop the source code into a directory wherever you want and use it.
posted by migurski at 8:19 PM on August 25, 2007


« Older Taking Stock of My Options   |   Help me un-corrupt my corrupted images. Newer »
This thread is closed to new comments.