Is it a bad server? Or am I a bad coder?
July 13, 2005 3:20 PM   Subscribe

I build websites. One of them runs very, very slowly, but only on the client's server. I'm pretty sure it's because their server needs more memory. But I'm a mediocre sysadmin, and I want to be absolutely certain I know what I'm talking about before I blame the hardware for what could be a problem with my software. Help reassure me that I'm not insane.

This is the information I get using "top" on their server, when my code is not running:
Mem: 254048k av, 249828k used, 4220k free, 0k shrd, 32828k buff
157100k actv, 31356k in_d, 38568k in_c
Swap: 1044208k av, 114004k used, 930204k free, 124536k cached

Is ~256MB RAM unusually stingy? And is the fact that only 4MB are free even in an idle state a bad sign? Also, the fact that that much swap space is in regular use: that's a Bad Thing, right?

Or are those reasonable numbers for a (low-budget) webserver?
posted by ook to Computers & Internet (13 answers total)
Low free memory on a *NIX server isn't necessarily bad; many will cache disk reads very close to full memory.

What you really want to look at is the number of pageins/pageouts when you're experiencing the slowdowns. If they're rapidly increasing, you're paging, and more memory will help dramatically.

256MB is low for a server, but is enough for many tasks.
posted by trevyn at 3:44 PM on July 13, 2005

It's normal for your swap space to be about 4x your physical ram, so that's not an indication of anything amiss. The fact that you only have 4MB physical ram free is, however. Unless the processor is slow or unduly burdened by other tasks, then you'll most likely see a benefit doubling (or more) the RAM.
posted by pmbuko at 3:44 PM on July 13, 2005

It's only buffering 32MB, and it's swapping.

Not enough RAM. An extra 256MB would do wonders for this box.
posted by 5MeoCMP at 3:52 PM on July 13, 2005

While it is true that you should spend the $50 to get more ram, this may not be the primary cause of the slowdown. Try doing this:

Use wget (on the server itself) to download a single object that does not require database calls or other stuff. A small image is best. Do this 3 times in rapid sucession while the server has no other traffic and isn't doing anything else. If the first download was slow, but the 2nd and 3rd were fast, you don't have enough ram. If all 3 were slow, it's probably something else.

Try the same thing, but from a different box on the same lan, and just keep doing it over and over. If some downloads are slow and some are fast, you probably have a network problem.

Now move to a remote box, like at home and do the same. If downloads are randomly slow here, but not on the lan, the internet connection may have some problem. If the first download was slow, but others were fast, and all downloads from the box itself (in the first test) were fast, your server might be trying to do DNS lookups on the fly.

If the image you are using to test downloads fast in all tests, but the HTML itself is slow, the server is probably waiting for a backend database or some other crazyness.
posted by darkness at 3:57 PM on July 13, 2005

this doesn't look good, but if you want to be 100% sure you need to run the code. the killer information is starting your code and seeing kswapd get a pile of cpu time, continuously - that means that your code requires more things in memory than can fit there when needed.

the information here shows that the machine is already using swap more than looks healthy, but that's not necessarily bad if it's doing the paging efficiently. the problem - thrashing - occurs when it has to page stuff out that it actually needs. paging stuff that the system doesn't need rapidly is not that bad (although these days with memory so cheap it's perhaps less common than it was).
posted by andrew cooke at 3:58 PM on July 13, 2005

sorry, bad explanation - kswapd will appear high on the top list, but it probably won't use that much cpu time itself. at the same time, you'll see that total cpu usage is nowhere near 100%.
posted by andrew cooke at 4:02 PM on July 13, 2005

Something isn't clear to me here. I keep reading "your code". Was actual code written for this website (C, Perl, PHP), or is this just HTML served by apache?
posted by darkness at 4:06 PM on July 13, 2005

I know that the odds are slim, but I recently had a similar problem on one of my boxes. It turned out to be a crappy .htaccess that I'd set up. Between nesting my webroots and domain-specific filtering (without IP lookups in Apache), my server was down to a crawl. Fixed the .htaccesses, and all was well.
posted by waldo at 4:30 PM on July 13, 2005

Response by poster: I explained this very, very poorly. Sorry about that.

I'm not the hosting provider or the official sysadmin; I'm trying to fix a problem that's occurring on someone else's (probably cheaply hosted) server. The code is actual perl code, not HTML. It's definitely not a network or apache config problem. The scripts in question are necessarily memory-intensive and are currently handling about ten times as much data as they were designed for (old code, written as a quick temporary stopgap, which they're still using five years later. Go figure) so I don't expect them to run smoothly; what I'm trying to figure out is why they run so much more poorly on this particular machine than they do elsewhere. When I saw those low RAM numbers, I got hopeful that that might be it.

Andrew is making me think I may be on the wrong track, though; when I run the scripts they take up unreasonably large chunks of CPU (as high as 98%), memory (26MB) and time (~30 seconds) -- vastly more than when the same scripts with the same data are run on other machines. But kswapd isn't showing any activity at all -- does that rule out low memory as the source of the problem?
posted by ook at 9:45 PM on July 13, 2005

Yeah... it could easily be the shared box. That's the real problem with those types of hosting packages -- you never really know what's going on with the rest of the box and your ability to diagnose/fix things is pretty limited.

Really, it could be anything.

That said, if the code runs perfectly under similar circumstances elsewhere, then it's reasonable to conclude that there are hardware issues at play.

One dumb thing that's always made a huge difference for me, though, is database indexing. A well indexed db can perform stunningly faster than an unindexed one.
posted by ph00dz at 5:02 AM on July 14, 2005

hmmm. that's surprising, but you're right, i think - if they are using lots of cpu then they are doing something, so they're not sitting their waiting for stuff to be moved to and from memory/disk.

the only exception would be if your code was written in an odd way - for example, if it uses non-blocking reads in a tight loop. normally when you read a file, you call a function and get a line or file or whatever back. but some functions take an array as an argument and return a number which is how much data was put into the array. then you keep calling the same routine again and again until it has all the data. if you have code like that then it can be going round in a little loop trying again and again, using cpu while the machine is paging. but you would see kswapd all the same.

someone mentioned dns resolution earlier. that might be the problem here if it is happening in your code, even if you are using the same config as elsewhere. it could be that dns resolution is broken on that machine. try "nslookup" a few times. after the first couple of tries, it should return instantly with the answer. hmm. but if it was that, cpu would probably drop.

sorry, no more ideas. i may be wrong about the cpu memory thing, of course. anyone want to confirm my reasoning? it is suspicious that this is happening on a machine with little memory that has a fair amount paged out...
posted by andrew cooke at 6:54 AM on July 14, 2005

Best answer: you know, memory is pretty cheap. can't you get them to stick an extra 256MB in anyway?! :o)
posted by andrew cooke at 6:55 AM on July 14, 2005

Response by poster: I'm pretty much going with a combination of that, and "You guys have been using this temporary code for five years now, maybe it's time for you to, you know, stop using it..." :)

Thanks for the brainpower though. I used to work in a room full of guys who were smarter than me, so I could just holler this kind of question over my shoulder whenever I got stuck. That's no longer the case, but is filling the gap nicely...
posted by ook at 7:16 AM on July 14, 2005

« Older ResumeFilter: Freelance "Gap"   |   Webcam for windows? Newer »
This thread is closed to new comments.