Join 3,512 readers in helping fund MetaFilter (Hide)


My processes, they are
November 27, 2012 12:36 PM   Subscribe

I've been learning a (very) little bit about Unix and AIX while doing some maintenance on our company's ERP server. I've been finding a lot of defunct processes. I have a few very n00b questions.

I've been listing the processes by user (using ps -fu <username>) and also systemwide (using ps -aef) and I see a lot of defunct processes all day long. Sometimes I attempt to kill the processes (usually using kill -5 <pid> <ppid>, but sometimes have to use kill -9 <pid> <ppid>). But killing these defunct processes is like a game of whack-a-mole: as soon as I kill one, another pops up. So I was wondering...

1) Do these defunct processes need to be killed?
2) Are these defunct processes hurting anything, other than a minor drag on our CPU?
3) Is it common to have a lot of defunct processes on a medium-size system with ~30+ users?
4) Does having a lot of defunct processes say anything about the stability of our ERP system?
posted by slogger to Computers & Internet (8 answers total)
 
It often helps to search on the name of the process, to see how it is tied to a running service. It may be important for a running service to know when a child process is finished.
posted by Blazecock Pileon at 12:42 PM on November 27, 2012


Assuming it's the ERP that's the parent process of them ?

Can you easily tell what the zombie process was meant to do ? (ie testing on the ERP when no one is using it, pull the cmd line for it, etc).

Before you starting doing this, was anything bad happening ? (ie are you killing them trying to fix a problem ?)

It's bad form for a program to leave zombies around. Unknown if the ERP eventually wait()s them or not.
posted by k5.user at 12:44 PM on November 27, 2012


Just an fyi, killing processes with -9 can be a bad idea in some cases because it bypasses any signal trapping the program might do to clean up after itself and consequently could leave lock files or other temporary files behind or otherwise leave things in an intermediate state not expected by the software when it starts up again. Though of course some times you have no other choice.
posted by XMLicious at 12:46 PM on November 27, 2012


You also might get more information by getting looking at those defunct processes to find out the parent's PID (the PPID), and figuring out what it is doing.
posted by wenestvedt at 12:56 PM on November 27, 2012


Agreed that you need to determine the parent of these processes. The defunct children are waiting to be reaped by the parent, where the parent can learn the exit status, at which time they dissapear Often many defunct processes from a single parent indicate a poorly written parent (failling to set the sigch[l]d handler). Each child will hold space in the process table but I believe the heap and stack from the child process has already been de-allocated.

Having said that, it is very possible the parent isnt using the exit status as communication mechanism. Personally, I would document the when/where/what and kill away. (your mileage may vary)

Yes, you can halt the system with too many (very large number) of unreaped processes. (I still have chills about 'that' day!)

Does the number continue to rise, or does it periodically dwindle?
posted by njk at 1:17 PM on November 27, 2012


njk: it appears relatively stable for the one account that I'm monitoring. It's usually around 3 defunct processes and the parent (of the one that I'm looking at right now, anyway) is the main module of the ERP.
posted by slogger at 1:49 PM on November 27, 2012


Defunct processes only take up one slot each in the process table; they consume no other resources. You cannot kill defunct proceses, because in fact they are not real processes. That's why they're called zombies.

If you really want to get rid of them, find their parent and kill it. Then all those zombies will be inherited by the system process init which will collect their exit status, and they will disappear.
posted by phliar at 5:06 PM on November 27, 2012 [1 favorite]


What phliar said. Just leave them alone, they're not doing any harm.
posted by devnull at 5:55 AM on November 28, 2012


« Older Help me add new dishes to the ...   |  Us: 13 participants in a multi... Newer »
This thread is closed to new comments.