Ubuntu 8.04 LTS Disk Power Management
February 26, 2009 9:47 AM   Subscribe

I'm having a problem with an Ubuntu 8.04 LTS server that seemed to have stumped the stars on the Ubuntu forums. It deals with disk drives and power management.

I'm seeing a strange problem since I updated to 8.04 LTS that I never saw before. I have a Dell PowerEdge 1600SC server with a SCSI disk system. Although it's used primarily as an Apache and SAMBA server I do have X11/Gnome installed so I can use NX to administer from my desktop.

Every now and then it drops offline (pings still work, however) and when I go to the console I can't log in. At that point X usually crashes and I can see system messages indicating an I/O error with sd. I can never get a command prompt and have to hit the reset button. Once it comes back up there are no errors in any of the logs and it looks to me as if the disk is being spun down when there are long idle periods and not coming back up. Once rebooted it runs fine for weeks at a time.

Is there something I should be setting/unsetting or loading/unloading to deal with this? I realize having it configured more like a desktop is probably not ideal but it makes things a lot more handy for me. Still, I'm willing to do what it takes to prevent this issue from happening again.
posted by tommasz to Computers & Internet (9 answers total) 1 user marked this as a favorite
 
It may be helpful to post the exact error messages and surrounding messages from dmesg and syslog.

The cheesy test to prove whether your theory about the drive spinning down would be to set up a cronjob that cats a small bit of data to the end of a file every 10 minutes or so, so the drive doesn't spin down. It's not a substitute for fixing the problem, but it may give you results today.
posted by eschatfische at 9:53 AM on February 26, 2009


Response by poster: In order to post the messages, I would have to copy them down from the console screen since they do not get written to any of the log files (which is what makes me think the drive is spun down). If it does it anytime soon, I will post that.

I already have a couple of cron jobs running, including one that runs hourly.
posted by tommasz at 10:04 AM on February 26, 2009



>>a cronjob that cats a small bit of data to the end of a file every 10 minutes or so
>> so the drive doesn't spin down


>I already have a couple of cron jobs running, including one that runs hourly.


60 != 10
posted by snuffleupagus at 10:38 AM on February 26, 2009


If the disk is spinning down, then shut off disk power management.

I'm not sure what Ubuntu uses for this, but I'm sure it's configurable.
posted by phrakture at 11:25 AM on February 26, 2009


Best answer: If its a server you should be able to disable ACPI. That will disable all power-saving features.
posted by damn dirty ape at 11:59 AM on February 26, 2009


Could also be a bad disk. Have you checked the SMART failed spinup attribute yet?
posted by damn dirty ape at 12:00 PM on February 26, 2009


Response by poster: I've disabled ACPI so I'll have to see how it goes. I don't believe the drive supports SMART or at least I can't easily determine if it does, and it's been way too long since I've played with SCSI for me to even try.
posted by tommasz at 12:25 PM on February 26, 2009


Checking the SMART status is easy. Firstapt-get install smartmontools and then smartctl -a /dev/sdX.

Also, to make the debug messages easier to read, once the machine starts up, drop down to a tty with Ctrl-Alt-F1 (X will still be running, so you can still do your NX bits). Then you can run setterm -powerdown 0 -blank 0 > /dev/tty0 to disable console blanking. If the machine hangs again, you can just turn the monitor on and you should see all the debugging output.

Also, a digital camera of some kind (even if it's a cell phone one) makes transcribing the errors unnecessary.
posted by paulus andronicus at 1:39 PM on February 26, 2009


You've got a serial port on this, right? Hook that shit up; we're hunting failure. If you don't have something like Raritan to do this remotely, hook it up directly and use minicom and start waiting. You should get a lot of diagnostic worthy material out of that, possibly more than a mere digital snapshot will produce if it flows over a screen.
posted by pwnguin at 4:00 PM on February 26, 2009


« Older Newport Beach hotel/nightlife suggestions   |   Winter wardrobe needed Newer »
This thread is closed to new comments.