Stall on cold boot, sometimes. Help.
September 19, 2005 6:07 PM   Subscribe

IT problem brainstorm: I have a PC that stalls during boot. Help.

The last thing that's displayed is the IDE auto-dectection. (Photo of a stalled boot here.) The next thing should be it clearing the screen and displaying that BIOS system summary. This usually occurs during a cold boot, first thing in the morning and resetting it usually results in it booting on the second try, but not always.

It has had two different network cards. I have moved RAM into different slots. I've re-seated all the cards. I've disabled unused ports (like serial/parallel). I have relatively few other options when it comes to replacing stuff (ie; no spare parts at hand).

Since this typically only stalls first thing in the morning and not every morning, I get to try one thing each day then it might take a week before it stalls again, telling me I tried the wrong thing. Since I'd like to fix it this year, does anyone have any idea where I should look next?
posted by krisjohn to Computers & Internet (21 answers total)
Check the power supply. A failing PSU can cause random, non-reproducable errors.
posted by mmcg at 6:28 PM on September 19, 2005

I had a similar boot stalling problem that turned out to be because i was using the wrong AC adapter on my external USB Drive. I could actually disconnnect and reconnect at the usb port to continue and repause the boot sequence.

No doubt your problem has a different cause, but I am thinking it could be something funny with your power supply. Or, less likely, something wierd with some USB peripherals. (although you've probably already checked any of those).

Unfortunately it's rather difficult to confirm a bad power supply without a replacement to test. But but process of elimination I'd say there is a good chance.

Also, maybe it is not the power supply per se but your power source. I have seen cases where computers worked better when plugged into location vs. another. Not sure how that works exactly though.

Good luck.
posted by umrain at 6:30 PM on September 19, 2005

If it is happening after POST, and there is not an immediate shutdown, the likelihood of it being a PSU problem is not very high. Start with the easiest troubleshooting of all, the last error before lock-up.

Detach your CD-ROM device first, and then boot. I doubt that it's the cause but it's the cheapest component to replace so you might as well start there. If that doesn't work, then it's possible that your hard drive is problematic. Seagate provides a disk fault detection tool (the downloaded binary will supposedly create a bootable disc). Run the tool, and if it finds fault, hope that the drive is still under warranty. If so, RMA it while you still have some use (but make sure you backup your important information first) or purchase a new drive if the warranty has expired.
posted by purephase at 6:55 PM on September 19, 2005

Okay, Does it beep and how many times and what duration. Those beeps are your friends .

You are running the Award Modular bios so using that information and the number/length of beeps you can get some form of diagnosis.

Here is a table.

Although it does sound like a failing power system to me.
posted by stuartmm at 7:00 PM on September 19, 2005

Response by poster: No beeps.

Thanks to everyone for confirming what I was going to try; The CD-ROM then the power supply. (Though don't let this stop you continuing to suggest things.)

BTW: umrain's comment about the quality of the power supply in one area vs another is a very good point and has been a problem for me in the past, but unfortunately doesn't appear to be the problem this time.
posted by krisjohn at 7:04 PM on September 19, 2005

Since you've been in and out of this box a few times, I'm going to assume you've verified you have good 80 pin IDE cables, cable routings aren't obviously bad, fans aren't looped off of drive power connectors, and other really obvious stuff, that would interfere with the POST process as drives spin up from a cold start. The fact that you say this always happens from a cold start, and that the machine boots OK after a reset seems to indicate that you might have something happening as a result of current draw on the power supply as drives and other accessories spin up, as others have said, but it is pretty easy to stress test a power supply, and cheap to just substitute a new one if you have any doubts. If that doesn't cure the problem, then read on.

In my experience, probably 75% of boot hangs are memory related. Your POST screen reports your memory frequency as 333MHz, could you back this down in your BIOS settings to 266, and see what happens? 845PE chip sets used with low end Pentium 4 processors like the 2.4 GHz in your machine aren't always able to run stably with the clock multipliers for a 133Mhz FSB, even with good PC3200 DDR. Sometimes a BIOS upgrade can solve the problem.

An article comparing a lot of 845PE/GE mobos:

If the machine runs stably at the lower frequency, it doesn't mean that the memory is "bad," necessarily, just that the 333Mhz timing isn't possible with the chipset/memory/AGP/BIOS combination. Not worth putting a lot of money into a machine of this vintage, and "better" memory isn't guaranteed to solve the problem at the 333MHz timing. There will be a slight performance hit running at the lower memory speed, but I doubt most users on business apps are going to notice it.

If the BIOS offers them, and you feel it's important to run at the highest memory rate, you can try stronger memory drive parameters, higher refresh. higher voltage, but in a business class machine, the mobo/BIOS options aren't usually those of a full overclocker board.
posted by paulsc at 7:13 PM on September 19, 2005

Also it doesn't hurt (generally) to update the BIOS to the latest. Looks like you're running a Mar 2004 version which is likely pretty old.

I like the power supply theory. At that point in the bootup process it's probably reaching some threshold of power consumption (e.g. firing up the CD drive) that the PS isn't liking. I'm a big fan of not being cheap when buying a power supply.

Here's an informative Slashdot thread about power supplies that I bookmarked long ago.
posted by intermod at 7:16 PM on September 19, 2005

Response by poster: Thanks again.

FWIW, we only get Antec power supplies now. The last PC we ordered I got an Antec Phantom 500W put in there.

I've been reminding myself of the history of this PC and apparently it had a motherboard replacement, after which the problems started. Several suggestions were made as to settings in the BIOS. I think it's time to find a new version. I'll probably try that before I futz with the power supply.
posted by krisjohn at 7:43 PM on September 19, 2005

You've covered all of the obvious (and less common) things - but I didnt' see you mention unplugging all the molex connectors (power cable-lets) and re-plugging them in.

I hear/see these problems especially during the tail end of Summer for some reason.

Alternatively, I've seen hangs at that stage before, sounds like either bad RAM or it's a software problem. You said that you've moved the ram around, have you tried swapping in just 1 stick? Do you have a boot CD that you can try to rule out a software problem? Maybe the live CD version of Ubuntu?
posted by PurplePorpoise at 8:24 PM on September 19, 2005

Check the CMOS Battery. These are better than they used to be, but one of our techs mentioned a few months ago that we had a few go bad in the last few years. The only problem I had with one was on a 386 about 10 years ago, but the problem was the same as yours - it wouldn't boot cold, but if you reset a few times the battery had enough charge to complete the process, although the settings were lost.

Very hard to troubleshoot as most non techies are not familiar or scared to mess with the components on the motherboard.
posted by Yorrick at 9:42 PM on September 19, 2005

Presumably it's auto-detecting the IDE devices fairly quickly (i.e. not taking 30+seconds)? If it's taking its time getting to this point, I'd suspect a failing HDD/CD, or possibly a conflict between the drive & the BIOS autodetect.

Some older boards had similar problems with autodetecting some hard disks - particularly Western Digital drives in their odd "Single Master" mode, but I've also seen it with Seagate & Fujitsu drives. Usually it'd boot OK if you pressed the reset button, but likely lock up again if you power-reset - something to do with device initialisation? The usual solution was to manually configure the disk parameters in the BIOS.

Beyond that, I'd be agreeing with everybody else that it sounds PSU-related - but you say you're pretty sure it's not.

Somewhere around this point is where the BIOS searches for other hardware - add-in cards, etc - and sets up the interrupt vectors & i/o addresses for them. Possibly a faulty/dying sound card or NIC?
posted by Pinback at 9:47 PM on September 19, 2005

I had nearly the same problem (intermittently hanging after BIOS, no beeps) and after checking everything else, found the problem was the PSU. Rather than replacing it, test the voltage at the connectors. In my case it was 7 volts instead of the normal 5.
posted by lasm at 10:04 PM on September 19, 2005

Response by poster: Boot disks (discs) aren't useful because the PC isn't booting, it's hanging before it tries to load any OS.

(It's detecting the IDE devices at a normal speed, ie; quickly.)

I'm not saying it's not the PSU, in fact it's a good chance that it is, I just didn't have a spare to test, but I'll bring my multimeter in tomorrow and test the voltages after the next crash, good idea. Thanks lasm.
posted by krisjohn at 12:08 AM on September 20, 2005

What brand is the motherboard?

I recently had a Tyan motherboard that stopped booting just like that and the answer was to unplug the keyboard. In order to boot, 95% of the time required booting without the keyboard plugged in. When the bios says "No keyboard, press any key to continue" I'd plug in the keyboard and press the space bar. Problem solved.

I know someone else with a similar Tyan problem that was fixed doing a similar workaround with a KVM switch.
posted by aaronh at 5:45 AM on September 20, 2005

I had an MSI motherboard which was doing the same thing (or freezing at various times for no reason). I found out that my particular motherboard had a problem with burst capacitors, which in my case turned out to be the culprit. I had to switch the MB (among other things). Here the thread that I posted about this.
posted by smcniven at 6:50 AM on September 20, 2005

I had a similar problem. I fixed it by having the floppy or CD-ROM drive, not the HDD, first in the boot order. I read somewhere that it had something to do with the time taken for older hard disks to spin up.
posted by ciaron at 7:21 AM on September 20, 2005

I have an older Dell which will not boot reliably with an IBM keyboard attached to it. Seriously, I am not hallucinating. I have the system on a KVM switch, and if I switch it out (so that the KVM will be emulating the keyboard on boot), it boots fine every time. If it is switched in then the system usually will not boot.

That's an easy one to try, and one not yet mentioned on this thread, just connect a different brand of keyboard.
posted by Invoke at 8:41 AM on September 20, 2005

Response by poster: If anyone's monitoring this thread, it's booted perfectly three out of three times (I have to leave it off for a while or the test is useless) since I unplugged the CDROM drive. Now, it might screw up next boot, but it's looking promising.

I've got my multimeter, but the thing I'm going to look for when I next open the box is to see if when the motherboard was replaced maybe they plugged the CDROM drive and the hard disk into the same power cable...

Thanks to everyone again.
posted by krisjohn at 7:14 PM on September 21, 2005

Response by poster: Well, after five or six good boots, it's hung again, so it's not the CDROM drive. I did notice that the HDD, FDD and CD had all been installed on the one power cable. I've split the drives across three separate cables. I tested the voltages. If my multimeter is to be believed, the 12V line is at 20V and the 5V line is at 8.7V. Will overpower like this cause the problems I'm seeing, or do problems only occur if the voltage is too low?

Next up, BIOS upgrade.
posted by krisjohn at 4:58 PM on September 22, 2005

Couple of things, for those that might still be reading this thread:

A quick check for low range voltage accuracy on a multi-meter is to measure the voltage a fresh standard (non-alkaline, non-lithium, etc.) "C" battery. A fresh regular "C" cell will be 1.56 VDC, within a few hundreths of a volt (1.54 to 1.58, typically).

Next, be sure to measure power supply voltages at the mainboard connectors, with the power supply connected to the mainboard. Most power supplies have output load regulation circuits that require the power supply to be delivering at least trickle current. You may need to insert small guage probe pins into the connectors to test.

Another issue in measuring voltages is that some digital multi-meters tend to display peak voltages in a circuit they are measuring, which has both a DC and an AC component. Modern power supplies are generally switching designs, where a silicon controller element is switched on and off thousands of times a second, for variable length slices of time, to produce fairly regulated "DC." But, in fact, their output is never pure DC, it has some small amount of AC component, at the switching frequency, appearing in the output, "riding" on the DC. Cheap multi-meters put a blocking capacitor in series with the probe on both AC and DC ranges, to protect the meter's internal electronics, and assist in auto-ranging, but the AC switching artifact in many supplies will "charge" this capacitor over a few seconds, and you maybe seeing this peak voltage. It's not really an issue for the mainboard circuits of well designed mainboards, which are low impedance to DC, and filter the AC switching artifacts from a power supply, if any, pretty effectively.

Most mainboards of the vintage you are working with have DC-to-DC converters modules that take the 5 VDC logic supply from the power supply, and produce lower stepped voltages of around 3.3 VDC to operate the CPU and the memory logic. These modules can tolerate higher voltages than 5 VDC being delivered from the power supply, but can't work if the power supply is dropping to less than 4 VDC or so. 8.7 VDC sounds pretty hot, though, for a loaded 5 VDC leg, and some power supplies will "clamp" their outputs with a "last gasp" protection circuit that tries to protect the mainboard in the event the power supply fails internally. Power supplies are pretty cheap; if you have any doubts, the best way to troubleshoot is to replace a suspect unit with a known good one.
posted by paulsc at 1:50 AM on September 24, 2005

Response by poster: Hung again.

Looks like I've got the latest BIOS.

Looks like I'll have to try a different power supply. Thing is, I can't get a purchase order unless I'm sure it will fix the problem.
posted by krisjohn at 8:54 PM on September 28, 2005

« Older antique fabric preservation   |   Cat attack! Newer »
This thread is closed to new comments.