RAID 5 on the motherboard?
April 13, 2009 6:52 PM   Subscribe

RAID 5 on the (Gigabyte) motherboard in Ubuntu Server?

I'm putting together a home file server that will run Ubuntu server, and I want RAID 5. It looks like Gigabyte motherboard does 6xSATA RAID 5. A PCI-e controller would cost much more than this option. Any ideas if A) RAID 5 on the motherboard is useful/viable, and B) if this particular Gigabyte motherboard would work well in Ubuntu (or CentOS)- and if not, any recommendations.

Thanks!
posted by xmutex to Technology (18 answers total) 2 users marked this as a favorite
 
I speak in generalities -- but SATA RAID cards are $$ because they do hardware raid which is generally faster (as there is a dedicated cpu do to checksum calculations, extra cache, etc).

If your looking at a software raid board like the Gigabyte -- why not just run the motherboard in JBOD mode and do software RAID with Linux's built in software raid?

I run software raid5 across 3 640gb drives and its more than fast enough to let the girlfriend watch bluray rips while running my little VMWare development environment, play mp3s, etc.

That being said -- my GA-EP45-UDR3 + E8600 has been rock solid.
posted by SirStan at 7:13 PM on April 13, 2009


Unless there is some stroke of luck, I'm guessing that the raid that's on that motherboard is fakeraid. Basically, the motherboard tricks the bios into thinking there's an actual raid controller, when in reality, it's just software raid that's got a little configuration help from the bios.

When you're talking raid 5, using consumer-grade disks, on a home workstation, software raid is plenty fast. I wouldn't even bother with software raid (unless, and this is a big unless, the ubuntu installer doesn't let you configure booting from raid and that's what you want, i remember that being an issue and if that's the case, perhaps fakeraid is the way to go). I've never had good luck with fakeraid, but I'm a brat when it comes to that kind of stuff. I prefer Linux software raid.

Oh, and for a server, I ALWAYS prefer CentOS over Ubuntu.
posted by Geckwoistmeinauto at 7:16 PM on April 13, 2009


I don't think the built-in "fake" RAID on AMD/ATI chipsets is supported by Linux's dmraid. You're probably better off using Linux's own software RAID support. And if you do that, you can use any motherboard with enough SATA ports and not worry about RAID support on the motherboard.
posted by zsazsa at 7:18 PM on April 13, 2009


Agreed with the above commenters on most counts. The Gigabyte mobo has "fake" RAID. Real RAID costs real money. "Fake" RAID is worse than pure software RAID.

The only point where I tend to disagree is that software RAID is "fast enough". This seems to be true for me on my RAID 1 system. However, my current desktop has RAID 5 (3 x 500GB), and I'm seriously disappointed in the performance. I can't wait to figure out how to switch back to a RAID 1 or similar setup. It's at a point where I've created a 2GB ramdisk (giving up half my RAM) just so I can do my C++ development work on a fast "disk". Cut my link times way down...

I may be generalizing too much from my experience, but that's why you get multiple answers to your question. The software RAID I'm using in both cases is the linux kernel (i.e. mdadm).
posted by knave at 7:32 PM on April 13, 2009


As an aside, knave, if you've only got three spindles, you're just not going to see fantastic read performance, especially with consumer-grade disks. With small writes like you'd see during a compile session, you are never going to see more than 75% of single disk performance. What's your stripe size? What disks are you using? I've spent way too long tuning madm raidgroups.
posted by Geckwoistmeinauto at 7:45 PM on April 13, 2009


I dont see the benefit of fake raid. If you cant afford raid 5 (which is super overkill for a home server) then get yourself a decent hardware RAID1 card and call it a day.
posted by damn dirty ape at 7:49 PM on April 13, 2009


Yeah, I definitely wouldn't use software RAID 5 on a desktop. It has been plenty fast for me on a personal file server with 4x1TB disks in it, however. Some bonnie++ benchmarks show that at least for sequential performance, it's faster than a single drive.
posted by zsazsa at 7:54 PM on April 13, 2009


1- Yes, use software raid. That way you can migrate the array to a different machine in case of hardware failures or upgrades.
2- RAID 5 will be slower on writes because you have to do n+1 writes for every write.
3- I had a motherboard whose fake raid0 (striping, simply for speed, in theory) was slower than just using the onboard IDE port. Reason being that the promise raid chip was connected via the PCI bus, while the straight IDE ports were connected in some kind of internal faster manner. Or built right into the chipset. Reads and writes were instantly saturating the PCI bus.
4- So choose your SATA ports wisely.
5- raid5 is generally a better choice for uptime, not speed. In theory, it will be faster for read speed, if every other bottleneck is taken care of.
6- Agree that software raid is plenty fast enough for data storage, probably not so much for disk-intensive compiling and what-what.
posted by gjc at 7:58 PM on April 13, 2009


Geckwoistmeinauto: Hate to derail, but the drives are all this model (8MB cache bad?), and /proc/mdstat reports 64kb chunk size (the default, I assume). Filesystem is ext3, which might suck. On my previous RAID 1 setups, I used reiserfs (v3). Motherboard is a nice Abit IP35 Pro, Q6600 CPU. Feel free to MeMail me if you have some off-topic advice!
posted by knave at 8:02 PM on April 13, 2009


I might get outed for this, but i just read the article how raid 5's are begging to become obsolete. I will be the first to admit that i dont know much about it though, i am just trying to spread what i just heard.
posted by Black_Umbrella at 8:08 PM on April 13, 2009


Response by poster: Hey, everyone, thanks for the great answers. After looking a bit it seems like software RAID will be a totally suitable way to go. I appreciate it.
posted by xmutex at 8:39 PM on April 13, 2009


Software RAID is definitely the way to go. Even if you can afford the hardware for real hardware RAID, recovery is MUCH, MUCH easier if you don't have to worry about the vagaries of hardware controllers.
posted by signalnine at 11:54 PM on April 13, 2009


black-umbrella- I wouldn't say that's off topic or bad advice. I too saw that article and heeded its warning. Although I think the idea that raid5 is obsolete is a bit overwrought. It just runs some numbers to show potential pitfalls. But it's not as dire as it would seem- rebuilds always took a long time, because older technology was slower. And while it is correct, one needs to remember that RAID isn't for data protection, it is for system uptime. It reduces (all but eliminates really) needing to restore data from backups. But it does not make backups any less important.

I work with enterprise (midrange) raid stuff for work. What the dedicated SCSI hot plug style cards do is maintain statistics on error rates and whatnot. They are able to fairly reliably detect imminent drive failures- I've never seen an array fail outright because another drive failed during a rebuild. It's absolutely possible and is something to plan for (BACKUPS!) however. What I have seen plenty of is arrays failing because people failed to change drives the moment they went bad, or because people didn't look at the error logs before embarking on online expansions and so forth.

The solutions are easy, however. First, remember that RAID was designed to make storage cheaper- instead of maintaining uptime via expensive disk systems, you use a bunch of cheaper, less reliable disks. If you are talking about guaranteeing massive uptime, RAID5 never should have been the only solution. Second, RAID6 adds an extra drive of redundancy. In theory, we could continue adding levels of redundancy until our needs are met. Costs a bit more obviously, but when weighed against the costs and increased odds of having to do a restore, it's well worth it. (And I suppose makes reads that much faster.) Further, having an online spare is another way to guard against multiple drive failure. The instant a drive starts to get wonky, the subsystem takes it offline and starts the rebuild process without user intervention. * Then there are more exotic solutions- raids of raids. This is actually easier with linux software raid. Each raid member can be mirrored, for example. Or you could mirror two raid5 subsystems. This would be an exercise in data modelling and throughput and MTBF calculations, but well worth it if it prevents just one disaster.

* Tip- let your spare complete its rebuild process before replacing the original failed drive. This lets the spare exercise itself, so you know it's actually good. And if you interrupt the rebuild, you lose the redundancy you got by installing it.
posted by gjc at 4:46 AM on April 14, 2009 [1 favorite]


another point I haven't seen raised yet is that your motherboard's (fake)RAID may not be portable to another board (should you suffer a MB failure, or even just come upgrade time).

In my experience with my own setup, linux software raid5 moves to a new motherboard quite trivially.
posted by namewithoutwords at 5:12 AM on April 14, 2009


Portability with RAID5 is always an issue, but you have much less portability issues with RAID1. RAID1 will just be a typical disk and a clone. The OS will be set to use the RAID driver, but thats really it. Ive pulled RAID1 drives from dead machines and put them on a plain jane SATA connection and they booted just fine.
posted by damn dirty ape at 6:39 AM on April 14, 2009


Best answer: To say what's being said up above, but from a slightly different angle, there are three major types of RAID; software, hardware, and "fakeraid".

Software RAID runs at the operating system level; the OS sees all of the disks directly, and then assembles them into a RAID. Modern CPUs, especially the multicore ones, are more than fast enough to do the checksumming and writing to disk in real time. My older Pentium-D 2.8, a dual-core P4 with a fairly crummy architecture, claims to checksum about 4.2 gigs per second. This is at least an order of magnitude faster than you're going to get out of a 5-drive RAID5 with current hardware; individual disks aren't going to exceed more than 75 megs a second of streaming writes, and it'll be a LOT slower if you're doing random ones. (Reads aren't choked in RAID5 -- it's writes that are slow.)

(oh, one aside: Windows software RAID is terrible. If you want to run Windows on your server, hardware RAID is WAY better.)

The big disadvantage to software RAID is that it's not visible to the computer until the Linux kernel is actually running. The BIOS doesn't see the RAID at all. This means you can only boot off RAID1 stripes, because the boot manager can just access the drives directly, ignoring the mirror. It just reads the kernel directly off, say, drive 0, partition 1, and boots it. If drive 0 is the one that fails, you'll usually have to manually boot the system off drive 1. Alternately, you can use a separate boot drive. (What I do on my server is boot from a small USB flash drive, and then transfer control to the RAID5 from there.)

Hardware RAID is a little different. It actually runs on the expansion card, handled in the controller firmware. All these cards have their own dedicated processors, often the rather oddball Intel I960 chip. You attach the drives to the card, format and create the RAID in the BIOS setup the card provides, and then install your operating system on top. The OS doesn't really see the individual disks; it just sees a single, large, SCSI drive, without knowing anything about the underlying hardware. This also means that you can just install and boot off your RAID drive without having to worry about a separate /boot partition -- you can boot directly off any RAID that the card can support, because it hides all the messy details.

Because of the dedicated coprocessor, this used to be a big speed advantage, but it really isn't anymore. PCI Express has a crapton of bandwidth, and the new front side busses can move a huge amount of data compared with the old machines. Running RAID used to be fairly heavy lifting for a CPU, but these days it's not a problem.

You'll probably get slightly better reliability on hardware RAID, because the OS is somewhat insulated from the RAID process. If the kernel crashes, you'll be much less likely to end up with a RAID in an inconsistent state. And you might get a little better speed. But you're tied to that exact make and model of card forever; if it goes south, you're hosed until you get another card exactly like it.

With software RAID, you trade away a little reliability (not tons), some CPU time on the server, and the extra hassle of building your boot volume. In exchange, you save a lot of money, and have better transportability. If your motherboard breaks, or if you want to upgrade, no biggie. You can just hook up your drives to any controller or motherboard that Linux supports, and your software RAID will start working again. You may have to manually reassemble it the first time from the command line on the new computer, but it will work. It's just a matter of telling the new machine which drives constitute the RAID volume.

FInally, there's fakeraid. Fakeraid is sort of a hybrid between the two approaches. The motherboard has a BIOS with enough brains to know about the RAID volume, so you can boot from it. But once the kernel boots, it needs to take over most of the functionality of the driver; it's doing all the checksumming still, just like in software RAID. All you really gain is bootability. In exchange, you lose all portability. This is really, in my opinion, an absolute abortion of a solution -- you get all the drawbacks of both approaches. You lose the offloaded CPU and the abstraction of true hardware RAID, you lose the portability and good administration tools of software RAID, and you have the least reliability of all, because these drivers just aren't tested like the software RAID stuff is.

Very strong advice: don't use motherboard RAID.

From there, I'd suggest going software; it's more than fast enough, and the $500+ that a decent hardware card will run will pay for a lot of hours of figuring out how to get the system booted. And you only have to figure that out once -- once you know how, it'll be easy thereafter. But if your time is worth a lot to you, hardware RAID is usually easier. Areca's stuff is supposed to be very good, although I haven't used it myself.
posted by Malor at 8:10 AM on April 14, 2009


Best answer: As far as the actual setup goes, don't use the drives directly as your RAID volume. That is, if your disks register as sda, sdb, sdc -- (scsi disk a, scsi disk b, scsi disk c) -- partition them first. If you plan to boot from the drive, I suggest two partitions on each. First, a small one at the start of the drive, a few hundred megs. Make this a RAID1, but mirrored across all your disks. That's your /boot partition. Then make a second partition on each that's RAID5-ed.

You might set it up this way:

sda1, sdb1, sdc1, sdd1, sde1 -- all 300MB, RAID1. This gives you a 300 megabyte boot partition with four extra copies. I believe both LILO and GRUB will figure this out and boot okay from it, but you might have to fiddle to get it right. If your normal boot drive fails, there is no auto-fallback to boot from another volume; you'll have to manually boot the machine yourself. This is easy with GRUB -- you just hit E to edit, change (hd0,0) to (hd1,0), hit enter, hit B, you're booting. LILO is hardcoded, so it'll probably take a rescue disk and a rewrite of the MBR to recover from the first drive failing.

sda2, sdb2, sdc2, sdd2, sde2 -- the rest of the disks minus a couple of gigs -- leave some blank space at the end. RAID5 these. If they're 1TB volumes, you'll end up with about 4TB of useful space, minus the 10 or so gigs you left empty at the end.

Why leave some space empty? Because not all drives are exactly the same size, and if you replace one of the drives with a different brand, it may not have the exact same number of sectors. If it's even one sector less than the other drives, you can't use it. So you knock a couple gigs off the end as an insurance policy.
posted by Malor at 8:19 AM on April 14, 2009


Oh, and note: that last set of instructions applies only to software RAID. Hardware RAIDs abstract this process in a number of different ways, depending on the card family. Typically, you tell the card which of the volumes you've plugged in are available for RAID. Then you assemble some or all of the available drives into an actual RAID, and put logical volumes on top of that. This is all done before the OS ever even boots up.

Once you're finished, the logical volumes look like disks to the OS. You might have, say, 5 actual 1TB spindles in RAID5, giving you 4TB of usable space. You can tell the OS you have any combination of logical drives you want, as long as it doesn't exceed 4TB.

The typical solution is to just export 1 big drive, and then partition that with the operating system. This is just the easiest way to handle it. Exporting several smaller volumes is primarily for multibooting operating systems that use incompatible partitioning schemes, like the new Windows dynamic disks. If you're just running Linux, you don't need to do that.

From there, things get too card-specific for me to be much help. Hopefully, that'll be enough to get you started.

If you can, use the same trick of not using all the available space on your volumes. If you have to replace a drive with a different brand that's even 1 sector too small, it won't work. Shaving off a half-percent at the end can save you grief.

One final note: RAID IS NOT A BACKUP. RAID does nothing for you if you fat-finger your data or if you get virused or whatever. RAID primarily prevents downtime, not data loss. There are many ways to lose data other than a drive failure, and RAID helps with none of them.
posted by Malor at 8:42 AM on April 14, 2009


« Older Mr Obama?   |   Why are these trees sick? Newer »
This thread is closed to new comments.