What's the best way to ensure 99.999% service of a MySQL server? I'm looking for information on best practices for redundancy, and what the best way to acheive hot-failover would be ... or if it's better to use a cluster.
Here's the technical details: I have two database servers purchased, but not yet set up. The main one will run PERC3 hardware raid 0+1 across four 18gb drives to form two mirrored 36 gb drives. The other one will run linux software raid1 across two 36gb drives. Most of the activity will be read, there will be minimal write. All drives are in hot swap trays, I have spare drives ready to go, and the colo is less than ten minutes from my home (...since hard drives always fail at 3 am). Hopefully, this will ensure that I can not be taken down by a single drive failure.
The main machine has a dual power supply, with the backup only having a single. (I'm working with a very limited number of rack units.) Both machines are dual processor, dual NIC and will be running CentOS. Neither machine will be internet accessible, but will instead be connected to a private subnet accessible only to the web servers. Both machines will be connected to a private UPS that will be in the rack, and the main machine will also be connected to the colo power. This should keep the databases from getting knocked offline like Wikipedias and Livejournals did, even if it means that power will fail to all but the firewall, one web server, and the databases.
I'm not sure how to configure MySQL for this duty, though. I'm used to using MyISAM tables for maximum speed, and I've only configured MySQL to run on the same server as web server. I could use hints for how to build maximum 'if shit happens' into the system.
I do know that there is a clustering solution, but I haven't read too deeply into it.
My questions are:
1. Am I configuring the hardware for maximum redundancy, speed, and reliability? Ideally, I'd be able to use hardware that had more redundancy built in, but I'm limited in my hardware selection by available rack space.
2. Is it possible to hot-failover a MySQL database using binary logging or some such, so that when one fails, the slave automatically assumes its identity and starts serving queries? I did google, but all efforts turned up the Cluster ... which I'm not sure about because it seems to rely on one server to be the cluster controller. (Insert joke about cluster-!@#$ here.)
3. Either way, is that a better solution or is clustering a better solution? Would any additional hardware (i.e. a cluster controller) be required? What happens if the cluster controller goes down?
4. What table type choice is the best? What will I need to know about configuring these servers to be hardy but fast draft horses rather than the thoroughbred racers that I'm used to creating?
5. What kind of network traffic will communication between the cluster or slave/backup require? Should I have a separate network for this traffic?
My general intent is to provide maximum reliability and 'soft tiered failures' ... if something breaks, it goes to the next level and screams for help. If THAT breaks, then it goes to the next level and screams louder. If THAT breaks, it's still not the end of the world, because there's one more layer ... but if that last layer breaks, well, god save the SLA.
Additional information: All servers in the rack are CentOS, most servers have 2 10/100, I haven't purchased the switches yet, so they can be managed if it will be best, 3 webservers will be feeding off of the two database machines, I'm not too concerned about processors overloading or failing but am more concerned about hard drives, and I can't add additional or larger hardware. PHP and Perl will be the methods of access.
Thanks in advance for your answers. Horror stories are welcome.
posted by SpecialK to computers & internet (7 comments total)
1 user marked this as a favorite
posted by SpecialK at 12:36 AM on February 22, 2005