Streaming video: Blades or Clone Army?
April 4, 2006 1:29 PM Subscribe
Anyone have experience deciding between a blade center and regular rack servers?
I work for an educational institution. Our department is responsible for serving media from university sporting events.
Outsourcing the serving is an option, but is not one that we're exploring at this time due to marketing and contract constraints. I'm really just looking for server advice.
The constraint is the number of apache or other lighter-weight httpd serving processes that I can stream the flash video down. As long as the stream is alive, the process is alive. This has driven our single existing server absolutely crazy, with load average peaking as high as 15 at certain times. This single server is also responsible for processing the pages from our content management system. It's not a happy server...
I could easily saturate the 100mbps that I've got to the 'bone with a clone army of white box servers. The high and mighties want 'enterprise class'. So I'm looking at two options: 1) A rack full of cloned 1u media servers running LHTTPd. Two networks; video is uploaded to a master that then pushes it down to the slaves by copying it over a private network. A load balancer runs in front of the media servers to keep traffic distributed.
2) A cluster of blade servers running off of a fibre channel array. The media gets copied onto the shared partition on the fibre channel, and then the blades push the processes down the line.
Which do you think would be better and *why*? I've worked with racks full of load-balanced servers before, but I've never worked with a cluster or with blade servers on fibre channel. Advice? Things to consider? Other solutions besides outsourcing the whole mess?
I work for an educational institution. Our department is responsible for serving media from university sporting events.
Outsourcing the serving is an option, but is not one that we're exploring at this time due to marketing and contract constraints. I'm really just looking for server advice.
The constraint is the number of apache or other lighter-weight httpd serving processes that I can stream the flash video down. As long as the stream is alive, the process is alive. This has driven our single existing server absolutely crazy, with load average peaking as high as 15 at certain times. This single server is also responsible for processing the pages from our content management system. It's not a happy server...
I could easily saturate the 100mbps that I've got to the 'bone with a clone army of white box servers. The high and mighties want 'enterprise class'. So I'm looking at two options: 1) A rack full of cloned 1u media servers running LHTTPd. Two networks; video is uploaded to a master that then pushes it down to the slaves by copying it over a private network. A load balancer runs in front of the media servers to keep traffic distributed.
2) A cluster of blade servers running off of a fibre channel array. The media gets copied onto the shared partition on the fibre channel, and then the blades push the processes down the line.
Which do you think would be better and *why*? I've worked with racks full of load-balanced servers before, but I've never worked with a cluster or with blade servers on fibre channel. Advice? Things to consider? Other solutions besides outsourcing the whole mess?
By and large, the chief advantage of blade servers is density. If you are under space constraints, but have lots of extra cooling capacity, and your application isn't especially CPU bound, a can of blades is something to consider.
The cooling capacity requirements are substantial, however. Don't even think about playing with a rack of blades until you nail down how much heat pumping you can spend.
Without knowing the details it's hard to make an informed suggestion, but my gut tells me you should go with option 1B: a stack of white box 1U hanging from FC storage. I don't see anything about your situation that strongly points towards doing the blade thing.
posted by majick at 2:37 PM on April 4, 2006
The cooling capacity requirements are substantial, however. Don't even think about playing with a rack of blades until you nail down how much heat pumping you can spend.
Without knowing the details it's hard to make an informed suggestion, but my gut tells me you should go with option 1B: a stack of white box 1U hanging from FC storage. I don't see anything about your situation that strongly points towards doing the blade thing.
posted by majick at 2:37 PM on April 4, 2006
If I understand this properly, you are basically serving static video files using a webserver? How large are the files? How many concurrent connections do you have to service?
I've used lighttpd to serve ~40-50MB files. I was able to saturate a 100mbit connection running on single a white-box Athlon XP 2000+ with 1gb of memory and a single IDE drive running debian sarge. There weren't many files, so I think everything was pretty much cached in memory. If I remember right, the process never used more than ~10MB of memory and the CPU was ~50% idle.
Byteserving files is probably more resource intensive, but it seems to me that you don't have to scale this very far, so the densitity of blade servers seems like overkill.
posted by Good Brain at 2:46 PM on April 4, 2006
I've used lighttpd to serve ~40-50MB files. I was able to saturate a 100mbit connection running on single a white-box Athlon XP 2000+ with 1gb of memory and a single IDE drive running debian sarge. There weren't many files, so I think everything was pretty much cached in memory. If I remember right, the process never used more than ~10MB of memory and the CPU was ~50% idle.
Byteserving files is probably more resource intensive, but it seems to me that you don't have to scale this very far, so the densitity of blade servers seems like overkill.
posted by Good Brain at 2:46 PM on April 4, 2006
Why can't you use a single threaded web server? It sounds like you have a significant software performance problem. I know precisely naught about streaming video/audio, but if your load profile is basicaly HTTP server reading various video files. Not much extra processing involved other than
1. read chunk of file
2. send chunk of file to client
3. goto 1
So why not look at using something like mathopd which will server all user requests through a single thread. You could always run multiple instances on each machine if you wanted to make more efficient use of multiple CPU's and load balance user connections between them.
Unless I'm missing something?
posted by alexst at 2:49 PM on April 4, 2006
1. read chunk of file
2. send chunk of file to client
3. goto 1
So why not look at using something like mathopd which will server all user requests through a single thread. You could always run multiple instances on each machine if you wanted to make more efficient use of multiple CPU's and load balance user connections between them.
Unless I'm missing something?
posted by alexst at 2:49 PM on April 4, 2006
Response by poster: The current box is going away as quickly as I can get rid of it, because I currently can't change anything about the way it runs. There's a number of significant problems with it, mostly stemming from the fact that I shouldn't say anything about the IT staff that's supporting it in a public forum. (If you catch my drift...) I can't change any of the variables and test my changes, either.
There's about 20 gb of flash video files that we're serving on a regular basis. (The average file size is 7mb, largest is a 60 minute video whose size I'm scared to look at.) When we're getting hit, you are theoreitcally correct in assuming that it's usually one video file that's being requested over and over, and theoretically it should be being cached in memory. Unfortunately, that's not the case. People generally watch one, and then watch the four or five other videos that are featured.
We can keep about 250 apache processes running at any one point in time. However, it seems that when the video file is requested, it streams it off of the hard drive as opposed to caching it in memory -- and I can't change whatever setting is causing this. (reference earlier IT staff comment.) If apache gets set to allow over 250 apache processes, we get into swap space, and the whole server keels and dies after load average spirals through the roof.
I so wish I could load-test flv streams on my dev whitebox to figure out how to optimally configure a web server for it, but I can't find something that'll let me smack a server around for streaming flash.
posted by SpecialK at 3:10 PM on April 4, 2006
There's about 20 gb of flash video files that we're serving on a regular basis. (The average file size is 7mb, largest is a 60 minute video whose size I'm scared to look at.) When we're getting hit, you are theoreitcally correct in assuming that it's usually one video file that's being requested over and over, and theoretically it should be being cached in memory. Unfortunately, that's not the case. People generally watch one, and then watch the four or five other videos that are featured.
We can keep about 250 apache processes running at any one point in time. However, it seems that when the video file is requested, it streams it off of the hard drive as opposed to caching it in memory -- and I can't change whatever setting is causing this. (reference earlier IT staff comment.) If apache gets set to allow over 250 apache processes, we get into swap space, and the whole server keels and dies after load average spirals through the roof.
I so wish I could load-test flv streams on my dev whitebox to figure out how to optimally configure a web server for it, but I can't find something that'll let me smack a server around for streaming flash.
posted by SpecialK at 3:10 PM on April 4, 2006
Spending multiple thousands on infrastructure without knowing if it will help seems like a bad idea. So your first order of business is to try and come up with a simulation of your workload.
Since you are serving these videos with apache, rather than the Flash Media server, it's probably a pretty simple HTTP transaction. The video player is probably manipulating TCP/IP flow control to keep the server from sending video faster than its needed. The other possibility is that it is using range serving or the like. I'd probably try setting something up on my devbox and use ethereal to capture and analyze packets to see what its doing.
Even without that knowledge, you could probably do a decent approximation by just swamping your devbox with a few thousand requests from a couple of clients running something like siege. I compiled it under cygwin and had trouble with it barfing if I had too many simulated clients per process. I solved it by starting multiple siege processes in parallel.
You can try tuing your OS and apache config on the server, but I'd really suggest trying lighttpd first, as its lends itself to this kind of workload better.
posted by Good Brain at 4:10 PM on April 4, 2006
Since you are serving these videos with apache, rather than the Flash Media server, it's probably a pretty simple HTTP transaction. The video player is probably manipulating TCP/IP flow control to keep the server from sending video faster than its needed. The other possibility is that it is using range serving or the like. I'd probably try setting something up on my devbox and use ethereal to capture and analyze packets to see what its doing.
Even without that knowledge, you could probably do a decent approximation by just swamping your devbox with a few thousand requests from a couple of clients running something like siege. I compiled it under cygwin and had trouble with it barfing if I had too many simulated clients per process. I solved it by starting multiple siege processes in parallel.
You can try tuing your OS and apache config on the server, but I'd really suggest trying lighttpd first, as its lends itself to this kind of workload better.
posted by Good Brain at 4:10 PM on April 4, 2006
The *only* thing blades are good for is rack density -- you need to have more processors, but you have limited physical space.
They will cost you more money, both on initial buy, and on cooling. A 44U rack full of blades is damn near a smelter, and if you don't have the cooling to handle it, it will kill itself rapidly.
Plus, if you need storage, this implies off-chassis storage, such as a SAN. That's more money.
There are cases where this is the correct answer -- either you need gobsmacks of CPU and little storage, or you want one rack full of 300GB fibre drives, and several racks full of blade, and can afford the cooling and power needed.
It doesn't seem that you need CPU as much as you need I/O, in particular, network I/O. Right now, there's a box that's built to be the baddest possible 1U network server in the world, the Sun Niagara boxes (the UltraSparc T1.) They're built for massively threaded network applications, and the guys I've talked to who run them are just amazed at how they blow blades away in the network serving realm.
Personally, what I'd do is buy five cheap 1U 2xCPU boxes or SunFire T-1000 and see how well that handles the load. These are boxes that will be useful for other things, and if it turns out that yes, you need 120 processers in the rack, then blades become a good idea. But if it turns out you need 20 or so, then blades will cost you more, both initially and over time, and buy you nothing that a few 1Us would.
posted by eriko at 7:33 AM on April 5, 2006
They will cost you more money, both on initial buy, and on cooling. A 44U rack full of blades is damn near a smelter, and if you don't have the cooling to handle it, it will kill itself rapidly.
Plus, if you need storage, this implies off-chassis storage, such as a SAN. That's more money.
There are cases where this is the correct answer -- either you need gobsmacks of CPU and little storage, or you want one rack full of 300GB fibre drives, and several racks full of blade, and can afford the cooling and power needed.
It doesn't seem that you need CPU as much as you need I/O, in particular, network I/O. Right now, there's a box that's built to be the baddest possible 1U network server in the world, the Sun Niagara boxes (the UltraSparc T1.) They're built for massively threaded network applications, and the guys I've talked to who run them are just amazed at how they blow blades away in the network serving realm.
Personally, what I'd do is buy five cheap 1U 2xCPU boxes or SunFire T-1000 and see how well that handles the load. These are boxes that will be useful for other things, and if it turns out that yes, you need 120 processers in the rack, then blades become a good idea. But if it turns out you need 20 or so, then blades will cost you more, both initially and over time, and buy you nothing that a few 1Us would.
posted by eriko at 7:33 AM on April 5, 2006
This thread is closed to new comments.
posted by dragstroke at 2:28 PM on April 4, 2006