What's the most robust bandwidth throttling method for my Apache web server?
January 31, 2006 10:46 AM   Subscribe

What's the best bandwidth throttling method for my Apache webserver?

I run a website that serves a great deal of videos -- we host 50+ videos at around 10-40MB a pop, and receive a few thousand hits a day. We are in no danger of reaching our monthly bandwidth limit. The problem is our concurrent bandwidth limit -- if we average over 10Mbps for too long, we get a nastygram from our ISP. The site is run on a dedicated Apache / Linux server with cPanel and the like.

Before I continue, a proviso: I'm new to all of this. I know my way around a command-line, and can follow tutorials, but I have never tried configuring Apache by hand, normally leaving it up to the cPanel/WHM interface (a web app that can handle most common server config tasks through its GUI). So a) I might get some terms wrong, and b) the more information in your answers, the better -- "Just set up a Widget in your .gadgets file" isn't going to help me much -- full paths or links to tutorials will. Thanks.

Anyway, the story -- when we got the server, I enlisted an internet acquaintence (since incommunicado, and thus unavailable for futher help) to set it up for us. He tried Apache's "modbandwith" (I believe is the name) first, but that established too hard of a limit on our bandwidth output -- out of ten users, the first one or two would get all the bandwidth, and the rest would get nothing, and no bursting was possible. We needed a fuzzier solution, in which the average rate would be lowered if there was too much usage, but still allowed bursting. The acquaintence whipped up a solution by which all references to .mov or .mp4 files are redirected (via an .htaccess file) to another web server, THTTPD, which resides on a different port. THTTPD is a "tiny, throttling" web server, and its throttling has served us pretty well so far, but lately it's been struggling a bit. Add to that the fact that top shows THTTPD using 70% of our memory and that it's not under active development, and I need a different solution.

So the question is: Is there a means to get Apache to do the sort of bandwidth throttling I need? That is, to have it monitor the average bitrate out and reduce it, rather than establishing a hard limit? I would prefer the solution to be Apache based, so I don't need to manage yet another process-I-have-a-hazy-understanding-of on my server, but I want the most robust solution available, and if that means not doing it through Apache, I'm open to it.

Again, I'm not a system administrator or Linux power-user -- compiling is beyond me, and I don't know beans about configuring Apache, so concrete responses like "type this at the command line" or links to packages I should install would be most helpful. Thanks so much!
posted by tweebiscuit to Computers & Internet (16 answers total)
 
Response by poster: Just to give a benchmark of my lack of experience with Apache: I found a reference to mod_throttle for Apache, but the installation instructions are greek to me. (I learn fast, but this is at a high enough level above me that I can't even get a foothold.)
posted by tweebiscuit at 10:50 AM on January 31, 2006


do you know what operating system the webserver is running? if you don't, the internet is telling me to tell you to use the"uname -a" command. i was stepping myself through the mod_throttle instructions and we need it to figure out the semaphore to use.
posted by soma lkzx at 11:20 AM on January 31, 2006


Hrm. Can the ISP do it on the firewall end? I've set up Netscreen firewalls to do just that.
posted by drstein at 11:20 AM on January 31, 2006


Mod_throttle only delays answering requests I think. For bandwidth shaping you could use mod_bandwidth, but having the OS do bandwidth shaping is going to give you better results. Have a look at the wonderful LARTC howto for all things routing, shaping, and much more.
posted by fvw at 11:31 AM on January 31, 2006


Response by poster: uname -a output:

Linux server.oldeenglish.org 2.6.10-1.12_FC2 #1 Wed Feb 2 01:13:49 EST 2005 i686 i686 i386 GNU/Linux

mod_bandwidth is, I believe, the module that established a limit that was too hard, though I didn't configure it myself, so perhaps it can be more flexible than that. What I do remember, though, was that when it was on my bandwidth graph was a flat line at whatever the throttle limit was, whereas with thttpd it's a jaggedy, bursty line over short timespans (such as "last hour") which averages out to the defined limit over longer timespans (such as week or month.) The latter definitely seems preferable for the purposes of Quicktime progressive downloading.
posted by tweebiscuit at 11:51 AM on January 31, 2006


Response by poster: Also, I'm running Apache version 1.3.33 (Unix)
posted by tweebiscuit at 12:00 PM on January 31, 2006


What OS are you running?

Different solutions are available, but not as binaries on all versions of Linux. If you aren't able to build and install source packages, you are kind of limited to considering just binary packages that may be available for your OS (Debian/Ubuntu apt-get, RedHat packages, etc), and that's going to mean, for the most part, the standard Apache modules, or other binary package solutions.

Some outboard commercial solutions like Trafficwize are pretty simple to implement, if you don't mind the monthly cost.

In line with other suggestions up thread, one thing you might consider (if you don't do it already) is having more than one interface bound to your network port (say 192.168.1.35 and 192.168.1.36). That lets you organize seperate instances of Apache running on those virtual interfaces, each of which can have its own set up. Alternatively, this can also facilitate some simple traffic management policies at the TCP/IP network stack level, which might be less resource intensive for your machine than Apache modules.
posted by paulsc at 12:24 PM on January 31, 2006


Response by poster: I believe the server runs RedHat. I could build and install source packages, but I'd need explicit instructions on how -- "put XXX in your makefile and compile" (which are the sort of instructions I ususally see on source package) isn't going to help me, unfortunately.
posted by tweebiscuit at 1:28 PM on January 31, 2006


I second fvw's recommendation of the Linux Advanced and Routing and Traffic control HOWTO. See Chapter 9. The tc and iptables utils should be included (or at least availible as stock RPMs on the install CD's... if it's a really old RedHat, recompilation of 'tc' may be required to use the HTB qdisc... or you can fall back to the capable but more complexicated CBQ qdisc... see HOWTO).

A *really* basic example like:

tc qdisc add dev eth0 root handle 1: htb default 10
tc class add dev eth0 parent 1: classid 1:1 htb rate 10mbit burst 12kb
tc qdisc add dev eth0 parent 1:1 handle 10: sfq perturb 10

Would limit total output on eth0 to an average of 10Mb, allow for short bursts, and with the SFQ filter, make sure that every connection recieves a fair share of the 10Mb availible.

The TC stack really tries to be polite, so you can run these commands on a running system and it won't drop anything, traffic just filters off the end. If they work as they should it will instantly start shaping traffic.. if they don't, packets will keep movin, and you can fix the error. If everything explodes, hey, I'm just some guy on the internet. ;-)

For status, run:

tc qdisc show dev eth0
tc class show dev eth0

And to clear it all and pretend it never happened:

tc qdisc del dev eth0 root

And the root qdisc, and everything chained from it disappears. And through it all, packets keep flowing.
posted by zeypher at 2:32 PM on January 31, 2006


Here's an older, but pretty readable HOW-TO on building packages from source code. In general, though, it's probably not a great idea to try learning how to do this on a small production server, especially one for which you don't know the full setup and dependencies history. Generally, considering what you've told us of your background, and since it looks like you have a fairly recent Fedora Core 2 based installation (2.6.10-1.12_FC2 from your note above), you're probably better off sticking with the native package management tools your OS provides, since the point of these package managers is to make sure you have the necessary dependent versions of system libraries installed, and to automate some of the install procedures, including keeping some kind of version history in case you need to revert. If you have yum, learn about it.

Still, I get the sense you have some interest in learning to do this, and becoming a better system administrator, so I'd suggest setting yourself up a small Linux box if you can, as a learning environment. Maybe start with Fedora, since your production system appears to be that already, and you'll have something to play with that won't break your production server while you learn. It's easier to learn if you don't have to worry about breaking a production machine, and what you learn by evaluating and testing packages in a sandbox machine can save you a lot of headaches when you do roll things out to your production machine.

Set up your SSH client with connections to your test box and your production machine, and you can easily visually inspect and compare configuration files, directories, etc. Learn to use Unix tools and applications to manage your server, outside cPanel, and you'll feel more in control, and capable of trying things. Work on recent backups of your production server on your test box, and you can simulate most everything but stress related problems, etc.
posted by paulsc at 2:53 PM on January 31, 2006


Oh, and if you're managing a remote machine as root or equivalent from a command line (console), and learning while doing it, I'd really recommend getting a second virtual interface going pronto, bound only for SSH, and maybe on a non-standard port, as your admin interface, before making any big changes to your box. Nothing worse than trying something on a remote box, and losing connectivity on the only way into the thing you have...
posted by paulsc at 3:10 PM on January 31, 2006


Best answer: Was he using the latest mod_bandwidth (0.6)? I've tried each of these solutions over the last few years, and for Apache 2.x I consider it to be the best, primarily because it is the most configurable of the solutions mentioned so far. It allows per-vhost and per-directory limiting, the ability to limit files over a particular size, and supports limits on the number of connections from everyone or certain IP blocks. I haven't seen the inequality problem you mentioned.

mod_throttle didn't seem to work right with Apache 2.x when I tried it. Bandwidth throttling in the kernel is effective, but limited since it's a more general-purpose tool that has no visibility into the internals of Apache. (If you do go that route, consider using a script such as cbq.init.)

As far as non-Apache solutions, lighttpd also has built-in bandwidth throttling, at least in the latest version.
posted by Edge100x at 3:16 PM on January 31, 2006


Response by poster: Hi Paul -- I am indeed interested in learning more, and have taught myself plenty by doing some simple configuration and debugging manually. (I even figured out how to remove a DDoS worm all by myself!) However, as I said, everything I've read about Apache configuration at this point tells me that it's way over my head, and I'm concerned that by the time I've learned about TCP keep-alives and whathaveyou my ISP will have shut me off. Which is not to say that I'm unwilling to try, just that I'm apprehensive about my chances of getting it right.

And yes, I am managing the server remotely, as root (though not ALWAYS as root), through SSH -- which I know is bad practice, so I'm careful, but you have a good point. How do I set up a virtual interface? (Googling only got me message board questions that I can't make head nor tail of.) Alternatively, could you recommend a good "Server Administration for the Inquisitive Web Developer" book? :)

Edge, a quick look at the various mod_bandwidth.c's on my server indicates that it might have been mod_bandwidth 2.X! Yikes! I'll look into it, thanks.

Finally, since again, my problem is more immediate than my ability to teach myself the necessary skills, I'd love to give any of you folks $100 to set this up for me and do a little spring cleaning on the server. Though I am by far the most technically adept member of my sketch comedy group (whose website we've been discussing), and enjoy mucking about in *NIX shells, at this point I think my time might be better spent becoming a better sketch comedian rather than a better system administrator. Any takers?

(That said, if anyone else has any input on possible solutions, please don't hesitate to put them down.)
posted by tweebiscuit at 5:21 PM on January 31, 2006


Best answer: Adam, tempting as your offer is, I'd suggest, ever so gently, that perhaps offering money and the passwords to your server to pleasantly helpful strangers, with whom you've had a quick exchange on the Internet, may not be the best possible way to accomplish your goals...;-) I think that, on sober reflection, you may agree, and I, for one, will not hold you to it.

I think you'd be better off romancing a geek in your locale. God knows, in Brooklyn, you should have no problem, in the dead of winter, finding Linux admins who are themselves becoming convinced they may need outside help with their socialization, as badly as you feel you need help with your bandwidth control issues. What's lacking in making such a Heaven blessed match is some ingenuity and effort on your part in getting things going.

Go where the geeks are, man. Ideally, you can hookup with some one who has some long term interest in your box, and can help you make, at a minimum, security, project, and maintenance plans, and then work with you to develop the administrative and content management roles cooperatively. For example, looking quickly at your site, you might want to add a more sophisticated Web based contact function, a gig calendar, and some kind of publicity/review/community posting area, once you get your immediate concerns addressed. Or not. But whatever you do in the future, you want to have some reasonably organized way of sorting through the options, getting the projects done, documenting what was done, and maintaining it. Finding a technical resource you can work with on some basis of mutual respect, who can also help you build administrative skills is better done face to face, than on line, IMHO, but if that were the prevailing wisdom in these halcyon days of outsourcing, whole cities in India would not be enjoying recent economic miracles. Well, my editorial biases aside, I think you'll find that there are interested, capable people who would be interested in helping you long term, maybe on a volunteer or quasi-commercial basis (free tickets to your shows, attribution, acknowledgement, skill/resume building opportunities), if you treat them professionally. Think about what you can offer people for their help, be upfront and frank, and I bet you find some Linux folks who love teh funny...

You needn't and shouldn't feel like a mendicant in doing this. You don't need someone with mad skillz, but you probably want someone with a decent base of administrative experience, and you should be willing to pay fair rates for commercial work and commercial priorities, if your needs are growing and immediate. If your needs can be flexible as to time, you have a better chance of attracting quasi-volunteers or folks looking for side gigs. Beware the gypsy hacker.

As for books, the Red Hat Unleashed series has kept many a newbie profitably enthralled for days. Also, the Red Hat Bible, and various O'Reilly books are readable, and far less cryptic than man pages. None of these, unfortunately, are much on plot or character development.
posted by paulsc at 12:35 AM on February 1, 2006


As for the second admin interface idea, the mechanics of doing this depend on what you have available.

On the low end, if you have only a single physical Ethernet port, and only one public IP, and your only firewall is on the box itself, you could at least set up a second instance of something like OpenSSH, running on a non-standard port, and telnet to that for admin purposes. Better than having only one way of talking to the box, but won't get you through to it in the case of something like a munged Ethernet driver upgrade, which kills the Ethernet port entirely.

If you have an outboard firewall, and your server sitting behind that on a private IP network, but still only one physical Ethernet port, you can setup and bind a second private IP address on that interface, and port forward from your firewall box to your server to that seperate private IP. That's a pretty common co-located setup for small servers. Helps if you have more than one public IP, as with a second public address, you can eliminate the port forwarding dance on the outboard firewall/router, and just route directly through your outboard firewall/router. Again, since you're still talking to only one physical interface on your server, this won't help if you munge an Ethernet driver upgrade badly, or have hardware problems.

Best option for co-located servers is to have a second Ethernet interface, to which you assign and bind a second private IP from a second public IP, thus giving you two completely independent ways of talking to the box. Big help too, if you do get DDoS attacks, or have reasons to watch the box while live traffic is hitting it on the "main" interface. Even better if that admin interface has a second small firewall/router in front of it, and the public IP is in an entirely seperate block. $50 for the second small firewall/router, and a few bucks a month for the second public address, usually. Highly recommended.
posted by paulsc at 1:49 AM on February 1, 2006


Response by poster: While Edge provided the technical answer to my specific question, marking paulsc's responses as well for being remarkably personal and generally helpful. Thanks to both of you!
posted by tweebiscuit at 9:21 AM on February 2, 2006


« Older Why is Neal Stephenson's Baroque Cycle in the...   |   port -> program resolution? Newer »
This thread is closed to new comments.