.htaccess redirects
November 15, 2010 3:51 PM   Subscribe

.htaccess redirects : how many is too many?

I recently moved a client's website from a standard, hand-coded site with many pages, to a document management system with many pages. I couldn't keep the old directory structure, but I wanted to make old links redirect seamlessly to the new system. I resorted to putting a couple dozen "Redirect 301" lines in the .htaccess file in the root directory of the site.

My server manager is telling me that this is not the most efficient way to do this, causes too much memory overhead with the server, and that I should probably find another way to do it.

First, is he right? And second, what's the alternative?
posted by crunchland to Computers & Internet (6 answers total) 3 users marked this as a favorite
 
Best answer: The nice thing about Redirect 301 is that it doesn't break SEO optimization, so if you care about that, you're doing it right (and server overhead be damned.)

In my experience, the concern over redirects (other than not breaking SEO) is the number in a row -- that is, if you're redirecting page a to page b, which redirects to page c, which redirects to page d, you're doing it wrong inefficiently.

However, putting your redirects in an .htaccess file in your root directory is indeed not quite as efficient as putting them in a server configuration file include, as far as I remember. And putting your redirects in a server configuration file include is more efficient (as far as server load is concerned) than every other viable solution, such as redirecting in the header of the document returned from the old location or (shudder) a meta tag redirect. That's because using the server config file (or .htaccess) allows the redirect to be processed before any page is served to the user -- all other solutions require data to be sent to the client, which then processes the redirect.

So, you're almost doing it right, and you should ask the server manager to host the redirects in a config file in the server's config directory.

Oh, and just a datapoint: for many, many years, the major web company I work for had a redirector box -- a single, ancient, Java-powered nightmare -- that contained nothing except tens of thousands of redirect and rewrite rules amassed over the years and maintained (often through multiple document chains) until they were either retired or migrated to a new, faster system that managed the same redirects. That box was not packed with memory or written efficiently, and yet it always responded quickly and never went down on us (until we finally managed to migrate stuff off and retire it.) Also, every single http://domainname/dirname request gets redirected to http://domainname/dirname/ in the same way your rules will be processed. So personally I think that, beyond migrating to an included server config file, there's no way you can make it more efficient and you shouldn't stress about a couple of dozen redirects.
posted by davejay at 4:26 PM on November 15, 2010


Best answer: A few dozen HTTP redirects isn't going to take down a web server. Sounds like the server manager is prone to premature optimization: fretting over theoretical performance impacts that are in reality completely negligible.
posted by zsazsa at 4:37 PM on November 15, 2010


Best answer: I've never heard that and I'm someone else's server manager. Also, as a server manager, if I don't like the way one of my developers does something, it's my job to tell them how I'd rather have it done.
posted by advicepig at 4:45 PM on November 15, 2010


Best answer: Well "too much memory overhead" isn't a specific number but it's safe to say that the complaint is nonsense. A couple of dozen redirects is absolutely fine, and even hundreds wouldn't be significant in terms of memory (it might get annoying for you but Apache can handle it). Apache does parse the .htaccess files on every page load but it parses it as a stream and thousands of redirects could only be about 10KB which isn't significant. You can make it slightly more efficient by setting it in your global config (because that's only parsed every time the server restarts, not on every page load), or by doing 'AllowOverride None' to stop Apache from traversing the hierarchy looking for other .htaccess files.

An alternative would be a catchall that calls a script but that has memory overhead itself anyway. Just invoking PHP with <?php print "hello world"; ?> costs about 1-2MB per request. There are lighter languages such as server-side JavaScript but, really, that would be overkill.

Get some hard numbers out of your server manager because then you can profile Apache before and after and show that it's a negligible difference. I really doubt that your server manager could even tell the difference by watching memory usage.

(a separate thing is whether you're chaining together redirects, e.g. sending HTTP redirect to a URL that is itself a redirect which in turn redirects and so on but I don't think that's what you're talking about and even if it was then that's only a few more requests so, again, it probably doesn't matter)
posted by holloway at 4:46 PM on November 15, 2010


Best answer: Don't worry about it. I used to be the Senior Web Geek for a well-known cable channel, and we had to have something like 10-15 redirects hand-coded for every one of our shows and specials, since people would commonly try to use non-canonical ways of getting to the show's main site.

For example, we had a reality show about a comedienne named, let's call her, Gathy Kriffin. Her real site, for SEO reasons, was located at something like CableCompany.com/Gathy-Kriffin-My-Life-On-The-L-Dist. But obviously that's too long to advertise in the promos or marketing material for her show. And sometimes our promos would use all uppercase letters, and sometimes mixed-case, and sometimes hyphens, and sometimes no hyphens. So we had to set up redirects for CableCompany.com/Gathy, CableCompany.com/Gathy-Kriffin, CableCompany.com/GathyKriffin, CableCompany.com/GATHY, CableCompany.com/GATHYKRIFFIN, CableCompany.com/L-Dist, CableCompany.com/LDIST, and a million other variants, all for this one show, to capture every possible mis-typed or mis-advertised or commonly mangled version of the show's name.

In total, we probably had 200-300 redirects to hand-curate in our .htaccess file, and I would occasionally have to troll our 404 error page reports to see if we'd missed anything.

It did not cause any sort of extra server overhead that we could see.

No, that was arranged by the miserable Flash+XML that our totally-non-techie marketing director insisted we use everywhere on the site, and the half-assed Ruby-on-Rails infrastructure on some, but not all, of the sites that he insisted we implement because he liked the look of those pages better and did not understand that their look was accomplished through something that was grinding our site to a halt since we lacked a proper server base which he didn't want to pay for, and a million other problems...
posted by Asparagirl at 5:00 PM on November 15, 2010 [4 favorites]


Response by poster: Thanks.
posted by crunchland at 8:50 PM on November 15, 2010


« Older Most difficult songs to sing along with?   |   Bead stuck in headphone jack Newer »
This thread is closed to new comments.