And many shall be one
June 18, 2006 12:10 AM   Subscribe

How do you uniquely route traffic from a large number of domains to a single server?

I realise that I can redirect at the DNS level, and curently forward traffic for a couple of domains that I own to a single domain. No problem there (you-suck.eu -> you-suck.com, for example).

But a friend is considering undertaking a project which will require a large number of domains, as many as four hundred, be redirected to a single server (no problem so far), but that server present content that is specific to that unique domain.

Can it be done? I told her about forwarding, but when we get to the part about how the server knows what domain was forwarded is where it gets fuzzy for me.

If this is possible can "any old" ISP provide this service? What does she ask for?
posted by Mutant to Computers & Internet (17 answers total) 1 user marked this as a favorite
 
Best answer: You can host multiple domains, serving unique content to each, by making use of the Host header in the HTTP request.

Apache and IIS will allow you to set up different virtual servers based on the Host header, or you can use server-side scripts (ie php pages) to decide which content to send depending on the value set in the Host header. This is probably the way to go in your case.
posted by Good Brain at 12:17 AM on June 18, 2006


Host headers are almost universal, but there are still some cases where this technique breaks. Basically, it depends on the user's browser sending along a header indicating what URL they are accessing.

For larger projects, I would recommend not using this technique. You could buy a range of IPs from your ISP, and have several pointing at one computer (you can have more than one network card in that computer, each bound to a different IP)

This way, you can later break the system up for fall-over, etc. Its risky having multiple services on one box.
posted by clord at 12:26 AM on June 18, 2006


Clarifying...

what you want to ask for is "X static IP's", where X is the number of hosts you want. You can also ask for a certain netmask, which is the more technical version of the same request.

You mention 400 hosts, so you would need a netmask of at least 255.255.254.0, possibly more if you want to grow.

Putting 400 network cards in a computer is just a bit excessive, so you'd need to use some fancy redirection. I don't think what you want is possible to do reliably with a single computer... perhaps some sort of virtualization would work.

If this is some sort of domain parking type thing, then host headers is your best bet, since quality of service wont matter as much.
posted by clord at 12:35 AM on June 18, 2006


When do host headers break, how often is host-header breakage likely to occur? Older browsers (circa 8 years ago) didn't support host headers, half-assed scraper scripts don't support host headers. I doubt either aren't at all common, and probably aren't a very important audience.

As for how to deal with IP based hosting. It's not hard to assign multiple addresses to a single NIC in either Linux, FreeBSD, or Windows Server, though 400 might be a bit excessive (I don't really know).
posted by Good Brain at 12:55 AM on June 18, 2006


clord, with all due respect, I don't think you know what you're talking about here. You don't want to do this on the IP level, not with 400 or so sites, and certainly not on the NIC level. This is what host headers are designed for, and the only people with whom host headers don't work are shitty ancient browsers or people writing http automation who don't know how to pass http headers for some reason.

Host headers give you plenty of flexibility for future growth, and simplify your management of the systems. I've hosted sites where a single box might respond on ~200 different URLs with differing wireframes and localization, and when done right it's a dream to manage and maintain. Depending on the nature of the sites and their content, it'll be vastly easier to do this in code, i.e. .net or php level host header parsing than to try to manage that many separate virtual sites or god forbid virtual IPs and NICs/machines. Also, for that many domains, you might want to start thinking about hosting your own solution directly. Depending on how critical uptime is, along with bandwidth usage, site revenue, etc, it could be cheaper and more useful to have your own server(s) and just co-lo for the cost of the bandwidth, or run it from your own house.

Lastly, the big unanswered question is what is being hosted that required 400 domains- is it possible you might need to reconsider how you're building this project that it's this messy? Why not one domain, and 400 ./blah/ vdirs? Whenever I have projects that start to look very messy, that's a good clue that I need to go back to the drawing board and re-evaluate how I've designed.

Oops- one more thing. It's not clear from your terminology if you're doing CNAMEs or actual http 302 forwards- the latter isn't very good for this setup, since you want a simple DNS structure but not make the user pay the price of hopping around pages- and it will make the URL-based content more complicated to serve when you're 302'ing them all over your site(s). CNAME you-suck.eu -> you-suck.com, and the browser should properly send the .eu host header along in its GET request to the IP for you-suck.eu, with only a single DNS lookup.
posted by hincandenza at 1:09 AM on June 18, 2006


Honestly, if passing the Host header was in any way unreliable, then thousands of shared web hosts hosting millions of websites would be in a very precarious position. I don't think it's even worth worrying about.
posted by evariste at 1:31 AM on June 18, 2006


Best answer: What you are describing is called name-based virtual hosting and it is extremely common, and has been in wide use since approximately 1995. Don't worry about anything breaking. The "host:" header is a required field in HTTP 1.1 and virtually all clients that use HTTP 1.0 will still send it. Unless you're trying to support ancient browsers from the early 90s you have nothing to worry about.
posted by Rhomboid at 3:23 AM on June 18, 2006


And by the way, if you want to see just how prevalent this practice is, enter a domain name or IP address here and note on the results page how many other websites are hosted on that IP address. I assue you that for the vast majority of sites in existance virtual hosting is used. Under no circumstances should you feel the need to get a different IP address for each domain, that is pure crazy talk.

(Unless of course you're dealing with SSL, which is a whole other issue.)
posted by Rhomboid at 3:30 AM on June 18, 2006


Sounds like you need a web server configured to direct all HTTP requests to the same 'site', which then uses the host header to decide what to present.

You're going to need a developer involved to get it all working, so it'd make sense to sort that out first so they can help work out the hosting & general architecture. I can't think of many situations where 400 domains for a single project makes sense, so you might find that's the first aspect that gets questioned.
posted by malevolent at 3:48 AM on June 18, 2006


Response by poster: Thanks for the tips so far guys, and sorry if I wasn't clear in some of the terminology.

When I say DNS, I mean the registrar, Gandi.net in this case, allows domains registered by them to be forwarded. In my ignorance I was assuming this was the correct approach, but I think now it might not be - the DNS might have to be handled exernally, correct?

Will this then allow the single server to see what original domain was invoked to forward this request?

Thanks for all your tips / feedback - much appreciated!

Take care
posted by Mutant at 3:57 AM on June 18, 2006


Point all the domains at the same IP address using DNS and configure the webserver (or write a PHP script) to show the appropriate content based on the Host header.

And yes, clord is out of his mnd. A browser that doesn't send the host header will be unable to access a high percentage (possibly the majority) of websites out there.
posted by cillit bang at 6:12 AM on June 18, 2006


Rhomboid is correct. You want name-based virtual hosting.

What happens is this: you list all your domains with your DNS host, and have them point 'www.domain1.com', 'www.domain2.com', 'www.domain3.com', etc, all at the same IP address. When a web browser goes to 'www.domain1.com', the server, which is running name-based virtual hosting, will automatically and transparently (to you) show them the correct domain. You just set up all your sites and it works like magic.

You also need one 'root' site, which is the site when the domain doesn't match, or when they come in via IP address. You may want to redirect to one of the named hosts, or you may want to throw an error... that's up to you.

This is extremely common and well-proven. There should almost certainly be enough expertise wherever you're hosting your box to get this running for you.

The vast majority of the world's low-traffic sites run in exactly this way.
posted by Malor at 6:53 AM on June 18, 2006


Mutant: When I say DNS, I mean the registrar, Gandi.net in this case, allows domains registered by them to be forwarded. In my ignorance I was assuming this was the correct approach, but I think now it might not be - the DNS might have to be handled exernally, correct?

Will this then allow the single server to see what original domain was invoked to forward this request?
This is a critical question in terms of this project. Let's be clear on the difference between a "forward" and a"CNAME":

CNAME: A CNAME is a 'canonical name', when you link one domain to another in DNS. This is handled by the DNS server, transparently to the user. When they request a DNS record for a URL, like 'you-suck.eu', they get back the IP and go directly there. However, this simplifies DNS management for you, as you can just create a single master entry, and then link the other entries in DNS.: you only have to worry about keeping one DNS record up to date if you move your server, etc. To the user, they are requesting 'you-suck.eu' directly when they contact your web server, which makes it easy for you to see what site they're requesting, because they are contacting your server using 'you-suck.eu' directly after a single DNS lookup.


Forwarding: A forward means an http 302, where the browser comes to a URL, and that server responds and says "Hey, you want to go to this other URL". The browser is told this in a way that it automatically understands, and it goes and does a second DNS lookup, and then contacts that site. This is specifically not good for your project, because now the browser will be contacting "you-suck.com" by name, and your only clue they meant to go to 'you-suck.eu' is if they pass the Referer header.

Now, contrary to what we were saying about host headers, the Referer tag is unreliable. Not the least of which is, what happens when someone bookmarks a page? The next time they come back, they won't be doing the DNS/forward route, so you'll think they're asking for you-suck.com, and they'll be pissed the site isn't remotely the same as the 'you-suck.eu' one they thought they'd bookmarked! Forwards serve a good purpose, including a gentle way to train users to start using a new URL; they are terrible for multiple-site hosting where you want to differentiate the content depending the original URL requested.

Conclusion:
So definitely use CNAMEs. It should be actually easier on the registrar, since CNAMEs don't require the registrar to maintain a web server who's sole job is to do http forwards for all their customers (since that http 302 forward will not be done by your server, but by theirs). You could block your ~400 domains themselves into logical groups, and do some short CNAME chains. For example, if you ran many int'l flavors of your site, you might make all European records point to a EU.you-suck.com, which itself points to www.you-suck.com. The asian sites might point to AS.you-suck.com, etc. This is purely a management element: having 400 records CNAMEd to a single record can get somewhat confusing in terms of making sure they're all set up.

And best of all, the user experience will be tighter and simpler: they'll look up you-suck.jp, and without even knowing it the DNS server is walking the CNAME chain (in milliseconds) and handing back to them a single IP, which is your server. Bam- you can easily see what site they were requesting, and server them the right content. There's no other intermediary layers to worry about.
posted by hincandenza at 11:33 AM on June 18, 2006


You don't even really need the CNAME; you can just put the same A record in for each of those 400 domains. Of course, this requires more maintenance if the IP address ever changes.

The only problem with name-based virtual hosts is if you start using SSL. The SSL setup and authentication happens *before* the client sends the Host: header, which means that the server doesn't know which hostname it's supposed to be identifying as. In that case, you might have to go with multiple IP addresses, unless there's a workaround for this problem.
posted by hattifattener at 12:26 PM on June 18, 2006


The only problem with name-based virtual hosts is if you start using SSL.

I always thought SSL required a unique IP address because of this?


Anyways, for the original question, Yes, it's easy. You need two things setup for doing non-ssl sites.
  1. All of your sites must point to the same IP Address
  2. Set up Virtual Servers on your HTTP Daemon of Choice (Apache tends to be the favorite, their documentation for this feature is here
400 domains will be a major pain in the ass without some software to help keep things straight... If you host with a company like dreamhost, their software will keep track of stuff well enough (not pleasant, but well enough).... if not, look into some of the software out there like cpanel that will let you mess with all of that in an easy interface.
posted by hatsix at 10:34 PM on June 18, 2006


SSL doesn't work with name-based virtual hosting because the "Host:" header (or any header for that matter) is only sent after the connection has been established, which is after the point at which SSL has completed its handshake, wherein the server presents its certificate to the client. The server therefore can only present one certificate, which means that effectively only one SSL domain can be hosted per IP address, since the certificate contains the domain name in the CN field.

In other words, it's a chicken and egg situation: in order to know which certificate to present to the client, the server has to know which domain the client is requesting, but it only knows this when the client sends the "Host" header which occurs well after the SSL handshake has completed.

I think this situation is addressed in one of the newest versions of the TLS standard, but this is a draft standard and nothing supports it yet.

That is the only reason, it has nothing to do with supporting prehistoric HTTP clients that don't know about the Host header.
posted by Rhomboid at 11:49 PM on June 18, 2006


So... here's how we do this (we host thousands of sites like you describe.)

1) We host our own box, but that's not really important overall, if you have a competent network d00d or good control over your machine. We use Ubuntu server, but the actual distro isn't important. Depending on what you have in mind, though, you might be surprised at how cheap it is to build a machine at home and drop it in a colo facility which can take care of networking/power/cooling.

2) We handle our own DNS using MYDNS. It's a mysql-based DNS server. Not as robust as BIND, but it's a lot easier to integrate into our production process. (I'm assuming that these sites will all be generated/controlled by something.) All you gotta do is add entries to your mysql table and it'll resolve. PLUS! There's a nice php-based admin thing for it called mydnsconfig, so if you need examples of how to set stuff up, it's not too hard.

Depending on how you're going to register all these domains, just makes sure you've got the nameserver all set up and properly registered with your domain system. (Godaddy, for instance, is kinda weird about that.)

3) Check out mod_vhosts_alias for apache. It lets you handle all your directories like:
/var/www/websites/examplesite.com

No rewriting of virtual hosts files or anything like that. All I did was add the following line to my Apache2.conf:

VirtualDocumentRoot /var/www/websites/%-2.0.%-1

That means that anything you send to examplesite.com or www.examplesite.com will resolve to the same folder.

Again, this makes for a nice fit into whatever automated process you're going to use to generate all these sites.

I should add that this example assumes that you/your team are going to be the only people who need ftp access.
posted by ph00dz at 7:44 AM on June 19, 2006


« Older Donde esta los pop artists?   |   power surge done it. Newer »
This thread is closed to new comments.