Why are my domains having intermittent DNS issues?
August 7, 2006 8:02 AM   Subscribe

I'm having crazy issues with my Debian DNS server, and I don't know how to troubleshoot them. Why would I not even see the DNS requests coming in?

The problem: I have three domains. I have people from around the globe complaining of intermittent issues resolving these domains. I am always able to resolve them.

The three domains are Smallbusiness.com, pornucopia.org, and Sadclown.com. (These are made-up examples.) The first two are registered with one of the popular major registrars. The third is registered with a minor registrar that I don't even remember the name of.

The first two are on Server A, and the second one is on Server B. Server A and Server B have identical installations of Debian on them. Server A and Server B are hosted in different CoLo facilities. All three domains on both servers are having the same problems. This made me suspect it was a problem with the Bind9 that came with my Debian distros. Also, Server A is the authoritative name server for Smallbusiness.com and pornucopia.org; Server B is the NS for Sadclown.com. (Each is also the web server hosting that domain.)

However, I recently did some troubleshooting. Denny, my customer, said that he couldn't get to pornucopia.org. Fine, I said. I got onto Server A and ran a tcpdump, looking for any traffic on port 53. Here's what I saw:

1. Denny tried pornucopia.org in his browser. I saw no domain requests incoming. Denny got a "Hostname couldn't be found" error.

2. I tried pornucopia.org from a different remote machine, and I saw a domain request come in, and the appropriate answer go out.

3. Denny tried smallbusiness.com in his browser. I saw a domain request come in. (Remember, this is the same server and same bind9 installation as pornucopia.org!)

Various Sadclown.com customers have been complaining about resolution problems for ages, so it seems that the same thing is afflicting my other server. I've never had a problem with that one either.

For a while I wondered if it could be a problem with the Apache virtual hosts, but my tcpdumps seem to indicate that it's a problem with the root name servers, or the registrars, or something crazy like that. (Though they are on different registrars.)

It seems that when a customer has a problem with resolution, it lasts for a long time. Denny hasn't been able to resolve Pornucopia.org for about 4 days now. Denny has had this problem come up once or twice before, and it lasted several days, then went away.

Any suggestions?
posted by pornucopia to Computers & Internet (2 answers total)
 
When someone makes a DNS request, it typically first goes to the domain server of their ISP. Often, it doesn't get beyond it -- DNS requests get cached on the remote servers, and so the ISP's domain name server responded with the (possibly inaccurate) info. It's only when there isn't material in the cache, or the cache has expired, that the ISP's domain name server goes out to make a request.

In addition, one's own networking software typically caches DNS. So, the fact that you didn't see a request when Denny visited pornucopia.org just means that Denny had already tried it, and for some reason, there was bad info in the cache.

What are the values in the SOA section set to in the zone files for your domains for refresh, retry and expiry? If they're especially high (ie, long), ISPs will be caching your DNS info for long times, meaning that when you make changes, people will be getting served the wrong information for some time. Try setting them down to the order of a couple of hours if they're set higher.

Finally, you didn't talk about secondary DNS. Do you have machines acting as secondary DNS for each of these domains? Most registrars require at least two DNS servers, and if you specified something for secondary that doesn't deliver data, that may be what the people who are trying to access your site are running up against. Information on these domains would be helpful, if only so that we could try our own DNS tests.

Lastly, people will always have intermittent DNS problems, because their own ISP's DNS servers go down or act flaky. It's something you can minimize, but it's not something you can eliminate. Expect issues outside of your control a couple of times a year if you have a large userbase.
posted by I EAT TAPAS at 8:18 AM on August 7, 2006


Response by poster: I don't have secondary name servers for any of the domains. I should probably set up Server A to be secondary for Server B, and the reverse. Thanks for the suggestion.

I will check the SOA values... but since the IP address hasn't changed in a couple of years, that shouldn't be an issue, right?
posted by pornucopia at 9:04 AM on August 7, 2006


« Older Very basic Curl question   |   Two computers connected to two networks? Newer »
This thread is closed to new comments.