Why did AT&T lose our DNS records!?
December 2, 2008 7:08 AM   Subscribe

iPhones, AT&T, and the University of Michigan..?! Help me sort out what is a - best I can figure - DNS issue!

I will attempt to be brief and only cover the salient details.

I work for the EECS Department at the University of Michigan doing IT support. I am the desktop support guy. In our department we also run our own web and e-mail services (and servers). I don't have much to do with this - these are all rack mounted Linux machines and they're administered by a bunch of Unix people. I am, I guess, sort of a filter between these cantankerous admins and the general population. Admins change the mail configuration. It breaks Outlook. I figure out how and why it fixes Outlook. They claim people should just use Berkley Mail. I politely disagree. We eventually reach compromise.

Anyway. I have an iPhone, as do several faculty, staff and graduate students in the department. I use this phone with my EECS IMAP mail server. This has never been a problem for me.

Then, suddenly, last week I could no longer send or receive e-mail from my EECS mail account while on the AT&T network. It would work on the campus WiFi. It would work on my WiFi at home. It would work at random WiFi hotspots. But it refused to work while using 3G (or EDGE) data services.

My GMail account worked without issue.

Well, I figured this was just a transient issue and it would disappear.

No such luck.

Instead, starting yesterday my trouble ticket system started to fill up with complaints from other iPhone users that their EECS mail is also not functioning on the cellular network.

At this point, I started to do some digging.

It seems like the AT&T network has, like - and here I lose the technical language a little bit - lost our DNS entries.

I installed an SSH client on my iPhone and on the 3G network I can't connect to one of our machines - marquette.eecs.umich.edu - I get a "hostname not found" error. But, if I try it via IP, it works.

I tried to browse to our website, www.eecs.umich.edu, on Safari. On the 3G network, it can't open the page. If I try it via IP, it works.

I went into my mail client and changed my incoming and outgoing mail servers to IP values instead of hostnames, and, viola, on 3G it started to work.

What the hell?

Our admins don't have a clue. They say our MX records and our servers and our firewalls are all configured correctly. They look at the logs and, yes, as you would suspect (as the names can't be resolved), they don't even see my phone attempting a connection.

Data points I think would be helpful - is this a geographical issue? Can people in other parts of the country with iPhones browse to, say, www.eecs.umich.edu? Is this an iPhone specific issue? I've read some criticism of the iPhone's DNS resolver - does this affect other mobile devices on the AT&T network?

That settled, what the hell could be causing this? I have no idea who in AT&T I would possibly call - it sounds like the support call from hell. I have a hard enough time just trying to check how many minutes I've used.

Any ideas, hive mind?
posted by kbanas to Technology (13 answers total)
 
Response by poster: Oh, I also realize that this sort of coincides with the release of the iPhone 2.2 software on the 21st. It doesn't exactly coincide, but then again, I couldn't say when precisely I noticed this problem for the first time.

I tried to roll back to 2.1 to test, but I got an error code from iTunes and further research indicated it would most likely require fiddling with 3rd party applications and so I decided to not even try.
posted by kbanas at 7:10 AM on December 2, 2008


I pulled up www.eecs.umich.edu fine on my iPhone via AT&T's 3G network. FWIW, I was able to pull up MX records for eecs.umich.edu from OpenDNS and a pair of nameservers that belong to AT&T. Both digs show newman and cliff (clever!) as your MXs.
posted by jquinby at 7:26 AM on December 2, 2008


Response by poster: jquinby, what software revision are you running on your iPhone?

(No one ever gets that about cliff and newman... thanks! :)
posted by kbanas at 7:29 AM on December 2, 2008


I am running 2.2, so it's the latest and greatest. I found an article about some low-level bugs in the iPhones DNS resolver behavior, but it seems to be related to CNAMEs.

I just checked the app store on iTunes and there's a free DNS lookup tool that might be useful for troubleshooting. As the price is right, I'm going to grab it anyway.
posted by jquinby at 7:37 AM on December 2, 2008


Response by poster: You know, I could have sworn I looked for a DNS lookup tool yesterday and only found one that was $4.99, but now that I look again I see "Lookup" for free.

I'm checking it out now. Thanks for the help, jquinby - very curious that it works just fine for you.
posted by kbanas at 7:41 AM on December 2, 2008


Just to be clear, the iPhone mail client shouldn't care about MX entries when sending or receiving mail. That is the mailservers problem while sending ( a receiving mail server might also care as part of validating the sending server via SPF records). I mention this, because the admins telling you that everything is A-Ok with the MX records for your domain is a red herring.

What is the hostname for your incomig and outgoing mailservers. What nameservers are authoratative for those hostnames. If you try resolving those hostnames directly against the appropriate nameserver, do each of them give the right answer in a timely manner?
posted by Good Brain at 7:56 AM on December 2, 2008


Response by poster: Good Brain,

And here you get a little beyond my knowledge with these things.

I will attempt to answer as best I can -

For my particular mail configuration, my incoming mail server is marquette.eecs.umich.edu and my outgoing mail server is mail.eecs.umich.edu.

The authoritative nameservers would be - what, exactly? The DNS servers that we run in our department? We have 2 of these - csedns.eecs.umich.edu and eecsdns.eecs.umich.edu.
posted by kbanas at 8:01 AM on December 2, 2008


kbanas, it does indeed look like it's a local AT&T dns issue, as I'm here in Ann Arbor and i just tried www.eecs.umich.edu on my original iPhone running 2.2 and got the same results you did; can't reach it by name, works fine by IP. It's pretty clearly their problem, but as you said, it's going to be damn tricky to get that call escalated to a technician. You might actually try going into the AT&T store on liberty and showing them this problem and see if they can start a ticket for you; that way you'll have the error documented locally by an employee, which should get you a few steps closer to the AT&T geeks who can fix this. Good luck, and feel free to mefimail me if you need another local test.
posted by ulotrichous at 8:03 AM on December 2, 2008


I can open your page from here (Massachusetts) on my 1st gen, 2.2 firmware iPhone -- WiFi or EDGE -- with no problem.
posted by Rock Steady at 8:46 AM on December 2, 2008


Good luck finding someone who could actually address your problems at ATT. Your best bet is probably getting someone in UMich purchasing department to call the highest ranked person they know. If associate your issue with the millions of revenue from UMich you'll likely have better results. May I also suggest Turboing?

Back in Feb 2008 ATT broke reverse DNS for all my phone based mail users (we're in NYC). If you connect using wap.cingular as your GRPS AP (iphone, smartphones, etc) and do a reverse lookup on your public IP you get a name which doesn't resolve to that (or any) IP. TCP Wrappers as compiled by default (paranoid mode) STRONGLY frowns on mismatched forward/reverse DNS, killing the connection as soon as it notices the wonky DNS. Dunno if you use TCP Wrappers.

Nine months in to this reverse DNS issue, it's clear to me that ATT doesn't care about DNS at all, so it wouldn't surprise me if they accidentally also broke reverse DNS for the ATT Ann Arbor DNS servers themselves, which your DNS servers might be ignoring due to the aforementioned TCP wrappers.
posted by notpeter at 9:29 AM on December 2, 2008


kbanas, you can see what nameservers are supposed to answer authoratatively for a given hostname by running "nslookup," telling it "set type=NS" hitting return, and then entering the hostname you are interested in.

This is what it tells me for both your incoming and outgoing mailserver:
Authoritative answers can be found from:
eecs.umich.edu nameserver = dip.eecs.umich.edu.
eecs.umich.edu nameserver = csedns.eecs.umich.edu.
eecs.umich.edu nameserver = zip.eecs.umich.edu.


I can then connect directly to each of those DNS servers by telling nslookup "server SERVERNAMEorSERVERIP." I can then see what each server tells me when I set the query type back to A (address records) with "set type=A" and then looking up the hostnames of interest. When I do that, I get timely answers from csedns.eecs.umich.edu for the names of both your incoming and outgoing mails servers. I get no answer from the other two, the requests time out after 30s or so. I can't successfully ping either of them either, while I can ping csedns. I note that dip and zip appear to be on a different subnet than csedns.

My guess is that this is the right place to look for the problem. It could be that there are times when none of your DNS servers are available from the Internet at large, or from all or part of AT&Ts network. Or it could be that your iPhone is timing out DNS requests out DNS requests before the server it is using manages to retry on the responsive server.
posted by Good Brain at 9:50 AM on December 2, 2008


This kind of thing always happens to me on cell connections. Ive seen DNS nuttiness on AT&T, T-mobile, and Sprint. I just give up and set my Phone's DNS 4.2.2.2.
posted by damn dirty ape at 11:22 AM on December 2, 2008


Your best bet is probably getting someone in UMich purchasing department to call the highest ranked person they know. If associate your issue with the millions of revenue from UMich you'll likely have better results.

Agreed - nothing like getting someone to wave a big heavy club to get a vendor to sit up.

This page on the Procurement Office's site lists the administrator for the UM AT&T cell phone contract. I realize your problem may not involve this contract, but the administrator should know how to get this escalated.

Or you could just shoot yourself, which is what I always end up wanting to do when I deal with Procurement. But I suppose that's not helpful.
posted by shiny blue object at 5:55 PM on December 2, 2008


« Older Lowered Conforming Jumbo limit predicament   |   How do I master my small business’ accounting and... Newer »
This thread is closed to new comments.