"Oh look, the sun went out."
September 1, 2009 3:08 PM Subscribe
Gmail was down this afternoon. What, exactly, is happening when gmail is "down," and what are the kind of things that can cause something like Google/gmail to not work?
Any number of things might take gmail down, or cause it to start working 'wrong' in such a way that its managers would rather block access till the problem is resolved. This is kind of like saying, "My car is in the shop, what kinds of things would do that?"
posted by Tomorrowful at 3:12 PM on September 1, 2009 [2 favorites]
posted by Tomorrowful at 3:12 PM on September 1, 2009 [2 favorites]
>What, exactly, is happening when gmail is "down,"
Could be 1000s of things. Basically a program that interacts between your browser and your email data (apache, mysql etc.) is not able to do this anymore. This could be because of a bad update, the database server being down....
>and what are the kind of things that can cause something like Google/gmail to not work?
Again, many things could cause this. Human errors, software errors, bad people (DOS).
More important is that you did not lose any emails that were delivered during this time. And while the web-interface (was you probably understand as gmail) was down, you were still able to access your email with another email client (IMAP).
From many email services I had, including my own setup, gmail is doing a fantastic job.
Y
posted by yoyo_nyc at 3:12 PM on September 1, 2009
Not even Google knows right now.
Update (2:37 pm): We've fixed the issue, and Gmail should be back up and running as usual. We're still investigating the root cause of this outage, and we'll share more information soon. Thanks for bearing with us.posted by june made him a gemini at 3:17 PM on September 1, 2009 [1 favorite]
To be generic, there are a lot of moving parts between the edge web server at Google and the server where your email data is. If any of those moving parts stops moving, things stop working. it could be a code error introduced by accident, it could be a networking misconfiguration, it could be whatever.
posted by GuyZero at 3:28 PM on September 1, 2009
posted by GuyZero at 3:28 PM on September 1, 2009
I discovered it was not working and first assumed something wrong on my end since I have come to assume Google and Gmail so efficient and on top of things. Nice to know they too are sometimes off their game.
posted by Postroad at 3:46 PM on September 1, 2009
posted by Postroad at 3:46 PM on September 1, 2009
So I'll take a quick shot.
Google has a very robust architecture so likely hardware issues (servers down) and even network connectivity to their colo's aren't very likely.
The weakest link would be if something happens to their DNS entries. DNS servers are pretty straightforward. when you type in ask.metafilter.com the first trip is to a DNS server that says the IP of ask.metafilter.com is 174.132.172.58 and then your browser goes to that IP address to get the page. These DNS servers (and there are several layers of them) all are ultimately fed from a single top level server if something strange happens there they you can get an outage.
Note imap and pop were working which have separate domains to get email from so this would be consistent.
posted by bitdamaged at 4:35 PM on September 1, 2009
Google has a very robust architecture so likely hardware issues (servers down) and even network connectivity to their colo's aren't very likely.
The weakest link would be if something happens to their DNS entries. DNS servers are pretty straightforward. when you type in ask.metafilter.com the first trip is to a DNS server that says the IP of ask.metafilter.com is 174.132.172.58 and then your browser goes to that IP address to get the page. These DNS servers (and there are several layers of them) all are ultimately fed from a single top level server if something strange happens there they you can get an outage.
Note imap and pop were working which have separate domains to get email from so this would be consistent.
posted by bitdamaged at 4:35 PM on September 1, 2009
I think IMAP and POP continuing to work (apparently; I didn't test it) would indicate more that the webserver and/or whatever underlying application architecture they use to serve the website died than a DNS issue. The domain was resolving fine, but it was dying catastrophically when it tried to do anything, even while the mail receiving, sorting, and routing bits worked fine.
I only point that out by way of illustration to answer the original question: could have been any of a zillion thing, anything from a bug in a script or behind-the-scenes application to a massive hardware failure to a feature upgrade rollout that went awry to a meteor strike, though one assumes that we would have heard more about that...
posted by socratic at 5:12 PM on September 1, 2009
I only point that out by way of illustration to answer the original question: could have been any of a zillion thing, anything from a bug in a script or behind-the-scenes application to a massive hardware failure to a feature upgrade rollout that went awry to a meteor strike, though one assumes that we would have heard more about that...
posted by socratic at 5:12 PM on September 1, 2009
Other times GMail has been down, it's been a botched rollout or upgrade or patch that they had to revert due to unexpected bugs.
posted by rokusan at 6:24 PM on September 1, 2009
posted by rokusan at 6:24 PM on September 1, 2009
Google has put a lot of effort into making their system very robust. By robust, I mean that they have built in a ridiculous redundancy into their system so that a single failure in a single subsystem doesn't bring the whole thing down. For example, they have multiple copies of all of their data interspersed geographically, so the sort of things that would bring down a typical company isn't going to cause Google to go down.
I think that massive hardware failures, bugs in scripts, and these sort of "typical" things are unlikely to cause Google to have a catastrophic failure.
posted by kenliu at 6:34 PM on September 1, 2009
I think that massive hardware failures, bugs in scripts, and these sort of "typical" things are unlikely to cause Google to have a catastrophic failure.
posted by kenliu at 6:34 PM on September 1, 2009
I've read a bit about Google's architecture (the public parts). It is extremely unlikely that any hardware or network failure would interrupt their service. Gmail data is stored redundantly with no single point of failure. You can't cut a wire somewhere and take Gmail offline.
My guess would be that they tried to push an update to the 100,000 or so Gmail servers, and the new software didn't work as planned. They spotted the problem, but it took some time to roll back the new software on all the servers.
By the way, every place I've worked has had serious e-mail disruptions once or twice a year. Google does an incredible job, especially considering that the service is free.
posted by miyabo at 6:44 PM on September 1, 2009
My guess would be that they tried to push an update to the 100,000 or so Gmail servers, and the new software didn't work as planned. They spotted the problem, but it took some time to roll back the new software on all the servers.
By the way, every place I've worked has had serious e-mail disruptions once or twice a year. Google does an incredible job, especially considering that the service is free.
posted by miyabo at 6:44 PM on September 1, 2009
It's a big, complex, heavily redundant architecture. Google also is (rightly) not entirely forthcoming on how their data centers are set up. So the answer is we don't know and even if we knew, we probably wouldn't understand since we aren't Google people.
posted by chairface at 7:10 PM on September 1, 2009
posted by chairface at 7:10 PM on September 1, 2009
« Older Permutations: an extension of the odd sock problem | Moving to Italy with a 4 month old -- advice? Newer »
This thread is closed to new comments.
posted by DanW at 3:10 PM on September 1, 2009