Gmail knocked offline due to “overloaded data centres”
Google explains why the world couldn’t access Gmail, while concerns are raised about the cloud-based emailing model.
Gmail was knocked offline for two-and-a-half hours yesterday morning, because of "routine maintenance" in one of Google's data centres.
The outage, which occurred at approximately 9.30am GMT, meant that people around the world couldn't access their Gmail accounts.
According to a blog post by Acacio Cruz, Gmail's Site Reliability Manager, routine maintenance shouldn't cause a problem because accounts are served out of another data centre.
However in this case, unexpected side effects of new code that tried to "keep data geographically close to its owner" caused another data centre to become overloaded. This, in turn, caused cascading problems from one data centre to another, which took Google engineers an hour to get under control.
"The bugs have been found and fixed, and we're in the process of pushing out changes. We know how painful an outage like this is we run Google on Gmail, so outages like this affect you," Cruz said.
"We always investigate the root causes of rare outages like this one, so we can prevent similar problems in the future."
The issue with Gmail has caused concern in some quarters about the viability of the cloud-computing model when it comes to email, as an outage for any amount of time could be extremely costly for businesses.
Get the ITPro. daily newsletter
Receive our latest news, industry updates, featured resources and more. Sign up today to receive our FREE report on AI cyber crime & security - newly updated for 2024.
However, SaaS enterprise e-mail company Mimecast defended the model, saying that Gmail wasn't the only option to choose from.
James Blake, Mimecast's chief product strategist, said that that if businesses were going to choose from cloud based email, they needed to choose vendors that agreed to certain service level agreements on uptime.
He said: "There is a big difference between guaranteeing 99.999 per cent and 99.9 per cent. One means downtime is kept to less than a minute while the other could mean close to an hour."