Cloud failures: Horror stories and how to avoid becoming one

error symbol on a computer monitor

Cloud Pro often talks about the many benefits of cloud computing, but let's be honest, some problems do crop up from time to time.

Outages send infrastructure offline and the wrong cloud applications can sometimes all but sink the companies that have been investing time and money in using them.

Here are a few cloud horror stories to scare you, but like every good horror movie, there is always a survivor who lives to tell the tale...

The recalcitrant cloud provider

One of the big fears around the cloud is that you run the risk of losing your data if the relationship with your cloud services provider turns sour.

Ian Masters, regional director at Vision Solutions, recalls previous experience when a service provider that his firm worked with was asked for help by a prospective customer.

“The customer's approach was due to the fact that they wanted to stop using their current cloud provider and move over to the new one,” Masters said.

“However, their current supplier did not want to provide any assistance in this. This recalcitrant approach did not endear them to the customer, so they had to think more carefully about the migration.”

The incumbent service provider was refusing to help with any migration assistance or solve any problems. This included not allowing access to the datacentre where the customer’s machine images were based, or any other support.

The new service provider could not just rebuild things from scratch, as that would have resulted in a large period of downtime, in addition to ramping up costs.

This locked room mystery meant the customer had to replicate their systems over to the new cloud platform, without having to physically touch the servers or storage.

With time running out, the service provider came up with the idea of using a software tool called Double-Take Move.

“This was installed in the customer’s virtual machine images to copy the VMs over to the DataStore 365 cloud platform in the background, but without having to get physical access,” Masters added.

The images were then switched over without any downtime or lost data. DataStore 365 was also able to prove it could help in adverse circumstances. This helped rebuild some trust back in the cloud in general, turning a potential horror story into a happy ending.

“What could have been a black mark against cloud turned into a validation of this way of delivering services,” Masters said.

When the cloud provider thinks you haven’t paid

Roger Goodwin, managing director of Digital Trading, an Oxfordshire-based web development house, warns that if you have an account with a cloud provider, you should manage it carefully.

“We recently had an issue where an account with Microsoft Azure, where our account got cancelled due to failed payments," he said.

"The payments failed due to security checks by the bank. No notification was received from either the bank or Microsoft of the issue. After 28 days of the failed payment the account was suspended."

When this happened, it took all the servers and site offline. Microsoft was quick to respond and the issue was dealt with quickly. However, once the account had been suspended all the IP addresses and DNS assigned to these servers were removed.

“Once the servers and sites came back online we then had to reset all the DNS records to point to the now new servers. In total this took about 24 hours during which time all sites were offline,” Goodwin added.

Disaster recovery? We’ve heard of it!

Never let it be said that a disaster recovery plan is a waster of time and money. That goes for both the customer and provider. Earlier this year, source code hosting provider Code Spaces experienced the ultimate horror when hackers gained access to its Amazon EC2 control panel.

The firm had already received an extortion threat from criminals demanding a “large fee” to stop a DDoS attack from sending the service offline.

That attack was just the beginning. In addition to amassing a considerable botnet to take the website offline, the hackers also managed to gain control of Code Spaces’ EC2 control panel.

Code Spaces finally managed to regain control of the access panel, but to no avail. The attacker had removed all EBS snapshots, S3 buckets, all AMIs, some EBS instances and several machine instances. This meant most of its data, machine configurations and offsite backups had been partially or completely deleted. There was no longer a business for it to run or a service to provide to customers.

If there are any lessons to be learned, it is to make sure that access to administration dashboards are properly secured and backups are held at a different cloud provider, with different authentication credentials.

"Get your data out of our cloud quick, we’re going out of business"

Probably the biggest nightmare of them all is what happened to customers of cloud storage provider Nivanix.

Last year, the firm shut down and gave customers just weeks to transfer data to another provider. For firms that had terabytes, if not petabytes, of data, stored in the cloud getting that data out of its infrastructure and somewhere - anywhere - else, was going to prove a monumental task given the short window for action.

Add to this the fact that very few providers have pipes big enough to transport so much data, and you are looking at a near impossible situation for firms.

In the end, help came largely from Aorta Cloud and what was left of engineering staff at Nirvanix to pull out data for every customer that contacted it. It did so using high-speed links in datacentres shared with Nirvanix, providing a lifeline to customers with loads of data that otherwise would have been lost.

The lesson here is to make sure to keep an emergency data backup that is fully under your control.

So there you have it. Nothing is perfect, including the cloud, but you can certainly enjoy the successes if you bear in mind what can go wrong and do your best to avoid it.

Rene Millman

Rene Millman is a freelance writer and broadcaster who covers cybersecurity, AI, IoT, and the cloud. He also works as a contributing analyst at GigaOm and has previously worked as an analyst for Gartner covering the infrastructure market. He has made numerous television appearances to give his views and expertise on technology trends and companies that affect and shape our lives. You can follow Rene Millman on Twitter.

Latest in Cloud
AI chatbot text dialogue boxes in difference colours above a digital circuit board with lines of light emanating from it
Enterprise AI is surging, but is security keeping up?
Oracle logo pictured in red lettering against a black background at the company's stall at Mobile World Congress (MWC) 2025 in Barcelona, Spain.
Say goodbye to walled gardens, Oracle is doubling down on multi-cloud
A glowing blue CGI representation of a network solution provided via the IT channel.
Why understanding the customer’s network unlocks its value and your success
Cloud storage concept image showing digitized cloud symbol with data flows.
AI is putting your cloud workloads at risk
A CGI visualization of cloud computing, with an isometric view of a purple and blue cloud linked to seven glowing cube nodes, to represent devirtualization and revirtualization.
Navigating devirtualization as businesses move away from the cloud
Logo of Google Cloud, which recently announced the Wiz acquisition, pictured at Mobile World Congress 2025 in Barcelona, Spain.
The Wiz acquisition stakes Google's claim as the go-to hyperscaler for cloud security – now it’s up to AWS and industry vendors to react
Latest in Feature
A photo of UNSW's Sunswift 7 car pictured in front of Uluru in Australia's Northern Territory.
How UNSW’s Sunswift Racing and Ericsson achieved cross-country connectivity in Australia’s outback
Matt Clifford speaking at Treasury Connect conference in 2023
Who is Matt Clifford?
Open source vulnerabilities concept image showing HTML code on a computer screen.
Open source risks threaten all business users – it’s clear we must get a better understanding of open source software
An abstract CGI image of a large green cuboid being broken in half with yellow, orange, and red cubes to represent ransomware resilience and data encryption.
Building ransomware resilience to avoid paying out
The words "How effective are AI agents?" set against a dark blue background bearing the silhouettes of flowchart rectangles and diamonds to represent the computation and decisions made by AI agents. The words "AI agents" are yellow, while the others are white. The ITPro Podcast logo is in the bottom right-hand corner.
How effective are AI agents?
An illustration showing a mouth with speech bubbles and question marks and a stylized robot alien representing an AI assistant chirping away with symbols and ticks, to represent user annoyance with AI assistants.
On-device AI assistants are meant to be helpful – why do I find them so annoying?