How businesses can ensure cloud uptime over the holidays
Preparation is a critical consideration and could be the difference between success and disaster
During the holiday season businesses can face an array of operational challenges, ranging from traffic surges and reduced staffing to an escalating threat of cyber attacks.
This had led to a growing reliance on cloud computing because of its abilities to rapidly scale resources, handle significant traffic spikes and ensure uninterrupted service; important since research commissioned by Hyve Managed Hosting highlights that a third (33%) of UK consumers will abandon an unresponsive website within 30 seconds.
This dependence has intensified in recent years, says Adam Gogarty, director and cloud strategy lead at Deloitte UK, particularly by video on-demand services and telecoms providers.
"Especially when the nation sits down to watch its favorite family Christmas movies," Gogarty adds.
Key challenges
Maintaining uptime during the holiday season is imperative for businesses, and there are certain challenges they may have to overcome. Reduced staffing due to holiday schedules is one of the most obvious, as are resource mismanagement and inadequate capacity planning.
Other issues can arise with legacy applications that have been converted to run on the cloud without adopting a cloud-native architecture as they often don’t have the same resilience, for example. However, it’s similarly vital to understand the complete supply chain during peak holiday periods.
“If a retailer wants to guarantee delivery before Christmas, systems will need to consider physical, not just digital aspects of business operations such as warehouse capacity and the ability to pack and ship goods,” explains Maynard Williams, managing director of consultancy firm Accenture.
Cloud Pro Newsletter
Stay up to date with the latest news and analysis from the world of cloud computing with our twice-weekly newsletter
“Businesses need to understand how different solutions interact and how to handle failure, particularly at their boundaries. Failure often isn’t in the digital cloud-native solutions, but the interface between them and, say, a warehousing solution,”
With businesses focused on high demand or downtime, security can also be often overlooked, which can put an organization at higher risk. This is because it’s commonly understood that there will always be an increase in threat actors over the holidays as they assume organizations will be slightly off guard.
Be prepared
To ensure uptime during the holidays, best practice should include conducting pre-holiday stress tests to identify system vulnerabilities and configure autoscaling to handle demand surges. Experts also recommend simulating failures through chaos engineering to expose weaknesses.
Redundancy across regions or availability zones is essential, as is a well-documented incident response plan – with clear escalation paths – “as this allows a team to address problems quickly even with reduced staffing,” says VimalRaj Sampathkumar, technical head – UKI at software company ManageEngine.
It’s all about understanding the business requirements and what your demand is going to look like, says Luan Hughes, chief information officer (CIO) at tech provider Telent, as this will vary from industry to industry.
“When we talk about preparedness, we talk a lot about critical incident management and what happens when big things occur, but I think you need to have an appreciation of what your triggers are,” she says.
“That means that, as a technology function, you’re driving action. Be clear on where your tolerances are and aware of when you need to mobilize people to take action and get ahead of the situation before it becomes a critical incident. This way you’re not creating a headache for yourself when you return from the holidays.”
It’s also important to focus on your people as much as your systems, she adds, noting that it’s imperative to understand your management processes, out-of-hours and on-call rota and how you action support if problems do arise.
Take advantage of the available tech
As well as robust strategies, a number of different technical solutions also support cloud reliability during high-stress periods, including multi-cloud and serverless computing.
AI and automation are two key areas than can support uptime management, by enabling dynamic resource scaling in response to demand and predicting hardware and software failures.
Self-healing systems, for example, powered by automation, can detect and fix common issues without manual intervention, reducing the impact of staffing shortages and accelerating problem resolution.
While technology has a multitude of benefits in terms of ensuring cloud uptime, Hughes advises that having such solutions doesn’t mean teams can rest on their laurels.
“You have to make sure you’ve got some manual intervention – it’s all about understanding what the risk is, allowing the automation to do the work, but still making sure your staff are checking in the background.”
Also, in case the worst does happen, Sid Nag, vice president, cloud, edge and AI infrastructure technologies at analyst firm Gartner, recommends implementing fully or near fully automated disaster recovery (DR). This could be through your own tools or third-party cloud native DR tools.
“This provides the foundation needed to meet aggressive recovery time objectives (RTOs),” he notes.
Does this approach work for businesses of all sizes?
The core strategy to maintain cloud uptime during the holidays should be the same whatever the organization’s size, but SMEs may benefit from taking a slightly different approach.
Uptime may be easier to manage as platforms are likely to be more contained, but on the flipside there’s a smaller team to manage this work. Therefore, smaller organizations that don’t have the scale to build and maintain all capabilities in-house may benefit from outsourcing critical workloads to managed service providers (MSPs) and/or collaborate with cloud providers for technical support during peak times.
Additionally, choosing a cloud solution that’s customizable based on demand, or offers auto-scaling features, can help smaller organizations efficiently handle traffic fluctuations and ensure consistent performance during peak times, advises Hyve’s operations director, Charlotte Webb.
Ensuring cloud uptime over the holidays requires a continuous cycle of planning, execution and improvement, always with a focus on financial sustainability and operational resilience, notes Gogarty.
However, with a focus on preparedness and by taking advantage of the tools available to support this, businesses of any size should be able to navigate peak periods problem-free.
Keumars Afifi-Sabet is a writer and editor that specialises in public sector, cyber security, and cloud computing. He first joined ITPro as a staff writer in April 2018 and eventually became its Features Editor. Although a regular contributor to other tech sites in the past, these days you will find Keumars on LiveScience, where he runs its Technology section.