What is server redundancy?

A server corridor for a supercomputer installation
(Image credit: Shutterstock)

Businesses are increasingly dependent on the reliability of the servers within their IT infrastructure. Unfortunately, these components fail at random and in unpredictable ways, which in turn drives the need for server redundancy.

Server redundancy means having more than one server with the same structure and data available when needed. The redundant server has the same configurations, storage, applications, and compute capacity as the production server. 

RELATED RESOURCE

BCDR buyer's guide for MSPs

How to choose a business continuity and disaster recovery solution

FREE DOWNLOAD

This fail-safe option forms a layer of protection that mitigates disruptions like hardware failure or natural disasters such as fire and flood, as well as associated issues like power outages. Server redundancy can also form part of an organization's cyber resiliency strategy: If the primary server falls victim to a ransomware attack, the business can restore its data from the backup server and continue operating smoothly. 

Redundant servers also form a key part of backup, maintenance, and load-balancing strategies. These increase a business's ability to have control when IT infrastructure problems occur. 

Designing a system that works when components fail can be an expensive exercise, running a redundant and production server is still an expense that requires extra space within a data center. This has to be counterbalanced with the fact that the costs associated with a drop in reputation and a breakdown in trust can be catastrophic for an enterprise.  

Why is server redundancy important?

Image of a server rack

(Image credit: Shutterstock)

If an organisation hopes to survive and thrive, it needs to be prepared for the unexpected. A thorough disaster recovery strategy needs to be put in place with the capability to implement it. This could be the difference between your company coming through intact and being successful, compared to one that fails and is unsuccessful; that's how important this strategy is.

In a world where most organisations are built around digital infrastructure, redundant servers must form an essential element of a successful disaster recovery plan. Not only is it critical for your organisation to be able to access data in the aftermath of a disaster, but its loss could mean severe, potentially long-lasting disruption.

Hardware failure, application faults, network problems and other such issues can prevent the proper functioning of primary servers, leaving users unable to access services and vital data. At best, this will pose a problem for productivity.

Your business can avoid these contingencies by employing server redundancy. By having critical data duplicated at a second location, it can quickly and easily be retrieved in the event of a fault with a live server. If data integrity and access are vital for applications and the functioning of your organisation – as they are for many – redundant servers are essential.

What are the business benefits of server redundancy?

Server redundancy offers assurance to businesses as they have a cost-effective backup for accessing critical data if disaster strikes and a live server goes offline. If a server goes down, a backup server can take up the slack enabling maximum uptime until the failed server is fixed.

They also feature real-time system monitoring which scans for possible failure, which means businesses always know about the health of their servers.

However, the benefits need to be balanced with the level of risk and the substantial costs associated with it.

Types of redundant server

Rendering of a server rack, with blue light coming from the units

(Image credit: Shutterstock)

Redundant servers can take many forms.

Redundant domain, front end, and validation servers

These types of server redundancy are used for load-balancing to ensure users can always access a service. For example, a secondary Windows Active Directory server validates user access to the domain if the primary AD server goes down or is busy.

Replicated servers

A replicated backup server can be paired with a production server. Any change to the production server is replicated to a backup server using software-based or hardware-based tools. In the event of a server failure, the replicated server can be brought into service.

Disaster recovery servers

These are semi-hot spares that can have backup files quickly restored and restart processing, should disaster strike.

How to create server redundancy

To create server redundancy in your infrastructure, you need two servers housing identical data - a primary server and a secondary server.

A failover monitoring server checks the primary servers for any problems. Should a problem be detected, it will automatically update DNS records so that network traffic is diverted to a secondary server.

Once the primary server is working properly again, traffic will be rerouted back to the primary server. If the handover and hand back are successful, users should not notice any difference to the service.

What is IP failover?

server with ethernet cables plugged in

IP failover is a popular technique for server redundancy. Servers run a so-called heartbeat process and in the event of one server failing to see the heartbeat of the other server, it takes over the IP address of the failing server.

IP takeover is implemented when two servers are connected on the same switch and are running on the same subnet.

What else should be redundant?

In addition to a redundant server, your infrastructure should make sure it has other parts that can be duplicated in case of emergencies and to ensure maximum uptime.

Backups

Backups can be deployed to ensure data held locally is also stored elsewhere (on the cloud, or another data centre in a distant location). This allows you to quickly restore data in the event of a disaster.

Disk drives

Hot spares should be available so that if a disk drive in a primary server fails, another drive can immediately replace it. Using a RAID array should ensure that a server can keep running when there is a single disk failure.

Power supplies

Redundant power supplies should be deployed on critical servers so that if the main power supply fails, it can continue to run.

Internet connectivity

If your server needs to have a connection to the internet at all times, having a line from a different telecoms company is important. If one line fails (e.g. if a workman severs a cable), traffic can shift onto an undamaged line.

Rene Millman

Rene Millman is a freelance writer and broadcaster who covers cybersecurity, AI, IoT, and the cloud. He also works as a contributing analyst at GigaOm and has previously worked as an analyst for Gartner covering the infrastructure market. He has made numerous television appearances to give his views and expertise on technology trends and companies that affect and shape our lives. You can follow Rene Millman on Twitter.