The art of digitising war

The front entrance of the Imperial War Museum in London with anti-aircraft guns outside
(Image credit: Shutterstock)

"Those who cannot remember the past are condemned to repeat it."

Thus wrote philosopher George Santayana, and while his famous aphorism has earned a place in the general cultural lexicon, it's most commonly applied to humanity's ignominious history of war and conflict. Unfortunately, societies have a tendency to romanticise their past, and that's part of the reason that historical archives like those held by the Imperial War Museums (IWM) are so valuable.

RELATED RESOURCE

The IT Pro Podcast: Digitising dinosaurs

How the Natural History Museum uses data to explore the ancient world

FREE DOWNLOAD

Founded in 1917 amidst the chaos of the first world war, part of the organisation's remit is cataloguing the vast amounts of documentary video footage produced during the conflict. It was one of the first major international wars substantially captured in film, and because much of this footage was shot on highly unstable nitrate-based celluloid film, digitising and preserving it has become one of the museum's major activities.

"Nitrate is a terrible format to use," explains IWM CIO Ian Crawford, "because it's got a nasty habit of unexpectedly bursting into flames now and again, because of its chemical composition. But a lot of our World War One film is the old nitrate; they didn't move over to more stable film stock until the 30s and 40s."

If you've seen Peter Jackson's award-winning 2018 documentary They Shall Not Grow Old, you've seen some of this footage in action; Jackson used previously unseen material from IWM's archives to create the film, colourising it, upscaling the resolution and interpolating the frames to create a smoother and more lifelike end result. Without the museum's digital archiving efforts, projects like this simply would not be possible.

The medium it uses to store this footage, however, may be surprising to some. Its primary digital archive uses tape drives – a technology that, many would argue, belongs in a museum itself. But tape-based storage still exists for a reason: There are some things that it just does better than other formats, and one of those is mass-scale cold storage. Tape is extremely stable, offers high capacity and is highly economical for long-term storage compared to spinning-disk or solid state drives.

This makes it ideally suited to the museum's specific needs in ways that disk-based or cloud storage can't match, Crawford explains. Unlike other organisations, which can streamline their data intakes and offload older data once it stops being relevant, the vast bulk of IWM's digital assets have to be retained in perpetuity.

RELATED RESOURCE

Transforming business operations with AI, IoT data, and edge computing

A Pathfinder report on the ROI of AI, IoT, and edge computing

FREE DOWNLOAD

IWM's digital archive is housed on two tape libraries – one using IBM's TS1150 tape drives, the other based on Spectra Logic's Spectra T950 tape appliances. These are supported by Spectra BlackPearl converged storage, object storage and NAS appliances, and controlled via a digital asset management system (DAMS). The organisation has been working with Spectra Logic as its tape storage provider for more than a decade, and Crawford says it's proven to be excellent value as a long-term strategy.

"That initial investment we made with Spectra Logic has really done us well, because we're using the tape libraries we bought 10 years ago; they're still very, very viable for us," he says. "We've just upgraded units, basically."

Crawford has also recently deployed Spectra's StorCycle storage lifecycle management software to move older operational data from the SAN and NAS infrastructure that IWM uses to run its day-to-day business into its tape archive for cold storage. This has allowed him to replace his existing SAN with a smaller, all-flash model.

"We think we only need to move over about 30 or 40% of what's stored on there onto the new SAN. That's a huge cost saving for us. We've also got quite a few Synology NAS boxes we're using – we're hoping to consolidate a lot of the data stored on there and put that onto StorCycle as well."

Consolidation and cost savings are both important concerns for the museum, given the size of its needs. The organisation is currently in the process of expanding its film scanning capacity and once this is complete, Crawford expects to be generating between 30 and 40 terabytes of data every month. Between archival content and digital-native material produced to support the museum's activity, IWM's storage needs are now well into the multi-petabyte range.

"We've got 23,000 hours of film and video in the collection; it's one of the oldest film archives in the UK, because it was established during the First World War," Crawford explains. "We've got 11 million photographs, we've got thousands of hours of audio recordings; You name it, we've got it. So it's quite a large collection. There's about 33 million objects in the collection altogether."

However, IWM's digital estate represents only a fraction of this total. To put it into perspective, Crawford estimates that less than 10% of the museum's collection has been digitised so far – which means that its storage capacity will need to keep pace with a 90% increase in order to digitally archive the remainder.

This is one of the main reasons that tape is still Crawford's medium of choice – it's incredibly cheap and easy to scale. LTO tape drives store much less data per square inch of physical space than other drives – around 90% less, according to industry estimates. While that may sound like a drawback, it means that with successive generations of tape cartridges, manufacturers have plenty of room to keep increasing the density of each cartridge without having to change the physical dimensions.

In practice, the result is that when an organisation like IWM wants to add additional capacity, they can simply swap their tape cartridges or drives out for the next generation, without having to bolt on additional appliances.

With this in mind, Crawford is confident that the museum's tape infrastructure will be able to cope with growing storage and archival needs. He reports that the museum has just completed a migration from LTO-5 tape cassettes to LTO-7 cassettes and that despite involving 1.5PB of data, the process was "fairly painless". The resulting capacity increase has freed up a significant number of slots in the museum's tape library, and if slot space starts to become an issue again, there's still more headroom to upgrade.

"The LTO-7 [tape cartridges] we're using, we know the roadmap goes up to LTO-8, LTO-9," Crawford explains. "The tape library we've got is an LTO-8 drive. So we know there's a good, decent roadmap, and it is keeping pace in terms of the size of data you can store on a single cartridge."

"The investment we made a few years ago, just by replacing the tape drives and upgrading the cartridges, it's given us a lot more storage density in the same chassis. We're not having to keep on adding loads of different bits of hardware onto it. It suits our purposes, really, and I can control the costs as well. I know what the costs are going to be, because we know exactly what the cost of the tape is."

The predictable nature of tape costs is of particular benefit, given the museum's funding model. Aside from visitor revenue from its five sites, most of the museum's funding comes from grants. If a big influx of cash comes in and the organisation decides to invest in more film scanning equipment, the storage infrastructure needs to be able to accommodate that unexpected surge without also involving a long and costly data centre upgrade.

"Knowing that we've got a platform that can scale and that we know the roadmap for has been a real benefit," Crawford says. "If we did get awarded a grant in the future that allowed us to digitise more of the collection, I know we can handle it."

Even more than other organisations, the IWM absolutely can't afford to spend years upgrading its storage capacity. The material in its collection is irreplaceable, and the museum's staff are in a race against time to digitise it while they still can.

"Our physical film stocks are kept in pretty good conditions. I'm sure there has, over time, been the odd film that we've taken out, and it's degraded a bit – but that's why we've been trying to be proactive and digitise. Across the country, though, I'm sure there are archives that have suffered and are suffering because of lack of investment. So we're in a pretty good space, really; it's a big job for us to do, but we are trying to tackle it."

Adam Shepherd

Adam Shepherd has been a technology journalist since 2015, covering everything from cloud storage and security, to smartphones and servers. Over the course of his career, he’s seen the spread of 5G, the growing ubiquity of wireless devices, and the start of the connected revolution. He’s also been to more trade shows and technology conferences than he cares to count.

Adam is an avid follower of the latest hardware innovations, and he is never happier than when tinkering with complex network configurations, or exploring a new Linux distro. He was also previously a co-host on the ITPro Podcast, where he was often found ranting about his love of strange gadgets, his disdain for Windows Mobile, and everything in between.

You can find Adam tweeting about enterprise technology (or more often bad jokes) @AdamShepherUK.