Why the CrowdStrike outage was a wakeup call for developer teams
The CrowdStrike outage caused widespread chaos last year, but have developer teams learned from the crisis?
Software development leaders say their organization did not have a robust incident response plan in place to contend with disruption before the 2024 CrowdStrike outage.
A report from IT services provider Adaptavist surveyed development leaders on how their organization reacted to the CrowdStrike outage in July 2024, which affected an estimated 8.5 million devices.
The respondents came from organizations with over $10 million in revenue based in the UK, US, and Germany.
The incident has been reported to have cost US Fortune 500 companies $5.4 billion, between £1.7 and £2.3 billion to the UK economy, and affected 98% of companies according to Adaptavist’s research.
Looking back on the event, 84% respondents admitted that their organization did not have an adequate incident response (IR) plan in place before the outage.
Of those who said they did have a plan in place, only 16% reported that they were actually effective in helping them recover from the fallout caused by the event.
Adaptavist noted that most organizations had never experienced a large-scale outage like that before, or the consequences they can have on their IT operations and, as such, struggled to understand what would be required to minimize the impact of such an incident.
Get the ITPro. daily newsletter
Receive our latest news, industry updates, featured resources and more. Sign up today to receive our FREE report on AI cyber crime & security - newly updated for 2024.
But the report also found that 80% of organizations claim to have drawn positive outcomes from the crisis, which brought about what it described as a “remarkable change among businesses”.
How the CrowdStrike outage forced changes in IT resilience
Adaptavist's report claimed the CrowdStrike outage “opened IT leaders’ eyes to the wider robustness of their entire software development practices and processes”.
For example, 81% of respondents said they adopted more robust software development methodologies as a result, with one-third stating they totally overhauled their engineering processes to prevent similar disruptions moving forward.
Among the list of changes leaders made to their development practices, 54% committed to IR planning, with just under half identifying monitoring and observability, unit testing, integration testing, CI/CD, and agile methodologies, as other key change areas.
Other frequent answers included manual code reviews (47%), grey box testing (47%), white box testing (46%), and system testing (45%).
In addition to operational changes, Adaptavist found the CrowdStrike outage also prompted changes in how organizations invest in software development. It reported 86% of businesses had boosted investment in software development practices and training as a direct response to the incident.
The key investment areas listed by the respondents were agile and DevOps practices and training (89%), software testing training (89%), cybersecurity training (88%), and IR training (86%).
The incident also sparked a hiring spree, Adaptavist noted, with 99.5% of organizations stating they plan on expanding their technical teams, with quality assurance roles (36%), IT operations (34%), software developers (32%), and DevOps engineers (31%) top of the list for recruitment.
Businesses still plagued by outdated development culture
Yet cultural challenges remain, the report cautioned, with 44% of IT leaders stating that their organization still prioritizes speed over quality in software development.
Furthermore, just under two-fifths of leaders said they worry that their team’s excessive workloads will lead to another major incident.
A quarter of respondents warned their organization still advocates for a culture of fear over learning to reduce risk, with 40% stating they still fear acknowledging their mistakes.
A lack of psychological safety is also hindering innovation at their organization, according to 44% of IT leaders, with 42% claiming their firm’s security is compromised by a fear of admitting one’s mistakes.
Adaptavist suggested that skills shortages are only making this problem worse, advising businesses to invest in building a culture that seeks to address burn out and stress among engineers to attract talent.
“The ongoing war for IT talent is likely exacerbating these issues, but building a culture which invites open and honest collaboration, and addresses the issue of over-worked and over-blamed teams is the key to creating a welcoming environment for more IT professionals and delivering a far more resilient, efficient and secure IT environment,” the report advised.
Solomon Klappholz is a Staff Writer at ITPro. He has experience writing about the technologies that facilitate industrial manufacturing which led to him developing a particular interest in IT regulation, industrial infrastructure applications, and machine learning.
The world's 'first AI software engineer' isn't living up to expectations: Cognition AI's 'Devin' assistant was touted as a game changer for developers, but so far it's fumbling tasks and struggling to compete with human workers
Oracle Java pricing concerns could spark a developer exodus