How a glitched CrowdStrike update caused the Blue Screen Friday
A faulty update from cybersecurity giant CrowdStrike drove the digital world to a halt on July 19, 2024. In a matter of hours, an estimated 8.5 million devices with Windows systems worldwide succumbed to the dreaded “blue screen of death” (BSOD), paralyzing critical industries and sending shockwaves through the global economy.
From government agencies and stock exchanges to hospitals and airlines, the CrowdStrike Falcon platform's widespread adoption turned a routine update into a catastrophic event, leaving IT professionals scrambling to orchestrate what will largely be a hands-on recovery process.
As the dust settles, questions loom about the controls (or lack thereof) at CrowdStrike that led to this slip-up, and what’s next for the company as it emerges from corporate losses that could exceed USD 5 billion. This report aims to answer those questions by delving into what exactly happened, a timeline of key events, CrowdStrike’s response, and potential opportunities for its rivals.
What exactly happened?
At the core of the “largest IT outage in history” was Falcon, CrowdStrike’s cybersecurity platform. A faulty update issued by the vendor in the early hours of Friday triggered a domino effect of crashing PCs and servers, eventually reaching an estimated 8.5 million Microsoft Windows systems worldwide. While this was less than 1% of all Windows machines in operation, the broader economic and societal impacts were significant, disrupting the operations of major organizations, including those in critical industries such as government agencies, stock exchanges, hospitals, and airlines.
The disruption followed a content configuration update for the Falcon sensor, a lightweight agent that protects and monitors devices using the Falcon platform. This sensor works at the kernel level (the core part of an operating system that manages basic computer functions), allowing it to watch over and safeguard important parts of the system and its processes.
CrowdStrike updates its security systems using two methods: Built-in Sensor Content and adaptable Rapid Response Content (RRC). The company explained that the crash was caused by a bug in its content validator, which failed to catch an error in one of two deployed Template Instances. These instances guide the Falcon sensor on threat detection and response. The error led to an out-of-bounds memory read, where the sensor tried to access unauthorized areas of a device's RAM, causing system failure.
Businesses estimated to lose billions due to disruption
According to a report from risk management software provider Interos, the outage directly impacted nearly 675,000 enterprise customers and indirectly affected more than 49 million customer relationships. For example, UK broadcaster Sky News went off-air briefly, major retailers had to resort to cash transactions only, and customers at major US banks, like Bank of America and Wells Fargo, reported issues with login and online transfers, with other banking customers reporting declined card transactions.
Interos highlighted that the US was the hardest hit, accounting for 41% of the affected entities, while European countries such as Britain, France, and Spain collectively accounted for nearly a third. It also noted that the Falcon cybersecurity platform is employed by nearly half of the largest US cities and 82% of US state governments, including the Department of Defense and intelligence agencies. Analysis done by Parametrix found that 25% of Fortune 500companies experienced disruptions from the CrowdStrike outage, with estimated direct losses of approximately USD 5.4 billion, excluding Microsoft.
Impact on Fortune 500 companies by industry
Estimated financial loss on Fortune 500 companies by industry
The airline industry was hit the hardest, with over 5,100 flight cancellations and 32,000 delays reported on Friday. The repercussions have continued, with ongoing flight cancellations and delays over the following days. Although airlines have restored their systems, the recovery process has been slow. According to FlightAware data, Delta Airlines was the most severely impacted, canceling more than 6,500 flights between Friday (July 19) and Wednesday (July 24). In addition to the logistical challenges, airlines are facing regulatory scrutiny regarding their compliance with federal refund requirements for significant flight cancellations or delays. Specifically, airlines are required to issue refunds in the form of the original payment method, typically a credit card, rather than as vouchers. Parametrix has estimated that Fortune 500 companies in this industry have incurred direct losses of USD 860 million due to business interruptions.
Worldwide flight cancellations
Insurers are also expected to face negative consequences from this event due to an increase in claims from businesses and individuals. According to Fitch Ratings, business interruption, contingent business interruption, and Cyber Insurance are expected to see the most significant claims. Additionally, smaller lines such as travel insurance, event cancellation, and technology errors and ommissions (E&O) will also be affected. Preliminary market estimates by Fitch suggest global insured losses to be in the mid-to-high single-digit billion USD range, which may not significantly impact (re)insurers but warns against ongoing claims and litigation. Meanwhile, Parametrix expects insured losses to cover only 10%–20% of the total financial loss faced by Fortune 500 companies, translating to USD 0.5 billion and USD 1.1 billion.
Recovery involves more than just an over-the-air update
CrowdStrike quickly identified the issue and released a corrected version of the update in approximately 78 minutes. However, applying the fix to already affected devices was not straightforward, as the underlying Windows OS triggered a BSOD loop, rendering the system inoperative during the normal boot process. This required IT administrators and users to manually boot their devices in Windows Safe Mode or Recovery Environment Mode to navigate the system directory and delete the problematic file, known as Channel File 291.
Given the large number of affected devices, this process is labor-intensive and time-consuming, with total recovery time estimated to span months for certain organizations. Additionally, requiring physical access to the devices, combined with any drive encryption technologies such as BitLocker, further complicates and prolongs recovery efforts.
What are the remedial actions CrowdStrike is taking?
The July 19 update raised a common question: How could a faulty update on critical software go undetected during internal testing? And should updates be deployed instantly to all devices globally?
This issue was not limited to just Windows alone; Linux users also reported kernel panics and crashes related to the Falcon platform since at least April of this year.
In an incident response posted on July 24, the company blamed a bug on one of its testing software for not detecting the problematic update. Some measures to prevent a similar occurrence include improving RRC testing with various testing types, adding more validation checks, implementing a staggered deployment strategy for RRC, enhancing sensor and system performance monitoring, and providing customers with greater control over RRC updating. In addition, CrowdStrike aims to release a full Root Cause Analysis report of the event following an investigation.
How will this impact the company?
CrowdStrike is likely to see considerable financial and reputational fallout due to the content update error. The company’s stock has already seen a notable decline since the incident's aftermath. On Friday, the stock fell by just over 11% to USD 304.96 per share and by Wednesday, July 24, it had lost nearly 25% (USD 258.14 per share) of its value compared with its closing price of USD 343.05 per share on July 18, just before the incident.
Share price movement of CrowdStrike
Software companies like CrowdStrike usually include protective clauses in their contracts to limit their liability for software issues, often capping potential payouts at the cost of the service (i.e., a simple refund). However, they may still face civil charges for intentional or negligent actions that cause harm. The faulty update situation could be covered by E&O insurance, which protects businesses against claims of negligence or mistakes. As a result, the full impact of potential lawsuits on CrowdStrike is uncertain at this time.
The company is also facing congressional scrutiny, with the House Committee on Homeland Security's Subcommittee on Cybersecurity and Infrastructure Protection requesting CEO George Kurtz to testify on the global IT outage incident.
Could opportunities lie for CrowdStrike’s rivals?
CrowdStrike rivals may find opportunities and challenges after the recent event. While it’s too early to identify “winners" in the ongoing CrowdStrike situation, its competitors are fortunate to have avoided a similar fate as CrowdStrike, allowing them to evaluate their systems, including assessing the depth of their integration with operating systems, improving methods of air-gapping their updates, and refining deployment processes.
As of 2023, CrowdStrike was the second-largest security software company by global revenue, with a 14.7% market share—just behind Microsoft according to Gartner. Other major competitors include Trellix, SentinelOne, and Palo Alto Networks. The recent IT outage has damaged CrowdStrike's reputation, which may lead organizations to rethink their IT and cybersecurity needs. This is especially true for companies with zero-tolerance policies, who might now look for other solutions to reduce their reliance on a single provider and prevent future business disruptions.
However, changing security solutions isn't easy or cheap. New systems often face challenges integrating with existing infrastructure and moving data and can take a long time to set up. As of April 2024, 65% of CrowdStrike's customers were using more than five of its modules. This deep integration with CrowdStrike's ecosystem can make it difficult for customers to switch to another provider.
While comparisons are being drawn to Okta's 2023 data breach—where hackers accessed data on all customers—CrowdStrike's situation may prove more challenging. Although Okta experienced longer sales cycles due to increased scrutiny of its security protocols, it largely retained its customer base. However, CrowdStrike's breach has caused significant disruption, necessitating substantial recovery efforts from its clients. This disparity in impact suggests that CrowdStrike might face more severe consequences, though the full extent remains uncertain. The magnitude of the breach and the resultant customer inconvenience could potentially lead to different outcomes for CrowdStrike compared with Okta's relatively stable aftermath.
Featured companies
CrowdStrike
CrowdStrike offers a collection of next-generation cybersecurity products for endpoint security, cloud security, managed services, and threat analysis and response, along with professional services and...
By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.