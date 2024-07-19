The blue screen of death, as it’s known on Windows PCs, is a failure mode. It basically means that something has gone so wrong with your computer that you can’t use it until it’s fixed. It’s not the kind of thing you want to see if you’re running, say, an airline or a bank. It’s definitely not the kind of thing you want to see on every computer at a hospital.

That is, however, what happened early Friday when CrowdStrike, which makes software that protects companies from cyberattacks, rolled out a security update that had a devastating effect on host computers. Effectively, the update made the computers unusable, which — again — is not a good state for computers that make critical services happen.

The whole thing is pretty unfortunate. As of 7 a.m., there were already more than 19,000 flight delays worldwide, and more than 1,000 cancellations reported for flights in the U.S. Delta Air Lines, for example, posted that its entire global flight schedule is paused while it works to recover from the outage. While it later shared an update that it had resumed some flights, it warned customers that “Additional delays and cancelations are expected Friday.” Like I said, it’s pretty bad.

There are, I’m sure, a lot of smart people working really hard to bring those computers back online, and this is going to be a very rough day for them. It’s also an especially rough day for CrowdStrike. I mean, that makes sense. If you sell software meant to protect computers from cyberattacks, it’s pretty bad if your customers’ computers go down not because they got attacked by hackers but because they installed a security update you provided.

As you might expect, CrowdStrike’s CEO, George Kurtz, posted a statement on social media: CrowdStrike is actively working with customers impacted by a defect found in a single content update for Windows hosts. Mac and Linux hosts are not impacted. This is not a security incident or cyberattack. The issue has been identified, isolated and a fix has been deployed. We refer customers to the support portal for the latest updates and will continue to provide complete and continuous updates on our website. We further recommend organizations ensure they’re communicating with CrowdStrike representatives through official channels. Our team is fully mobilized to ensure the security and stability of CrowdStrike customers.

That statement, by the way, sounds like a confident CEO who is in charge of making things right. The problem is, it completely ignores that the thing that went wrong is his company’s fault.

To be fair, it wasn’t an intentional mistake. No one had malicious intent. Writing code is hard, and bugs happen. Also, it’s helpful that he clarified that this wasn’t a security incident or cyberattack. But I think there are a few problems with this tweet: First, there’s no apology. I mean, at the most basic level, CrowdStrike messed up and took down the computers that run banks, airlines, hospitals, and television stations. Thankfully, most of the internet runs on Linux-based machines, or you wouldn’t even be able to read this right now.

When a crisis happens, most companies go into “save our reputation” mode, which often means communicating in very scripted ways that are meant to say very little about what they might have done wrong. I get it, but if you’re the CEO of a company that makes a mistake, the first thing you should do is apologize. (Update: In an appearance on the Today show this morning, Kurtz did in fact apologize.)

Not only that, but there’s barely an explanation in the tweet of what happened beyond “a defect found in a single content update for Windows hosts.” That sounds pretty passive, as though it’s a thing that just sort of happened but isn’t anyone’s fault. Here’s the thing — it may not be your fault, but it’s definitely your problem.

I get that this is a bad day to be the CEO of CrowdStrike, but that’s nothing compared to how tough it is being the person trying to figure out how to manage thousands of flights or hundreds of patients or millions of bank accounts. When this kind of crisis happens, the right response is to be open about what went wrong and what you’re doing to fix it. Most importantly, the right response involves having a little humility and transparency instead of worrying about your own reputation. There will be plenty of time for that once everyone stranded in an airport gets back home.

