The Day After: Crowdstrike
The world’s IT admins woke up Friday morning thinking they would be able to enjoy a light Friday work and then be off for the weekend. Little did they know that amidst their morning routine they would fine multitudes of servers offline. First thing they may have wondered was are they being attacked? Come to find out from an official announcement from their EDR vendor that low and behold they were taken down by none other than their trusted EDR vendor — Crowdstrike.
Many IT admins woke up to this scenario including us here at Smart Tech Networx. Luckily, our experience was pretty simple as we only had Crowdstrike securing a single test server. Nonetheless, receiving that morning alert that a server had gone down was quite alarming prompting immediate investigation into the matter.
What Happened?
The endpoint detection and response company, Crowdstrike, released an update to their agent which triggered a logic error on Windows systems around the globe that caused many systems to crash and receive a blue screen error. In the error, it specifically stated the cause was csagent.exe. The error was caused by “a defect in a content update to its ‘Falcon’ cybersecurity defense software for Windows hosts”. Crowdstrike released more technical details on their blog describing the outage.
This caused widespread outages as far as even including, in case you were wondering why you may not have been able to pre-order your Starbucks coffee yesterday morning, the Starbucks app. As can be seen by this outage, a very large number of companies depend on Crowdstrike’s cybersecurity platform. Ranging from airports, coffee shops, Internet service providers, banks, hospitals, and the public transit sector.
Should there be government intervention?
This outage has prompted discussion from lawmakers regarding whether or not Congress and the Biden administration should step in and put tighter restrictions in place and more regulatory oversight to make sure an outage of this magnitude doesn’t happen in the future.
Requiring companies to have completely redundant systems so they have backups in case something like this were to happen in the future would be extremely costly and prohibitive for most businesses. Not to mention, creating completely redundant systems would also be very time-consuming. That being said, something like the outage that occurred yesterday could likely happen again in the future and could even be more devastating than this time.
Ultimately, this situation boiled down to quality control, plain and simple. The Crowdstrike update has more than likely shaken the trust of many of their customers and will probably drive their customers to seek out other vendors if they are not able to re-instill trust in them. At the very least, the outage was not caused by a malicious act. This does however bring up another important question.
Should you rely on a single vendor for cybersecurity?
This poses an interesting question as relying on a sole vendor for cybersecurity, regardless of how good their reputation may be, can be a serious single point of failure. This type of incident can happen to any company at any time. It is always good to add multiple layers and have other vendors for business continuity. Having this redundancy usually comes with an upfront cost and has to be justified by “what if” scenarios. However, as we have seen, sometimes these “what if” scenarios do actually happen and the repercussions of not having this redundancy in place can be costly.
Remediation
Hopefully, most companies have recovered from the outage at this point, however, if you haven’t — Crowdstrike has posted on their blog remediation steps that can be used to recover systems: