
In the wake of the CrowdStrike outage, Microsoft has adopted the industry partnership stance I suggested last weekend: It’s calling on the security industry to work with it to make the ecosystem safer. There aren’t many details yet, but given the software giant’s security push this year—generally and with Windows 11 specifically—and the ongoing fallout from the CrowdStrike outage, the timing couldn’t be better.
“The recent CrowdStrike incident underscores the need for mission-critical resiliency within every organization, and our unique ability to support the change required,” Microsoft’s John “the fixer” Cable writes in a new post to the Windows IT Blog. “This incident shows clearly that Windows must prioritize change and innovation in the area of end-to-end resilience. These improvements must go hand in hand with ongoing improvements in security and be in close cooperation with our many partners, who also care deeply about the security of the Windows ecosystem.”
Reading that, I thought to myself, that sounds familiar.
And that’s because it’s almost exactly what I wrote this past Sunday in CrowdStrike Outage Has Roots in Microsoft’s Antitrust Problems. There’s been a lot of finger pointing in the wake of the CrowdStrike outage, which is understandable. And Microsoft is likewise understandably sensitive to the assumptions that it was the cause of this outage. Maybe overly sensitive: In blaming EU antitrust regulators, though, Microsoft was out of line and, worse, it was factually incorrect. But that was the point of Sunday’s missive: Let’s put aside regulators and antitrust concerns in this case because the industry should just agree, collectively, that fixing this type of problem and securing our infrastructure is a bigger and more important concern.
“[Solving this problem] is something Microsoft can’t do by itself,” I wrote. “Perhaps this is a good opportunity: In a year in which Microsoft has pledged to take security seriously again by making it a top priority, and as more specifically announced a major new Windows security push, it’s time for the industry—with or without regulators—to agree to new levels of security in Windows that will benefit everyone. This platform is too widespread and too obvious a target to allow this to ever happen again.”
So, Microsoft agrees with this. Great. But what’s next?
Cable points to two recent security innovations that speak to the resiliency Microsoft wants for Windows, VBS enclaves and Microsoft Azure attestation. But with little in the way of detail.
Here’s my overly simplistic take on this. I’m not a security expert.
If you’re familiar with the security work that Microsoft did for the Copilot+ PC platform, then that first one might feel familiar: VBS, or virtualization-based security, is the core technology behind Windows Hello Enhanced Sign-in Security (ESS), and the reason for its stringent hardware requirements.
VBS, in Microsoft’s words, uses the security hardware in the PC (Proton/TPM and other related components) and the Windows Hypervisor to create an isolated virtual environment that’s used as the root of trust for the operating system. This environment is “Zero Trust,” so it assumes that the Windows kernel can be compromised. Contrast that with the situation today in which Windows crashes when the kernel is compromised and reboots into a recovery environment, the effect we saw with CrowdStrike’s botched update. And the VBS environment hosts the security tools needed to fix it, separate from the OS. Isolated from that OS.
Cable references VBS enclaves specifically: This VBS-based solution was originally created for apps, and it’s implemented as a software-based software execution environment in Windows, a specially made DLL file. He doesn’t explain how Microsoft might implement a VBS enclave for the operating system—or, for the kernel specifically, I guess—but I can imagine a potential architecture in which this system adapts the Windows Hypervisor for security more broadly.
That is, instead of a parent partition (Hyper-V) and a child partition (Windows) that are both operating systems, you might have two child partitions, one that is Windows and one that is the secure environment with higher privileges than the Windows kernel. And instead of crashing and rebooting Windows into recovery mode when its kernel is compromised, this system could heal the kernel at runtime without failing.
Cable doesn’t describe it that way, of course. He doesn’t describe it at all. But he does note that the isolated compute environment created by the VBS enclave, however it’s implemented, is tamper resistant and doesn’t require kernel mode drivers, the capability that Microsoft added because of antitrust concerns about a security feature in the x64 versions of Windows Vista in 2006. It’s time to rid our world of kernel mode drivers, at least as used for security patches.
Microsoft Azure attestation also uses the TPM and whatever other security components, this time in cloud-hosted servers, to remotely attest to—”verify the truth of”—the trustworthiness of a platform (which I take to mean “an Azure server or VM”) and the integrity of the binaries (executables, meaning software applications and services) running inside it. This is a common practice with PC security, where the PC and its firmware are locked down while offline—Secure Boot, for example—and not just when you’re using Windows. If the security subsystem discovers that anything has been compromised, it can refuse to boot, or more commonly, go back to the previous known-safe configuration.
“These examples use modern Zero Trust approaches and show what can be done to encourage development practices that do not rely on kernel access,” Cable notes. “We will continue to develop these capabilities, harden our platform, and do even more to improve the resiliency of the Windows ecosystem, working openly and collaboratively with the broad security community.”
There you go.
Cable also shares some best practices that he says helped companies recover more quickly from the CrowdStrike outage than some others (cough, Delta). Most of that is useful for corporations and those managing corporate infrastructure, so you can check out the original post for the full list. But I see a few things in there that might be useful for individuals, too:
Back up data securely and often. Microsoft always mixes up the terms “back up” and “sync,” but most individuals are best served by cloud-based data sync, and not by old-school backup solutions. I’ve written about this topic a lot, but Don’t Be a Statistic (Premium) and Roll Your Own Windows Time Machine (Premium) are good places to start.
Ensure that you can restore your Windows devices quickly. This is also discussed in Roll Your Own Windows Time Machine (Premium) and in the Windows 11 Field Guide, of course. The Help and Recovery section includes relevant Recovery Drive and Reset This PC chapters. In a weird coincidence, Cable mentions system restore points, too: This feature is disabled by default in Windows 11, and I was literally just exploring this topic to see whether it should be added to the book. (I think it should.) VM snapshots work similarly (but even better) for virtualized environments.
I get it. Security is a big, hairy, complex topic that freaks people out for good reason. I’ve been writing more and more about this since late 2023 because of some recent advances like passkeys, and I’ll be writing more about password managers specifically soon. But yes, to Cable’s point, this is all about resiliency. And as Microsoft and the industry races to make our infrastructure more resilient, it’s on us as individuals to do the same for our personal data and identities. We might view this, collectively, as a security and privacy baseline for … well, people.
More soon.
With technology shaping our everyday lives, how could we not dig deeper?
Thurrott Premium delivers an honest and thorough perspective about the technologies we use and rely on everyday. Discover deeper content as a Premium member.