The thing the President’s advisors underestimated

Regarding the recent Executive Order for cyber security.

6 min readMay 17, 2021

On May 12th, the President of the United States issued the “Executive Order on Improving the Nation’s Cybersecurity.”

The order references the recent ransomware attack on Colonial Pipeline, but it is clear that the legwork was started weeks or months earlier.

The document is an extremely thorough, clear, and well-informed on the topic. It outlines explicit deliverables, deadlines, parties responsible for overseeing each step, and parties responsible for executing each step of the plan.

In every way, it is starkly different from any similar resolutions or orders issued by the US government in the past several years.

The internet exploded with discussion about the potential this has for changing the cybersecurity landscape, with some more optimistic than others- but all in agreement that this is a big deal.

The EO was 16 pages when I printed it to take detailed notes. In the process, something in the fine print stood out to me, because I’ve been working on it for the past year. The writers of the EO are asking for more than they realize.

TLDR;

(Or, “16 pages of text is too much for me to process right now”)

According to the official summary, there are 7 key points to the EO:

Remove Barriers to Threat Information Sharing Between Government and the Private Sector
Modernize and Implement Stronger Cybersecurity Standards in the Federal Government
Improve Software Supply Chain Security
Establish a Cybersecurity Safety Review Board
Create a Standard Playbook for Responding to Cyber Incidents
Improve Detection of Cybersecurity Incidents on Federal Government Networks
Improve Investigative and Remediation Capabilities

I’m currently collaborating with a colleague to create a robust analysis of each section of the EO, including commentary informed by our experience creating governance documentation and compliant cloud solutions for financial services.

Pause. What does the EO actually get right?

Fair enough. Before I nitpick one tiny point, let me highlight a few brilliant elements of the EO.

A focus on Zero Trust Architecture demonstrates a firm understanding of the complexities presented by modern cloud infrastructure. Perimeter-based security is impossible, and this document goes as far as to define and explain the importance of Zero Trust.
It puts “first things first” by ordering a comprehensive revision of contracts with vendors, to ensure that any work done as part of the EO includes open communication and the clear expectations that are required for a shared-trust model with vendors.
It puts power in the hands of technical professionals.
[Sidebar… When the DOD sought recommendations from the Texas Military Department looking for ways to improve responses to ransomeware based on the TMD’s astounding success in 2019, the response from TMD officials was roughly: “We don’t have an exact system. We put the right people in the right place, and we empower them to do their jobs.”]
Increased transparency among federal organizations and their vendors is mandated and given a clear roadmap. Inter-departmental communication is either a boon or a bane for any organization, so this is a huge win.
The EO mandates “Software Bill of Materials” across the board. This is basically a list of ingredients for software. This should always be done everywhere by anyone managing production resources.
Multi-factor Authentication is given a huge role as well, with virtually no room being left for any organization to avoid implementing some form of MFA.

There’s plenty more of great stuff, but I’ll stop there for now.

So where are they asking for more than they realize?

This isn’t a knock on the writers of the EO- it’s a brilliant document. However, I was able to spot one of their blindspots simply due to the perspective I’ve gained through my own work.

The guidelines shall include criteria that can be used to evaluate software security, include criteria to evaluate the security practices of the developers and suppliers themselves, and identify innovative tools or methods to demonstrate conformance with secure practices
- Section 4(b)

And connecting to this, a few paragraphs later…

Such guidance shall include standards, procedures, or criteria regarding:
(i) secure software development environments, including such actions as:
(A) using administratively separate build environments;
(B) auditing trust relationships;
(C ) establishing multi-factor, risk-based authentication and conditional access across the enterprise;
(D) documenting and minimizing dependencies on enterprise products that are part of the environments used to develop, build, and edit software;
(E) employing encryption for data; and
(F) monitoring operations and alerts and responding to attempted and actual cyber incidents;
(ii) generating and, when requested by a purchaser, providing artifacts that demonstrate conformance to the processes set forth in subsection (e)(i) of this section;
…
(iv) employing automated tools, or comparable processes, that check for known and potential vulnerabilities and remediate them, which shall operate regularly, or at a minimum prior to product, version, or update release;

This seems fairly innocuous, right?

A misconfigured resource is both easy to create by mistake and easy to exploit. And an exploited resource quickly becomes an exploited organization.

Misconfigured IT resources are the single greatest threat to any organization’s security. And in this case, that means national security.

Scanning a resource’s configuration to ensure that no improper settings have been applied is a fair request. Though it can be a bit tedious to implement, there are a variety of tools to make the process a bit less painful.

Scanning configuration prior to deployment is known as “verification,” and scanning a configuration after the resource has been deployed is known as “drift management.”

Both verification and drift management are important for everything from employee’s virtual machines to app databases.

However, there is a third aspect to this that is critical, but often overlooked. If you want to truly demonstrate compliance, you need validation.

The writers ask for demonstration, but they aren’t privileged with the experience that would tell them… that’s a really big request.

What’s the difference between verification and validation?

…and what’s the big deal?

Verification says “You are trying to do the right thing,” while Validation says “You’ve done the right thing.”

Any parent should see the difference here by now, but I’ll break it down further.

In my role as a cloud infrastructure engineer for multiple international financial institutions, I don’t care what you meant to do- I care what you did.

This is a bit inverted from how most parents think about a kid’s mistakes. If the child intended good while doing harm, it’s a lot easier to forgive and coach them forward. With infrastructure, it’s the opposite.

Two points on the importance of validation before we move on:

Constricting deployments to an allowed configuration is a highly specific activity. A setting that is recommended for one user may be unnecessary in an organization that resolves the issue at a higher level. Also, some resources simply behave differently due to how they cooperate with other resources on the network- requiring a different verification process for different situations.
Trusted configurations are not always impervious to error. For example, my team once found a place where an Azure Policy said it was preventing an insecure action… but we were able to perform that action during validation. (The Azure team was quick to fix this bug when it was reported!)

What can we take away from this?

At this point, I’m assuming that the parties responsible for executing and enforcing these mandates will cut corners in the area of demonstration.

I assume that instead of interpreting demonstrate to mean “prove,” they will settle for letting it mean “indicate” or “evidence.”

If it is sufficient to indicate compliance, verification will do the trick. But if we want a true demonstration that proves compliance, validation will be required.

A separate automated test will need to be written for every governance standard on every type of infrastructure resource that is in production.

Automated pipelines will need to execute validation prior to a resource being approved for production deployment.

Tests will need to provide cohesive audit trails, and present human-readable descriptions of what each test does.

Tests will need to be audited themselves, to ensure they are doing what they say they’re doing- so code will need to be well written and maintained.

Fully demonstrating compliance in a scalable manner is a daunting task.

I’ve done it.

Although… perhaps that’s a good thing.