This is the second of two posts outlining some basic principles and practices for securing software supply chains.
In part 1 of this series, I looked at good password and software hygiene, both at the point of first commit and along the software supply chain. Then I explored the importance of deterministic build processes and the properties that we should look for in our build systems.
Using deterministic processes also helps when it comes to another frequently recommended practice for securing software supply chains: understanding and maintaining the dependencies we ingest into our releases.
Know what you ingest
It’s generally understood that we all ingest numerous open source projects into our software. It is also well known, but less frequently discussed, that we take on risk when we do that.
Typically when we have characterized that risk, it’s been in terms of “what if the software stops being maintained?” A major worry is that bugs, including security issues, will stop being fixed. Equally unnerving is the idea that a malicious maintainer could take over a project and continue to make new releases with some kind of additional damaging functionality that we may not learn about until it is too late. We have seen this in practice, most notably in the highly targeted event-stream npm module case.
This is indeed a serious problem and is often a primary concern for those tasked with threat modeling for software supply chains. Most dependencies have their own set of dependencies, so if we explicitly add X dependencies to our product, that typically results in a much larger Y number of dependencies that we ingest. Each dependency, in effect, has its own supply chain. Even worse, we may ingest multiple incompatible versions of a dependency, some of which may be unmaintained or have known vulnerabilities.
What can we do about this? It is hard to make specific recommendations around tools for managing dependencies, because what constitutes a best practice will vary considerably between development ecosystems. In the abstract, though, we can state that our overarching goal should be to make conscious and deeply informed decisions about what we ingest into our release pipelines.
This starts with fully understanding our dependencies: their names, versions, known vulnerabilities, modifications from upstream, upstream location and more.
Then, to prevent unanticipated surprises, we should be looking to pin, or lock, the dependencies to ensure we always retrieve the same known version of a dependency when building our software.
Once we understand our dependencies and have them locked down during build, we must actively monitor and maintain that locking so that we are aware of any new releases, as well as vulnerabilities in the versions we are ingesting.
Minimizing the number of dependencies we ingest has an obvious benefit in reducing the attack surface of our projects. We can also minimize the impact of the dependency on our project by layering an abstraction over the dependency, making it easier to replace in future if required. Additionally, we can reduce the likelihood that dependency vulnerabilities will impact our wider project by deploying sandboxing and other isolation techniques.
Finally, by building the source ourselves wherever possible (when verifiably reproducible builds are not available), we can have increased confidence that the components in our releases correspond with the source code we expect.
Once we have processes in place to help us understand, lock, monitor, and maintain our dependencies, it is a best practice to start reporting on those dependencies by producing a software bill of materials (SBOM). IT customers, including the US federal government, are increasingly requiring SBOMs as a form of assurance that security best practices are being followed. SBOMs can also be a powerful aide in understanding and monitoring a project’s dependencies.
If you are building containers, the excellent Tern project supports understanding, locking and reporting on dependencies in container images. It can inspect the dependencies in both the underlying Linux distribution and in many of the language ecosystems used to develop container contents, including Go, Python, JavaScript and more.
Don’t forget to sign and verify
The final principle I want to discuss is that of signing and verifying artifacts, which is where I’ve been making most of my own contributions to open source supply chain security.
Signed artifacts for software delivery are well understood at this point, but not as pervasive as you might hope in 2021. Docker Content Trust is still not enabled by default, despite having launched in 2015, and many language ecosystems still utilize package managers that do not support package signing.
Even for those that do enable optional signing, the signing itself is often the easy part – leaving key management as an additional burden for both the systems implementing the signing and the developers using that feature.
The open source project I co-maintain, called The Update Framework (or TUF), provides a blueprint for signing the content in a content repository. Despite the name, this works with almost any content repository, regardless of content type. If you care about the integrity and authenticity of the content; that it was signed by a known, or at least identifiable, entity; that you have a consistent view of the content to prevent things like installing an inconsistent mixture of content that is easier to compromise; and that you are getting the latest content (or at least know when you are not), TUF provides an industry-adopted and security-tested solution to build on.
One of the tenets of TUF that I really appreciate, and try to advocate for in designing secure systems more generally, is the idea of compromise resilience. This, effectively, amounts to acknowledging that compromise will happen and taking steps to minimize the impact and make recovery easier.
TUF achieves compromise resilience through a few core principles:
- Different roles with their own keys are used to sign different metadata in the system, with each type of metadata contributing to different aspects of the system’s security.
- Key revocation can be handled explicitly, through updating repository metadata, or implicitly, through expiration times. Both approaches are fast and secure, and the implicit revocation encourages implementers to be conservative in how long they choose to trust in the safety of a key.
- This reduced trust in keys is further emphasized with recommendations to keep the high-risk keys offline – that is, not continuously available to the system. High risk keys are typically kept in a hardware security module that is only connected to the system when a signing operation needs to happen. TUF is also designed so that frequently used keys, which often need to be kept online to enable smooth operation of the repository, are associated with lower risk roles where a key compromise is unlikely to result in complete repository compromise.
- The final aspect of compromise resilience in TUF is the notion of a “threshold,” which allows a repository to require a quorum of signatures before a change becomes trusted. That way, you can build in checks such as requiring multiple administrators of a system to all sign metadata before a new configuration is introduced to the repository. This prevents a single lost or compromised administrator key being used to compromise the entire repository.
Signing along the pipeline and authenticated metadata
Signing the outputs of our secure release pipelines is a recognized best practice. But what if we take this idea of signing content a step further, and instead of just signing the output of a software development pipeline we sign some intermediate artifacts too?
That’s the idea behind the open source in-toto project, which produces signed metadata at each step in the release pipeline to provide several security properties, including:
- Artifact flow integrity. With signed metadata about each step in a release pipeline, we can ensure that artifacts flowing through that pipeline have not been tampered with between steps.
- Step authentication. By using public-key cryptography to sign this metadata, we can have confidence that each step was performed by the expected entity, the owner of the private key.
- Supply chain layout integrity. Possessing authenticated metadata about each executed step in the release process enables us to verify that all steps were performed, in the expected order, by the expected entities.
The properties described here would likely have helped prevent a Solarwinds-like attack, so it’s not surprising that we’re seeing many software producers starting to engage with in-toto.
The same general idea of generating authenticated metadata about the steps in a release pipeline and feeding that into some kind of policy engine powers Google’s Binary Authentication technology. The overlap is significant enough that Google engineers are collaborating with the in-toto project to define a standard format for authenticated metadata about software artifacts, called software attestations.
These attestations not only give us data with which to make policy decisions, they also enable us to perform forensics in case of a compromise and give us evidence to inform any remediation.
in-toto attestations are a great example of both the trend towards industry standard best practices in securing software supply chains AND the power of open source to define industry standards that leverage the diverse experiences of contributors regardless of their employer.
In summary, a number of excellent open source projects are securing specific areas of the software supply chain while the broader security community is working on building out holistic solutions that pull these discrete efforts together. As that work progresses, however, there is plenty that each of us can – and should – be doing today to make our software supply chains more secure.
For further reading, check out the CNCF TAG Security team’s Catalogue of Supply Chain Compromises and its Software Supply Chain Best Practices whitepaper.
If you are interested in getting involved, the CNCF Security TAG’s Software Supply Chain Security working group and the Open Source Security Foundation’s Digital Identity Attestation working group are good places to start.