Network Chaos - 3D Rendering
Community

Composing the Ultimate SBOM

Have you seen dependency graphs of modern real-life applications? They look a lot like star clusters or complex subway systems, like the one they have in NYC. Go check yourself – jump on Open Source Insights and generate the dependency graph of a project of your choice. Here’s betting you can easily get to something as beautiful as the Kubernetes’s graph:

Kubernetes’s graph

The dependency graphs of modern applications greatly demonstrate how we build software today – we focus on our unique innovation and deal with common challenges by leveraging existing solutions. That’s a fine software development approach – we don’t need to reinvent the wheel over and over again and focusing on our unique innovations, the solutions that meet our users’ needs are brought to life more efficiently. And that’s awesome! 

But every third-party component we use drags along dependencies that drag along their dependencies, and we end up with tons of known and unknown dependencies and the lack of visibility and transparency could get us into trouble. 

We’ve had enough of that meme with the small project, poorly or not maintained at all, bending under the weight and complexity of modern digital infrastructure and hopefully, we’ve finally realized that we need to pay closer attention to the security of the software we use.

Meme of small project
Source: xkcd

The SBOM, the Community Standards and the Landscape of Tools

We’ve already discussed the potential of SBOMs in Let’s Get SBOM Ready. By explicitly declaring software components, SBOMs enable us to identify and mitigate risks before they become a crisis. 

We know what data we want to convey across organizational boundaries in a SBOM and we have standardized formats (such as The Software Package of Data Exchange (SPDX) and OWASP CycloneDX), meaning that SBOMs can be machine-generated and read, the exchange can be automated and we can make it scale. Luckily we have many tools to generate SBOMs in our modern context with a huge variety of technologies, programming languages, build systems, etc.  

A lot of communities are focused on the development of open source tooling for efficient and effective exchange of SBOMs to enable license compliance, security, export control, pedigree and provenance workflows. Here are a few of them to get an idea of the variety of use cases:

  • Tern is a great post-build solution for generating SBOM for container images and Dockerfiles. 
  • Salus is a good solution for generating SBOM at build time. 
  • Pkgconf also presented a bomtool functionality which is a great new tool that supports build-time generation and provides accurate data for C/C++ projects. The SPDX SBOM Generator generates SBOM data from source code. 
  • The K8s BOM Tool appears to be an emerging solution that generates SBOMs from directories, container images, single files, etc. and processes already created documents. 
  • Docker presented the docker sbom functionality based on Syft.

The Ultimate SBOM

Having that variety of tools to generate SBOMs at different stages of the software lifecycle is awesome! Each stage however has unique features and the SBOM may have some differences depending on when and where the data was collected. Missing build and dependency metadata limits the compliance and security benefits of an SBOM. One vulnerable piece can jeopardize the whole system and that’s the risk we are trying to mitigate. How do we make sure that we have all the pieces?

If we refer to the Supply chain Levels for Software Artifacts (SLSA) framework, the red triangles in the picture below mark the threats to the supply chain SLSA addresses. 

Supply chain Levels for Software Artifacts (SLSA)
Source: slsa.dev

SBOMs are usually generated at the Source stage of the Supply Chain, or post-build, relying on heuristics. By generating SBOMs at build time, high-fidelity information about what went into the artifact is illuminated, including better dependency and compiler change information. In some cases, only the object code may be available for SBOM generation while in many cases the tooling to generate SBOM at build time does not yet exist. Binary analysis tools can help to better understand the components and their dependencies in those cases.

We need that diversity of tools! We can’t expect build-time/from-source SBOM for everything, but that shouldn’t mean we settle for post-build scan SBOMs exclusively – we need the best of all worlds.

Software is modular in nature – each component has its own purpose, dependencies, and lifecycle. If we incrementally left-shift our SBOM generation on a per-component basis, we end up with ‘wider-in-quantity and smaller-in-scope’ SBOMs, that we define as micro-SBOMs. Producing a more accurate build and dependency information relies on having a micro-SBOM that describes each component that makes up a larger piece of software.

And yes – we’ll definitely end up piled high with hundreds of SBOMs. But we can stitch them together into a single SBOM! We refer to this as “composing.” 

The composing functionality that we’ve implemented in sbom-composer parses the content of micro-SBOM input files in SPDX format, merges them, removing duplicates and constructs an SBOM output file that complies with the highest of the input SPDX versions. The result is validated, making sure that it is a fully functional SBOM that helps improve software transparency and supports use cases, such as vulnerability management, software inventory and license compliance. 

Takeaways

Although generating micro-SBOMs and composing the ultimate SBOM come with technical challenges, innovative solutions are proving that they are not unsurmountable. It’s not rocket science after all (although it might feel a lot like it when generating SBOMs for legacy systems). 

There’s one more challenge lurking: organizations and communities need to introduce the necessary changes to their processes so that they adopt SBOMs and start getting involved in SBOM production. 

“The Ultimate SBOM is a get-together challenge.”

Providing SBOMs comes with a lot of benefits one can expect to realize but there are concerns as well that need to be properly addressed. All actors in the supply chains need to provide the necessary transparency in how their software is created, distributed and consumed. We need to come together to be able to take full advantage of SBOM capabilities and have more secure supply chains!

Stay tuned to the Open Source Blog and follow us on Twitter for more deep dives into the world of open source contributing.