Building a proprietary software product on top of open source projects can come with many benefits, but not without complexity. Often the benefits are very tempting and can outweigh any complexity, while the same complexity can be mitigated with careful planning and strategy.
VMware has a number of contributors in the Kubernetes SIG Cluster Lifecycle, where projects like Cluster API are hosted. Cluster API is used in the VMware Tanzu product line. Work on this project brings together people from a number of different companies such as VMware, Amazon, Microsoft, RedHat and others. These companies collaborate to create better software together, despite the concurrent and competitive nature of their market shares with Kubernetes-based products.
The Cluster API project is originally an example of great collaborative innovation between a number of companies at the time investigating a solution for declarative Kubernetes cluster management, following already established API conventions by the wider Kubernetes project. This blog post called The Kubernetes Cluster API includes more details about the project inception and initial work.
VMware’s Open Source Technology Center has the mission to educate our company into how to better engage and align in common open source goals. This blog post will cover some of the common questions around upstream engagement in open source projects with the idea to introduce changes and innovate through collaboration. The information here can help you to better understand what processes developers may have to follow to better work in open source project communities and why open source projects matter for innovation.
Why open source?
When software that is being developed starts depending on external projects or products, they are considered an “upstream dependency,” where upstream is the direction towards work by some other original authors. And accordingly, downstream can refer to the project that is being developed locally and depends on work by some other original authors. If an upstream dependency is open source and has a community of contributors, this makes it a great target for collaborative innovation. If the upstream is, for example, a proprietary SDK without an open source community, innovating in it might still be possible with the SDK authors.
Innovating downstream first can be well-scoped and targeted for the product of a certain company, granting them full control and reducing concerns around making changes. But this localizes the innovation to the collective brain that may have a field of view limited by the focus and timeline on getting product “next” to the market. The same collective, despite having experts, can still miss a critical point that can later become technical debt, which is difficult to deal with. At that point the same technical debt can have a maintenance cost that the owning company no longer wants to deal with. Open sourcing a solution with such a set of problems may not help improve the overall state, as additional implementers would need a substantial investment for their adaptation.
Innovating upstream first in open source projects has the potential to enable collaboration that goes beyond the limitations of the company team that you work with. The eyes of an expert from another company that has a different use case than yours, can discover gaps in the solution that can be closed and improve the overall product stance for all involved parties and even those that are not yet involved. Collaboration with other companies that can share the maintenance load and bring more expertise to the table can result in a more polished solution that is well abstracted and works for everyone, even your company’s future customers.
The Power of Open Source Innovation
Innovation in open source software is broad. From reducing the resource overhead of a small module in a project with limited adoption, to adding a very high-demand feature to a popular module that has wide adoption and can then benefit computational biology. Your changes can impact everyone and everything up the stack.
As an example, Cluster API is a relatively new project and every year it sees more and more contributions that build the overall project innovation. With collaboration between different SIGs and companies in Kubernetes, the project can achieve improved security, improved declarative cluster management, but also better support for ARM devices on the Edge, Windows workloads, GPUs and hyperscale computing.
Cluster API, a mechanism that allows operators to check the overall cluster condition with a single command, is an example of improving the declarative cluster management. This is something that stock Kubernetes does not support out of the box.
The potential for technology improvement is limitless and the impact unpredictable.
The biggest challenges usually revolve around communication and (maintaining or preserving) a tolerance for interests and demands of others. Communication is key. We all come from different companies, with different styles of communication and different vocabularies. It’s important when stepping outside your company’s environment to adapt your communication style and be clear, distinct, and always checking for common understanding. Simple words can have dramatically different meanings and thus impact. Always check in with your collaborators. Tolerating and understanding the use case of your users and independent contributors is important too.
When a conflict of interest arises, try to find a middle ground and gather as much feedback as possible to make the right decision. If you are a long standing collaborator, try not to overrule based on a personal preference. If you are a newcomer, try not to force your opinion on peers from other companies, in case your company has more senior contributors in the list of active maintainers.
If needed, create a survey and let the project collaborators and users decide on a change. Survey results are often an excellent metric for the direction a project must take and they have been a well-established practice in SIG Cluster Lifecycle.
For example, based on the latest SIG survey results from 2020, the group established that containerd as the container runtime is rapidly gaining more traction in the Kubernetes user base compared to the integrated support in the kubelet for Docker (also known as ”dockershim,” which is deprecated) and the group started planning how to address the user experience concerns about migrating users. This resulted in some collaboration with other groups like SIG Node, which maintains the kubelet component.
A certain change might see push back only from a set of maintainers from a certain company. While being a blocker for your innovation, this is an indicator that this change might not fit the collaborative direction of the project. Accepting the feedback and going back to the drawing board can result in a better solution.
But if the timeline is critical and if something is not a good fit for upstream, developing it downstream-only may be the only option for pushing something in a product. This, however, comes with some of the risks mentioned above. The “upstream first” model is still better with respect to early feedback from more reviewers and ensuring better maintenance potential.
You can follow these recommendations to help align interests:
- Follow the best practices that your company has already established.
- Find maintainers from your company that are already engaged in a project and ask them to mentor you.
- Ask the maintainers questions around engagement from other companies and what points they care about. Company X might be sensitive to performance, while company Y might be sensitive to the addition of certain new features down the line.
If you are the first collaborator from your company, listen and observe the behavior of existing maintainers. Try to learn more about the existing collaboration process from them. Learn what products they are building on top of this software. Be careful with the first impressions you make and try to go from small changes to bigger changes once you have gained trust and expertise.
Read these related blog posts by other members of VMware’s OSTC:
- How to Get Your Code Upstream by Steven Rostedt
- How to Create Good Good-First-Issues by Nisha Kumar
- Building a Community: Company Resources by Dawn Foster
Establishing your downstream priority first is fine, but try to estimate the reception of an upstream review and how to accommodate this new planned innovation to suit everyone. Note that you cannot do that effectively without being familiar with the existing community. Anticipate comments on the problem space and do not get discouraged if you see a lot of them.
Try to justify a change with a use case that innovates in a beneficial way beyond your company and product. Such use cases are hard to ignore and oftentimes the users have the final word.
Despite concerns around alignment of interests between the involved, collaboration in open source is the most powerful instrument for software innovation. Open source has the potential for creating and improving software by collective introspection with unlimited feedback and a wide field of view.
Following a set of strategic approaches can grant your company the privilege to frontline software innovation through open source. Feedback from third party contributors like other companies and individuals comes as a benefit to the product that you are building. The ongoing innovation in open source can then be backed and supported not only by your company’s developer resource and engagement, but also by the same third parties engaging to contribute with their own stance and interests.