Features Community

The Challenges of Maintaining a Large Open Source Project

large open source projectToday, Kubernetes is one of the fastest growing open source projects next to the Linux kernel, with over 130 commits each week—and that’s just to the main repository. It was recently announced that the CNCF Technical Oversight Committee (TOC) voted for Kubernetes to become CNCF’s first project to graduate: https://www.cncf.io/blog/2018/03/06/kubernetes-first-cncf-project-graduate/

One can compare Kubernetes to the Linux kernel based on the codebase size, the large number of contributors and the similar challenges they are facing. Yet these two projects use completely different development models.

Both projects are present on GitHub. For Kubernetes, this is the primary development location for code, issues and most planning. In addition to GitHub’s features, Kubernetes uses Google Docs and Zoom as well. While Linux is also maintained on GitHub, the Linux community went against the concept of using GitHub’s features like issue tracking and pull requests (PRs). Linux uses the Kernel Bugzilla for tracking bugs, and email for patch reviews, PRs and discussions.

large open source project

A deeper comparison between Linux and Kubernetes

The Linux kernel project uses the “network of trust” (aka “chain of command”) model where you send your PRs and patches to people with more experience (higher ranks) than you via email. Patches are not sent directly to the lead maintainers (who would be overwhelmed) or to a centralized GitHub repository where labels have to be used for delegation. Once the reviewers have approved the changes, the code can reach the higher ranks. It’s a distributed and elegant model that, although seemingly chaotic, just works. All the noise from bad patches is filtered through the “network of trust,” and only good code gets merged upstream. A key reason for Kubernetes not using this model is that GitHub is more modern and approachable, catering to a new generation of programmers who are not heavy email users. However, easier and centralized is not always good in the long run.

Key members of the Linux community have expressed their concerns numerous times on how GitHub is not suited for large projects. Below is a talk from Greg Kroah-Hartman in 2016 where he compares Linux to other projects, and also explains some of the reasoning behind not using such platforms:

A memorable quote from this talk is, “GitHub does not scale,” and Kubernetes was given as a prime example of how it was getting difficult to manage. Today, the number of open issues and PRs is even larger.

Scalability and making changes to development models

Scaling is difficult for Kubernetes at this point, even though GitHub is continually evolving and rolling out features that help larger projects. Control is transferred into the hands of the maintainers and the so-called SIGs (Special Interest Groups), who can decide if a certain module can be moved to a separate repository or a git submodule. Splitting code hierarchically is like a “network of trust,” and enables segregating out different cadences of innovation and stabilization. One problem with this “delayed-approach” is that GitHub does not provide a native way to transfer PRs and issues into the new repository. One way of doing it would be third-party tools like this: https://GitHub.com/google/GitHub-issue-mover

This is also true for labels, which may not be consistent across repositories. For consistency, there is a growing set of automation behind-the-scenes to enable Kubernetes subteams to “split the monolith.” An efficiency balance must be struck between tight coupling for consistency and loose coupling for independence. This is a work in progress for Kubernetes.

This type of transferring data between repositories is prone to errors, fragments the community workflow and can create chaos—like losing track of tickets and PRs. Currently, the main Kubernetes repository is full of many outdated and/or duplicate issues and many forgotten PRs. While an effort is underway to prune older, stale issues, this backlog makes any issue tracking system difficult to navigate and even more difficult to maintain. Using labels, bots and mailing list notification commands can help, but they also impose a maintenance complication that should be minimized. The Linux Kernel Bugzilla has similar issues—again, the centralized model does not work well there.

On one hand, Linux as a monolithic kernel uses a distributed development model, while Kubernetes as a distributed platform uses a centralized development model. On the other hand, the Linux kernel has a centralized MAINTAINERS file, while Kubernetes has distributed OWNERS files, which are machine-consumed in order to request PR reviews automatically from the best reviewers.

For both projects, issues with scalability and leadership are drawing attention, both of which are themes in many large open source projects. If you are a new contributor and you submit your first PR to either project, don’t be discouraged if you don’t get a quick response from the maintainers and reviewers. It can often take weeks before someone takes a look at your PR because maintainers are very busy and the signal-to-noise ratio in their mailboxes is very low.

Thinking ahead of time is important

In the land of GitHub, one way to prevent this difficult-to-control accumulation of tickets and to actually improve the interaction with the community is to think ahead and create sub-project repositories at the very early stages of the particular module/feature – long before it starts to grow out of control. Often it is hard to predict if a project will suddenly gain a large user base, but it’s very important that time and planning is reserved for this! If this sudden growth of the user base starts happening, it would not be advisable to use the issue tracker and PR mechanism of the main repository anymore, but only those of the sub-projects. The Docker community tackled this to a certain extent. For Kubernetes, it introduces a complication as modules like “kubeadm” and “kubectl” rely on the CI in the main repository. The effort of keeping the tests up-to-date outside of the main repository transfers into the hands of the module maintainers, which they prefer to avoid due to lack of resources.

Arguably, GitHub remains the easiest and most popular platform for maintaining a large open source project with a decent documentation, easy logging of issues, PRs, mentions, markdown, inline code reviews and all the nice features that most maintainers expect. However, there is a trade-off between ease of use and the eventual complexity of the project. One has to spend time planning the structure and organizing a project they want to open-source ahead of time; otherwise, some of the scale problems the Kubernetes community are facing will manifest sooner or later.

The SIG-Release and SIG-Architecture groups of Kubernetes recognize these issues and are looking at incremental improvements. There is also an effort to mentor more people into senior review and maintainer positions, as currently there is a limited set of senior, trusted individuals.

Conclusion

The takeaway here is that organizing and maintaining large open source projects is complicated. The Linux kernel’s process is quite mature, but it has also evolved quite a bit across its more than 25 years of existence. The Kubernetes process is actively evolving today. The more you give deliberate thought upfront to topics such as tools, processes and automation, the more you may avoid painful transitions later. At the same time, it is also important to be pragmatic and recognize that evolution will be necessary.

Lubomir started programming in the late 80s as a hobby and moved into commercial closed source work in the late 90s. He has been doing open source for over a decade, contributing to open source projects in different fields, including audio, signal processing and UI. He joined the OSTC in August 2017.

Stay tuned to the Open Source Blog for more around large open source projects like Kubernetes and Linux, and be sure to follow us on Twitter (@vmwopensource).