Community

Project, Process, and Product(ion)

By Dirk Hohndel, VP, Chief Open Source Officer at VMware

This is a familiar topic; I’m reasonably certain I have talked about this for about twenty years in one form or another. I’m sure we all have, at some point, heard that catchy line “I use open source for this because it’s free” – which of course goes back to one of the rare instances where the English languages (which has more words than any other living language) is actually lacking specificity. “Free” can mean many things, including “free as in speech” or “free as in beer” – usually Richard Stallman is given credit for first honing in on this distinction. I like the Wikipedia article on Gratis vs Libre – except that I would have used the terms “umsonst” and “frei”…

 

But as I discussed in a previous post, in general it’s not a reasonable idea to run upstream open source software in production. And even if a productized version of open source software may still be (mostly) free (libre), it usually stops being free (gratis). Now mind you, that doesn’t mean that you have to go to an open source vendor and buy a commercial product built on top of an upstream project. You are certainly able to do the work that it takes to bring such software into production yourself. But please understand – all you are doing is trading capital expenditure (the cost of buying a product) for operational expenditure (the engineering effort to take that project to the point where you can run it in production).

 

But what is the process to get from a project to a product – or more specifically, software you can reasonably run in production? It starts with some basics. You need to work based on an open source project that is alive and well and provides you with enough to work with. Public source control, a bug tracker that is actively managed, and ideally contribution guidelines that allow you to engage and participate. For many projects these days the answer to all three is GitHub, but of course many projects live outside of that ecosystem as well.

 

Another item that I consider an important hallmark of a viable project is the habit of providing meaningful Changelogs. That allows you to get a much better idea of what changed between versions. And then of course there is the question about the release strategy, release cadence, stable branches and all these other aspects that allow you to understand what you are dealing with.

 

Once you have these basics covered, start with a solid security review. What’s the attack surface? Are there well thought through and documented configuration recommendations? And what about all the dependencies, do you understand their security considerations, attack surface, release cadence, etc? Really do this. Regularly. Reliably. And reproducibly. You’ll thank me later.

 

Create automated tests. The project should have both unit and integration tests, but of course those need to be extended to cover the actual use in production. The full stack, the infrastructure, and the scale at which you expect to run this solution. And then think about the challenges that you face to bring this (and maintain it) in your production environment and data center(s). How is your solution deployed (again, this requires the whole stack, not just an individual project)? How you installing upgrades, what about roll-backs if something goes wrong? How do you maintain configuration across those changes, how do you backup your data (ideally in ways that you can restore them if needed). And all of that of course has to be done in an automated fashion. Anything that requires manual intervention or undocumented “knowledge” by the admin staff is by definition not production ready.

 

And then of course comes the interesting question of scaling. Within your data center, across geographies. And questions like work load migration or load balancing.

 

By now it should be clear that this process is entirely non-trivial, and I’m sure I forgot at least twenty more things that you’ll need to work through.

 

So, what’s the advice?

 

Think Big.

Think Context.

Think Security.

Think Automation.

And do that for every release. Regularly. Reliably. Reproducibly.

 

And this will put you on the path to implement a process that will get you from the rapid innovation in an open source project into production.