In my last post, I reviewed the “fork and pull” method of open source contribution that has become the gold standard for those looking to build an iterative, standardized and secure open source project that is simultaneously inviting to new contributors. This time, I want to look examine what typically comes after a pull request is initiated: the code review.
When I first started regular code reviews, I adopted what I now call a “ low-quality code review approach.” Basically, I followed the notifications I received from GitHub in a purely linear fashion.
I’d receive a new pull request and read the changes line-by-line from the top. As I was doing that, I’d also try to keep in mind what else this affected in the code base. Then, as I finished with each file, I’d collapse these changes so that I had a placeholder for where I was and what I had done. Finally, when I got to the bottom, I’d comment and either block or approve the request.
Before long, I realized that there were several problems with this approach to code review.
First, I was missing the big picture. I was down in the weeds, trying to figure out what was going on without a plan or perspective for how each line related to the rest of the code base.
The approach was also shallow. I found issues, such as naming or formatting problems, but I wasn’t digging deep into the code to look at corner cases. So, I wasn’t finding, for example, the bugs that can exist in logic loops. Instead, I just found isolated issues.
Lastly, it was inefficient. The only way I could track what I’d reviewed was the GitHub web page. But if I reloaded that page, all the project files opened, and I had to figure out which I’d already read and where I was in the review. It also meant that when I wanted to really dig deeper, I had to find the related code. But I lacked a good map or perspective for looking at it from different angles.
Clearly, a better method existed. I began to think about what would constitute “high-quality code review,” and I came up with some ideas that apply beyond open source to any project you may work on.
On a meta level, code review is essentially quality assurance for the changes going into your code base. It’s your last chance to refine them into something you’re comfortable maintaining for what might be years to come.
Within that context, I realized I could focus on three actions to help ensure I did a high-quality code review.
Schedule review time in blocks – The first thing I did when I changed my approach was break my code review time into blocks of 45 to 60 minutes – enough time to deal with a small request or do a decent amount of a larger one before taking a break. Those breaks gave me the opportunity to put what I’d read in a wider context, allowing me to respond to individual ideas from a project-wide perspective. It also gave me time to formulate more constructive responses to particularly tough issues.
Make a checklist – I also started using a formal checklist to keep track of what I’d done and to guide me in making sure I’d covered everything I needed to, especially when working over multiple sessions.
Focus on quality feedback – Lastly, I started paying more attention to how I offered feedback. Feedback is probably the most important part of the code review process because more often than not, we only interact with contributors via comments. Good feedback is encouraging and positive. It requires empathy for the person you are responding to and centering the discussion on the solution proposed, not the person proposing it. Comments work best when they are educational and inquisitive – asking questions to understand what the contributor was trying to accomplish and then responding to the substance of the answer you receive.
Each of these actions had a material and positive impact on my review process, but they also helped keep my contributors – who are mostly working on their own time – engaged in the project.
They also inspired what I call my Humble Checklist approach to reviewing pull requests, which is a baseline I now apply to any pull request I work on.
The checklist forces me to take four different perspectives on changes coming in, regardless of scale.
First, I do an App Review. This is the 10,000-foot view from which I review the contribution in the context of the project as a whole. Here, I check the demo code, asking whether its use conforms to our guidelines and makes sense to use as written. If the work is on an existing component, I also check it for backward compatibility because people already rely on the code, and it can’t break on them.
Then, I move down to the services level, asking how the pieces connect and communicate with each other. Here, I’ll look at syntax, which means checking the project coding guidelines and making sure things are named the right way and have the proper location. I also pay attention to whether the code is intended to be public or private and whether that’s appropriate. I ask whether the API for the service makes sense – who will use it and how? And I review the specifications, looking at anything public to make sure it’s well tested. That becomes the surface of the project API, which is a contract we need to keep.
Next, I go into the child components, where I look at low-level communications issues between child and parent components. Again, I’m reviewing the new code for how it is used and checking syntax. I’m also noting when an input is added, whether outputs go where they should and whether either is public or private. I also review the specs for these as well to make sure they are properly tested.
Finally, I go back up to the parent components and apply the same set of checks as I have on the child components. This is also where I double-check the API for the component in the demos and make sure it conforms to what we want the software to look like for our users.
I also have a separate section in my checklist for specifications. On Clarity, the open source project I deal with most often, we break this down into four areas. We test:
- The high-level TypeScript API
- Template APIs
- View basics
- Component/service behaviors
I’m not suggesting you copy my exact checklist. Depending on what your project does, you might have other things to add. But the point is to have a checklist of some kind and to use it every time you do a review.
Yes, checklists can seem laborious, and you might catch most issues with a more linear, ad hoc code review system. But the Humble Checklist removes your ego from the equation and is more likely to result in real quality assurance – and if you have to break off the review at any point, you’ll know exactly where you left off and can be confident you haven’t missed anything when you pick it back up.
Stay tuned to the Open Source Blog and follow us on Twitter (@vmwopensource) for all the latest open source projects updates from VMware.