Creating Optimal Reviews

Published July 19, 2011 in Archives

Geoff Crain

Everyone should know by now what the benefits of code review are – even if you think they are just theoretical. Finding defects sooner, sharing knowledge, “pairing” across timezones. But are you actually getting these out of your process? Taking just the worst of the reviews that I see, I would say no. And probably most of the time. Particularly if you are a Code Review Denier. If you were actually seeing these benefits, code review would be a good thing, right? Please correct me if you think I’m wrong.

Using data from Atlassian’s internal Crucible instance, and my own knowledge from what I see from our customers, this post will hopefully explain why code review might seem hard, and how with a small amount of effort, it can not only be easy, but you’ll also get a lot more benefit from the whole process.

You are doing it wrong

Or at least, a lot of people are. I know this because I’ve been an Atlassian Crucible developer for about 4 years, and I’ve seen code review at its worst. Its really painful to me, because I also see code review at its best. And when I hear people blaming their tools, or worse – the code review process, it makes me want to start throwing things.

Two common behaviours

Overloading (Too many files) – Someone has been working on a branch for a week or a month, and finally they think its ready for someone to look at. So they shove it all into one big review.
Overcrowding (Too many reviewers) – Maybe there’s a complicated change. So just to be sure, you’ll add an extra reviewer. Or two, or three.

These keep coming up, all the time, because people don’t realize the negative impact these behaviors can have. You’re going to see just how big that impact can be.

Reviews of unusual size

Problem

Why is it a bad thing to have lots of files in a review? Well, as you’d expect, most files in a review aren’t isolated. They interact with multiple other files. To review one, you have to figure out what other files it touches, find them, read and understand them, before finally coming back to the original file. These overheads grow exponentially with the number of files. And as the number increases to 10, 20, 100 files, their costs become huge.

Analysis*

All files being equal

This is lines of code changed per file against the number of files. Each file in a review is roughly the same size, regardless of file count. So you’d expect that each file would take about the same amount of time to review. A graph of this, then, would be a flat, horizontal line.

Time spent reviewing

But it’s not flat. It’s not even close to flat. It plummets in just the first 10 files, and continues to drop after that. People are spending less time per file – they are less thorough – when there are more files in a review.

This code isn’t even being reviewed. And that’s with just 90 files. How big are your largest reviews?

Time spent per line

This is time spent against lines of code, and again, you can see that people are spending less time as the review grows. People are spending seconds per line, compared with… pretty much nothing. In large reviews, they’re looking at hundreds of lines per second.

Solution

When you look closer, you find that most files in these larger reviews aren’t actually directly related. Reviewers are left to figure out the mess on their own, and its basically because the author was lazy.

This pain is unnecessary. Don’t be lazy. Spend some time removing the cruft, stripping the review down to its bare essentials. If its all absolutely essential, break it up into smaller focused reviews.

For any non-trivial review, you should also just tell the reviewers what to do. In comments or objectives, the author can point to the more important stuff, so reviewers will know where to focus their efforts.

Overcrowding

Problem

A problem I find more interesting, and is perhaps less intuitive is the number of reviewers. Sometimes, for certain changes, people tend to think its a good idea to add more reviewers. You might think there are some valid cases, e.g notifying people of a change, but each extra reviewer adds more complications.

When there are more reviewers, they can grow complacent. They’re only human after all, and intentionally or not, they can and will assume that others will do the work.

Analysis

Reviewers spend less time

You can see how this complacency grows as there are more reviewers. Each additional reviewer adds a little bit less value.

Whats really interesting here though is this drop between 3 and 6 reviewers. When there are 6 reviewers, they are spending the same amount of total time as when there are 3.

Each additional reviewer has reduced the amount of effort given by the other reviewers. In our case, by 50 percent!

Reviewers find less defects

And its not just time that’s affected here. According to this graph, as there are more reviewers, there are actually less defects raised – less defects in total!

So again, each addition reviewer has reduced, this time the quality of effort given by other reviewers.

Solution

In hindsight, there are often at least a couple of people that don’t need to be there. They just don’t add value. Don’t have a go at the individuals – its often an accumulation other factors, rather than their competence.

There’s a feature in Crucible that suggests reviewers for you. It bases this on how much they’ve had to do with the files in the review, as well as their work load.

Being one of my successful ShipIt projects, I totally recommend using at least a similar approach. Choose a focused group (or pair, or person even) to review your code.

For each reviewer, consider what value they’re actually adding. Then consider how much they’re going to cost.

If you really have to add more people, tell them individually what to do, what they’re there for.

You can also let people opt out of reviews. It will save everybody’s time and it will give you more accurate confidence in whats actually been reviewed.

In summary

Two steps in the right direction

Avoid overloading

Reviews should be

concise – don’t let cruft get in the way
contained – if there are two parts, create two reviews
guided – tell people how to review it

Avoid overcrowding

Reviewers should be

selected with care – don’t blindly choose reviewers
focused – if its a specialist change, choose a specialist dev
guided – tell people individually what they should look out for

Why do we create reviews?

As I wrote in the beginning. Finding defects sooner, sharing knowledge, “pairing” across timezones. Often people see the code review creation as a final step of development. Its not! I’m going to compare Crucible / code review to Jira / issue tracking. What is the goal of issue tracking. Its not to raise issues. Its to get them implemented. Creating the issue is the first step (thought processes / finding bugs are really a pre-step). The team then works on scheduling and implementing the fix. But most importantly (most relevant to my argument, at least) is that you often have to go back to the Reporter and ask for more details. If the issue is created well, with version numbers, steps to reproduce, maybe some screenshots, it makes the whole process easier. This has become best practice.

Now, how does this translate to code review? Creating the code review is also the first step. Again, there is a pre-step: developing an actual change. But to achieve the benefits of code review (finding defects…) it can’t be the final step. The team (reviewers) has to read it, understand it, and raise any improvements. Often details are missing, and as you can see in the graphs, often developers just go “meh”. If the review is created well, with only a small change, relevant reviewers, a detailed description and maybe some some screenshots, it makes the whole process easier.

This is best practice.

What does “created well” mean?It means that you generally have to consider your reviewers. Think about how they are going to achieve the goals of code review. Creating reviews in certain ways can have a huge impact on their efficiency and effectiveness, the dramatic trends in the graphs show this.

As an author, you want to make it as easy for them as possible, and there are no excuses. 5 minutes for an author costs your team, your company a lot less than 10 minutes per reviewer.

*How does your team rate?

What I’ve shown you is how these problems affect Atlassian, across thousands of reviews from Jira and Confluence, and other teams. They may affect you differently. However, all the graphs were generated in a crucible plugin. Download Crucible Review Statistics and install it on your own Crucible instance. You can understand how your reviewers think and optimise code review for your team.

You can also contribute any changes you want, raise issues for any features you want, or even fork the plugin. Its all open source at bitbucket.

Note: this post has been adapted from my Atlassian Summit 2011 lightning talk

You are doing it wrong

Two common behaviours

Reviews of unusual size

Problem

Analysis*

All files being equal

Time spent reviewing

Time spent per line

Solution

Overcrowding

Problem

Analysis

Reviewers spend less time

Reviewers find less defects

Solution

In summary

Two steps in the right direction

Avoid overloading

Avoid overcrowding

Why do we create reviews?

*How does your team rate?

More in Archives

5 reasons to try Team Calendars in Confluence Cloud Premium

Three ways to get the most from your Confluence personal space

5 tips for building a powerful knowledge base with Confluence

Introducing Atlassian Cloud for Gmail