Effective Code Reviews and File Ownerships

SlideShare (a place to share and discover professional Knowledge on LinkedIn) started out on Ruby on Rails (RoR). With a focussed and agile team, RoR was a great solution because of the productivity gains it brought us, and it helped the team deliver a quality product at a fast pace over the years. I am going to refer to this Ruby on Rails code base as ‘The WebApp’, mainly because that is what it is called internally.

In the early days, Slideshare was a small team. It had a small and elegant code base and code commits were guarded by engineers who had been working with it since its inception. I remember that I accessed a model finder method from a view (an evil evil thing) a week or so after I joined and got chewed out by my onboarding ‘buddy’ (technical mentor). When I joined this scrappy startup in 2010, there were about 12 developers. There were about 20 controllers and a slightly higher number of models in the WebApp. During the first few weeks, if I had a question, I used to stand and ask out loud if anyone knew about a particular file. The answer was usually a chair or two away. We had one Scrum meeting each morning, which all of us attended and a few dialed into. In simpler terms, you knew who knew what.

Over time, we added more features and more colleagues. Our traffic continued to grow. The content hosted on SlideShare surged from a few million slideshows to ten times that. Soon our team didn’t fit in the conference room anymore so we split into two teams. A few months later, we had five teams. We were building exciting stuff, adding more colleagues and accumulating users like never before.

It has been a few years since then and ‘The WebApp’ now has a large number of controllers, even more models, Javascript files, ES6 files, CSS files, SCSS files, and views. We also have a large number of external integrations and third party services that ‘The WebApp’ uses. We also have a complex set of performance optimizations and other custom patches and configurations. Over 40 people are now contributing to this code base. Due to the more complex code, developers are constantly asking and answering more questions of increasing complexity

Finding the ‘guy’

The problem with more teams was that each team was now working in isolation. An engineer on one team was soon out of sync with what a colleague on another team was working on. You regularly ran into code or features that you or your team knew nothing about. To find an expert on a piece of code that you just touched, you had to either post on the internal group chat system or walk around the office asking people to point you to the right person. While previously adding a new engineer and getting him or her up to speed on the code used to take a day, now we had four different wiki pages documenting different parts of the application that were written from the perspective of different engineers on different teams. This wasn’t a problem in and of itself, it was a symptom of an underlying problem, and as we added more people to the team, the problem just got worse. Or it did, till about a month ago.

Finding who could tell you about a file or code wasn’t the only problem. SlideShare has a culture of effective code reviews to ensure that code shipped would be quality code. The culture and regular practice of Code Reviews was one of the foundations of ensuring quality and craftsmanship in development at SlideShare and was now showing the strains of not being able to handle the scale. Code reviews are only effective if the person reviewing a piece of code is familiar with the code he or she is reviewing. As demonstrated in 'Don’t Touch My Code! Examining the Effects of Ownership on Software Quality', a developer working on a code or a module he or she is not familiar with has a greater chance of introducing a defect or a bug. Since developers like to move around and work on new features, the next best thing is to get code reviewed by people who know the code well. However, it wasn’t simple anymore to identify the right reviewers and developers ended up circulating reviews within their immediate sub-teams, which did not have enough expertise to review the code.

Another outcome was that it made it harder and riskier for developers to identify dead code and remove it or refactor code and eliminate technical debt.

Available tools

As I mentioned before, code reviewing is baked into LinkedIn’s culture. Since most repositories at LinkedIn are modular and built as services, it was possible to have a set of owners who understood the entire code base. This allowed automated tools to verify every incoming commit for reviews from owners of the repository. This is enforced by using Git’s pre-commit hooks, and LinkedIn’s tools team provided the necessary systems to check for reviews from owners.

That approach would not work as effectively for a Ruby on Rails app with a feature set as diverse or a team as large as SlideShare. We needed something that allowed for finer control over the process of expressing and enforcing ownership. One of the nicer things about the systems I mentioned before is the flexibility they provide in validating commits. Given the diverse set of files in the WebApp, with no clear pattern, this flexibility allowed us to get a bit creative. It was just what I needed.

Files and Owners

Let me take a step back and explain what we did and why. What I have laid out until now is that being unable to identify who has an expertise in a particular area is the same problem as not having experts at all. When they have no one to get answers from, developers make their own judgement calls which have a big chance of being wrong. This is not good when you have a site outage that needs an immediate fix. You need someone who knows the implementation inside out. When no one owns code, nobody cares about its quality. A great read about code ownership I ran into recently is from one of the Chromium contributors and it explains this in more depth and clarity.

The best way to solve this problem is:

  1. Identify who owns a piece of code. It could be a file or a directory but each part of the project needs an owner.
  2. Make it easy for developers to find who the owner is.
  3. Enforce ownership by requiring sign-off from owners when a file they own is changed.

In a project this big with a history of over 260K commits, Git was the most effective way to identify the owners. It already had every developer’s activity. With a few minutes spent on writing a script in Ruby with some Shell commands thrown in, I had a an initial mapping of files to committers/potential owners. After a bit of filtering and manual cleanup, we had a nice list of approximately 1000 files with the names of people and thus the subteams that possibly owned particular files. A few hours of manual vetting and huddling with the developers themselves to explain the new process, all that remained was creating the files that LinkedIn’s ACL system would handle.

The flexibility the system provided meant that it would cross reference all files and their owners in a commit with its ReviewBoard URL to check if it had ‘ship-its’ from owners before allowing the commit.

In order to keep the list of owners manageable, we followed the LinkedIn’s preferred five owners per file restriction (not too many owners, but at the same time handling the developer out of office situation.)

Each ACL file (ACL files in this context means a file containing information for the Git pre-commit hook to validate commits for sign-off from code owners) had the following details:

// team_name.acl
owners: [user1, user2, user3, user4, user5]
paths: [
	app/controllers/login_controller.rb
	app/models/login.rb,
	app/views/login/*
]

Some files needed special handling. Common files are inevitable in shared code bases. This may include configuration files, version files etc. In our Ruby on Rails project, this meant that Gemfile, Gemfile.lock and routes.rb needed access by all teams and were added these to all the ACL files just to make it easier to develop. Every team could sign-off on changes to these files.

While analyzing files and their ownership, I also came upon a lot of files that either didn’t have clear ownership or were symptoms of technical debt. Some files were the application_controller.rb and the main_controller.rb file. These particular files (which are just examples) had methods which were put into the file simply to be available to all controllers (Including modules instead might have been a better choice). Files like these were added to a block.acl file and the owners of the files are responsible for steering developers into refactoring changes out of the file or taking ownership of files that did not have any clear owners.

Shipping Ownership ACLs

The goal of this entire process is to make it easy for those of us working on the code. A small sign of that was to ensure that developers who already had code ready to be shipped with sign-offs prior to enforcing ownership were not affected. It would have been an unpleasant surprise for developers to run into the new ACLs just when they were trying to commit. So we scheduled this change at a time when no commits or releases were planned. A minor detail, but an important one as the entire purpose of this move was to make developers’ lives easier.

Conclusion

Within a few weeks of these changes, a healthy trend has emerged. Reviews are more spread out than ever before and developers are actively submitting reviews to external teams. Teams that own pieces of the app now know if other teams have made changes to those files. Code is being refactored out of large files. Developers have a clear way to identify experts on the piece of code they are working on, and it anecdotally saves a lot of time during this process.