On Writing a Great Pull Request

Getting into the mindset of efficient pull requests

Nowadays, pull requests are essential tools to enable seamless collaboration within software teams. They’ve changed our workflows for the better and became a standard development practice.

Although there are many useful guides on the internet about increasing the quality of pull requests, this post aims to provide my own take on the matter: what I think are the essential characteristics of a great pull request. The focus will be on the pull request itself, not on the accompanying code. I shall note that this is my personal opinion, and some points may not apply to your case.

Don’t make me think

Bob is opening a pull request for something he’s been working on over the past few weeks. It’s a complex feature that touches legacy parts of an old codebase, and the changeset is comprised of 50 files. A huge mix of red and green can be seen in the diff. He writes the following title:

Implement IDK integration

And along with it, the following description:

Reference: <a href="http://github.com/cool_project/issues/666">http://github.com/cool_project/issues/666</a>

Lastly, he clicks on “Create Pull Request”. That’s great, now he can relax and move on to the next thing!

Joe has been assigned to review Bob’s pull request. Of course, the title is the first element to stand out: “Implement IDK integration”. But then he asks, “what’s IDK?” The description doesn’t mention what IDK is. “OK, let me Google up”. He clicks on the first result, but it does not seem to be useful. He then heads to the next result. After a few random tries, he figures out what IDK is and what it’s used for.

Joe flips back to the pull request and a link to the issue calls his attention. “Let’s see what we’ve got in there”. Even though the issue’s description seems utterly confusing, he tries to fish something interesting out of it. He scrolls through a long thread of comments. The tenth comment is very lengthy and introduces new details, also changing the state of what’s been written in the main description. As if that was not enough, most comments force Joe to keep track of newfound details, making up a not-so-funny jigsaw puzzle.

After wasting yet more time, Joe acknowledges the different parts of the feature and the business goals. However, it’s a considerable amount of context to keep in his head at once, so he may even refer to the issue again in case he forgets something.

With that out of the way, Joe can finally start reviewing the code. He clicks on the “Files changed” tab, and unsurprisingly, he can barely understand what’s been changed in the first file, and how that change relates to the next file, or any other file for the matter. Joe spends 1 hour, perplexed, looking at the code, bouncing back and forth, but he still doesn’t know what to do. Bob isn’t online or even available for Joe to ask questions.

Joe finally figures out what the entry point of the code is, and what files he should look up next! He checkouts Bob’s branch on his computer and starts playing with the code. He spots a code change that seems absurd, and leaves a comment with questions. Actually, one among many!

Hours go by, and Joe stumbles across something that invalidates his first comment, but at this point it’s just a supposition. Time passes, and Joe’s supposition is suddenly confirmed. He finally gets it: Bob was working around a pressing issue of the legacy codebase. Now it makes total sense!

Joe has time boxed his review because he’s a busy fella and has other things to do. He finishes his review and submits it. On the next day, Bob replies to Joe’s comments, and some of his answers explain why the code looks and behaves a certain way. After a few rounds of code review, the reasons behind key parts of Bob’s design suddenly become clear. But at this point it might be too late to suggest design changes.

What’s the problem?

It’s inevitable. If a pull request is not descriptive, people will be lazy or not willing to spend much time on it. They’ll probably comment on some specific technicality or syntax detail, and pretend to have submitted a meaningful review. Or they will complain and ask for more details. Or they will simply think : ”I’m the worst code reviewer in the world. How can I not understand this? Others get it so much faster than me!” — and that’s not what’s actually going on.

This is not good at all. Neither for the reviewer, nor the company, nor the project, not even you! Code may be merged with bugs and the design may slowly hinder future development around it. And worse yet, the code may be solving the wrong problem, but there’s a chance nobody noticed this fact.

Everyone commits such mistakes, and that’s exactly what a code review is for: gathering people with diverse backgrounds and prying eyes to improve the overall quality of the work.

The most important ability of a developer: putting oneself in the shoes of others

Bob is the person working on the IDK feature. He’s gone through all the trouble of understanding what IDK is, what the issue is about, what the business goals are, the perils and pitfalls of the legacy codebase, what parts of the code are relevant, the approach to implementing it, and so forth. The devil is precisely in these details.

But each reviewer has to go through the same path as Bob, over and over again — especially if they are not familiar with the feature or important aspects around the feature. Which means the reviewer’s time is not being respected. What could have Bob done instead?

Clear out the path

The perfect pull request clears out the path for reviewers as much as possible. Imagine you are the reviewer. How would you go about understanding your own pull request?

Explain the issue

Sometimes, just providing a link to the issue is not enough. The problem is that issue threads are chronological and often full of noise, therefore important information may be scattered or just plain inaccessible.

A great attitude is to explain the issue in the context of your changes. Whoever wrote the issue has a certain point of view about the matter, and you are in charge of materializing that view into code. More than anyone, you hold a lot of hands-on knowledge that could be transmitted along with your code.

Explain in a few sentences or paragraphs what you are dealing with, and provide relevant links in case the reviewer wants to go deeper. In the IDK example, it could be explained what IDK is, what the goals of integrating IDK into the application are, the touchpoints, and so on.

Explain what the feature is and why it’s being done

And also, provide a link to the issue along with the text. Best case scenario is when your explanation avoids a click to that link, but keep in mind that’s not always possible or feasible.

Provide context

Now we’re on the technical side of the force. What prerequisites does the reviewer need in order to delve into the problem you are solving? What can you show to the reviewer to make things easier to understand? Can you write a text? Can you provide a directory listing with the tree command? Would that help? What would help?

Provide a starting point and a path

Is there an entry point from where logic branches off? In web applications that’s usually the controller or a spot where a top-level component is instantiated — or something else entirely different. Just don’t assume it is obvious! You may even have a multitude of controllers, but in that case you should elect the most important ones.

After writing down these entry points, you may suggest a script for the reviewer to follow. In GitHub pull requests, you can’t reorder changed files manually, but you can write that information down in the description.

Provide visual resources

If you are doing frontend work, your PR can likely benefit from an annotated screenshot, a gif, or a video that represents the outcome of your work.

In other cases, you can resort to resources such as diagrams. For example, for showing how modules, components, or services interact between each other. Whether you choose to provide these or not, it’s up to your judgement!

Explain the code

Code is made of expressions, functions, classes, modules, files, namespaces, and so forth. Understanding a function, depending on how it’s written, may be easy enough. The real challenge is understanding how things are put together and how they provide value. And it’s likely that your reviewer is struggling exactly with that.

No matter how “clean” your code may appear to be, it’s a sea of information that people will try to make sense of. Therefore, take an extra leap and explain in a few sentences or paragraphs what your code does. People will thank you for that!

Take an extra leap and explain in a few sentences what your code does

Explain the design

Coding is a complex activity, and there are many ways to put a feature together — each with different tradeoff characteristics. Some decisions are not easy to make, but we have to make them all the time, right? That’s what we’re good at. And it boils down to design: finding out the best structure — whether existing or not — to accommodate code changes, taking into account the current state of affairs.

Never assume your design is obvious. There may be subtly interconnected parts making it up, and if you don’t explain it bluntly, people will take longer than necessary to understand your intents. So write about how you approached the problem. It will be much easier for anyone to chime in and suggest improvements.

Write about how you approached the problem

Simpler is better, but if you resort to a well-known approach or design pattern, make sure to spell it out in detail. That’s ubiquitous language familiar to developers, so use that to your advantage!

Comment along the PR

What if you have something specific to comment that relates to a certain part of the code but does not fit into the main description? For example, it would not help saying “In the function X of module Y, I did Z and W”. After all, the reviewer is still reading the description, so he may not have looked at the code yet.

That’s why you should review your own PR — usually more than once — and comment on any puzzling parts that would be better understood with the help of human language. Empty your cup before looking at your code and you will have a clue on what to explain and how to explain whatever it is you need to explain.

Always assume your description can be understood without looking at any code

Refer to historical information

Code has an implicit history that’s written throughout commits, issues, and other artifacts. And it’s likely that you had to look up these resources while seeking the best approach to deal with key parts of the code. That’s even more likely if your changes are broad or deep.

In that case, you may provide links to past issues, commits, pull requests, and briefly explain what they have to do with your changes.

Even if you haven’t done any research, it might be useful to explain in general terms how the pre-existing code plays out in your changes. You may also have to explain about code that’s not part of the diff, but has an implicit influence on your work.

Should it always be like that?

Absolutely not! The best possible situation is when you write the bare minimum that does not omit essential information. That’s exactly the challenge. If your description is very lengthy or elongated, it will discourage people from reading it.

When your pull request is simple — say a superficial change or bug fix — you can probably get by with just the issue’s link. And remember: you can’t explain everything. You shouldn’t. The goal is to enable easy discovery and allow your reviewer to focus on what’s important.

So, try to be concise and objective. Make sure your text has a nice progression and rhythm. Make sure the key elements to understanding your work are explicitly stated. Cut down on unnecessary words and sentences. Divide up sections logically. Think of the other person and be a sensible developer.

By doing that, you — the expert — is saving everyone’s time. Let’s say you can compose your pull request in 1 hour and save 4 hours per reviewer. That’s a big win for everyone!

Now make me think

Given that you’ve cleared out the path for reviewers, you can now focus on asking the questions that you consider to be important — in other words, what you are craving for feedback. List your requests explicitly in the description or via comments.

Keep in mind you are just facilitating the reviewer’s jobs as much you can. That does not mean the reviewer should not think outside of the box!

In any case, good reviews take time. But if you make it easier and smoother, they will take far less time, and the overall quality of the work will be better in the end. Thus, your pull request will reach peak efficiency — just as it ought to be!

I would love to hear what you think. If you have any comments or experiences, please leave it up below. Thanks for reading!

Thanks to Tiárli Oliveira.

We are hiring new talents. Do you want to work with us? become@codeminer42.com