Changes containing more than one fix are a bad development practice

Putting multiple fixes in a single change request is a harmful practice in software development. This seemingly innocuous behavior has a number of downsides and can significantly impact project efficiency, code quality, and overall productivity.

While reviewing a change for an open source project I immediately spotted an issue, but spent a while trying to figure out if all of the rest of the changes somehow worked around the new issue. It took another fifteen minutes before I realized that the change contained two separate unrelated changes. The fixes touched the same code area and the commit message made it sound like both changes were affected by the same bug, but that was not the case. I was pretty sure that fix number one is incorrect, but because it is combined with fix number two I couldn't prove it without investing significantly more time. The commit could have been two small changes, each easy to talk about in a bug report, but now it requires several iterations before it can be merged.

I've observed many times where bundled changes significantly wastes time. Each one of these I encountered multiple times and now have a rule that every change request must only contain one change.

It makes the reviewers job more difficult

When reviewing a change that contains two separate bug fixes the reviewer has to mentally figure out which changes go to which fix. This makes reviewing harder and can take longer. The reviewer is more likely to make mistakes. The worst mistake is if the reviewer only verifies one fix and not the other.

It makes the authors job more difficult

The first reviewer of any change is yourself. The more complicated it is, the higher the possibility you will miss something. This can cause required iteration on the change or worse potentially fixes after it is merged and problems are found there.

It hurts future readers

In the future when someone looks at that change they will have the same difficulty as the reviewer and have to take the extra time and effort and can make the same mistakes. Except this time they don't have you to ask questions and probably won't look up the bug in the bug database (if it still exists) further increasing the likelihood that they are interpreting the change incorrectly.

It reduces the velocity of the project

It is rare for a change to be perfect the first time around and the bigger they are the higher the likelihood that they will have a few rounds of tweaking before they can go in. If the change has two separate fixes, one fix will usually be done before the other, but by tying them together in one commit you are denying the project the finished fix until the second one is ready. This might be an hour or a day, but it could be months.

It delays small fixes

Often when coding one will spot small fixes; maybe a documentation typo, a missing unit test edge case, etc. These changes by themselves are easy and quick to review and merge by anyone. But when mixed in with larger changes they only can be merged once the larger change is ready. These are often the type of changes that are added into other more complex changes.

It makes reverting harder

Changes are not always perfect and sometimes need to be reverted. Not infrequently when reverting you want to do it asap. When there are two fixes in a single change you can not just do something simple like "git revert [sha]", you might have to dissect the change and revert only half of it. Once again you have to mentally figure out what part of the change goes to which fix. And because we are not perfect (hey we had to revert something remember) there is a chance that you will revert it incorrectly. I have never seen someone that split a revert test and verified that the other half they left behind still works. Worst case scenario the person reverting the commit doesn't realize the commit is two fixes and reverts the whole change assuming it only was a single fix and no one spots it until the regression bug report from the user comes in.

Modern tools make splitting easy

This practice is typically found in cases where both fixes are in the same file. With older revision control systems like SVN or Perforce splitting multiple fixes in a single file was a non-trivial task. But when using modern revision control systems they all support easy splitting of changes.

When writing a commit message if you find yourself writing multiple paragraphs about different topics, listing multiple different bug id's, using words such as "while I was in this file ..." or even just the word "also" it is a pretty good hint that you probably want to step back and split your commit into two.

Creating patches with multiple fixes is a bad practice that wastes time and can cause errors. And if you are asked to review a commit that should be two, reject it and have them split it.

Git

Here are some Git specific tips:

If you are working in a file and notice something else there that needs to be changed you have options:

git branch - Branching in Git is cheap, make a branch for your existing code or commit and start a new branch for the new fix.

git stash - Git stash will store your changes in a hidden branch while you make the other fix in its own commit and then use Git stash apply your changes and continue your work.

git add -p or git add -i - When it comes time to make the commits rather than adding the whole file to be committed, Git add -p will prompt you one by one through each chunk in the file to find out if it should be added to the staging. Only add the relevant parts to a fix, commit and then do the same to the other fix. A more detailed explanation with examples can be found on this blog entry on git add -p or in the git add man page. (bonus tip: git checkout -p exists and does exactly what you think it does)

Splitting commits

If you already have committed a patch and only realize after the fact that it should have been two Git is still there for you. If it was the last commit you can git reset HEAD^ and then use git commit -p to add them as two commits (You can even use git commit -c [sha] so you don't have to re-type the half of the commit message that matters). And if it isn't the top commit that is okay too, you can perform an interactive rebase and mark the commit to be edited. When the rebase reaches that commit, use git reset HEAD^, add the multiple new commits, and finish the rebase. The git rebase man page has detailed instructions on how to split commits.

Using Git Hooks

Because Git is a distributed revision control system the infrastructure to support commit hooks exists on your local machine too. To use hooks and automate the process of spotting commits that should be split of you can create a one liner commit-msg hook that greps for the word "also" and outputs a warning.

Previous
Previous

Code analysis reporting tools don't work

Next
Next

Lessons from open source software