Changes containing more than one fix are a bad development practice

Committing multiple unrelated fixes in a single code change is a bad practice in software development. This seemingly innocuous habit has a number of downsides and can significantly impact project efficiency, code quality, and overall team productivity.

I was reviewing a change for an open source project and was pretty sure part of it was wrong, but it was doing a bunch of other stuff I have seen all of the above happen multiple times. These are not made up scenarios, they really happened, wasted everyone’s time, decreased our productivity, and the project was worse off for it. which might have solved the issue I spotted. It took another fifteen minutes before I realized that the change contained two changes and the other stuff was actually a separate fix. The fixes touched the same code area and the commit message made it sound like both changes were effected by the same bug, but that was not the case. I am pretty sure that fix number one is incorrect, but because it is combined with fix number two I couldn't prove it without investing significantly more time. The commit could have been small and easy to talk about in a bug report, but now it will require several iterations taking hours to days to weeks before it will be resolved.

What is so wrong with a commit that contains multiple separate changes?

It makes the reviewers job more difficult

When reviewing a change that contains two separate bug fixes the reviewer has to mentally figure out which changes go to which fix. This makes reviewing harder and can take longer. The reviewer is more likely to make mistakes. The worst mistake is if the reviewer only verifies one fix and not the other.

It makes the authors job more difficult

The first reviewer of any change is yourself. The more complicated it is, the higher the possibility you will miss something. This can cause required iteration on the change or worse potentially fixes after it is merged and problems are found there.

It hurts future readers

In the future when someone looks at that change they will be in the same boat as the reviewer and have to take the extra time and effort and can make the same mistakes. Except this time they don't have you to ask questions and probably won't look up the bug in the bug database (if it still exists) further increasing the likelihood that they are interpreting the change incorrectly.

It reduces the velocity of the project

It is rare for a change to be perfect the first time around and the bigger they are the higher the likelihood that they will have a few rounds of tweaking before they can go in. If the change has two separate fixes, one fix will usually be done before the other, but by tying them together in one commit you are denying the project the finished fix until the second one is ready. This might be an hour or a day, but it could be months.

It delays small fixes

Often when coding one will spot small fixes; maybe a documentation typo, a missing unit test edge case, etc. These changes by themselves are easy and quick to review and merge by anyone. But when mixed in with larger changes they only can be merged once the larger change is ready. These are often the type of changes that are added into other more complex changes.

It makes reverting harder

Changes are not always perfect and sometimes need to be reverted. Not infrequently when reverting you want to do it asap. When there are two fixes in a single change you can not just do something simple like "git revert [sha]", you might have to dissect the change and revert only half of it. Once again you have to mentally figure out what part of the change goes to which fix. And because we are not perfect (hey we had to revert something remember) there is a chance that you will revert it incorrectly. I have never seen someone that split a revert test and verified that the other half they left behind still works. Worst case scenario the person reverting the commit doesn't realize the commit is two fixes and reverts the whole change assuming it only was a single fix and no one spots it until the regression bug report from the user rolls in.

Modern tools make splitting easy

This practice is typically found in cases where both fixes are in the same file. With older revision control systems like SVN or Perforce splitting multiple fixes in a single file was a non-trivial task. But when using modern revision control systems they all support easy splitting of changes.

When writing a commit message if you find yourself writing multiple paragraphs about different topics, listing multiple different bug id's, using words such as "while I was in this file ..." or even just the word "also" it is a pretty good hint that you probably want to step back and split your commit into two.

Creating patches with multiple fixes is a bad practice that wastes time and can cause errors. And if you are asked to review a commit that should be two, reject it and have them split it.

Git

Here are some Git specific tips:

If you are working in a file and notice something else there that needs to be changed you have options:

git branch - Branching in Git is cheap, make a branch for your existing code or commit and start a new branch for the new fix.

git stash - Git stash will store your changes in a hidden branch while you make the other fix in its own commit and then use Git stash apply your changes and continue your work.

git add -p or git add -i - When it comes time to make the commits rather than adding the whole file to be committed, Git add -p will prompt you one by one through each chunk in the file to find out if it should be added to the staging. Only add the relevant parts to a fix, commit and then do the same to the other fix. A more detailed explanation with examples can be found on this blog entry on git add -p or in the git add man page. (bonus tip: git checkout -p exists and does exactly what you think it does)

Splitting commits

If you already have committed a patch and only realize after the fact that it should have been two Git is still there for you. If it was the last commit you can git reset HEAD^ and then use git commit -p to add them as two commits (You can even use git commit -c [sha] so you don't have to re-type the half of the commit message that matters). And if it isn't the top commit that is okay too, you can perform an interactive rebase and mark the commit to be edited. When the rebase reaches that commit, use git reset HEAD^, add the multiple new commits, and finish the rebase. The git rebase man page has detailed instructions on how to split commits.

Using Git Hooks

Because Git is a distributed revision control system the infrastructure to support commit hooks exists on your local machine too. To use hooks and automate the process of spotting commits that should be split of you can create a one liner commit-msg hook that greps for the word "also" and outputs a warning.

Previous
Previous

Code analysis reporting tools don't work

Next
Next

Lessons from open source software