The pinnacle of refactoring is the removing of code by introducing a new feature. It may seem like a ridiculous notion, but I’m often surprised myself at how often it is a viable option.
The False Abstraction
Over the past weekend I was refactoring Leaf to get rid of a false abstraction. Two classes in the code, scope
and managed_variable_context
, derived from an interface called variable_context
. At the time it seemed like a good idea to make this interface, since most users of scope
needed only a tiny fraction of its ability.
It was the wrong thing to do. As I added features increasingly more functions of the scope
class crept into the interface. The other implementation, managed_variable_context
was also becoming less and less correct, dealing with only a few special cases.
Removing the abstraction would remove several hundred lines of code. Interestingly, it would require implementing a few minor features at the same time. Or rather, its removal would be what implements those features. In my case it solves a namespace export issue in Leaf.
In the general case removing an abstraction will introduce new features, perhaps not at the user level, but certainly at the code level. It generally involves the merging of a minor implementation’s behaviours into the dominant implementation. Often new tests are not needed; these new features aren’t genuinely new so their code is primarily covered by an existing unit test.
Some might cry foul here. That abstraction provides flexibility and removing it will make our code harder to change later. Here I invoke the YAGNI principle: if you don’t need it now it has no business being in the code. We can’t be retaining abstractions just because they might be useful sometime. Not being useful now is the dominating concern.
Nearly identical code
Most complex systems duplicate behaviours. This is a natural and normal scenario. When the requirements are first written, it would be foolish to define the most abstract system. You correctly start with the concrete use-cases and go from there. In time common behaviours and types start to emerge.
Clearly if two functions do the same thing one should be eliminated. But what if they just almost do the same thing? This is where adding a feature can sometimes help. Look for the little bits of code that are missing from each function. What would it take to make them the same function?
I often find the addition or change of something quite small is all it takes. Perhaps I remove an artificial limitation or superficial validation check on one object. Maybe I change the length of a field, or extend the set of operations available. In Leaf I simply exposed a new keyword, multi
, that exposed a previously internal only ability. In any case, I add a test case for the new feature and then drop one of the functions.
Nothing says the newly exposed feature must actually be visible to the user. Don’t fall into the trap of thinking all the internal features must actually be usable in the user interface.
Fishy tight integration
When I implemented shared closure support in Leaf, I used a cheap approach to get it done quickly. This involved creating a suspicious contract between the typing code and the IR generator. It introduced a requirement that one module relies on a specific behaviour of another. That behaviour was absolutely not part of the proper interface between the modules.
These tight integrations create scar tissue in the code. As time goes on both modules attempt to retain and workaround the odd structure. Instead of being able to refactor at will they are beholden to an artificial constraint. It’s the type of code that starts to spawn several if
statements and asserts
.
Often the only way to remove such problems is by introducing a new feature at the interface level. Module A needs some way to communicate to Module B what it is actually trying to achieve. An interface change could be on an actual language interface, a high-level protocol change between two services, a schema change, or even a file format change. Once the change is made both sides can cleanup all their silly patchwork and proceed with unhindered refactoring.
Tight integration belongs to a class of essential-to-fix issues. The longer they linger the worse they get. They often stick around since some people are unwilling to modify an interface. Sometimes the term “feature freeze” is used to block the change, since technically it is a new feature, even though the result will be less code, and cleaner code.
The goal is less code
My primary goal in refactoring isn’t just cleaner code, but less code. Clean code is obviously nice to read, but simply not having code is even easier.
It may seem counterintuitive that the path to less code is often through new functionality. The removal of a class, merging of functions, or cleaning up sometimes hinges on minor feature additions. Mostly these are small internal features, but sometimes they ripple to higher levels. This is a win-win situation, less code, more functionality.
I should note that sometimes I do manage to remove several hundred lines of code, but more often I only remove a handful of lines. Even if I remove no lines total, I still consider it successful. Having a new feature, and cleaner code, without increasing code size is always a good thing.
The next time you need to refactor, first look to what added functionality would make it easier. And the next time you need to add a feature, look to what code can be removed to make it possible.