Programming

Removing code by adding features

The pinnacle of refactoring is the removing of code by introducing a new feature. It may seem like a ridiculous notion, but I’m often surprised myself at how often it is a viable option.

The False Abstraction

Over the past weekend I was refactoring Leaf to get rid of a false abstraction. Two classes in the code, scope and managed_variable_context, derived from an interface called variable_context. At the time it seemed like a good idea to make this interface, since most users of scope needed only a tiny fraction of its ability.

It was the wrong thing to do. As I added features increasingly more functions of the scope class crept into the interface. The other implementation, managed_variable_context was also becoming less and less correct, dealing with only a few special cases.

Removing the abstraction would remove several hundred lines of code. Interestingly, it would require implementing a few minor features at the same time. Or rather, its removal would be what implements those features. In my case it solves a namespace export issue in Leaf.

In the general case removing an abstraction will introduce new features, perhaps not at the user level, but certainly at the code level. It generally involves the merging of a minor implementation’s behaviours into the dominant implementation. Often new tests are not needed; these new features aren’t genuinely new so their code is primarily covered by an existing unit test.

Some might cry foul here. That abstraction provides flexibility and removing it will make our code harder to change later. Here I invoke the YAGNI principle: if you don’t need it now it has no business being in the code. We can’t be retaining abstractions just because they might be useful sometime. Not being useful now is the dominating concern.

Nearly identical code

Most complex systems duplicate behaviours. This is a natural and normal scenario. When the requirements are first written, it would be foolish to define the most abstract system. You correctly start with the concrete use-cases and go from there. In time common behaviours and types start to emerge.

Clearly if two functions do the same thing one should be eliminated. But what if they just almost do the same thing? This is where adding a feature can sometimes help. Look for the little bits of code that are missing from each function. What would it take to make them the same function?

I often find the addition or change of something quite small is all it takes. Perhaps I remove an artificial limitation or superficial validation check on one object. Maybe I change the length of a field, or extend the set of operations available. In Leaf I simply exposed a new keyword, multi, that exposed a previously internal only ability. In any case, I add a test case for the new feature and then drop one of the functions.

Nothing says the newly exposed feature must actually be visible to the user. Don’t fall into the trap of thinking all the internal features must actually be usable in the user interface.

Fishy tight integration

When I implemented shared closure support in Leaf, I used a cheap approach to get it done quickly. This involved creating a suspicious contract between the typing code and the IR generator. It introduced a requirement that one module relies on a specific behaviour of another. That behaviour was absolutely not part of the proper interface between the modules.

These tight integrations create scar tissue in the code. As time goes on both modules attempt to retain and workaround the odd structure. Instead of being able to refactor at will they are beholden to an artificial constraint. It’s the type of code that starts to spawn several if statements and asserts.

Often the only way to remove such problems is by introducing a new feature at the interface level. Module A needs some way to communicate to Module B what it is actually trying to achieve. An interface change could be on an actual language interface, a high-level protocol change between two services, a schema change, or even a file format change. Once the change is made both sides can cleanup all their silly patchwork and proceed with unhindered refactoring.

Tight integration belongs to a class of essential-to-fix issues. The longer they linger the worse they get. They often stick around since some people are unwilling to modify an interface. Sometimes the term “feature freeze” is used to block the change, since technically it is a new feature, even though the result will be less code, and cleaner code.

The goal is less code

My primary goal in refactoring isn’t just cleaner code, but less code. Clean code is obviously nice to read, but simply not having code is even easier.

It may seem counterintuitive that the path to less code is often through new functionality. The removal of a class, merging of functions, or cleaning up sometimes hinges on minor feature additions. Mostly these are small internal features, but sometimes they ripple to higher levels. This is a win-win situation, less code, more functionality.

I should note that sometimes I do manage to remove several hundred lines of code, but more often I only remove a handful of lines. Even if I remove no lines total, I still consider it successful. Having a new feature, and cleaner code, without increasing code size is always a good thing.

The next time you need to refactor, first look to what added functionality would make it easier. And the next time you need to add a feature, look to what code can be removed to make it possible.

2 replies »

  1. I’ve noticed many times that, if you were to graph the size and complexity of a project you’re working on, at first it grows with time. As you add features, the project naturally becomes “bigger.”

    But at some point you begin to really understand the abstraction of the problem, plus you’ve had a chance to throw code at it and see what works and what’s clumsy. At that point, the graph of size and complexity actually begins to trend downwards even while adding new features.

    When you work on a problem for a while, you get good at solving it. For programmers, this seems to imply writing code that’s flexible and easy to refactor. One more place where programming defies cliche: we do sweat the small stuff and we sometimes do change horses in midstream. :)

    • I’ve got a long established piece of code (6 years maybe) which has had a lifetime like the one you describe. It definitely seems to have gone past the point of realising the fundamental structure which best describes/organises the domain it provides for.

      We often get support developers tasked with adding functionality and so they take what the user wants literally and put a chunk of new code. I can often explain/show them how removal/simplification of the code can add the feature with very little new code. I guess that’s all about them seeing the fundamentals I’ve discovered as it got refactored into what it is now. Some of them take more time to look and I just need to confirm with them that the code explained itself!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s