What is good code? Even if I say nobody cares about your code the quality nonetheless does influence a project. Functional correctness reigns supreme, but if we put that aside what should we look at next? Cohesion and coupling are the two most significant aspects of code impacting software quality.
Cohesion is how well the bits of a module fit together. A set of functions, an interface, is considered cohesive when each function is closely related to another. For example,
log_warning are closely related and belong nicely together in a single
log module. As functions diverge in their behaviour a module becomes less cohesive. For example,
parse_json don’t seem to be related at all.
We can roughly measure cohesion by how succinctly the module can be described. Short and precise sentences are preferred to longer, verbose, and generic ones. Contrast “The log module records diagnostic messages” to “This module records diagnostic messages and parses json”. In the latter we couldn’t even give the module a name. If we can’t give a module a name there’s a good chance it lacks cohesion.
Words like “assorted” and “variety” are a potential indication of a low cohesion. A module that provides “an assortment of calculations” doesn’t sound as cohesive as a module that provides “bond amortization and valuation”.
A highly cohesive module is easier to understand than a lowly cohesive one. While looking at the code, or interface, we can keep a single unifying theme in our head. As all functions relate to that theme it is easier to understand the code, and thus easier to extend the code, debug errors, or simply use the interface.
Cohesive modules tend to reuse the same types throughout their interface. The parameters to
log_error should be the same as those to
log_warning. Once we’ve learned the basic types used in a module it is not far easier to use the rest of the module.
Clean and coherent interfaces also make a module much easier to replace and test. Since all the functions are related, it makes sense we’ll have to replace all of them to swap out the module. Similarly writing a mock object for testing becomes more straight forward. If there are too many unrelated functions in the interface we find our unit tests also lose focus.
Coupling is how dependent modules are on the inner workings of each other. Tightly coupled modules rely extensively on the specific state of each other, sharing variables and many types. Loosely coupled modules are fairly independent: they have a few well defined APIs and share a limited amount of, or no data.
The extent of coupling between modules is determined by the complexity of the exposed interface, where every public function and type is part of the interface. A module that has a high ratio of public information compared to private information will result in tight coupling. A module that exposes a minimal interface will result in loose coupling.
Transitivity is also an indicator of coupling. Say we have a series of modules, where each depends on each other in turn:
A -> B -> C. Ideally module
A knows nothing about module
C. The more it does know the more it is coupled to module
The higher the coupling is between modules the harder it is understand what the code is doing. The more details two modules share the more difficult it is to understand them. Independent code is great for readability since we can make sense of it in isolation.
The more coupled modules are the harder it is to replace them. This also means it is harder to write tests for them since it is unclear where one module ends and another begins. As any changes ripple through connected code, refactoring of highly coupled code is difficult.
Module isolation is very important for team structure. Programmers can be productive on individual modules without needing to understand the entire code base. They are also less likely to interfere with the work of others.
Be wary of the false abstraction while approaching coupling. Simply putting a new face on an API doesn’t decouple it.
Cohesion and coupling are two significant areas of code quality. Unfortunately it doesn’t mean that can be easily measured. It takes a programmer, and some practice, to read code and decide how cohesive, or coupled two modules are.
I don’t doubt there are static analysis tools that can give magic numbers they call cohesion and coupling. Never measure your quality by such tools! They can sometimes be helpful to locate trouble spots, but quality measurement is still quite subjective in this area: don’t trust magic numbers.
Fortunately, most people don’t need to do this. While programming all we have to do is keep these metrics in our head and design for them. For any new function or type I always think, “does this really belong here?”, and whenever I access data from another module I think, “should I really be accessing this directly?”
The biggest refactoring I do tends to be decoupling code and improving cohesion. I often do this prior to adding new features if I can’t fully understand the code I need to modify. I like thinking in terms of this high-level refactoring to keep me focused and not chasing irrelevant details.
I’ve been fortunate to work on a wide variety of projects, from engineering to media, from finance to games. Follow me on Twitter to share in my love of programming. If you’re interested in something special just let me know.