Programming

Cohesion and coupling: good measures of quality

What is good code? Even if I say nobody cares about your code the quality nonetheless does influence a project. Functional correctness reigns supreme, but if we put that aside what should we look at next? Cohesion and coupling are the two most significant aspects of code impacting software quality.

Cohesion

Cohesion is how well the bits of a module fit together. A set of functions, an interface, is considered cohesive when each function is closely related to another. For example, log_error and log_warning are closely related and belong nicely together in a single log module. As functions diverge in their behaviour a module becomes less cohesive. For example, log_error and parse_json don’t seem to be related at all.

We can roughly measure cohesion by how succinctly the module can be described. Short and precise sentences are preferred to longer, verbose, and generic ones. Contrast “The log module records diagnostic messages” to “This module records diagnostic messages and parses json”. In the latter we couldn’t even give the module a name. If we can’t give a module a name there’s a good chance it lacks cohesion.

Words like “assorted” and “variety” are a potential indication of a low cohesion. A module that provides “an assortment of calculations” doesn’t sound as cohesive as a module that provides “bond amortization and valuation”.

Why

A highly cohesive module is easier to understand than a lowly cohesive one. While looking at the code, or interface, we can keep a single unifying theme in our head. As all functions relate to that theme it is easier to understand the code, and thus easier to extend the code, debug errors, or simply use the interface.

Cohesive modules tend to reuse the same types throughout their interface. The parameters to log_error should be the same as those to log_warning. Once we’ve learned the basic types used in a module it is not far easier to use the rest of the module.

Clean and coherent interfaces also make a module much easier to replace and test. Since all the functions are related, it makes sense we’ll have to replace all of them to swap out the module. Similarly writing a mock object for testing becomes more straight forward. If there are too many unrelated functions in the interface we find our unit tests also lose focus.

Coupling

Coupling is how dependent modules are on the inner workings of each other. Tightly coupled modules rely extensively on the specific state of each other, sharing variables and many types. Loosely coupled modules are fairly independent: they have a few well defined APIs and share a limited amount of, or no data.

The extent of coupling between modules is determined by the complexity of the exposed interface, where every public function and type is part of the interface. A module that has a high ratio of public information compared to private information will result in tight coupling. A module that exposes a minimal interface will result in loose coupling.

Transitivity is also an indicator of coupling. Say we have a series of modules, where each depends on each other in turn: A -> B -> C. Ideally module A knows nothing about module C. The more it does know the more it is coupled to module B.

Why

The higher the coupling is between modules the harder it is understand what the code is doing. The more details two modules share the more difficult it is to understand them. Independent code is great for readability since we can make sense of it in isolation.

The more coupled modules are the harder it is to replace them. This also means it is harder to write tests for them since it is unclear where one module ends and another begins. As any changes ripple through connected code, refactoring of highly coupled code is difficult.

Module isolation is very important for team structure. Programmers can be productive on individual modules without needing to understand the entire code base. They are also less likely to interfere with the work of others.

Be wary of the false abstraction while approaching coupling. Simply putting a new face on an API doesn’t decouple it.

Quality metrics

Cohesion and coupling are two significant areas of code quality. Unfortunately it doesn’t mean that can be easily measured. It takes a programmer, and some practice, to read code and decide how cohesive, or coupled two modules are.

I don’t doubt there are static analysis tools that can give magic numbers they call cohesion and coupling. Never measure your quality by such tools! They can sometimes be helpful to locate trouble spots, but quality measurement is still quite subjective in this area: don’t trust magic numbers.

Fortunately, most people don’t need to do this. While programming all we have to do is keep these metrics in our head and design for them. For any new function or type I always think, “does this really belong here?”, and whenever I access data from another module I think, “should I really be accessing this directly?”

The biggest refactoring I do tends to be decoupling code and improving cohesion. I often do this prior to adding new features if I can’t fully understand the code I need to modify. I like thinking in terms of this high-level refactoring to keep me focused and not chasing irrelevant details.

I’ve been fortunate to work on a wide variety of projects, from engineering to media, from finance to games. Follow me on Twitter to share in my love of programming. If you’re interested in something special just let me know.

4 replies »

  1. You might find the answers on this [http://programmers.stackexchange.com/questions/151004/are-there-metrics-for-cohesion-and-coupling] (Programmers’ Stack Exchange Question) worthwhile reading too.

    • I don’t think that measurement of cohesion is actually a strong indicator of cohesion. The problem is that it is measuring code cohesion rather than interface/feature cohesion.

      Consider my `log_error` and `log_warning` examples again. These can both belong to a single log class, but perhaps it’s just wrapper around some lower level functions. Thus the two functions won’t appear related at all from a code viewpoint.

      Similarly, a `notices` class may have a `user_alert` and `user_critical_alert` function. One may just file a passive notice somewhere on the desktop, the other may make a popup. Thus the code could be entirely different, yet still form a cohesive interface.

      The opposite is perhaps more troubling, if you rely on the linked metric. It’s very easy for functions in a class to share properties and state, even if it isn’t really related. Perhaps they’re using a shared cache, or just plain using a temporary property for who knows what reason. The code based cohesion measurement would say this class is cohesive, regardless of what the API looks like. It could actually encourage an invalid sharing of properties!

  2. If you’re changing the level of cohesion, you might be just making changes for a new version, and not actually doing the thing called “refactoring.”

    http://en.wikipedia.org/wiki/Code_refactoring

    If your changes aren’t externally visible, you didn’t change module cohesion at all. It seems built into the definitions that when you improve cohesion, those improvements are visible from outside, and are not examples of refactoring.

    Like in your examples; any change to your examples of cohesion would change the API in some way, even if it is just to improve the way you invoke it.

    • I don’t think we should be too strict on what exactly is refactoring and what is simply modifying an API. Proper code is done in layers, so while I may be technically modifying an API it may be an internal module and thus the API change isn’t too relevant.

      It’s hard to imagine any refactoring that doesn’t change an API somehow. Even modifying a prviate internal function is still changing an API, it’s just that we have good control of all users of that API and can modify them at the same time.

      I prefer to consider refactoring as something where the primary intent is to do some kind of cleanup, or prep work, and API changes are just incidental, not intended.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s