Programming

Lingering Spaces and the Problems of Overformatting

Recently I started a new project and was surprised to hear my colleague express his desire to use spaces for indentation. I assumed this problem was long behind us. Though I realize several projects still uses spaces to cope with existing code, this was an entirely new project. It just didn’t make sense, but I suspect he is not the only holdout. Using tabs for indentation has real practical advantages over spaces.

Lack of Choice

If people choose space indents they must immediately answer the question, “how many spaces?”. If they were simply to choose tabs they wouldn’t need to make this decision. For the trouble with choosing is that everybody comes from a different background and is used to a different visual level of indentation. With spaces it will not be possible to satisfy all of them. And though not a tragic loss, such simple changes do affect productivity. Nobody dictates the colors a programmer should use. Nor do we say which fonts, which editor, or even which keyboard. Why should we dictate something like a purely visual indentation level?

Perhaps a bit of revelation before going further: I don’t use monospaced fonts. I have opted to switch to using easy to read and nicely rendered proportional fonts. It took a while to get used to, but now I honestly believe it improves my productivity. Since my editor of choice Kate supports it so well I have to assume I’m not alone in this decision. Though it does influence my dislike of space indents, I had already long since abandoned spaces.

Suppose you’re one of us uncommon people using a non-fixed width font. That choice of how many spaces becomes quite problematic. Many people have decided that two spaces are more than enough for an indent. Perhaps with a monospace font this is visually sufficient, but with the narrower spaces of a proportional font, two spaces are simply not large enough to be visually distinctive. With tab indents this really isn’t an issue: if you like monospaced two space indents, then you can have them, and I can have my proportional font with larger indents.

The Danger of Spaces

That aside, a key problem with spaces, and perhaps fixed width fonts, is their use in additional, and unnecessary, code alignment. Beyond indentation level I don’t think code should have any further sequences of columnar whitespace. It interrupts the natural flow of the code, makes maintenance harder, and can cause conflicts during merging.

A common problem is the alignment of extra parameters for a function to the same column as wherever the first parameter happens to be. This pushes all the parameters far off to the right of the screen, quite often beyond the edge of a typical window. Perhaps more importantly is that should somebody refactor the code they have to take additional time to ensure all these columns are still aligned. More than likely however things will just end up misaligned.

Now the parameters to a function are important; they are something you want to see clearly. So I won’t disagree with the to align them, but this can be done just fine with normal indentation: each parameter resides one tab in from the function name. This keeps all the parameters between functions in a consistent location and alleviates the problem of refactoring.

Related to this is the alignment of the ‘=’ operator when declaring variables. Perhaps this lends some kind of textual visual aesthetic to the code, but I find it rather distracting. It artificially separates values from their types and becomes a  pain if you need to add another, longer, variable to the list.

Merge Conflicts

I mentioned the additional work required to maintain the space alignment. Assume that all team members do keep the code aligned, this actually introduces a new problem. In order to do simple refactoring you have to touch more lines of code than you have semantically altered (you have added/removed spaces on several additional lines). This increase the size of the code change.

This increased code size causes problems for reviewing and merging code. A reviewer of your code will be distracted by these additional changes and not clearly see the things you have changed (though with a good visual diff tool this can be mitigated). Worse however the possibility for conflict on those additional lines. If somebody else has modified the same line (also while refactoring) you will get a conflict. Avoiding this extra alignment avoids these problems.

You can of course imagine how horrible this all looks with a non-proportional font.

Just Use Tabs

At some point in the past, the long past I should add, getting tabs to work properly was a bit of an issue. A lot of editors had problems and confused spaces and indents. Today of course all but the crappiest tools can easily handle tabs for indentation. Lack of support was the only valid criticism I’ve ever heard against tabs, and that is simply no longer applicable.

Choosing spaces introduces unnecessary problems into a project. First off it forces all programmers to use the same visual indentation that may be enough of a distraction to hurt productivity. Secondly it opens the door to doing all sorts of  counter-productive visual alignment, which in turn can cause very real merge conflicts. And of course it pretty much eliminates the chance of any team member using a proportional font.

Tabs also have this nice notion of being a single character for a single purpose. One indent is simply one tab. I see no value in using spaces for indentation. Use tabs.

Categories: Programming

5 replies »

  1. Monospace vs. proportional arguments aside, most modern VCSs provide options to ignore whitespace in diffs when reviewing patches, so additional indentation from spaces for alignment are hardly a problem in that regard.

    Adding the alignment itself can be done in the background while you’re reading the code, if someone forgot to update the alignment when they modified it. A lack of proper indentation can even serve as an indicator of how much of a rush the person who modified the code was in. Some people may even find it therapeutic to grumble about the indentation while tracking down a bug ;)

    Tabs are problematic, in my opinion, because indentation beyond the function/conditional level is often unclear to automated tools because it is more closely tied to the semantics of the code. For example, maps and lists that are broken across multiple lines should not be left justified with the rest of the containing block because they are enclosed within the list. Further it would be nice for the keys and values in the map to be aligned with one another so that it is easier to scan the map visually.

    Using tabs, an editor might put the same amount of whitespace between the key -> value pairs as before indented lines. This is problematic because two or four spaces may be excessive when only a single additional space would suffice. So the tool would need to be smart enough to compress the additional space when only simple alignment is required.

    Lists and maps are not the only place where indentation is not explicit. Nested lambdas can often throw automated tools into a tailspin. Multiline conditionals, while ugly by their very nature, are still easier to parse if they are vertically aligned. There are many cases where it makes thing clearer to align argument lists if the API requires repeated invocation of the same function. It is simpler if a human makes the decision as to how the code should be formatted.

    Beyond this, personal preference of the number of spaces a tab should represent must be configured in multiple places (diff tool, editor, vcs, compiler debug messages, etc.) leading to a sprawling amount of configuration and a constant irritation at the formatting being “wrong”. With spaces the indentation is explicit and not subject to the interpretation of the tool used to view the code.

    • Tabs have no problem doing vertical alignment or indenting the items in a map. Just start them all at one level more of indentation on the next line.

      I’ve never seen a reason to align the “values” of entries in a map. I don’t see that it adds clarity to the code.

      Your arguments seem to indicate you also use automated formatting tools. Perhaps I need to extend, or write, about how I disagree with the use of such tools as well (in the general case).

  2. I must say thanks for your sane article. Although I haven’t yet tried to use proportional fonts for programming, I can’t figure out why on earth someone would prefer to use plain spaces over tabs.

    To me you have nailed it precisely: with tabs you just preserve each personal choice in a globally concise way.

    Please people, stop requiring spaces for your projects!

  3. Yes! Absolutely nailed every point. I use variable width fonts for some languages and not others, and vary indentation widths (and syntax colouring) similarly. No alignment whitespace beyond left indent – copious usage of lambdas, multiline conditionals, large statically initialised data structures, etc. are all beautifully readable following the “one level more of indentation on the next line” rule (c.f. well known complaints about readibility of fully justified text because of irregular word separation distances). This formatting approach, lacking arbitrary mid-line alignment whitespace, could also be readily applied algorithmically.

    Thanks.

  4. Talking about version control, it is not just a matter of merge conflicts actually.
    It also makes the code review a little worse, and you mess with the blame of that line putting that line in a commit that has absolutely nothing to do with it beside fixing the alignment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s