An unpleasant visit from the refactoring demon

In the middle of a major change to the Leaf type system I encountered the refactoring demon. It reared it’s ugly little head chanting, “you’re doing it wrong, all wrong!” It was probably right, but there was no way to know until I was done. Seeing my determination did not sway the demon. Instead it started pointing out all the other changes I should be making instead.

Avoiding the big refactor

First off, problems like these are why I avoid refactoring that requires a lot to change at once. I much prefer coming up with small steps, where the test suite can be run after each change.

Alas, sometimes a big change is required. My change in Leaf was about how parametric types are handled. Trying to implement a feature I hit a block. There was just no way to reasonably introduce the feature that wasn’t either very messy, or involved a large change in the the type system. I deliberated until I convinced myself the change was the right thing to do.

Just a few hours into these changes I noticed other problems with the code. A lot of things started to crop up, things that could be done better. Things that would make everything easier. I even started doubting whether my approach was correct at all. Already demotivated, I came to another dissatisfying realization: my type identifier code was too hard to read and highly redundant as well.

Stop, think

I had a few basic options. I could carry on and come back to the issues later. Or I could rollback my current refactoring and do the other reductions first.

Carrying on with the parametric refactoring had these considerations:

  • I already invested a lot of time in this refactoring. It’s something that has to be done anyway.
  • It was a clear goal to work towards and would implement a concrete feature.
  • A lot of the work would however be discarded by the next refactoring.
  • I was putting in lots of strange hacks to work around issues the other refactoring would fix.
  • Some of the tests had to be commented out since a proper solution could not be found.
  • There’s technically nothing wrong with this refactoring.

Stopping this effort and doing the reductions first had these considerations:

  • It would significantly reduce the work I needed to implement the parametric feature.
  • The type identifier code would be a lot easier to understand, and could produce cleaner rules for the type system.
  • The code would be significantly reduced.
  • Though smaller it was also a rather large change.
  • It didn’t include a semantic change to the system, thus would involve less tricky decisions.

At this point it wasn’t a matter of whether either refactoring should be done, I was convinced that both of them were needed. It was just very problematic that the work involved in one would significantly alter the other. Indeed, the parametric refactoring would be a whole lot easier if the reduction was done first.

Punching through

Neither solution was great. These frustrating realizations can be depressing and lead to a code paralysis where nothing gets done. I took a step away from the machine for a while. I built a small shelving unit for my desk; wood presents a much simpler set of challenges than compiler code.

After more reflection I decided to continue with the parametric refactoring. It offered a real new feature in the system, as opposed to a pure refactoring. This was my primary consideration. I did not wish to abandon the work done so far to go traipsing about in refactor land.

To avoid losing too much time on code that would become redundant, I decided to simplify my effort. I’d take the questionable approach of hacking it to completion. I just turned a blind eye to any other issues I saw in the code. I used a few too many if statements. I commented out a few unit tests. I left the type identification code in a worse state than when I started.

I managed to complete the change. The new feature even functioned as desired without further work. That made me quite happy. Sure, I left a bunch of holes in the code, but the high-level language tests all worked.

Now I just need to do the reduction refactoring. I presume the little demon is waiting for me.

In my programming journeys I’ve seen oddities and curiosities of all sorts. Getting things done is always the primary goal, but it can be daunting at times. If your project needs some assistance getting through a tough spot, then contact me to see how I can help.

14 replies »

  1. Been there, done that (many times), and I know well that nagging voice that demands you change major building blocks.

    Fred Brooks, in his famous The Mythical Man Month wrote that, “Be prepared to throw the first one away. You will anyway.” That isn’t as true today with incremental development, but it still sometimes applies, especially when designing something large and new. You get so far and realize fundamental design choices could have been better.

    Sometimes it’s easier to tear it all down and start fresh knowing what you now know.

    FWIW: I’ve come to believe that the ability to seriously refactor your code without much stress and breakage indicates good coding practice — robust, well-designed code. After thirty years of practice, I began to realize I could make sweeping changes almost painlessly — that it was almost fun.

    • Brooks also introduces the “Second System Effect”, which is in contrast to his thesis “Be prepared to throw the first one away”. And I think exactly this is the hard thing to manage. One easy solution is when the “first thing” is a prototype: it’s okay to throw it away. But when talking about prototypes, I’m not sure whether the need for refactoring applies. When it is an established system, I would never recommend to throw it away and build it anew.

      In addition to this I can second Wyrd’s statement that refactoring can be painless and fun – but the noted many years of practice are a strong prerequisit for that. ;-)

    • Yeah, “Second System” is a totally different thing, and the thing that really strikes me about Brooks is how he nailed certain software development principles “way back when” and we’re still operating in ignorance of them today.

      For example, I’ve never forgiven Adobe for turning their simple PDF reader into a bloated pile of features I have no use for. In fact, that tends to be true of most of the software I’ve used though multiple generations.

      “Throw the first one away” is more about how you don’t always understand the problem as well as you thought you did, and often creating a solution only shows you how far you missed the mark.

      That was pretty important back when the rule was 90/10 for design versus coding. (In fact, back then, coding was grunt work done by interns — one of the first jobs I applied for was writing COBOL from flow charts designed by the “real” programmers.)

    • Coincidentally, I finished TMM-M just a few days ago. And I feel the same way – much of what he says still applies today or at least inspires for today.
      My impression is that, since fourty years ago, each individual project manager, developer or shop has to learn the same lessons over and over again. Our educational system seems to not teach the important things, or not enough.

      I think that “Throw the first one away” applies to design *and* coding.

      On an unrelated sidenote (and as a shameless plug), I’d be happy if whoever reads this takes my survey about Refactoring Tools and their quality and acceptance:
      I think of doing open source stuff on that topic.

    • I would never rewrite such a large code base, but as my article points out I’m not afraid of making major changes. The type system involved in the refactoring is essentially the backbone of the code. It was written initially to be really easy to extend and adapt, which have proven out a few times.

      However, changing something fundamental is another story. I don’t think one should ever design for such changes either. Well, there should certainly be tests which would allow such a refactoring, but the structure of the code itself should not be prestructured to allow fundamental changes. Certainly good coding practices help a lot, indeed without them the refactoring may not have been possible.

      I see a misconception sometimes though when I speak of refactoring, not necessarily from these comments, but elsewhere. The word has a range of meaning and I think it is too often taken to mean just simple things, like renaming, moving properties around, splitting a class, etc. That is, I see many people equate refactoring with the range of things an automated tool can do. The type of change I was doing here does not fall in this category. I still call this refactoring, though perhaps I shouldn’t.

      I don’t like throwing away prototypes. There’s no reason you should have have to throw away a system if you’re following good coding practices. Sure, you can throw away modules, or parts of modules, but this should rarely amount to a lot of code. Certainly it’s possibly to completely replace most code over time, but this should never be done as a single concerted effort.

      There are several good topics here I should probably cover in further articles.

    • @Daniel Albuschat: I agree, refactoring includes design and coding. I brought up the difference as a reference to the “waterfall” model of design versus the “spiral” (or “incremental”) model.

      In the waterfall model, you spend a lot of time trying to design a correct specification, and you often don’t realize you’ve made poor choices until the code set is large enough for system testing. It’s easier to invest lots of time you end up deciding to throw away.

      In the spiral model, prototypes help you discover wrong paths early in the process, and you are less likely to waste large amounts of design time on non-ideal solutions.

      You’d think no one would use waterfall these days, but the last project I worked on before I retired last year involved just that.

      @mortoray: I agree, refactoring applies to any systematic changes, whether done by humans or software. I’m pretty sure the term originally applied to coders. It’s just that there are tools that make the job easier.

      I’ve never designed with changes or refactoring in mind. It’s more that you develop good coding skills that result in well designed modules with strong coherence and very little coupling. Good coding design structure is also robust for refactoring.

      So while I’ve never intended my code for refactoring, I’ve found that over time the code I wrote was easier to refactor — almost fun. Like trying on a new wardrobe.

      As to throwing away prototypes, today it’s common to prototype system behavior in a “rapid” language intended for prototyping more than for production code. This is especially the case when a system revolves around a user interface.

      For example, I might throw together a “working” (ha!) user interface using VB and writing just enough code to simulate behaviors. This allows users to bang on it and provide feedback.

      Once I’m sure the design is headed in the right direction, I’d create the production version.

      Python is also great for quick prototyping to test ideas and algorithms, but I’d never write production code in it.

  2. I think refactoring is one of those terms that has taken on a huge amount of baggage. Managers tend not to want to hear the word “redesign,” or worse “rework,” so often one finds people just substituting “refactoring” when they do mean those things.

    In principle a refactoring really is a small reversible step that does not change functionality. In that sense it’s reasonable to suggest it can be automated by an IDE. Many such refactorings can be chained together. Although in Fowlers refactoring book he allows for a refactoring that changes the algorithm, which I think can easily leak into what I’d prefer to call redesign.

    • I agree the specifics of the term are not universal. I pretty much consider any change that doesn’t add a new feature to be a refactoring. This includes changes that modify behaviour, in particular those that are not completely backwards compatible.

      It’s always a matter of level though, are we talking about functionality at the low-level units, or the high-level use-cases? I frequently engage in refactoring that essentially breaks everything except the highest level use-cases.

      We can push the scope even further, for Leaf I consider the ultimate use-case to be its ability to write algorithms concisely and cleanly. In this sense I can even change the semantics of the language itself and not modify the highest level use-cases.

      I’m also no longer in favour of API design, I believe strongly in refactoring only. Just start with something and keep modifying it until it works the way you want. Every time I’ve designed an API ahead of time it’s inevitably been the wrong API. Of course, I must add my initial guesses as to design now are reasonably good.

  3. The spirit of the refactoring concept is that overall redesign can be accomplished via a series of small reversible changes (see for eg Wikipedia entry ).

    What you’re doing sounds like traditional hacking to me. I don’t really see the reason for using the term refactoring here.

    I’m not saying refactoring is always the right thing to do, just that your usage of the term doesn’t seem to be distinct from just redesigning the code the old fashioned way.

    • What I’m doing here fits that definition at wikipedia. The external behaviour of my product is not being signficantly altered. In both steps I did recently less than 1% of high-level tests had to be altered due to the changes. The atlerations that had to be done were minor as well.

      I see no reason why not to use the term refactoring for any activity on the code that has no external motivator. That is, all changes which are strictly for the purpose of improving the code itself, as opposed to changes that add new features or fix defects.

  4. You can redesign or even rewrite a whole subsystem of an application without changing its external functionality. That doesn’t constitute a refactoring.

    Refactoring was developed to make large changes in the code possible via a very incremental, iterative process. It’s the process of refactoring that’s key. The mere objective of cleaning up the code in some way is not in itself refactoring, at least in my opinion.

    Again I’m not objecting to the idea that refactoring may not be always the most expedient way to go. But doing traditional hacking (albeit backed by automated tests) and calling it I refactoring I think causes unnecessary confusion.

    • I’m think we’re going to have to disagree then on this point. I don’t think that refactoring was even “developed”, it was a term applied to an activity that was already being done on code. To somehow limit it’s scope to those things that trivially done, such as automatically in an IDE, seems pointless. I’ll just be forced to invent a new word and would never use the word refactoring.

  5. I don’t want to be obtuse but I’ll try one more time to clarify my thinking.

    My whole point is that the idea(l) of refactoring is that you can achieve any desired overall change in a piece of code using a set of small reversible code transformations. You are clearly saying you disagree, or at least that it’s not always the right thing.

    As far as I know refactoring came out of the early xp/smalltalk community as a tool to substitute in place of hacking (in addition to other tools such as unit testing etc). It is the same thing as just cleaning up or reorganizing code.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s