Defective Language

The very real mess of virtual functions

Virtual functions, though generally a blessing, have a defect-prone dark side. No language, that I know of, provides a way to encode when the base class should be called. This leads to a lot of defects when the functions are overridden.

The four contracts

A virtual function is declared in base class like this (obviously the syntax varies per language):

1
virtual process() {...}

The process function is virtual meaning a derived class can override it. It says nothing about when the base function should be called. There are four basic possibilities, all of which are in common use.

Call first

The base function is expected to be called at the start. This is typically used when the overridden method wishes to extend the logic of the function.

1
2
3
4
override void init() {
    base.init()
    //stuff that might depend on base call
}

Failing to call base.init, or calling it at the wrong time, leads to an incomplete calculation or initialization. Quite often the program can still run, but with subtle and hard-to-spot problems. This can make it very difficult to debug.

Call last

The base function is expected to be called last. This has two common uses: during setup to allow a derived class to set configuration values before continuing; or during destruction to allow reversing the order of construction.

1
2
3
4
override void destroy() {
    //destroy stuff setup in the `construct` function
    base.destroy()
}

The typical problem of not calling base.init, or calling it at the wrong time, is a resource leak. Something that should have been freed up is either not freed, or freed incorrectly. This can be hard to debug as the program often keeps working without any noticeable issues.

Call whenever

The base function can be called at any point. These are commonly used when the virtual is performing a calculation and the derived class extends the calculation.

1
2
3
override bounds calc_bounds() {
    return merge( local_bounds(), base.calc_bounds() )
}

It is also used, though somewhat rarely, to wrap the base behaviour, perhaps for serialization or to setup and teardown a new execution context.

Call never

The base function doesn’t need to be called at all. This setup is used when a derived class is allowed to fully override the behaviour of the base. It’s often used by derive classes that implement an optimized form of the function.

Abstract functions could perhaps be viewed as this form. I consider them distinct since if the base class has no code in the function it’s really irrelevant when, or if, it’s called.

The problem contracts

I find “call whenever” to be relatively resistant to errors. The derived and base class behavior is tightly coupled, thus the processing tends to noticeably fail if the base class is not called, or called at the wrong time. Possibly it’s less error prone since using this contract requires a fairly good understanding of what the base class is doing and we’re simply more attentive while doing it.

The “call never” case is the least error prone. The base doesn’t care if it’s called thus not calling it usually doesn’t lead to problems, though sometimes calling it redundantly can lead to problems. A problem can arise however if the contract changes through the type hierarchy; often a “call never” becomes a “call whenever” for descendent classes.

The errors I’m referring to here are about using the “virtual” aspect of the function correctly: when it is called. Saying it’s less error prone refers only to that. The actual code in the derived class is subject to all the generic programming problems on its own and nothing in the virtual contract could prevent those.

It’s the “call first” and “call last” virtuals that cause the most problems. “Call first” tends to be setup and it’s amazing how often code can generally work without being initialized correctly. Sometimes default values just happen to work, or something else initializes it. It’s often something subtle that is broken, something only a complex test case is able to detect.

“Call last” virtuals tends to be cleanup code, forgetting to call them often cause no noticeable side-effects. Over time this may lead to leaks: the program slowly runs out of memory or can no longer allocate resource handles.

Both of these cases can introduce a lot of subtle errors if they are called at some other time. It often results in partial setup, or partial destruction. Calling at the wrong time could be worse then forgetting to call them at all. Because most errors resulting from these scenarios are subtle, it makes it hard to debug.

Extending the syntax

I would like to see the syntax for virtuals extended to cover the common contracts. It’d help eliminate some frequent errors in object oriented programs, and save a lot of debugging time.

Perhaps a trait on the virtual function:

1
virtual(extend_after) void init() { ... }

Where extend_after would cover the “call first” case, with a matching extend_before for the “call last” case. For these two attributes I’d say the call to the base is implied; it isn’t written explicitly in the derived function. Though we might want an extends keyword for clarity, since override doesn’t feel right for this case.

1
2
3
4
extends void init() {
    //no call to base.init
    //local init
}

Just covering those two cases would probably be enough. The “call whenever” and “call never” case are less error prone and covered by the basic virtual and override syntax.

There is also one other inverted contract that commonly appears: when the base class code calls the virtual. I’ll cover this in a followup article, as it requires special treatment.

33 replies »

    • Sort of, this is actually the idiom I’m referring to in my final sentece about what I’ll cover next time. It has problems of it’s own.

  1. “No language, that I know of, provides a way to encode when the base class should be called.”

    CLOS (Common Lisp Object System) has done this for over 20 years. From the Wikipedia entry:

    “Apart from normal (“primary”) methods, there also are :before, :after, and :around “auxiliary” methods. The former two are invoked prior to, or after the primary method, in a particular order based on the class hierarchy. An :around method can control whether the primary method is executed at all. Additionally, the programmer can specify whether all possible primary methods along the class hierarchy should be called or just the one providing the closest match.”

  2. What do you do if one derived class needs to call the base function, but other derived classed don’t need to call that base function? or if one derived class needs to call the base function at the beginning of its implementation and other derived classes at the end? Such contracts would limit the flexibility you currently have. I can see the use for these contracts, but limited, and it’s likely that in many more cases one would have to do a lot more work just to circumvent the contracts. I don’t see this as helping in writing more maintainable code.

    • this is the point of the whole problem. A derived class never has to know about the implementation of the base function and because of that it lacks the knowledge when the base function must be called. This is why the constrain is needed in the original definition(or may be added as separate constrain in the overridden version in the derived class to allow further reuse down the class hierarchy).

      Even putting the actual uses aside such constrain will spare a superficial line in the overridden method and/or add the compiler some more power in regards of logic/code-flow checking- something that will be welcome.
      The more errors the compiler flags before publishing the less errors will we have to track while debugging, or(in the nightmare case) by doing night shifts since the client system is down.

    • I don’t want to remove the generic ability of virtuals. I want to add specific contracts for common cases. This would eliminate several defects I’ve seen appear in many projects.

      The more complex the relationship is between derived and base class the less likely it seems that a defect hapepns. It’s primarily in these simple cases that people tend to make simple errors since they aren’t thinking as deeply about the problem.

  3. Great to see an article that makes me realise I am not alone. I’ve worried about this for years. I think telepathy is the accepted solution. Somehow you are simply supposed to know what the author of the base class intended.

    • Only if the author of the base class failed to document the semantics and usage of the functions.

    • I actually don’t think knowing the base classes author intent is a problem. Chances are if you are thinking about it you’ll end up doing it right. It’s the trivial cases where you simply don’t think enough about it, that are the problem. There’s a function to override, I override it, the program works, so it must be okay.

      If it really is a matter of not understanding how the base class works, which does happen, it is a whole other matter enitrely I think. But at least with come common extensions the base class author might at least express it correctly a few more times.

    • I agree with both comments above; however, I have had many circumstances where the usual time constraints have meant that I simply haven’t had time to delve into the source of the base class so I have just taken a guess at the best time to call the base class function. Also, don’t forget that you don’t always have the source. And, let’s be honest, just how good is most documentation?

      This is nothing to do with the specific problem of virtual functions, but I’m using SQLite at the moment, which has been around years and is extremely well-respected, and I am finding myself looking on Amazon for books because the docs just don’t give me all the information I need.

  4. Very good points. I am not sure however if extending the virtual keyword will do any good. Given the problems I would suggest two approaches:
    1. use attributes that do not change the function itself but add possibility to control its behavior:
    2. use dedicated tests that are automatically generated/required during compile/debug time. Combined with 1. this will force the programmer to check the implementation again before publishing.

    As an example

    a)
    [base_first]
    protected virtual void v_testFunc(){/*code here*/}

    compiler to call base at the beginning. It is not needed in the function code. The parameters passed to the overridden function will be passed to the base function.

    b)
    [base_last: {parameter map if applicable} ]
    protected virtual void v_testFunc(){/*code here*/}

    compiler calls the base function last. It is not needed in the function code. If it has parameters those must be explicitly mapped to parameters the overridden function wants to be used. Local variables from the function must be allowed in the parameter map.

    c)
    [base_never]
    protected virtual void v_testFunc(){/*code here*/}

    this one allows optional implementation in derived classes so it can not be replaced from abstract functions which are obligatory to implement.

    calling the base function by hand in all of the cases above causes compile time error

    d)
    protected virtual void v_testFunc(){/*code here*/}

    this is the current state with a twist: base can be called anywhere but since no explicit behavior is defined, the compiler will throw if the base function is not called in the body of the overridden function.

    • I’m not really attached to my syntax at all, so anything that would work would be fine. I think languages need ways to express all sorts of more information the programmer might know about the function. I feel this is a huge source of errors. The original author has all this information in their head about what should be happening but has no way to encode that into the source code.

  5. This issue may be one reason why deriving from concrete classes is considered bad practice. The better option is to always derive from an abstract interface and call a shared implementation where necessary.

    • I know some people consider this a bad practice, but by in part deriving classes from base classes is widely accepted as a useful programming paradigm.

  6. “Sometimes I’m a bit depressed to learn that a solution to a problem has actually existed so long yet so many mainstream languages just ignore the problem.”

    I feel like it’s important to ask why. Is it possible this is *not* a problem worth solving? It’s unlikely that nobody among the designers of C++ or Java or C# was familiar with CLOS. It’s more likely that they have evaluated the feature and the pros/cons were not in favor.

  7. SIMULA-67 provides the INNER keyword to call the base-class method. I know it works on constructors. I can’t remember if it works on virtual functions in general. But that pushes back the date of this solution to late 1960’s(!!)

    • If we think more generally the current languages actually use the solution, but only for constructors and destructors. C++’s constructors are basically a `extend-after` construct and destructors are the `extend-before` that I am talking about. So the need for this is recognized, but somehow limited only to these very special instructions.

  8. This is silly. Any incorrect call order might result in a bug, not just a call to a base class function from the overriding function. Adding special syntax just for this case is dumb. If there is a reason that the base class function should always be called at a certain point, DOCUMENT IT.

    • I’ve seen it documented in many projects yet it hasn’t stopped people from making a mistake. The attitude that anything could be a bug therefore we shouldn’t try to prevent them seems misguided. Why wouldn’t we attempt to fix common bugs that a compiler, with a few adjustments, could easily recognize to be bugs?

    • This is why we generally consider C a more challenging language. Too many things are left ‘up to programmers’ to do right. C++ has improved significantly in that respect, so did Java, and other modern languages. If a base-class author wants you to do something before/after, ideally you shouldn’t have to dig through documentation to learn about it. Nothing nicer than having a compiler point the error out when you compile the code: “overridden base-method not invoked”.

    • Gee, it’s almost as if some people aren’t capable of reading, processing information, or responding to the points of an argument.

      “I’ve seen it documented in many projects yet it hasn’t stopped people from making a mistake.”

      OMG, people can’t be stopped from making mistakes! What serious problem!

      ” The attitude that anything could be a bug therefore we shouldn’t try to prevent them seems misguided.”

      Yes; good thing I didn’t say anything like that.

      “Why wouldn’t we attempt to fix common bugs that a compiler, with a few adjustments, could easily recognize to be bugs?”

      Another ridiculous strawman. This has nothing to do with adjusting compilers to recognize bugs.

  9. Hi
    Interesting point of discussion.
    I’d take the engineering approach and try to:
    – Create a clear design following SOLID principles: Specifically “Open/close” and “Liskov substitution”
    – Cover the code with unit tests
    I’m not a big fan of relying on tools (code languages are) to ensure the validity and correctness of a solution.
    Regards

    • TDD is good but the whole point of the scope control(public/private/protected..) and the inheritance is to ensure compile level logic compliance that requires us to write tests for the implementations and not for the structure and the transported data.
      A self-testing sequence consists of a class hierarchy where each level effectively tests the previous level by implementing it. The latter allows reduction of the overall tests that have to be performed/designed/written.

      A perfect system would be validated by smoke tests only, since its functionality is built in such way that ambiguous definitions/implementations are not tolerated in the design. Such systems are not really possible, but at the same time adding a lot of tests does not compensate ambiguous, fragmented and self-contradicting system design.
      The latter is cheap(effectively no design) since with the time passing functionality is piled rather than integrated. At the end, however cheap means a lot of work and the whole project ends being outsourced to third parties that employ cheap but yet qualified enough labor, effectively opening the flood gates and rejecting the efforts and costs that come with know-how management. A company that outsources a project has effectively given up on it at its current state. In that context Agile, TDD..etc are nothing but excuses to bad management and design decisions.

    • I think I’ve almost always considered this a bug as I’ve not seen a use-case where you’d intentionally call it twice.

      It might be hard for a compiler to determine it isn’t called twice though if you call the base in multiple locations but only one will be invoked.

  10. Do we really need this? How about instead:

    virtual onChildInit();

    final init(){
    // … parent code …
    onChildInit();
    }

    Contract enforced, no extra keywords needed?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s