The very real mess of virtual functions

Tags

, , , ,

Virtual functions, though generally a blessing, have a defect-prone dark side. No language, that I know of, provides a way to encode when the base class should be called. This leads to a lot of defects when the functions are overridden.

The four contracts

A virtual function is declared in base class like this (obviously the syntax varies per language):

1
virtual process() {...}

The process function is virtual meaning a derived class can override it. It says nothing about when the base function should be called. There are four basic possibilities, all of which are in common use.

Call first

The base function is expected to be called at the start. This is typically used when the overridden method wishes to extend the logic of the function.

1
2
3
4
override void init() {
    base.init()
    //stuff that might depend on base call
}

Failing to call base.init, or calling it at the wrong time, leads to an incomplete calculation or initialization. Quite often the program can still run, but with subtle and hard-to-spot problems. This can make it very difficult to debug.

Call last

The base function is expected to be called last. This has two common uses: during setup to allow a derived class to set configuration values before continuing; or during destruction to allow reversing the order of construction.

1
2
3
4
override void destroy() {
    //destroy stuff setup in the `construct` function
    base.destroy()
}

The typical problem of not calling base.init, or calling it at the wrong time, is a resource leak. Something that should have been freed up is either not freed, or freed incorrectly. This can be hard to debug as the program often keeps working without any noticeable issues.

Call whenever

The base function can be called at any point. These are commonly used when the virtual is performing a calculation and the derived class extends the calculation.

1
2
3
override bounds calc_bounds() {
    return merge( local_bounds(), base.calc_bounds() )
}

It is also used, though somewhat rarely, to wrap the base behaviour, perhaps for serialization or to setup and teardown a new execution context.

Call never

The base function doesn’t need to be called at all. This setup is used when a derived class is allowed to fully override the behaviour of the base. It’s often used by derive classes that implement an optimized form of the function.

Abstract functions could perhaps be viewed as this form. I consider them distinct since if the base class has no code in the function it’s really irrelevant when, or if, it’s called.

The problem contracts

I find “call whenever” to be relatively resistant to errors. The derived and base class behavior is tightly coupled, thus the processing tends to noticeably fail if the base class is not called, or called at the wrong time. Possibly it’s less error prone since using this contract requires a fairly good understanding of what the base class is doing and we’re simply more attentive while doing it.

The “call never” case is the least error prone. The base doesn’t care if it’s called thus not calling it usually doesn’t lead to problems, though sometimes calling it redundantly can lead to problems. A problem can arise however if the contract changes through the type hierarchy; often a “call never” becomes a “call whenever” for descendent classes.

The errors I’m referring to here are about using the “virtual” aspect of the function correctly: when it is called. Saying it’s less error prone refers only to that. The actual code in the derived class is subject to all the generic programming problems on its own and nothing in the virtual contract could prevent those.

It’s the “call first” and “call last” virtuals that cause the most problems. “Call first” tends to be setup and it’s amazing how often code can generally work without being initialized correctly. Sometimes default values just happen to work, or something else initializes it. It’s often something subtle that is broken, something only a complex test case is able to detect.

“Call last” virtuals tends to be cleanup code, forgetting to call them often cause no noticeable side-effects. Over time this may lead to leaks: the program slowly runs out of memory or can no longer allocate resource handles.

Both of these cases can introduce a lot of subtle errors if they are called at some other time. It often results in partial setup, or partial destruction. Calling at the wrong time could be worse then forgetting to call them at all. Because most errors resulting from these scenarios are subtle, it makes it hard to debug.

Extending the syntax

I would like to see the syntax for virtuals extended to cover the common contracts. It’d help eliminate some frequent errors in object oriented programs, and save a lot of debugging time.

Perhaps a trait on the virtual function:

1
virtual(extend_after) void init() { ... }

Where extend_after would cover the “call first” case, with a matching extend_before for the “call last” case. For these two attributes I’d say the call to the base is implied; it isn’t written explicitly in the derived function. Though we might want an extends keyword for clarity, since override doesn’t feel right for this case.

1
2
3
4
extends void init() {
    //no call to base.init
    //local init
}

Just covering those two cases would probably be enough. The “call whenever” and “call never” case are less error prone and covered by the basic virtual and override syntax.

There is also one other inverted contract that commonly appears: when the base class code calls the virtual. I’ll cover this in a followup article, as it requires special treatment.

Follow

Get every new post delivered to your Inbox.

Join 894 other followers