Ideal Language

Why don’t we have runtime virtual dispatching?

Virtual functions are a great feature, but they have a rather severe limitation. They can only be implemented by extending the definition of all classes involved. A lot of operations on classes don’t need to be members, and sticking them in the class, just to use virtual dispatching, leads to interface pollution. I’m proposing here an out-of-class virtual dispatching syntax.

The basic problem

In my Leaf compiler code I have an object of type expression, which could one of several sub-types. I wish to do some processing on this object. Some of the sub-types, not necessarily all, require special processing. Essentially I want to pass my object to one of these functions, depending on its type:

void process( expression )
void process( expression_binary )
void process( expression_function )

There are two basic approaches to this dispatch: visitors and switches.

A visitor pattern

If we’re willing to retrofit the types involved we can create a visitor pattern. Add a accept function to each type in the class hierarchy and then build a visitor object.

class process_visitor : expression_visitor {
    void visit( expression )
    void visit( expression_binary )
    void visit( expression_function )

object.accept( process_visitor )

The problem here is that I haven’t avoided modifying the origin type. I need to add an accept function to the base and every derived class. If I forget to add the function to a newly derived class then it breaks.

It also requires some kind of expression_visitor interface to be defined. What exactly should that be? Usually it will contain a definition for all the subtypes of expression. But I don’t always want to override the processing for all expressions.

Due to its limited functionality and a lot of boilerplate code I tend not to use the pattern often.

A big old switch

The other common approach is a switch statement:

switch( type_of(object) ) {
    case t_binary: ...
    case t_function: ...
    default: ...

This requires some way to identify the types, either natively in the language, or with a custom type id. Additionally, the switch statements I’ve seen don’t deal well with a value hierarchy. We can’t easily select a sub-tree of the type, instead we’re forced to itemize every leaf type. For this reason a series of if statements is often used instead.

if( object instanceof expression_binary ) ) {
} else if( object instanceof expression_function ) ) {
} else {

This is a lot of repetitive coding, and it’s possible to get the order of the if statements wrong (like checking for a base type prior to a derived type).

Standalone virtual functions

Both approaches are just workarounds for what I really want: standalone virtual functions. I just want to express a series of functions with a common name that accept different types:

void process( virtual expression & ) = 0;
void process( virtual expression_binary & b );
void process( virtual expression_funccall & f );

The virtual bit expresses that dispatch to this overload is done based on the runtime type of the first parameter. I call this just as any other overloaded function:

process( object );

At run-time the dispatch will be done to the function best matching the type: following the overload rules as though it were a static dispatch with the most derived type.

This has several advantages over the other approaches:

  • it is very compact without repetitive boilerplate code
  • the original types are left unmodified
  • it can dispatch over a subset of all the derived types
  • it doesn’t require a user-defined type identifier
  • it can deal with dispatching to mid-level types, not just leaf types

This feature would be a great addition to any OOP language with virtual functions. It’s a simple way to extend classes without touching the classes themselves.

10 replies »

  1. Nice! :-) In C# there is an awful pattern with Equals method, so I think it is useful to forbid cross-speciaization of the methods (foo(Int,Object) vs foo(Object,Int)) and provide multidispatch. Now I see it could be even more useful to use your approach as well, move “virtual” in front of the function to provide such multidispatch for functions as well.

    • I’m also thinking of adding a commutative attribute to functions in Leaf. If a function were marked commutative then calling it `foo(a,b)` would be the same as `foo(b,a)`. The compiler can do this transformation automatically saving you from creating both versions. It also gives the optimizer another opportunity to improve certain code paths.

    • It does not seem so –, quote on MD “of more than one of its arguments” vs Leaf “based on the runtime type of the first parameter.”

    • @macias – fair enough, but is it not basically a special case thereof? With only one argument, one may consider moving the ‘process’ method into expression and override that method in subtypes as needed. If the ‘process’ function belongs elsewhere because of orthogonality, I agree I am not a big fan of visitor for the reasons edaqa outlined. In that case the functionality he describes sounds reaonable. I wonder if a lookup table would work too or if that would be to hacky. i.e. something like table.lookup(expression).process();

    • I think multiple dispatch more focuses on the idea that multiple arguments determine the function to use. My idea is about using the dynamic type — though several of the examples in the article do use dynamic type. The concepts may be orthogonal and it may be sensical to have both in a language.

    • On reflection, I agree it’s somewhat orthogonal if the language does not support function overloading. For example in a dynamic language like javascript, python, or ruby, you can’t have multiple functions with the same name in the same namespace. But if you support function overloading, this would effectively be a proper subset of multiple dispatch aka multimethods, or so it seems to me…

    • (oops – I made a silly mistake with respect to overloading since it does not use the runtime type of the underlying object for matching).

  2. And that would be implemented like this:

    // on global init:
    SomeMapType _virtuals_for_process;

    // on module init:
    _virtuals_for_process.insert(expression_binary.class, process__R17expression_binary);
    _virtuals_for_process.insert(expression_funccall.class, process__R19expression_funccall);

    // on use:

    Right? That’s actually a rather nice codegen to have in one’s compiler.

Leave a Reply to macias Cancel reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s