Programming

Using macros to simplify type visitors and enums

I like using enums and type hierarchies. What I don’t like is writing verbose switch statements to use them. In C++ I use macros to significantly reduce the burden of creating visitors. In this article I’ll show how I applied this technique to my Leaf souce code.

The direct method

In Leaf I have an expression class hierarchy. Each of the types is identified with an enum value. Originally I had an enum like this (stripped for brevity):

1
2
3
4
5
6
7
8
9
class expression {
    enum form_t {
        f_binary,
        f_funccall,
        f_transform,
    };

    ...
};

I have several visitors in my code. They follow a pattern like this:

1
2
3
4
5
6
7
8
switch( expr.get_form() ) {
    case expression::f_binary:
        return call_binary( static_cast<expression_binary>(expr) );
    case expression::f_funccall:
        return call_binary( static_cast<expression_funccall>(expr) );
    case expression::f_transform:
        return call_binary( static_cast<expression_transform>(expr) );
}

This approach has two problems with it:

  1. I could accidentally omit a case in that switch statement. There is a GCC flag to emit a warning, but warnings aren’t guarantees, and this option isn’t available in all compilers or languages.
  2. For enums with many more values, like I have, this extra code becomes highly redundant. The meaning is obscured and the chance of making a typo increases.

The macro approach

What I want is a way to add a new enum value and automatically have all my switches updated and compiler errors produced if I’m missing an appropriate function.

Instead of directly coding the enum values I instead create a macro:

1
2
3
4
#define EXPRESSION_FORMS \
    EXPRESSION_FORM(binary) \
    EXPRESSION_FORM(funccall) \
    EXPRESSION_FORM(transform)

This is essentially just a list of the enum value names. At this point I don’t define the single EXPRESSION_FORM, it becomes the callback for users of this list.

My enum statement gets converted to this:

1
2
3
4
5
    enum form_t {
        #define EXPRESSION_FORM(form) f_##form,
        EXPRESSION_FORMS
        #undef EXPRESSION_FORM
    };

And my switch statement becomes:

1
2
3
4
5
6
7
switch( expr.get_form() ) {
    #define EXPRESSION_FORM(form) \
        case expression::f_##form: \
            return call_##form( static_cast<expression_##form>(expr) );
    EXPRESSION_FORMS
    #undef EXPRESSION_FORM
}

Deferring the definition of EXPRESSION_FORM let’s me create different types of code based on the same listing of values.

This code has some clear advantages over the previous approach:

  1. Adding a new enum value is easy. I just update the EXPRESSIONS_FORM list and my enum and all my switch statements automatically include the new value.
  2. If I forget a visitor function the compiler produces an error, refusing to compile.
  3. It forces a standard naming convention onto my enum, types, and functions. The association between the various components is very clear.

Not just visitors

I use the pattern for more than just visitor function dispatching. For example, now I can easily program a missing feature in C++: converting enums into names.

1
2
3
4
5
6
7
8
char const * name_of( expression::form_t f ) {
    switch( f ) {
        #define EXPRESSION_FORM(tag) case f_##tag: return #tag;
        EXPRESSION_FORMS
        #undef EXPRESSION_FORM
    }
    return "invalid";
}

For strict object-oriented programming it also becomes beneficial. If you don’t like using the macros directly it’s easy to convert to a virtual member based visitor class. This way I can create a visitor base and simply override the functions for the types of interest. I do this in my Leaf code as well, creating an expression walker. It looks something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
struct expression_transform {
    shared_ptr<expression> apply_expression( shared_ptr<expression> in ) {
        switch( in->form ) {
            #define EXPRESSION_FORM(form) \
                case expression::f_##form: \
                    return apply_##form( std::static_pointer_cast<expr_##form>(in) );
            EXPRESSION_FORMS
            #undef EXPRESSION_FORM
        }
    }

protected:
    #define EXPRESSION_FORM(form) \
        virtual shared_ptr<expression> apply_##form( shared_ptr<expression_##form> in ) { return in; }
    EXPRESSION_FORMS
    #undef EXPRESSION_FORM
};

A derived class overrides just the functions of interest, for example:

1
2
3
4
5
struct sample_transform : public expression_transform {
protected:
    shared_ptr<expression_binary> apply_binary( shared_ptr<expression_binary> in );
    shared_ptr<expression_funccall> apply_funccall( shared_ptr<expression_funccall> in );
};

Once you get used to using list macros you’ll find all sorts of nice ways to use them.

No Macros?

The major problem with the macro approach is that it requires macros, and not all languages have them. I have the same situation in C# on my game Radial Blitz, but am unable to find a good solution. I find this very unfortunate. All that redundant code just buries the semantics and does not add value to the project.

In some dynamic languages there is another option: converting the enum value to a string, appending to a common function prefix, and calling the resulting function by name. This approach of course provides no guarantee that I haven’t missed a function. I’ll only know at runtime when an exception is thrown. This, of course, has to do with dynamic languages as a whole, rather than just this isolated scenario.

I’m still a firm believer in macros. Sure, they can be abused, but often, like here, they make the code safer, more readable, and easier to maintain. I have yet to see a language with strong enough syntax and semantic constructs to eliminate the need for macros. At least they could provide a builtin visitor syntax, perhaps I’ll expand that idea later.

6 replies »

    • I was also going to mention x-macros.

      I encountered them only once in production code and I hated them thoroughly. I needed to find a bug in code hidden behind some layers of x-macros based on a 10+ column table defining event names, callback functions, etc. and all I had was an RS-232 based terminal and source code. I did not even have a debugger (it was an embedded system).

      I do not mean to say that x-macros are bad. They can be quite elegant tools for dealing with lacking language support. They can also be abused easily (but so can templates).

    • Like all language features they can be abused.

      One of my purposes is to reduce defects: it’s far easier to get the switch/visitor/names wrong without the use of this pattern. I’m always open to other options, but I can see nothing more elegant for this scenario.

  1. This is very interesting albeit I find the syntax to be rather opaque for me – I’m not terribly familiar with macros. It would be nice if a feature like this existed baked into Leaf such that you could do this sort of thing with a more readable syntax… Possible?

    • I’m hoping that in Leaf a lot of the need for macros goes away. This visitor pattern is something that could be done with a more natural syntax. There are lots of options of course. Meta-programming in general is a very useful pattern, so I think a language should have lots of such features.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s