The Life of a Programmer

Preprocessor Case Study: Message Dispatching

Dispatching is one of those areas where I almost always rely on the preprocessor. I have done this in a variety of languages using a variety of techniques and have found nothing simpler than using C or C++’s macro system.

By dispatching I mean you have a series of messages and you have to handle them. Each message is associated with some identifier and has a generic header. Messages will be posted to your code, either directly, from another thread, or over the network. This isn’t just limited to messages, but also to commands, queries, or any other identifier to handler mapping.

This article is a generic case study of this situation.

The Functions

If you are trying to handle a lot of messages you will at some point need to dispatch those messages to an appropriate handler function. I always end up having a bunch of functions with signatures like below (usually in a class, but sometimes standalone).

[sourcecode language=”cpp”]
bool process_add( msg_add const & msg, context & ctx );
bool process_delete( msg_delete const & msg, context & ctx );
bool process_modify( msg_modify const & msg, context & ctx );

[/sourcecode]

These process functions are repeated for however many messages I need to process: one per message. I don’t like this. What is really just a listing of messages is obscured in the syntax. It involves a lot of redundancy, which is always bad. So I use the preprocessor to make it look like this:

[sourcecode language=”cpp”]
#define MESSAGE( type ) bool process_#type( msg_#type const & msg, context & ctx );
MESSAGE(add)
MESSAGE(delete)
MESSAGE(modify)

#undef MESSAGE
[/sourcecode]

This code is a lot simpler. You can immediately see that all these functions are related and serve the same purpose. It is also easier to maintain (in case I need to change the function signature).

Since the body of these functions is not defined I need to repeat the signature again later. When I implement the function I use a similar macro.

[sourcecode language=”cpp”]
#define MESSAGE( type ) bool class_name::process_#type( msg_#type const & msg, context & ctx )

MESSAGE(add)
{
//body of add handling
}
[/sourcecode]

This matches the original declaration exactly. It also lets me add the class name (if these are part of a class). A further benefit is that it enforces a naming requirement on the parameters to the function. I find it horrible when dealing with dispatch code where each function has a different name for each of the common parameters!

The Dispatch

Somewhere in the code I will have a “post” function which receives messages to be processed. In this function I must dispatch to the appropriate handler. I tend to use a switch statement here. A single case statement looks something like this:

[sourcecode language=”cpp”]
switch( msg.id )
{
case msgid_add:
result = process_add( static_cast(msg), ctx );
break;

}
[/sourcecode]

In such a form the intent of the code is entirely lost in the syntax. It also very tedious to have to repeat this code for all the messages. This is also a simple example, often I need a few more lines, such as to deserialize the message. Again I use the preprocessor to remove the repetition.

[sourcecode language=”cpp”]
switch( msg.id )
{
#define MESSAGE(type) \
case msgid_#type:
process_#type( static_cast(msg), ctx );
break;

MESSAGE(add)
MESSAGE(delete)
MESSAGE(modify)

#undef MESSAGE
}
[/sourcecode]

You may note that the MESSAGE list here is exactly the same as the initial function declarations. This means even this code could be shared by further use of the preprocessor. I don’t always do this, but in a few cases I do — where it becomes really inconvenient to manage each list, or when the list is in additional places. It is done something like this:

[sourcecode language=”cpp”]
#define ALL_MESSAGES \
MESSAGE(add) \
MESSAGE(delete ) \
MESSAGE(modify) \

[/sourcecode]

Then use ALL_MESSAGES in place of a list of messages. This works because the MESSAGE macro here does not get expanded until you actually use the ALL_MESSAGES macro. For each expansion it can be a different macro.

Inferior Alternatives

Here I’ll look at some of the alternatives I’ve tried before, and briefly explain why they are not as good.

Function Map

Creating a map of functions and message ids is probably the standard approach. Where possible I tend to take this approach, but it has a few issues. Either you have a dispatcher with a “register” function or you directly create a map of the functions. Dispatch is then simply looking up the id in the map and calling the function. This still ends up with a lot of redundant syntax, and it does absolutely nothing to remove the repetition from the function signatures. So I end up using macros anyway.

[sourcecode language=”cpp”]
map[msgid_add] = &process_msgid;
map[msgid_delete] = &process_delete;
map[msgid_modify] = &process_modify;

[/sourcecode]

The redundancy is only one problem, and there is a more significant issue. This type of map requires that all functions take the same parameter types. None of my functions do however. They all take a message parameter which is the type of message they are processing. You can’t do this with the map solution. I mean, in C++ you could create a template wrapper function that does the casting, but now you’re talking about further ballooning the syntax, leading you right back to the preprocessor.

My previous “switch” approach also has another advantage in being a bit more flexible. While usually not the case, sometimes you deal with messages which are not homogeneous. For example, of 50 messages 45 of them share a common format, 4 share another format, and 1 is just odd. When using a “switch” statement it is really no problem to add custom handling for these other formats.

Reflection

My examples rely on messages, their ids, and the handler functions to follow a specific naming convention. Since this is already the case we could make use of reflection to find the appropriate function. We could even cast the message to the correct type if we want. In pseudo-code this might look like:

[sourcecode language=”cpp”]
Function func = lookup_func( "process_" ++ Name(msgid) )
Class msg_type = lookup_class( "message_" ++ Name(msgid) )
bool result = func( msg_type.cast( msg ), ctx );
[/sourcecode]

A significant issue here is figuring out how you can call “func” with a variety of different message types. That is, is there a even a language where the above would work? Usually I’ve had to resort to not casting the message type, or finding an alternate solution.

Even if you get it to work this solution still has the drawback of doing nothing to reduce the function signature syntax. It also can’t simply deal with messages of different formats. You could use a series of if statements, but then the syntax starts to become bulky again.

Conclusion

The reason I always end up using the preprocessor for this situation is that I don’t see a better alternative. A lot of languages don’t offer a preprocessor which I find depressing, mainly since they don’t offer a better way of doing this type of code. I should be clear: I don’t really want to use the preprocessor, but it offers the clearest and least redundant solution. If you know a language with a more elegant solution, or equally clear without macros, please let me know.

But for now I must conclude that for this type of dispatching, a preprocessor offers the best solution. It is type safe. It has a low level of redundancy. It is compact. It is flexible. It is simple. It is clear.

Please join me on Discord to discuss, or ping me on Mastadon.

Preprocessor Case Study: Message Dispatching

A Harmony of People. Code That Runs the World. And the Individual Behind the Keyboard.

Mailing List

Signup to my mailing list to get notified of each article I publish.

Recent Posts