The Life of a Programmer

The curse of varargs

Unexpectedly I needed to add support for variable argument functions to Leaf long before I intended to. I just wanted to print out a value, but it turns out that the standard C library has no function to format a floating point number as string, other than the printf family of functions. printf of course is a “varargs” function: an antiquated, and unsafe, mechanism for passing variable arguments to a function.

I also considered migrating or writing my own floating point formatter. Despite my expectations it didn’t take much research to realize this is not a trivial thing to do! Besides, wherever possible in Leaf I prefer to use existing implementations, such as the standard C library, to cut down on implementation time.

What is varargs?

A C function can have a trailing ... marker indicating it accepts a variable list of arguments. We gain access to these parameters using the va_ macros from stdarg.h (historically called varargs.h, thus the likely source of the common short name “varargs”).

Being an old C-level feature, these variable arguments are part of the system’s ABI. Libraries can expose functions with variable arguments and there is a way to call them from other languages using any compiler. For “varargs” this means a very specific format for arguments on the stack.

A safety issue

Unfortunately in C there are no variant types, nor any kind of runtime type information. There is no instrinsic way for a function receiving ... arguments to know any of:

  • the number of arguments it has been passed;
  • the types of those arguments; or
  • the location of those arguments on the stack.

The function must have some other way to get this information. For printf this comes from the format string. If we pass %d %f it knows to pull an integer and a float off the stack.

Well, not quite. The C compiler doesn’t care about the format string. It needs to produce the ... arguments without knowing what type they are supposed to be. It has standard rules how to put the various types on the stack, such small integers types being promoted to int and float becoming double.

This leads to a significant safety issue. If I put %d in the format string, but pass a floating point value, the result will be wrong. The compiler sees the floating point argument and pushes that on the stack, yet the receiving function attempts to get an integer since that is what the format string says is there.

Mismatches result in undefined behaviour. This may be the completely wrong value. It could be a bounds error if there are too few arguments. But the worst is perhaps when small values are formatted correctly but large ones are wrong, thus the mismatch escapes casual testing. Given that there are 20 cryptic single character type specifies along with 8 cryptic length modifiers it’s almost a guarantee that mismatches will occur.

Some compilers, GCC at least, have special printf support. It emits warnings when the arguments do not match the types expected by the format string. This helps for printf, but does nothing for the general use of variable argument functions with ....

Any function that uses ... has this safety issue to contend with. It needs someway to know what is on the stack, and has to just hope that the caller has properly matched their arguments.

Nonetheless

Given the type safety issues, and potential for random crashes and other undefined behaviour, it’s not something I really want in a language like Leaf. I much prefer the type safe formatting options, like from C++ boost, or with a dynamic language like Python, where the the number and types of the arguments are known intrinsically.

Nonetheless, I do want to format a floating point value in Leaf. Calling “varargs” functions is also a use-case that one would expect from a system programming language. This method need not be “the correct way” of doing variable argument functions in Leaf, I can leave that until later.

I decided that “varargs” is a special way of passing a tuple to a function. A tuple is converted to this special “varargs” type at the high level, and the lower level IR will unwrap it into the actual variable argument call.

For example, at a high level one might end up calling printf like this:

1
printf( abi_string("%d %f"), [ 12, 0.5 ] )

Of course, this is just a detail used within Leaf’s standard library. No user of Leaf would ever actually call such functions unless they themselves are linking directly to a C library using varargs.

Please join me on Discord to discuss, or ping me on Mastadon.

The curse of varargs

A Harmony of People. Code That Runs the World. And the Individual Behind the Keyboard.

Mailing List

Signup to my mailing list to get notified of each article I publish.

Recent Posts