Embedding a text file in C++ code (with Scons)

I needed a way to embed a text file in my C++ code. I recently introduced Leaf code into the Leaf compiler and didn’t want to depend on a file dependency at runtime. I found a way to embed the file as a string, though it took me longer than expected.

Raw Strings

I first looked at the tool xxd. The -i option produces a C constant of the bytes in a file. It’d work, but I then recalled C++11 added raw string literals. A string can contain any data, including null bytes, so it should be okay.

Raw strings gave me a way to include the raw data. I have a header file that looks like this:

std::string dataBaseLeaf = R"~~~~( 
    ... the file data 

For those unfamiliar with raw strings, the R"~~~~( opens the raw string. All characters to the matching )~~~~ will be taken as-is (no escape sequences). The ~~~~ is a delimiter I chose. We can choose any delimiter but must choose one that doesn’t appear in the included data.

I include this file in another C++ file and get the origin file contents from the dataBaseLeaf string.

Scons Build

The tricky part was figuring out how to get this as part of my build process. I’d like my header file to be updated whenever I modify the source file. I use Scons as a build tool, and despite having a lot of documentation, it lacks in common use-cases.

I needed to create a generator, at least I think that’s the name for it. I use the Action command combined with a custom function to produce the C++ raw string from the input file. Let’s just look at the code:

def RawStringIt(varName):
    def Impl(target, source, env):
        content = source[0].get_text_contents()
        with open(target[0].get_path(), 'w') as target_file:
            target_file.write( "std::string {} = R\"~~~~({})~~~~\";".format(varName,content))
        return 0

    return Action(Impl, "creating C++ Raw String $TARGET from $SOURCE" )

The Impl closure does the actual transform. There isn’t much to it, but it took me a while to piece this together from various references. I’m still uncertain if I’m doing it correctly — it feels odd that there is a get_text_contents() for the source file, but no counterpart to write to the destination file. Action wraps the implementation for use in Scons as a production rule.

A cleverer implementation would ensure the ~~~~ delimiter doesn’t exist in the source file. If it does then it should pick a new one. The delimiter can be anything, and we don’t have to look it, so a sequence of random numbers is suitable. We could just keep trying until we find one that works (likely the first time).

I use the RawStringIt function as follows:

base_leaf = env.Command( 'include/runner/base.leaf.hpp', 'runner/base.leaf', RawStringIt("dataBaseLeaf") )

This Command creates the base.leaf.hpp file, with a variable named dataBaseLeaf that contains the contents of base.leaf. Anytime I change the source file it rebuilds the raw string and the program that depends on it.

We can see the base_leaf variable as part of the dependency set here:

leaf = env.Program( 'leaf', [
    lib_lang, lib_parser, lib_util, lib_ir, lib_ir_llvm, lib_runner,
env.Requires( leaf, [base_leaf, base_unit_test] )

Categories: Programming, Use Case

Tagged as: , , ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s