The Life of a Programmer

What is a closure?

You’ve heard the term, you’ve probably even used them, but what exactly is a closure? It’s a combination of data and code that have become a staple of modern programming. They offer a natural functional feature; quite useful even if you don’t fully understand them. Let’s take a closer look and demystify this curious construct.

A simple example

We can start with a simple local form:

defn pine = -> {
    var x = 1

    defn incr = -> {
        print(x)
        x = x + 1
    }

    incr() // prints 1
    incr() // prints 2
}

Here were define a function pine. Inside that function we define another function called incr. The x variable, used inside incr, is not defined there: it’s part of the local variables of the pine function. The function incr is a closure of it’s code and the variables in the surrounding scope.

By calling incr twice we see that x persists between calls. Moreover, x is shared between the scopes:

    incr() // 1
    print(x) // 2
    x = 4
    incr() // 4

Both incr and the pine scope are referring to the same x. This depends a bit on the language, in Leaf, shown here, this sharing is the default. In C++, and other languages, cloning is the default; in this case the value is not shared, but a copy is taken when the function is created. For example:

defn pine = -> {
    var x = 5

    //a copy of `x` is made now
    defn incr = -> clone {
        print(x)
        x = x + 1
    }

    incr() // 5
    print(x) // still 5 (would be 6 if shared)
    x = 1
    incr() // 6 (would be 1 if shared)
    print(x) // 1
}

Subsequent calls to incr use the same x, but the one in pine is independent of it. This creates two different x variables. These are both types of closures as they combine code and a data scope. The results are quite different, so it’s important to know what type you’re dealing with.

High-order functions

When we combine closures with high-order functions we get interesting new possibilities. For example, the data scope for a closure can persist outside of the function that creates it.

defn addr = (x) -> {
    var f = (y) -> {
        return x + y
    }

    return f
}

var a5 = addr(5)
print( a5(1) ) // 6
pinrt( a5(3) ) // 8

var a7 = addr(7)
print( a7(1) ) // 8
pinrt( a7(3) ) // 10

addr(5) is creating a function that adds 5 to a number. We aren’t calling that function from within addr though, instead we assign to the variable a5 and then call it. When addr(7) is called it creates a new environment, it’s x value is not the same as the one from a5.

A closure can be passed to a function that knows nothing about the surrounding scope.

defn key_sort( data : listο½’anyο½£, key : string ) {
    sort( data, (a,b) -> {
        return a#key < b#key
    })
}

The sort function will be expecting a comparator for two objects. It does not know how that comparison is done, or that it might be accessing data from the enclosing scope. This code demonstrates that the closure, which we pass to sort, is truly enclosing the code and data, it’s not just some syntax trickery.

A compiler may well opt to use trickery in many cases where closures are used. The most generic approach, one that works for sort and return has a bit of overhead. The simpler approach, shown with incr earlier, can often be compiled without any runtime closure support.

A technical nit for the highly interested

The above explains basically what a closure is, and how it can be used. If you’re into language theory, you might not be satisfied with some details. If you’re not into language theory, feel free to stop reading now and go about having fun with closures.

Let’s go back to this code:

defn pine = -> {
    var x = 1

    defn incr = -> {
        print(x)
        x = x + 1
    }

    incr() // 1
    incr() // 2
}

I called incr the closure which may not be technically true — this depends on the language. In Leaf it’s not yet a closure, it’s a function that takes a “context”. It’s not much different than this function in C:

struct pine_context {
    int x;
}

void incr( pine_context * pctx ) {
    print( pctx->x );
    pctx->x++;
}

It’s when the function is called, incr(), that a closure is made. The compiler sees that the function is expecting a pine_context environment; it finds one in the immediately enclosing scope, that of pine, which seems logical. It binds the context to the function and then makes the call.

This is one place where a compiler could instead use trickery. It avoids creating any closure object and instead calls the incr function with the pine_context. This optimization allows using local closures without overhead cost.

This binding also needs to happen if incr would escape the scope where it’s declared.

defn pine = -> {
    var x = 1

    defn incr = -> {
        print(x)
        x = x + 1
    }

    return incr
}

var q = pine()
q() // 1
q() // 2

When the compiler encounters the return statement, it realizes it needs to bind the incr function to the pine_context before it returns. After the return statement, there will be no further opportunity to find the context.

Isn’t a closure just a class and instance in disguise? Good observation. Yes, there are usually syntactic, and some behavioural differences, but they both model the same relationship between data and code. In the Leaf compiler, they mostly share the same processing code.

Just the surfaces

Languages can behave quite differently with closures. It’s important to understand that fundamentally it’s just a combination of a function and some data. Knowing whether the scope is being shared or cloned is an important detail. Or where possible, like in C++, when parts of the context are shared and parts cloned.

I’ve strayed a bit into Leaf specific behaviour, just to give an idea of the complexities involved. If you’d like to dive even deeper into how closures are implemented, then let me know.