Debugging a defect with a shared argument

While writing some error handling helpers, during my algorithm stream, I encountered a defect. It’s a tricky one dealing with the passing of shared arguments to functions in Leaf.

After a lot of searching I managed to reduce it to this minimal unit test.

/* EXPECT(3) */
var a : integer shared := 2

defn pine = (b : integer shared) -> {
    var c : integer shared := 3
    b = c


The function is silly, but it is sufficient to demonstrate an error. It manifests as severe sounding abort: duplicate-release-shared error.

For the curious, the example was reduced from the below error reporting function.

@export defn print = ( ofi : fail_info optional shared ) -> {
     while has(ofi) {
        var q = getopt(ofi)
         ofi = q.next

That iterates over the tags in an error and prints them to the console. It mostly works, except for one issue on the ofi = q.next line — it also causes that nasty duplicate-release-shared error.

It’s a problem of ownership

The error is one of ownership: two code paths are freeing the same value. Consider this basic code:

var b : shared integer := 5
var c : shared integer := 6

b = c

b and c are both shared values. These are roughly equivalent to a shared_ptr<integer> in C++, or the Integer type in Java. When we assign b = c we’re not modifying values, rather changing references. You may refer to my article on divorcing a name from its value for more details on that concept.

Leaf uses reference counting; this is roughly what happens when we assign to a shared variable:

// b = c
assign(b, c)

We release the old object in b, acquire the new one c, and then assign the reference (a memory pointer).

Take a look back at the test case again, the b = c comes from that example:

defn pine = (b : integer shared) -> {
    var c : integer shared := 3
    b = c

The trouble here is that b is an argument to the function, so we’re only allowed to release it if we own the variable. It turns out we don’t!

Consider the calling side:


Reduced to pseudo-code, again based on reference counting:

var a_tmp = acquire(a)
pine( a_tmp )
release( a_tmp )

This bit of code ensures that the reference to a is valid during the function call. Since a is a globally modifiable, we need to play it safe here. The problem is the above code calls release(a_tmp). But our pine function is also doing release(b), which happens to be the same shared object as a_tmp. Thus we’ve released the object twice!

Logically avoiding the issue

This issue can be avoided by shadowing all arguments locally in the function.

defn pine = ( _b : integer shared) -> {
    var b = _b
    var c : integer shared := 3
    b = c

The shadowing works since the first thing the function does is acquire(_b), and the eventual release(b) doesn’t touch the source argument.

It feels a bit like overkill to do this for all arguments on the off-chance they might be modified. I guess the compiler could scan the code and figure it out. Or, I could just disallow modifying function arguments. That’s how I arrived at my question Should function arguments be reassignable or mutable? .

For sanity in the compiler, I’ll likely add a fix (local shadowing), and forbid assigned to arguments by default. Then use some magic to try and optimize away all situations where it isn’t needed — though popular opinion seems to lean towards forbidding the modification of arguments.

Categories: Leaf

Tagged as: , , ,

9 replies »

  1. Maybe my thinking today is way off, but I have hard time understanding the problem. Is it reference semantic surprise (because something looks like value semantics)? Or is it releasing the memory — “we’re only allowed to release it if we own the variable”. This rule looks odd (I assume release is done automatically, not by user). IMHO it should be released when the counter is zero, no matter who owns the reference. Btw. sometimes in code you use “b” and “c”, but in explanation you refer to “a” and “b”, take a look at first piece of code in section “It’s a problem of ownership”.

    • Ooops, I fixed the code to match the test but forgot to change the text.

      I don’t `release` is clear in my text. `release` means to decrement the reference counter and check if it is zero. It isn’t a `free` function. The issue is that the function and caller are both calling `release` on an object which has only been acquired once. And yes, all this acquire/release is done automatically — this is a compiler defect.

      `shared` objects in Leaf have kind of “reference” semantics. When shared values are assigned all the refernece refer to the same value.

  2. I’m slightly confused.

    b is of type “integer shared”. b is created when pine is called. b has reference semantics. The object or value that is referenced by b uses reference counting for memory management.

    Is all the above correct?

    At the point where b is created (i.e. during the entry to pine), does leaf call “acquire” on the value b references in order to increase the reference counter by one (to reflect the fact that b now refers to the value)?

    If yes: Then it seems that you should be able to release(b) prior to reassignment. In fact, it seems like you should be required to call release(b).

    If no: Why is acquire(b) not called? (Performance?)

  3. (In my previous post, I forgot to subscribe to notifications, so I am posting again.)

    • Yes, `b` starts in an acquired state (the reference count is 1). Look closely at how the call to the function is made, the caller also does an acquire/release pair to ensure it’s safe for the duration of the function call. The initial release inside the function is problematic though. It’s assigning to a different name, `b`, that points to the same object as `a`. It’s doing a `release` that nobody asked for.

      The function doens’t “own” it’s own arguments in this sense. Thus it shouldn’t be freeing them up.

      In my last article I talk about restricting the ability to modify the arguments, and this is one of those reasons. I could make the function own the arguments, in which case calling rleease is correcct (atually, it has to call it then, it’s no longer optional). Or I say it can’ tmodify the args in which case it doesn’t ever release them.

    • IMO, whether or not arguments are mutable or immutable by default is a language design choice. Mutable arguments should be possible. Mutability should be orthogonal to memory management. If mutability of arguments and memory management are colliding, this is due to flaws in the implementation and/or flaws the language specification itself.

      The primary reason to make choices about argument mutability is the cognitive burden such choices will place upon the programmer. The difficulty of the implementation is, comparatively, a very minor concern.

      A reference either exists in the calling function, or in the called function, or both. (“Both” meaning there are two separate references, one in each function.) Reference counts should be incremented and decremented accordingly, by and in the corresponding function. Obviously, the called function should operate independently of any references that exist in the calling function. Similarly, the calling function should not be responsible for releasing references that exist in the called function.

      Do you disagree?

      If I were designing a reference counted language, I suspect I would have pine release(b) as part of “b=c”. At the end of pine, b goes out of scope (and is released). Alternatively, because b is an argument, responsibility for releasing b could devolve to the calling function. In this case, the calling function would need access to the value of b at the end of pine so as to properly release the correct value at the appropriate time.

      B is an argument, and is probably stored at a known location on the stack, so it is technically possible for the calling function to release the value of b at the appropriate time. However, I suspect it is simpler to have pine release(b) upon return from pine when b goes out of scope. (In either case, I suspect these are implementation details, and different implementations could all comply with the language specification.)

      IMO, it is incorrect for the calling function to assume that the value of variables (including arguments) inside a called function will not change. In other words, it is incorrect for a calling function to blindly release the values it passed into a called function, upon return from the called function.

    • I don’t disagree with that. I’m admitting this is a mistake in the compiler. If I allow arguments to be reassigned then I must account for this.

      It carries a cost with it, that is my point about thinking about not allowing reassignment. I’d prefer to avoid an extra acquire/release pair to accomodate a very limited scenario.

      The caller in this case wasn’t making the mistake of thinking the argument changed. It didn’t actually care about that — it was correctly releasing the original value (the function actually gets a copy of the underlying pointer). The error was the missing acquire/release pair, by default, in the function itself — and that’s what I’m trying to avoid.

    • “It carries a cost with it, that is my point about thinking about not allowing reassignment. I’d prefer to avoid an extra acquire/release pair to accomodate a very limited scenario.”

      There are costs either way. If you don’t acquire/release in pine, and if pine holds the only reference to b, then b will be released later than necessary. In the case of a shared int, very little storage is used.

      However, some references could be to the root node of a large data structure. If (a) you have a recursive function that generates and passes a shared unique large data structure to each successive call, and if (b) in each invocation of the function, the function may quickly decide to release the large, unique, passed in data, then (c) under your strategy of releasing in the calling function (rather than the called function) you are going to be keeping all those unreachable large data structures in memory until the recursion terminates.

      Some reference counting implementations have the benefit that memory is freed as soon as possible. I believe your implementation may lack this benefit.

    • There is definitely that potential. I don’t think it’s a common occurence but I can see it happeneing when using a functional coding style with end-call recursion. Though in that case I thik the optimizer might take care of that anyway.

      At the moment the lifetime of variables is preserved based on statements anyway, so they’d still be preserved even if the callee was an owner.

      I think now the decision to make arguments immutable makes sense. It ensures that we can play with the calling convention later — since the most restrictive argument semantics allows the most flexible conventions.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s