Philosophy

The exceptional myth of non-local flow

“Exceptions are bad because they introduce non-local flow into a program.” It’s an argument I’ve seen often and it even came up in the comments in my previous article. It’s nonsense though. Not only can I not pin down precisely what it means, there’s simply no way it applies to exceptions.

Exotic flow patterns

What is non-local flow, or a non-local branch? I’ve been unable to find a concise definition so I’ll have to infer based on other meanings of “non-local” and what I’ve read in the argument.

Non-local implies something that is not happening, or controlled locally, in the current context. This is usually seen as the current function. Let’s look specifically at the point we call another function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
defn pine = -> {
    if( !prep() ) {
        return
    }

    //what happens here
    cone()

    post()
}

In the first case, with prep, the function controls what happens when prep is done. It inspects the return value and decides to continue, or return to its caller in turn. It’s a local decision, thus a local branch. The branch is influenced by prep, but the decision is made locally.

What happens at cone? We don’t have a visible branch here, so flow should presumably continue to post. If the execution suddenly decides to go somewhere else we could have a non-local flow.

Can this happen? Yes, it certainly can. There are functions like longjmp in the C library that make this possible. You can also use a non-local goto in some compilers, which as the name implies jumps to a location outside of the current function.

If pine, this function, is powerless to stop that from happening then we have a non-local flow.

I’m not including context or thread switches here as the logical flow will return to the the function in the same location. Plus, since all programs, regardless of source language are generally subject to these switches. There might be a special case with signal interruption, as it’s handled differently. But I don’t think it’s particularly relevant to this discussion.

Exception flow

Let’s put aside for a moment how exceptions are implemented, whether it’s with longjmp or as zero cost, and consider how they can be semantically understood. In a language with exceptions any function can “throw an exception”.

The terminology here is perhaps misleading, instead just assume that any function can report an error. In a non-exception language it is programmer’s responsibility to check the error status. For example, in the C library you might need to check the errno value:

1
2
3
4
5
6
FILE * f = fopen( "my_file", "r" );
if( errno != EOK ) {
    //handle error
}

load_data(f);

Some languages encode the error combined with the result of the function, such as Haskell’s Maybe or Rust’s std::result. In pseudo-code this looks roughly like below:

1
2
3
4
5
6
var result = open_file( "my_file" )
if( is_error(result) ) {
    //handle _error
}

load_data(result.value)

The key difference in exception based languages is that an implied error handling. If the error is not specifically handled the function raises the same error in turn. In our first example, the cone() call is compiled roughly to a structure like this:

1
2
3
4
result = cone()
if( is_error(result) ) {
    return result
}

The implied error handling simply returns the error to the caller. Let’s add some explicit error handling to our original code and see what happens:

1
2
3
4
5
try {
    cone()
} catch( FileNotFoundException e ) {
    log "Skipping"
}

This can become this structure:

1
2
3
4
5
6
7
8
result = cone()
if( is_error(result) ) {
    if( type_of(result) == FileNotFoundException ) {
        log "Skipping";
    } else {
        return result;
    }
}

Completely local

There is nothing “non-local” happening here. The caller is in complete control of what happens when an error is raised. It doesn’t matter that most languages don’t implement exceptions this way, it only matters that we can reason about them this way. The “as-if” rule is golden in compilers: so long as the high-level semantics don’t change the compiler can modify low-level details at will.

I should note that I implement errors this way in Leaf. I had previously used the zero-cost model but was unhappy with how it worked. I gain more flexibility and less maintenance effort by using a return value approach instead. The high-level language is unaffected by this choice of implementation.

All local

Exception handling doesn’t add any complexity to a function’s branching. It can be modelled as a simple conditional and return. In contrast, languages with exceptions also tend to offer features like destructors, early return, and defer statements. These things certainly do complicate block branching.

There are definitely things wrong with exceptions, but this idea of “non-local flow” is just not one of them. It’s a nonsensical argument that should be abandoned.

Categories: Philosophy, Programming

Tagged as: ,

9 replies »

  1. Count me as unconvinced. Non-local phenomenons can be turned into local phenomenons by program translation; this does not suffice to argue that they are local.

    If you want to give your “local explanation” of the control flow between a root caller that handles a particular exception and the function deep down a call chain that raises this exception, you will need to perform this source-to-source translation of the whole call-chain: this is a global transformation. You can also assume that the transformation is done *everywhere*, but then this amounts to reasoning on the output of a compilation pass that lowered a non-local concept into a local concept.

    The same trick can be played with implicit mutable state: do a state-passing translation and poof, all state passing becomes local. That does not mean that the original program has no global ambient state, merely that it can be understood through a state-passing translation. (Syntactic sugar, eg. the do-notation, that hides the post-translation cruft by making it look like the pre-translation thing allow you to choose to reason at one level or the other, but do not make the fundamental choice go away.)

    Note that we could go even further and claim that an even deeper issue here is no local vs. non-local, but the fact that program understanding now involves control flow. Reasoning about just data flow is easier, when it can be done: when I read `plop()`, I want to think in terms of “this denotes the value returned by the `plop()` function”, and this abstraction (assuming my code is sufficiently pure to make it correct) lets me *not* think about control-flow at all. Under this view, using a `return` in non-tail position is control flow (as break or continue; return in tail position is just data flow), which should be avoided whenever it is reasonable to do so. Your translation of exceptions into returns transforms one form of control into the other. To understand it back in term of data flow, you would need to add one extra translation layer (for example into Continuation Passing Style, but that is not the only choice and a bit overkill).

    • I am not lowering a non-local concept into a local concept. I’m saying that exceptions do not require any non-locality in semantic understanding. The key is that the caller, and solely the caller, decides how the errors are propagated. It isn’t left in the hands of the compiler, the programmer has full control over how exceptions are propagated at each call-point. The only difference to normal flow is that functions have an implied default of returning the error to the caller if nothing is done.

      If you can model state by passing a state parameter around then it’s a valid transformation. Everything becomes local though, that is, all the data, which makes the transfromation kind of useless in terms of optimization or reasoning since the data set is too large. Additionally, most programs cannot mimic state this way due to threading and actual non-local memory alteration (like the kernel altering thread local values).

      Your final argument is essentially the functional one. Anytime you can model an algorithm without using flow at all is great. There’s a limited number of scenarios where this can be done however. In terms of “data flow” exceptions are also nothing special, since they are no more complex than non-terminal returns, continue statements, or any kind of deferred statement.

    • I don’t know what you mean by “non-locality in semantic understanding” — but I suspect that “semantic understanding” is always local, by nature, as giving a precise semantics of any programming feature (local or not) requires making it fully explicit (and, often, local, whatever you mean by it).

      My claim would be as follows: a programming feature exhibits non-locality if *removing it* requires a non-local code transformation. For example, pattern-matching does not exhibit non-locality, because translating a pattern-matching into a tree if if-tests is purely local. On the contrary, imagine a language without exceptions, in which a programmer (mistakenly assuming that exception exists) adds *one* “raise” somewhere and *one* “catch” up in the call chain. The translation you presented lets us remove this raise-catch pair, but it requires a non-local transformation (the whole call chain between those two points).

      Again, I could use the exact same argument for global mutable state. “The key is that the caller, and solely the caller, decides how the mutable state is propagated. It isn’t left in the hands of the compiler, the programmer has full control over how state is propagated across each call-point. The only difference to stateless code is that functions have an implied default of updating their local view of the state with the state returned callee if nothing is done.”

    • Adding or removing a `raise` somewhere in the code doesn’t require any translation be done elsewhere in the code. In an exception based language every code-path must always assume an error can be returned from the functions it calls. It’s not a thing that “might” happen, it’s something it assumes will happen. The caller must always reason this way, thus it is irrelevant whether an error is actually raised in the called code.

      Similarily, the `raise` behaves exactly the same with or without a `catch` higher up in the stack*. It simply returns the error state to its caller. No intervening function may make any assumption about whether an error may or may not occur, they must always assume the worse and deal with it.

      No, you can’t use the same argument for global mutable state. First off, we’re talking about data now, not flow. So the control the caller has would have to relate to the data, not the flow. Here I’d say the local control would imply nothing can modify the State object unless the caller specfically provides that object to a function. In other words, if the thread was paused the state data could not change. This is simply not true of global memory, since there are several things that can modify it while a thread is not running.

      (*C++ has one oddity here that actually prevents the transformation I’ve described: if there is no `catch` handler at all the exception is required to call a unique function prior to unwinding. Not all exception languages have this requirement, and I don’t believe it’s a required language feature.)

    • > Adding or removing a `raise` somewhere in the code doesn’t require any translation be done elsewhere in the code. In an exception based language every code-path must always assume an error can be returned from the functions it calls.

      You are saying that if you already understand the semantics of your program as defined by elaboration into the result of a global, whole-program translation, then no change needs to be performed. This is true, but the point is that the global transformation is global. When someone says that exceptions have a non-local effect, well one way to observe this is that *adding* exceptions to a language that does *not* have them must be explained as a *non-local* code transformation.

      You may find it interesting to consider a language with a distinction between two kind of functions, those that are allowed to raise exceptions and those that do not. Your translation allows to translate function calls of the first kind (may implicitly raise) to ones of the second kind.

      > Here I’d say the local control would imply nothing can modify the State object unless the caller specfically provides that object to a function. In other words, if the thread was paused the state data could not change. This is simply not true of global memory, since there are several things that can modify it while a thread is not running.

      I’m not talking of OS-global memory or concurrent data races, but of program-local, non-shared mutable state (eg. reference cells, in a purely sequential setting). OS-global memory can only be explained away by threading a reference to the OS environment throughout the control flow of the whole program, and concurrency requires its own complex discussion/explanation.

      In the setting of, say, ML reference cells (and sequential program), then yes it is the case that the state-passing translation gives the guarantee that nothing can access the state unless the caller specifically provides the state to a function. (Modification is performed by returning a new state, so the caller decides how callee-state-changes affect them, just as pre-translation the caller can reset the cell value after the call, ignoring the value the callee put there).

      I think my claim above was clear and precise, and you don’t seem to disagree with it, only to have a different notion of what “local” and “non-local” mean than the one I proposed. I think the definition I proposed is useful, as there are example of non-local (implicit state) and local (pattern-matching) features for which its verdict corresponds to consensual usage. I don’t think you have given a definition in enough details to know what would count as local or non-local in your book; I seem to misunderstand what you say, but to me your argument would also imply that implicit (sequential) state is local, one reason to dislike your inferred definition.

    • Also: what are the implications of the translation from a typing point of view? A function that returns an integer in the pre-translation program returns either an integer or an error value in the post-translation program.

      You could say, I suppose, that functions that may raise an exception always return either something or an error value; but both programmers and language designers understand “return type” at face value, as the type of the output that the function returns *if it returns normally* instead of failing one way or another.

  2. If I understand you correctly, you seem to be saying that:

    Every throw could immediately caught by the caller. Therefore, all exception flow could be handled locally.

    But in practice, the primary purpose of exceptions is: to centralize error handling at a high level; and similarly: to avoid having to wrap every call with lengthy error handling notations.

    Therefore, exceptions are primarily used (at least in some programming styles) to intentionally introduce non-local flow.

    I am not aware of any programming language that requires every statement to be annotated with a list of all the exceptions that statement could throw, let alone a language that requires every statement to immediately catch any exception that could be thrown inside it. For example, Java only requires that possible exceptions be declared at the method level.

    If I have summarized your argument correctly, then I consider it to be silly.

    Exceptions introduce gotos. Some of those gotos can be non-local (i.e. hidden).

    For the record, I think exceptions can be highly productive when used wisely.

    • I’m not saying something “could” happen, I’m stating that exceptions are semantically equivalent to having a series of if-then branches with the exception encoded in the return value. The logical flow of the program and it’s behaviour remain unchanged. So long as if/return are local flow this means exceptions are also local flow.

      Exceptions are not goto’s, they are incapable of bypassing code that the caller did not intend to bypass. They are not like actual `goto` or `longjmp` that can actually bypass destructors and deferred statements. An exception is equivalent to an early return statement in a function.

      Non-local != hidden. The if/branch behaviour of exceptions is not visible in the user code, it is implied. I am not arguing it isn’t. It’s still local flow, just hidden. This is very important when considering constructs like `try!` in Rust. Nobody seems to consider `std::result` in Rust as non-local, but by using `try!` it certainly is hidden (no visual if/return).

    • “I’m stating that exceptions are semantically equivalent to having a series of if-then branches with the exception encoded in the return value.”

      I agree.

      “So long as if/return are local flow this means exceptions are also local flow.”

      I do not consider “return” to be local flow. Return handles flow between a local function and a non-local (direct-parent) caller.

      A throw is therefore a hidden, non-local jump across (potentially) multiple callers. I agree with you that the compiler does not see it this way, and can do deferred/clean up work along the way. But the (errant?) programmer may find the end result to be just as confusing to debug as a non-local goto.

      “Exceptions are not goto’s, they are incapable of bypassing code that the caller did not intend to bypass.”

      An exception is capable of bypassing code that a programmer did not intend to bypass. This may well be due to programmer error. But it does cause confusion. And I therefore respect the comparison between goto and exceptions in terms of the complexity they introduce.

      Actually, I think exceptions are more like the mythical “come-from” statement, than they are like “goto”.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s