Defective Language

Everything can and will go wrong: all functions fail

In a dreamy programming utopia, functions behave correctly and nothing unexpected ever happens. In technical reality however even the most mundane of actions can suffer failures and errors. It’s a misguided notion that errors can be eliminated or that some things can not fail. The only sane way to program is by assuming things will go wrong.

A polymorphic world

Operating systems have become very complex and flexible creatures. Between the lowest levels of the computer and our high level application code are layers upon layers of abstractions. Let’s consider a very common operation: opening a file.

1
var q = open( "url_or_device_path?" )

What is a file? This can be basically anything nowadays. Some languages let us open URLs, or remote resources, directly as files. The OS lets us open named sockets, or even create anonymous sockets while opening a file. Linux exposes a whole host of kernel properties via /proc files, or communication with hardware via /dev files. Protocols like WebDAV and SFTP also ensure that even things that look like normal files may not be.

As a file can be so many different things it follows that there is a virtually unlimited number of ways that opening it can fail. Even if we do successfully open it, the first read or write, or the second or third, could easily go wrong. It’s not possible, in a general way, to know how file operations might fail on us, only that they can at any time.

There’s another discussion leading from this about the classification of errors. This type of abstraction is what causes checked exceptions, or controlled error codes, to ultimately fail. They tend to create an unsolvable mapping problem.

All libraries fail

The purpose of libraries is to abstract a lower concept, to create a friendlier way to use it. We never really know what is happening in the library; we don’t know how many options it offers or concepts it combines. If it just calls a single function that can fail, that requires the high-level call can also fail. The chances of this are quite high: consider that most functions in the standard C library can fail, as can a great portion of OS calls.

Some libraries go out of their way to front-load failures in the form of parameter checking. This helps prevent “unexpected” errors happening during operations. But the reporting of the parameter errors are still errors.
It may be tempting to distinguish between genuine failures and programmer error, but unless we become infallible at work what difference does it really make? The function fails in either case.

“But wait!” we say, “I’ve seen functions that don’t report errors.” They’re usually just lying, hiding the fact that errors happen and making it difficult to debug or diagnose the problem. Several bad APIs decide to just log the error and continue, making it impossible for us to actually know an error happened. Either a library reports errors or it’s lying.

Don’t confuse libraries that offer side-channels for error reporting as being the same as not reporting errors. OpenGL uses glGetError to store all error codes, and you’re expected to call it after all other functions.

Even the basics

We can’t assume even the fundamentals of programming are error-free. Take even simplest of operations, the addition of two integers. In the world of math this is a well behaved operation that can’t fail. On a computer however we don’t have real integers, we have fixed length integers. This means any addition can result in either an overflow or an underflow.

1
2
3
var q = a + b
if (a > 0 and b > 0)
    assert( q > a && q > b)  //this can fail due to overflow

What’s very painful about integer underflow and overflow is that most languages fail to report it in any fashion whatsoever. It’s also the hidden fabric of so many more operations, from array indexing to stream seeking. High-level errors resulting from low-level overflow are quite common, from long-running programs with counters that get too high to users that feed massive documents into programs. It’s really not hard to reach the limits of integers.

Division also has a problem, if we accidentally divide by zero. For integers this may result in a low-level trap on some systems, but floating point may result in infinity, or NaN. Floating point also has it’s own assortment of range and precision problems. The chance of us writing error-free floating point code is unfortunately kind of low.

Given that the underpinnings of basic programming are prone to failures and errors, it’s hard to imagine anything built on top of that can ever be error-free.

It will fail

From the lowest levels through to the highest levels, everything can fail. Crashes, corruption, and glitches usually happen because somebody assumed something wouldn’t fail. Even theoretically perfect algorithms can fail due to programmer error.

Assuming that something can fail is the only realistic option. If a library, or language, pretends otherwise, then chances are it is broken.

14 replies »

  1. Very true indeed!

    I might not fully agree with this part though:
    > It may be tempting to distinguish between genuine failures and programmer error, but unless we become infallible at work what difference does it really make? The function fails in either case.

    My apologies if this is a repeat of the same argument I made before, but I would argue that there is in fact a difference. Sure any error is an error, but you are *far* more likely to handle the ‘progamming errors’ than the ‘internal failures’ (assuming we’re talking about the same concept).
    Example: An e-commerce web application with the ability to add a product with a user chosen quantity to a shopping cart. It’s an HTML form that posts the product id and the quantity to the server. The products have a limited supply.
    The server handles it like so (pseudocode):
    ————————
    int productId = (int)request.getParam(“productId”);
    int quantity = (int)request.getParam(“quantity”);
    try {
    cart.addProduct(productId, quantity);
    }
    catch (error) {
    if (error.code == INVALID_PRODUCT) response.output(“The product you requested doesn’t exist”);
    else if (error.code == STOCK_TOO_LOW) response.output(“Not enough in stock, please lower the quantity”);
    }
    ————————

    The addProduct function can obviously throw a lot more errors, even if the product id is valid and the quantity is within limits. The cart total needs to be updated (as you said, integer addition can overflow), it probably stores the cart products somewhere in a file or a database (as you said, file write and db query can fail for many reasons). But all those failures are not something I or the user can do anything about. So the only thing I can do with those failures is ‘handle’ them somewhere else and output to the user something like “Sorry, something went wrong”.

    The only logical errors addProduct() produces are the 2 that are mentioned. Other business rules could be implemented that produce other logical errors, like:
    – You can only buy our cheap batteries if you at least also purchase 1 smartphone
    – You can only buy certain products if your order total is higher than $200
    – You need to have an active user account before you can add certain products to your cart
    That would result in 3 more logical error coming from addProduct() that can (should) be handled nicely in the same catch() block

    Those are the errors we care about. I don’t care if some integer addition overflows somewhere deep in some library. It should definitely produce an error, but I don’t want to be bothered handling it here. However, in some kind of math gui app, I probably *would* like to handle those ‘low-level’ overflow errors. So the fact if the error can/should be handled or not, it not tied to the error type in any way.

    You could argue we can just check the error type to see if we want to handle it, but that’s not 100% fullproof (I could elaborate on that, but this comment is already getting too long).

    • I think we’re in agreement here. Handling a very specific subset of errors and just letting the rest propagate is a good way to handle errors.

      The programming errors I mean specifically the programmer implementing something incorrectly. I’m not sure if either or your examples qualify, they seem like errors that originate at the user. But this is my main point, regardless of the source of the errors we have to deal with them somehow. Your pseudo-code appears to do just that, handle the few it knows, and just let the rest propagate. It’s the natural flow in an exception based langauge.

      Checking error “types” is inherently very dangerous (I want to get into this in another article some more). It’s very difficult for the error-site to know what “type” of error it is in the caller’s context. In Leaf I’ve settled on there only being 3 “types” or errors, and they are based on things the error-site can actually know.

      – critical errors: these are the easy ones, something horrible has gone wrong and recovery should not be attempted
      – side-effect errors: side-effects have been generated on the objects involved and they are now in an unknown state. Recovery involves cleanup of those objects
      – no-side-effect errors: no side-effects so far so the caller doesn’t need to take any action to recover (these are typical param check errors)

      The goal is to get a language where’s it easy to program with the no-/side-effect error system.

    • I look forward to your new article then :)

      In the meantime, could you give an example of an error for “the programmer implementing something incorrectly”?

      If you’re interested, I’d like to discuss the internals of error handling some more..? Also in regard to Leaf and the 3 types. For example, what would be considered a critical error?

    • A programmer error is something like a sorting algorithm that doesn’t actually sort the input, or a point-of-sale machine that charges tax twice because the programmer added it twice.

      I’ll be sure to post the Leaf articles, and the others, on reddit to allow a bit of discussion (the reply system on WordPress is woefully lacking).

      I’m actually very undecided what is a critical error, or rather how they can be detected. Something like `malloc` failing to allocate a small object (<1K) is critical since it means standard obejct management is no longer possible, including creating the error objects. Or if the abstract machine detects memory corruption, meaning the whole system is unstable now.

  2. @Maarten, I don’t want to drag the discussion in wrong direction, but the code you pasted is actually wrong (IMHO) for two reasons. First — those are not errors. “Not enough supply”, “only 2 items per customers”, “Done”, etc are just outcomes of the operation (think in terms of “tryAddProduct”). Secondly — using exceptions for this… It is technically possible, but for me it is wrong tool for the job. I am not saying you cannot ever throw an exception, but not in such scenario. Exceptions here would be “connection to db is lost”, “authorization is invalid”, those which are not logically connected to operation performed.

    • That is in fact a different discussion, and an argument I’ve seen many times before. I myself believe a function should do only 1 thing (could be a complex thing, but whenever possible, just 1 thing). That thing could have a certain outcome for success (the return value). But whenever the function cannot do as requested (i.e. add the product), that’s an error situation. You wanted the function to do something, but it couldn’t, it’s as simple as that. And that’s the role exceptions fulfill nicely, they have a way to signal and handle a not-succeeded path.

      You could complicate things and have the function return something like a Maybe, or enum (see Rusts Result type and mortoray’s previous article about it: https://mortoray.com/2015/10/21/messy-error-handling-in-rust-with-try/ ). But this really is what exceptions are for.
      I don’t want to self-promote or anything, but to keep it short, I also explained this in one of my own articles: http://blog.bitethecode.com/post/130259363012/exceptions-dont-excist

    • I agree with Maarten on this, a function has a given success criteria, if that criteria cannot be fulfilled then there is an error. I think all errors should be raised and handled using a single mechanism as well; error codes in the return path would be a distinct mechanism.

      The idea of a “logical” error is extremely tricky, that was my purpose in exploring file open. Abstract APIs have the problem of having an endless number of logical errors.

  3. @Maarten, “You could complicate things and have the function return something like a Maybe”, I didn’t say a word about this and I am against such practice. But in this case you have void function which is (perfect?) example it should be converted to status-function instead. If you put if/switch after that it blends right in, with status-exceptions how you have two flows (or actually three — job done, status-exceptions, error-exceptions). Ok, that’s all from me :-) I don’t want to redirect discussion in another areas, but obviously we are slightly in different camps here.

    • Dont’ worry about drawing discussion elsewhere. It’s important for me, for my Leaf language, to understand the various approaches to errors. From all the discussions and comments I’ve seen there is one core point we all tend to agree on: that we’re currently not handling errors correctly and our languages don’t seem to offer the correct mechanisms for doing so.

  4. @mortoray, ah, in such case, a little explanation from me. Consider `indexOf` on String type. You can get (or not) OutOfMemory exception as the result of this or that implementation, caching, building some tree, etc. So this is real error, thus returning such status (wrapped with Maybe) would be wrong. But on the other hand throwing an exception because the pattern was not found is wrong (IMHO), because it is core semantics of this function. Similarly here with `addProduct`. “TooManyItems” is its core logic, while “DatabaseMaintenance” is not. Of course, I agree with you that drawing the line which side given case is, is tricky part :-) I am under big impression of Icon and I try to steal as many ideas as I can from it :-D Currently I went to basics — could I return failure (null) from such operation as division by zero and where it would lead me…

    • There’s even a problem on basic functions, like `indexOf`. The actual function cannot know the significance of not finding the pattern, all it can know is whether it found it or not. Is the caller just optionally looking for a substring, or is the caller parsing a config file where the pattern is expected to occur?

      The reason why we use a distinct channel for reporting such problems I believe is because current languages make it very difficult (bulky syntax) to deal with such errors. In Leaf I want to minimize that syntax, so you can avoid the tricky situations and just raise errors in all situations where something is even partially wrong. (Plus design the optimizer/compiler so that basic things like not finding a string don’t involve costly error object creation).

  5. @mortoray, “Plus design the optimizer/compiler so that basic things like not finding a string don’t involve costly error object creation”. I am in purpose for this goal, however in Skila so far I don’t have value types, so I was solved this (only conceptually so far) with union types — https://aboutskila.wordpress.com/2015/10/24/leaving-the-trenches-of-the-option-type/ I would like to have clean syntax for such operation as array get, indexOf, intParse, however still with control if there was a failure.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s