Tags

, , , , , , ,

New languages try to improve the lives of programmers by simplifying some aspect of programming. Many make bold claims about eliminating certain types of errors, yet in the end none of them can actually stop a program from crashing. C style pointers are harped on as a major souce of these crashes, despite new languages, having no such pointers, also crashing. An errant programmer ultimately has final say over the destiny of their application.

To understand requires looking at the various ways a program can crash. Those which a language simply can not prevent. First let’s define crash to mean any serious loss of operation or utility in the running program. Clearly a program aborting or never responding is a crash. A complete loss of graphics, or total loss of input control in a video game is also a crash as it is bereft of nearly all utitlity. Similarly, should all the buttons in an interface become duds that program’s utility also goes to zero. We’ll ignore cases where there is only a minor, or small partial loss of function and focus strictly on those with a complete, or nearly complete, loss.

Infinite Loop

Short of calling “abort”, the quickest way to crash a program is the infinite loop. Simply get one of your loop variables wrong and instead of doing something useful, your program will instead keep doing the same thing forever. This usually manifests as the program not responding. Worse is that you’re now wasting a lot of processor time, thereby hampering the utility of other processes on the machine.

A related error is infinite recursion. It is logically an infinite loop but your program will likely abort due to a memory error rather than stop responding. Certain functional languages, or imperative optimizers, may convert your stack exhaustion into a true infinite loop.

Such logical errors simply can’t be prevented. A language can however provide constructs like range or set-based loops, rather than explicit counting, to help limit the severity. Anytime the programmer can avoid an explicit loop variable is an opportunity to avoid an infinite loop. This does nothing for the recusion case and most programs will need at least a handful of custom loop counting. Looping is a fundamental aspect of programming and ultimately there is no way a language can actually prevent an infinite loop.

Memory Exhaustion

Even with garbage collection and a lack of pointers, a program can still easily exhaust enough memory to bring itself to a halt. There is a point at which you’re remaining memory is so low that no operation can work anymore, even though the program appears to be running. Even the most trivial of execution paths tend to require a little bit of memory.

Noteworthy is that your program will likely stop being useful long before you’ve fully exhausted the available memory. Most operating systems provide swap space, a destination for memory which isn’t currently used. If your program operates on too much memory it’ll continually swap in and out, something called thrashing. Since the swap space, usually a disk, is quite slow compared to main memory, your program, and others, will face a significant speed drop — often to the point where it becomes unusable.

Caching too many objects is one quick way to use up too much memory. Since a compiler cannot deduce what you are attempting to do with your cache, it has no way to prevent it. Perhaps pure functional programming could avoid this. If you simply can’t create a cache, there really isn’t any way to slowly exhaust your memory. Though even in that case you can still keep working with an ever increasing data set.

Using huge blocks of memory is another way to exhaust that resource. Since computers now have a relatively large amount of RAM, it is tempting to load resources which would otherwise remain on the disk. A language can’t prevent you from doing this, nor should it, as such caching is often a huge performance boon.

Data Race

Any program with threads runs the risk of a data race. That is when one thread attempts to write to a value while another thread is also accessing it. The immediate result, if something goes wrong, is varibale corruption. This may then lead to a deadlock, infinite loop, or a memory fault. Worse than a crash this may result in silently corrupting the data being processed. While languages provide many synchronizaton primitives, it really isn’t possible for them to guarantee that no data race occurs.

One exception goes to purely functonal programming with threads and a lack of shared data. If your parallel processing has no ability to work with the same data there is no possibility of having a data race. In programs where pure data processing is required this may actually be an option. For the vast majority of programs however this is not likely a sufficient memory model.

Deadlock

The fight again data races requires the use of synchronization; any program using locks must then contend with deadlocks. This happens whenever a thread is waiting on a synchronization object that will never be released. One common case is when two threads both attempt to lock two mutexes in different orders: they each hold the lock now being requested by the other. A more subtle case is waiting on a condition variable that will never notified by any other thread.

How much functionality is lost depends on where the deadlock occurs determines. It is reasonable to assume that most deadlocks result in a significant loss of utility. A language can only prevent this problem by simply not offering threads, or anyway to communicate with any other process. For even a single-threaded applicaton may be waiting on an external resource that is never freed. While you may not wish to call it a deadlock, the effect is exactly the same.

Conclusion

Just because a language can’t prevent a crash doesn’t mean it shouldn’t try. There are several features that can make writing programs easier. I’m very skeptical however when a language claims to have *solved* some problem. Fundamentally the above problems cannot be solved and a lot of the *solved* problems are nothing more than variations of the above. Yet ignoring the problems is easily as bad as claiming to have solved them. If the above problems are not somehow addressed there is little point in even writing a new language.

Most distressing are the threading issues. Data races are actually one of the most difficult problems to solve. Certain functional programming can come extremely close, or even achieve a solution, but essentially at the cost of concurrent processing. That does give us hope however. Perhaps this problem is actually addressable. Can it be completely solved? Unlikely, but certainly I’d hope we can do a lot better than what we have now.

About these ads