Conditionals are a key construct in programming: from the simple if statement, to loops and switches, and even to dynamic mapping. The essence of program flow is defined by its conditions and thus invalid conditions are at the root of many defects. Despite their critical importance I have not seen a clean and well-structure syntax for this construct.
In this article I will explore the generalities of conditionals and consider options for their syntax. I’m not trying to fully explore a topic or provide a decisive view. It’s like a excerpt from a conversation on language design.
The Basics
For any kind of programming I basically expect the following three conditionals to be available.
[sourcecode]
if a == 5:
//do something
switch a:
case 1: //something
case 2: //something
while a < 10:
//do something
[/sourcecode]
These three constructs cover the most common and basic scenarios for programming. Without a clean way to express these conditions the language would be severely handicapped. I don’t mean however they should be expressed precisely as I’ve described above. That’s just pseudo-code. I mean the construct must be available: essentially we need a singular branching construct (the if-statement), a multi-branching construct (switch), and a conditional looping construct.
Single or Multiple Branch
What bothers me is the artificial separation between single and multiple constructs. An “if” statement is really nothing but a “switch” statement with only one, or two conditions. Traditionally C has separated them because the switch could not deal with arbitrary conditions. Intuitively they persist because you often view them quite differently. An “if” statement is a toggle controlling execution of a code block. Whereas a “switch” statement is a selection mapping from a variable to a block of code.
Functional languages, like Haskell, have another selection statement known as pattern matching. This goes beyond a simple condition and allows extraction of values from the pattern. I think this is a great construct, as it merges the logic of matching with the use of the matched fragments. I’m not so sure I like the syntax I’ve seen for it, but the construct itself should definitely exist in any new language, such as Cloverleaf.
I should point out that there are actually two case-like constructs. One is actually a switch, where you provide the value to match and a series of patterns. The other is really just a set of conditions, without a matching variable. In C the latter is fulfilled by if-else constructs. Here’s an example of a few situations.
[sourcecode]
//single value switch
switch a:
case 1:
case 2:
default:
//a switch with complex conditions
switch:
a < 5:
a > 10:
true:
//arbitrary conditions, like a chain of C if-else statements
switch:
is_mine():
is_theirs():
[/sourcecode]
Could these be merged into a single construct without loss of clarity? Perhaps the selection variable is optional, and when specified can be referred to within the case statements. The case statements could also always be conditionals where a single value is simply interpreted as equivalence. We can rewrite the above switches with this logic:
[sourcecode]
switch a:
1:
2:
true:
switch a:
? < 5:
? > 10:
true:
switch:
is_mine():
is_theirs():
[/sourcecode]
This might work, but the choice of syntax will have to be made very carefully. While the above situations seem clear and unambigious you have to imagine this with a lot of nested function calls and more complex expressions. Also, there would be actual code in each case.
Remove the “if”
Could a traditional “if” statement be removed and the language rely entirely on a case-like syntax? Semantically this is what would happen anyway, but syntactically I think there is a problem. If-statments require a boolean expression; they introduce a logical condition. Switch statements however do not force a type and thus a true/false comparision must always be made explicit. Consider this pseudo-code:
[sourcecode]
if a == 5:
/*do something*/
switch a == 5:
true: /*do something*/
[/sourcecode]
Conditionals imply a true/false comparison. Thus it seems illogical and redundant to have to write “true”. Looking back at our selection of switch statements we do see that they could accept comparisons as well. We could write this as follows:
[sourcecode]
switch:
a == 5: /*do something*/
[/sourcecode]
This is better, but it still doesn’t feel right. We aren’t switching or selecting in this single case. If we had a series of if-else statements this syntax is would probably be okay. The case-like setup below immediately makes clear what the conditions are and removes overhead (though this is just pseudo-code).
[sourcecode]
if a == 5:
//do something
else if a == 10:
//do something
else if is_mine():
//do something
else
//otherwise
switch:
a == 5:
//do something
a == 10:
//do something
is_mine():
//do something
true:
//otherwise
[/sourcecode]
But the single if-statement, and perhaps it’s brethren the single if-else statement, are extremely common in programming. The syntax for the basic if should be compact enough to reflect the frequency of its use. Needing to use a switch for that case is clearly inadequate.
A different approach
What if the core syntax exposes only the single if-statement instead? Could we construct a language where advanced constructs, like an else, or a switch statement, could in theory be written in the code itself? Obviously a standard definition of these advanced concepts should be available out of the box; I’m thinking more conceptually now.
A conditional is nothing more than a condition that evaluates to a boolean value and a code block, which is executed if the condition is true. Let’s create a syntax which expresses exactly this notion. That is, let’s treat a conditional statement almost as a variable type. I’ll use the :: operator to indicate that condition.
[sourcecode]
a == 5 :: some code
[/sourcecode]
If we call this variable type “conditional” then we could define the signature of the conditional functions. For example, in C syntax you have the following:
[sourcecode]
if( conditional * a );
switch( conditional * a, int len );
while( conditional * a );
[/sourcecode]
“if” is thus a function which takes a single “conditional”. “switch” takes an array of conditionals and executes exactly one of them. “while” takes one conditional and repeatedly executes it so long as the condition is true. Now obviously these particular functions would need implicit definitions. Trying to implement these at user-code level would introduce a recursive definition problem. However, and this is a notable point, once you have “if” and “while” you could actually write the “switch” statement.
[sourcecode]
switch( conditional * a, int len ):
int i = 0
boolean done = false
while i < len and not done:: {
done = conditional[i].value()
if done :: conditional[i].code
i = i +1
}
[/sourcecode]
Promoting code blocks and conditionals to proper types allows a lot of freedom, though it perhaps makes the life of the compiler/optimizer a bit more difficult. The pseudocode above is enough to show that a simple “if” and “while” built-in functions are enough to implement more complex behaviour.
Would we really want to do this however? Let’s go back to an earlier switch statement and implement it with this method.
[sourcecode]
switch( {
a == 5 ::
expr;
a == 10 ::
{ block; };
is_mine() ::
expr;
true:
{ block; };
} )
[/sourcecode]
This doesn’t actually look too bad, except it has a pesky double-bracketing. This is just pseudo-code, but I believe most languages would require some kind of syntactic cruft here. Adding a switch variable (like a C switch), would further this problem. It is not a requirement that a language has this issue, but it is also difficult to avoid it.
What to do?
I haven’t really discussed loops here much. I think that if a satisfactory syntax for the “if” and “switch” statements is found then a clean notation for loops will drop out naturally. In terms of dynamic mapping, I’ve considered the pattern matching approach, but only at a limited non-function level.
I’m not sure I’ve really found the answer yet. The final approach, treating conditionals as proper variable types, seems to offer the most flexibility, but it has some syntactic problems. I might try to press ahead with this approach in Cloverleaf for now. If anybody has seen a nicer syntax for conditionals, or has some ideas, please let me know.