Tags

, , , , ,

“goto”: the demonized programming contsruct. This little expression allows you to jump to somewhere else in the code while skipping the expressions in between. Opponents say it leads to spaghetti code and has no business in modern programming. Most new languages buy into this argument and don’t include a goto expression, yet fail to include a solution for the places where goto is still useful.
There are several situations where goto produces the cleanest possible code. Trying to cram the same login into a series of if statements, or loops, in fact can lead to more obfuscated code. Within state machines, including parsers, it often has a role. For highly optimised code it provides a good solution for producing lean code. All in all there are enough valid reasons for goto that not including it in a language is a mistake.

Alternatives

Wherever there is an alternative to goto it should be used. For example, breaking out of multiple levels of nesting can be done with a “continue” keyword which accepts a label for a loop construct. Common exit code from a function can be handled via a “finally” block. Often a goto can be avoided simply by using a closure or local function with a return statement.

Restrictions

Nobody will claim that unlimited goto is ever required. We aren’t trying to recreate setjmp/longjmp, which are truly hard to resolve in a modern language. To preserve stack integrity goto can only ever allow one to go up in the stack. Preserving caller context may also require that goto can only be used within a function.  These limitations are reasonable, and most uses of goto live within these.

Implications

Obviously if the execution path is going to be jumping about very clear rules about variable life-time are necessary. This is the clearest reason why goto can only ever go up the stack. Going down the stack means entering scopes of variables which haven’t yet been initialized. So using a goto will unwind the stack up to the level of the destination. In terms of resources this essentially requires the RIAA pattern for resource management: we can’t allow the goto to leak resources.

In the destination scope it also follows that the construction of any variables is not skipped. (C++ has this rule.) That is, in the scope where the label exists, all variables after that label must have been constructed and initialized prior to the goto statement.

Questions

Jumping outside of the current function probably doesn’t make sense. The most signficant reason is that there is no guarantee that the calling function has the destination label defined. We need to make some allowances however. If the language makes use of anonymous functions, closures and lambdas it may make sense to allow goto in those scopes. If those components are specified elsewhere however, for convenience, or reuse, would goto be allowed?

Examples

Here is a random collection of places where goto may be a suitable option.

Breaking from inner loops

This a common example of where goto is helpful. You have an inner loop and wish to break out of the outer loop. For example, say you wish to find an element in a matrix.

	location_t where;
	for( int i=0; i < num_rows; ++i )
	{
		for( int j=0; j < num_cols; ++j )
		{
			if( matrix(i,j).is_what_we_want() )
			{
				where.set(i,j);
				goto found;
			}
		}
	}
	throw error( "not-found" );

	found:
	//do something with it

There are obviously other ways to doing this to avoid the “goto”, but this code is perfectly clear and easy to follow. It does not make sense to code this another way if the only reason is to avoid using “goto”.

Redoing a block of code

You have a bit of code which you might need to execute multiple times within the same scope. A “goto” can sometimes be the clearest option to express this logic.

	redo:
		...
		if( must_redo_expr )
			goto redo;
		...
		if( another_redo_expr )
			goto redo;

The alternate, and popular approach, is to use a while loop.

	while( true )
	{
		...
		if( must_redo_expr )
			continue;
		...
		if( another_redo_expr )
			continue;

		break;
	};

The loop approach slightly obscures the logic of the code. It also has a severe limitation that if you have an inner loop you simply can’t use “continue” to get to the outer loop. Furthermore, it introduces a new scope which can be a problem for certain variables. Now, you can virtually always use a loop instead, but why would you do that if it requires a lot of code juggling and ends up making the code less readable?

Error handling

Error handling within a function can more complicated than simply returning “false” or throwing a simple exception. Said error handling may depend on numerous local variables.

	int function( int a, int b, int c )
	{
		int d = 0;
		...
		if( detect_problem )
			goto broken;
		...
		if( other_problem )
			goto broken;
		...
		return d;

	broken:
		report( a, d );
		resolve( c );
		return -1;
	}

Repeating the error handling code multiple times would be bad (code duplication is always bad). You might be able to create an external functon, but then you must take care to pass all the variables correctly in each case.

This type of error handling is kind of like a miniature exception handling. You can basically create local “try-catch” conditions without the overhead (both runtime and syntactic) of using exceptions. It keeps your primary logic easier to read by pushing error handling off somewhere else.

State machine

Sometimes the logic of an algorithm is just not amenable to representation as a tree. This quite often appears when doing parsing or encoding.

	void handle_stream()
	{
	idle_state:
		...
		if( detect_high )
			goto high_state;
		if( detect_low )
			goto low_state;
		if( detect_end )
			goto end_state;
		goto idle_state;

	high_state:
		...
		if( continue_high )
			goto high_state;
		goto low_state;

	low_state:
		...
		goto idle_state;

	end_state:
		...
		return;
	}

Though state machines are pretty common we don’t see this type of code very much. Most of such code tends to be created by code generators, such as a parser generator or a protocol generator. In cases where performance is not an issue you’ll instead see a higher level functional, or object based approach — using a “switch” statement or function pointers you can avoid using “goto”, though the logic is the same.

About these ads