Tags

, ,

Collection iteration is perhaps the most insidious language construct. Simple to create and easy to understand. Yet lurking within lies the ability to create random non-local defects. Monstrous bugs that you’ll spend hours, if not days, trying to find, if you even notice them.

Here is a prime example of a defective loop, found in the source of my Leaf compiler. It’s in C++, but rest assured the same problem code can be written just as easily in any language.

1
2
3
for( auto & mb : all_blocks ) {
    complete_block( mb );
}

In the general form this code is not safe. It doesn’t matter if you are using the above form, explicit iterators, or manual indexing. Which items the loop itself iterates depends on the behaviour of complete_block. What happens if all_blocks is modified within the loop? The interplay between the language specification, the type of collection, and exact manner in which it is changed, is complex enough that we might as well consider it undefined.

Why it’s undefined

Let’s first give an expanded view of the loop to better see what is happening.

1
2
3
for( auto iter = all_blocks.begin(); iter != all_blocks.end(); ++iter ) {
    complete_block( *iter );
}

This is the form that ultimately all loops take. There is an iteration variable that starts at the beginning of the collection. Each iteration steps to the next location. Once the end is reached the iteration is stopped.

It breaks if complete_block modifies the collection. But not always. Many C++ container operations have the property of “invalidating iterators”: after such an operation any iterator to the container is invalid, and using it results in undefined behaviour.

For example, inserting a value into a vector may invalidate iterators. This is required since the iterators may actually be pointers to a location in memory. If inserting a value into the collection causes reallocation, those pointers are left pointing to the old location. Accessing such a pointer is undefined.

Traditional syntax can skip or duplicate elements

The traditional loop syntax doesn’t have exactly the same problem. It won’t have invalid iterators but it may skip or duplicate elements.

1
2
3
for( int i=0; i < all_blocks.size(); ++i ) {
    complete_block( all_blocks[i] );
}

When a new item is inserted into the collection the loop does one of two things. If the value is inserted prior to the current index, the current value will be pushed to position i+1, and then used again in the next loop operation. The newly inserted value will not be iterated. If the value is inserted after the current position it will simply be iterated and no issue arise.

When the current item is used after complete_block a strange mismatch can occur. If a value was inserted or deleted, it’s not longer certain to which element i refers. It could even be past the end of the collection and result in undefined behaviour.

1
2
3
4
for( int i=0; i < all_blocks.size(); ++i ) {
    complete_block( all_blocks[i] );
    something_else( all_blocks[i] );
}

Always undefined, even if safe

Now let’s use an example from Python.

1
2
for mb in all_blocks:
    complete_block( mb )

I really have no idea what happens if all_blocks is modified by complete_block. I know from a few situations that my code doesn’t crash, but that’s it. I choose Python for this reason only: that I don’t know. Other than for C++ it’s about the level understanding I have for all languages. To be fair, in C++ the rules are so complex that I generally have to assume I don’t understand it either.

Even if the above is entirely safe, it doesn’t crash nor corrupt memory, it’s uncertain what should happen. If the all_blocks container is modified, should new elements also be iterated? If an element is removed should the iteration just cover the remainder of the collection? Or maybe the iteration should just raise an exception of its modified status. I really can’t think of a sane behaviour.

Just don’t do it…

The only safe solution is to never modify a container will iterating it. If modification is unavoidable then a copy of the container must be created prior to iteration. This is a common pattern I have in my code.

1
2
3
4
5
std::vector<block> dup;
dup.swap( all_blocks );
for( auto & mb : dup ) {
    complete_block( mb );
}

This only works though because I have a queue-like collection here. It won’t work in all cases. Indeed, there are several ways to handle modification within the loop. Each depending on the circumstances. This is kind of disconcerting since every loop in your program may end up looking different.

Finding a solution assume we know there is a problem. For tight loops performing a minimum of operations it can be clear. But if complete_block is a complex operation then it’s simply too difficult to know. So long as all_blocks is not a local variable then it can modified somewhere else.

It’s also possible that when you first wrote the loop it was fine. Later a colleague changes the behaviour of complete_block to modify all_blocks. They don’t even know about your loop and thus are unaware they are introducing a defect.

The solution

The overhead of having to duplicate the collection to iterate is often undesired. But I repeat my last point, unless you can guarantee it isn’t modified in the loop, it isn’t safe to loop directly. One language that has potential here is Haskell. By using the monad system you can actually track the side-effects done by a function. However, so many functions will be marked IO (meaning they modify global state) that we end up in the same situation quite quickly.

Immutable containers are another approach. If the container can’t be modified there is never a problem with a loop. Attempting to work with immutable containers is however a challenge all on its own.

I once tried a locking container solution. To iterate requires a lock on the container. No modifications can be done so long as a lock is held. It worked. Errors were unfortunately not detectable at compile-time, but at least at run-time I got nice clear exceptions about the problem. The biggest problem is that suddenly I lost access to all the standard containers and iteration syntax. It was too much of a burden to use this approach. I think it is a good approach, but it really requires full cooperation from the language.

At the moment we’re left with one of the fundamental programming constructs being terribly unsafe. I have tracked a lot of defects to such loops. I bet that most applications have this issue lurking in at a least a couple of their loops. It’s hard to detect and hard to avoid. If you have ideas on how to address this problem, then please let me know.

About these ads