The Life of a Programmer

Search

Finding connected cities in a road network

Connecting locations with roads is the key mechanic in my game, The Way to Hexet. The player places individual road segments, but in order to evaluate the score, and bonuses, I need to track a high-level connection network. This network answers the core question of which groups of key locations are connected. In this article I’ll go through my data structures, and show how, step-for-step, I build a specialized path-finding, or flood-fill, algorithm.

Map Structure

Below is an image from the game, showing various locations and roads between them. In the initial prototype I called these locations “cities”, and the name has stuck in the code. Now, these cities represent locations of interest, thematically they could be a person, a statue, a town or the central tower. I’ll keep using the term “city” here, since I didn’t find a better term that didn’t overlap with other positional concepts.

The key goal of the game is to connect similar cities with each other. For example, all the blue cities in the map, which also share a distinct shape for easier identification. A secondary goal is to connect those cities to the central city, the tower, in the middle of the map. Thematically, you are building rescue paths to get these lost band members to the battle of the bands in time, or to visit photo worthy locations for the ultimate band picture.

Each turn, a player will place a single road on the map connecting two tiles. I need to know which cities those roads connect in order to calculate the score.

Data Structure

There are three key pieces of data to track while finding the connections:

  • Map: Information about each hex tile stored in a matrix.
  • Cities: A list of cities, including their location and type.
  • Roads: A list of all the roads the player has placed on the map.

The map matrix stores each location as a cell that I can address with an “x,y” location pair. It stores the tile type, such as a forest or grassland, and the river information: each of the cell sides may be a river. For more information, refer to my article about hex neighbours. The initial map is static information loaded at the start of the game.

Cities are also part of this static information. In addition to a simple list of cities, I also store a reference to a city in the corresponding map cell — this is a simple nullable pointer. This lets the game code quickly know which city is at a location, as well as having the central list of all cities. There is also a further dataset which provides lists of all cities of a given type.

public class CellData {
	public CellType type = CellType.None;
	// bit-map of sides with rivers
	public int river = 0;
	public CityData city = null;
	
	public bool HasRiverOnSide(int side) {
		return (river & (1 << side)) != 0;
	}
}

Players create roads at runtime as they take their turns, making them the first piece of dynamic information. This is the primary piece of data used to determine which cities are connected. The game stores each placed road in a central list. This piece of data contains the origin and destination, as well as the road type, whether it’s a normal road or a bridge.

public class RoadData {
	public Position from;
	public Position to;
	public RoadType type;

	...
}

Algorithm: Connecting cities

I want to go over the basic algorithm now, even though a critical piece of data is missing. I often create algorithms like this first, since it helps me hone in on the correct data storage.

The algorithm is based on identifying networks: cells which are connected by one set of roads. In graph theory terms, I’m looking for the connected components of the graph where the map cells are vertices and the roads are edges. Cities are connected if they are in the same network: they reside in cells in the same connected component.

There are many ways to solve this problem. The algorithm I’ve chosen is specific to my game’s needs.

Initialize the grid

I’m looking for networks, so let me start with a state that has no networks. Each cell has a variable that stores its network, which I initialize to 0 — a placeholder for no network. As I identify networks, I’ll update this variable.

Knowing the network a cell is in also tells us the network a city is in, as each city has a location and each location has a cell associated with it. Thus, for any city, I can lookup the network it belongs to. This assumes that cells are strictly part of one network; roads in and out of a single map cell are always connected. For example, two parallel roads cannot run through the same cell.

Walk the ~roads~ cells

This part of the algorithm is fairly common to path-finding. I mentioned in the intro that it’s also like flood-filling, which is semantically speaking, closer to what I’m doing here. These families of algorithms share the same basic form, only differing in their goal and optimization.

I’ll start with an algorithm that walks over all the map cells and shows the problem with this later. For each cell, in arbitrary order, I do:

for cell in map_cells:
	// Is this cell part of a network already?
	if cell.network != 0:
		continue
		
	// Create a new network number for this cell
	cell.network = next_network_number()

	// initialize a stack of cells
	open_cells = [cell]
	process_open_cells()

Whenever the stack of open_cells isn't empty, I process it until it is. It's important to see how the below code is executed inside the loop above.

function process_open_cells():
	while !open_cells.empty():
		cell = open_cells.pop()

		// check connections to each neighbour, on each side of the cell
		foreach side in cell_sides(cell):
			neighbour_cell = neighbour_on_side(cell, side)
			
			if !has_road_connection( cell, neighbour_cell ):
				continue

			// already part of the network*			
			if neighbour_cell.network != 0:
				assert neighbour_cell.network == cell.network
				continue
				
			// assign network and add to processing
			neighbour_cell.network = cell.network
			open_cells.push( neighbour_cell )

The * in this second part is the key piece that stops the algorithm from processing indefinitely. Otherwise, since our graph is undirected, processing a neighbour would end up pushing the source cell onto the stack as well. This would also happen in an unidirectional graph with loops.

When I check neighbour_cell.network != 0, the network of the neighbour will either be 0, or the value of the current cell’s network. If I encounter a different network then something has gone wrong, thus the assert in the code. This would happen if has_road_connection isn’t working correctly, either missing some connections, or treating them as directed edges.

This is what I mean about this algorithm being specialized. I don’t mean that’s it special, novel, or unique, solely that it’s been adapted to work specifically with the constraints I have in my game.

I repeat this for all the cells on the map. This gives me a network number for every cell, and thus every city.

Problems

As currently written, there are two problems with this algorithm, at least within the context of my game.

  • Every cell without a road gets labelled with a unique network
  • If I only have a list of roads, then has_road_connections is linear, and makes this algorithm terribly inefficient
Every cell has a network

The first problem is only a partial problem. All cells having a network was creating nuances with scoring and rendering. I’d prefer that if the cell has no roads connected to it then it remains with network zero.

The fix is to iterate the roads instead of the cells. I mentioned I can iterate the cells in arbitrary order, so it makes sense that I can instead start from all the cells linked by roads. Roads connect two cells, so this means processing the start and end cell.

For any connected cell, this approach yields the same result. However, the approach does not process any cell that is not mentioned by a road, thus leaving it in the initial network == 0 state. This gives me what I want: I know which network a cell belongs to, or if it doesn’t belong to one.

It has an added bonus: fewer iterations. Since I have strictly fewer roads than cells, I don’t need to check as many cells, thus it’s faster. In my game, I don’t have many cells, and my road-to-cell density gets quite high, so it’s not likely a noticeable performance difference. You could imagine though, that if the maps get larger, a lot larger, how I’d want to avoid iterating every cell.

has_road_connections is slow

The computational complexity of this algorithm depends on the outer loop and the complexity of has_road_connections.

The outer loop, at a glance, looks like it is a nested loop, which we know would be bad. But before processing anything, we always check if a cell is already part of a network. This means that I’ll process each cell only one time. And since I’ve modified it to work from the roads, the outer loop is a linear function of the number of roads.

The list of roads, however, is a plain array. To implement has_road_connections I’d need to scan that list — a linear time operation. Having a linear time complexity loop within a linear time complexity loop, results in a \(O(N^2)\) operation — which would not scale with the real-time needs of an interactive game.

To fix this, I prerecord all the connections in each cell. When the player places a road between cells, I mark the connected edges of a cell. Since a hex only has six edges, I can directly track whether each edge has a connection — like a fixed vector of booleans of length six. This is the same approach I used for the rivers. To determine if a cell is connected to its neighbour, all I need to do is check if there is a connection on that edge. This is a constant time operation, thus avoiding embedded loops.

Instead of a fixed-length vector, I track the connections in a single bit-masked value. This avoids any allocation overhead, and reduces the total memory used. I kind of miss using C++ in places like this, since it offers a bitset which does this for me in a nice abstract way. In C# I’m writing my own, admittedly simple, bit twiddling.

Notice that I’m not considering the loop over the sides of a cell, the foreach side in cell_sides(cell) part. This is because it isn’t a variable number: each cell has six sides, always. The actual speed is impacted by this, but the computational bounds are not. Also, I’d expect that a good optimizer will see this and unroll the loop at least partially.

Fast Result

I’ve left out one complication, that my game is multiplayer. The map has two aspects, the static information, like the cell types and where the cities are, as well as the dynamic information like where the roads are. That static information is global, but the dynamic information is per-player. I need to calculate road connections per-player. One global matrix stores static cell information, while a second matrix stores all the dynamic cell information. Otherwise, everything is basically the same.

public class PlayerCellData {
	// bit-map of side connections
	public int connections = 0;
	
	public bool hasSideConnection(int side) {
		return (connections & (1<<side)) != 0;
	}
	
	public int GetNumberConnections() {
		// Alas, BitOperations are not part of the dotNet version in Unity
		// ?? return System.Numerics.BitOperations.PopCount(connections);
		
		int count = 0;
		for( int i=0; i < 6; ++i ) {
			if( hasSideConnection(i) ) {
				count++;
			}
		}
		return count;
	}
	
	// to which network it belongs (mutable), 0 is uninitialized
	public int network = 0;
}

The result is a fast algorithm I can call every time any player places a road. There are so few roads that I suspect even a slow algorithm would have worked. But one should never underestimate the potential lag, and battery drain, a \(O(N^2)\) algorithm could introduce. This would be especially true if the roads were more dynamic, and networks needed to be calculated per-frame, instead of per-road-placement — think of a drawbridge that does up and down, or a log floating by in a river creating a temporary connection.

Please join me on Discord to discuss, or ping me on Mastadon.

Finding connected cities in a road network

How I check if locations are connected in my hex-based puzzle game

A Harmony of People. Code That Runs the World. And the Individual Behind the Keyboard.

Mailing List

Signup to my mailing list to get notified of each article I publish.

Recent Posts

Search