Making mazes using Minimum Spanning Trees

Making mazes has never been so fun! This post is a high level look at an interesting way of generating random mazes.

Background

Oftentimes, students of Computer Science will discover that a lot of the interesting data structures and algorithms they learn throughout the years never seem to really have useful applications in the actual reality of their software engineering or development jobs.

Thus, when I came across this technique for generating simple mazes that gave me an excuse to dust off a few of those things I'd learned as an undergraduate but never looked at again, I had to give it a try.

This article is going to make use of the following, but will not go into too much detail about them. Check the footnotes for good Wikipedia links for you to explore further!

The disjoint-set or union-find data structure¹
Undirected graphs²
Minimum spanning trees (MSTs)³
Kruskal's Algorithm for finding the MST of a graph⁴

Warning

Be advised, this article is going to serve as an outline for how to perform the described technique, but many details will be left out. If you'd like to see an implementation of a maze generator written in Python using this strategy, you can find one here.

Step 1: Establishing the Field of Play

First things first -- we need a way to represent our maze in code. Without this, this maze train won't even leave the station!

To do this, I first have to quickly cover a very special kind of graph² called a lattice graph⁵. A lattice graph is a graph where the nodes are arranged in a specific tiling pattern -- in this case, we'll use square lattice graph so that our nodes are arranged in a chessboard-like grid pattern.

To make our eventual maze easier to view, I'm going to use the matplotlib library in Python and its pcolormesh function (with some special configuration) to draw our nodes as white squares on a black background, as you can see in the picture below. Note that our graph currently has no edges!

Lattice graph nodes in a grid — Each white square is a node

Next, we'll complete our lattice graph by drawing in all the edges between adjacent nodes. The image below (which could double as an optical illusion) fills in the edges by filling in the squares between the nodes. The start and end nodes of our maze are also filled in as green and red, respectively.

Lattice graph nodes and edges in a grid — Each lilac square is an edge

Step 2: Make it...a Maze?

As you can imagine, we can transform our grid into a maze by simply choosing whether to leave our edges as empty space, or fill them with walls...let's look first at a naive approach.

A Naive Approach

We're making a maze, so surely starting with some randomness makes sense...right?

Here's some Python code. We'll keep each node filled (not shown here), and for each edge in our list of edges, we'll get a random floating point number between 0 and 1.0 and leave that edge empty if that number is greater than or equal to 0.5. If not, we'll fill the edge in with a wall.

# edges are stored as tuples (u, v) where u is the start
#   vertex or node, and v is the vertex or node
# note: our graph is undirected, so the order of the
#   vertices/nodes in our edge list doesn't matter
for (u, v) in edges:
    # the coordinates of our edge is just the average
    #   of the x and y positions of it's two nodes
    edge_x = (u.x + v.x) // 2
    edge_y = (u.y + v.y) // 2

    # maze is a 2D array representing our maze output
    #   setting a square to 0 means a wall (black)
    #   setting a square to 1 means no wall (white)
    maze[edge_x][edge_y] = 1 if randomfloat() >= 0.5 else 0

Alright, let's generate our maze! We'll also go ahead and fill in edges with the same color as our nodes to make this look a lot better.

A maze with randomly generated walls — A...maze?

Right away, we have some issues:

Critically, there is no path from the start node (green) to the finish node (red). This maze isn't even solvable!
Many of our nodes got left out, and can't ever be visited by someone wandering through our maze. This isn't as bad as the first problem, but it does mean large portions of our maze may end up completely unused.

Obviously, there must be a better way to decide whether an edge between two nodes should be filled in as a wall or not...

Minimum Spanning Trees to the Rescue

We'll take a step away from our little maze for a moment to discuss a much more sophisticated, but remarkably simple approach using a Minimum Spanning Tree².

But what is a minimum spanning tree (MST)? In short, an MST is a tree structure that ensures each node of our graph is connected to every other node by exactly one route through the edges -- hence "spanning" and "tree". It also guarantees that the total weight (the sum of all edge weights) of our tree is minimized -- hence "minimum". The MST is created by taking some subset of our list of edges (we'll talk about finding this subset in a minute).

But wait...the edges between our nodes don't have a weight (or a distance, which is the same thing). Absolutely right! But don't worry -- we'll solve that shortly.

Info

This exact same technique could be done without using a Minimum Spanning Tree. The important part is that you generate some sort of spanning tree to guarantee all nodes can be reached.

The reason a minimum spanning tree is so interesting for our lattice graph is that it solves both the issues our random method had!

Since an MST must connect every node to every other node (hence "spanning" in the name), we're guaranteed to have a path (exactly 1, as it turns out) between our start node and end node
Similarly, the above guarantee means that every other node in our maze is also reachable -- no wasted space!

Assigning Weights

To allow us to generate our minimum spanning tree, we of course need some weights for our edges. No spoilers, but this is actually where the real magic happens.

Instead of using our random numbers to decide when to fill in an edge, we'll just assign that random number as the weight of that particular edge. Once we've done this for every edge, we have a lattice graph with a set of randomly weighted edges connecting each adjacent node to its neighbors.

for edge in edges:
    edge.weight = randomfloat()

Finding the MST

As much as I'd love to dig into the intricacies of Kruskal's Algorithm⁴ and the Disjoint-set data structure¹ here, for the sake of brevity I'll leave that exploration for another time.

Instead, here is a short summary of our approach given our lattice graph with randomly weighted edges as a starting point. As noted at the top of the post, you can see code implementing all this here:

Use Kruskal's Algorithm to calculate a minimum spanning tree of our graph
Fill in all nodes with white (empty) and all edges as black (walls)
For each edge in the MST, fill in those edges as white (empty).

Once we've done that...we have a maze!

A maze with walls generated from our MST — A maze!

Kruskal's algorithm is quite good when we have a list of edges. It makes use of a Disjoint-set structure to keep adding edges until all our nodes are part of the same set (in this case, that means connected). It also doesn't add edges that would not join new nodes together, so we don't end up with multiple paths that get to the same place.

Info

Because we're using an MST to make our maze, we won't have any cycles -- there will only ever be one path from any node to any other node. If you'd like a maze with loops, this technique won't work for your purposes!

Conclusion

Obviously, a lot of the finer details have been left out here, but the approach is really that straightforward...

Create a lattice-grid with random weights
Use an algorithm like Kruskal's the find the MST
Fill in only those edges in the MST

Scaling our maze is easy too -- by simply expanding our initial lattice-grid size, we can generate mazes of any size we desire!

A bigger maze — A much bigger (and solved) maze!

From here, you can imagine plenty of interesting things to do:

Allow a seed to be specified for random number generation so you can re-generate the same maze again
Use path-finding or maze solving algorithms to solve the maze (I found that by writing out the maze array, which was filled in with 1's and 0's for empty spaces and walls, I could input it into another program which could solve it!)

Explore the code I used for this post here.