Return to the Lecture Notes Index

15-200 Lecture 26 (November 10, 2006)

Finding a Minimum Spanning Tree of an Undirected, Connected Graph

A minimum spanning tree of an undirected graph is a tree formed from that graph's edges that connects all the vertices of that graph at the lowest total cost. You can make a spanning tree of a graph only if the graph is connected. There may be more than one spanning tree of a particular graph.

The number of edges in a minimum spanning tree of a graph will be the number of vertices it has - 1. A minimum spanning tree is a tree because it's acyclic. It's spanning because it reaches every vertex in the graph, and it's minimum for the obvious reason. If we need to wire a house with a minimum of cable, then a we need to find a minimum spanning tree of a graph of the electrical layout of the house.

Minimum Spanning Trees: Why Do We Care?

Many real situations can be modeled with graphs. And, many real situations can be solved by finding the minimum spanning tree of graphs.

My favorite example involves an electrician and a house. Imagine that a collection of electrical outlets have been installed in the walls of a house, and that the electicity enters the house at a single electrical box. We can model this situation as a graph, where each of the outlets and the electrical box is a node, and the walls are the edges. The length of each wall, or segment thereof, is the distance along the wall between two of the electrical connections.

As a result, the electrician may have many different routes he can use to wire the outlets -- they may be reachable by different paths along the walls. So, the electrician wants to find the path that requires the least amount of wire. This saves money, becuase less wire is needed. And, it saves time, because the runs along the wal are shorter and, as a consequence, take less time to install.

To solve this problem, the electrician can model the outlets and wall segments conecting them as a graph rooted at the electrical box. Then, the electrician can find the minimum spanning tree of the graph. The edges in this tree give the paths that the electrician should use to run the wires -- they will reach each node, while requiring the least amount of wire.

Greedy Algorithms

Today we are going to study one algorithm for finding the minimum spanning tree of a graph. It is known as Kruskal's Algorithm. Kruskal's algorithm is an example of a Greedy Algorithm. Greedy Algorithms operate by breaking a decision making process down into small steps, and making the best decision at each step. For many problems, this approach will lead to the best possible overall solution -- for others, it will not.

For example, we "make change", by using a greedy algorithm. We hand back $10 bills, until handing back another $10 would be giving back too much money. Then we hand out $5 bills, then $1 bills, then quarters, then dimes, then nickels, then pennies. In the end, we are guaranteed that we have returned the correct amount of change -- with the fewest possible bills or coins.

But, this algorithm would not work, for example, if we had 12-cent coins. Normall, if we have to return 21-cents of change, we return dime-dime-penny. But, with a 12-cent coin, let's call it the "dozen", we'd return dozen-nickel-penny-penny-penny-penny. From this example, we can see that greedy algorithms are appropriate for some problems -- but not all problems.

But, Kruskal's Algorithm, is a greedy algorithm -- and does actually work. Let's take a look at a different example, the sidewalk example, and see how.

/afs/andrew.cmu.edu/course/15/200/www/applications/ln

Kruskal's Algorithm

Althoguh the implementation is a bit complex, the basic algorithm is very straight-forward. We simply attempt to add each edge from the original graph to the minimum spanning tree, beginning with the lowest-weight edge and finishing with the greatest-weight edge. We add the edge if it doesn't cause a cycle and passs it up, if it does. We continue to add edges, until we've added N-1 edges, where N is the number of verticies. Remember, spanning trees have exactly N-1 edges -- never more, never less.

In order to make it easy to select the candidates in the right order, from the lowest weight to the highest weight, we store the edges in a priority queue (heap). Then, selecting an edge is simply the deleteMin() operation.

Another way of viewing the algorithm is to views the inital configuration as a forest of trees, with each vertex in its own, independent tree. If we take this view, then adding an edge merges two trees into one. When Kruskal's terminates, there is only one tree - the minimum spanning tree.

Let's take a look at the algorithm in operation, using the smae graph as we used last class:

The following table shows the verticies sorted by weight and whether or not each vertex was accepted. Remember, we evaluate each vertex, one-at-a-time, from the top of this list down. In a real implemention, we would have added them to a heap, and be using deleteMin() to get to the top one for each iteration.

Edge Weight Action
(1,4) 1 Accepted
(6,7) 1 Accepted
(1,2) 2 Accepted
(3,4) 2 Accepted
(2,4) 3 Rejected
(1,3) 4 Rejected
(4,7) 4 Accepted
(3,6) 5 Rejected
(5,7) 6 Accepted

Using Sets to Detect Cycles

So, how can we figure out if adding a particular edge to a tree will create a cycle? This is very important to Kruskal's Algorithm, because we only add an edge, if doing so won't create a cycle.

Imagine that each vertex in a graph is its own set (remember the Set class you created for the lab?).

If you connect two vertices in the same set together, you'll create a cycle. As long as you connect vertices from two different sets, you won't create a cycle. A and B are in different sets. I'll connect them.

Now AB is a set. Can I connect C to AB? C and AB are in different sets, I'll connect them. Now I have a set called ABC.

Can I connect C to B? C and B are in the same set, so connecting them will create a cycle.