Tip:
Highlight text to annotate it
X
Okay, so it's time to discuss our first minimum spanning tree algorithm namely
Prim's algorithm. Definitely a candidate for the greatest
hits compilation. And again remember even though it's
called Prim's algorithm, it was actually discovered earlier by Jarnik.
So how's it work? Well before showing you any pseudo code,
let's first illustrate it on an example. As we go through the example, I hope that
the similarities to Dijkstra's shortest path algorithm will be evident.
I'm going to work with the same example graph from the previous video with four
vertices and five edges. The plan is to grow a tree one edge at a
time. And we're going to keep growing this tree
like a mold. We're going to start from just a seed
vertex. And then we're going to suck up one new
vertex with each iteration of the algorithm.
So, this is similar to Dijkstra's Algorithm.
In Dijkstra's Algorithm, it was clear where we should grow the initial mold
from, because we were given a source vertex, that they're trying to compute
the shortest paths out of. We have no source vertex in the minimum
spanning tree problem, but it turns out that we can just pick an arbitrary vertex
to start. Doesn't matter which one, which is cool.
So the plan is in E generation we're going to add one edge and span one new
vertex adjacent to the ones we're already spanning.
Now as a greedy algorithm Prim is simply going to select the cheapest edge that
allows it to span one additional new vertex.
Now the start of the algorithm here we're not really spamming anything.
We are sort of thinking of ourselves as growing from and currently spanning the
vertex in the upper right. So what are the edges in which we can
span an adjacent vertex? Well, there is two inches.
There is the top inch that costs one then we'll span an addition in the upper left
vertex or the is the edge with cost two on the right.
If we include that, we'll be able to span the vertex in the bottom right.
So we're not going to be greedy, we're just going to choose the cheaper edge,
the edge of cost one. Now, the vertices that our tree thus far
spans are the top two vertices. So, in the next iteration, we want to add
one more edge [COUGH] to span one additional new vertex.
And now we see that there are three edges sticking out of what we're spanning thus
far that will allow us to span a new edge.
There's the edges that have cost two, three, and four.
The two and the three will allow us to span the vertex in the bottom right.
If we pick the four, that will allow us to span the vertex in the bottom left.
Yeah, and we're going to be greedy, so of these three candid edges, we're going to
pick the cheapest one which is the edge of cost two.
So now the mold that we've been growing is in effect, covers all of the verticies
except for the one in the bottom left. So now in the final iteration we want to
include one more edge so that we span that final remaining vertex.
The one in the bottom left. Note that there's there was this edge of
cause three that we never added. But it got sucked up into the tree that
we grew anyways. So we're going to go ahead and ignore
that. Adding the three wouldn't allow us to
span any more vertices. In fact, it would create a loop which we
don't want. So we're going to say, okay.
We'll have the two edges that would allow us to span an extra vertex.
There's the four and there's the five. We're going to be greedy,
we're going to pick the four. And once we have the edges of the cost
one and two and three and four we have a spanning tree there's no loops there's a
path from any vertex to any other vertex along the pink edges, the cost is seven
you might recall from the previous video this is indeed the minimum cost spanning
tree of this graph. Of course, the fact that we have this
simple procedure that works correctly in this toy example, which is four vertices
and five edges, really means nothing. I mean you shouldn't draw any immediate
conclusions that this is a good algorithm in general even though that is going to
be the case. So let's next go and actually define the
algorithm generally. So if you have a general graph, what does
it mean to start somewhere and grow a mold, span one more vertex each
iteration, always proceeding greedily until you are done.
So lets spell out the pseudo code on the next slide.
So here is Prim's minimum spanning tree algorithm.
We're going to start with just two lines of initialization.
We're going to maintain a set of vertices, capital X.
This is meant to the be the vertices that we span so far.
Again, we need some seed vertex from which to start the process.
It doesn't matter where, which one we pick.
We're going to get the same tree no matter what, so just call it little s.
That's an arbitrary vertex from which we start growth.
The other thing we're maintaining is, of course, the tree.
So that's initially going to be empty. We're going to add one edge to it in each
iteration. An invarient that we are going to
maintain throughout the algorithm is that the edges that currently reside in the
set capital T span the verticies that currently reside in the set capital X.
Then we're going to have our main while loop.
this is the workhorse of the algorithm. And it's very similar to the one in
Dijkstra's algorithm. Namely, each iteration is responsible for
picking one edge crossing the current frontier.
advancing to include one new vertex. And again, it's going to be greed.
The criterion's going to be different, in fact, simpler, than with Dijkstra's
Algorithm instead of looking at links. We're just going to say, what's the
cheapest edge that allows us to span a new vertex?
So the loop's going to keep going, as long as there are vertices that we don't
yet span. And then what we do is we search to the
edges that allow us to span a new vertex. So which edges are those?
Well we want there to be one endpoint in the set X of vertices we already have our
tree spanning and we want the other end point to be non-redundant, so we want it
to be outside of X. So if we have an edge that crosses the
frontier in this sense, one endpoint in X, in endpoint outside that's how we
increase the number of spanned vertices by one in an iteration.
So if E is the cheapest edge amongst all of those that cross the front here with
one end point on either side, that's the one we're going to add to our tree so far
capital T in this iteration, it's end point that's not already in capital X,
that's going to be the very text that we add to X in this iteration.
And again the semantics of an iteration is that we're trying to increase the
number of spanned vertices while paying as little as possible, that's the sense
in which a prim's algorithm is a greedy algorithm.
So as usual with a greedy algorithm, this seems natural enough, but it's not at all
clear that it's correct, that it always computes in minimum spanning tree.
In fact, if you think about it's not even obvious, it actually computes a spannin
tree at all, minimum or otherwise, but it is correct.
Let's make that statement precise on the next slide.
So the key claim is that Prim's Algorithm is correct.
Given any connected input graph, it is guaranteed to output a spanning tree with
minimum possible cost. So before we delve into any details, let
me just finish this video by telling you about the proof plan.
We're going to prove this theorem in two parts.
First, we're going to establish that it outputs some spanning tree.
Maybe, maybe not minimum. Even that's non trivial.
Then we'll worry about arguing that the spanning tree output actually is one of
minimum cost. Both parts of the proof are interesting.
For part one to argue that we output some spanning tree, we're going to review some
preliminaries about graphs and about cuts and about spanning trees and graphs.
For part two to argue optimality, we're going to rely on a very neat property of
spanning trees, minimum spanning trees called the cut property.
I'm happy to report so that the work that we do here and in both parts will bear
further fruit later we're going to reuse these ingredients when we prove the
correctness of another MST algorithm named McCrustgrals algorithm.
For those of you who would much rather talk about running time than correctness
don't worry your time will come after we wrap up this correctness proof I'll
address how do you implement prim's algorithm quickly in particular using
heaps we'll get the running time down to the near linear bound of O of M log n.