9.3 - Wis in path graphs - A linear time algorithm - Dynamic programming - By tim roughgarden

So now that we've done some work with our thought experiment, understanding exactly what the optimal solution, the maximum rate independent set had to look like in the path graph, let's turn that work into a linear time algorithm. Let me quickly remind you, the bottom line from the previous video. So we argued two things. First of all, if the max weight independent set of a path graph happens to exclude the rightmost vertex, then it has to in fact be a max weight independent set with the smaller graph g prime, obtained from g by plucking off the rightmost vertex. If on the other hand a max weight independent set does include the rightmost vertex v sub n, then if you take out v sub n and look at the remaining part of the independent set, that has to be max weight among the smaller graph g double prime, defined by plucking the two rightmost vertices off of the original graph g. Ergo, if we happen to know, if a little birdie told us which of these two cases we were in we could recursively compute the optimal solution of either G prime or G double prime as appropriate and be done. If we recurse on G prime we just return the result. If we recurse on G double prime we augment it by V sub N and return the result. Now, there is no little birdie, we don't know which of the two cases, we're in. So we're concluded the previous video by proposing just trying both case. So let's write down what that proposed algorithm would look like before we take a step back and critique it. So, the good news about this algorithm is it is correct. It's guaranteed to return the maximum weight independence set. So, how would you prove this formally? Well, it would be by induction, in exactly the same way we proceeded with divide and conquer algorithms. And for the same reason, I'm not going to talk about the details here. If you're interested, I'll leave it as an optional exercise to write this out, formally. But intuitively, the inductive hypothesis guarantees correctness of our recursive calls. So computing the maximum weight solution in either G prime or G double prime, and then, in the previous video, the whole point of that, in effect, was arguing the correctness of our inductive step, given the correct solution to the sub-problem, we argued what has to be the optimal way to extend it to a solution, to the original graph, G. The bad news, on the other hand, is that this algorithm takes exponential time. It's essentially no better than brute force search. So while the correctness of this kind of recursive algorithm, follows the template of divide and conquer pretty much exactly. the running time is blown up to exponential. And the reason for that difference is, in our divide an conquer algorithms, think of Merge Sort as a canonical example, we made a ton of progress before we recursed. Right? We threw out half of the array, 50% of the stuff before we bothered with any recursion. How much progress are we making in this algorithm? Well, very little. It's positive progress, but very small. We throw out either one or two vertices out of maybe this graph with say a million vertices before recursing. So we're branching by a factor two and making very little progress before each branch. That's why we give this exponential running time rather than something more in the neighborhood of n log n. So this brings us to the following question. This is an important question. I want you to think about it carefully before you respond. So we have this exponential time algorithm, it makes an exponential number of recursive calls. Each recursive call is handed some graph, for which it's responsible for computing the maximum-weight independence set. The question is this. Over all of these exponentially many different sub-problems, each of which is passed some graph as a sub-problem, how many distinct, How many fundamentally different sub problems are ever considered across these exponentially many recursive calls? So the answer to this question, and the key to unlock the crazy efficiency of our dynamic programming implementation, is B. So despite the fact that there's an exponential number of recursive calls, we only ever solve a linear number of distinct sub-problems. In fact, we can explicitly say what are the different sub-problems it could solve throughout the recursion. What happens before you recurse? Well you pluck vertices from the graph you were given off from the right. Maybe you pluck off one vertex that's in the first recursive call, or in the second recursive call you pluck off two vertices, but both from the right. So an invariant maintains throughout the recursion is that the sub-problem you're handed was obtained by successive plucking off of vertices from the right part of the graph, from the end of the, end of the graph. So, however you got to where you are, whatever the sequence of recursive calls led to where you are now, you are guaranteed to be handed a prefix of the graph. The graph induced by the first I vertices, where I is some number between zero and n. So therefore there's only a linear number of sub-problems you ever have to worry about, the prefixes of the original input graph. From this, we conclude that the exponential running time of the previous algorithm arises solely from the spectacular redundancy of solving exactly the same sub-problem from scratch, over and over and over and over again. So this observation offers up the promise of a linear time implementation of this algorithm. After all, there's no need to solve a sub-problem more than once. Once you've solved it once you know the answer. So an obvious way to speed up this algorithm, to speed it up dramatically is to simply cache the results of a sub-problem the first time you see it. Then you can look it up in some array, constant time, at any point later on in the algorithm. There is a word for this, I won't really use it in this class, but just so you that know what it is, it's called memoization. So in case this is a little vague, what I mean is you would have some array, some global array, indexed one to N or maybe zero to N with the semantics that the Ith entry of this array, is the value of the solution of the Ith sub-problem. Now when you do a recursive call and you're handed the first I vertices of the graph, and again remember that we know that the sub-problem has to look like the first I vertices of the graph for sub I. You check the array, if it's already been filled in, if you already know the answer, great. You just return it and count the time, you don't bother to resolve. If this is the first time you've ever seen. this sub problem then you recurse and you solve it, as as we saw, as we suggested in the previous slot. So with this simple memoization fixed, this action, this algorithm is linear time. The easiest way to see that, and actually in fact a better implementation, is to go away from this recursive top down paradigm, and instead implement the solution in a bottom up way. So solving sub problems in a principled way from the smallest to the original one, the biggest. So a little bit more precisely, let me use the notation G sub I to denote the sub graph of the original graph, consisting of the first I vertices. So we're going to again going to ha-, going to have an array with the same semantics as in the recursive version. The Ith entry denotes the solution to the Ith sub-problem. That is the max rate independent set of G sub I, and the plan is to populate that bottom up, that is from left to right. So we'll handle the edge cases of the first two entries of, of this array specially G sub zero would be the empty graph, so there's no independent set. So lets define the max weight, independent set of the map graph, to be zero. And, if you graph in G one, where the only vertex is v one, the max weight independent set is obviously that one vertex. Remember weights are not negative. So our main four loop just builds up solutions to all of the sub-problems in a systematic way, going from smallest graph, G sub two up to the biggest graph, the original one, G sub n. And when we consider the graph G sub I, how do we figure out what the max weight independence set is of that graph? Well now we use, completely directly, the work that we put in the previous video. The previous video told us what an optimal solution has to look like. And it has to have one of two forms. We know, we proved, either, the maximum independent set of G sub I. Excludes the last vertex V sub I, and then is merely an optimal solution of the graph G sub I minus one. If it's not that, there is nothing else it can be, other than including the last vertex V sub I. And being an optimal max weighted independent set for the graph G sub I minus two. We know its one of those two things. We don't know which. We do brute force search among the two possibilities, and that gives us the optimal solution for the Ith sub problem. Crucially, when we need to do this brute force search for the Ith sub problem, we already know. We've already computed the optimal solutions to the smaller sub problems that are relevant, those can be looked up in constant time, and that's what makes this, each iteration of this four loop run in constant time. So we've done a fair amount of work to get to this point, but, after seeing that the greedy algorithm design paradigm failed us. The divide-and-conquer algorithm design paradigm was inadequate. Brute force search is too slow. With this, as we'll see, dynamic programming algorithm, we now have a one line solution, effectively, to the max weight independent set problem in path graphs. Pretty cool. What's the run time? Well this is probably the easiest algorithm is run time we've ever had to analyze. Obviously, it's linear time, constant time per each iteration of the four loop. Why is the algorithm correct? Well it's as same as our recursive algorithm. It makes exactly the same decisions. The only difference is it doesn't bother with the spectacular redundancy of resolving sub-problems it's already solved. Again if you wanted to prove it by scratch, it would just be a straight forward induction, like in our divide and conquer algorithm, the recursive calls are correct by the inductive hypothesis. The inductive step is justified by the case analysis of the of the previous video.