Tip:
Highlight text to annotate it
X
Let's complete the proof of the master method. Let me remind you about the story
so far, the first thing we did is we analyzed the work done by a recursive
algorithm using a recursion tree. So we zoomed in on a given level J, we
identified the total amount of work done at level J and then we summed up over all
of the levels resulting in this rather intimidating expression star. C into the D
times a sum over the levels J from zero to log base B of N of quantity A over B to
the B raised to the J. Having derived this expression star we then spent some time
interpreting it, attaching to it some semantics Sticks. And we realize that the
roll of this ratio A to the B over D, is to distinguish between three fundamentally
different types of recursion trees. Those in which A = B to the D and the amount of
work is the same at every level. Those in which A is less than B to the D and
therefore the amount of work is going down with the level. And those where A is
bigger than B to the D in which case the amount of work is growing with the level.
This gave us intuition about the three cases of the master method and even gave
us predictions f or the running times we might see. So what remains to do is turn
this hopeful intuition into. A rigorous proof. So we need to verify that in fact
the simplest possible scenarios outlined in the previous video. Actually occur. In
addition, we need to demystify the third case and understand what the expression
has to do with the number of leaves of the recursion tree. Let's begin with the
simplest case, which is case one. We're calling case one, we're assuming that A
equals B to the D. This is the case where we have a perfect equilibrium between the
forces of good and evil. Where the rate of the sub problem proliferation exactly
cancels out with a rate at which we do less work per sub problem. And now,
examining the expression, star, we can see how easy our lives get when A equals B to
the D. In that case, this ratio is equal to one. So naturally this ratio raised to
the J is also equal to one for all J. And then of course this sum evaluates to
something very simple. Namely one summed with itself log base B of N plus one
times. So the sum simply equals log base B of N. Plus one, and that's going to get
multiplied by. This CN to the D term which is independent of the sum. So summarizing,
when A equals B to the D, we find that star equals CN to the D times log base B
of N plus one. Writing this in big o notation, we would write big o of end of a
d login. And again, I'm going to suppress the base of the logarithms. Since all
logarithms differ only by a constant factor we don't have to specify the base.
That's just suppressed by the constant hidden in the big O notation. So that's it
for case one. Like I said, this is the easy case. So what do we do when A is not
equal to B to the D? And remember A could be either less than or bigger than B to
the D. To answer that question, let's take a short detour into geometric series. For
this single slide detour we're going to think about a single constant number R.
Now, what you want to think about is R. Representing that ratio A. Over B. To the
D. From the previous slot. But for this slide only let's just call it R. This is a
constant. It's bigger than zero, and it's not equal to one. Now, suppose we sum up
powers of R stopping, let's say, at the Kth power of R. I claim that this sum has
a nice closed form formula. Specifically it is exactly, R. To the K. Plus one,
minus one. Divided by or a minus one. Now, whenever you see a general formula like
this, it's useful to keep in mind a couple of canonical values of the parameters that
you can plug in to develop intuition. And for this expression, you might wanna think
canonically about the cases, R=2, and R=1/2. So when R=2, or something that
powers a two. One+2+4+8+16, and so on. One hour's a half, [inaudible] have one, plus
a half, plus a quarter, plus an eighth, and so on. Now I'm not gonna prove this
for you, I'd like you to prove this yourself. If you don't already know this
fact. So the way to prove this is simply by induction. And I will leave this an, an
exercise. What I wanna focus on instead is what this fact can do for us. The way that
we use this fact is to formalize the idea that, that in recursion trees where the
amount of work is increasing in the levels, the leaves dominate the overall
running time. And where recursion trees, where the amount of work is decreasing in
the level, the root dominates the running time. In the sense that we can ignore all
of the other levels of the recursion tree. So, and in the vision in this slide, we
have two upshots. First of all, for the case when R is less than one. And in this
case, this expression on the right-hand side. R to the Q plus one minus one over R
minus one can be upper bounded by one over one minus R. So again, remember, you might
want to have a canonical value of r in mind here, namely, one half. So what we're
claiming here is that the right hand side is nor more than two for the case of
R=1/2. And that's easy to see if you think about one plus one-half plus a one-fourth
plus one-eighth and so on. That sum is converging to, to as k grows large.
So in general, for our less than one constant, the sum is divided by one minus
one over R. Now, we're not actually gonna care about this formula, one minus one
over R. The point for us is just that this is a constant. And by constant, I mean
independent of K, independent of how many terms we sum up. Obviously, it depends on
R of the ratio, but it does not depend on how many things we sum up on K. So the way
to think about this is, when we sum up a bunch of terms where R is less than one,
then the very first term dominates. The first term is with a one. And no matter
how many terms we sum up, we never get, grow bigger than the sum constant. A
similar situation holds for the case where r is a constant bigger than one. When r is
bigger than one. A tiny bit of algebra shows that we can upper bound the right
hand side by r to the k. Times something which is constant, independent of K. So
again, let's interpret the second upshot in terms of a canonical value of R.
Namely, R equals two. Then our sum is one plus two plus four plus eight plus
sixteen, and so on. And what this is saying is that no matter how many terms we
sum up, the overall sum is never gonna be more than twice. The largest and final
term. So if we sum up to say 128, the sum, you'll notice, will be 255, which is, at
most, twice that largest term, 128. And that saying is true for any K. The entire
sum is no more than twice that of the largest term. In this sense, the largest
term in the series dominates the whole thing. So to summarize this slide in just
into one sentence we sum up powers of a constant R when R is bigger than one the
largest power of that constant dominate to the sun when R is smaller than one then
the sun is just a constant. Let's now apply this to prove case two of the master
method. In case two of the master method, we assume that A is less than B to the D.
That is, the rate at which sub problems are proliferating is drowned out by the
rate at which we do less work per sub problem. So this is the case where the
amount of work is decreasing with each level of the recursion tree. And our
intuition said that, well, in the simplest possible scenario, we might hope that all
of the work, up to a constant factor, is being done at the root. So let's make that
intuition precise by using the basic Sums fact on the previous slide.
So, since A is less than B to the D. This ration is less than one. So let's call
this ratio equal to R. So R, you'll notice, does depend on the three
parameters, A, B and D. But R is a constant, it does not depend on N. So what
is this sum? The sum is just, we're just summing up powers of this constant R,
where R is less than one. What did we just learn? We just learned that any such sum
is bounded above by a constant, independent of the number of terms that
you sum up. So therefore, what is this expression star evaluates to. It evaluates
to C, which is a constant, times N to the D. Times another constant. So suppressing
the product of these two constants in Big O notation we can say that the expression
starts upper bounded by Big O(n to the d). And this makes precise our intuition that
indeed the overall running time of the algorithm, in this type of recursion tree
with decreasing work per level, is dominated by the root. The overall amount
of work is only a constant factor larger than the work done and merely at level
zero of the tree. Let's move on to the final and most challenging part of the
proof, the final case. In case three we assume that A is bigger than B to the D.
So in conceptual terms, we're assuming the rate at which sub problems proliferate is
exceeding the rate at which we do less work per sub problem. So these are
recursion trees where the amount of work is increasing with each level, with the
most work being done at the leaves. And once again, using the basic sums fact, we
can make precise the hope that, in fact, we only have to worry about the leaves. We
can throw away the rest of work, losing only a constant factor. So to see that,
you will again denote this ratio between A and B to the D as R. And in this case R is
bigger that one. So this sum is a sum of a bunch of powers of R were R is bigger than
one, what did we just learn about that two slides ago in the basic sums facts, we
learned that such sums are dominated by the largest and last term of the sum. Okay
so the bounded it by a constant factor times the largest term. Therefore, we can
we can simplify the expression star to the following. I'm
gonna write it in terms of big O notation. And, like, on the last slide, I'll use it
to suppress two different constants. On the one hand, I'm gonna be suppressing the
constant C, which we inherited way back when from the original recurrence. And on
the other hand, I'm gonna use it to also suppress this constant that comes from the
basic sums fact. So ignoring those two constants, what do we have left? We have N
to the D. Times the largest term of the sum. So what is the largest term of the
sum? Well, it's the last one so we plug in the biggest value of J that we're ever
going to see. So what's the biggest value of J that we're ever going to see? We'll
it's just this. Log base B of N. So, we get the ratio A over B to the D, raised to
the log base B of N. Power. Now don't despair how messy this looks. We can do
some remarkable simplifications. So what I want to do next is I want to focus just on
this one over B to the D, raised to the log base B of N term. So that's going to
be. You can write that as B to the minus D log base B of N. Which if I factor this
exponent into two successive parts I can write this as B Raise to the log base B of
N power. And only then raised to the minus D. And now of course what happens is that
taking the logarithm of N base B, followed by taking, raising it to the B power,
those are inverse operations that cancel, so that leaves us just with the N. So this
results in a end to the minus D. And now remarkably this end to the minus D is just
gonna cancel out with this end to the D. Leaving us with merely. A, raise the log
based B event. And thus, out of this crazy sea of letters, rises a formula we can
actually understand. So A to the log based B of N, if we step back and pick for a
minute, is actually a supernatural quantity. Describe something about the
recursion trees that we already knew was supposed to pop up in the analysis. I'll
let, I'll let you think through exactly what that is in the following quiz. So the
correct answer to this quiz is the fourth response. A raised to the logarithm event
is precisely the number of leaves of the recursion tree. And remember in our
intuition for case three, recursion trees where the amount of work is increasing per
level, we thought that perhaps the work would be dominated by the work done at the
leaves which is as proportional as the number of leaves. So why is this the
answer? Well just remember what recursion trees look like at level zero. We have a
single node, and then with each level we have eight times as many nodes as before.
That is, with each level of the recursion tree, the number of nodes goes up by a
factor of A. How far does this, how long does this process go on? Well, it goes on
until we reach down the, the leaves. Recall that in the input size starts at N
up at the root. It gets divided by a factor of B each time, and it terminates
once we get down to one. So the leaves preside at precisely level log base B of
N. So therefore. The number of leaves is just a branching factor which is A raised
to the number of times that we actually multiply by A which is just the number of
levels which is log base b n. So each time we go down a level we
increase the number of nodes by a factor of A and we go down a level log base B of
N times. Leaving us with a number of leaves equal to A raised to the log base B
of N. So what we've done is we've mathematically confirmed, in a very cool
way, our intuition about what case three should look like in the master method.
We've proven that in case three when A is. Bigger than b to the d. The running time
is, o of the number of leaves in the recursion tree, just as the intuition
predicted. But, this leaves us with one final mystery. If you go back to the
statement of the master method, we didn't say, a to the log base b of n. In case
three, it says the running time is, n to the log base b of a. And, not only that,
we've used this case three formula over and over again, to evaluate Gauss's
recursive algorithm for integer multiplication, to evaluate the Strassen's
matrix multiplication algorithm, and so on. So, what's the story? How come we're
not getting the same thing, as in the statement of the master method? Well
there's a very simple explanation, which is simply that, believe it or not. A log
base B of N, and N to the log base B of A. Are exactly the same thing. This looks
like the kind of mistake you'd make in freshmen algebra. But actually, if you
think about it, these are simply the same quantity. If you don't believe me, just
take the logarithm base B of both sides, and it'll give the same thing in both
sides. Now, you might well be wondering why I didn't just state in the master
method that the running time of case three is this very sensible and meaningful
expression, a raised log based b of n, i.e., the number of leaves in the
recursion tree. Well, it turns out that while this expression on the left hand
side is the more meaningful conceptually. The right hand side. N. To the log base B.
Of A. Is the easiest one to apply. So recall when we worked through a bunch of
examples, of the master method, this right hand side was super convenient, when we
evaluated the running times of out rhythms. When we plugged in the numbers of
A. And B. In any case, whether or not you want to think about the running time in
case three as proportional to the number of leaves in the tree or as proportional
at the end of the log base B of A, we're done. We've proved it. That's case three.
That was the last one. So we're done with the master method. Qed. So that was a lot
of hard work for doing the master method and I would never expect someone to be
able to regurgitate all of the details of this proof you know it's something like a
cocktail party well maybe except the nerdiest of all cocktail parties but I do
think there's a few high level conceptual points of this proof that are worth
remembering in the long term, so we started by just writing down a recursion
tree for the recursive algorithm and in a generic way. And going level by level, we
counted up the work done by the algorithm. And this part of the proof had nothing to
with how A and B related to each other. Then we recognized that there
are three fundamentally different types of recursion trees. Those with the same
amount of work per level, those where it increases with the level, and those where
it decreases with the level. If you can remember that, you can even remember what
the running times of the three cases should be. In the case where you do the
same amount of every work at each level. We know there's a logarithmic number of
levels. We know we do end in D work at the root. So that gives us the running time in
case one had ended the day you log in. When the amount of work is decreasing with
the levels, we now know that the route dominates. Up to a constant, we can throw
out the rest of the levels, and we know end of the D work gets done at the root,
so that's the overall running time. And in the third case, where it's increasing in
the levels, the leaves dominate. The number of leaves is A raised to the log
based of B of N, and that's the same as N, the log based B of A. And that's
proportional to running time in case three of the master method.