Tip:
Highlight text to annotate it
X
Understanding why Papadimitrious's randomized local search algorithm for two
set works requires understanding some basic properties of random walks on the
non-negative integers. It's a very fun classic topic.
In this video I'll provide the relevant background.
In this video, we're going to focus on a simple randomized process.
We're not going to discuss in this video how this connects to Papadimitriou's
algorithm. But in the next video, we will, trust me.
The setup is as follows. Perhaps you just took a super difficult
exam. Maybe it was in algorithm class, for
example. You stagger out of the exam room and
you're just in a complete daze. You have no idea where you are or were
your going. You just stumble along.
At each time step, you take a step to the left with 50% probability.
Where a step to the right with 50% probability, you have no idea where
you're going. We measure your position in terms of the
number of steps you've taken from a dead end at the end of the street, which we
call position 0. So by randomly walking left or right at
each time step, your position either goes up by 1 or it goes down by one.
Each has a 50 50 chance of happening. So one exception is if you found your way
to position 0. Again that's a dead end, there's no way
to go left. So with 100% probability, you stagger
back to the right. You go back to position one at the next
time step. We assume that your exam room is located
at this position 0 at the dead end. In other words, at time 0 you're at
position 0. There's a lot of interesting questions
you can ask about random walks on the non-negative integers.
Here's the one we're going to be interested in.
Imagine your dorm room is n steps away from the exam room, that is, your dorm
room is at position [INAUDIBLE] n. Now if you had your wits about you, you
could go from your exam room to your dorm room in n steps.
You just zip right down the line. But you don't have your wits about you.
You're in this post-exam daze, staggering around.
So doing your random walk, on average how many steps does it take you before you
make it home? To position N. To illustrate a sample trajectory,
imagine that your dorm room was only three steps away.
That's spot 3. So in time 1, we know that for sure, you
move from position 0 to position 1. And now maybe you get lucky and you
stumble a step to the right, in the next time step, so you're at position 2.
You're only one step away from home. But now, unfortunately, you stagger
backward, back to position one. Maybe now you lurch forward back to
position two again and this time your momentum carries you forward on home to
position 3. That was a trajectory in which it took
you 5 steps to get home to your dorm room at position three.
You can imagine a trajectory which is faster if you just zipped straight there.
There, you can certainly also imagine trajectories which are slower if you
stumbled back left a couple times more before making it all the way to the right
to position three. So let's state this statistic precisely
and check in with your intuition about how it grows as a function of n.
So, for a non-negative integer, n, I'm going to use T sub n to denote the number
of steps this random walk takes before it reaches, for the first time, position n.
So, T sub n is a random variable in the sense that it's value depends on exactly
which steps you took on the results of your random coin flips.
On the other hand, once you unveil the answers to all of the random coin flips
whether you go left Left to right, at each time step you can compute T, just
like we did in the sample trajectory on the last slide.So the non-trivial
question I want you to think about on this quiz is, how does the expected value
of T saban/g grow as a function of the position And that you're waiting to
reach. I'm not expecting to devise a proof to
the asnswer, indeed that's what we're going to be doing for most of the rest of
this video. Just asking you to take your best guess.
And so the correct answer is B, theta n^2.
Depending on how much facility you have with discrete probability, you may or may
not have been able to do a back-of-the-envelope calculation that
would suggest this should be the answer. So for example, if you take some
Bernoulli random variables and do some calculations with their standard
deviation, you might expect the answer to be theta(n^2).
In fact what's even cooler and what we're going to prove is it's not just
theta(n^2), the expected value of The variant random variable T sub N is
exactly N squared for every non-negative integer N.
For the most part in these courses, I've endeavored to give you proofs which on
the one hand aren't too long, I know you're a busy person and I don't want to
waste your time, but also on the other hand teach you something.
So I hope you'll forgive me if it's not clear what this proof teaches you.
This is just going to be the slicked-up [INAUDIBLE] proof from the book that the
expected value of T sub n, number of steps needed to reach the position n on a
random walk is exactly m squared. The really nice idea behind this slick
proof is to solve a more general problem. So we're going to calculate the expected
number of steps, not just to get from position 0 to position n on a random
walk. But more generally from every
intermediate position I to the position N on a random walk.
The reason this is a good idea, this gives us N+1 different statistics we're
trying to calculate. And it's easy to relate the values of
these differents things we're trying to calculate.
From these relationships we'll be able to solve for all of them simultaneously.
So, let's have some notation to make this precise for a non-negative integer i, by
Z sub i, I mean the number of random walk steps that you take if I start you at i
until you reach position n for the first time.
Notice by the definitions, Z sub zero is exactly the same thing as t sub n.
The number of steps you need to reach n starting from 0.
So, are there any Zi's whose expectations are easy to compute? Well, sure, if we
took i=n, what is z sub n. That's the, number of steps, on average,
you need to get from n to n, for the first time.
Well that's going to be zero. For the other values of i, we're not
going to solve for the expected value of z sub i explicitly just yet.
Rather, we're going to relate the expectations of different zi's to each
other. To see how this works lets start with i
equals zero that is Z0 the random variable that we really care about.
Now we don't know what the expected value of Z0 is that's is we're trying to
compute but we do know that every walk starting at 0 and you get n begins with
the step from 0 to 1 and then is followed by.
A walk from 1 to N. Therefore, the expected length of one of
these walks is just 1, that's for that first hop to get from 0 to 1 plus the
expected value of a random walk from position 1 to N.
That's a quantity also known as the expected value of Z sub 1.
So that's kind of exciting, how we were able to avoid explicitly computing the
expected value of z sub 0, instead relating it to the expectation of a
different random variable z sub 1. But if you think about it we can play
exactly the same game with the intermediate values of pi as well.
We can relate the expected value of z sub i to the expected values of z sub i-1 and
z sub i+1. To make this fully precise, I'm going to
bust out the definition of conditional expectation.
If you want to review it, you can go back to part 1, we need a conditional
probability to analyze, [UNKNOWN] contraction algorithm or you
can just look it up on Wikipedia or whatever.
If you are not feeling in the mood to recall what conditional expectation is,
actually think the computations them about to do or sufficiently intuitive,
you will find them eminently plausible even without the former justification.
So what's the expected value of Z sub i, the average number of steps you need to
hit n for the first time if I start you out at position i.
Well, let's just condition on what happened in the first step.
There are only 2 possibilities. 50% probability you go left to i-1.
The rest of what happens is a random walk from i-1 and you just count the steps
till you get the end for the first time. The other 50% of the time where you go
right and the rest of the process is just a random walk starting from i+1 and you
count the steps till you get the end for the first time.
If you want to be totally formal about it, you would just expand that the
expected value of z sub i conditioning on the first step, so you'll have the
probability when you go left times the Conditional expectation of Z sub I given
that you go left in the first step and a similar term for when you go right.
As we discussed, we know the probability that you go left is exactly 1/2.
The probability that you go right is exactly 1/2.
The conditional expected value of your random walk, given that you went left
first, well that's just 1. That's the step you took to get to I-1,
plus the expected length of a random walk to N from I-1.
Similarly, if you're lucky enough to go right, then the expected random walk is
just 1, that's for the step to go right, plus the expected value of the rest of
the random walk, also known as the expected value of Z sub (i+1).
Once the dust clears, we discover that the expected value of Z sub i, the
average number of steps from position i, is just 1.
Plus the average of the expectations of Z sub I minus 1, and Z sub I plus 1.
If we multiply both sides of this equality by 2 and rearrange, we get that
the difference between the expected values of Z sub I and Z sub I plus 1 is
equal to 2 more than the difference between the expected values of Z sub I
minus 1 and Z sub I. So what's going to interpretation of this
equation? Well first let's just notice that the expected value of z sub i, of
that number is definitely decreasing as you take i bigger and bigger.
As you start closer and closer to your destination n, certainly the number of
steps you need is going down. So without reason both sides of this
equation are going to be positive. Write the expected value of Z sub i is
going to be bigger than z sub i+1. You don't need as many steps if you start
further to the right at i+1. This equation is saying more, its saying,
consider a consecutive pair of conceivable starting point.
Clearly you have an advantage by starting one position closer.
This equation is saying that as you slide this pair of consecutive starting points
closer to the destination the advantage gets amplified.
So, so however much savings you got by starting 1 position closer at 1 instead
of I-1, you get that same advantage plus two more if you get to start at I+1
relative to starting at I. So why was it useful to relate all of
these various expectations to each other? Well now that we have this big systems of
constraints, we can solve simultaneously for what all of these expectations have
to be. In particular, one expectation we really
care about, e of z naught. So to see how we do this, let's just
write down what the difference is between success and expectations have to be.
So the first thing we know and this is one of our edge cases, is that whatever
the expected value of Z not is it has to be one more than the expected value of
Z1. That is the difference between these two
expectations equals 1. We concluded the previous slide by noting
that if you slide a pair of consecutive positions, one position further toward
the destination n, then the advantage you get by starting one position closer goes
up by two. So, if your advantage is one, by starting
at z1 instead of z9 Then your advantage is going to be three by starting at Z two
instead of Z one. That is, the expected value of Z one
minus the expected number of steps from two is going to be equal to three.
So we can just apply that equation over and over again.
We bump up the indices by one, and the difference between the expectation gets
bumped up by two. So at the end of the day, relative to our
very first equation we've bumped up both indexes by n-1, so we've bumped up the
right hand side by 2n-2, so we have that the expected value of zn-1 minus the
expected value of zn=2n-1. Massive cancellation is often the sign
you're on the right track so that's what we're going to see here.
So all of the expected values for z1 throught zn-1 appears once positively,
once negatively so they all are going to drop out.
And actually gets even better than that, the expected value of z sub n.
That's just 0. Remember that would be how longer random
walk takes to get from end to end, also known as 0.
So if we add up all these equations we magically get exactly what he heard about
in left hand side the expected value of z0 of a walks turning at 0 and the n
that's also known as T sub n, the expectation we're trying to analyse.
So what is the right hand side sum 1 easy way to think about it is just to group
the first row with the last row the second row with the second to the last
row and so on. For example of n is equal to 100 we're
going to group 1 with 199, 3 with 197, 5 with 195, and so on.
So each of these groupings is going to gives us 2N, and if N is even there's
going to be N/2 of these pairs, giving us a total of N^2.
If N is odd, the story is the same you just get N-1/2 groups of size 2N, plus
the median group is going to have the value of N.
Again, you get N^2. So that completes this very pretty if
some what inscrutable proof that the expected number of steps you need before
you reach n for the first time is exactly n squared.
In the analysis of Papadimitriou's algorithm we're not actually going to use
this claim about the expectation rather we're going to use an easy corollary
which gives an upper bound of the probablility that.
This exceeds its expectation by more than a factor of 2.
Specifically, we're going to use the fact that the probability that a random walk
needs strictly more than 2*n^2 steps tor each position n for the first time is no
more than 50%. This is a special case of a simple but
useful inequality known as Markov's Inequality.
In proof let's denote by p the probability of interest, the probability
that the random variable Tn is strictly bigger than 2n^.
Essentially the reason the corollary is true is that if the probability that Tn
was bigger than 2n^ was strictly bigger than 1/2, well then the expectation would
have to be bigger than n squared, but it's not it's eqauls to n squared, we
just computed it. So a little more formally, let's start
from the expected value of t sub n. Now we know this is n^2, we just computed
it, but let's also write out the definition of this expected value.
So that's just a sum over the values that tn might take on, let's go ahead and just
use all, all negative integers k from 0 to infinity, and then it's k times the
probability that t sub n takes on the value k.
I'm going to break this sum into two parts.
The parts where k takes on a value that's at most 2n^2, and then the parts of the
sum where k takes on a value bigger than 2n^2.
So now let me just do some very crude lower bounds of this right-hand side.
This first sum, well, whatever it is, it's going to be non-negative.
Because ks are non negative and probabilities are non negative.
And the second sum, again I'm feeling very generous, let's just lower bound k
in the sum by 2n^2, it's always going to be bigger than that.
After the dust clears the first sum has disappeared that's what we're just
lower-bounding by 0. In the second sum we're lowering K by 2n
squared at that point. We can take the 2n squared outside the
sum and at that point the sum is merely the total probability masked on which Tn
takes on a value strictly bigger than 2n^2+1.
By definition we are calling this probability of P.
This is the probability of interest. And rearranging we do see that, in fact,
P has to be at most one-half [INAUDIBLE] So that completes the claim of the
corollary. It's really just a simple consequence of
the hard work which was proving that the expected value of this random log was
n^2. We'll plug this corollary into the
analyisis of Papademetriou's algorithm next.