Tip:
Highlight text to annotate it
X
Let's move on to the second sense in which the generic local search algorithm
is under-determined. The main y loop reads, if there is a
superior neighboring solution Y, then reset the current solution to be y.
But there may of course be many legitimate choices for y.
For example, in the maximum cut problem, given a cut, there may be many vertices
whose Switch to the opposite group yields a cut with strictley more crossing edges.
So when you have multiple superior neighboring solutions, which one should
you choose. Well this is a tricky question and the
existing theory does not give us a clear cut answer.
Indeed it's seems like the right answer is highly domain dependent.
Answering the question is probably going to require some serious experimentation
on your part. With the data that you're interested in.
One general point I can make is that recall when you're using local search to
generate from scratch an approximate solution to an approximation problem you
want to inject randomness into the algorithm to coax it to explore the
solution space and return as many different locally optimal solutions as
possible over a bunch of independent runs.
You're going to remember the best of those locally optimal solutions.
Is going to return, remember, the best one at the end of the day.
So this is a second opportunity, in addition to the starting point, where you
can inject randomness into local search. If you have many possible improving
values of y, choose 1 At random. Alternatively you could try to be clever
about which y they should choose. For example, if there's multiple superior
neighboring y's, you could go to the best one.
For example, in the maximum cut problem, if there's lots of different vertices,
who've switched to either group, will yield a superior cut Pick the vertex
which increases the number of crossing edges by the largest amount.
In the traveling salesman problem, amongst all neighboring tours with
smaller costs, pick the one with the Minimum over all costs.
So this is a perfectly sensible rule about how to choose y.
It is myopic, and you could imagine doing much more sophisticated things to decided
which y to go to next. And indeed, if this is a problem you
really care about, you might want to work hard to figure out what are the right
heuristics for choosing y, that work well in your application.
A third question that you have to answer, in order to have a precisely defined
local search algorithm, is what are the neighborhoods.
For many problems, there is significant flexibility in how to define the
neighborhoods. And theory again does not give clear-cut
answers about how they should be designed.
Once again this seems like a highly domain dependent issue.
If you want to use local search on a problem of interest It's probably going
to be up to you to empirically explore which neighborhood choices seem to lead
the best performance of the local search algorithm.
One issue is likely that you'll have to grapple with is figuring out how big to
make your neighborhoods. To illustrate this point, let's return to
the maximum cut problem. So there we defined the neighbors of a
cut to be the other cuts you can reach by taking a single vertex and Moving into
the other group. This means each cut has a linear number,
o of n, neighboring cuts, but it's easy to make the neighborhood bigger if we
want. For eaxmple, we could define the
neighbors of a cut to be those cuts you can reach by taking you, 2 vertices or
fewer, and switching them to the opposite groups.
Now each cut is going to have a quadratic number of neighbors.
More generally of course, we could allow what single local move to do k swaps of
vertices between the 2 sides, and then the neighbor size would be o(n)^k.
So, what are the pros and cons of enlarging your neighborhood size.
We generally speaking, the bigger you make your neighborhoods, the more time
you're going to have to invest searching for in improving solution in each step.
For example, in the maximum cut problem if we implement things in a straight
forward way If we only allow 1 vertex to be switched in an iteration then we only
have to search through a linear number of options to figure out if there's an
improving solution or not. On the other hand, if we allow 2 vertices
to be switched in a given iteration we have to search through a quadratic number
of possibilities whether or not we're currently locally optimal.
So bigger the neighborhood generally speaking the longer its going to take to
check whether or not you're currently locally optimal or whether there's some
better solution that you're supposed to move on to.
The good news about enlarging your neighborhood size is that you're going to
have only fewer local optima. And in general, some of these local
optima that you're pruning are going to be bad solutions that you're happy to be
rid of. If you look back at the example I gave
you in the previous video that demonstrated that the simple local search
for maximum cut can be off by 50%. You'll see that if we just enlarge the
neighborhoods to allow 2 vertices to be swapped in the same iteration, then all
of a sudden on that 4 vertex example, local search would be guaranteed to
produced the globally maximum cut. The bad locally optimal cuts have been
pruned away. Now, even if you allow 2 vertices to be
swapped in a given iteration, there's going to be more elaborate examples
showing the local search might be off by 50%, but on many instances allowing this
larger neighborhood will give you better performance from local search.
Summarizing one high level design decision you should be clear on in your
head before you apply the local search heuristic design paradigm to a
computational problem is how much do you care about solution quality versus how
much do you care About the computational resources required.
If you care about solution quality a lot and you're willing to either wait or
you're willing to throw a lot of hardware at the problem, that suggests using
bigger neighborhoods. They're slower to search but you're
likely to get a better solution at the end of the day.
If you really want something quick and dirty, you want it to be fast, you don't
care that much about the solution quality, that suggests using simpler,
smaller neighborhoods. Neighborhoods that are fast to search,
knowing that some of the local optima you might get might not be very good.
Let me reiterate, these are just guidelines, these are not gospel.
In all computational problems, but especially with local search, the way you
proceed has to be guided by the particulars of your application.
So make sure you code up a bunch of different approaches, see what works.
Go with that. For our final two questions let's
supposed you've resolved the initial three and you have a fully specified
local search algorithm. So you've made a decision about exactly
what your neighborhoods are, you figured out the sweet spot for you between
efficient searchability and the solution quality you're likely to get at the end
of the day. You've made a decision about exactly how
you're going to generate the initial solution.
And you've made a decision about how, when you have multiple neighboring
superior solutions, which one you're going to go to next.
Now, lets talk about what kind of performance guarantees you can expect,
from a local search algorithm. So lets first just talk about running
time, and lets begin with the modest question, is it at least the case, that
this local search algorithm is guaranteed to converge Eventually.
In many of the scenarios you are likely to come across, the answer is yes.
Here's what I mean. Suppose you're dealing with a
computational problem where the set of possible solutions is finite.
And moreover, your local search is governed by an objective function.
And the way you define a superior neighboring solution, is that it's one
with better objective function value. This is exactly what we were doing in the
maximum cut problem. There the space was finite, it was just
the set of exponentially many graph Cuts and our objective function which is the
number of crossing cuts. Similarly for the traveling salesman
problem, the space is finite. It's just the roughly infactorial
possible tours, and again, how do you decide which tour to go to next? You look
one that decreases the objective function value of the tour.
Total cost of the tour. Whenever you have those 2 properties,
finiteness and strict improvement in the objective function, local search is
guaranteed to terminate. You can't cycle, because every time you
iterate you get something with a strictly better objective function value, and you
can't go on forever, eventually you'll run out of the finitely many possible
things to try. There is, of course, no reason to be
impressed by the finite conversions of local search.
After all, brute force search, equally well terminates in finite time.
So this class is all about having efficient algorithms that run quickly.
So the real question you want to ask is, local search guaranteed to converge, in
say, polynomial Here, the answer is generally negative.
When we studied the unweighted maximum cut problem, that was the exception that
proves the rule. That, there, we only needed a quadratic
number of iterations before finding a locally optimal solution, but, as we
mentioned in passing, even if you just pass to the weighted version of.
Of the maximum cut problem. There already, a local search might need,
in the worst case, an exponential of iterations, before halting with a locally
optimal solution. In practice, however, the situation is
rather different, with local search heuristics often, finding, locally
optimal solutions, quite quickly. We do not at present have a very
satisfactory theory that explains, or predicts, the running time of local
search algorithms. If you want to read more about it, you
might search for the keyword smooth analysis.
Another nice feature of local search algorithms, is that even if you're in the
unlucky case, where your algorithms are going to take a long.
In time, before it finds a locally optimal solution, you can always just
stop it early. So when you start the search, you can
just stay look, if after 24 hours you haven't found for me an local optimal
solution, just tell me the best solution you've found thus far.
So in addition to running time, we want to measure the performance of a local
search algorithm in terms of its solution quality.
So, is that going to be any good? So here the answer is definitely no.
And again, the maximum cut problem was the exception that proves the rule.
That's a very unusual problem that you can prove at least some kind of
performance guarantee about the local, locally optimal solutions.
For most of the optimization problems in which you might be tempted to apply the
local search design paradigm, there will be locally optimal solutions quite far
from globally optimal ones. Moreover this is not just a theoretical
pathology. Local search algorithms in practice will
sometimes generate extrememly lousy locally optimal solutions.
So this ties back into a point we made earlier, which is if you're using local
search not as a just post processing improving step, but actually to generate
from scratch a hopefully near optimal solution to an optimization problem, you
don't just want to run it once, because you don't know what you're going to get.
Rather you want to run it many times, making random decisions along the way,
either from a random starting point, or choosing random improving solutions to
move to, so that you explore as best you can the space of all solutions.
You want over many executions of local search to get your hands on as many
different locally optimal solutions as you can, they you can remember the best
one. Hopefully the best one at least will be
pretty close to a globally optimal solution.