Lec - 20 asymptotic analysis of algorithms

We shall continue our study of asymptotic analysis of algorithms. And today after having seen growth functions the order notation, the upper bound, the lower bound notations we will now try and analyze some of the algorithms or programs that we have written till now and see what this analysis means and also study some of the techniques of analyzing or trying to find out these asymptotic growth functions. So we will pick up quickly these problems, the maximum of n numbers, the exchange sort, the tournament sort, Fibonacci numbers, the merge sort and the quick sort, these are the one that we have done. We have also done towers of Hanoi, permutations, the standard problems which we shall come to later on when the time permits. The maximum of n numbers was something which we did in the pervious class and without writing the routine again we found out that f(n) is of the form A n plus b and we can say this is order n. Just before we continue with this, if f(n) is equal to k which is a constant then it is said to be order one, this constant is usually called order one. it is the same as order zero, order three, order four, order five, all of them mean the same thing but the language which we use is constant complexities order n, when we say order one. When we say order n, we say it is a linear time algorithm. Now if a function is varying with n and you have to read all the inputs, anyway it cannot be better than linear time. But is the constant minimum is something we can analyze a bit later on but presently we are only bothered about asymptotic analysis. So finding the maximum, finding the minimum, all of them are linear time. And you really cannot find the maximum of n numbers in less than linear time. So that part is over if it is un-sorted, if it is sorted then obviously you can find it in constant time. If its already sorted and you know finding maximum of n numbers is obviously constant but an unsorted arbitrary list of n numbers, finding out the maximum is the best possible order that you can get asymptotically, the order is linear. So, the next one which we will come to is the exchange sort. The exchange sort, the core part of it there are two loops, for i equal to 1 i less than n or if you wrote 0 then you would have written n minus 1. For j equal to i plus 1 to j equal to n that is less than n j plus plus. If data i is less than data j then this is what you do. So if you make a quick analysis then this part will be executed once, this will be executed n times, this will be executed n times, this will be executed n times. In a loop, this will be executed n into n minus 1 whatever this similarly. And the inner loop will be executed means, what order of complexity you can find the exact, you can put 1 1 1 and find the exact value but if you have got the hang of order of analysis, you can say it will be of the type of a n square plus b n plus c. Very quickly you can say it is going to be of this type, though the constants will be dependent on what you do and you can say that exchange sort is order n square algorithm. So this is an order n square algorithm. Is that okay? Let us come to this tournament. You remember, this diagram of a tournament in which we found out we had these n numbers and we used an array of size 2 n minus 1 and we compare 2 input in the index of the half place. So building up of this tournament we did in a loop from back, compare these two 14 and 15 and put it in 7. 6 and 9 and put it in 9, 15 and 1 and put it in 5, 8 and 9 and put it in 4 here and this is what we did. So this loop is of order n. If this is 2 n minus 1, this will be approximately proportional to 2 n minus 1. So building up the tournament is of order n time. So tournament build, if you do not understand please stop me. Tournament build is order n. find out the maximum element is constant time after tournament build. Get next max, what did we do? We went along the maximum and if we assume the, we know the minimum then you make it less than the minimum and then you go up. Now how many comparisons do you make when you go down? log n and when you go up, log n. So to get the next max, you make two log n. two log n, so next max and if you repeat this next max continuously then you will require some, this is f(n) isn’t it, which is An plus b. Find_max, Next_max this is what you will continuously do plus n into this was of the form a log n plus any terms which are less linear time terms etc etc Bn plus whatever it may be happen, it may continue. So this term will be of a, this is a one say this is a one say this is a dash, this is b dash. So this whole term very quickly you can see that the highest order term will be a n log n. No, now b dash, b dash n, b dash n is not here sorry b dash may be some constant time may be required, there is not n sorry then this would have been n order n because log n is not there. So this, the whole sorting of n numbers will be done in order n log n time which is much better. So now let us see, one we have got f(n) is equal to n square. The other we have got g(n) which is equal to n log n. these are the two orders, order function. Now clearly g(n) is upper bounded by f(n) and g(n) is not lower bounded by f(n) but these two are not of the same order. Therefore this is much better than this. there are various other issues relating to order specially when you consider complexities of the type n to the power say log n, how does it compare with n to the power 500? How will you compare these two? Take the logs on both sides and compare and you will get a comparison between both of them. If you cannot directly compare these two, take log on both sides and compare. This will be better than this. See this one, its growth is a polynomial where this growth, this log n will increase with n. So, this one will grow faster than this one. So between these two algorithms, this will be better than this. In fact for any constant n to the power k, for any fixed constant may be even 2 milli ampere, it will be better than this in order terms. So we have got this tournament sort is order n log n algorithm, therefore its more efficient in asymptotic worst case complexity than a exchange sort but if you take 100 elements may be you just don’t know because the constants may be very big here. Let us come to the Fibonacci number and here we come, in the previous two cases we had very simple loops which we could analyze. In the Fibonacci number problem, how will we find out, what is the complexity of this? Can anybody help, any suggestions. What is the complexity of this? If n is equal to 0 then it is one check constant time, it is constant time. If n is equal to 1 it is constant time, otherwise otherwise what? So we have been able to say for Fibonacci numbers f(n) is equal to let us write it out like this. If n is equal to 0 or 1 then f(n) is equal to constant k is order 1 else what? It is the time to do this plus the time to do this plus another constant which is say one. So now the complexity equation again becomes a recursion and the complexity equation, the time n for Fibonacci is equal to let us say constant we can make it one, we are doing asymptotic analysis for n less than equal to 1 and is equal to T of n minus 1 plus t of n minus 2 plus 1 for n greater than 1. We have not found the solution, we don’t know the order yet, this is what we have. No, can you tell me what order this is? This is called a recurrence relation or a recurrence equation. This one can be solved. Do you know the answer to this one? Can you solve it in terms of Fibonacci n? See it looks very similar to Fibonacci numbers itself. by the way do you know what is the value of fib (n), 1 by root 5, n 1 minus 1 by root 5 to the power n plus 1 plus 1 by root 5 to the power n the whole 1 by 2 root five or something like that. It is like this something like this, it’s called a golden number but this T(n) can be written in term of fib (n) you know. Can you make a guess? Is T(n) equal to fib (n)? Just, what is T(n)? T(n) is equal to 1, fib n plus 1, is it equal to fib n plus 1. How will you prove this yes or no, how will you prove it? Substitute here by induction, fib n plus 1, so this will be T(n) if T(n) is fib n plus 1 then this by induction must be fib of n plus fib of n minus 1 plus 1 which will be fib of n plus 1 plus 1 which is not true. Now looking at this, can you make an adjustment? Put fib n plus 1 minus 1, then you will get 1 minus 1 here, you will get 1 minus 1 here and you will get 1 minus here. T(n) is equal to fib of n plus 1 or not? This is not true right, if it were true by induction this would have happened. T(n) minus 1. So by induction this should have been fib of n plus fib of n minus 1 plus 1. Now fib of n plus fib of n minus 1 is fib of n plus 1, so you will get fib of n plus 1 plus 1 which is not true. But if you put it just 1 minus 1 here, you will get a minus 1 here, will get a minus 1 here, you will get a minus 2 here and this will give you fib of n minus 1. So even if you don’t know this, you can say T(n), this much you can say. Isn’t it? Now fib of n plus 1 is an exponential number, just see. So T(n) will be you can always write, atleast it will be upper bounded by 2 to the power n. It will also be lower bounded by 2 to the power n, you can show that. I just leave it to you to try out because these are both numbers which would be exponential, the growth of this will be exponential. So this algorithm takes is of order exponential time 2 to the power n, it grows very rapidly compared to an algorithm like n square or n cube which grows as an polynomial function of n. You have to see what we can do about this. But the one point that we get from analyzing recursive equation is that these recursive definitions, the complexities of these recursive definitions can be formed by solving certain recurrence equations. Is that idea clear? That the solution to recursive definition are the complexity, time complexity, asymptotic time complexity of recursive functions or inductive definitions can be obtained by solving certain recurrence relations. And we have already got several such problems that are in hand and we will see one of them. Let us see the merge sort problem. If you remember the merge sort problem was if we split into two parts, equal parts or unequal parts or whatever. So split into two parts and then sort individually and then merge the two sorted list. And we did not decide on where to split, right. We did not decide on where to split, we said we will decide later on. Now is the time we will see, if we split in different ways what we will get. Now first let us understand what this merge routine does. This merge routine it takes in our final implementation in the same array a set of numbers one pointer here, one pointer here and in another array it compares these two, the smaller of the two if you are sorting in that order is written. Suppose this is 5 and this is 3 then 3 will be written and this pointer will move here, this index would move here. So for every number which is written here at most one comparison is done, at most one comparison is done. So for finding out merging of m plus say m 1 plus m 2 numbers so total of n numbers if I have to merge, I have to do some… My complexity will be of the form like this no more than this because I do not have to do more than n comparison and more than n comparison and more than n such assignments here. That’s all I have to do. So merging of n elements is order n time, so we know merging of n elements is order n time. So we come to merge sort, we are splitting into two parts. Now we could split one n minus 1 2 n minus 2 3 n minus 3 or we could split up half half three fourth one fourth one tenth nine tenth. So suppose we split up into 1 and n minus 1. Then what will be our equation? T(n) is equal to constant for n equal to 1 that is known and there is one element, it is constant time. Otherwise it is T(1) plus T(n) minus 1 plus merge time, this is order n. Order n you can write as n also because it will not affect the order of the complexity which is the same as T(n) minus 1 plus n plus 1. So T(n) is T(n) minus 1 plus n plus 1 which is, which starts expanding it out. How will you solve this? Can you solve this? Try it, expand it, T(n) minus 2 plus n minus 1 plus 1 plus n plus 1. This T(n) minus 1 will break up as T(n) minus 2 plus n minus 1 plus 1 like T(n) broke up as n minus 1 plus n plus 1, this will break up as n minus 2 plus n minus 1 plus 1. So it will be T(n) minus 2 plus n plus n plus 1. The next step you can write in one short T(n) minus 3 plus n minus 1 plus n plus n plus 1. And this will go right up to T(1) plus when n, when this is n minus three this starts from n minus 1. So when this is 1 which is equivalent to n minus n minus 1, 1 is equivalent to n minus n minus 1, it will start from n minus n minus 3. If this is 3 this is 1, so if this is n minus 1 this is n minus 3 which is the same as 3, 3 plus 4 plus 5 plus plus n plus 1. So this will be T(1) plus 3 plus 4 plus 5 or something like that. Isn’t it? Now this is a series summation which will sum up to order, straight write can you write the order term. n square. So if you split up 1 n minus 1 2 n minus 2, we are going to get an order n square algorithm, alright. Let’s see the, if you split up n minus 1 1 also you are going to get the same thing but suppose we split half half then our equation is T(n) is equal to 1, n is equal to 1 and is equal to 2 of T(n) by 2 for the 2 half’s n by 2 plus n by 2 plus n. So we have to solve this one, lets start expanding, two of expand T(n) by 2, T(n) by 4 plus n by 2 plus n which is the same as 2 square T(n) by 2 square plus n by 2 to the power 1 plus n by 2 the power 0. This will come in sorry, this will n n n, 2 into n by 2 this will be n, am sorry this is not this. This part is okay. What is this part? n plus n, right. This is it. This will be again broken up 2 square 2 T(n) by 2 cube plus n by 2 square plus n plus n. This is 2 cube t, if you stop me if you cannot follow. Now I can write the general term. Can anybody tell me the general term? 2 k t n by 2 k plus kn, alright. So now what happens? When does this become one? When n by 2 k is equal to 1 which implies n is equal to 2 k which implies k is equal to log n to the base 2. At that point of time, this is 1, 2 k is n so this becomes n into 1 plus n log n which makes it order n log n. So now if we split half half, we have got n log n. If we split 1 n minus 1, we get n square. So you remember what we were talking about in the initial class when we said that you leave that split alone. These are all the possible ways of designing, we will analyze it and then we will choose. Now you can check if I split three fourth, one fourth what will happen or you can write out an equation like this. I want, what is my choice of algorithm? T(1) is this, otherwise minimize this function for whatever I minimum. Find out the minimum value of i, find out that value of i for which this is minimized and solve this equation. And if you can solve this equation, you will get the optimal choice. This I may depend on n, it may say if n is 5 then you choose two. If n is 7 you choose three, if n is 9 you choose whatever it is or you may get a very simple solution like always choose i is equal to n by 2. You may not, this may not lead to a very simple equation, simple solution but this is what we have to do if we want to optimize your choice. This is what is meant by the choice of that split. Is that clear? So we have now got an idea that in merge sort, if I split it into two equal half I get an order n log n. So I will use, you will see all books of merge sort split into half because there is another proof that this minimizes i equal to n by 2 minimizes it. But there are other proof, if you write n by 4 and 3 n by 4 you will still get n log n. If you split it up in any fraction, you will get n log n, half you will get n log n. There are more details you can prove that if you split in half you spend, you do less comparisons than you do when split in three fourth also that you can choose but in terms of asymptotic complexity, we have reached this point in merge sort saying that merge sort is best, we can prove which I am not doing here. Merge sort is best when you split into half. Let us have a look at quick sort, very similar. Quick sort equation is also very similar. T(n) is equal to 1 for n is equal to one and is equal to and it depends on where your pivot element splits you. Now the pivot element you have chosen randomly, alright. The pivot element is chosen randomly, so it can split you in one n minus one. So in the worst case quick sort can become ordered n square because choice of pivot element is not in your hands but if you could choose the pivot element in such a way that it broke up the list into half half then you would have got order n log n algorithm. But then how will you find out how to split into half half? That is if you could split in linear time, look you cannot take n square time to split into half half that will spoil your whole complexity equation. So, given a set of elements which is that element which splits it into half half, the median. So you have to find the median in linear time. If you could find the median elements of n elements in linear time then you would choose the median and do a quick sort based on that median. Then you would have got worst case order n log n time but then how will you find the median in linear time. All of us find median by sorting. It’s not a chicken and egg problem, there is a method to find the median in linear time but that algorithm is slightly more complex. I will not go into it today but you can find the median in linear time. So in the worst case quick sort is order n square, in the best case it is n log n provided you can get it. Then you can start proving in the average case what happens and other things, we will discuss it later on. but our tournament sort and merge sort in the worst case they are order n log n algorithms and therefore they are better in the worst case scenario than quick sort or exchange sort or bubble sort but the merge sort has to be done by splitting half half, not arbitrarily. So asymptotic analysis helps you a lot, not only to find out how good your algorithm is, it helps you to decide on how to do recursive decomposition and where to choose some of your choice points which you may have left for the analysis part. It will also help you to understand which data structure is better and how to analyze the efficiency of your data structure which we shall come into the, which we shall discuss later on. But this phase of understanding what is meant by complexity, what is meant by orders, how do we analyze complexity, how do we use recurrence equations etc will form the basis of the second phase, the last two parts of this course mainly data structures, complete design of algorithms as well as data structures which we shall pick in the final phase of this course.