Tip:
Highlight text to annotate it
X
In
the previous lecture we have derived the decomposition method. The method can be used for finding
the solution of a system of linear algebraic equations or we can use it for finding inverse
of a given matrix.
We have also mentioned that the decomposition method can be simplified if we can use the
property of the given matrix. Now let us take this particular case when the given matrix
A is a symmetric matrix. The
method in that case has some special names.
The method is called the Cholesky method or it is also called the square root method.
The method was actually derived in two different places at the same time in different countries,
so that's why the names of
Cholesky has come and this square root methods have come; but they have been actually been
derived almost at the same instant of time.
Now we start with the assumption that A is a symmetric matrix. Then we decompose A as,
A is equal to L into L transpose or as A is equal to U into U transpose, where L is our
lower triangular matrix and U is
an upper triangular matrix. Any one of these decompositions can be used LL transpose or
A is equal to U into U transpose. Now if A is positive definite, then this decomposition
is always guaranteed. So if A is
positive definite this decomposition is always guaranteed. As in the case of the LU decomposition method this condition is
a sufficient condition and not a necessary condition. Now suppose that we have now
decomposed A L L into L transpose. How do we use it further for this solution of problem.
So let us take the case and illustrate for A is equal to L into L transpose. So in the
first method we said that we can solve a system of equations or we can find inverse. So let
us try to use this to solve the system
of equations Ax is equal to b. We follow the same way as in the LU b decomposition, so
we would write this as L into L transpose of x is equal to b. Then we shall denote this
in a product inside; denote L
transpose of x as some z, then this will read as L into z is equal to b. Then we shall say
that we first solve the second set of equations, solve Lz is equal to b. Now L is a lower triangular
matrix, therefore this can
be solved by forward substitution. Then we solve the first set that is Lx is equal to
z; L is a lower triangular matrix and therefore its transpose is an upper triangular matrix.
So this will be an upper triangular
system. Therefore we shall use the back substitution for solving this method.
Now again as we have shown that we can solve here also by one forward substitution and
one back substitution but the advantage will be not to use this but to use a slightly different
procedure and that is we can
have an alternative between these two. Let's write down the first one, take in L inverse
of this. So I can write this as z is equal to L inverse of b and the second system I
will write x is equal to L transpose inverse
of z. We know that transpose and inverse can be interchanged, so this will be equal to
L inverse of transpose of z. Therefore I need to determine only one inverse. L inverse is
here, L inverse transpose is being
used here. So I need to determine only L inverse. Now we have already shown it earlier that
if L is lower triangular, its inverse is also lower triangular. Therefore I can use the
forward substitution to get the inverse
of L also. Therefore L inverse will also be obtained by using the forward or forward substitution
only and we can then use the same L inverse in both the places and get the solution of
the problem.
Now let us see how we are going to get a decomposition as A is equal to L into L transpose. So let
us write it down our A is equal to LL transpose. Let's write down the full system a1, 1, a12,
a1n, a21, a22, a2n,
an1, an2, ann. L is a lower triangular matrix, so let us take it as l11, l21, l22, ln1, ln2,
lnn and we are multiplying by its transpose. So I can just transpose it and write this
as l11, l21, ln1 that is a first column
becomes the first row and the second column l22, l2n, and so on. I will have lnn over
here. Now let us multiply it and then compare the coefficients. So if I take the first row
and the first column I will have l11
square, so what I have here is l11 square is equal to a11. Therefore l11 is equal to
square root of a11. Now I complete the multiplication of the first row and the other columns, so
l11 into l21 is a12; this is a
product of this and this. Here a is symmetric, we have taken therefore aij is equal to aji;
therefore a12 is equal to a21; a1n is an1 and so on. So these elements will be the identical
elements here, so that we have
here a12 is equal to a21, etcetera. Therefore l11 into l21 is equal to a12. So I can get
l21 from here as a12 by l11 from this. And so on the first row and the last column will
give me l11; ln1 is a1n, therefore ln1 is
a1n upon l11. Therefore the entire set can be simply written as li1 is equal to ai1 by
l11. This is l21, l31, ln1 is a1n by l11. So this is our first column of l that we have
obtained from here.
Now we go to the second row second column because we know don't need to multiply the
first row because it is symmetric, so it is not necessary. So I will take the second row
second column that is l21 square
plus l22 square, that will be equal to a22. So I will then have l21 square plus l22 square,
that is a product of this second row second column is a22; therefore we take two to this
side and take the under root of
that. So I will have this as square root of a22 minus l21 square. Now it is here, we mention
that if A is positive definite, our decomposition is guaranteed. If A is positive definite,
this quantity is always strictly
greater than zero. Positive definiteness would imply that your a22 l21 square is always strictly
positive. We are now doing the entire arithmetic in real arithmetic but if you are going to
do complex arithmetic, it is
okay because then it is a negative number; then a square root will be a complex number
but we are doing real arithmetic therefore I would like to have real values only, therefore
this should be a positive quantity.
Therefore in other cases this may turn out to be a negative quantity wherein the method
would then fail. Now we proceed on and then compute all the other coefficients. I can
now write down, if you see the
diagonal element first, you take any row multiplied by the corresponding column; say ith row and
the ith column then it will be li1 square li2 square..., lii square, that means I would
get immediately lii is under root
of aii minus of li1 square plus so on lii minus one square or we simply remember it,
so that we put this summation notation j is equal to 1, 2, i minus one, that is your lij
square. Therefore these are all the squares
of all the previous available elements. We are now on the diagonal element, so we are
now taking the square of all the previous elements, that is this particular component
here, which is subtracted from aii; that
gives me all the diagonal elements, i is equal to 2, 3, ...n. Now for the other elements,
I will multiply by the method of matrix multiplication of this. Therefore I can immediately write
down that lij is equal to the
same. You can see that all these half diagonal elements would have a division by its pivot.
So first row will be l11, second row l22 and so on, if we are doing for lij will give you
ljj. So we will have the division by ljj and we will have here aij minus summation. This
does not have any square roots, so it
will be simply k is equal to 1, 2, j minus one again ljk, lik; i is equal to j plus one,
j plus two, so on, k. Of course j is equal to 1, 2, 3 ...n.
In this summation when the upper suffix is smaller than the lower suffix then that summation
is ignored. For example, if I take j is equal to one here; I will get here summation 1,
2, 0, so there is nothing is there.
Therefore that particular case will land into this particular case; this will land into
the first one. So if I take j is equal to two, then I will start getting the next elements
that we are getting here. So this is the general
loop under which all the elements of lii and ljj can be obtained. Now you would notice
why the method was given a square root method. We have got square roots for each pivot, so
all the pivots have to be done
by square root. Therefore there n square roots are to be used, n square roots are to be obtained
from here.
Now let us illustrate this by an example. I will try to find the inverse of a matrix
4, 2, 1; 2, 3, 2; and 1, 2, 2 and let's also write using Cholesky method. So let us just
write down the three by three matrix again; 4,
2, 1; 2, 3, 2; and 1, 2, 2. The lower triangular matrix l11, l21, l22, l31, l32, l33. We will
have this as l11, l21, l31; 0, l22, l32; 0, 0, l33. Let us simplify right here; let's
multiply it out l11 square, l11, l21, l11, l31. So
we'll have l11, l21, l21 square, l22 square, then we have l21, l31, l22, l32. That is a
product of second row and third column. Then we'll have l11, l31 and then we have here
this l21, l31 plus l22 l32, which is the
product of third row second column and lastly we will have l31 square, l32 square, l33 square.
Now given matrix is symmetric; we have written out l as l and l transpose, therefore these
will be symmetric.
Therefore this element has to be this element; this element will be this and this element
will be this because of the structure of the symmetry that we have used here constructing
L and L transpose. Now I just
compare these elements and write down the solution of these elements.
We will have here l11 square is equal to 4; therefore we have l11 is equal to two. Then
the next element l11 l21 is equal to two.
When we are talking of a square root we'll be talking of only the positive square root.
Either you use right throughout the positive square root or you use throughout negative
square root. I can also use the
negative square root; it will also give me the same solution. So we are using either
as a positive root but it is conventional to use square root as a positive square root.
Therefore we have l11 is equal to two;
therefore l21 is equal to two; divided by two, that's equal to one. Then we have the
third element l11 l31 is equal to one. Therefore l31 is equal to
one by two.
Now the three elements have been determined. The first column has been determined. Now
I go to the second. So we will take the middle element l21 square plus l22 square that is
second pivot is given as 3.
Therefore let us find out l22 square from here, which is three minus l21 square; l21
square is 1, therefore I have 2. Therefore l22 is equal to root of two. Then we go to
this last element l21, l31 plus l22, l32 is
given to us as two. Now let us substitute the values l21 is one, l31 three one is half,
l22 is root two, l32 is to be determined, and that's equal to two. So I can take half
to this side which produces three by two,
therefore l32 is three by two root two. Then the second column is complete. So we go to
the third column that means the third element. So we will have to take the last element over
here that is l31 square, l32
square l33 square is given as two. Now let us substitute the values l31 square is one
by four, l32 square is nine by four into two, that is eight plus l33 square is equal to
two. So let us take this to the right hand
side, we will have this as eleven minus eleven by eight, which is five by eight. Therefore
we have l33 which is root five by two root two. So all the elements are now determined.
Now let us substitute the elements and then write down what is our L. Therefore from here
I can write down the matrix L as 2, 0, 0; 1, root two, 0; that is l22 is root two and
this is one upon two, three upon two
root two, root five upon two root two. Now what is it we want, we want the inverse. We
have written A is equal to L into L transpose, I invert it, so I will write down A inverse
is equal to LL transpose inverse.
Now in inverting the product it gets interchanged, L transpose inverse into L inverse. Now transpose
and inverse can be interchanged. So I will write down L inverse transpose L inverse.
Therefore our A inverse
is simply L inverse transpose L inverse. Therefore I need to get L inverse and then transpose
it here, multiply to get my A inverse. So L is lower triangular, therefore it's inverse
is also lower triangular. Therefore I
will use the idea that L and inverse must be equal to I, product should be equal to
I identity matrix.
Let's write down our L and L inverse. So we will have here 2, 0, 0; 1, root two, 0; ½,
three by three root two, root five by two root two into L inverse. Now let's call this
as L11 dash, 0, 0; l21 dash, l22 dash, 0;
l31 dash, l32 dash, l33 dash. So let's denote the elements by the dash terms. We can make
one observation; some of the elements are trivial because this is a lower triangular
matrix, same is true for upper
triangular also. The diagonal elements of this will be inverse of the diagonal elements.
In fact we don't need to compute them; l11 dash has to come as one by two; l22 dash will
be one upon root two and l33
dash will be two root two by root five. But of course in our computation we can proceed
on; but that will be a check on doing the correct solution or not.
So these elements are trivially known, therefore we can see the if I multiply the row by this
column, what I am getting is two times l11 prime is one, that is your this element. Therefore
l11 dash is equal to one by
two. So this is the element that is true, then we multiply the second row and the first
column. So that I will have l11 dash plus root two, l21 dash; I then multiply the second
row first column, which should be
equal to zero. So this should be equal to zero, therefore l21 dash. Now l11 dash goes
to the right hand side, so I will have minus one upon two root two. Second row second column
just gives you root into l22
dash that is equal to one, so l22 dash is equal to one upon root two. The observation
which you made that the diagonal elements be of this will be inverse of this that is
one upon root two. Then I can multiply this
row and this column. So I will have half l11 dash three upon two root two l21 dash plus
root five by two root two l31 dash is equal to zero.
Now let us substitute these elements here, that is equal to half into half, that is equal
to one by four; three by root two, l21 dash is minus one upon two root two and this we
retain it as this, two root two, l31 dash
is equal to zero. So let us take everything to the right hand side and get the element
l31 dash that is two root two by five and then this will be minus one by four plus three
by eight. That gives us two root two by
root five and this is equal to one by eight. So that is equal to root two by four root
five.
Now we have determined the element l31 dash. Now to get the next element I need to multiply
by this third row with the second column. So if I do that, it gives you three upon two
root two, l22 dash plus root
five by two root two, l33 dash. I will just have a look at this; here we are multiplying
this row this column, so what I would get here is three upon two root two into l22 dash
plus root five upon two root two, l32
dash and these elements should correspond to this, that is equal to zero. Therefore
we can find out l32 dash is equal to minus this factor, let's write it; first two root
two, divided by this three upon two
root two,
one upon root two. So these gives us minus three by root five or let's write it as three root
two by two root five. Lastly we have the product of this row and this column. So that will
be two root two by root
five which is the inverse of this element. Therefore we have determined our L inverse;
L inverse is equal to l11 dash that is equal to one by half. I will just write these elements;
½, 0, 0; minus one upon two root
two, one upon root two, zero; root two by four root five, minus three root two by two
root five and two root two by root five; so this is our inverse.
Now I write down the solution of the problem as A inverse is equal to L inverse transpose
into L inverse. So I will invert this and write it as first matrix that is equal to
one by two,0, 0; minus one upon two root
two, one upon root two, zero; root two by four root five, three root two by two root
five and two root two by root five. I should have written the transpose of the matrix L
inverse. Let us avoid writing one extra
step. Let us now write the transpose of L inverse; one by two, minus one upon two root
two, root two upon four root five; zero, one upon root two and three root two by two root
five; zero, zero, two root two
by root five. This is our L inverse of transpose. This matrix ½, 0, 0; minus one upon two root
two, one upon root two, zero; root two upon four root five, three root two by root two
five, two root two by root
five. So this is the matrix.
Now it is a very simple thing. Just multiply it out and simplify it, I will give the value
for this so the value that comes out is simply equal to 2/5, -2/5, 1/5; -2/5, 7/5, -6/5;
1/5, -6/5, 8/5. This is obtained by just
multiplying the two matrices that we have given there. Now the application of this would
be that you have determined the inverse in a particular problem and if I have more than
one right hand side, this will be
advantageous for us to find the inverse. We are finding only one L inverse in this particular
procedure. Now these two direct methods that is the LU decomposition method and its version
that is Cholesky
method for symmetric matrices and the Gauss elimination are the most powerful method that
are used in our software. Even though there are more variants of the decomposition procedure
than the Gauss
elimination procedure, even today these are the most power methods for solving the system
of equations, when the system of equations is small. If the system is too large of millions
of or lakhs of equations, we
just cannot load them on the computer. Therefore there is no way that we can use these direct
methods to solve such huge systems. But there are some other practical problems.
Let us try to solve a practical problem like this. You are solving a particular problem
within a given domain by a finite difference method (some set of differential
equations) and it gives you system of linear equations, lets assume it is a linear equation.
Now you find that the region that you have taken is not sufficient and you want to have
a bigger region. Now you again
solve by the same method, you produce again a system of algebraic equations. Now the idea
is can we use the computation that has already been made? That means you have solved this
system of equations by
decomposition, and you have A inverse for the system available. Now I solve the problem
by extending that problem further. Can I use this computation that has already been used
rather than waste the
computations. The answer is yes, the particular application in fluid mechanics wherein the
condition for solving a particular differential equation is given as a condition like, as
x tends to infinity it should have this
value. So a condition as extending to infinity is given to us but infinity is not defined
in a particular problem, infinity could be nine in a problem and in other problem it
could be hundred, so you want to solve it
numerically and so you fixed a finite length for that. You find that you have not achieved
U is equal to U naught in the problem, so you have not positioned your infinity properly.
So we have to move the infinity
further. So infinity could be achieved, now you solve the problem. What happens to the
resultant is that the original system of equation stays as it is. If you look the rows and column
of Ax is equal to b, what we
have really done is we have added few rows and few columns to the original system and
got a new system.
Now the idea is we want to use the inverse that has been determined earlier in the previous
problem to solve the current problem and reduce the tremendous amount of computations. The
other application that we
would have is suppose you have a system of hundred by hundred equations to be solved.
Now I prefer to solve two fifty by fifty system of equations and some more computations rather
using hundred by
hundred systems.
We are going to show later that the computation efforts for the Gauss elimination or the decomposition
are order of n cubed by three for n large; that means if I solve the problem hundred
by hundred and I solve
a fifty by fifty, the amount of computation is at least of the order of magnitude of four
times. So if one problem takes two minutes, other problem takes another six to eight minutes,
so the order of the time that is
taken up by doubling a system is much higher. Therefore what I would prefer in that case
is I would like to use if possible the smaller systems and in that direction we have what
is known as partition method,
which will partition our matrix A and obtain the inverse of this matrix using the inverse
of the smaller systems.
As I said this has got very important applications in many areas of engineering. So what we do
is given this matrix A, I would conveniently decompose it, separate them as partition as
this and then name this as
some matrices B C D E. Now let us take A as n into n matrix, so I can take B is equal
to some r into r matrix and C is same rows, therefore C is r into s, D is equal to s into
s and E is equal to s into s. So r plus s
is equal to n. So these two matrices B and E are square matrices and these are rectangular
matrices. This is r into s, this is s into r but this is r into and r s into s. These
are square matrices and these will be
rectangular matrices.
Suppose you have a system of hundred by hundred. I can decompose if it was given like this,
I can take sixty by sixty here, I can take forty by forty system here; the two square
matrices, then this will be sixty
by forty and this will be forty by sixty. Or I could as well decompose it as fifty by fifty. Now what we were trying
to say here is that if I want the inverse of this hundred by hundred matrix I would
like to utilize
the known inverse; for example, of this fifty by fifty matrix or I know the inverse of sixty
by sixty matrix and thereby the amount of computation is reduced tremendously. Now in
this case what we would do is
we would also partition A inverse in this same way. So let us write A inverse is X Y Z V; partition
also A inverse as Z and V, so that the dimension of X and B are the same. So similarly your
E and V dimensions
will be the same and your C and Y dimensions will be the same and D and Z dimensions will
be the same. We have taken this as r into r, we have taken this as E and V; as s into
s; this C is r into s; this is s into r.
AA inverse should be equal to I, and this I will also partition it as I1, I2. I1 is of dimension r into r, I2
is of dimension s into s. So I will partition it also into two identity matrices; I1 and
I2. This will be the same
dimension of our X and V and I2 will be the dimension of V. Now let us put the values
of A and A inverse, that is B, C; D, E; X, Y; Z, V and this is equal to I1 and I2. Now
let us just multiply out the two
matrices. There is matrix multiplication. So I will have B into X plus C into Z equal
to I1. So let us just number it as one. Then B into Y plus C into V is null, this is a
null matrix. Let's number it as two. Then D
into X plus E into Z is null. So i will have D into X plus E into Z is null. Then the last
equation is D into Y plus E into V is equal to I2. Now we have assumed that the inverse
of the B matrix. One of the inverses
is available for us either inverse of B is available or inverse of E is available. So
let us start because as an application oriented problem, let us assume inverse B exists and
so let B inverse be known. Now it is
simply to solve these four equations for these four unknowns, so since B inverse is known
let us start with equation two.
Let us start with the equation two. I pre multiply by B inverse, so this gives me B
plus Y plus B inverse CV is equal to null. I have pre multiplied this by B inverse, so
Y B inverse CV is equal to zero. Now I
eliminate D, eliminate Y from four using this. So what i have to do here is I multiply this
pre multiply by D and subtract from here. So what I will do therefore is I will take
fourth equation minus D into equation
two, and let's call it as five. From four I will pre multiply this by D and subtract
from there. From this I am subtracting that is equal to E. Now we are multiplying this
by D therefore this D Y is gone and I will
have E minus D B inverse C. I pre multiply this by D, so this is gone, so D B inverse
of C into V and this is equal to I2. Now this goes to the right hand side, therefore V is
equal to E minus DB inverse C
inverse. This is I2. I can throw away I2 because I2 is multiplied by identity matrix. Therefore
V is known. Now let's go to two or five. Now therefore
Y is equal to minus B inverse C V (from five, Y is equal to minus B inverse C V). Therefore
Y is known; therefore Y is equal to known. Now I have manipulated
equation two and four, I will do similarly for one and three. So let us take equation
one. Again pre multiply by B inverse, so I will have X plus B inverse of C Z is equal
to B inverse. I pre multiplied this by B
inverse. Now I want to throw away X from here, so I multiply this by D and subtract from
there, therefore three minus D into let's number it as six. So from the equation three
I multiply by D and subtracted from
here, so what I will again have here is E minus D B inverse C of Z, that is your E here,
I have multiplied this by D, D B inverse C that is equal to minus DB inverse. We multiplied
by D and subtracted minus D B
inverse, therefore Z is equal to E minus D B inverse C inverse D B inverse. Let's bring
this minus sign over here but this E minus D B inverse C inverse is V. Therefore I can use this value of E over
here and
write this simply as V D B inverse. Now the solution is complete. Now use six; six gives
me X is equal to B inverse, I can take out the common there I minus C Z I minus C Z.
So I am using this, I am taking this
side and taking the common on the left. B inverse I minus C Z. So this is your seven.
Now we have completed the solution of all the four. Now what we should see here is that
in building this we have used only two inverses; one is B inverse, the other one that we used
here is E minus D B inverse
C inverse. These are the two inverses that we have used in the construction of the entire
system over here. Now what is this order; the order of this is r into r, the order of
E was s, therefore order of this is s into
s. Therefore we are now using the inverse of only two matrices r into r, s into s.
Now if you go back to the example which we had written earlier, if you are constructing
the inverse of this particular matrix I need inverse of this and not a hundred by hundred
system. But by using this or for
example in this case I need the inverse of two fifty by fifty matrices and of course
I need some multiplication of the matrices to be done; leave alone the extra computation.
So we shall be doing enormous saving
of computation in finding the inverse of large system of equations by using this particular
concept that we can partition a given matrix into a suitable form. And then find out inverse
of that particular matrix using
the inverses of the lower order matrices.