Introduction
The forums are replete with people trying to learn how to efficiently find inverses of matrices or solve the classic \(Ax=b\) problem. Here's a decent method that is fairly easy to learn and implement. Hopefully it might also serve as a stepping stone to learning some of the more advanced matrix factorization methods, like Cholesky, QR, or SVD.
Overview
In 1948, Alan Turing came up with LU decomposition, a way to factor a matrix and solve \(Ax=b\) with numerical stability. Although there are many different schemes to factor matrices, LU decomposition is one of the more commonly-used algorithms. Interestingly enough, Gauss elimination can be implemented as LU decomposition. The computational effort expended is about the same as well.
So why would anyone want to use this method? First of all, the time-consuming elimination step can be formulated so that only the entries of \(A\) are required. As well, if there are several matrices of \(b\) that need to be computed for \(Ax=b\), then we only need to do the decomposition once. This last benefit helps us compute the inverse of \(A\) very quickly.
How It Works
Let's start with an \(Ax=b\) problem, where you have an \(n\)x\(n\) matrix \(A\) and \(n\)x\(1\) column vector \(b\) and you're trying to solve for a \(n\)x\(1\) column vector \(x\). The idea is that we break up \(A\) into two matrices: \(L\), which is a lower triangular matrix, and \(U\), which is an upper triangular matrix. It would look something like this:
\[
\left [
\begin{matrix}
a_{11} & a_{12} & a_{13} \\
a_{21} & a_{22} & a_{23} \\
a_{31} & a_{32} & a_{33} \\
\end{matrix} \right ] = \left [
\begin{matrix}
l_{11} & 0 & 0 \\
l_{21} & l_{22} & 0 \\
l_{31} & l_{32} & l_{33} \\
\end{matrix} \right ] \left [
\begin{matrix}
u_{11} & u_{12} & u_{13} \\
0 & u_{22} & u_{23} \\
0 & 0 & u_{33} \\
\end{matrix} \right ]
\]
This seems like a lot more work, but allows us to use some cool math tricks. Our original equation, substituted with our decomposition is now \((LU)x=b\):
\[
\left (
\left [ \begin{matrix}
l_{11} & 0 & 0 \\
l_{21} & l_{22} & 0 \\
l_{31} & l_{32} & l_{33} \\
\end{matrix} \right ]
\left [ \begin{matrix}
u_{11} & u_{12} & u_{13} \\
0 & u_{22} & u_{23} \\
0 & 0 & u_{33} \\
\end{matrix} \right ]
\right )
\left [ \begin{matrix}
x_1 \\
x_2 \\
x_3 \\
\end{matrix} \right ] =
\left [ \begin{matrix}
b_1 \\
b_2 \\
b_3 \\
\end{matrix} \right ]
\]
If we shift the parentheses, we get the following:
\[
\left [ \begin{matrix}
l_{11} & 0 & 0 \\
l_{21} & l_{22} & 0 \\
l_{31} & l_{32} & l_{33} \\
\end{matrix} \right ]
\left (
\left [ \begin{matrix}
u_{11} & u_{12} & u_{13} \\
0 & u_{22} & u_{23} \\
0 & 0 & u_{33} \\
\end{matrix} \right ]
\left [ \begin{matrix}
x_1 \\
x_2 \\
x_3 \\
\end{matrix} \right ]
\right )
=
\left [ \begin{matrix}
b_1 \\
b_2 \\
b_3 \\
\end{matrix} \right ]
\]
Looking just inside the parentheses, we can see another \(Ax=b\) type problem. If we say that \(Ux=d\), where \(d\) is a different column vector from \(b\), we have 2 separate \(Ax=b\) type problems:
\[
\left [ \begin{matrix}
l_{11} & 0 & 0 \\
l_{21} & l_{22} & 0 \\
l_{31} & l_{32} & l_{33} \\
\end{matrix} \right ]
\left [ \begin{matrix}
d_1 \\
d_2 \\
d_3 \\
\end{matrix} \right ]
=
\left [ \begin{matrix}
b_1 \\
b_2 \\
b_3 \\
\end{matrix} \right ]
\\
\left [ \begin{matrix}
u_{11} & u_{12} & u_{13} \\
0 & u_{22} & u_{23} \\
0 & 0 & u_{33} \\
\end{matrix} \right ]
\left [ \begin{matrix}
x_1 \\
x_2 \\
x_3 \\
\end{matrix} \right ]
=
\left [ \begin{matrix}
d_1 \\
d_2 \\
d_3 \\
\end{matrix} \right ]
\]
It looks like we just created more work for ourselves, but we actually made it easier. If we look at the problems, the matrices on the left are in a form like row echelon form. Using forward and back substitutions, we can solve easily for all the elements of the \(x\) and \(d\) matrices. We first solve \(Ld=b\) for \(d\), then we substitute into \(Ux=d\) to solve for \(x\). The real trick to this method is the decomposition of \(A\).
The Catch
There are a couple catches to this method:
- The matrix \(A\) must be square to use LU factorization. Other factorization schemes will be necessary if \(A\) is rectangular.
- We have to be sure that \(A\) is a nonsingular (i.e. invertible) matrix. If it can't be inverted, then the decomposition will produce an \(L\) or \(U\) that is singular and the method will fail because there is no unique solution.
- To avoid division by zero or by really small numbers, we have to implement a pivoting scheme just like with Gaussian elimination. This makes the problem take the form \(PA=LU\), where P is a permutation matrix that allows us to swap the rows of A. P is usually the identity matrix with rows swapped such that \(PA\) produces the \(A\) matrix with the same rows swapped as P. Then the \(Ax=b\) problem takes the form \(LUx=Pb\) since \(PA=LU\).
- Some of the entries in the \(L\) and \(U\) matrices must be known before the decomposition, or else the system has too many unknowns and not enough equations to solve for all the entries of both matrices. For what's formally known as Doolittle decomposition, the diagonal entries of the \(L\) matrix are all 1. If we use Crout decomposition, the diagonals of the \(U\) matrix are all 1.
The method presented here won't have the pivoting part implemented, but it shouldn't be a problem to implement later.
The Algorithm
For a square matrix \(A\) with entries \(a_{ij},\,i=1,\cdots,n,\,j=1,\cdots,n\), the Crout decomposition is as follows:
First:
\[
l_{i1} = a_{i1},\,\textrm{for}\,i=1,2,\cdots,n\\
u_{1j} = \frac{a_{1j}}{l_{11}},\,\textrm{for}\,j=2,3,\cdots,n
\]
For \(j=2,3,\cdots,n-1\):
\[
l_{ij} = a_{ij}-\sum_{k=1}^{j-1}l_{ik}u_{kj},\,\textrm{for}\,i=j,j+1,\cdots,n \\
u_{jk} = \frac{a_{jk}-\sum_{i=1}^{j-1}l_{ji}u_{ik}}{l_{jj}},\,\textrm{for}\,k=j+1,j+2,\cdots,n
\]
Finally:
\[
l_{nn}=a_{nn}-\sum_{k=1}^{n-1}l_{nk}u_{kn}
\]
That's it! If you notice, \(A\) is being traversed in a specific way to build \(L\) and \(U\). To start building \(L\), we start with \(a_{11}\) and then traverse down the rows in the same column until we hit \(a_{n1}\). To start building \(U\), we start at \(a_{12}\) and traverse along the same row until we hit \(a_{1n}\). We then build \(L\) further by starting at \(a_{22}\) and traverse along the column until we hit \(a_{n2}\), then we build \(U\) further by starting at \(a_{23}\) and traverse along the row until \(a_{2n}\), and so on.
Since the 1's on the diagonal of \(U\) are given as well as the 0's in both the \(L\) and \(U\) matrices, there's no need to store anything else. In fact, there's a storage-saving scheme where all the calculated entries of both matrices can be stored in a single matrix as large as \(A\).
From here, \(d\) can be solved very easily with forward-substitution:
\[
d_1 = \frac{b_1}{l_{11}} \\
d_i = \frac{b_i - \sum_{j=1}^{i-1}l_{ij}d_j}{l_{ii}},\,\textrm{for}\,i=2,3,\cdots,n
\]
As well, \(x\) can be solved very easily with back-substitution:
\[
x_n = d_n \\
x_i = d_i - \sum_{j=i+1}^n u_{ij}x_j,\,\textrm{for}\,i=n-1,n-2,\cdots,1
\]
A Numerical Example
Let's try to solve the following problem using LU decomposition:
\[
\left [ \begin{matrix}
3 & -0.1 & -0.2 \\
0.1 & 7 & -0.3 \\
0.3 & -0.2 & 10 \\
\end{matrix} \right ]
\left [ \begin{matrix}
x_1 \\ x_2 \\ x_3 \\
\end{matrix} \right ]
=
\left [ \begin{matrix}
7.85 \\ -19.3 \\ 71.4 \\
\end{matrix} \right ]
\]
First, we copy the first column to the \(L\) matrix and the scaled first row except the first element to the \(U\) matrix:
\[
L =
\left [ \begin{matrix}
3 & 0 & 0 \\
0.1 & - & 0 \\
0.3 & - & - \\
\end{matrix} \right ],\,\,\,
U =
\left [ \begin{matrix}
1 & -0.0333 & -0.0667 \\
0 & 1 & - \\
0 & 0 & 1 \\
\end{matrix} \right ]
\]
Then we compute the following entries:
\[
l_{22} = a_{22} - l_{21}u_{12} = 7-(0.1)(-0.0333) = 7.00333 \\
l_{32} = a_{32} - l_{31}u_{12} = -0.2-(0.3)(-0.0333) = -0.19 \\
u_{23} = \frac{a_{23}-l_{21}u_{13}}{l_{22}} = \frac{-0.3-(0.1)(-0.0667)}{7} = -0.0419 \\
l_{33} = a_{33}-l_{31}u_{13}-l_{32}u_{23} = 10-(0.3)(-0.0667)-(-0.19)(-0.0419) = 10.012 \\
\]
Inserting them into our matrices:
\[
L =
\left [ \begin{matrix}
3 & 0 & 0 \\
0.1 & 7.00333 & 0 \\
0.3 & -0.19 & 10.012 \\
\end{matrix} \right ],\,\,\,
U =
\left [ \begin{matrix}
1 & -0.0333 & -0.0667 \\
0 & 1 & -0.0419 \\
0 & 0 & 1 \\
\end{matrix} \right ]
\]
This is the LU factorization of the matrix. You can check if \(LU=A\). Now, we use back- and forward-substitution to solve the problem:
\[
d_1 = \frac{b_1}{l_{11}} = 7.85/3 = 2.6167 \\
d_2 = \frac{b_2-l_{21}d_1}{l_{22}} = (-19.3-(0.1)(2.6167))/7.00333 = -2.7932 \\
d_3 = \frac{b_3-l_{31}d_1-l_{32}d_2}{l_{33}} = (71.4-(0.3)(2.6167)-(-0.19)(-2.7932))/10.012 = 7 \\
\]
\[
x_3 = d_3 = 7 \\
x_2 = d_2 - u_{23}x_3 = -2.7932-(-0.0419)(7) = -2.5 \\
x_1 = d_1 - u_{12}x_2 - u_{13}x_3 = 2.6167-(-0.0333)(-2.5)-(-0.0667)(7) = 3 \\
\]
So the solution to the problem is:
\[
\left [ \begin{matrix}
x_1 \\ x_2 \\ x_3 \\
\end{matrix} \right ]
=
\left [ \begin{matrix}
3 \\ -2.5 \\ 7 \\
\end{matrix} \right ]
\]
This solution can easily be verified by plugging it back into \(Ax=b\).
MATLAB Code
Here's some quick MATLAB code for LU decomposition:
function [L,U] = lucrout(A)
[~,n] = size(A);
L = zeros(n,n);
U = eye(n,n);
L(1,1) = A(1,1);
for j=2:n
L(j,1) = A(j,1);
U(1,j) = A(1,j) / L(1,1);
end
for j=2:n-1
for i=j:n
L(i,j) = A(i,j);
for k=1:j-1
L(i,j) = L(i,j) - L(i,k)*U(k,j);
end
end
for k=j+1:n
U(j,k) = A(j,k);
for i=1:j-1
U(j,k) = U(j,k) - L(j,i)*U(i,k);
end
U(j,k) = U(j,k) / L(j,j);
end
end
L(n,n) = A(n,n);
for k=1:n-1
L(n,n) = L(n,n) - L(n,k)*U(k,n);
end
end
Matrix Inverse with LU Decomposition
LU decomposition is nice for solving a series of \(Ax=b\) problems with the same \(A\) matrix and different \(b\) matrices. This is advantageous for computing the inverse of \(A\) because only one decomposition is required. The definition of the inverse of a matrix \(A^{-1}\) is a matrix such that \(AA^{-1}=I\), where \(I\) is the identity matrix. This looks like this for a general 3x3 matrix:
\[
\left [ \begin{matrix}
A_{11} & A_{12} & A_{13} \\
A_{21} & A_{22} & A_{23} \\
A_{31} & A_{32} & A_{33} \\
\end{matrix} \right ]
\left [ \begin{matrix}
a_{11}^\prime & a_{12}^\prime & a_{13}^\prime \\
a_{21}^\prime & a_{22}^\prime & a_{23}^\prime \\
a_{31}^\prime & a_{32}^\prime & a_{33}^\prime \\
\end{matrix} \right ]
=
\left [ \begin{matrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1 \\
\end{matrix} \right ]
\]
This can be set up like \(n\) \(Ax=b\) problems with different \(b\) matrices and the \(x\) matrix becomes the nth column in the inverse matrix. For example, the first problem is:
\[
\left [ \begin{matrix}
A_{11} & A_{12} & A_{13} \\
A_{21} & A_{22} & A_{23} \\
A_{31} & A_{32} & A_{33} \\
\end{matrix} \right ]
\left [ \begin{matrix}
a_{11}^\prime \\
a_{21}^\prime \\
a_{31}^\prime \\
\end{matrix} \right ]
=
\left [ \begin{matrix}
1 \\
0 \\
0 \\
\end{matrix} \right ]
\]
The second column of the inverse can be computed by changing \(b\) to \([0,1,0]^T\), the third column with \([0,0,1]^T\), and so on. This method is quick because only back- and forward-substitution is required to solve for the column vectors after the initial LU decomposition.
WARNING: As you can see, to get the inverse of this matrix, we end up having to solve
n Ax=b problems for each of the columns of the inverse. If you're trying to get the inverse of the matrix just to solve an Ax=b problem, you're introducing more numerical error into your solution and slowing down your computation time. In short, make sure you really need the matrix inverse and
never use the matrix inverse to solve a system of equations!
Beyond LU Decomposition
There are a lot of other matrix factorization schemes besides LU, like Cholesky or QR factorization, but the general idea of decomposing a matrix into other matrices is roughly the same. The real key to computational savings comes from knowing beforehand what kind of matrix you're factoring and choosing the appropriate algorithm. For example, in structural finite element analysis, the matrix being decomposed is always symmetric positive definite. Cholesky decomposition is way more efficient and quicker than LU for those kinds of matrices, so it's preferred.
Conclusion
LU decomposition is a great tool for anyone working with matrices. Hopefully you can make use of this simple, yet powerful method.
Article Update Log
28 Apr 2014: Added warning to matrix inverse
16 Apr 2014: Initial release
I think I even read this in the Matlab documentation, that you should never explicitly compute the inverse of a matrix, but rather stick with the factors of the factorization.