One of the most important and common applications of numerical linear algebra is the
solution of linear systems that can be expressed in the form A*x = b
. When
A
is a large sparse matrix, you can solve the linear system using
iterative methods, which enable you to tradeoff between the run time of the calculation and
the precision of the solution. This topic describes the iterative methods available in
MATLAB^{®} to solve the equation A*x = b
.
There are two types of methods for solving linear equations A*x =
b
:
Direct methods are variants of Gaussian elimination. These methods involve the individual matrix elements directly, through matrix operations such as LU, QR, or Cholesky factorization. You can use direct methods to solve linear equations with a high level of precision, but these methods can be slow when operating on large sparse matrices. The speed of solving a linear system with a direct method strongly depends on the size of the coefficient matrix.
For example, this code solves a small linear system.
A = magic(5); b = sum(A,2); x = A\b; norm(A*xb)
ans = 1.4211e14
MATLAB implements direct methods through the matrix division operators /
and \
, as well as functions such as lsqminnorm
,
decomposition
, and linsolve
.
Iterative methods produce an approximate solution to the linear system after a finite number of steps. These methods are useful for large systems of equations where it is reasonable to tradeoff precision for a shorter run time. These methods involve the coefficient matrix only indirectly, through a matrixvector product or an abstract linear operator. Iterative methods are usually applied only to sparse matrices, because smaller systems can be easily solved with direct methods. The speed of solving a linear system with an indirect method does not depend as strongly on the size of the coefficient matrix as a direct method. However, using an iterative method typically requires tuning parameters for each specific problem.
For example, this code solves a large sparse linear system that has a symmetric positive definite coefficient matrix.
A = delsq(numgrid('L',400));
b = ones(size(A,1),1);
x = pcg(A,b,[],1000);
norm(A*xb)
pcg converged at iteration 796 to a solution with relative residual 9.9e07. ans = 3.4285e04
MATLAB implements a variety of iterative methods that have different strengths
and weaknesses depending on the properties of the coefficient matrix
A
.
Direct methods are usually faster and more generally applicable than indirect methods,
if there is enough storage available to carry them out. Generally, you should attempt to use
x = A\b
first. If the direct solve is too slow, then you can try using
iterative methods.
Most iterative algorithms that solve linear equations follow a similar process:
Start with an initial guess for the solution vector x0
. (This
is usually a vector of zeros unless you specify a better guess.)
Compute the residual norm res = norm(bA*x0)
.
Compare the residual against the specified tolerance. If res <=
tol
, end the computation and return the computed answer for
x0
.
Update the magnitude and direction of the vector x0
based on
the value of the residual and other calculated quantities.
Repeat Steps 2 through 4 until the value of x0
is good enough
to satisfy the tolerance.
The iterative methods differ in how they update the magnitude and direction of
x0
in Step 4, and some have slightly different convergence criteria in
Steps 2 and 3, but this captures the basic process that all iterative solvers follow.
MATLAB has several functions that implement iterative methods for sparse systems of linear equations. These methods are designed to solve Ax = b or minimize the norm b – Ax. Several of these methods have similarities and are based on the same underlying algorithms, but each algorithm has benefits in certain situations [1], [2].
Description  Notes 























This flowchart of iterative solvers in MATLAB gives a rough idea of the situations where each solver is useful. You can
generally use gmres
for almost all square, nonsymmetric problems. There
are some cases where the biconjugate gradients algorithms (bicg
,
bicgstab
, cgs
, and so on) are more efficient
than gmres
, but their unpredictable convergence behavior often makes
gmres
a better initial choice.
The convergence rate of iterative methods is dependent on the spectrum (eigenvalues) of the coefficient matrix. Therefore, you can improve the convergence and stability of most iterative methods by transforming the coefficient matrix to have a more favorable spectrum (clustered eigenvalues, or a condition number near 1). This transformation is performed by applying a second matrix, called a preconditioner, to the system. This process transforms the linear system
$$Ax=b$$
into an equivalent system
$$\tilde{A\text{\hspace{0.17em}}}\tilde{x}=\tilde{b}\text{\hspace{0.17em}}.$$
The ideal preconditioner transforms the coefficient matrix A into an identity matrix, since any iterative method will converge in one iteration with such a preconditioner. In practice, finding a good preconditioner requires tradeoffs. The transformation is performed in one of three ways: left preconditioning, right preconditioning, or split preconditioning.
The first case is called left preconditioning since the preconditioner matrix M appears on the left of A:
$$\left({M}^{1}A\text{\hspace{0.17em}}\right)x=\left({M}^{1}\text{\hspace{0.17em}}b\right)\text{\hspace{0.17em}}.$$
These iterative solvers use left preconditioning:
In right preconditioning, M appears on the right of A:
$$\left(A\text{\hspace{0.17em}}{M}^{1}\right)\left(M\text{\hspace{0.17em}}x\right)=b\text{\hspace{0.17em}}.$$
These iterative solvers use right preconditioning:
Finally, for symmetric coefficient matrices, split preconditioning ensures that the new system still has a symmetric matrix. The preconditioner $$M=H{H}^{T}$$ appears on both sides of A:
$$\left({H}^{1}A\text{\hspace{0.17em}}{H}^{T}\right){H}^{T}x=\left({H}^{1}b\right)$$
The solver algorithm for split preconditioned systems is based on the above equation,
but in practice there is no need to compute H. The solver algorithm only
ever applies M
or inv(M)
directly.
These iterative solvers use split preconditioning:
In all cases, the preconditioner M is chosen to accelerate convergence of the iterative method. When the residual error of an iterative solution stagnates or makes little progress between iterations, it often means you need to generate a preconditioner matrix to incorporate into the problem.
The iterative solvers in MATLAB allow you to specify a single preconditioner matrix M, or
two preconditioner matrix factors such that M =
M_{1}M_{2}. This makes it easy to specify a preconditioner that is calculated from a
factorization, such as M = LU. Note that in the split preconditioned case, where M =
HH^{T} also holds, there is not a direct connection between the
M1
and M2
inputs with the H
factors.
In many cases, preconditioners occur naturally in the mathematical model of a given problem. A partial differential equation with variable coefficients can be approximated by one with constant coefficients, for example. In the absence of natural preconditioners, you can use one of the incomplete factorizations in this table.
Function  Factorization  Description 

ilu 
$$\text{A}\approx \text{LU}$$
 Incomplete LU factorization for square or rectangular matrices. 
ichol 
$$\text{A}\approx \text{L}\text{\hspace{0.17em}}{\text{L}}^{*}$$  Incomplete Cholesky factorization for symmetric positive definite matrices. 
See Incomplete Factorizations for more information about ilu
and ichol
.
Consider the fivepoint finite difference approximation to Laplace's equation on a
square, twodimensional domain. The following commands use the preconditioned conjugate
gradient (PCG) method preconditioner M = L*L'
, where
L
is the zerofill incomplete Cholesky factor of
A
.
A = delsq(numgrid('S',250));
b = ones(size(A,1),1);
tol = 1e3;
maxit = 100;
L = ichol(A);
x = pcg(A,b,tol,maxit,L,L');
pcg converged at iteration 92 to a solution with relative residual 0.00076.
pcg
requires 92 iterations to achieve the specified tolerance.
However, using a different preconditioner can yield better results. For example, using
ichol
to construct a modified incomplete Cholesky allows
pcg
to meet the specified tolerance after only 39
iterations.
L = ichol(A,struct('type','nofill','michol','on')); x = pcg(A,b,tol,maxit,L,L');
pcg converged at iteration 39 to a solution with relative residual 0.00098.
For computationally tough problems, you might need a better preconditioner than the one
generated by ilu
or ichol
directly. For example,
you might want to generate a better quality preconditioner or minimize the amount of
computation being done. In these cases, you can use equilibration to
make the coefficient matrix more diagonally dominant (which can lead to a better quality
preconditioner), and reordering to minimize the number of nonzeros in
matrix factors (which can reduce memory requirements and may improve the efficiency of
subsequent calculations).
If you use both equilibration and reordering to generate a preconditioner, the process is:
Use equilibrate
on the coefficient matrix.
Reorder the equilibrated matrix using a sparse matrix reordering function, such as
dissect
or symrcm
.
Here is an example that uses equilibration and reordering to generate a preconditioner for a sparse coefficient matrix.
Create the coefficient matrix A
and a vector of ones
b
for the righthand side of the linear equation. Calculate an
estimation of the condition number for A
.
load west0479
A = west0479;
n = size(A,1);
b = ones(n,1);
condest(A)
ans = 1.4244e+12
Use equilibrate
to improve the condition number of the
coefficient matrix.
[P,R,C] = equilibrate(A); Anew = R*P*A*C; bnew = R*P*b; condest(Anew)
ans = 5.1042e+04
Reorder the equilibrated matrix using dissect
.
q = dissect(Anew); Anew = Anew(q,q); bnew = bnew(q);
Generate a preconditioner using an incomplete LU factorization.
[L,U] = ilu(Anew);
Solve the linear system with gmres
using the preconditioner
matrices, a tolerance of 1e10
, 50 maximum outer iterations, and 30
inner iterations.
tol = 1e10; maxit = 50; restart = 30; [xnew, flag, relres] = gmres(Anew,bnew,restart,tol,maxit,L,U); x(q) = xnew; x = C*x(:);
Now, compare the relres
relative residual returned by gmres
(which includes the
preconditioners) to the relative residual without the preconditioners
resnew
and the relative residual without equilibration
res
. The results show that even though the linear systems are all
equivalent, the different methods apply different weights to each element, and this can
significantly affect the value of the residual.
relres resnew = norm(Anew*xnew  bnew) / norm(bnew) res = norm(A*x  b) / norm(b)
relres = 8.7537e11 resnew = 3.6805e08 res = 5.1415e04
The iterative solvers in MATLAB do not require that you provide a numeric matrix for
A
. Since the calculations performed by the solvers primarily use the
result of the matrixvector multiplication A*x
or
A'*x
, you can instead provide a function that calculates the result of
those linear operations. A function that calculates these quantities is often called a
linear operator.
In addition to using a linear operator instead of a coefficient matrix
A
, you can also use a linear operator instead of a matrix for the
preconditioner M
. In that case, the function needs to calculate
M\x
or M'\x
.
Using linear operators enables you to exploit patterns in A
or
M
to calculate the value of the linear operations more efficiently than
if the solver carried out the full matrixvector multiplications. It also means you do not
need the memory to store the coefficient or preconditioner matrices, since the linear
operator typically calculates the result of the matrixvector multiplication without forming
the matrix at all.
For example, consider the coefficient matrix
A = [2 1 0 0 0 0; 1 2 1 0 0 0; 0 1 2 1 0 0; 0 0 1 2 1 0; 0 0 0 1 2 1; 0 0 0 0 1 2];
When this coefficient matrix multiplies a vector, the only nonzero elements are the ones
that multiply the tridiagonal elements of A
. So, for a given vector
x
, the linear operator function simply needs to add together three
vectors to calculate the value of
A*x
:
function y = linearOperatorA(x) y = 1*[0; x(1:end1)] ... + 2*x ... + 1*[x(2:end); 0]; end
Most iterative solvers require the linear operator function for A
to
return the value of A*x
. Likewise, for the preconditioner matrix
M
, the function generally must calculate M\x
. For
the solvers lsqr
, qmr
, and
bicg
, the linear operator function needs to also return the value for
A'*x
or M'\x
when requested. See the iterative
solver reference pages for examples and descriptions of linear operator functions.
[1] Barrett, R., M. Berry, T. F. Chan, et al., Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, SIAM, Philadelphia, 1994.
[2] Saad, Yousef, Iterative Methods for Sparse Linear Equations. PWS Publishing Company, 1996.