bicg
Solve system of linear equations — biconjugate gradients method
Syntax
Description
attempts to solve the system of linear equations x
= bicg(A
,b
)A*x = b
for
x
using the Biconjugate Gradients Method. When the attempt is
successful, bicg
displays a message to confirm convergence. If
bicg
fails to converge after the maximum number of iterations or
halts for any reason, it displays a diagnostic message that includes the relative residual
norm(bA*x)/norm(b)
and the iteration number at which the method
stopped.
[
returns a flag that specifies whether the algorithm successfully converged. When
x
,flag
] = bicg(___)flag = 0
, convergence was successful. You can use this output syntax
with any of the previous input argument combinations. When you specify the
flag
output, bicg
does not display any diagnostic
messages.
Examples
Iterative Solution to Linear System
Solve a square linear system using bicg
with default settings, and then adjust the tolerance and number of iterations used in the solution process.
Create a random sparse matrix A
with 50% density. Also create a random vector b
for the righthand side of $\mathrm{Ax}=\mathit{b}$.
rng default
A = sprand(400,400,.5);
A = A'*A;
b = rand(400,1);
Solve $\mathrm{Ax}=\mathit{b}$ using bicg
. The output display includes the value of the relative residual error $\frac{\Vert \mathit{b}\mathrm{Ax}\Vert}{\Vert \mathit{b}\Vert}$.
x = bicg(A,b);
bicg stopped at iteration 20 without converging to the desired tolerance 1e06 because the maximum number of iterations was reached. The iterate returned (number 7) has relative residual 0.45.
By default bicg
uses 20 iterations and a tolerance of 1e6
, and the algorithm is unable to converge in those 20 iterations for this matrix. Since the residual is still large, it is a good indicator that more iterations (or a preconditioner matrix) are needed. You also can use a larger tolerance to make it easier for the algorithm to converge.
Solve the system again using a tolerance of 1e4
and 100 iterations.
x = bicg(A,b,1e4,100);
bicg stopped at iteration 100 without converging to the desired tolerance 0.0001 because the maximum number of iterations was reached. The iterate returned (number 7) has relative residual 0.45.
Even with a looser tolerance and more iterations, the residual error does not improve much. When an iterative algorithm stalls in this manner, it is a good indication that a preconditioner matrix is needed.
Calculate the incomplete Cholesky factorization of A
, and use the L'
factor as a preconditioner input to bicg
.
L = ichol(A); x = bicg(A,b,1e4,100,L');
bicg converged at iteration 65 to a solution with relative residual 6.7e05.
Using a preconditioner improves the numerical properties of the problem enough that bicg
is able to converge.
Using bicg
with Preconditioner
Examine the effect of using a preconditioner matrix with bicg
to solve a linear system.
Load west0479, a real 479by479 nonsymmetric sparse matrix.
load west0479
A = west0479;
Define b
so that the true solution to $\mathrm{Ax}=\mathit{b}$ is a vector of all ones.
b = sum(A,2);
Set the tolerance and maximum number of iterations.
tol = 1e12; maxit = 20;
Use bicg
to find a solution at the requested tolerance and number of iterations. Specify five outputs to return information about the solution process:
x
is the computed solution toA*x = b
.fl0
is a flag indicating whether the algorithm converged.rr0
is the relative residual of the computed answerx
.it0
is the iteration number whenx
was computed.rv0
is a vector of the residual history for $\Vert \mathit{b}\mathrm{Ax}\Vert $.
[x,fl0,rr0,it0,rv0] = bicg(A,b,tol,maxit); fl0
fl0 = 1
rr0
rr0 = 1
it0
it0 = 0
fl0
is 1 because bicg
does not converge to the requested tolerance 1e12
within the requested 20 iterations. In fact, the behavior of bicg
is so poor that the initial guess x0 = zeros(size(A,2),1)
is the best solution and is returned, as indicated by it0 = 0
.
To aid with the slow convergence, you can specify a preconditioner matrix. Since A
is nonsymmetric, use ilu
to generate the preconditioner $\mathit{M}=\mathit{L}\text{\hspace{0.17em}}\mathit{U}$. Specify a drop tolerance to ignore nondiagonal entries with values smaller than 1e6
. Solve the preconditioned system ${\mathit{M}}^{1}\text{\hspace{0.17em}}\mathit{A}\text{\hspace{0.17em}}\mathit{x}={\mathit{M}}^{1}\text{\hspace{0.17em}}\mathit{b}$ by specifying L
and U
as inputs to bicg
.
setup = struct('type','ilutp','droptol',1e6); [L,U] = ilu(A,setup); [x1,fl1,rr1,it1,rv1] = bicg(A,b,tol,maxit,L,U); fl1
fl1 = 0
rr1
rr1 = 4.1410e14
it1
it1 = 6
The use of an ilu
preconditioner produces a relative residual less than the prescribed tolerance of 1e12
at the sixth iteration. The output rv1(1)
is norm(b)
, and the output rv1(end)
is norm(bA*x1)
.
You can follow the progress of bicg
by plotting the relative residuals at each iteration. Plot the residual history of each solution with a line for the specified tolerance.
semilogy(0:length(rv0)1,rv0/norm(b),'o') hold on semilogy(0:length(rv1)1,rv1/norm(b),'o') yline(tol,'r'); legend('No preconditioner','ILU preconditioner','Tolerance','Location','East') xlabel('Iteration number') ylabel('Relative residual')
Supplying Initial Guess
Examine the effect of supplying bicg
with an initial guess of the solution.
Create a tridiagonal sparse matrix. Use the sum of each row as the vector for the righthand side of $\mathrm{Ax}=\mathit{b}$ so that the expected solution for $\mathit{x}$ is a vector of ones.
n = 900; e = ones(n,1); A = spdiags([e 2*e e],1:1,n,n); b = sum(A,2);
Use bicg
to solve $\mathrm{Ax}=\mathit{b}$ twice: one time with the default initial guess, and one time with a good initial guess of the solution. Use 200 iterations and the default tolerance for both solutions. Specify the initial guess in the second solution as a vector with all elements equal to 0.99
.
maxit = 200; x1 = bicg(A,b,[],maxit);
bicg converged at iteration 35 to a solution with relative residual 9.5e07.
x0 = 0.99*e; x2 = bicg(A,b,[],maxit,[],[],x0);
bicg converged at iteration 7 to a solution with relative residual 8.7e07.
In this case supplying an initial guess enables bicg
to converge more quickly.
Returning Intermediate Results
You also can use the initial guess to get intermediate results by calling bicg
in a forloop. Each call to the solver performs a few iterations and stores the calculated solution. Then you use that solution as the initial vector for the next batch of iterations.
For example, this code performs 100 iterations four times and stores the solution vector after each pass in the forloop:
x0 = zeros(size(A,2),1); tol = 1e8; maxit = 100; for k = 1:4 [x,flag,relres] = bicg(A,b,tol,maxit,[],[],x0); X(:,k) = x; R(k) = relres; x0 = x; end
X(:,k)
is the solution vector computed at iteration k
of the forloop, and R(k)
is the relative residual of that solution.
Using Function Handle Instead of Numeric Matrix
Solve a linear system by providing bicg
with a function handle that computes A*x
and A'*x
in place of the coefficient matrix A
.
Create an nonsymmetric tridiagonal matrix. Preview the matrix.
A = gallery('wilk',21) + diag(ones(20,1),1)
A = 21×21
10 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 9 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 8 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 7 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 6 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 5 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 4 2 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 3 2 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 2 2 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 1 2 0 0 0 0 0 0 0 0 0 0
⋮
Since this tridiagonal matrix has a special structure, you can represent the operation A*x
with a function handle. When A
multiplies a vector, most of the elements in the resulting vector are zeros. The nonzero elements in the result correspond with the nonzero tridiagonal elements of A
.
The expression $\mathit{A}\text{\hspace{0.17em}}\mathit{x}$ becomes:
$\mathit{A}\text{\hspace{0.17em}}\mathit{x}=\left[\begin{array}{ccccccc}10& 2& 0& \cdots & & \cdots & 0\\ 1& 9& 2& 0& & & \vdots \\ 0& 1& \ddots & 2& 0& & \\ \vdots & 0& 1& 0& \ddots & \ddots & \vdots \\ & & 0& \ddots & 1& \ddots & 0\\ \vdots & & & \ddots & \ddots & \ddots & 2\\ 0& \cdots & & \cdots & 0& 1& 10\end{array}\right]\left[\begin{array}{c}{\mathit{x}}_{1}\\ {\mathit{x}}_{2}\\ {\mathit{x}}_{3}\\ \vdots \\ \vdots \\ {\mathit{x}}_{21}\end{array}\right]=\left[\begin{array}{c}{10\mathit{x}}_{1}+2{\mathit{x}}_{2}\\ {\mathit{x}}_{1}+9{\mathit{x}}_{2}+2{\mathit{x}}_{3}\\ \vdots \\ \vdots \\ {\mathit{x}}_{19}+9{\mathit{x}}_{20}+2{\mathit{x}}_{21}\\ {\mathit{x}}_{20}+10{\mathit{x}}_{21}\end{array}\right]$.
The resulting vector can be written as the sum of three vectors:
$\mathit{A}\text{\hspace{0.17em}}\mathit{x}=\left[\begin{array}{c}{10\mathit{x}}_{1}+2{\mathit{x}}_{2}\\ {\mathit{x}}_{1}+9{\mathit{x}}_{2}+2{\mathit{x}}_{3}\\ \vdots \\ \vdots \\ {\mathit{x}}_{19}+9{\mathit{x}}_{20}+2{\mathit{x}}_{21}\\ {\mathit{x}}_{20}+10{\mathit{x}}_{21}\end{array}\right]$=$\left[\begin{array}{c}0\\ {\mathit{x}}_{1}\\ {\mathit{x}}_{2}\\ \vdots \\ {\mathit{x}}_{20}\end{array}\right]+\left[\begin{array}{c}{10\mathit{x}}_{1}\\ {9\mathit{x}}_{2}\\ \vdots \\ 9{\mathit{x}}_{20}\\ 10{\mathit{x}}_{21}\end{array}\right]+2\cdot \left[\begin{array}{c}{\mathit{x}}_{2}\\ {\mathit{x}}_{3}\\ \vdots \\ {\mathit{x}}_{21}\\ 0\end{array}\right]$.
Likewise, the expression for ${\mathit{A}}^{\mathit{T}}\text{\hspace{0.17em}}\mathit{x}$ becomes:
${\mathit{A}}^{\mathit{T}}\text{\hspace{0.17em}}\mathit{x}=\left[\begin{array}{ccccccc}10& 1& 0& \cdots & & \cdots & 0\\ 2& 9& 1& 0& & & \vdots \\ 0& 2& \ddots & 1& 0& & \\ \vdots & 0& 2& 0& \ddots & \ddots & \vdots \\ & & 0& \ddots & 1& \ddots & 0\\ \vdots & & & \ddots & \ddots & \ddots & 1\\ 0& \cdots & & \cdots & 0& 2& 10\end{array}\right]\left[\begin{array}{c}{\mathit{x}}_{1}\\ {\mathit{x}}_{2}\\ {\mathit{x}}_{3}\\ \vdots \\ \vdots \\ {\mathit{x}}_{21}\end{array}\right]=\left[\begin{array}{c}{10\mathit{x}}_{1}+{\mathit{x}}_{2}\\ {2\mathit{x}}_{1}+9{\mathit{x}}_{2}+{\mathit{x}}_{3}\\ \vdots \\ \vdots \\ {2\mathit{x}}_{19}+9{\mathit{x}}_{20}+{\mathit{x}}_{21}\\ {2\mathit{x}}_{20}+10{\mathit{x}}_{21}\end{array}\right]$.
${\mathit{A}}^{\mathit{T}}\text{\hspace{0.17em}}\mathit{x}=\left[\begin{array}{c}{10\mathit{x}}_{1}+{\mathit{x}}_{2}\\ {2\mathit{x}}_{1}+9{\mathit{x}}_{2}+{\mathit{x}}_{3}\\ \vdots \\ \vdots \\ {2\mathit{x}}_{19}+9{\mathit{x}}_{20}+{\mathit{x}}_{21}\\ {2\mathit{x}}_{20}+10{\mathit{x}}_{21}\end{array}\right]=2\cdot \left[\begin{array}{c}0\\ {\mathit{x}}_{1}\\ {\mathit{x}}_{2}\\ \vdots \\ {\mathit{x}}_{20}\end{array}\right]+\left[\begin{array}{c}{10\mathit{x}}_{1}\\ {9\mathit{x}}_{2}\\ \vdots \\ 9{\mathit{x}}_{20}\\ 10{\mathit{x}}_{21}\end{array}\right]+\left[\begin{array}{c}{\mathit{x}}_{2}\\ {\mathit{x}}_{3}\\ \vdots \\ {\mathit{x}}_{21}\\ 0\end{array}\right]$.
In MATLAB®, write a function that creates these vectors and adds them together, giving the value of A*x
or A'*x
, depending on the flag input:
function y = afun(x,flag) if strcmp(flag,'notransp') % Compute A*x y = [0; x(1:20)] ... + [(10:1:0)'; (1:10)'].*x ... + 2*[x(2:end); 0]; elseif strcmp(flag,'transp') % Compute A'*x y = 2*[0; x(1:20)] ... + [(10:1:0)'; (1:10)'].*x ... + [x(2:end); 0]; end end
(This function is saved as a local function at the end of the example.)
Now, solve the linear system $\mathrm{Ax}=\mathit{b}$ by providing bicg
with the function handle that calculates A*x
and A'*x
. Use a tolerance of 1e6
and 25 iterations. Specify $\mathit{b}$ as the row sums of $\mathit{A}$ so that the true solution for $\mathit{x}$ is a vector of ones.
b = full(sum(A,2)); tol = 1e6; maxit = 25; x1 = bicg(@afun,b,tol,maxit)
bicg converged at iteration 19 to a solution with relative residual 4.8e07.
x1 = 21×1
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
⋮
Local Functions
function y = afun(x,flag) if strcmp(flag,'notransp') % Compute A*x y = [0; x(1:20)] ... + [(10:1:0)'; (1:10)'].*x ... + 2*[x(2:end); 0]; elseif strcmp(flag,'transp') % Compute A'*x y = 2*[0; x(1:20)] ... + [(10:1:0)'; (1:10)'].*x ... + [x(2:end); 0]; end end
Input Arguments
A
— Coefficient matrix
matrix  function handle
Coefficient matrix, specified as a square matrix or function handle. This matrix is
the coefficient matrix in the linear system A*x = b
. Generally,
A
is a large sparse matrix or a function handle that returns the
product of a large sparse matrix and column vector.
Specifying A
as a Function Handle
You can optionally specify the coefficient matrix as a function handle instead of a matrix. The function handle returns matrixvector products instead of forming the entire coefficient matrix, making the calculation more efficient.
To use a function handle, use the function signature function y =
afun(x,opt)
. Parameterizing Functions explains how to
provide additional parameters to the function afun
, if necessary. The
function afun
must satisfy these conditions:
afun(x,'notransp')
returns the productA*x
.afun(x,'transp')
returns the productA'*x
.
An example of an acceptable function is:
function y = afun(x,opt,B,C,n) if strcmp(opt,'notransp') y = [B*x(n+1:end); C*x(1:n)]; else y = [C'*x(n+1:end); B'*x(1:n)]; end
afun
uses the values in B
and
C
to compute either A*x
or A'*x
(depending on the specified flag) without actually forming the entire matrix.
Data Types: double
 function_handle
Complex Number Support: Yes
b
— Righthand side of linear equation
column vector
Righthand side of linear equation, specified as a column vector. The length of
b
must be equal to
size(A,1)
.
Data Types: double
Complex Number Support: Yes
tol
— Method tolerance
[]
or
1e6
(default)  positive scalar
Method tolerance, specified as a positive scalar. Use this input to tradeoff accuracy and
runtime in the calculation. bicg
must meet the tolerance within the number of allowed iterations
to be successful. A smaller value of tol
means the answer must be more precise for the calculation to be
successful.
Data Types: double
maxit
— Maximum number of iterations
[]
or min(size(A,1),20)
(default)  positive scalar integer
Maximum number of iterations, specified as a positive scalar integer. Increase the value of
maxit
to allow more iterations for
bicg
to meet the tolerance tol
.
Generally, a smaller value of tol
means more iterations are
required to successfully complete the calculation.
M
, M1
, M2
— Preconditioner matrices (as separate arguments)
eye(size(A))
(default)  matrices  function handles
Preconditioner matrices, specified as separate arguments of matrices or function
handles. You can specify a preconditioner matrix M
or its matrix
factors M = M1*M2
to improve the numerical aspects of the linear
system and make it easier for bicg
to converge quickly. For
square coefficient matrices, you can use the incomplete matrix factorization functions
ilu
and ichol
to generate preconditioner matrices. You also can use equilibrate
prior to factorization to improve the condition number of
the coefficient matrix. For more information on preconditioners, see Iterative Methods for Linear Systems.
bicg
treats unspecified preconditioners as identity
matrices.
Specifying M
as a Function Handle
You can optionally specify any of M
, M1
, or
M2
as function handles instead of matrices. The function
handle performs matrixvector operations instead of forming the entire
preconditioner matrix, making the calculation more efficient.
To use a function handle, first create a function with the signature
function y = mfun(x,opt)
. Parameterizing Functions explains
how to provide additional parameters to the function mfun
, if
necessary. The function mfun
must satisfy these conditions:
mfun(x,'notransp')
returns the value ofM\x
orM2\(M1\x)
.mfun(x,'transp')
returns the value ofM'\x
orM1'\(M2'\x)
.
An example of an acceptable function is:
function y = mfun(x,opt,a,b) if strcmp(opt,'notransp') y = x.*a; else y = x.*b; end end
mfun
uses a
and
b
to compute either M\x = x*a
or
M'\x = x*b
(depending on the specified flag) without actually
forming the entire matrix M
.
Data Types: double
 function_handle
Complex Number Support: Yes
x0
— Initial guess
[]
or a column vector of zeros (default)  column vector
Initial guess, specified as a column vector with length equal to size(A,2)
.
If you can provide bicg
with a more reasonable initial guess
x0
than the default vector of zeros, then it can save computation
time and help the algorithm converge faster.
Data Types: double
Complex Number Support: Yes
Output Arguments
x
— Linear system solution
column vector
Linear system solution, returned as a column vector. This output gives the approximate
solution to the linear system A*x = b
. If the calculation is
successful (flag = 0
), then relres
is less than
or equal to tol
.
Whenever the calculation is not successful (flag ~= 0
), the solution
x
returned by bicg
is the one with
minimal residual norm computed over all the iterations.
flag
— Convergence flag
scalar
Convergence flag, returned as one of the scalar values in this table. The convergence flag indicates whether the calculation was successful and differentiates between several different forms of failure.
Flag Value  Convergence 

 Success — 
 Failure — 
 Failure — The preconditioner matrix 
 Failure — 
 Failure — One of the scalar quantities calculated by the

relres
— Relative residual error
scalar
iter
— Iteration number
scalar
Iteration number, returned as a scalar. This output indicates the iteration number at which
the computed answer for x
was
calculated.
Data Types: double
resvec
— Residual error
vector
Residual error, returned as a vector. The residual error norm(bA*x)
reveals how close the algorithm is to converging for a given value of
x
. The number of elements in resvec
is equal
to the number of iterations. You can examine the contents of resvec
to help decide whether to change the values of tol
or
maxit
.
Data Types: double
More About
Biconjugate Gradients Method
The biconjugate gradients (BiCG) algorithm was developed to generalize the conjugate gradient (CG) method to nonsymmetric systems. BiCG solves not only the original linear system $$Ax=b$$ but also the conjugate system $${\text{A}}^{T}{x}^{*}={b}^{*}$$. This leads to two sets of conjugate residuals defined in terms of the transpose of the coefficient matrix.
For symmetric positive definite systems, which the CG algorithm is designed for, the BiCG algorithm delivers the same results but with twice the cost per iteration. The accuracy of BiCG can be comparable to GMRES, but between the two only GMRES truly minimizes the residual. Several variants of the BiCG algorithm were developed to address the irregular convergence behavior it displays (see BiCGSTAB, BiCGSTABL, and CGS) [1].
Tips
Convergence of most iterative methods depends on the condition number of the coefficient matrix,
cond(A)
. WhenA
is square, you can useequilibrate
to improve its condition number, and on its own this makes it easier for most iterative solvers to converge. However, usingequilibrate
also leads to better quality preconditioner matrices when you subsequently factor the equilibrated matrixB = R*P*A*C
.You can use matrix reordering functions such as
dissect
andsymrcm
to permute the rows and columns of the coefficient matrix and minimize the number of nonzeros when the coefficient matrix is factored to generate a preconditioner. This can reduce the memory and time required to subsequently solve the preconditioned linear system.
References
[1] Barrett, R., M. Berry, T.F. Chan, et al., Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, SIAM, Philadelphia, 1994.
Extended Capabilities
ThreadBased Environment
Run code in the background using MATLAB® backgroundPool
or accelerate code with Parallel Computing Toolbox™ ThreadPool
.
This function fully supports threadbased environments. For more information, see Run MATLAB Functions in ThreadBased Environment.
GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.
Usage notes and limitations:
When input
A
is a sparse matrix:Only one sparse matrix preconditioner
M
is supported.If you use two preconditioners,
M1
andM2
, then both of them must be functions.For GPU arrays,
bicg
does not detect stagnation (Flag 3). Instead, it reports failure to converge (Flag 1).
For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).
Distributed Arrays
Partition large arrays across the combined memory of your cluster using Parallel Computing Toolbox™.
Usage notes and limitations:
If
M1
is a function, then it is applied independently to each row.
For more information, see Run MATLAB Functions with Distributed Arrays (Parallel Computing Toolbox).
Version History
Introduced before R2006a
Open Example
You have a modified version of this example. Do you want to open this example with your edits?
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
 América Latina (Español)
 Canada (English)
 United States (English)
Europe
 Belgium (English)
 Denmark (English)
 Deutschland (Deutsch)
 España (Español)
 Finland (English)
 France (Français)
 Ireland (English)
 Italia (Italiano)
 Luxembourg (English)
 Netherlands (English)
 Norway (English)
 Österreich (Deutsch)
 Portugal (English)
 Sweden (English)
 Switzerland
 United Kingdom (English)