The residual equation of a nonlinear PDE is as follows:

To obtain a discretized residual equation, apply the finite element method (FEM)
to a partial differential equation as described in Finite Element Method Basics:

The nonlinear solver uses a Gauss-Newton iteration scheme applied to the finite
element matrices. Use a Taylor series expansion to obtain the linearized system for
the residual:

Neglecting the higher-order terms, write the linearized system of equations
as

The descent direction for the residual is

The Gauss-Newton iteration minimizes the residual, that is, the solution of $${\mathrm{min}}_{U}\Vert \rho \left(U\right)\Vert $$, using the equation

Here, ɑ ≤ 1 is a positive number, that must be set as large as possible so that
the step has a reasonable descent. For a sufficiently small ɑ,

For the Gauss-Newton algorithm to converge, $${U}^{0}$$ must be close enough to the solution. The first guess is often
outside the region of convergence. The Armijo-Goldstein line search (a damping
strategy for choosing ɑ) helps to improve convergence from bad initial guesses. This
method chooses the largest damping coefficient ɑ out of the sequence 1, 1/2, 1/4, .
. . such that the following inequality holds:

Using the Armijo-Goldstein line search guarantees a reduction of the residual norm
by at least $$1-\alpha /2$$. Each step of the line-search algorithm must evaluate the residual $$\Vert \rho \left({U}^{n}+\alpha {p}_{n}\right)\Vert $$.

With this strategy, when
*U*^{n} approaches
the solution, $$\alpha $$→1, thus, the convergence rate increases.