It is often happens that minimization algorithms (including zero finders, such as Newton’s method or Broyden’s method) fail to converge if initial approximation was chosen far away from the actual solution. Nevertheless, it is possible to improve the current approximation in the direction of minimization even if the method suggests too large a step that overshoots the solution. Consider trivial example of solving with solution using Newton’s method:

If the initial approximation , for example , the sequence will diverge:

However, it possible to recover from this divergence since Newton’s method takes the step towards the minimization of , but overshoots. Thus, it is reasonable to find a better approximation somewhere between and . One way to do this is to use bisection method. Indeed, we know the minimum exists between these two points. So, the first point to try would be (see R.W. Hamming Numerical methods for scientists and engineers). A different approach was suggested in Numerical Recipes by W.H. Press et al: we usually know , (derivative of at ) and — these points can be used to approximate on the line between and via quadratic function. Since this method is also applicable for vector-valued problem for , the following derivation will be conducted for a more general vector-valued problem.

Derivation

Consider the iteration of a non-linear solver that suggests to take the step . The line search algorithm aims to find such that will minimize the absolute value of . This can be stated as a minimization of , . Since it aims to find the minimum on the line , we can simplify this problem further by introducing and minimizing it for . Derivative of :

where is the Jacobian matrix . When this simplifies to . Notice, if is a Newtonian step: , it simplifies further to .

Before introducing iteration steps it should be noted that the line search only needs to be conducted if . In case the line search is required, , the stop criteria is chosen as:

where is an empirical parameter (defaults to ).

To find the minimum of , it can be approximated as a quadratic function using , and with minimum found at

Since , . If satisfies minimization criteria, the method stops. Otherwise, can be further approximated using cubic approximation using , and values on last two approximations starting with (second last) and (last):

Coefficients and found as:

And minimum of the cubic function is at:

Finally, to prevent to be too large or too small, at each step it is forced to fall in . If becomes too small for the method to converge (say, below ), the method fails.

Code (with annotations)

In PolyMath-Extensions line search is implemented as a class PMLineSearchFunctionMinimizer in package Math-Scalar-Iterative. This class is descendant of DhbIterativeProcess (PMIterativeProcess), thus, main functionality is implemented in the method evaluateIteration:

PMLineSearchFunctionMinimizer>>evaluateIteration
	| a b tmp1 tmp2 gamma1 gamma2 nextX deltaX |
	deltaX := result - previousResult. "λ1 - λ2"
	useCubicApproximation
		ifFalse: [
			"First step - quadratic approximation"
			"λ* = D[g](0)/2(g(1) - g(0) - D[g](0))"
			nextX := derivativeAtZero negated * 0.5 / (valueAtOne - valueAtZero - derivativeAtZero) max: 0.1.
			"After first step - use cubic approximation"
			useCubicApproximation := true ]
		ifTrue: [
			gamma1 := valueAtResult - (derivativeAtZero * result) - valueAtZero / deltaX.
			gamma2 := valueAtPreviousResult - (derivativeAtZero * previousResult) - valueAtZero / deltaX.
			tmp1 := gamma1 / result squared.
			tmp2 := gamma2 / previousResult squared.
			"a and b coefficients"
			a := tmp1 - tmp2.
			b := result * tmp2 - (previousResult * tmp1).
			nextX := (b negated + (b squared - (3.0 * a * derivativeAtZero) sqrt)) / (3.0 * a) min: 0.5 * result max: 0.1 * result.
			"Finish (fail) iterations if next approximation is too small"
			nextX < failingMin ifTrue: [ PMStopIterations new signal ] ].
	"Update λ2 ← λ1, λ1 ← nextX and update function values"
	self updateResult: nextX.
	"precision in this case: just need to be negative"
	^ valueAtResult - (alpha * derivativeAtZero + valueAtZero)

This class also overrides computeInitialValues and stops the evaluation if :

PMLineSearchFunctionMinimizer>>computeInitialValues
	"Computes initial values as (1, g(1), 0, g(0))"
	result := 1.0.
	valueAtResult := valueAtOne.
	previousResult := 0.0.
	valueAtPreviousResult := valueAtZero.
	useCubicApproximation := false.
	(valueAtOne < valueAtZero or: [ valueAtOne <= minValue ])
			ifTrue: [ precision := 0.0. PMStopIterations new signal ].

Example of use

Line search is not intended to be used directly by the user, but rather by an implementer of other methods. Direct use examples can be found in test-case class PMScalarMethodsTestCase. Here is the example of its use in vector-vales Newton’s method:

PMVectorNewtonZeroFinder>>evaluateIteration
	"Compute one step of Newton's zero finding method. Answers the estimated precision."
	| g0 dg0 coefficient |
	"Calculate new Newton step"
	lastFunctionValue := newFunctionValue.
	linearSolver linearOperator: (jacobianOperator value: result);
		rightHandSideVector: lastFunctionValue; initialValue: searchStep negated.
	searchStep := linearSolver evaluate negated.
	linearSolver hasConverged ifFalse: [ PMStopIterations new signal ].
	"Initialize variables for line search"
	g0 := 0.5 * (lastFunctionValue * lastFunctionValue).
	dg0 := -2 * g0 squared.
	"Get coefficient of the step from line search: 0 < coefficient <= 1; newFunctionValue is updated as well"
	coefficient := lineSearch setValueAtZero: g0 derivativeAtZero: dg0; evaluate.
	lineSearch hasConverged ifFalse: [ PMStopIterations new signal ].
	result := coefficient * searchStep + result.
	^ searchStep rootMeanSquareNormWith: errorWeightVector

Quite importantly, this implementation uses the fact that line search guarantees that the nonlinear function will be called on , thus, avoiding unnecessary function evaluations.