Geology Reference
In-Depth Information
H 1 g is the Newton direction and, when evaluated at any w on a
quadratic error surface, it will point to the minimum of the error function w * .
The algorithm is iterated to minimize the error surface since, in reality, it will
only be approximately quadratic near a minimum:
The vector
w j þ 1 ¼ w j H 1 g j
ð 4 : 27 Þ
If we consider the relationship between the weight and gradient vectors gener-
ated at two successive steps, then we derive the quasi-Newton condition:
w j þ 1 w j ¼ H 1
ð
g j þ 1
g j Þ
ð 4 : 28 Þ
The computational cost of generating the inverse Hessian is prohibitive, so
quasi-Newton algorithms operate by iteratively generating more accurate approxi-
mations to the inverse Hessian matrix G. The approximation must be constructed to
satisfy the quasi-Newton condition in ( 4.28 ).
4.4.3 Levenberg
Marquardt Algorithms
-
The LM algorithm provides a numerical solution to minimization problems which
are generally nonlinear in type. LM optimization is considered as a virtual standard
in nonlinear optimization which signi
cantly outperforms similar algorithms such
as gradient descent, CG, and quasi-Newton BFGS methods, particularly for med-
ium sized problems such as those encountered in hydrology. This LM algorithm can
be de
ned as a pseudo-second-order method which works with only function
evaluations and gradient information at the same time, which estimates the Hessian
matrix using the sum of outer products of the gradients.
As de
ned in the previous sections, consider our rule-based quadratic approxi-
mation in ( 4.27 ). This quadratic rule is not universally better, since it assumes a
linear approximation of fs dependence on w, which is only valid near a minimum.
Levenberg
is technique mixes these two extremes. The advantage is that one can use
a steeper descent type method until approaching a minimum, and then gradually
switch to the quadratic rule. Levenberg
'
'
is algorithm is formalized as shown in ( 4.29 )
which assumes a blending factor (
) which will handle the mix between steepest
descent and the quadratic approximation. Then the new updated form of the
equation is
k
1 g j
w j þ 1 ¼ w j ð H þ
I
ð 4 : 29 Þ
where the term I is the identity matrix. As
becomes small, the rule approaches the
quadratic approximation update rule as mentioned above. If
k
k
is large, the rule
converges to
 
Search WWH ::




Custom Search