Graphics Reference
In-Depth Information
θ
∗
to minimize Equation (
A.16
) are
The necessary conditions for
(
θ
∗
)
=
F
∂θ
1
.
∂
∂
F
∂
θ
(
θ
∗
)
=
0
(A.17)
F
∂θ
M
∂
2
F
∂θ
2
F
∂θ
1
∂θ
M
∂
∂
···
1
(
θ
∗
)>
2
F
∂
θ
∂
.
.
.
.
.
2
(
θ
∗
)
=
0
(A.18)
2
F
∂θ
1
∂θ
M
2
F
∂θ
∂
∂
···
M
The notation inEquation (
A.18
) is shorthand for the positive definiteness of thematrix
∂
2
F
∂
θ
(
θ
∗
)
, which is called the
Hessian
.
The condition in Equation (
A.17
) can be written:
2
∂
(
θ
∗
))
∂
F
f
∂
θ
(
θ
∗
)
=−
∂
θ
(
θ
∗
)
=
2
(
x
−
f
0
(A.19)
or equivalently,
∂
f
∂
θ
(
θ
∗
)(
(
θ
∗
))
=
x
−
f
0
(A.20)
Note that
∂
f
∂
θ
is a
N
×
M
matrix. When
f
is a linear function of the parameters
θ
N
×
M
doesn't depend on
(
θ
)
=
θ
∈ R
θ
given by
f
A
, where
A
, then Equation (
A.20
)isa
θ
∗
, the solution of which is:
linear equation in
θ
∗
=
(
F
F
)
−
1
F
x
(A.21)
θ
However, in general, Equation (
A.20
) is a nonlinear systemof equations in
that must
be solved by numerical means.
Let's expand the cost function
F
(
θ
)
in a Taylor series approximation about some
t
:
point
θ
2
F
∂
θ
)
+
∂
F
∂
θ
(
θ
1
2
(
θ
−
θ
)
∂
t
t
)
(
θ
−
θ
t
t
t
t
F
(
θ
)
≈
F
(
θ
)
+
2
(
θ
)(
θ
−
θ
)
(A.22)
We can compute the Hessian matrix of second derivatives in Equation (
A.22
) as:
2
N
2
F
∂
θ
2
f
∂
θ
))
∂
(
x
k
;
θ
)
2
∂
f
∂
θ
(
θ
)
∂
f
∂
θ
(
θ
t
t
t
t
2
(
θ
)
=−
1
(
x
k
−
f
(
x
k
;
(
θ
)
+
)
2
∂
θ
k
=
(A.23)
2
N
2
f
θ
))
∂
(
x
k
;
θ
)
t
t
)
J
t
=−
1
(
x
k
−
f
(
x
k
;
(
θ
)
+
2
J
(
θ
(
θ
)
2
∂
θ
k
=
where
J
is the
Jacobian
matrix defined by
)
=
∂
f
∂
θ
(
θ
t
t
J
(
θ
)
(A.24)
th
element of
J
is the partial derivative of the
j
th
model prediction
ˆ
That is, the
(
j
,
k
)
x
j
with respect to the
k
th
parameter
θ
k
.