Information Technology Reference
In-Depth Information
We denote by w LS the vector of parameters for which the least squares cost
function J is minimum. The resulting model is thus of the form g Q = ζ T w LS ,
and one can define vector g Q = Ξw LS ,where
g Q is a vector whose N components are the predictions of the model for
the N measurements.
Ξ is the observation matrix: column i ( i =1to Q + 1) is the vector Ξ i ,of
which the components are the N measurements of the i th input: matrix
Ξ has N rows and Q + 1 columns,
ζ 1 T
···
···
···
ζ N T
ζ 1 ... ζ 1
n
···
···
···
= ζ 1
ζ n .
Ξ =
···
···
···
=
···
···
···
···
ζ 1
ζ n
···
The input selection problem is the following: are all Q candidate variables
relevant? If a variable is irrelevant, the corresponding parameter in the com-
plete model should be equal to zero. A submodel is a model that is obtained
by setting to zero one or several parameters of the complete model. Thus,
in order, to solve the problem, the complete model must be compared to all
its submodels. We consider a submodel whose last q components (numbered
from Q
,where w Q−q
mc
is the vector of parameters that is obtained by minimizing the least squares
cost function J =
= Ξw Q−q
mc
q +2 to Q + 1) are equal to zero: g Q−q
2 under the constraints that the last q
components of the vector of the parameters be equal to zero. We want to test
the null hypothesis H 0 : the last q parameters of the random vector W are
equal to zero. If that hypothesis is true, then the random variable
y p
g Q−q ( ζ , w )
2
2
Z = N
Q
1
Y p
G Q−q
Y p
G Q
q
2
Y p
G Q
2
= N
Q
1
G Q
G Q−q
2
q
Y p
G Q
is a Fisher variable with q and N −Q -1 degrees of freedom.
2 is the sum of the squares of the components
Proof. The quantity
Y p
G Q
of vector Y p
G Q , which is orthogonal to the subspace spanned by the Q +1
columns of the observation matrix Ξ . Thus, it is the sum of N
( Q +1)
squared independent Gaussian variables: it has a Pearson distribution with
N
G Q−q is in a q -dimensional
space, hence the square of its norm is the sum of q squared independent
Gaussian variables: therefore,
Q -1 degrees of freedom. Similarly, vector G Q
2 is a Pearson variable with q
degrees of freedom. The ratio Z of those Pearson variables is a Fisher variable,
as mentioned above.
G Q
G Q−q
Search WWH ::




Custom Search