Information Technology Reference
In-Depth Information
f fat ( e )= μ n
G h fat ( e ) ,
with h fat >h IMSE .
(3.26)
By Proposition 3.1, there is a Gaussian kernel G h ( e ) such that G h fat ( e )=
G h IMSE ( e )
G h ( e ). Hence,
f fat ( e )= μ n
G h ( e )= f n ( e )
G h IMSE ( e )
G h ( e )
n→∞
f ( e )
G h ( e ) , (3.27)
where the convergence is in the IMSE sense.
The estimate f fat ( e ) is oversmoothed compared to the one converging to
f ( e ). This is unimportant since we are not really interested in f n ( e ) (we
namely don't use it to compute error rates). Our sole interest is in getting
the right classifier parameter values ( d and σ in Examples 3.1 and 3.2) cor-
responding to min P e .
3.2 The Linear Discriminant
Linear discriminants are basic building blocks in data classification. The lin-
ear discriminant implements the following classifier family:
Z W = θ ( w T x + w 0 ); w
d ,w 0 R , (3.28)
where w and w 0 are the classifier parameters usually known as weight vec-
tor and bias term, respectively, and θ ( · ) is the usual classifier thresholding
function yielding class codes. We restrict here our analysis of the linear dis-
criminant to the case where the inputs are Gaussian distributed; this will
be enough to demonstrate the MEE sub-optimal behavior for this type of
classifier.
W
R
3.2.1 Gaussian Inputs
To derive the error PDF for Gaussian inputs x i we take into account that
Gaussianity is preserved under linear transformations: if X with realizations
x =[ x 1 ...x d ] T
has a multivariate Gaussian distribution with mean
μ
and
covariance Σ ,X
g ( x ;
μ
, Σ ),then
Y = w T X + w 0
g ( y ; w T
+ w 0 , w T Σw ) .
μ
(3.29)
Therefore, the class-conditional error PDFs, f E|t ( e ), are also Gaussian and
we deal with an error PDF setting similar to the one of Examples 3.1 and
3.2:
f Y |t ( y )= g ( y ; w T
μ X|t + w 0 , w T Σ X|t w )
(3.30)
 
Search WWH ::




Custom Search