Image Processing Reference
In-Depth Information
4.2 Kernel Feature Space versus PCA Feature Space
In PCA, covariance matrix of the input data is given after some processing stapes such as cent-
ring the data and calculating the mean. Then, the feature space is given by extracting the ei-
genvectors from this matrix. Indeed, each eigenvector has its corresponding eigenvalue whose
greatness determines the value of its principal axis. Projection of the data onto a spanned sub-
space of the mention feature space is the actual data transformation and depending on how
many principal axes to project the data onto, the dimension of the transferred data is determin-
ed. It means that the dimension of the input data can be reduced to even “one” when using just
the first principal ax for projection. On the other hand, the maximum dimension data can have
after transformation is equal to its original dimension when using all principal axes to project
the data onto. Note that, in PCA the dimension of the feature space, where the spanned sub-
space is selected, is equal to the dimension of the input data. For example, let X = [ x 1 , …, x N ],
where x t R d meaning that we have N input data having the dimension of d . In this case, no
matter how large N is, the feature space has the fixed dimension of d , which can be considered
an advantage for PCA as the dimension of feature space is limited to that of input data and it
does not get too high in terms of analyzing potentially large number of data.
In KPCA, however, it is different. The kernel feature space has a higher dimension than that
of input data. Indeed, in KPCA, the whole input data space is mapped into kernel feature
space by a specific kernel function which is nonlinear, and uses inner products. It is given by
considering kernel matrix as input data space for PCA. It means that, unlike PCA, the dimen-
sion of kernel feature space is not dependent to the dimension of the input data. Indeed, it is
equal to the number of input data. Let X be [ x 1 , …, x N ], for instance, and x t R d meaning that
we have N input data with the dimension of d . In KPCA, no mater how high the dimension of
the input data d is, the kernel feature space is N dimensional as the kernel matrix (Gram mat-
rix) is [ N × N ]. Kernel matrix and its feature space is defined in Shawe-Taylor and Cristianini
[ 22 ] as follows:
Given a set of vectors X = [ x 1 , …, x N ], the gram matrix is defined as the [ N × N ] matrix G
whose entries are G ij =〈 x i , x j 〉. If we are using a kernel function k to evaluate the inner products
in a feature space with feature map φ , the associated Gram matrix has entries
In this case, the matrix is often referred to as kernel matrix displaying as follows:
k ( x 1 , x 1 ) k ( x 1 , x 2 )
k ( x 1 , x N )
k ( x 2 , x 1 ) k ( x 2 , x 2 )
k ( x 2 , x N )
k ( x N , x 1 ) k ( x N , x 2 )
k ( x N , x N )
Search WWH ::

Custom Search