Graphics Reference
In-Depth Information
Example 6 In this example we use the “Pima diabetes” data set from the UCI Ma-
chine Learning data archives. he reduced kernel method is adopted by randomly
sampling % of the column vectors from the full kernel matrix. For reduced kernel
PCA, we assume K is the underlying reduced kernel data matrix of size n
6
m,where
n is the data size and m is the reduced set size (i.e.,column set size). KPCA using the
reduced kernel is a singular value decomposition problem that involves extracting
the leading right and let singular vectors β and β:
n
n
K β
αβ,normalizedtoα β
β
I n
andαβ
β
.
( . )
=
=
=
n
Inthis data set, there arenine variables, including number of times pregnant, plasma
glucoseconcentration (glucosetolerancetest),diastolicbloodpressure(mmHg),tri-
cepsskin foldthickness (mm),two-hour seruminsulin (mu U/ml),body massindex
(weight in kg/(height in m) ), diabetes pedigree function, age (years), and the class
variable (testfordiabetes),whichcanbepositive ornegative. Fordemonstration pur-
poses, we use the first eight variables as input measurements and the last variable as
Figure . . Results from PCA based on original input variables
Figure . . Results from KPCA with the polynomial kernel of degree and scale
Search WWH ::




Custom Search