Information Technology Reference
In-Depth Information
based on the fact that blood vessels contract when
subjects are stressed and thus skin temperature
drops. Studies have shown that the skin tempera-
ture is at its lowest when stress is at its maximum
(Kistler, et al., 2003). As for skin conductance,
studies have shown that when stress increases the
moisture of the body also increase, which leads to
a decrease in the resistance of the skin.
In order to validate the importance of the three
features, Principle Component Analysis (PCA)
was conducted (Pearson, 1901). PCA is a well
known method that reduces a number of possibly
correlated features to a smaller number using their
covariance matrix. A feature with high variance
is most likely to have more information and a
feature that is independent of the other features
is also of high importance because it carries new
information. To perform PCA, first center the
data by subtracting from each feature its mean,
then compute the covariance matrix and finally
perform an eigen value/eigen vector decomposi-
tion on the covariance matrix. The feature hav-
ing the highest eigen value is the one that has
the most variability and hence contains most of
the information. For the features used, the three
eigen values obtained are: λ 1 = 0.66, λ 2 = 0.28 and
λ 3 = 0.11. The three eigen values are significant
and hence taking away one feature will cause a
significant loss of information.
Once features are selected, the challenge for a
prediction system is to combine the different indi-
cators in order to make a decision. The prediction
has to be accurate, consistent and computationally
effective. What we present here is an algorithm
based on Support Vector Machines.
mappings: polynomial, sigmoid, and RBF such
as Gaussian RBF. In the higher dimension feature
space the SVM algorithm separates the data using
a linear hyperplane. Not like other techniques,
probability model and probability density func-
tions do not need to be known a priori. This is
very important for generalization purposes, as
in practical cases there is not enough informa-
tion about the underlying probability laws and
distributions between the inputs and the outputs.
Since SVM has been recording state of the art
accuracies in many fields, and since it has excel-
lent generalization ability, it is used in the course
of this chapter.
What follows is an introduction to the theory of
SVM and the general equation of the hyperplane
that will separate the two classes. In the case of
linearly separable data the approach is to find
among all the separating hyperplanes the one
that maximizes the margin. Clearly, any other
hyperplane will have a greater expected risk than
this hyperplane.
During the learning stage the machine uses the
training data to find the parameters w =[ w 1 ,w 2 ,...w n ]
and b of the decision function d( x , w , b) given by:
n
(
) =
1
T
d x w b
,
,
w x
+ =
b
w x
+
b
i
i
i
=
The separating hyperplane follows the equa-
tion d(x, w, b) = 0 . In the testing phase, an unseen
vector x , will produce an output y according to
the following indicator function:
y = sign (d(x,w,b))
Support Vector Machines
The decision rule is: if d(x, w, b) > 0 then x
belongs to class 1 and if d(x, w, b) < 0 then x be-
longs to class 2. The weight vector and the bias are
obtained by minimizing the following equation:
When data is linearly separable, SVM uses a linear
hyperplane to create a classifier with a maximal
margin (Kecman, 2001). When the data is not lin-
early separable, SVM maps the data into a higher
dimensional space called the feature space. This
mapping can be achieved using various nonlinear
L d (a)=0.5a T Ha-f T a
Search WWH ::




Custom Search