Information Technology Reference
In-Depth Information
For exploiting the complementary information among all classifiers, we investigated
three decision rules (mean rule, product rule, and median rule). Detailed derivation of
decision rules by Eqn. 5 and Bayesian theorem can be found e.g. in [19]. Assume that
all classifiers used are generally statistically independent, and the priori probability of
occurrence for i -th class model are under assumption of equal priors, the rule of multi-
classifier fusion is simplified to
Assign X
→{
ω t =
j
}
if
P
(
ω t =
j
|
X, D k )= max
i∈{ 1 ,...,C}
P
(
ω t =
i
|
X, D k )
DecisionRule
k∈{ 1 ,...,R}
(6)
As shown in Fig. 3, many popular classifiers, such as SVM, can output a voting vec-
tor which represents the voting numbers for each class. We denote V i
, for the voting
number of i -th class from k -th classifier D k .
These voting numbers are then converted to probabilities by applying the softmax
function
(
V i )
exp
P i =
P
(
ω t =
i
|
X, D k )=
(7)
i =1
V i )
exp
(
Using this transformation does not change the classification decision for a classifier;
moreover, it allows us to treat the classifier within Bayesian probabilistic framework.
4
Experiments
The proposed approach was evaluated with the Cohn-Kanade facial expression database.
In our experiments, 374 sequences were selected from the database for basic expres-
sions recognition. The sequences came from 97 subjects, with one to six expressions
per subject.
Coordinates of facial fiducial points in the first frame are determined by ASM, and
then the CSF features extracted from 38 facial components with fixed block size on
those points are concatenated into one histogram. Ten-fold cross validation method was
used in the whole scenario.
It was anticipated that the component size will influence the performance. Fig. 4
presents results using four block sizes with CSF. From this figure we can observe that
the highest mean performance (
94
.
92%
) is reached when the component size is 16
×
16,
which was then selected for the following experiments.
AdaBoost is used to select the most important slices, as described in Sec. 2.3. In
our experiments, the number of slices varies at 15, 30, 45, 60, 75, 90. The average
recognition accuracies corresponding to different number of slices are
90
.
37%
,
91
.
98%
,
94
.
12%
,
93
.
32%
,
93
.
05%
,
92
.
25%
, respectively. It is observed that the best accuracy
of
is obtained with 45 slices. Compared with the result in Fig. 4 at optimal
block size, the accuracy decreases by
94
.
12%
, but the dimensionality of the feature space
is reduced from 38*59*3 (6726) to 45*59 (2655).
The six-expression classification problem was decomposed into 15 two-class prob-
lems. Therefore, each test sample is classified by 15 expression-pair sub-classifiers. In
multi-classifier fusion, 15 sub-classifiers as a whole were thought as an individual clas-
sifier D k as shown in Fig. 3. After selecting the optimal component size, five different
0
.
8%
 
Search WWH ::




Custom Search