Database Reference
In-Depth Information
where the trade-off parameter C
m are the Lagrange
multipliers. The resulting decision function can be shown to take the form:
>
0, and
ʱ i ,
i
=
1
,
2
,...,
sgn m
i = 1 ʱ i y i · k ( x , x i )+ b
(
)=
f
x
(10.40)
where the weight parameters
ʽ i in Fig. 10.5 are replaced by
ʱ i y i .
10.4.4
Results of Movie Clip Classification
The experimental results were conducted on a database consisting of 6,000 movie
clips previously used in Sect. 10.2.5 . All videos were indexed by visual and audio
features. The visual feature was obtained by TFM with T c
=
,
000. Each video
clip was described by its associated weight vector [cf. Eq. ( 10.14 )]. The audio
feature was obtained by LMM. A wavelet transform with 9-level decompositions
was applied to the audio signal from each video clip. The coefficients in each
high frequency subband were then characterized by the LMM. The resulting model
parameters and the mean and standard deviation of the wavelet coefficients in the
low-frequency subband were used to obtain feature vectors according to Eq. ( 10.12 ).
The SVM-based fusion model was applied for the classification of videos in the
video database. Five semantic concepts were utilized to obtain the results. These
concepts included Fighting , Ship Crashing , Love Scene , Music Video , and Dance
Party . For each of the five concepts, the ground truth classes were obtained by
manually classifying all video clips in the database. Table 10.6 shows detailed
information of the data set used in the experiment. The ground truth class was
used for measuring classification performance. For each concept, the system was
trained using a training set of 100-250 samples randomly selected from the database
according to the type of concepts. The size of the training set was approximately less
than 2 % of all video clips used for testing.
In order to measure the performance of the system, three following criteria
were utilized: classification accuracy, false positive rate, and false negative rate.
Classification accuracy was used to measure the percentage of correct/incorrect
classifications [ 305 ]. The false positive rate was the proportion of negative instances
that were erroneously reported as being positive, and the false negative rate was the
proportion of positive instances that were erroneously reported as negative [ 304 ].
Table 10.7 shows the experimental results obtained by the SVM-based fusion
method. It can be observed that the method achieved very high accuracy, an average
of more than 91 %. It should be noted that this is not a rare result. The number of
negative samples was much more than the positive samples within a given class; the
models can correctly classify most of the negative samples, and thus the average
was high. An interesting observation was the false negative rate, since it indicated
the percentage of positive samples that were correctly detected. The system had
the highest false negative rate at 26.87 % for classification of the 'Ship Crashing'
2
Search WWH ::




Custom Search