Database Reference
In-Depth Information
respectively. This study is also consistent with existing research [ 258 , 260 , 261 ].
In the following experimentation for genre categorization with a total of 23 sports
types, it is predicted that the codebook size should be bigger than in the tested 14
sports case. Therefore, a codebook size of 1,600 is chosen, and a codebook size of
800 is also applied as a comparative analysis. For view classifications involving 14
sports, a codebook size of 800 is selected.
9.4.1
Genre Categorization Using K-Nearest Neighbor
Classifier
In genre categorization, a K-nearest neighbor (k-NN) classifier is applied. Three
different dissimilarity measurements are compared, including Euclidian distance
(ED), earth mover's distance (EMD), and Kullback-Leibler divergence (KL-div).
ED is used for measuring the spatial distance in Euclidian space in between two his-
tograms. EMD is a distance function for achieving the minimal cost in transforming
one histogram into the other [ 300 ]. The KL-div is a non-symmetric measurement
between two probability distributions Q and P defined as D KL (
Q
||
P
)=
i q i ·
ln
[ 301 ]. In this work, q i and p i are individual codewords for the query video
Q and the trained genre model P , respectively.
Before accuracy performance analysis on genre categorization, codebook gener-
ation schemes are examined by comparing both the proposed two-level bottom-up
(BU) structure and the baseline single K-means (SK) clustering method [ 301 ]. As
pointed out by Jain et al. [ 302 ], K-means clustering is considered a partitional
algorithm using the squared error to reach the optimum solution. The sum of
squared errors (SSE) is a widely used criterion function for clustering analysis,
which quantitatively measures the total difference between all individual points
to their clustering centers [ 301 ]. An SSE deviation percentage
(
q i /
p i )
ʴ dev is defined in
Eq. ( 9.16 ). Let
ʾ SK represent the SSEs of the bottom-up clustering and the
single K-means clustering at the end of each algorithm, respectively. The numerator
is the absolute value of the difference between
ʾ BU and
ʾ BU and
ʾ SK , and the denominator
is
ʾ SK . As Table 9.4 shows, the SSE deviation percentages at codebook sizes of
800 and 1,600 are 1
7 %, respectively. Thus, we can conclude that in
using the bottom-up structure instead of the single K-means clustering for codebook
generation, the deviation of SSE is trivial.
.
4 % and 3
.
ʴ dev = | ʾ
ʾ
|
BU
SK
·
100 %
(9.16)
ʾ SK
Codebook computation effort of the bottom-up structure is also compared with
single K-means clustering in Table 9.4 . Both bottom-up and single K-means
clustering are employed on a single Quad CPU at 2.40 GHz with 4.0G RAM
machine, in which the bottom-up is only simulated as parallel computing in a serial
sequence. To generate a codebook with size 800, the single K-means clustering uses
Search WWH ::




Custom Search