Database Reference
In-Depth Information
respectively. This study is also consistent with existing research [
258
,
260
,
261
].
In the following experimentation for genre categorization with a total of 23 sports
types, it is predicted that the codebook size should be bigger than in the tested 14
sports case. Therefore, a codebook size of 1,600 is chosen, and a codebook size of
800 is also applied as a comparative analysis. For view classifications involving 14
sports, a codebook size of 800 is selected.
9.4.1
Genre Categorization Using K-Nearest Neighbor
Classifier
In genre categorization, a K-nearest neighbor (k-NN) classifier is applied. Three
different dissimilarity measurements are compared, including Euclidian distance
(ED), earth mover's distance (EMD), and Kullback-Leibler divergence (KL-div).
ED is used for measuring the spatial distance in Euclidian space in between two his-
tograms. EMD is a distance function for achieving the minimal cost in transforming
one histogram into the other [
300
]. The KL-div is a non-symmetric measurement
between two probability distributions
Q
and
P
defined as
D
KL
(
Q
||
P
)=
∑
i
q
i
·
ln
[
301
]. In this work,
q
i
and
p
i
are individual codewords for the query video
Q
and the trained genre model
P
, respectively.
Before accuracy performance analysis on genre categorization, codebook gener-
ation schemes are examined by comparing both the proposed two-level bottom-up
(BU) structure and the baseline single K-means (SK) clustering method [
301
]. As
pointed out by Jain et al. [
302
], K-means clustering is considered a partitional
algorithm using the squared error to reach the optimum solution. The sum of
squared errors (SSE) is a widely used criterion function for clustering analysis,
which quantitatively measures the total difference between all individual points
to their clustering centers [
301
]. An SSE deviation percentage
(
q
i
/
p
i
)
ʴ
dev
is defined in
Eq. (
9.16
). Let
ʾ
SK
represent the SSEs of the bottom-up clustering and the
single K-means clustering at the end of each algorithm, respectively. The numerator
is the absolute value of the difference between
ʾ
BU
and
ʾ
BU
and
ʾ
SK
, and the denominator
is
ʾ
SK
. As Table
9.4
shows, the SSE deviation percentages at codebook sizes of
800 and 1,600 are 1
7 %, respectively. Thus, we can conclude that in
using the bottom-up structure instead of the single K-means clustering for codebook
generation, the deviation of SSE is trivial.
.
4 % and 3
.
ʴ
dev
=
|
ʾ
−
ʾ
|
BU
SK
·
100 %
(9.16)
ʾ
SK
Codebook computation effort of the bottom-up structure is also compared with
single K-means clustering in Table
9.4
. Both bottom-up and single K-means
clustering are employed on a single Quad CPU at 2.40 GHz with 4.0G RAM
machine, in which the bottom-up is only simulated as parallel computing in a serial
sequence. To generate a codebook with size 800, the single K-means clustering uses
Search WWH ::
Custom Search