Scalable Video Genre Classification and Event Detection - Multimedia Database Retrieval: Technology and Applications

Database Reference

In-Depth Information

Table 9.4

ʴ dev and computation time in codebook

generation using bottom-up (BU) and single K-means (SK) structures

Codebook size

SSE deviation percentage

cb BU =

800

cb SK =

800

cb BU =

1

,

600

cb SK =

1

,

600

Computation

4h

350 h

9h

648 h

ʴ

1.4 %

3.7 %

dev

350 h, while the bottom-up clustering only takes 4 h. When the codebook size is

doubled to 1,600, the computation for single K-means and bottom-up clustering

are 648 and 9 h, respectively. With a truly distributed processing environment

using multiple computers, bottom-up processing time will be further reduced. This

comparison of computational complexity demonstrates that our generic framework

using robust bottom-up clustering for codebook generation can replace the single

K-means in dealing with large-scale and diverse datasets.

For the accuracy performance using k-NN and various dissimilarities, Table 9.5

shows the average genre categorization results for 23 different sports. The proposed

bottom-up codebook generation manifests a better and more robust performance

than single K-means codebook generation in both EMD and KL-div measurements.

By comparing the row-wise's dissimilarities, the bottom-up structure is more

consistent with codebook sizes of 800 and 1,600. On the contrary, the single

K-means codebook generation is unstable for both histogram and mLDA-based

distributions. For instance, the performance at a codebook size of 800 using EMD

has about a 7 % increment from ED dissimilarity (75

31 %), while the

counterpart at a codebook size of 1,600 using EMD has dropped 1

.

33 % vs. 68

.

1 % from ED

dissimilarity (64

39 %). One reason is that the single K-means clustering

on over three million input SIFT points hardly reaches the optimal value. As a

summary, KL-div performs the best among three dissimilarity measures. Using

the bottom-up structure, results of the codebook size 1,600 outperform the cases

with size 800 in all measurements with consistency. Oppositely, single K-means

clustering results are not consistent.

Another merit of the bottom-up structure is its preservation of individual

genre characteristics from the 1st-level K-means. On the contrary, single K-means

codebook generation covers all the data; thus, a weakly distinguishable genre is

easily overruled by a strong one. This reasoning explains why with the increase of

codebook size from 800 to 1,600, the bottom-up process has about a 4 % improve-

ment for KL-div, while the single K-means process has only a 2 % increment for

KL-div.

The individual sport genre classification result is illustrated in Fig. 9.7 .On

average, a codebook size of 1,600 gives an average of 3

.

28 % vs. 65

.

6 % higher than the

codebook size of 800, which corresponds with the empirical studies from other

research groups [ 258 , 261 ].

To evaluate the generic and extensive properties of our proposed method,

experimental results on the 23-sports dataset are compared with results in Li et al.'s

work [ 265 ], where a top-down process was adopted using single K-means as its

.

Multimedia Database Retrieval: Technology and Applications

Search WWH ::

Custom Search

Home