Database Reference
In-Depth Information
Table 9.8
Performance comparison on score event detection in basketball
Accuracy
Dataset A (NBA/Olympics)
Dataset B (NBA/NBA)
SVM + A (%)
PLSA + A (%)
SVM+B (%)
PLSA + B (%)
HMM
ˉ =
0
78.28
75.29
87.50
85.94
CRF
ˉ =
0
78.16
74.57
87.43
86.52
CRF
ˉ =
1
79.52
76.82
88.52
87.89
HCRF
ˉ =
0
80.93
75.53
90.00
90.77
HCRF
ˉ =
1
83.26
80.24
93.08
92.31
2 82.09 77.88 91.46 91.77
Dataset A: NBA matches as training, Olympic matches as testing. Dataset B: NBA
matches for both training and testing
HCRF
ˉ =
of structured prediction models in accommodating poorly labeled video sequences
from PLSA, yet achieving comparable performance with those labeled sequences
from SVM. Therefore, the event detection presented in this work achieves similar
results by both unsupervised and supervised learning. However, due to PLSA's
reduced human involvement, the unsupervised classifier is preferred in large-scale
video analysis.
Experimental result discrepancies using Dataset A and Dataset B are also
compared. Although both datasets belong to basketball, Dataset B (with NBA
matches for both training and testing) outperformed Dataset A (with NBA matches
for training and Olympics matches for testing) by 10
9 % on average. It suggests
that albeit Datasets A and B are of the same genre and event detection task, a
significant difference exists. Such a difference can be explained by assuming that
NBA and international basketball (FIBA) are two different styles of the same
genre. In terms of computer vision and structured prediction, NBA and FIBA have
related but different temporal patterns even in the same semantic event. Thus,
by training/testing in the same style, it is expected to have a better detection rate
than training/testing using different styles. This is also an example of the semantic
gap-that semantic event recognition with discrepant conditions is still not perfect.
Although there is only one event detection example discussed, it is believed that
the method can be extended and generalized to a bigger pool of event scenarios. The
reason is fourfold: First, the experiment data of the basketball score event are multi-
source and non-simplex. Videos are collected from both internet and TV recordings,
and there are different production rules of NBA and Olympics basketball. Second,
the video representation module using local features and the BoW model is domain
knowledge-free and with no production rules involved. Such a generic approach has
been proven to be effective in genre categorization of 23 sports, view classification
of 14 sports, and the basketball score event. Third, the event detection algorithm
utilizing HCRFs, as well as baseline HMMs and CRFs are structured prediction
models and belong to the category of state event model. By comparing the number
of events analyzed using different event models from Table 9.2 , the state event
model, a recently popular approach in literature, is capable in handling more events
.
 
Search WWH ::




Custom Search