Information Technology Reference
In-Depth Information
in gray, and H-segments in black. The other rows contain the relabeling according
to the various detectors. For the three rows
f
N, O
g
_
x
(
x
2f
1, 2, 5
g
), a segment is
labeled O when it contains a part of overlap and, otherwise, N. For the three rows
f
N, L, H
g
_
x
(
x
2f
1, 2, 5
g
), overlap segments are labeled according to the conflict
level of the clip: L for LLC-Ov and H for HLC-Ov.
Figure
18.5
gives an instance of metadata relabeling for the LLC clip
#Train_0001. For this clip, an LLC-Ov occurs over 13.01 and 14.4 s. The relabeling
is O for the segments 14 and 15 of
f
N, O
g
_1, the segments 7 and 8 of
f
N, O
g
_2, and
thesegment3of
f
N, O
g
_5. The relabeling is L for the segments 14 and 15 of
f
N,
L, H
g
_1, the segments 7 and 8 of
f
N, L, H
g
_2, and the segment 3 of
f
N, L, H
g
_5.
Figure
18.6
gives an instance of metadata relabeling for the HLC clip
#Train_0006. For this clip, HLC-Ovs occur over 14.9 and 18.9 s and over 28.3
and 30 s. The relabeling is O for the segments 15, 16, 17, 18, 19, 29, and 30 of
f
N,
O
g
_1, for the segments 8, 9, 10, and 15 of
f
N, O
g
_2, and for the segments 3, 4, and
6of
f
N, O
g
_5. The relabeling is H for the segments 15, 16, 17, 18, 19, 29, and 30
of
f
N, L, H
g
_1, for the segments 7, 8, 9, 10, and 15 of
f
N, L, H
g
_2, and for the
segments 3, 4, and 6 of
f
N, L, H
g
_5.
18.4.2
Two-Class
f
N, O
g
Classifiers
Using relabeling, three two-class SVMs (
f
N, O
g
_1,
f
N, O
g
_2,
f
N, O
g
_5) were
estimated on the Train set. Each SVM classifies a segment of a given duration
(1, 2, and 5 s) into overlap (O) or Non-Ov (N). To account for the imbalanced
class distribution, the upper-represented category (N) was down-sampled by a given
factor. A factor of 4 was applied for the
f
N, O
g
_1 detector, a factor of 3 for the
f
N, O
g
_2 detector, and a factor of 2 for the
f
N, O
g
_5 detector. We investigated the
effects of different feature sets on the accuracy of the overlap speech detection.
Table
18.4
gives the accuracy rates (N-Acc. and O-Acc. in %) of the two-class
Table 18.4
Accuracy rates of the detectors
f
N, O
g
on the Development
set according to the feature sets. In bold, the best feature set
Detectors
f
N, O
g
Feature set
N-Acc. (%)
O-Acc. (%)
UAR (%)
f
N, O
g
_1
IS-2010
86.7
73.9
80.3
f
N, O
g
_1
IS-2011
87.7
72.3
80.0
f
N, O
g
_1
IS-2012
87.8
71.6
79.7
f
N, O
g
_2
IS-2010
85.1
75.1
80.1
f
N, O
g
_2
IS-2011
87.3
71.6
79.5
f
N, O
g
_2
IS-2012
87.4
71.7
79.5
f
N, O
g
_5
IS-2010
82.7
78.7
80.7
f
N, O
g
_5
IS-2011
84.9
75.3
80.1
f
N, O
g
_5
IS-2012
84.0
75.7
79.8
Search WWH ::
Custom Search