Information Technology Reference
In-Depth Information
in gray, and H-segments in black. The other rows contain the relabeling according
to the various detectors. For the three rows f N, O g _ x ( x 2f 1, 2, 5 g ), a segment is
labeled O when it contains a part of overlap and, otherwise, N. For the three rows
f N, L, H g _ x ( x 2f 1, 2, 5 g ), overlap segments are labeled according to the conflict
level of the clip: L for LLC-Ov and H for HLC-Ov.
Figure 18.5 gives an instance of metadata relabeling for the LLC clip
#Train_0001. For this clip, an LLC-Ov occurs over 13.01 and 14.4 s. The relabeling
is O for the segments 14 and 15 of f N, O g _1, the segments 7 and 8 of f N, O g _2, and
thesegment3of f N, O g _5. The relabeling is L for the segments 14 and 15 of f N,
L, H g _1, the segments 7 and 8 of f N, L, H g _2, and the segment 3 of f N, L, H g _5.
Figure 18.6 gives an instance of metadata relabeling for the HLC clip
#Train_0006. For this clip, HLC-Ovs occur over 14.9 and 18.9 s and over 28.3
and 30 s. The relabeling is O for the segments 15, 16, 17, 18, 19, 29, and 30 of f N,
O g _1, for the segments 8, 9, 10, and 15 of f N, O g _2, and for the segments 3, 4, and
6of f N, O g _5. The relabeling is H for the segments 15, 16, 17, 18, 19, 29, and 30
of
f N, L, H g _1, for the segments 7, 8, 9, 10, and 15 of
f N, L, H g _2, and for the
segments 3, 4, and 6 of f N, L, H g _5.
18.4.2
Two-Class f N, O g Classifiers
Using relabeling, three two-class SVMs ( f N, O g _1, f N, O g _2, f N, O g _5) were
estimated on the Train set. Each SVM classifies a segment of a given duration
(1, 2, and 5 s) into overlap (O) or Non-Ov (N). To account for the imbalanced
class distribution, the upper-represented category (N) was down-sampled by a given
factor. A factor of 4 was applied for the f N, O g _1 detector, a factor of 3 for the
f N, O g _2 detector, and a factor of 2 for the f N, O g _5 detector. We investigated the
effects of different feature sets on the accuracy of the overlap speech detection.
Table 18.4 gives the accuracy rates (N-Acc. and O-Acc. in %) of the two-class
Table 18.4 Accuracy rates of the detectors f N, O g on the Development
set according to the feature sets. In bold, the best feature set
Detectors f N, O g
Feature set
N-Acc. (%)
O-Acc. (%)
UAR (%)
f N, O g _1
IS-2010
86.7
73.9
80.3
f N, O g _1
IS-2011
87.7
72.3
80.0
f N, O g _1
IS-2012
87.8
71.6
79.7
f N, O g _2
IS-2010
85.1
75.1
80.1
f N, O g _2
IS-2011
87.3
71.6
79.5
f N, O g _2
IS-2012
87.4
71.7
79.5
f N, O g _5
IS-2010
82.7
78.7
80.7
f N, O g _5
IS-2011
84.9
75.3
80.1
f N, O g _5
IS-2012
84.0
75.7
79.8
 
Search WWH ::




Custom Search