Information Technology Reference
In-Depth Information
the time from 1 to 30 s (clip duration); the row Segmentation is the representation of
the diarization metadata of the clip: N-segments are colored in white, L-segments
in gray, and H-segments in black. The other rows contain the posterior probabilities
presented as a percentage. For the three rows f N, O g _ x ( x 2f 1, 2, 5 g ), a segment of
posterior probabilities that was higher than 50 was detected as O; otherwise, it was
detected as N. The posterior probabilities that were associated with the f N, L, H g _5
detector are presented in the three other rows f N, L, H g _5 (N), f N, L, H g _5 (L), and
f N, L, H g _5 (H) for, respectively, nonoverlap (N), low-level-conflict (L), and high-
level-conflict (H). For a given segment, the higher probability (in bold) corresponds
to the class that was detected.
In Fig. 18.8 , the class O was detected for the segments 14 and 15 of f N, O g _1,
the segments 8 and 9 of f N, O g _2, and the segments 3 and 4 of f N, O g _5. Class H
was detected for the segment 3 for f N, L, H g _5. There are three wrong detections:
the class O instead of N for the segment 9 of f N, O g _2 and the segment 4 of f N,
O g _5, and the class H was detected instead of L for segment 3 for f N, L, H g _5.
In Fig. 18.9 , the class O was detected for the segments 10, 11, 16, 17, 18, 19,
21, 22, 23, 28, and 30 of f N, O g _1, for the segments 5 and 8 through 15 of f N,
O g _2, and for the segments 2, 4, 5, and 6 of f N, O g _5. Class H was detected for the
segments 2, 4, 5, and 6 of f N, L, H g _5. There are ten wrong detections: the class O
instead of N for the segments 10, 11, and 28 of f N, O g _1, the segments 5, 13, and 14
of f N, O g _2, the segment 2 of f N, O g _5, the class N instead of O for the segments
15 and 29 of f N, O g _1, and the class H instead of N for the segment 2 of f N, L,
H g _5. We note that there was no wrong decision for segments 21, 22, and 23 of f N,
O g _1, for the segments 11 and 12 of f N, O g _2, and for segment 5 of f N, O g _5 and
f N, L, H g _5; after listening, an overlap occurs effectively from 20.3 to 22.2 s but
was not labeled in the metadata.
18.5.2
Overlap Feature Sets
One hundred and twenty posterior probabilities were computed for each clip. These
values depend on the time and represent the temporal shape of a conflict in terms
of the overlap. There are specific temporal shapes for conflict escalation (Kim et al.
2012c ), but the 797 clips of the Train set are insufficient to model these temporal
shapes. We have chosen to apply statistical functionals to the posterior probabilities;
the purpose was to obtain an overlap feature set that is related to the percentage
of overlap duration. Three functionals have been chosen: mean, correlation, and
covariance. The mean functional was applied to the posterior probabilities of f N,
O g _1, f N, O g _2, f N, O g _5 for the class O and to the posteriors of f N, L, H g _5
for the classes N, L, and H. The correlation functional was applied between the
posterior probabilities of the class O for all combinations of f N, O g _1, f N, O g _2
and f N, O g _5. Table 18.8 gives a list of the ten features that were computed by the
mean and correlation functionals.
Functional covariance is a functional of a functional. It was applied to the mean
and correlation functionals. The interest of this functional is to reveal the cofactors.
Search WWH ::




Custom Search