Information Technology Reference
In-Depth Information
Fig. 18.1
Clip occurrence on the Train set as a function of the CSR
(a) the overlap segment duration, (b) the clip overlap duration as the summation of
each overlap segment duration of the clip, (c) the mean overlap duration of a clip as
the ratio of the clip overlap duration to the number of overlaps occurring in the clip,
and (d) the percentage of overlap duration of the clip as the ratio of the clip overlap
duration to the clip duration.
18.2.2
SSPNet Train Set Statistics
We analyzed the statistics of the SSPNet database Train set in focusing on the
main characteristics of overlap segments; some statistics of the moderator were also
investigated. The Train set includes 793 clips and has a total duration of 23,774 s
(two clips' duration is inferior to 30 s), with 82 speakers (one moderator and 81
participants).
We analyzed the 4,143 segments of 23,774 s duration that were obtained by the
clip diarization given in the SSPNet database. These segments were split according
to the number of speakers that occurred in the segment: (1) 34 segments of a total
duration of 89.9 s, which correspond to gaps in which nobody is speaking, (2) 2,638
segments of a total duration of 20,083.5 s, in which a lonely subject is speaking,
and (3) 1,471 segments of a total duration of 3,600.6 s, in which two subjects are
speaking. No segment was identified that had three or more speakers.
Figure 18.2 shows the histogram for each CSR of the average of the number
of interruptions (i.e., the segments of overlapping speech) of the CSR clips. The
horizontal dashed line represents the average of the number of interruptions of the
Train set clips. Except for the CSR ([ 1, 0[), all of the CSRs of LLC have a mean
number of interruptions that are below the average value (1.85 D 1,471/793). The
HLC clips have more interruptions than the LLC clips.
Search WWH ::




Custom Search