Information Technology Reference
In-Depth Information
Table 18.6 Information gain
of the five best-ranked LLDs
of the IS-2010 audio feature
set in discriminating
LLC-Ovs and HLC-Ovs
Low-level descriptors (LLD)
Inf. gain
Rank (/38)
Log power [3,934-5,649 Hz]
0.130
1
Log power [2,682-3,934 Hz]
0.119
2
Log power [1,768-2,682 Hz]
0.107
3
Normalized loudness
0.102
4
Log power [5,649-8,000 Hz]
0.102
5
Table 18.7 Information gain
of the five best-ranked LLDs
of the IS-2010 audio feature
set in discriminating Ovs and
Non-Ovs
Low-level descriptors (LLD)
Inf. gain
Rank (/38)
Fundamental frequency (F0)
0.141
1
Log power [614-1,101 Hz]
0.129
2
Log power [0-259 Hz]
0.127
3
Jitter (DDP)
0.124
4
First mel-frequency cepstral coef.
0.121
5
Table 18.7 gives the information gain that is computed on the Train set of
the five best-ranked LLDs (over 38 LLDs) in discriminating Ovs and Non-Ovs.
According to the information gain rank, the most relevant LLDs are the fundamental
frequency, the logarithmic powers, especially in low-frequency bands, the jitter,
and the first mel-frequency cepstral coefficient. The usual representation techniques
and algorithms are designed and interpreted for speech signals in which a lonely
subject is speaking. In the case of overlapping speech in which two or more subjects
are speaking, the usual algorithms are not adapted (e.g., the pitch algorithm); the
computation of one fundamental frequency has no sense, and its computation was
shown to be the most discriminant cue for detecting Ov/Non-Ov. For a speech
representation such as the logarithmic power in the mel-frequency bands, the low-
frequency bands in which the first two formants of the speaker occur were also
shown to be discriminant. Last, the jitter DDP (difference of differences of periods)
related to the pitch and the first mel-frequency cepstral coefficient related to the
energy of the segment were also shown to be relevant for the discrimination
Ov/Non-Ov.
18.5
Conflict Detector
Overlap detectors have been developed and assessed, to incorporate their
knowledge in an improved conflict detector (conflict/nonconflict). Incorporating
prior knowledge (Krupka and Tishby 2007 ;Lieta . 2008 ) in classification
systems allowed an increase in the performance in many applications of pattern
recognition (e.g., biomedical image, pathological voice). Various methods have
been developed for neural network systems (Chen et al. 2000 ) and SVM classifiers
(Decoste and Scholkopf 2002 ; Lauer and Bloch 2008 ). As defined by Schölkopf
and Smola ( 2001 ), the methods developed for including prior knowledge in an
 
Search WWH ::




Custom Search