Digital Signal Processing Reference
In-Depth Information
ASE 2
Audio Event 3
ASE 1
ASE 3
Audio Event 2
ASE 2
ASE 1
ASE 2
Audio Event 1
ASE 1
t
Fig. 7.13 Viterbi search of the optimal audio event sequence, Trellis diagram for the hierarchical
recognition of audio events that consist of audio sub-events (ASE). The backtracking path is shown
over time, and squares represent feature vector observations. HMMs (one per ASE) are shown
schematically in Bakis topology. After backtracking the sequence of audio events 2, 1, 3 is recognised
current step in time. By that, the 'beam width' is broadened or narrowed according
to the validation of the concurring paths' ascent or decline. This width is decisive for
the trade-off between higher accuracy (broadened width) and higher speed (narrowed
width).
Subsequently, at audio event transitions the value of the LM is added in the
computation and it is jumped to the first state of the first model of the new audio
event. In addition the required back-tracking information is stored.
Finally, the best audio event sequence is obtained at its end by the usual back-
tracking, and the recognised audio events and their boundaries are output.
In practical applications, this particularly efficient search algorithm can reach
reductions of the number of states to be computed of 1:1 000 [ 36 ]. The overall
approach integrates knowledge of information on different levels in hierarchy to
avoid early wrong decisions.
7.4 Ensemble Learning
Up to now, a number of learning algorithms was presented. In order to benefit from
diverse advantages of these, one can aim at a synergistic heterogeneous combina-
tion of these. Alternatively, or in addition, homogeneous combination of the same
 
Search WWH ::




Custom Search