Digital Signal Processing Reference
In-Depth Information
fusion A • F • ALms, the most reliable and the least reliable streams are the
audio-only and the multi-stream audio-lip, respectively. The reliability
orderings are assigned considering their single modality performances. The
most promising decision fusion can be set as (A+F+ALms)•A•F, where
weighted summation A+F+ALms is picked to have the most reliable source
of information as it performs better under high SNR conditions, and audio-
only and face-only are picked to be the other two modalities for multilevel
Bayesian decision fusion. The benefit of multilevel Bayesian decision fusion
is clear from the performances of A • F • ALms and (A+F+ALms)•A•F.
Even though weighted summation achieves high performance results for
multimodal systems, we observe further performance improvements using
Bayesian decision tree over different likelihood streams with the prior
knowledge of associated stream reliabilities.
5.
CONCLUSIONS
We have presented a multimodal (audio-lip-face) speaker identification
system that improves the identification performance over unimodal schemes.
These three independent sources of information with different reliabilities are
put together to propose a reliability ordering based multilevel decision
fusion. We observed significant improvement with WTAll decision fusion,
and a further improvement is achieved using the multilevel Bayesian
decision fusion. The reliability ordering is fixed with respect to the EER
Search WWH ::




Custom Search