Digital Signal Processing Reference
In-Depth Information
A number of further measures and search functions exist, and one can also add
additional combinations or alterations of features throughout search, usually by ran-
dom injection or genetic algorithms to limit the search space [ 93 , 105 - 107 ].
If one aims at mere compression of the feature space in the sense of a reduction
rather than selection, i.e., the original feature space still needs to be extracted, PCA,
LDA or similar can be employed (cf. [ 108 ]).
References
1. Parsons, T.: Voice and Speech Processing. McGraw-Hill (1987)
2. Ruske, G.: Automatische Spracherkennung, 2nd edn. Methoden der Klassifikation und Merk-
malsextraktion. Oldenbourg, Munich (1993)
3. Oppenheim, A.V., Willsky, A.S., Hamid, S.: Signals and Systems, 2nd edn. Prentice Hall,
(1996)
4. Wendemuth, A.: Grundlagen der digitalen Signalverarbeitung: Ein Mathematischer Zugang.
Springer, Berlin (2005)
5. Wendemuth, A.: Grundlagen der stochastischen Sprachverarbeitung. Oldenbourg, München,
Wien (2004)
6. Deller, J., Proakis, J., Hansen, J.: Discrete-Time Processing of Speech Signals. Macmillan
Publishing Company, Yew York (1993)
7. O'Shaughnessy, D.: Speech Communication, 2nd edn. Adison-Wesley (1990)
8. Schuller, B., Rigoll, G.: Timing levels in segment-based speech emotion recognition. In:
Proceedings of the 9th International Conference on Spoken Language Processing, INTER-
SPEECH 2006, ICSLP, ISCA, pp. 1818-1821, Pittsburgh, Sep 2006
9. Schuller, B., Wimmer, M., Mösenlechner, L., Kern, C., Arsic, D., Rigoll, G.: Brute-forcing
hierarchical functionals for paralinguistics: a waste of feature space? In: Proceedings of the
33rd IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP
2008, (IEEE) pp. 4501-4504, Las Vegas, NV, April 2008
10. Sohn, J., Kim, N.: A statistical model-based voice activity detection. IEEE Signal Process.
Lett. 6 (1), 1-3 (1999)
11. Ramirez, J., Segura, J., Benitez, M., De La Torre, A., Rubio, A.: Efficient voice activity
detection algorithms using long-term speech information. Speech Commun. 42 (3), 271-287
(2004)
12. Ramirez, J., Segura, J., Benitez, C., Garcia, L., Rubio, A.: Statistical voice activity detection
using a multiple observation likelihood ratio test. IEEE Signal Process. Lett. 12 (10), 689-692
(2005)
13. R. Gemello, F. Mana, and R. D. Mori. Non-linear esimation of voice activity to improve
automatic recognition of noisy speech. In: Proceedings of INTERSPEECH, 2005, ISCA pp.
2617-2620, Lisbon, Sept 2005
14. Mousazadeh, S., Cohen, I.: AR-GARCH in presence of noise: parameter estimation and its
application to voice activity detection. IEEE Trans. Audio Speech Lang. Process. 19 (4), 916-
926 (2011)
15. Zwicker, E., Fastl, H.: Psychoacoustics—Facts and Models, 2nd edn. Springer, Berlin (1999)
16. Kießling, A.: Extraktion und Klassifikation prosodischer Merkmale in der automatischen
Sprachverarbeitung. Berichte aus der Informatik. Shaker, Aachen (1997)
17. Furui, S.: Digital Speech Processing: Synthesis, and Recognition. Signal Processing and
Communications, 2nd edn. Marcel Denker Inc, New York (1996)
18. Schuller, B.: Automatische Emotionserkennung aus sprachlicher und manueller Interaktion.
Doctoral thesis, Technische Universität München, Munich, Germany, June (2006)
19. Fant, G.: Speech Sounds and Features. MIT Press, Cambridge (1973)
 
Search WWH ::




Custom Search