Cross-lingual experiments with phone recognition
- 1 January 1993
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 2, 507-510 vol.2
- https://doi.org/10.1109/icassp.1993.319353
Abstract
Research on speaker-independent continuous phone recognition for both French and English is presented. The phone accuracy is assessed on the BREF corpus for French, and on the Wall Street Journal (WSJ) and TIMIT corpora for English. Cross-language differences concerning language properties are presented. It is found that French is easier to recognize at the phone level (the phone error for BREF is 23.6% vs. 30.1% for WSJ), but harder to recognize at the lexical level due to the larger number of homophones. Experiments with signal analysis indicate that a 4 kHz signal bandwidth is sufficient for French, whereas 8 kHz is needed for English. Phone recognition is a powerful technique for language, sex, and speaker identification. With 2 s of speech, the language can be identified with better than 99% accuracy. Sex-identification for BREF and WSJ is error-free. Speaker identification accuracies of 98.2% on TIMIT (462 speakers) and 99.1% on BREF (57 speakers) were obtained with one utterance per speaker. 100% accuracies were obtained with two utterances per speaker.Keywords
This publication has 10 references indexed in Scilit:
- SPEECH-TO-TEXT CONVERSION IN FRENCHInternational Journal of Pattern Recognition and Artificial Intelligence, 1994
- Bayesian learning for hidden Markov model with Gaussian mixture state observation densitiesSpeech Communication, 1992
- The design for the wall street journal-based CSR corpusPublished by Association for Computational Linguistics (ACL) ,1992
- Speaker-independent phone recognition using BREFPublished by Association for Computational Linguistics (ACL) ,1992
- Phoneme based speaker verificationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992
- Acoustic modeling for large vocabulary speech recognitionComputer Speech & Language, 1990
- Speaker-independent phone recognition using hidden Markov modelsIEEE Transactions on Acoustics, Speech, and Signal Processing, 1989
- Maximum-Likelihood Estimation for Mixture Multivariate Stochastic Observations of Markov ChainsAT&T Technical Journal, 1985
- Recognition of Isolated Digits Using Hidden Markov Models With Continuous Mixture DensitiesAT&T Technical Journal, 1985
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentencesIEEE Transactions on Acoustics, Speech, and Signal Processing, 1980