The effects of handset variability on speaker recognition performance: experiments on the Switchboard corpus
- 23 December 2002
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 1, 113-116
- https://doi.org/10.1109/icassp.1996.540303
Abstract
This paper presents an empirical study of the effects of handset variability on text-independent speaker recognition performance using the Switchboard corpus. Handset variability occurs when training speech is collected using one type of handset, but a different handset is used for collecting test speech. For the Switchboard corpus, the calling telephone number associated with a file is used to imply the handset used. Analysis of experiments designed to focus on handset variability on the SPIDRE database and the May95 NIST speaker recognition evaluation database, show that a performance gap between matched and mismatched handset tests persists even after applying several standard channel compensation techniques. Error rates for the mismatched tests are over 4 times those for the matched tests. Lastly, a new energy dependent cepstral mean subtraction technique is proposed to compensate for nonlinear distortions, but is not found to improve performance on the databases used.Keywords
This publication has 8 references indexed in Scilit:
- The effects of telephone transmission degradations on speaker recognition performancePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- A feature-space transformation for telephone based speech recognitionPublished by International Speech Communication Association ,1995
- Speaker identification and verification using Gaussian mixture speaker modelsSpeech Communication, 1995
- SWITCHBOARD: telephone speech corpus for research and developmentPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992
- RASTA-PLP speech analysis techniquePublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992
- Speaker verification using randomized phrase promptingDigital Signal Processing, 1991
- On the use of instantaneous and transitional spectral information in speaker recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1988
- Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verificationThe Journal of the Acoustical Society of America, 1974