Handset-dependent background models for robust text-independent speaker recognition

22 November 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 2 (15206149) , 1071-1074
https://doi.org/10.1109/icassp.1997.596126

Abstract

This paper studies the effects of handset distortion on telephone-based speaker recognition performance, resulting in the following observations: (1) the major factor in speaker recognition errors is whether the handset type (e.g., electret, carbon) is different across training and testing, not whether the telephone lines are mismatched, (2) the distribution of speaker recognition scores for true speakers is bimodal, with one mode dominated by matched handset tests and the other by mismatched handsets, (3) cohort-based normalization methods derive much of their performance gains from implicitly selecting cohorts trained with the same handset type as the claimant, and (4) utilizing a handset-dependent background model which is matched to the handset type of the claimant's training data sharpens and separates the true and false speaker score distributions. Results on the 1996 NIST Speaker Recognition Evaluation corpus show that using handset-matched background models reduces false acceptances (at a 10% miss rate) by more than 60% over previously reported (handset-independent) approaches.

Keywords

This publication has 3 references indexed in Scilit:

Speaker background models for connected digit password speaker verification
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
The effects of handset variability on speaker recognition performance: experiments on the Switchboard corpus
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
RASTA-PLP speech analysis technique
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1992