Robustness study of free-text speaker identification and verification

Abstract
Usable free-text speaker identification and voice verification systems must exhibit robustness under varying operational conditions. The authors study the degree of robustness provided by various signal processing techniques by experimenting on a widely used long distance telephone database. This database consists of data recorded at two different sites, with data from one site much poorer in quality than that from the other. Further, the recording equipment had been inadvertently changed for the later half of the sessions, resulting in a significantly changed environment. The combination of techniques that provide consistent and significant improvements is identified. The present results surpass other published results on the same task. Specifically, in the task of identifying 16 speakers with training data from the recording prior to equipment change and testing on data from a set after the change (the most challenging condition), a correct identification rate of 87.5% with an average rank of 1.12 was obtained.

This publication has 6 references indexed in Scilit: