Voice identification using nearest-neighbor distance measure

Abstract
An algorithm for attributing a sample of unconstrained speech to one of several known speakers is described. The algorithm is based on measurement of the similarity of distributions of features extracted from reference speech samples and from the sample to be attributed. The measure of feature distribution similarity employed is not based on any assumed form of the distributions involved. The theoretical basis of the algorithm is examined, and a plausible connection is shown to the divergence statistic of Kullback (1972). Experimental results are presented for the King telephone database and the Switchboard database. The performance of the algorithm is better than that reported for algorithms based on Gaussian modeling and robust discrimination.

This publication has 6 references indexed in Scilit: