Talker Recognition on Large Populations
- 1 January 1970
- journal article
- Published by Acoustical Society of America (ASA) in The Journal of the Acoustical Society of America
- Vol. 47 (1A_Supplem) , 66
- https://doi.org/10.1121/1.1974661
Abstract
Automatic talker recognition based on single-word utterances has been investigated for two populations of 172 talkers each. Five replicates of the words “one,” “two,” “three,” “four,” and “nine” were fed through a 40-channel frequency analysis covering a 20- to 2900-Hz range. The output was digitized and summed over utterance duration. Thus one vector in 40-space represents an utterance. Statistical discriminant analysis determines a set of linear combinations of these 40 coordinates that maximally separate talker centroids relative to the variation of the replicate utterances around the centroids. The first few of these new “CRIM-coordinates” may be used to restrict the set of talker centroids with which an unknown utterance need be compared. Centroids not thus excluded are ranked in order of their likelihood to match the unknown, on the basis of weighted Euclidean distances in CRIM-coordinate space. Efficient recognition strategies utilizing these methods have reduced running time of the algorithm per unknown to 0.18 sec (costing $0.02) on a GE 635 computer. On word “one,” for 82% of the unknowns the first most likely match was correct; for 92% either the first or second; for 95%, one of the first five. Combining information from different words can improve results still further.This publication has 0 references indexed in Scilit: