Abstract
This paper describes a method for text-independent speaker identification. In this method, in order to utilize phoneme-dependent personal information in addition to personal information common to all phonemes, multiple personal factor spaces are constructed by applying canonical discriminant analysis to the predetermined subspaces in the observation space. The decision is based on a liklihood measure derived from a posteriori probabilities in all the factor spaces. Using the 21-dimensional observation vectors obtained from every 40 msec voiced segments, the methods of construction of the subspaces and others were examined. An identification accuracy comparable to human listeners was achieved.

This publication has 4 references indexed in Scilit: