Standard and target driven AR-vector models for speech analysis and speaker recognition

1 January 1992

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 2 (15206149) , 5-8 vol.2
https://doi.org/10.1109/icassp.1992.226134

Abstract

Theoretical aspects and practical applications are reported of two variants of the AR-vector modeling technique: the standard AR-vector model and the target driven (or multistep excited) AR-vector model. The standard version supposes a white excitation, while the target driven model assumes a piecewise constant input. The standard AR-vector model turns out to be extremely efficient for speaker recognition, since, for a set of 420 different speakers, the recognition score ranges from 93% to 100%, depending on the duration of the test speech sample. The target driven AR-vector model shows very interesting properties for speech analysis and segmentation. There exists a strong correspondence between the steps in the input function and the underlying phonetic content of speech. Moreover, under some normalization, the values of the steps can be interpreted as acoustic targets.

Keywords

This publication has 4 references indexed in Scilit:

Cinematic techniques for speech processing: temporal decomposition and multivariate linear prediction
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1992
An evaluation of temporal decomposition
Published by International Speech Communication Association ,1991
Minimum prediction residual principle applied to speech recognition
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1975
Recursive solution to the multichannel filtering problem
Journal of Geophysical Research, 1965