Abstract
Different speaking rates and stress patterns are known to affect the timing of speech utterances in a nonlinear and complicated way. Time normalization is, therefore, an important prerequisite in automatic speech and speaker recognition. A new approach to speech and speaker recognition based on a time-invariant measure of "contour similarity" was described. The important relative timing of the different parameters representing a given utterance are preserved by the contour similarity measure defined here. The concept of contour similarity is applicable to other recognition tasks in which the available information can be represented by contours in a multidimensional space.

This publication has 0 references indexed in Scilit: