Abstract
A concept of waveform similarity for tackling the problem of time-scale modification of speech is proposed. It is worked out in the context of short-time Fourier transform representations. The resulting WSOLA (waveform-similarity-based synchronized overlap-add) algorithm produces high-quality speech output, is algorithmically and computationally efficient and robust, and allows for online processing with arbitrary time-scaling factors that may be specified in a time-varying fashion and can be chosen over a wide continuous range of values.

This publication has 4 references indexed in Scilit: