High quality time-scale modification for speech

23 March 2005

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 10, 493-496
https://doi.org/10.1109/icassp.1985.1168381

Abstract

We present a new and simple method for speech rate modification that yields high quality rate-modified speech. Earlier algorithms either required a significant amount of computation for good quality output speech or resulted in poor quality rate-modified speech. The algorithm we describe allows arbitrary linear or nonlinear scaling of the time axis. The algorithm operates in the time domain using a modified overlap-and-add (OLA) procedure on the waveform. It requires moderate computation and could be easily implemented in real time on currently available hardware. The algorithm works equally well on single voice speech, multiple-voice speech, and speech in noise. In this paper, we discuss an earlier algorithm for time-scale modification (TSM), and present both objective and informal subjective results for the new and previous TSM methods.

Keywords

This publication has 4 references indexed in Scilit:

Signal estimation from modified short-time Fourier transform
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1984
System to independently modify excitation and/Or spectrum of speech waveform without explicit pitch extraction
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1982
Time-scale modification of speech based on short-time Fourier analysis
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1981
Phase Vocoder
The Journal of the Acoustical Society of America, 1965