On the role of amplitude and phase in the synthesis of male and female voices
- 1 May 1990
- journal article
- Published by Acoustical Society of America (ASA) in The Journal of the Acoustical Society of America
- Vol. 87 (S1) , S109
- https://doi.org/10.1121/1.2027822
Abstract
A pitch-synchronous segmentation, which was shown to be perceptually close to a deconvolution [C. Hamon et al., Proc. IEEE ICASSP'89, 238–241 (1989)], was used to obtain a short-time Fourier representation of the LPC residual. After selected amplitude and phase manipulations of voiced segments, a residue was reconstructed, which was used to drive the LPC synthesis filter. Twenty utterances (ten male, ten female) were investigated under two amplitude (original/flat) and two phase conditions (original/zero), yielding four versions for each utterance. The quality of these versions was judged by 12 subjects in a paired-comparison experiment. Original amplitude information was consistently preferred over original phase information. For female voices, there were significant quality differences between any of the four versions. However, for male voices the original amplitude information alone proved to be sufficient to make the synthetic speech almost indistinguishable from natural speech. [Work was supported in part by the Dutch SPIN-ASSP program.]Keywords
This publication has 0 references indexed in Scilit: