Phonetic to acoustic mapping using recurrent neural networks

Abstract
The application of artificial neural networks for phonetic-to-acoustic mapping is described. The specific task considered is that of mapping consonant-vowel-consonant (CVC) syllables to the corresponding formant values at different speech tempos. The performances of two different networks, the Elman recurrent network and a single hidden layer feedforward network, are compared. The results indicate that the recurrent network is able to generalize from the training set and produce valid formant contours for new CVC syllables that are not a part of the training set. It is shown that by choosing the proper input representation, the feedforward network is also capable of learning this mapping.

This publication has 4 references indexed in Scilit: