Intonation in text-to-speech synthesis: Evaluation of algorithms

Abstract
Two algorithms, termed schematic and naturalistic, for generating intonation contours in an English text-to-speech system are compared by eliciting preference judgments from a total of 21 subjects. The major problem for both algorithms, but especially for the schematic algorithm, has to do with accent assignment and with the determination of the intonation phrase rather than with the phonetic realization of accent through manipulation of F0. Due to parser errors, phrase boundaries are incorrectly identified in 30% of the sentences used in the three experiments. Moreover, the naturalistic algorithm uses a grammatical part-of-speech hierarchy which ranks nouns higher than verbs. Therefore, incorrect classification of verbs as nouns (the major classification error) results in an unintended accent. The results indicate that accent assignment and phrase determination are the primary areas requiring improvement in order to further increase the naturalness of synthetic speech intonation.

This publication has 0 references indexed in Scilit: