Abstract
Telesensory Systems, Inc. (TSI) has developed an English text-to-speech system based on MITalk [1], but which runs in real time on microprocessor hardware and in main memory. Important structural parts of the MITalk-79 algorithm that are omitted in the TSI system include: (1) MITalk's parser and its sophisticated phrase-structure dependent fundamental frequency generator, and (2) The morpheme decomposition module with its 12,000 morpheme lexicon. These omitted modules, which use about 50% of MITalk's memory, have been replaced in the TSI system by an 1100 word dictionary and a simple "hat and declination" F0 routine. Results from intelligibility and comprehension tests identical to those used in evaluating MITalk [10] indicate that for common paragraphic text listener performance on TSI's speech is comparable to performance on MITalk speech. Analysis of the results suggests that accurate acoustic realization of segmental information is the crucial factor in intelligibility, with simple F0 patterns and conventional letter-tophoneme conversion adequate for many purposes.

This publication has 3 references indexed in Scilit: