Effect of Synthetic Voice Intelligibility on Speech Comprehension

Abstract
This research investigated the differential impact of synthetic voice quality and text difficulty on comprehension of extended prose. Sixty participants listened to five easy and five difficult passages in one of three speech modes: natural speech, VOTRAX (low intelligibility), or DECtalk (high intelligibility). Comprehension of DECtalk was equal to that of natural speech, whereas comprehension of VOTRAX was significantly poorer than with natural speech or DECtalk. Subjects were also asked to shadow passages of each speech type as a measure of resource processing demands. It was found that shadowing accuracy was significantly better for natural speech than for DECtalk and shadowing of DECtalk was markedly superior to that of VOTRAX. The results of this study suggest that resource-demand measures alone may not be appropriate to predict performance in practical applications. Specifically, overall comprehension may not suffer despite on-line losses in processing. These findings also point to a differential allocation of cognitive resources by speech synthesizers of differing intelligibility.

This publication has 11 references indexed in Scilit: