Speaker sex identification from voiced, whispered, and filtered isolated vowels

Abstract
The purpose of this investigation was to determine the relative importance of the speaker’s laryngeal fundamental frequency and vocal tract resonance characteristics in speaker sex identification tasks. Six sustained isolated vowels (/i, -, -, a, o, u/) were recorded by 20 speakers, 10 males and 10 females, in a normal and whispered manner. A total of three master tapes (voiced, whispered, and filtered) were constructed from these recordings. The filtered tape involved 255 Hz low-pass filtering of the voiced tape. The tapes were played to 15 listeners for speaker sex identification judgments and confidence ratings of their evaluations. Results of their judgments indicate that, of the 1800 identifications made for each tape (20 speakers × 6 vowels × 15 listeners), 96% were correct for the voiced tape, 91% were correct for the filtered tape, and 75% were correct for the whispered tape. Moreover, the listeners were most confident in their judgments on the voiced tape, followed by the filtered tape, and showed the least amount of confidence on the whispered tape. These findings indicate that the laryngeal fundamental frequency appears to be a more important acoustic cue in speaker sex identification tasks than the resonance characteristics of the speaker. Subject Classification: [43]70.30.

This publication has 0 references indexed in Scilit: