A segment model based approach to speech recognition

6 January 2003

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

No. 15206149,p. 501-541
https://doi.org/10.1109/icassp.1988.196629

Abstract

Proposes a global acoustic segment model for characterizing fundamental speech sound units and their interactions based upon a general framework of hidden Markov models (HMM). Each segment model represents a class of acoustically similar sounds. The intra-segment variability of each sound class is modeled by an HMM, and the sound-to-sound transition rules are characterized by a probabilistic intersegment transition matrix. An acoustically-derived lexicon is used to construct word models based upon subword segment models. The proposed segment model was tested on a speaker-trained, isolated word, speech recognition task with a vocabulary of 1109 basic English words. In the current study, only 128 segment models were used, and recognition was performed by optimally aligning the test utterance with all acoustic lexicon entries using a maximum likelihood Viterbi decoding algorithm. Based upon a database of three male speakers, the average word recognition accuracy for the top candidate was 85% and increased to 96% and 98% for the top 3 and top 5 candidates, respectively.

Keywords

This publication has 3 references indexed in Scilit:

On the automatic segmentation of speech signals
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Isolated Word Recognition for Large Vocabularies
Bell System Technical Journal, 1982
An Algorithm for Vector Quantizer Design
IEEE Transactions on Communications, 1980