Recognition of Spoken Words and Phrases in Multitalker Environment Using Syntactic Methods

1 May 1978

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Computers

Vol. C-27 (5) , 442-452
https://doi.org/10.1109/TC.1978.1675124

Abstract

We describe a method of recognizing isolated words and phrases from a given vocabulary spoken by any member in a given group of speakers, the identity of the speaker being unknown to the system. The word utterance is divided into 20-30 nearly equal frames, frame boundaries being aligned with glottal pulses for voiced speech. A constant number of pitch periods are included in each frame. Statistical decision rules are used to determine the phoneme in each frame. Using the string of phonemes from all the frames of the utterance, a word decision is obtained using (phonological) syntactic rules. The syntactic rules used here are of 2 types, namely, 1) those obtained from the theory of word construction from phonemes in English as applied to our vocabulary, 2) those used to correct possible errors in phonemic decisions obtained earlier based on the decisions of neighboring segments. In our experiment, the vocabulary had 40 words, consisting of many pairs of words which are phonemically close to each other. The number of speakers was 6. The identity of the speaker is not known to the system. In testing 400 words utterances, the recognition rate was about 80 percent for phonemes (for 11 phonemes) but the word recognition was 98.1 percent correct. Phonological-syntactic rules played an important role in upgrading the word recognition rate over the phoneme recognition rate.

Keywords

This publication has 12 references indexed in Scilit:

A Method for the Correction of Garbled Words Based on the Levenshtein Metric
IEEE Transactions on Computers, 1976
Order- n correction for regular languages
Communications of the ACM, 1974
An algorithm for the distance between two finite sequences
Journal of Combinatorial Theory, Series A, 1974
Evaluation of various parameter sets in spoken digits recognition
IEEE Transactions on Audio and Electroacoustics, 1973
Digital inverse filtering-a new tool for formant trajectory estimation
IEEE Transactions on Audio and Electroacoustics, 1972
Matching Sequences under Deletion/Insertion Constraints
Proceedings of the National Academy of Sciences, 1972
Speech Analysis and Synthesis by Linear Prediction of the Speech Wave
The Journal of the Acoustical Society of America, 1971
Some experiments with a simple word recognition system
IEEE Transactions on Audio and Electroacoustics, 1968
Speech recognition using autocorrelation analysis
IEEE Transactions on Audio and Electroacoustics, 1968
Spoken Digit Recognition Using Time-Frequency Pattern Matching
The Journal of the Acoustical Society of America, 1960