A coarse phonetic knowledge source for template independent large vocabulary word recognition

Abstract

In this paper we present a template independent knowledge source (KS), that uses coarse phonetic information to substantially constrain the candidate vocabulary for use in word hypothesization with very large vocabularies. It consists of three parts: the segmenter that breaks a test utterance up into a sequence of coarse phonetic classes, the knowledge compiler that generates a reference dictionary containing the appropriate coarse phonetic representations for each word candidate and finally, a matching engine. Coarse phonetic classification is performed using linear discriminant analysis, more specifically perceptron classification. The knowledge compiler first generates a phonemic representation and segmental durations by rule from a list of word candidates (i.e., from text), and then derives coarse phonetic class segments. Matching is performed by a nonlinear time alignment algorithm based on dissimilarity scores between detected and lexical coarse class segments. The coarse phonetic KS was tested by compiling a word list of approximately 1500 words. Using only the coarse classes Silence, Plosive, Fricative, Vocalic, Front Vowel, Back Vowel, Nasal and R, a vocabulary reduction to 5% of the original vocabulary is achieved at lower than 5% error rate for three different speakers.

Keywords

This publication has 7 references indexed in Scilit:

Suprasegmentals in very large vocabulary isolated word recognition
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
A speaker independent word recognition system based on phoneme recognition for a large size (212 words) vocabulary
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Properties of large lexicons: Implications for advanced isolated word recognition systems
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Recognition of consonant based on the perceptron model
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
A hierarchical decision approach to large-vocabulary discrete utterance recognition
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1983
Demisyllable-based isolated word recognition system
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1983
Minimum prediction residual principle applied to speech recognition
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1975