Speech recognition using temporal decomposition and multi-layer feed-forward automata

13 January 2003

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 409-412
https://doi.org/10.1109/icassp.1989.266452

Abstract

A report is presented of intraspeaker and interspeaker variability as a major source of error in automatic speech recognition. The authors report on two series of experiments using multilayer feed-forward automata (MLFFA) to control some aspects of this variability. The first series concerns the classification of spectral targets obtained from a robust implementation of temporal decomposition. An MLFFA accepts three successive targets to output an allophonic label. No improvement has been found so far from traditional classification techniques (i.e. k -nearest neighbors). In a second series of experiments spectral transformations using MLFFA are introduced for the adaptation to new speakers. Compared to linear techniques (multivariate regression and canonical correlation analysis), the MLFFA approach offers some improvement Author(s) Montacie, C. Dept. Signal, ENST, Paris, France Choukri, K. ; Chollet, G.

Keywords

This publication has 3 references indexed in Scilit:

Temporal decomposition and acoustic-phonetic decoding of speech
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
An introduction to computing with neural nets
IEEE ASSP Magazine, 1987
Adaptation of automatic speech recognizers to new speakers using canonical correlation analysis techniques
Computer Speech & Language, 1986