Speech recognition using temporal decomposition and multi-layer feed-forward automata
- 13 January 2003
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
A report is presented of intraspeaker and interspeaker variability as a major source of error in automatic speech recognition. The authors report on two series of experiments using multilayer feed-forward automata (MLFFA) to control some aspects of this variability. The first series concerns the classification of spectral targets obtained from a robust implementation of temporal decomposition. An MLFFA accepts three successive targets to output an allophonic label. No improvement has been found so far from traditional classification techniques (i.e. k -nearest neighbors). In a second series of experiments spectral transformations using MLFFA are introduced for the adaptation to new speakers. Compared to linear techniques (multivariate regression and canonical correlation analysis), the MLFFA approach offers some improvement Author(s) Montacie, C. Dept. Signal, ENST, Paris, France Choukri, K. ; Chollet, G.Keywords
This publication has 3 references indexed in Scilit:
- Temporal decomposition and acoustic-phonetic decoding of speechPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- An introduction to computing with neural netsIEEE ASSP Magazine, 1987
- Adaptation of automatic speech recognizers to new speakers using canonical correlation analysis techniquesComputer Speech & Language, 1986