The Meta-Pi network: connectionist rapid adaptation for high-performance multi-speaker phoneme recognition
- 4 December 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 165-168 vol.1
- https://doi.org/10.1109/icassp.1990.115564
Abstract
A multinetwork time-delay-neural-network (TDNN)-based connectionist architecture that allows multispeaker phoneme discrimination (/b,d,g/) to be performed at the speaker-dependent recognition rate of 98.4% is presented. The overall network gates the phonemic decisions of modules trained on individual speakers to form its overall classification decision. By dynamically adapting to the input speech and focusing on a combination of speaker-specific modules, the network outperforms a single TDNN trained on the speech of all six speakers (95.9%). To train this network a form of multiplicative connection called the Meta-Pi connection is developed. It is illustrated how the Mega-Pi paradigm implements a dynamically adaptive Bayesian MAP classifier. It learns-without supervision-to recognize the speech of one particular speaker (99.8%) using a dynamic combination of internal models of other speakers exclusively. The Meta-Pi model is a viable basis for a connectionist speech recognition system that can rapidly adapt to new speakers and varying speaker dialects.Keywords
This publication has 3 references indexed in Scilit:
- Consonant recognition by modular construction of large phonemic time-delay neural networksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Phoneme recognition using time-delay neural networksIEEE Transactions on Acoustics, Speech, and Signal Processing, 1989
- Learning representations by back-propagating errorsNature, 1986