Rapid speaker adaptation in eigenvoice space
Top Cited Papers
- 1 November 2000
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Speech and Audio Processing
- Vol. 8 (6) , 695-707
- https://doi.org/10.1109/89.876308
Abstract
This paper describes a new model-based speaker adaptation algorithm called the eigenvoice approach. The approach constrains the adapted model to be a linear combination of a small number of basis vectors obtained offline from a set of reference speakers, and thus greatly reduces the number of free parameters to be estimated from adaptation data. These "eigenvoice" basis vectors are orthogonal to each other and guaranteed to represent the most important components of variation between the reference speakers. Experimental results for a small-vocabulary task (letter recognition) given in the paper show that the approach yields major improvements in performance for tiny amounts of adaptation data. For instance, we obtained 16% relative improvement in error rate with one letter of supervised adaptation data, and 26% relative improvement with four letters of supervised adaptation data. After a comparison of the eigenvoice approach with other speaker adaptation algorithms, the paper concludes with a discussion of future work.Keywords
This publication has 33 references indexed in Scilit:
- Unsupervised speaker adaptation method based on hierarchical spectral clusteringPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Speaker hierarchical clustering for improving speaker-independent HMM word recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Speaker clustering and transformation for speaker adaptation in speech recognition systemsIEEE Transactions on Speech and Audio Processing, 1998
- Probabilistic visual learning for object representationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1997
- Speaker adaptation based on transfer vector field smoothing using maximum a posteriori probability estimationComputer Speech & Language, 1996
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov modelsComputer Speech & Language, 1995
- Speaker adaptation using constrained estimation of Gaussian mixturesIEEE Transactions on Speech and Audio Processing, 1995
- Speaker adaptation in continuous speech recognition via estimation of correlated mean vectorsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1991
- Automatic Speech RecognitionPublished by Springer Nature ,1989
- Dynamic speaker adaptation for feature-based isolated word recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1987