Vector-field-smoothed Bayesian learning for incremental speaker adaptation

Abstract
The paper presents a fast and incremental speaker adaptation method called MAP/VFS, which combines maximum a posteriori (MAP) estimation, or in other words Bayesian learning, with vector field smoothing (VFS). The point is that MAP is an intra-class training scheme while VFS is an inter-class smoothing technique. This is a basic technique for on-line adaptation which will be important in constructing a practical speech recognition system. Speaker adaptation speed of the incremental MAP is experimentally shown to be significantly accelerated by the use of VFS in word-by-word adaptation. The recognition performance of MAP is consistently improved and stabilized by VFS. The word error reduction rate achieved in incrementally adapting a few words of sample data is about 22%.

This publication has 7 references indexed in Scilit: