Abstract
Hidden Markov modeling (HMM) techniques have been used successfully for connected speech recognition in the last several years. In the traditional HMM algorithms, the probability of duration of a state decreases exponentially with time which is not appropriate for representing the temporal structure of speech. Non-parametric modeling of duration using semi-Markov chains does accomplish the task with a large increase in the computational complexity. Applying a postprocessing state duration penalty after Viterbi decoding adds very little computation but does not affect the forward recognition path. The authors present a way of modeling state durations in HMM using time-dependent state transitions. This inhomogeneous HMM (IHMM) does increase the computation by a small amount but reduces recognition error rates by 14-25%. Also, a suboptimal implementation of this scheme that requires no more computation than the traditional HMM is presented which also has reduced errors by 14-22% on a variety of databases.

This publication has 4 references indexed in Scilit: