Abstract
Ergodic, continuous-observation, hidden Markov models (HMMs) were used to perform automatic language classification and detection of speech messages. State observation probability densities were modeled as tied Gaussian mixtures. The algorithm was evaluated on four multilanguage speech databases: a three language subset of the Spoken Language Library, a three language subset of a five-language Rome Laboratory database, the 20-language CCITT database, and the ten-language OGI (Oregon Graduate Institute) telephone speech database. In general, the performance of a single state HMM (i.e., a static Gaussian mixture classifier) was comparable with that of the multistate HMMs, indicating that the sequential modeling capabilities of HMMs were not exploited.

This publication has 11 references indexed in Scilit: