Abstract
We propose a method that combines a probabilistic phonetic feature hierarchy with support vector machines for segmentation of continuous speech into five classes - vowel, sonorant consonant, fricative, stop and silence. We show that by using the hierarchy, only four binary classifiers are required to recognize the five classes. Due to the probabilistic nature of the hierarchy, the method overcomes the disadvantage of the traditional acoustic-phonetic methods where the error is carried down the hierarchy. In addition, the hierarchical approach allows the use of comparable amount of training data of two classes that each binary classifier is designed to discriminate. The segmentation method with 13 knowledge based parameters performs considerably better than a context-dependent hidden Markov model (HMM) based approach that uses 39 mel-cepstrum based parameters.

This publication has 5 references indexed in Scilit: