Segmentation of speech using speaker identification
- 17 December 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. i (15206149) , I/161-I/164
- https://doi.org/10.1109/icassp.1994.389330
Abstract
This paper describes techniques for segmentation of conversational speech based on speaker identity. Speaker segmentation is performed using Viterbi decoding on a hidden Markov model network consisting of interconnected speaker sub-networks. Speaker sub-networks are initialized using Baum-Welch training on data labeled by speaker, and are iteratively retrained based on the previous segmentation. If data labeled by speaker is not available, agglomerative clustering is used to approximately segment the conversational speech according to speaker prior to Baum-Welch training. The distance measure for the clustering is a likelihood ratio in which speakers are modeled by Gaussian distributions. The distance between merged segments is recomputed at each stage of the clustering, and a duration model is used to bias the likelihood ratio. Segmentation accuracy using agglomerative clustering initialization matches accuracy using initialization with speaker labeled data.Keywords
This publication has 7 references indexed in Scilit:
- Text independent speaker identification using automatic acoustic segmentationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Identification of non-linguistic speech featuresPublished by Association for Computational Linguistics (ACL) ,1993
- Speech segmentation and clustering based on speaker featuresPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1993
- An unsupervised, sequential learning algorithm for the segmentation of speech waveforms with multiple speakersPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992
- Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/continuous HMMsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992
- Training and search algorithms for an interactive wordspotting systemPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992
- Segregation of speakers for speech recognition and speaker identificationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1991