Segmental prototype interpolation coding
- 1 January 1999
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 4 (15206149) , 2311-2314 vol.4
- https://doi.org/10.1109/icassp.1999.758400
Abstract
Current parametric speech coding schemes can achieve high communications quality speech at bit rates in the range of 2.4 to 1.5 kbits/sec. Most schemes sample and quantise, at regular intervals, the "tracks in time" generated by the parameters of the speech production model. As a result, reconstructed "parameter tracks" do not evolve "smoothly" with time. Furthermore, no advantage is taken of the "linguistic event" nature of speech. In this paper, model parameter "time tracks" are split into non-overlapping speech "event" related segments. These segment based evolutions of model parameters are then vector quantised to provide at the receiver a smooth and subjectively meaningful reconstruction. Thus the paper presents an application of this generic segmental speech model quantisation approach to a 1.5 kbits/sec prototype interpolation coding (PIC) system. Results indicate that the proposed methodology can almost halve the bit rate of this PIC system while preserving overall recovered speech quality.Keywords
This publication has 6 references indexed in Scilit:
- Source driven variable bit rate prototype interpolation codingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- A very low bit rate speech coder using HMM-based speech recognition/synthesis techniquesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Segmental vocoder-going beyond the phonetic approachPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Efficient coding of LSP parameters using split matrix quantisationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Multiband excitation vocoderIEEE Transactions on Acoustics, Speech, and Signal Processing, 1988
- Speech analysis/Synthesis based on a sinusoidal representationIEEE Transactions on Acoustics, Speech, and Signal Processing, 1986