An Improved Word-Detection Algorithm for Telephone-Quality Speech Incorporating Both Syntactic and Semantic Constraints
- 1 March 1984
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in AT&T Bell Laboratories Technical Journal
- Vol. 63 (3) , 479-498
- https://doi.org/10.1002/j.1538-7305.1984.tb00016.x
Abstract
Accurate location of the endpoints of spoken words and phrases is important for reliable and robust speech recognition. The endpoint detection problem is fairly straightforward for high-level speech signals in low-level stationary noise environments (e.g., signal-to-noise ratios greater than 30-dB rms). However, this problem becomes considerably more difficult when either the speech signals are too low in level (relative to the background noise), or when the background noise becomes highly nonstationary. Such conditions are often encountered in the switched telephone network when the limitation on using local dialed-up lines is removed. In such cases the background noise is often highly variable in both level and spectral content because of transmission line characteristics, transients and tones from the line and/or from signal generators, etc. Conventional speech endpoint detectors have been shown to perform very poorly (on the order of 50-percent word detection) under these conditions. In this paper we present an improved word-detection algorithm, which can incorporate both vocabulary (syntactic) and task (semantic) information, leading to word-detection accuracies close to 100 percent for isolated digit detection over a wide range of telephone transmission conditions.This publication has 8 references indexed in Scilit:
- Speaker-independent isolated word recognition using a 129-word airline vocabularyThe Journal of the Acoustical Society of America, 1982
- An improved endpoint detector for isolated word recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1981
- Isolated and Connected Word Recognition--Theory and Selected ApplicationsIEEE Transactions on Communications, 1981
- A level building dynamic time warping algorithm for connected word recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1981
- Speaker-independent isolated word recognition for a moderate size(54 word)vocabularyIEEE Transactions on Acoustics, Speech, and Signal Processing, 1979
- Considerations in applying clustering techniques to speaker‐independent word recognitionThe Journal of the Acoustical Society of America, 1979
- Speaker-independent recognition of isolated words using clustering techniquesIEEE Transactions on Acoustics, Speech, and Signal Processing, 1979
- Minimum prediction residual principle applied to speech recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1975