Structural methods in automatic speech recognition
- 1 January 1985
- journal article
- review article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in Proceedings of the IEEE
- Vol. 73 (11) , 1625-1650
- https://doi.org/10.1109/proc.1985.13344
Abstract
The past decade has witnessed substantial progress toward the goal of constructing a machine capable of understanding colloquial discourse. Central to this progress has been the development and application of mathematical methods that permit modeling the speech signal as a complex code with several coexisting levels of structure. The most successful of these are "template matching," stochastic modeling, and probabilistic parsing. The manifestation of common themes such as dynamic programming and finite-state descriptions accentuates a superficial likeness amongst the methods which is often mistaken for the deeper similarity arising from their shared Bayesian foundation. In this paper, we outline the mathematical bases of these methods, invariant metrics, hidden Markov chains, and formal grammars, respectively. We then recount and briefly interpret the results of experiments in speech recognition to which the various methods were applied. Since these mathematical principles seem to bear little resemblance to traditional linguistic characterizations of speech, the success of the experiments is occasionally attributed, even by their authors, merely to excellent engineering. We conclude by speculating that, quite to the contrary, these methods actually constitute a powerful theory of speech that can be reconciled with and elucidate conventional linguistic theories while being used to build truly competent mechanical speech recognizers.Keywords
This publication has 99 references indexed in Scilit:
- Maximum likelihood estimation for multivariate observations of Markov sourcesIEEE Transactions on Information Theory, 1982
- Connected digit recognition using a level-building DTW algorithmIEEE Transactions on Acoustics, Speech, and Signal Processing, 1981
- Acoustic correlates of some phonetic categoriesThe Journal of the Acoustical Society of America, 1980
- Two-level DP-matching--A dynamic programming-based pattern matching algorithm for connected word recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1979
- Review of the ARPA Speech Understanding ProjectThe Journal of the Acoustical Society of America, 1977
- The role of phonological rules in speech understanding researchIEEE Transactions on Acoustics, Speech, and Signal Processing, 1975
- The Jacobian of a growth transformationPacific Journal of Mathematics, 1973
- Invariant functions of an iterative process for maximization of a polynomialPacific Journal of Mathematics, 1972
- Growth transformations for functions on manifoldsPacific Journal of Mathematics, 1968
- A note on two problems in connexion with graphsNumerische Mathematik, 1959