A Study of the Representation of English Vowel Phonemes in the Orthography

Abstract
A study was made of the correlation between the written and spoken forms of English, as part of a project for the construction of a machine to derive from printed text the corresponding forms of the spoken language. The units selected for correlation were phonemes and graphemes. A tentative list of correspondences was drawn up for each phoneme, on an intuitive basis. This list was checked by the counting of correspondences occurring in a sample of text, refinements and changes in the statement of graphemic sequences being introduced as the need arose. The correspondences were then re-formulated as rules for the prediction of phonemes from graphemic sequences. It was found that the correlation could be simplified by taking account of the reflection in the orthographic system of certain grammatical patterns, and by fusion of some sequences, in particular those containing the grapheme r, with other sequences having different phonemic correspondents. This involved modification of the initial phonemicization to the extent that the units with which graphemic sequences were correlated could no longer be considered as phonemes in any acceptable sense of the term; for these units the term ‘graphophoneme’ has been introduced. The accuracy of the resulting set of correlations is not complete. Inaccuracies are due to either the rare occurrence of a correspondence (in which case it is not accounted for in the conversion rules), or the lack of a consistent graphemic environment by reference to which different correspondents of a single sequence might be specified (in which case correspondents are assigned randomly in the ratio to each other in which they occurred in the sample). It proved possible, however, to specify the more frequent correspondences with a high degree of accuracy.

This publication has 1 reference indexed in Scilit: