The limitations of speech recognition procedures which depend solely on acoustic data are discussed. One such 'primary recognition' scheme, based on phoneme classification by tracking the acoustic correlates of a set of distinctive features, is presented. Programmed on a digital computer, these logical operations on digitalized spectra of 17-msec samples of speech were tested on some 300 nonsense utterances from two talkers. A priori information about individual talker characteristics is incorporated into the logic (single- speaker approach). Comparison of machine performance was made with both the intent of the speaker and with the judgments of listeners. Listeners were presented with the same acoustic stimuli that were machine processed. Some perceptual tests were run on short vowel segments excised from nonsense syllables. Detailed quantitative results are presented only for vowels. They show that man and machine agree about 90% of the time on vowel judgments under these conditions of minimal contextual information. Clear feasure boundaries are shown on the F1-F2 plane for the (stressed) vowel utterances. Although these boundaries are not generally valid for more than one voice, simple translations of them may suffice to obtain usable vowel separation for many talkers. (Author)