A Comparison of Four Methods for Analog Speech Privacy

Abstract
Four well-known procedures for analog speech privacy have been compared in terms of residual intelligibility, bandwidth expansion, and encoding delay. Intelligibility scores have been determined from a perceptual experiment where about 70 untrained listeners were given the task of recognizing each of 200 spoken digits that occurred in a balanced set of 50 encrypted four-digit utterances, and by averaging resulting probabilities of correct digit recognition. Bandwidth expansion has been expressed in terms of a new segmental measure that is more sensitive to short-time bandwidth manipulations than a conventional, long-time-averaged power spectrum measurement. Encoding delay is a straightforward function of analog scrambler parameters. The scrambling procedures that have been compared are sample permutation (S), block permutation (B), frequency inversion (F), and a combination of methodsBandF, denoted by [BF]. Sample permutations involved a contiguous set of LS(2 to 128) 8 kHz samples, while block permutations operated on a contiguous set of NB(4 to 128) speech segments each of which was LB(8 to 256) samples long. Frequency inversion is obtained by simply inverting the sign of every other Nyquist (8 kHz) sample. The parameters,L_{s},N_{B}, and LB, determine residual intelligibility as well as transmission properties such as encoding delay and bandwidth. The comparisons in our study provide a quantitative justification for the popular approach [BF]. For example, withN_{B} = 8andL_{B} =128, although the encoding delay is as much as 128 ms, the bandwidth expansion is only about 100 Hz (using the new segmental measure), and the digit intelligibilityIis 20 percent. Note that in the specific problem of recognizing ten digits, purely random (input-independent) listener responses correspond toI = 10percent.

This publication has 7 references indexed in Scilit: