Spectral and temporal cues to pitch in noise-excited vocoder simulations of continuous-interleaved-sampling cochlear implants

Abstract
Four-band and single-band noise-excited vocoders were used in acoustic simulations to investigate spectral and temporal cues to melodic pitch in the output of a cochlear implant speech processor. Noise carriers were modulated by amplitude envelopes extracted by half-wave rectification and low-pass filtering at 32 or 400 Hz. The four-band, but not the single-band processors, may preserve spectral correlates of fundamental frequency (F0). Envelope smoothing at 400 Hz preserves temporal correlates of F0, which are eliminated with 32-Hz smoothing. Inputs to the processors were sawtooth frequency glides, in which spectral variation is completely determined by F0, or synthetic diphthongal vowel glides, whose spectral shape is dominated by varying formant resonances. Normal listeners labeled the direction of pitch movement of the processed stimuli. For processed sawtooth waves, purely temporal cues led to decreasing performance with increasing F0. With purely spectral cues, performance was above chance despite the limited spectral resolution of the processors. For processed diphthongs, performance with purely spectral cues was at chance, showing that spectral envelope changes due to formant movement obscured spectral cues to F0. Performance with temporal cues was poorer for diphthongs than for sawtooths, with very limited discrimination at higher F0. These data suggest that, for speech signals through a typical cochlear implant processor, spectral cues to pitch are likely to have limited utility, while temporal envelope cues may be useful only at low F0.
Keywords