Subband based classification of speech under stress

Abstract
This study proposes a new set of feature parameters based on subband analysis of the speech signal for classification of speech under stress. The new speech features are scale energy (SE), autocorrelation-scale-energy (ACSE), subband based cepstral parameters (SC), and autocorrelation-SC (ACSC). The parameters' ability to capture different stress types is compared to widely used mel-scale cepstrum based representations: mel-frequency cepstral coefficients (MFCC) and autocorrelation-mel-scale (AC-mel). Next, a feedforward neural network is formulated for speaker-dependent stress classification of 10 stress conditions: angry, clear, cond50/70, fast, loud, lombard, neutral, question, slow, and soft. The classification algorithm is evaluated using a previously established stressed speech database (SUSAS) (Hansen and Bou-Ghazale 1997). Subband based features are shown to achieve +7.3% and +9.1% increase in the classification rates over the MFCC based parameters for ungrouped and grouped stress closed vocabulary test scenarios respectively. Moreover the average scores across the simulations of new features are +8.6% and +13.6% higher than MFCC based features for the ungrouped and grouped stress test scenarios respectively.

This publication has 10 references indexed in Scilit: