An investigation of PLP and IMELDA acoustic representations and of their potential for combination

Abstract
Two acoustic representations, integrated Mel-scale representation with LDA (IMELDA) and perceptual linear prediction-root power sums (PLP-RPS), both of which have given good results in speech recognition tests, are explored. IMELDA is examined in the context of some related representations. Results of speaker-dependent and independent tests with digits and the alphabet suggest that the optimum PLP order is high and that the effectiveness of PLP-RPS stems not from its modeling of perceptual properties but from its approximation to a desirable statistical property attained exactly by IMELDA. A combined PLP-IMELDA representation is found to be generally more effective than PLP-RPS, but an IMELDA representation derived directly from a filter-bank provides similar results to PLP-IMELDA at a lower computational cost.

This publication has 8 references indexed in Scilit: