Feature Optimization of Speech Emotion Recognition

HTML  XML Download Download as PDF (Size: 271KB)  PP. 37-43  
DOI: 10.4236/jbise.2016.910B005    1,748 Downloads   3,830 Views  Citations

ABSTRACT

Speech emotion is divided into four categories, Fear, Happy, Neutral and Surprise in this paper. Traditional features and their statistics are generally applied to recognize speech emotion. In order to quantify each feature’s contribution to emotion recogni-tion, a method based on the Back Propagation (BP) neural network is adopted. Then we can obtain the optimal subset of the features. What’s more, two new characteristics of speech emotion, MFCC feature extracted from the fundamental frequency curve (MFCCF0) and amplitude perturbation parameters extracted from the short- time av-erage magnitude curve (APSAM), are added to the selected features. With the Gaus-sian Mixture Model (GMM), we get the highest average recognition rate of the four emotions 82.25%, and the recognition rate of Neutral 90%.

Share and Cite:

Yu, C. , Xie, L. and Hu, W. (2016) Feature Optimization of Speech Emotion Recognition. Journal of Biomedical Science and Engineering, 9, 37-43. doi: 10.4236/jbise.2016.910B005.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.