Speaker Recognition System Based on the Baseband Correlation Score Reliability Fusion

DOI: 10.4236/cn.2013.53B2107   PDF   HTML     4,442 Downloads   5,457 Views  


Emotion mismatch between training and testing will cause system performance decline sharply which is emotional speaker recognition. It is an important idea to solve this problem according to the emotion normalization of test speech. This method proceeds from analysis of the differences between every kind of emotional speech and neutral speech. Besides, it takes the baseband mismatch of emotional changes as the main line. At the same time, it gives the corresponding algorithm according to four technical points which are emotional expansion, emotional shield, emotional normalization and score compensation. Compared with the traditional GMM-UBM method, the recognition rate in MASC corpus and EPST corpus was increased by 3.80% and 8.81% respectively.

Share and Cite:

He, Q. , Huang, T. and Zhang, H. (2013) Speaker Recognition System Based on the Baseband Correlation Score Reliability Fusion. Communications and Network, 5, 596-600. doi: 10.4236/cn.2013.53B2107.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] M. Pawlewski and J. Jones, “URU Plus—A Scalable Component-Based Speaker-Verification System for BT’s 21st Century Network,” BT Technology Journal, Vol. 25, No. 3-4, 2007, pp. 170-178. http://dx.doi.org/10.1007/s10550-007-0072-y
[2] J. Q. Han, L. Zhang and Y. R. Zheng, “Speech and Signal Processing,” Tsinghua University Press, Beijing, 2004.
[3] S. Furui, “Cepstral Analysis Technique for Automatic Speaker Verification,” IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 29, No. 2, 1981, pp. 254-272. http://dx.doi.org/10.1109/TASSP.1981.1163530
[4] R. D. Zilca, B. Kingsbury, J. Navratil, et al., “Pseudo Pitch Synchronous Analysis of Speech with Applications to Speaker Recognition,” IEEE Transactions on Audio Speech and Language Processing, Vol. 14, No. 2, 2006, pp. 467-478. http://dx.doi.org/10.1109/TSA.2005.857809
[5] D. Morrison, R. Wang and L. C. De Silva, “Ensemble Methods for Spoken Emotion Recognition in Call-Centres,” Speech Communication, Vol. 49, No. 2, 2007, pp. 98-112. http://dx.doi.org/10.1016/j.specom.2006.11.004

comments powered by Disqus

Copyright © 2020 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.