Speech Signal Recovery Based on Source Separation and Noise Suppression

Zhe Wang; Haijian Zhang; Guoan Bi

doi:10.4236/jcc.2014.29015

Journal of Computer and Communications > Vol.2 No.9, July 2014

Speech Signal Recovery Based on Source Separation and Noise Suppression

Zhe Wang, Haijian Zhang, Guoan Bi
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore.
DOI: 10.4236/jcc.2014.29015 PDF HTML 3,681 Downloads 4,939 Views

Abstract

In this paper, a speech signal recovery algorithm is presented for a personalized voice command automatic recognition system in vehicle and restaurant environments. This novel algorithm is able to separate a mixed speech source from multiple speakers, detect presence/absence of speakers by tracking the higher magnitude portion of speech power spectrum and adaptively suppress noises. An automatic speech recognition (ASR) process to deal with the multi-speaker task is designed and implemented. Evaluation tests have been carried out by using the speech da- tabase NOIZEUS and the experimental results show that the proposed algorithm achieves impressive performance improvements.

Keywords

Speech Recovery, Time-Frequency Source Separation, Adaptive Noise Suppression, Automatic Speech Recognition

Share and Cite:

Wang, Z. , Zhang, H. and Bi, G. (2014) Speech Signal Recovery Based on Source Separation and Noise Suppression. Journal of Computer and Communications, 2, 112-120. doi: 10.4236/jcc.2014.29015.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1]	Boll, S. (197) Suppression of Acoustic Noise In Speech Using Spectral Subtraction. IEEE Transactions on Acoustics Speech and Signal Processing, 27, 113-120. http://dx.doi.org/10.1109/TASSP.1979.1163209
[2]	Junqua, J.C., Mak, B. and Reaves, B. (1994) A Robust Algorithm forward Boundary Detection in the Presence of Noise. IEEE Transactions on Speech and Audio Processing, 2, 406-421. http://dx.doi.org/10.1109/89.294354
[3]	Beritelli, F., Casale, S., Ruggeri, G., et al. (2002) Performances Evaluation and Comparison of G.729/AMR/Fuzzy Voice Activity Detectors. IEEE Signal Processing Letters, 9, 85-88. http://dx.doi.org/10.1109/97.995824
[4]	Abdallah, I., Montresor, S. and Baudry, M. (1997) Robust Speech/Non-Speech Detection in Adverse Conditions Using an Entropy Based Estimator. International Conference on Digital Signal Processing, Santorini, 757-760.
[5]	Zhang, H., Bi, G., Razul, S.G. and See, C.-M. (2013) Estimation of Underdetermined Mixing Matrix with Unknown Number of Overlapped Sources in Short-Time Fourier Transform Domain. IEEE ICASSP, 6486-6490.
[6]	Comaniciu, D. and Meer, P. (2002) Mean Shift: A Robust Approach toward Feature Space Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 603-619. http://dx.doi.org/10.1109/34.1000236
[7]	Aissa-El-Bey, A., Linh-Trung, N., Abed-Meraim, K. and Grenier, Y. (2007) Underdetermined Blind Separation of Nondisjoint Sources in the Time-Frequency Domain. IEEE Transactions on Signal Processing, 55, 897-907. http://dx.doi.org/10.1109/TSP.2006.888877
[8]	Griffin, D. and Lim, J.S. (1984) Signal Estimation from Modified Short-Time Fourier Transform. IEEE Transactions on Acoustics Speech and Signal Processing, 32, 236-243. http://dx.doi.org/10.1109/TASSP.1984.1164317
[9]	Chang, H.Y., Lee, A.K. and Li, H.Z. (2009) An GMM Super-vector Kernel with Bhattacharyya Distance for SVM Based Speaker Recognition. IEEE ICASSP, 4221-4224.
[10]	Hu, Y. and Loizou, P. (2006) Subjective Comparison of Speech Enhancement Algorithms. IEEE ICASSP, 1, 153-156.

Journals Menu

Follow SCIRP

	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies