Single-Channel Speech Enhancement Using Critical-Band Rate Scale Based Improved Multi-Band Spectral Subtraction

Abstract

This paper addresses the problem of single-channel speech enhancement in the adverse environment. The critical-band rate scale based on improved multi-band spectral subtraction is investigated in this study for enhancement of single-channel speech. In this work, the whole speech spectrum is divided into different non-uniformly spaced frequency bands in accordance with the critical-band rate scale of the psycho-acoustic model and the spectral over-subtraction is carried-out separately in each band. In addition, for the estimation of the noise from each band, the adaptive noise estimation approach is used and does not require explicit speech silence detection. The noise is estimated and updated by adaptively smoothing the noisy signal power in each band. The smoothing parameter is controlled by a-posteriori signal-to-noise ratio (SNR). For the performance analysis of the proposed algorithm, the objective measures, such as, SNR, segmental SNR, and perceptual evaluations of the speech quality are conducted for the variety of noises at different levels of SNRs. The speech spectrogram and objective evaluations of the proposed algorithm are compared with other standard speech enhancement algorithms and proved that the musical structure of the remnant noise and background noise is better suppressed by the proposed algorithm.

Share and Cite:

N. Upadhyay and A. Karmakar, "Single-Channel Speech Enhancement Using Critical-Band Rate Scale Based Improved Multi-Band Spectral Subtraction," Journal of Signal and Information Processing, Vol. 4 No. 3, 2013, pp. 314-326. doi: 10.4236/jsip.2013.43040.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] D. O’Shaughnessy, “Speech Communications: Human and Machine,” 2nd Edition, University Press (India) Pvt. Ltd., Hyderabad, 2007.
[2] Y. Ephraim, “Statistical-Model-Based Speech Enhancement Systems,” Proceedings of the IEEE, Vol. 80, No. 10, 1992, pp. 1526-1555. doi:10.1109/5.168664
[3] Y. Ephraim, H. L. Ari and W. Roberts, “A Brief Survey of Speech Enhancement,” In: The Electrical Engineering Handbook, 3rd Edition, CRC, Boca Raton, 2006.
[4] Y. Ephraim and I. Cohen, “Recent Advancements in Speech Enhancement,” In: The Electrical Engineering Handbook, CRC Press, Boca Raton, 2006, pp. 12-26.
[5] J. S. Lim and A. V. Oppenheim, “Enhancement and Bandwidth Compression of Noisy Speech,” Proceedings of the IEEE, Vol. 67, No. 12, 1979, pp. 1586-1604. doi:10.1109/PROC.1979.11540
[6] S. F. Boll, “Suppression of Acoustic Noise in Speech Using Spectral Subtraction,” IEEE Transactions on Acoustic, Speech, and Signal Processing, Vol. 27, No. 2, 1979, pp. 113-120. doi:10.1109/TASSP.1979.1163209
[7] M. Berouti, R. Schwartz and J. Makhoul, “Enhancement of Speech Corrupted by Acoustic Noise,” Proceedings of International Conference on Acoustic, Speech, and Signal Processing, Washington DC, April 1979, pp. 208-211.
[8] S. Kamath and P. Loizou, “A Multi-Band Spectral Subtraction Method for Enhancing Speech Corrupted by Colored Noise,” Proceedings of International Conference on Acoustic, Speech, and Signal Processing, Orlando, 13-17 May 2002. doi:10.1109/ICASSP.2002.5745591
[9] R. M. Udrea, N. Vizireanu, S. Ciochina and S. Halunga, “Non-Linear Spectral Subtraction Method for Colored Noise Reduction Using Multi-Band Bark Scale,” Signal Processing, Vol. 88, No. 5, 2008, pp. 1299-1303. doi:10.1016/j.sigpro.2007.11.023
[10] S. Li, J. Q. Wang and X. J. Jing, “The Application of Non-Linear Spectral Subtraction Method on Millimeter Wave Conducted Speech Enhancement,” Mathematical Problems in Engineering, Vol. 2010, 2010, Article ID: 570940. doi:10.1155/2010/570940
[11] V. Rama Rao, R. Murthy and K. S. Rao, “Speech Enhancement Using Cross-Correlation Compensated MultiBand Wiener Filter Combined with Harmonic Regeneration,” Journal of Signal and Information Processing, Vol. 2, No. 2, 2011, pp. 117-124. doi:10.4236/jsip.2011.22016
[12] P. C. Loizou, “Speech Enhancement: Theory and Practice,” Taylor and Francis, 2007.
[13] R. Martin, “Noise Power Spectral Density Estimation Based on Optimal Smoothing and Minimum Statistics,” IEEE Transactions on Speech and Audio Processing, Vol. 9, No. 5, 2001, pp. 504-512. doi:10.1109/89.928915
[14] I. Cohen, “Noise Spectrum Estimation in Adverse Environments: Improved Minima Controlled Recursive Averaging,” IEEE Transactions on Speech and Audio Processing, Vol. 11, No. 5, 2003, pp. 466-475. doi:10.1109/TSA.2003.811544
[15] G. Doblinger, “Computationally Efficient Speech Enhancement by Spectral Minima Tracking in Subbands,” Proceedings of Euro Speech, Vol. 2, 1995, pp. 1513-1516.
[16] L. Lin, W. H. Holmes and E. Ambikairajah, “Adaptive Noise Estimation Algorithm for Speech Enhancement,” Electronics Letters, Vol. 39, No. 9, 2003, pp. 754-755. doi:10.1049/el:20030480
[17] E. Zwicker and H. Fastl, “Psychoacoustics: Facts and Models,” Springer-Verlag, Berlin, 1990.
[18] E. Zwicker and E. Terhardt, “Analytical Expressions for Critical-Band Rate and Critical Bandwidth as a Function of Frequency,” Journal of Acoustic Society of America, Vol. 68, No. 5, 1980, pp. 1523-1525. doi:10.1121/1.385079
[19] N. Upadhyay and A. Karmakar, “A Perceptually Motivated Multi-Band Spectral Subtraction Algorithm for Enhancement of Degraded Speech,” Proceedings of IEEE International Conference on Computer, and Communication Technology, Allahabad, 23-25 November 2012, pp. 340-345.
[20] N. Upadhyay and A. Karmakar, “An Auditory Perception Based Improved Multi-Band Spectral Subtraction Algorithm for Enhancement of Speech Degraded by Non-Stationary Noises,” Proceedings of IEEE International Conference on Intelligent Human Computer Interaction, Kharagpur, 27-29 December 2012, pp. 392-398.
[21] “A Noisy Speech Corpus for Assessment of Speech Enhancement Algorithms.” http://www.utdallas.edu/~loizou/speech/noizeus/
[22] “Perceptual Evaluation of Speech Quality (PESQ), and Objective Method for End-to-End Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs,” ITU, ITU-T Rec. P. 862, 2000.

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.