Splitting of Gaussian Models via Adapted BML Method Pertaining to Cry-Based Diagnostic System

Hesam Farsaie Alaie; Chakib Tadj

doi:10.4236/eng.2013.510B058

Engineering > Vol.5 No.10B, October 2013

Splitting of Gaussian Models via Adapted BML Method Pertaining to Cry-Based Diagnostic System

Hesam Farsaie Alaie, Chakib Tadj
Department of Electrical Engineering, écolede Technology Supérieure, Montréal, Canada.
DOI: 10.4236/eng.2013.510B058 PDF HTML 2,365 Downloads 3,632 Views Citations

Abstract

In this paper,we make use of the boosting method to introduce a new learning algorithm for Gaussian Mixture Models (GMMs) called adapted Boosted Mixture Learning (BML). The method possesses the ability to rectify the existing problems in other conventional techniques for estimating the GMM parameters, due in part to a new mixing-up strategy to increase the number of Gaussian components. The discriminative splitting idea is employed for Gaussian mixture densities followed by learning via the introduced method. Then, the GMM classifier was applied to distinguish between healthy infants and those that present a selected set of medical conditions. Each group includes both full-term and premature infants. Cry-pattern for each pathological condition is created by using the adapted BML method and 13-dimensional Mel-Frequency Cepstral Coefficients (MFCCs) feature vector. The test results demonstrate that the introduced method for training GMMs has a better performance than the traditional method based upon random splitting and EM-based re-estimation as a reference system in multi-pathological classification task.

Keywords

Adapted Boosted Mixture Learning; Gaussian Mixture Model; Splitting of Gaussians; Expected-Maximization Algorithm; Cry Signals

Share and Cite:

Alaie, H. and Tadj, C. (2013) Splitting of Gaussian Models via Adapted BML Method Pertaining to Cry-Based Diagnostic System. Engineering, 5, 277-283. doi: 10.4236/eng.2013.510B058.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1]	D. A. Reynolds and R. C. Rose, “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models,” IEEE Transactions on Speech and Audio Processing, Vol. 3, 1995, pp. 72-83. http://dx.doi.org/10.1109/89.365379
[2]	A. P. Dempster, et al., “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society. Series B (Methodological), Vol. 39, 1977, pp. 1-38.
[3]	L. P. Heck and K. C. Chou, “Gaussian Mixture Model Classifiers for Machine Monitoring,” IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 6, 1994, pp. VI/133-VI/136.
[4]	D. Jun, et al., “Boosted Mixture Learning of Gaussian Mixture Hidden Markov Models Based on Maximum Likelihood for Speech Recognition,” IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, 2011, pp. 2091-2100. http://dx.doi.org/10.1109/TASL.2011.2112352
[5]	M. Kim and V. Pavlovic, “A Recursive Method for Discriminative Mixture Learning,” Proceedings of the 24th International Conference on Machine Learning, 2007, pp. 409-416.
[6]	V. Pavlovic, “Model-Based Motion Clustering Using Boosted Mixture Modeling,” Proceedings of the 2004 IEEE Computer Society Conferences on Computer Vision and Pattern Recognition, Vol. 1, 2004, pp. I-811-I-818.
[7]	W. Boyu, et al., “Gaussian Mixture Model Based on Genetic Algorithm for Brain-Computer Interface,” 3rd International Congress on Image and Signal Processing (CISP), 2010, pp. 4079-4083.
[8]	G. Heigold, et al., “Equivalence of Generative and Log-Linear Models,” IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, 2011, pp. 1138-1148. http://dx.doi.org/10.1109/TASL.2010.2082532
[9]	G. Heigold, et al., “On the Equivalence of Gaussian and log-Linear HMMs,” INTERSPEECH, 2008, pp. 273-276.
[10]	E. Bauer and R. Kohavi, “An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants,” Machine Learning, Vol. 36, 1999, pp. 105-139. http://dx.doi.org/10.1023/A:1007515423169
[11]	M. Hariharan, et al., “Normal and Hypoacoustic Infant Cry Signal Classification Using Time-Frequency Analysis and General Regression Neural Network,” Computer Methods and Programs in Biomedicine, Vol. 108, 2012, pp. 559-569. http://dx.doi.org/10.1016/j.cmpb.2011.07.010
[12]	M. Hariharan, et al., “Pathological Infant Cry Analysis Using Wavelet Packet Transform and Probabilistic Neural Network,” Expert Systems with Applications, Vol. 38, 2011, pp. 15377-15382. http://dx.doi.org/10.1016/j.eswa.2011.06.025
[13]	E. Amaro-Camargo and C. Reyes-García, “Applying Statistical Vectors of Acoustic Characteristics for the Automatic Classification of Infant Cry,” In: D.-S. Huang, et al., Eds., Advanced Intelligent Computing Theories and Applications. With Aspects of Theoretical and Methodological Issues, Vol. 4681, Springer Berlin/Heidelberg, 2007, pp. 1078-1085.
[14]	S. Cano, et al., “A Combined Classifier of Cry Units with New Acoustic Attributes,” In: J. Martínez-Trinidad, et al., Eds., Progress in Pattern Recognition, Image Analysis and Applications, Vol. 4225, Springer Berlin/Heidelberg, 2006, pp. 416-425.
[15]	O. Galaviz and C. García, “Infant Cry Classification to Identify Hypo Acoustics and Asphyxia Comparing an Evolutionary-Neural System with a Neural Network System,” In: A. Gelbukh, et al., Eds., MICAI 2005: Advances in Artificial Intelligence, Vol. 3789, Springer, Berlin/ Heidelberg, 2005, pp. 949-958.
[16]	S. C. Ortiz, et al., “A Radial Basis Function Network Oriented for Infant Cry Classification,” In: A. Sanfeliu, et al., Eds., Progress in Pattern Recognition, Image Analysis and Applications, Vol. 3287, Springer, Berlin/Heidelberg, 2004, pp. 15-36.
[17]	J. Orozco and C. A. R. Garcia, “Detecting Pathologies from Infant Cry Applying Scaled Conjugate Gradient Neural Networks,” European Symposium on Artificial Neural Networks, Bruges, 2003.
[18]	H. FarsaieAlaie and C. Tadj, “Cry-Based Classification of Healthy and Sick Infants Using Adapted Boosting Mixture Learning Method for Gaussian Mixture Models,” Modelling and Simulation in Engineering, Vol. 2012, p. 10.
[19]	O. Wasz-Hockert, et al., “Twenty-Five Years of Scandinavian cry Research,” New York, 1985.
[20]	C. Bishop, “Pattern Recognition and Machine Learning,” Springer, Berlin, 2006.
[21]	R. O. Duda, et al., “Pattern Classification,” John Wiley & Sons, 2001.
[22]	L. Mason, et al., “Functional Gradient Techniques for Combining Hypotheses,” In: A. J. Smola, et al., Eds., Advances in Large Margin Classifiers, MIT Press, Cambridge, 2000, pp. 221-246.
[23]	S. Rosset, “Robust Boosting and Its Relation to Bagging,” Proceedings of the 11th ACM SIGKDD International Conferences on Knowledge Discovery in Data Mining, Chicago, Illinois, 2005.
[24]	Y. Freund and R. E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” Journal of Computer and System Sciences, Vol. 55, 1997, pp. 119-139. http://dx.doi.org/10.1006/jcss.1997.1504
[25]	G. Schwarz, “Estimating the Dimension of a Model,” The Annals of Statistics, Vol. 6, 1978, pp. 461-464. http://dx.doi.org/10.1214/aos/1176344136
[26]	M. J. Corwin, et al., “The Infant Cry: What Can It Tell Us?” Current Problem Pediatrics, Vol. 26, 1996, pp. 325- 334. http://dx.doi.org/10.1016/S0045-9380(96)80012-0
[27]	M. D. Plumpe, et al., “Modeling of the Glottal Flow Derivative Waveform with Application to Speaker Identification,” IEEE Transactions on Speech and Audio Processing, Vol. 7, 1999, pp. 569-586.
[28]	W. Longbiao, et al., “Speaker Identification by Combining MFCC and Phase Information in Noisy Environments,” IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2010, pp. 4502-4505.
[29]	K. S. R. Murty and B. Yegnanarayana, “Combining Evidence from Residual Phase and MFCC Features for Speaker Recognition,” IEEE Signal Processing Letters, Vol. 13, 2006, pp. 52-55.
[30]	Z. Nengheng, et al., “Integration of Complementary Acoustic Features for Speaker Recognition,” IEEE Signal Processing Letters, Vol. 14, 2007, pp. 181-184. http://dx.doi.org/10.1109/LSP.2006.884031
[31]	S. Young, et al., “The HTK Book (for HTK Version 3.4),” Cambridge University Engineering Department, 2006.
[32]	L. R. Rabiner and R. W. Schafer, “Digital Processing of Speech Signals,” Prentice-Hall, Upper Saddle River, 1978.
[33]	X. Huang, et al., “Spoken Language Processing: A Guide to Theory, Algorithm, and System Development,” Prentice Hall, Upper Saddle River, 2001.
[34]	M. Benzeghiba, et al., “Automatic Speech Recognition and Speech Variability: A Review,” Speech Communication, Vol. 49, 2007, pp. 763-786. http://dx.doi.org/10.1016/j.specom.2007.02.006
[35]	J. John R. Deller, et al., “Discrete Time Processing of Speech Signals,” Prentice Hall, Upper Saddle River, 1993.
[36]	D. Zill, et al., “Advanced Engineering Mathematics,” Fourth Edition, 2011.

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies