Using Non-Additive Measure for Optimization-Based Nonlinear Classification

Abstract

Over the past few decades, numerous optimization-based methods have been proposed for solving the classification problem in data mining. Classic optimization-based methods do not consider attribute interactions toward classification. Thus, a novel learning machine is needed to provide a better understanding on the nature of classification when the interaction among contributions from various attributes cannot be ignored. The interactions can be described by a non-additive measure while the Choquet integral can serve as the mathematical tool to aggregate the values of attributes and the corresponding values of a non-additive measure. As a main part of this research, a new nonlinear classification method with non-additive measures is proposed. Experimental results show that applying non-additive measures on the classic optimization-based models improves the classification robustness and accuracy compared with some popular classification methods. In addition, motivated by well-known Support Vector Machine approach, we transform the primal optimization-based nonlinear classification model with the signed non-additive measure into its dual form by applying Lagrangian optimization theory and Wolfes dual programming theory. As a result, 2n – 1 parameters of the signed non-additive measure can now be approximated with m (number of records) Lagrangian multipliers by applying necessary conditions of the primal classification problem to be optimal. This method of parameter approximation is a breakthrough for solving a non-additive measure practically when there are relatively small number of training cases available (m<<2n-1). Furthermore, the kernel-based learning method engages the nonlinear classifiers to achieve better classification accuracy. The research produces practically deliverable nonlinear models with the non-additive measure for classification problem in data mining when interactions among attributes are considered.

Share and Cite:

N. Yan, Z. Chen, Y. Shi, Z. Wang and G. Huang, "Using Non-Additive Measure for Optimization-Based Nonlinear Classification," American Journal of Operations Research, Vol. 2 No. 3, 2012, pp. 364-373. doi: 10.4236/ajor.2012.23044.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] N. Freed and F. Glover, “Simple but Powerful Goal Programming Models for Discriminate Problems,” European Journal of Operational Research, Vol. 7, No. 1, 1981, pp. 44-60. doi:10.1016/0377-2217(81)90048-5
[2] N. Freed and F. Glover, “Evaluating Alternative Linear, Programming Models to Solve the Two-Group Discriminate Problem,” Decision Science, Vol. 17, No. 2, 1986, pp. 151-162. doi:10.1111/j.1540-5915.1986.tb00218.x
[3] Y. Shi, “Multiple Criteria and Multiple Constraint Levels Linear Programming: Concepts, Techniques and Applications,” World Scientific Pub Co Inc., New Jersey, 2001.
[4] G. Kou, Y. Peng, Z. Chen and Y. Shi, “Multiple Criteria Mathematical Programming for Multi-Class Classification and Application in Network Intrusion Detection,” Information Sciences, Vol. 179, No. 4, 2009, pp. 371-381. doi:10.1016/j.ins.2008.10.025
[5] Y. Peng, G. Kou, Y. Shi and Z. Chen, “A Multi-Criteria Convex Quadratic Programming Model for Credit Data Analysis,” Decision Support Systems, Vol. 44, No. 4, 2008, pp. 1016-1030. doi:10.1016/j.dss.2007.12.001
[6] V. Vapnik, “The Nature of Statistical Learning Theory,” Springer-Verlag, New York, 1995.
[7] G. Choquet, “Theory of Capacities,” Annales de l’Institut Fourier, Vol. 5, 1954, pp. 131-295. doi:10.5802/aif.53
[8] Z. Wang and G. J. Klir, “Fuzzy Measure Theory,” Plenum, New York, 1992.
[9] Z. Wang and G. J. Klir, “Generalized Measure Theory,” Springer, New York, 2008.
[10] Z. Wang, K.-S. Leung and G. J. Klir, “Applying Fuzzy Measures and Nonlinear Integrals in Data Mining,” Fuzzy Sets and Systems, Vol. 156, No. 3, 2005, pp. 371-380. doi:10.1016/j.fss.2005.05.034
[11] Z. Wang and H. Guo, “A New Genetic Algorithm for Nonlinear Multiregressions Based on Generalized Choquet Integrals,” The 12th IEEE International Conference on Fuzzy Systems (FUZZ’03), Vol. 2, 25-28 May 2003, pp. 819-821.
[12] M. Grabisch and M. Sugeno, “Multi-Attribute Classification Using Fuzzy Integral,” IEEE International Conference on Fuzzy System, San Diego, 8-12 March 1992, pp. 47-54.
[13] M. Grabisch and J.-M. Nicolas, “Classification by Fuzzy Integral: Performance and Tests,” Fuzzy Sets System, Vol. 65, No. 2-3, 1994, pp. 255-271. doi:10.1016/0165-0114(94)90023-X
[14] L. Mikenina and H. J. Zimmermann, “Improved Feature Selection and Classification by the 2-Additive Fuzzy Measure,” Fuzzy Sets and Systems, Vol. 107, No. 2, 1999, pp. 197-218. doi:10.1016/S0165-0114(98)00429-1
[15] K. Xu, W. Z., P. Heng and K. Leung, “Classification by Nonlinear Integral Projections,” IEEE Transactions on Fuzzy Systems, Vol. 11, No. 2, 2003, pp. 187-201. doi:10.1109/TFUZZ.2003.809891
[16] H. Fang, M. Rizzo, H. Wang, K. Espy and Z. Wang, “A New Nonlinear Classifier with a Penalized Signed Fuzzy Measure Using Effective Genetic Algorithm,” Pattern Recognition, Vol. 43, No. 4, 2010, pp. 1393-1401. doi:10.1016/j.patcog.2009.10.006
[17] J. Chu, Z. Wang and Y. Shi, “Analysis to the Contributions from Feature Attributes in Nonlinear Classification Based on the Choquet Integral,” 2010 IEEE International Conference on Granular Computing (GrC), San Jose, 14-16 August 2010, pp. 677-682.
[18] T. Murofushi, M. Sugeno and K. Fujimoto, “Separated Hierarchical Decomposition of the Choquet Integral,” International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol. 5, No. 5, 1997, pp. 563- 585. doi:10.1142/S0218488597000439
[19] H. Kuhn and A. Tucker, “Nonlinear Programming,” Proceedings of 2nd Berkeley Symposium on Mathematical Statistics and Probabilistics, 1951, pp. 481-491.
[20] N. Cristianini and J. Shawe-Taylor, “An Introduction to Support Vector Machines,” Cambridge University Press, Cambridge, 2000.
[21] B. Boser, I. Guyon and V. N. Vapnik, “A Training Algorithm for Optimal Margin Classifiers,” Fifth Annual Workshop on Computational Learning Theory, 1992, pp. 144-152. doi:10.1145/130385.130401
[22] J. Platt, “Fast Training of Support Vector Machines Using Sequential Minimal Optimization,” Technical Report, Microsoft Research, 1998.
[23] E. Osuna, R. Freund and F. Girosi, “An Improved Training Algorithm for Support Vector Machines,” Neural Networks for Signal Processing [1997] VII. Proceedings of the 1997 IEEE Workshop, Amelia Island, 24-26 September 1997, pp. 276-285.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.