TITLE:
A Novel Operational Partition between Neural Network Classifiers on Vulnerability to Data Mining Bias
AUTHORS:
Charles Wong
KEYWORDS:
Machine Learning, Neural Networks, Data Mining, Data Dredging, Non-Stationary Time Series Analysis, Permanent Data Learning, Reversible Data Learning
JOURNAL NAME:
Journal of Software Engineering and Applications,
Vol.7 No.4,
April
17,
2014
ABSTRACT:
It is difficult if not impossible to appropriately and
effectively select from among the vast pool of existing neural network machine
learning predictive models for industrial incorporation or academic research
exploration and enhancement. When all models outperform all the others under
disparate circumstances, none of the models do. Selecting the ideal model
becomes a matter of ill-supported opinion ungrounded on the extant real world
environment. This paper proposes a novel grouping of the model pool grounded
along a non-stationary real world data line into two groups: Permanent Data
Learning and Reversible Data Learning. This paper further proposes a novel
approach towards qualitatively and quantitatively demonstrating their
significant differences based on how they alternatively approach dynamic and
raw real world data vs static and prescient data mining biased laboratory data.
The results across 2040 separate simulation runs using 15,600 data points in
realistically operationally controlled data environments show that the
two-group division is effective and significant with clear qualitative,
quantitative and theoretical support. Results across the empirical and
theoretical spectrum are internally and externally consistent yet demonstrative
of why and how this result is non-obvious.