SCIRP Mobile Website
Paper Submission

Why Us? >>

  • - Open Access
  • - Peer-reviewed
  • - Rapid publication
  • - Lifetime hosting
  • - Free indexing service
  • - Free promotion service
  • - More citations
  • - Search engine friendly

Free SCIRP Newsletters>>

Add your e-mail address to receive free newsletters from SCIRP.


Contact Us >>

WhatsApp  +86 18163351462(WhatsApp)
Paper Publishing WeChat
Book Publishing WeChat

Article citations


Wickham, H., Caragea, D., & Cook, D. (2006). Exploring High-Dimensional Classification Boundaries. Proceedings of the 38th Symposium on the Interface of Statistics, Computing Science, and Applications—Interface 2006: Massive Data Sets and Streams, Pasadena, May 24-27 2006.

has been cited by the following article:

  • TITLE: Visualizing Random Forest’s Prediction Results

    AUTHORS: Hudson F. Golino, Cristiano Mauro Assis Gomes

    KEYWORDS: Machine Learning, Assessment, Prediction, Visualization, Networks, Cluster

    JOURNAL NAME: Psychology, Vol.5 No.19, December 5, 2014

    ABSTRACT: The current paper proposes a new visualization tool to help check the quality of the random forest predictions by plotting the proximity matrix as weighted networks. This new visualization technique will be compared with the traditional multidimensional scale plot. The present paper also introduces a new accuracy index (proportion of misplaced cases), and compares it to total accuracy, sensitivity and specificity. It also applies cluster coefficients to weighted graphs, in order to understand how well the random forest algorithm is separating two classes. Two datasets were analyzed, one from a medical research (breast cancer) and the other from a psychology research (medical student’s academic achievement), varying the sample sizes and the predictive accuracy. With different number of observations and different possible prediction accuracies, it was possible to compare how each visualization technique behaves in each situation. The results pointed that the visualization of random forest’s predictive performance was easier and more intuitive to interpret using the weighted network of the proximity matrix than using the multidimensional scale plot. The proportion of misplaced cases was highly related to total accuracy, sensitivity and specificity. This strategy, together with the computation of Zhang and Horvath’s (2005) clustering coefficient for weighted graphs, can be very helpful in understanding how well a random forest prediction is doing in terms of classification.