Using Boosted Regression Trees and Remotely Sensed Data to Drive Decision-Making - Open Journal of Statistics

OJS > Vol.7 No.5, October 2017

Open Journal of Statistics

Volume 7, Issue 5 (October 2017)

ISSN Print: 2161-718X ISSN Online: 2161-7198

Google-based Impact Factor: 0.53 Citations

Using Boosted Regression Trees and Remotely Sensed Data to Drive Decision-Making ()

HTML XML

Download as PDF (Size: 716KB) PP. 859-875

DOI: 10.4236/ojs.2017.75061 2,384 Downloads 5,759 Views Citations

Author(s)

Brigitte Colin^*, Samuel Clifford, Paul Wu, Samuel Rathmanner, Kerrie Mengersen

Affiliation(s)

School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia.

ABSTRACT

Challenges in Big Data analysis arise due to the way the data are recorded, maintained, processed and stored. We demonstrate that a hierarchical, multivariate, statistical machine learning algorithm, namely Boosted Regression Tree (BRT) can address Big Data challenges to drive decision making. The challenge of this study is lack of interoperability since the data, a collection of GIS shapefiles, remotely sensed imagery, and aggregated and interpolated spatio-temporal information, are stored in monolithic hardware components. For the modelling process, it was necessary to create one common input file. By merging the data sources together, a structured but noisy input file, showing inconsistencies and redundancies, was created. Here, it is shown that BRT can process different data granularities, heterogeneous data and missingness. In particular, BRT has the advantage of dealing with missing data by default by allowing a split on whether or not a value is missing as well as what the value is. Most importantly, the BRT offers a wide range of possibilities regarding the interpretation of results and variable selection is automatically performed by considering how frequently a variable is used to define a split in the tree. A comparison with two similar regression models (Random Forests and Least Absolute Shrinkage and Selection Operator, LASSO) shows that BRT outperforms these in this instance. BRT can also be a starting point for sophisticated hierarchical modelling in real world scenarios. For example, a single or ensemble approach of BRT could be tested with existing models in order to improve results for a wide range of data-driven decisions and applications.

KEYWORDS

Boosted Regression Trees, Remotely Sensed Data, Big Data Modelling Approach, Missing Data

Share and Cite:

Colin, B. , Clifford, S. , Wu, P. , Rathmanner, S. and Mengersen, K. (2017) Using Boosted Regression Trees and Remotely Sensed Data to Drive Decision-Making. Open Journal of Statistics, 7, 859-875. doi: 10.4236/ojs.2017.75061.

Cited by

[1]	Diversity of global fisheries governance: Types and contexts
	Fish and …, 2023

[2]	Evaluation of the metabolomic profile through 1H-NMR spectroscopy in ewes affected by postpartum hyperketonemia
	Scientific reports, 2022

[3]	Driving forces of forest expansion dynamics across the Iberian Peninsula (1987–2017): a spatio-temporal transect
	Iglesias, M Ninyerola, P Serra… - Forests, 2022

[4]	Comparison of Classification Performances of MARS and BRT Data Methods: AB? DE-2016 Case
	EGITIM VE BILIM-EDUCATION AND …, 2022

[5]	Using Machine Learning to Predict the Risk of Human-Elephant Conflict in the Nepal-India Transboundary Region
	2022

[6]	MARS ve BRT Veri Madenciliği Yöntemlerinin Sınıflama Performanslarının Karşılaştırılması: ABİDE-2016 Örneği
	EĞİTİM VE BİLİM, 2022

[7]	Smart Environment Monitoring Models Using Cloud‐Based Data Analytics: A Comprehensive Study
	… Approach for Cloud Data Analytics in IoT, 2021

[8]	Patterns and drivers of rodent abundance across a South African multi-use landscape
	Animals, 2021

[9]	Environmental drivers of reef manta ray (Mobula alfredi) visitation patterns to key aggregation habitats in the Maldives
	2021

[10]	Identifying predictors of international fisheries conflict
	2021

[11]	Fine‐scale oceanographic drivers of reef manta ray (Mobula alfredi) visitation patterns at a feeding aggregation site
	2021

[12]	Large-scale High-resolution Coastal Mangrove Forests Mapping across West Africa with Machine Learning Ensemble and Satellite Big Data
	2021

[13]	Application of machine learning algorithms and their ensemble for landslide susceptibility mapping
	2020

[14]	Remote islands are vulnerable to non-indigenous species: Utilization of data analytics to investigate potential modes of introduction and pest interceptions
	2020

[15]	ABİDE 2016 fen başarısının yordanmasında MARS ve BRT veri madenciliği yöntemlerinin karşılaştırılması
	2020

[16]	Reef manta rays, Mobula afredi, of the Chagos Archipelago: Habitat use and the effectiveness of the region's marine protected area
	2019

[17]	Estimating Spatial and Temporal Trends in Environmental Indices Based on Satellite Data: A Two-Step Approach
	2019

[18]	Serum proteomic analysis of melanoma patients with immunohistochemical profiling of primary melanomas and cultured cells: Pilot study
	2019

[19]	Data-Driven Decision Making in Precision Agriculture: The Rise of Big Data in Agricultural Systems
	2019

[20]	Performance indicators in football: The im-portance of actual performance for the market value of football players
	SCIAMUS – Sport und Management, 2019

[21]	portance of actual performance for the market value of football players
	2019

[22]	Relationships in the data
	2018

[23]	Sam Clifford-Bayesian Statistics
	2018

[24]	Influence of Spatial Aggregation on Prediction Accuracy of Green Vegetation Using Boosted Regression Trees
	Remote Sensing, 2018

[25]	Education and Science


[26]	Eğitim ve Bilim

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies