TITLE:
Application of Regularized Logistic Regression and Artificial Neural Network Model for Ozone Classification across El Paso County, Texas, United States
AUTHORS:
Callistus Obunadike, Adekunle Adefabi, Somtobe Olisah, David Abimbola, Kunle Oloyede
KEYWORDS:
Machine Learning, Ozone Prediction, Pollutants Forecasting, Atmospheric Monitoring, Air Quality, Logistic Regression, Artificial Neural Network
JOURNAL NAME:
Journal of Data Analysis and Information Processing,
Vol.11 No.3,
July
11,
2023
ABSTRACT: This paper focuses on ozone prediction in the atmosphere
using a machine learning approach. We utilize air pollutant and meteorological
variable datasets from the El Paso area to classify ozone levels as high or
low. The LR and ANN algorithms are employed to train the datasets. The models
demonstrate a remarkably high classification accuracy of 89.3% in predicting
ozone levels on a given day. Evaluation metrics reveal that both the ANN and LR
models exhibit accuracies of 89.3% and 88.4%, respectively. Additionally, the
AUC values for both models are comparable, with the ANN achieving 95.4% and the
LR obtaining 95.2%. The lower the cross-entropy loss (log loss), the higher the
model’s accuracy or performance. Our ANN model yields a log loss of 3.74, while
the LR model shows a log loss of 6.03. The prediction time for the ANN model is
approximately 0.00 seconds, whereas the LR model takes 0.02 seconds. Our odds
ratio analysis indicates that features such as “Solar radiation”, “Std. Dev.
Wind Direction”, “outdoor temperature”, “dew point temperature”, and “PM10”
contribute to high ozone levels in El Paso, Texas. Based on metrics such as
accuracy, error rate, log loss, and prediction time, the ANN model proves to be
faster and more suitable for ozone classification in the El Paso, Texas area.