Hybrid PINN-LSTM Model for River Temperature Prediction: A Physics-Informed Deep Learning Approach

Abstract

This study proposes a hybrid modeling approach that integrates a Physics Informed Neural Network (PINN) and a long short-term memory (LSTM) network to predict river water temperature in a defined section of the Catu River. The PINN is formulated on the basis of the advection-dispersion-reaction equation to incorporate physical constraints into the learning process, while the LSTM captures temporal patterns from meteorological inputs. The model was trained using normalized historical data and evaluated with standard quantitative metrics and visual comparisons. To examine the influence of each input feature, sensitivity and ablation analyses were performed. The results indicate that the model accurately learns the relevant dependencies, with humidity and dew point exerting the greatest influence on the predictions. In contrast, precipitation showed negligible impact on model performance, aligning with the low and seasonally concentrated rainfall of the region. These results suggest that the model captures both the temporal and physical patterns associated with the thermal dynamics of the river. The hybrid model reproduced observed seasonal variations and extreme temperature peaks with consistency, demonstrating predictive robustness under variable environmental conditions. The combination of data-driven learning and physically constrained modeling offers a viable solution for temperature forecasting in river systems, particularly in regions with limited observational data.

Share and Cite:

Figueredo, M. , Ferreira, M. , Monteiro, R. , Silva, A. , Murari, T. and Neri, T. (2025) Hybrid PINN-LSTM Model for River Temperature Prediction: A Physics-Informed Deep Learning Approach. Journal of Computer and Communications, 13, 115-134. doi: 10.4236/jcc.2025.136008.

1. Introduction

The temperature of the water is a key factor in the dynamics of aquatic ecosystems, directly influencing the physical, chemical, and biological processes of rivers [1]-[4]. This variable affects gas solubility, dissolved oxygen availability, metabolism of aquatic organisms, and even the evaporation rate of the water surface [5]-[7]. Changes in water temperature can alter the composition of aquatic communities, modify biogeochemical cycles, and impact food chains, making its analysis essential for the conservation and sustainable management of water resources [8] [9]. Furthermore, water temperature plays a crucial role in the thermal regime of rivers, reflecting climatic variations and environmental changes caused by anthropogenic activities, such as dam construction and rapid urbanization [10]. In this context, continuous and accurate monitoring of water temperature not only provides relevant data for environmental quality assessment [11], but also supports the development of efficient strategies for managing water bodies in the face of increasing human impact and global climate change.

Given the complexity of the interactions that determine river thermal variations, mathematical modeling emerges as an essential tool to predict future scenarios and gain a deeper understanding of the factors that regulate water temperature. Computational models allow the simulation of different environmental conditions by incorporating variables such as air temperature, solar radiation, thermal pollution, and river flow [12] [13]. In recent years, artificial intelligence and neural network-based approaches have demonstrated great potential in capturing nonlinear patterns in these dynamic systems, allowing more robust predictions that adapt to various hydroclimatic conditions [14] [15]. The integration of physics-informed models, such as Physics-Informed Neural Networks (PINNs) [16]-[18], combined with recurrent neural networks such as Long-Short-Term Memory (LSTM) [19]-[22], presents a promising approach for forecasting water temperature. This method takes advantage of physical principles and deep learning to improve prediction accuracy, providing more precise information on the thermal response of rivers to environmental changes [23].

Several methodologies have been explored to improve river water temperature forecasting, taking advantage of well-established theories and modern computational approaches [24]-[26]. Although physics-based models provide a structured framework for representing thermal dynamics, they often require extensive parameterization and struggle with data limitations in complex environments. In contrast, statistical and machine learning models have gained prominence for their ability to extract patterns from observational data, enabling efficient and adaptable predictions under diverse hydroclimatic conditions [27] [28].

Despite these advances, there are challenges to accurately model temperature variations in regions with intricate environmental dynamics, such as urban areas with heterogeneous land cover and localized heat fluxes. Many empirical models, though effective within known data distributions, lack robustness when extrapolating to new conditions [29] [30]. Furthermore, reliance on purely data-driven approaches can obscure the underlying physical mechanisms that govern river temperature fluctuations [31] [32]. Addressing these gaps requires an integrative strategy that balances data-driven insights with physically consistent representations, ensuring both predictive reliability and interpretability in dynamic aquatic systems.

The Catu River region has a tropical climate, with average annual temperatures ranging between 24˚C and 26˚C. During summer, maximum temperatures can reach 33˚C, while in winter, minimum temperatures rarely drop below 18˚C [33]. This seasonal variation significantly influences the thermal balance of the river, leading to changes that occur both on a local scale and in response to global climatic trends. In addition to directly affecting aquatic ecosystems, these thermal fluctuations highlight the need for continuous and precise monitoring, which is essential for sustainable management of the watershed and conservation of water resources.

In this study, we propose a hybrid approach that integrates Physics-Informed Neural Networks (PINNs) and Long-Short-Term Memory (LSTM) to predict the temperature of the Catu River in a specific section. This approach is justified by the necessity of incorporating physical knowledge directly into the LSTM model, which alone does not explicitly recognize the hydrological and climatic variability of the region.

Based on the governing differential equations, we developed a new model (PINN-LSTM) with limited computational complexity, comparable to that of statistical models. The final structure combines an Advection-Diffusion-Reaction Equation dependent on variables such as precipitation, air temperature, humidity, evaporation, and wind speed with the LSTM’s capability to learn temporal patterns from historical river temperature data. This combination allows the model to learn the river’s thermal evolution while incorporating both underlying physical principles and patterns extracted from observed data.

Thus, the PINN-LSTM can be seen as a hybrid tool that preserves physical foundations while leveraging machine learning to enhance predictive capacity. However, our study focuses exclusively on this section of the river, without considering external influences such as tributaries, due to the unavailability of comprehensive data on these variables.

Additionally, the network was trained and validated to identify temporal patterns in river temperature variations, analyzing the delays between precipitation events and thermal changes over time. This enables a detailed evaluation of the Catu Rive’s thermal behavior, accounting for both seasonal impacts and extreme hydrological events.

This publication presents an investigation into a model, input data, and training characteristics for the daily prediction of water temperature in a section of the Catu River. It consists of the application of two types of machine learning models to input datasets.

The main contributions of this work can be described as follows: 1) the development of a hybrid approach integrating Physics-Informed Neural Networks (PINNs) and Long-Short-Term Memory (LSTM) to improve temperature prediction by combining physical knowledge with deep learning techniques; 2) the incorporation of an advection-diffusion-reaction equation that accounts for the influence of precipitation, air temperature, humidity, evaporation, and wind speed, allowing for a physically consistent representation of river thermal dynamics; 3) the evaluation of the model’s performance with a limited dataset, analyzing its robustness in handling noise and its capacity to capture nonlinearities present in the data; 4) an investigation of the role of seasonal variations and extreme meteorological events in river temperature fluctuations, assessing the predictive capability of the PINN-LSTM framework under different hydroclimatic conditions; and 5) a demonstration that traditional machine learning models struggle when applied to large environmental datasets, reinforcing the need for hybrid approaches that integrate physical constraints to enhance prediction reliability.

2. Method

2.1. Study Area and Data

In this study, we analyzed a 1 km stretch of the Catu River, located in a watershed covering 208.5 km2 and encompassing the municipalities of Alagoinhas, Catu, and Pojuca (Figure 1). Measurements were conducted between the coordinates 12˚10'05.1"S, 38˚24'23.8"W and 12˚11'04.6"S, 38˚24'11.9"W, with altitudes ranging from 40 to 255 meters. The average slope of the basin is 30%.

Figure 1. Catu River section.

2.2. Dataset

Meteorological data were obtained from the National Institute of Meteorology (INMET), using measurements from the station located in Alagoinhas, BA, with latitude −12˚14'86" and longitude −38˚50'57". The station is situated at an altitude of 47.56 meters and provides records from January 1, 2009, to January 1, 2017, with a sampling resolution of 10 minutes.

The meteorological variables included in the study are air temperature (Air Temp), Wind Speed, relative humidity (Humidity), evaporation and precipitation. Table 1 summarizes the training data used for predictive model calibration.

Table 1. Summary of meteorological data used in the study.

Year

Air Temp

Wind Speed

Humidity

Evaporation

Precipitation

2009

31.05

2.10

77.25

3.80

2.66

2010

30.52

2.18

77.05

3.39

3.33

2011

29.85

2.16

74.92

3.32

3.26

2012

30.54

2.17

74.32

4.48

1.82

2013

30.56

2.20

77.38

3.84

3.06

2014

29.98

2.04

77.74

3.65

2.74

2015

31.08

1.28

73.76

4.15

2.82

2016

31.27

2.03

75.85

6.31

2.32

To validate the model, a temperature measurement campaign of the Catu River was conducted between January and December 2023. Data were collected using DS18B20 sensors coupled to an Arduino microcontroller, positioned 5 cm from the water surface. Measurements followed the same sampling frequency as that adopted by the National Institute of Meteorology (INMET, Brazil), ensuring compatibility with the available meteorological data.

The collected data were compared with the predictions generated by the trained PINN-LSTM networks, allowing for an assessment of the model’s accuracy in the real context of the watershed. The data used and historical records are available at https://www.kaggle.com/datasets/corcova/caturiverbahia/data.

Although this study focuses on the middle section of the Catu River and uses data limited to the period 2016-2017, this choice was made based on the availability and reliability of observed river temperature records. Future work will expand the dataset to include other stretches and longer time series to improve model generalization.

2.3. Model Framework

As shown in Figure 2, the PINN-LSTM forecasting model follows a structured pipeline:

Figure 2. Steps for building the PINN-LSTM predicting model.

First, meteorological and hydrological data, including air temperature, wind speed, humidity, evaporation, and precipitation, are preprocessed for consistency, normalization, and handling missing values.

Next, the Physics-Informed Neural Network (PINN) is trained to learn the governing thermal dynamics of the river, providing physically consistent estimates y pinn_pred . This output is incorporated into the final hybrid model training.

The Long-Short-Term Memory (LSTM) network captures temporal dependencies and nonlinear patterns, integrating both observational data and PINN constraints. The final model is evaluated using RMSE, NSE, and R2, ensuring accuracy across different climatic and hydrological conditions.

Once validated, the model is deployed for inference, enabling real-time river temperature predictions. Figure 2 summarizes the complete PINN-LSTM model pipeline.

2.4. Governing Equation and PINN Formulation

The Physics-Informed Neural Network (PINN) used in this study is built upon the governing advection-dispersion-reaction equation that models river water temperature dynamics. The equation integrates advective heat transport, longitudinal thermal dispersion, and source/sink terms representing atmospheric and hydrological exchanges:

T w t =U T w x + D L 2 T w x 2 + R h + R i , (1)

where:

  • T w ( x,t ) is the water temperature at position x and time t ;

  • U is the average river velocity (m/s);

  • D L is the longitudinal dispersion coefficient (m2/s);

  • R h represents the heat exchange with the atmosphere (dependent on air temperature, wind, and humidity);

  • R i represents the heat input from precipitation and losses due to evaporation.

The PINN architecture approximates the solution T w ( x,t ) and minimizes the residual of the governing PDE, which is defined as:

PDE = T w t +U T w x D L 2 T w x 2 R h R i . (2)

This residual is evaluated at a set of collocation points using automatic differentiation. The neural network is trained to minimize this residual along with prediction errors based on available data.

Boundary and Initial Conditions:

  • Initial condition: The temperature distribution at time t=0 is defined by observed data: T w ( x,0 )= T w 0 ( x ) .

  • Dirichlet boundary condition: Specifies the temperature at the river boundary, e.g. T w ( x 0 ,t )= T 0 .

  • Neumann boundary condition: Specifies the heat flux at the boundary:

T w n | Ω = q 0 ( x,t ), (3)

where Ω is the boundary domain and q 0 the known flux.

Loss Function:

The complete loss function for the hybrid PINN-LSTM model combines the physical constraint, prediction accuracy, and LSTM forecast loss:

total = λ 1 PDE + λ 2 LSTM + λ 3 data , (4)

where λ i are weighting coefficients, PDE corresponds to the residual of the PDE, LSTM represents the sequential prediction error from the LSTM, and data is the supervised loss from observed data. This formulation ensures that the predictions are consistent with both the observed data and the physical dynamics of the river system.

2.5. PINN Architecture and Training Process

The Physics-Informed Neural Network (PINN) employed in this study was designed to estimate the thermal dynamics of the Catu River by embedding the underlying advection-dispersion-reaction equation into its architecture. The network consists of:

  • An input layer with five variables: precipitation ( P t ), river temperature ( T r ), air temperature ( T a ), time ( t ), and evaporation ( E ).

  • Three hidden layers, each with six neurons, using the swish activation function, batch normalization, and dropout regularization ( p=0.1 ).

  • One output neuron corresponding to the predicted river temperature ( T w ).

Figure 3 shows the architecture of the network, including a symbolic representation of the differential constraints.

The training process uses automatic differentiation to compute the residuals of the governing PDE:

PDE = T w t +U T w x D L 2 T w x 2 R h R i ,

which are incorporated into the loss function. The model is optimized via stochastic gradient descent (AdamW), minimizing a composite loss consisting of the PDE residual, temporal prediction error from the LSTM, and observational data mismatch. This architecture ensures both physical consistency and high predictive capability.

Figure 3. Architecture of the Physics-Informed Neural Network (PINN) used for estimating river water temperature.

2.6. LSTM Architecture

The proposed hybrid model integrates a Physics-Informed Neural Network (PINN) with a Long-Short-Term Memory (LSTM) network. The physically-informed prediction y pinn_pred , generated by the PINN module, is used as input to the LSTM. This integration allows the model to extract temporal features and nonlinear dependencies from time-series data that already respect physical constraints.

The LSTM architecture consists of three bidirectional layers-two with 64 neurons and one with 32 neurons-followed by a dense output module with layers of 64, 32, and 1 neuron, respectively. A temporal window of seven days was adopted (time_steps = 7), enabling weekly pattern analysis and sensitivity to abrupt thermal variations. Several topologies and hyperparameters were evaluated, and only the best configuration is presented in this work.

The internal structure of the LSTM cell is defined by the following set of equations:

i t =σ( U i h t1 + W i x t + b i ) f t =σ( U f h t1 + W f x t + b f ) c ˜ t =tanh( U c h t1 + W c x t + b c ) c t = f t c t1 + i t c ˜ t o t =σ( U o h t1 + W o x t + b o ) h t = o t tanh( c t ) (5)

where:

  • x t is the input vector at time t (including y pinn_pred );

  • i t is the input gate that controls memory update;

  • f t is the forget gate that filters previous cell information;

  • o t is the output gate;

  • c t is the cell state;

  • h t is the hidden state;

  • denotes element-wise multiplication;

  • σ is the sigmoid function, and tanh is the hyperbolic tangent.

Figure 4 illustrates the overall LSTM pipeline, showing the PINN prediction being processed through sequential LSTM and dense layers. The internal cell computation matches the mathematical formulation above.

Figure 4. LSTM architecture used in the model. The output of the PINN ( y pinn_pred ) is used as input to the LSTM.

2.7. Integration of PINN and LSTM for Temperature Prediction

The Physics-Informed Neural Network (PINN) employed in this study approximates the solution to the Advection-Dispersion-Reaction Equation (1):

During training, the PINN learns a continuous temperature function T ^ w ( x,t ) satisfying:

  • the differential equation across the domain (minimizing residual error);

  • boundary and initial conditions;

  • and observational data (where available).

The trained PINN generates a continuous function:

T ^ w ( x,t )= NeuralNetwork θ ( x,t ) (6)

where θ represents network parameters. Practically, this continuous function is sampled at discrete points corresponding to observational and meteorological data intervals, resulting in a discrete time series:

T ^ PINN ( t 1 ), T ^ PINN ( t 2 ),, T ^ PINN ( t n ) (7)

This discretized series serves as input for the LSTM model. Specifically, the LSTM takes sequences of the simulated temperature data (e.g. time window of 7 days):

[ T ^ PINN ( t i ), T ^ PINN ( t i+1 ),, T ^ PINN ( t i+6 ) ] T w ( t i+7 ) (8)

This integration is conceptually justified because the PINN provides physically-informed, realistic temperature estimations based on governing physical laws, while the LSTM model refines predictions by capturing temporal patterns observed historically.

The boundary conditions implicitly applied within the PINN framework are Dirichlet conditions at x=0 :

T w ( 0,t )= T r ( t ) (9)

where T r ( t ) represents observed temperature data at the initial spatial point. This condition is enforced by minimizing the discrepancy between predicted and observed temperatures at x=0 .

Thus, the PINN-LSTM hybrid approach effectively combines theoretical physical modeling (PINN) and empirical temporal pattern recognition (LSTM), leveraging strengths of both methods to enhance prediction accuracy for river temperature forecasting.

2.8. Performance Metrics

To evaluate the performance of the PINN-LSTM hybrid model in predicting the thermal behavior of the Catu River, three commonly used metrics were adopted: the coefficient of determination ( R 2 ), the Mean Absolute Error (MAE), and the Root Mean Squared Error (RMSE). These metrics quantify how well the model predictions u ˜ i approximate the observed values u i across all n data points.

The formulas are defined as follows:

R 2 = i=1 n ( u ˜ i u ¯ ) 2 i=1 n ( u i u ¯ ) 2 (10)

MAE= 1 n i=1 n | u i u ˜ i | (11)

RMSE= 1 n i=1 n ( u i u ˜ i ) 2 (12)

where:

  • u i are the observed river temperature values;

  • u ˜ i are the corresponding predicted values from the model;

  • u ¯ is the mean of the observed values;

  • n is the total number of observations.

The R2 metric assesses the proportion of variance in the observed data explained by the model, with values closer to 1 indicating better fit. MAE represents the average magnitude of the errors without considering their direction, while RMSE penalizes larger errors more heavily due to squaring, making it particularly sensitive to outliers.

3. Main Results

The dataset was chronologically split into two subsets: approximately 80% for training and 20% for testing. Specifically, the first 80% (2009-2015) of the data was used for model training, while the remaining 20% (2016-2017) was reserved exclusively for model validation and testing.

The model was trained using the combined Physics-Informed Neural Network (PINN) and Long-Short-Term Memory (LSTM) approach for a maximum of 500 epochs, with batch size set at 64. To prevent overfitting and enhance generalization capability, early stopping was implemented. The training process was automatically terminated if the validation loss did not improve after 40 consecutive epochs.

Table 2. Results of quantitative metrics for model evaluation.

Model

MAE

MSE

RMSE

R2

LSTM + PINN

0.1154

0.0208

0.1443

0.5394

CNN + PINN

0.1349

0.0312

0.1767

0.3098

LSTM

0.1664

0.0502

0.2240

0.1084

Other topologies

0.1495

0.0392

0.1980

0.1334

The comparative analysis of the evaluated models reveals significant distinctions in predictive capability, as indicated by the quantitative metrics detailed in Table 2. The hybrid architecture, integrating Long-Short-Term Memory (LSTM) networks with Physics-Informed Neural Networks (PINNs), consistently demonstrated superior predictive accuracy, achieving the lowest errors across all considered metrics, Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE), alongside the highest coefficient of determination (R2), as shown in Table 2. These outcomes suggest that incorporating physics-based constraints and domain-specific knowledge effectively enhances the robustness of predictions, promoting improved model generalization and reliability.

Conversely, models relying exclusively on LSTM architectures exhibited notably inferior performance, particularly when assessed through metrics sensitive to squared errors (MSE and RMSE). This reduced performance indicates heightened vulnerability to data variability and potential susceptibility to anomalous observations. Such results underscore the critical role of embedding physical regularization techniques within neural network models, as these constraints significantly mitigate undesirable model behaviors by leveraging prior knowledge intrinsic to the physical system under study.

Furthermore, the alternative evaluated topologies, including Convolutional Neural Networks (CNNs) combined with PINNs, presented intermediate predictive capabilities. While these models surpassed purely traditional approaches, they still fell short compared to the hybrid LSTM + PINN architecture, underscoring that while innovative, their performance remains limited by factors potentially related to their architectural complexity or inadequate integration of domain-specific physical insights.

Despite the hybrid model’s superiority, the moderate values obtained for the coefficient of determination (R2) across all models, including the best-performing one, reflect the inherent complexity and multifaceted nature of the problem investigated. This suggests the presence of underlying phenomena or intrinsic noise in the data not fully captured by the current modeling frameworks. Such findings align with prevalent observations in the specialized literature, reinforcing the hybrid model’s efficacy as an advanced methodological approach for enhancing predictive quality in systems characterized by substantial complexity and inherent uncertainties.

The analysis of predicted versus actual temperature values from the models throughout the testing period reveals distinct differences in their predictive behaviors. The hybrid LSTM + PINN (Figure 5) model consistently aligns closely with the observed temperature trends, accurately following both gradual and abrupt fluctuations observed in the data. This model effectively captures the peaks, declines, and rapid changes, demonstrating robust adaptability to temporal variations.

Figure 5. The LSTM + PINN model accurately capturing temperature fluctuations and peak events.

In contrast, the CNN + LSTM (Figure 6) model exhibits noticeable challenges in tracking temperature variations accurately. Predictions from this model frequently deviate from actual measurements, particularly during periods of sudden or extreme temperature shifts. This results in delayed responses and underestimation of peak temperatures, indicating limited responsiveness and decreased accuracy in capturing dynamic and volatile temperature behaviors.

Figure 6. The CNN + LSTM model exhibiting delayed responses and inaccuracies during rapid temperature changes.

Overall, the hybrid LSTM + PINN model provides a more reliable representation of temperature trends over the entire testing period compared to the CNN + LSTM model.

Analyze the temporal prediction capability:

  • Plot time series of observed vs. predicted temperatures for selected weeks/months.

  • Show performance during extreme events (e.g. heavy rainfall, heatwaves).

3.1. Analysis of Monthly Peak

Extreme temperature events in the Catu River are typically characterized by monthly peaks exceeding 21˚C, often associated with seasonal heating and hydrological stress conditions. These events are critical for ecological balance, water quality, and system-level modeling. Therefore, evaluating the model’s capability to predict these upper-bound temperature scenarios is essential.

The comparison between observed and predicted peak temperatures, as illustrated in Figure 7, demonstrates that the PINN + LSTM model is capable of approximating the seasonal dynamics of monthly maxima. The predicted values follow the overall trend of real peaks, particularly during the warmer months when temperature variability is higher. However, systematic deviations are observed in several periods.

Notably, there is a consistent offset between observed and predicted peaks, with an average deviation of approximately 2˚C. This level of divergence is within the

Figure 7. Monthly peak river temperatures (2009-2016): comparison between observed and predicted values using the PINN + LSTM model.

acceptable range for natural measurement uncertainty in environmental monitoring, especially considering the complexity of river temperature dynamics influenced by meteorological, hydrological, and topographic factors.

During the summer and late spring months, where extreme peaks are more pronounced, the model reproduces the temperature elevation pattern but tends to slightly underestimate the peak amplitude. In contrast, during cooler months, the difference between prediction and observation is reduced, indicating better alignment in low-variability scenarios.

The results suggest that while the model captures the general structure of monthly extremes, high-frequency and short-term peak magnitudes remain partially attenuated. This behavior is consistent with temporal smoothing effects induced by the LSTM architecture and the regularization from physics-based constraints.

3.2. Sensitivity Analysis of Meteorological Variables

A sensitivity analysis was performed to investigate how variations in individual meteorological inputs affect the LSTM model’s output for river temperature. This analysis was conducted by computing the partial derivatives of the model’s predictions with respect to each input variable using automatic differentiation.

The results, shown in Figure 8, reveal that different variables exhibit distinct levels of influence on the model output. Variables related to atmospheric moisture, such as dew point and humidity, display higher sensitivity, suggesting that the model registers stronger gradients in temperature predictions when these inputs are perturbed. This may be associated with the model’s internal representation of near-surface heat exchange mechanisms.

Evaporation also demonstrates a measurable impact, indicating a potential role in modulating temperature through latent heat processes. Meanwhile, rainfall and wind speed contribute lower gradient responses, reflecting reduced direct sensitivity

Figure 8. Mean sensitivity of meteorological input variables to river temperature prediction, based on gradient analysis of the LSTM model.

under the current temporal and spatial resolution.

These sensitivity patterns vary across variables and likely depend on the seasonal dynamics, temporal scale of the input data, and the autoregressive [34] nature of the LSTM architecture, which may amplify or dampen certain features based on their historical behavior in the input sequences.

3.3. Ablation Study of Meteorological Inputs

To complement the gradient-based sensitivity analysis, an ablation study was conducted to examine the dependency of the model on individual meteorological variables. In this approach, each input feature was systematically removed by zeroing its values in the test set while keeping the remaining variables unchanged. The resulting change in Mean Squared Error (MSE) was computed to quantify the relative importance of each variable to the predictive performance of the model.

The outcomes of this procedure are illustrated in Figure 9. Variables associated with atmospheric moisture-particularly humidity and dew point-produced the largest increases in error when removed, indicating that these features provide essential information for capturing the thermal behavior of the river system. Wind speed and evaporation also led to noticeable error increases, suggesting that they contribute relevant dynamic cues to the model’s internal representation of energy exchange processes.

Precipitation, on the other hand, showed minimal influence on model performance. Its removal did not result in a meaningful increase in prediction error, and even led to a slight improvement in some cases. This behavior may be explained by the regional climatic characteristics: the study area experiences relatively low rainfall throughout the year, with precipitation concentrated in brief seasonal periods. As a result, the temporal resolution and dynamics of precipitation events may not align strongly with short-term river temperature fluctuations, reducing its predictive

Figure 9. Change in prediction error (MSE) after removal of each meteorological input variable. Positive values indicate performance degradation.

value in this specific configuration.

The ablation study reveals that meteorological variables such as humidity and dew point play significant roles in river temperature prediction. However, other important drivers, including river discharge, riparian vegetation cover, and riverbed morphology, were not incorporated in this study. Future work should aim to integrate these factors to enhance the model’s predictive capabilities.

4. Conclusions and Suggestions

This study aimed to develop a hybrid modeling approach that combines Physics-Informed Neural Networks (PINNs) with Long-Short-Term Memory (LSTM) networks to predict river water temperature with improved accuracy and physical consistency. Based on the results, the proposed objective was effectively achieved. The model demonstrated the capability to capture both the temporal variability and physical constraints governing thermal behavior in river systems, even under data limitations.

Regarding the main contributions, the integration of the advection-diffusion-reaction equation within the PINN module allowed for the physical representation of heat exchange processes influenced by meteorological inputs. Combined with the LSTM’s temporal learning ability, the hybrid architecture enhanced the predictive robustness across diverse hydroclimatic conditions. Sensitivity and ablation analyses confirmed that the model effectively learned variable dependencies, highlighting humidity and dew point as primary influencers and confirming the minimal impact of precipitation due to the region’s low rainfall frequency.

The results support the conclusion that hybrid data-physics-informed models present a viable alternative to purely data-driven or physical models, particularly in scenarios where complex dynamics and limited data coexist. The accurate reproduction of temperature peaks, the alignment with observed seasonal patterns, and the model’s adaptability to extreme events emphasize its suitability for real-world environmental applications.

While the PINN + LSTM model achieves satisfactory average performance, there is a systematic underestimation of peak river temperatures, as evidenced by the comparison of monthly maximum temperatures. This limitation is crucial for environmental management applications. Future research may address this by applying extreme-focused loss functions or augmentation techniques to better capture critical thermal events.

Future work may involve expanding the model to incorporate spatial dynamics via distributed sensors, integrating additional hydrological variables such as streamflow and turbidity, and improving the PINN formulation with adaptive weighting between physical and observational losses. Additionally, extending the model to include uncertainty quantification methods may provide more interpretable predictions for decision-making in environmental monitoring and management.

Acknowledgements

This work was supported by CAPES (Coordination for the Improvement of Higher Education Personnel), the State University of Bahia (UNEB), and the SENAI CIMATEC University. The authors thank these institutions for their academic, technical, and financial support throughout the development of this research.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Zhu, S. and Piotrowski, A.P. (2020) River/Stream Water Temperature Forecasting Using Artificial Intelligence Models: A Systematic Review. Acta Geophysica, 68, 1433-1442.
https://doi.org/10.1007/s11600-020-00480-7
[2] Qiu, R., Wang, Y., Wang, D., Qiu, W., Wu, J. and Tao, Y. (2020) Water Temperature Forecasting Based on Modified Artificial Neural Network Methods: Two Cases of the Yangtze River. Science of The Total Environment, 737, Article ID: 139729.
https://doi.org/10.1016/j.scitotenv.2020.139729
[3] Qiu, R., Wang, Y., Rhoads, B., Wang, D., Qiu, W., Tao, Y., et al. (2021) River Water Temperature Forecasting Using a Deep Learning Method. Journal of Hydrology, 595, Article ID: 126016.
https://doi.org/10.1016/j.jhydrol.2021.126016
[4] Tao, H., Abba, S.I., Al-Areeq, A.M., Tangang, F., Samantaray, S., Sahoo, A., et al. (2024) Hybridized Artificial Intelligence Models with Nature-Inspired Algorithms for River Flow Modeling: A Comprehensive Review, Assessment, and Possible Future Research Directions. Engineering Applications of Artificial Intelligence, 129, Article ID: 107559.
https://doi.org/10.1016/j.engappai.2023.107559
[5] Gresselin, F., Dardaillon, B., Bordier, C., Parais, F. and Kauffmann, F. (2021) Use of Statistical Methods to Characterize the Influence of Groundwater on the Thermal Regime of Rivers in Normandy, France: Comparison between the Highly Permeable, Chalk Catchment of the Touques River and the Low Permeability, Crystalline Rock Catchment of the Orne River. Geological Society, London, Special Publications, 517, 351-378.
https://doi.org/10.1144/sp517-2020-117
[6] Georges, B. (2022) Characterization of the River Thermal Regime in Relation to Its Environment: A Regional Approach Using in Situ Sensors in a Temperate Region (Wallonia, Belgium).
https://hdl.handle.net/2268/266855
[7] Drainas, K., Kaule, L., Mohr, S., Uniyal, B., Wild, R. and Geist, J. (2023) Predicting Stream Water Temperature with Artificial Neural Networks Based on Open‐Access Data. Hydrological Processes, 37, e14991.
https://doi.org/10.1002/hyp.14991
[8] Almeida, M.C. and Coelho, P.S. (2023) Modeling River Water Temperature with Limiting Forcing Data: Air2stream V1.0.0, Machine Learning and Multiple Regression. Geoscientific Model Development, 16, 4083-4112.
https://doi.org/10.5194/gmd-16-4083-2023
[9] Zhang, W., Zhou, H., Bao, X. and Cui, H. (2023) Outlet Water Temperature Prediction of Energy Pile Based on Spatial-Temporal Feature Extraction through CNN-LSTM Hybrid Model. Energy, 264, Article ID: 126190.
https://doi.org/10.1016/j.energy.2022.126190
[10] Huang, X., Abolt, C.J. and Bennett, K.E. (2023) Brief Communication: Effects of Different Saturation Vapor Pressure Calculations on Simulated Surface-Subsurface Hydrothermal Regimes at a Permafrost Field Site. The Cryosphere Discussions, 1-15. (Preprints)
https://doi.org/10.5194/tc-2023-8
[11] Bazionis, I.K. and Georgilakis, P.S. (2021) Review of Deterministic and Probabilistic Wind Power Forecasting: Models, Methods, and Future Research. Electricity, 2, 13-47.
https://doi.org/10.3390/electricity2010002
[12] Padrón, R.S., Zappa, M., Bernhard, L. and Bogner, K. (2024) Extended Range Forecasting of Stream Water Temperature with Deep Learning Models. EGUsphere, 1-27. ttps://doi.org/10.5194/egusphere-2024-2591-supplement
[13] Oad, S., Imteaz, M.A. and Mekanik, F. (2023) Artificial Neural Network (ANN)-Based Long-Term Streamflow Forecasting Models Using Climate Indices for Three Tributaries of Goulburn River, Australia. Climate, 11, Article 152.
https://doi.org/10.3390/cli11070152
[14] Tran, T.T.K., Bateni, S.M., Ki, S.J. and Vosoughifar, H. (2021) A Review of Neural Networks for Air Temperature Forecasting. Water, 13, Article 1294.
https://doi.org/10.3390/w13091294
[15] Georgescu, P., Moldovanu, S., Iticescu, C., Calmuc, M., Calmuc, V., Topa, C., et al. (2023) Assessing and Forecasting Water Quality in the Danube River by Using Neural Network Approaches. Science of The Total Environment, 879, Article ID: 162998.
https://doi.org/10.1016/j.scitotenv.2023.162998
[16] Cai, S., Mao, Z., Wang, Z., Yin, M. and Karniadakis, G.E. (2021) Physics-Informed Neural Networks (PINNs) for Fluid Mechanics: A Review. Acta Mechanica Sinica, 37, 1727-1738.
https://doi.org/10.1007/s10409-021-01148-1
[17] Nilpueng, K., Kaseethong, P., Mesgarpour, M., Shadloo, M.S. and Wongwises, S. (2022) A Novel Temperature Prediction Method without Using Energy Equation Based on Physics-Informed Neural Network (PINN): A Case Study on Plate-Circular/Square Pin-Fin Heat Sinks. Engineering Analysis with Boundary Elements, 145, 404-417.
https://doi.org/10.1016/j.enganabound.2022.09.032
[18] Liao, S., Xue, T., Jeong, J., Webster, S., Ehmann, K. and Cao, J. (2023) Hybrid Thermal Modeling of Additive Manufacturing Processes Using Physics-Informed Neural Networks for Temperature Prediction and Parameter Identification. Computational Mechanics, 72, 499-512.
https://doi.org/10.1007/s00466-022-02257-9
[19] Gao, W., Gao, J., Yang, L., Wang, M. and Yao, W. (2021) A Novel Modeling Strategy of Weighted Mean Temperature in China Using RNN and LSTM. Remote Sensing, 13, Article 3004.
https://doi.org/10.3390/rs13153004
[20] Hayder, G., Iwan Solihin, M. and Najwa, M.R.N. (2022) Multi-Step-Ahead Prediction of River Flow Using NARX Neural Networks and Deep Learning LSTM. H2Open Journal, 5, 43-60.
https://doi.org/10.2166/h2oj.2022.134
[21] Dehghani, A., Moazam, H.M.Z.H., Mortazavizadeh, F., Ranjbar, V., Mirzaei, M., Mortezavi, S., et al. (2023) Comparative Evaluation of LSTM, CNN, and Convlstm for Hourly Short-Term Streamflow Forecasting Using Deep Learning Approaches. Ecological Informatics, 75, Article ID: 102119.
https://doi.org/10.1016/j.ecoinf.2023.102119
[22] Li, L., Wang, Q., Zhao, Z., Li, C. and Hu, Y. (2024) A Novel Optimal Temporal Lag Air2stream Model Using Dynamic Time Series Matching. Hydrological Sciences Journal, 69, 2278-2292.
https://doi.org/10.1080/02626667.2024.2406311
[23] Feng, D., Tan, Z. and He, Q. (2023) Physics‐Informed Neural Networks of the Saint‐Venant Equations for Downscaling a Large‐Scale River Model. Water Resources Research, 59, e2022WR033168.
https://doi.org/10.1029/2022wr033168
[24] Elbeltagi, A., Pande, C.B., Kumar, M., Tolche, A.D., Singh, S.K., Kumar, A., et al. (2023) Prediction of Meteorological Drought and Standardized Precipitation Index Based on the Random Forest (RF), Random Tree (RT), and Gaussian Process Regression (GPR) Models. Environmental Science and Pollution Research, 30, 43183-43202.
https://doi.org/10.1007/s11356-023-25221-3
[25] Feigl, M., Lebiedzinski, K., Herrnegger, M. and Schulz, K. (2021) Machine-Learning Methods for Stream Water Temperature Prediction. Hydrology and Earth System Sciences, 25, 2951-2977.
https://doi.org/10.5194/hess-25-2951-2021
[26] Heddam, S., Ptak, M. and Zhu, S. (2020) Modelling of Daily Lake Surface Water Temperature from Air Temperature: Extremely Randomized Trees (ERT) versus Air2Water, MARS, M5Tree, RF and MLPNN. Journal of Hydrology, 588, Article ID: 125130.
https://doi.org/10.1016/j.jhydrol.2020.125130
[27] Someetheram, V., Marsani, M.F., Mohd Kasihmuddin, M.S., Mohd Jamaludin, S.Z. and Mansor, M.A. (2024) Double Decomposition with Enhanced Least-Squares Support Vector Machine to Predict Water Level. Journal of Water and Climate Change, 15, 2582-2594.
https://doi.org/10.2166/wcc.2024.558
[28] Usharani, B. (2023) ILF-LSTM: Enhanced Loss Function in LSTM to Predict the Sea Surface Temperature. Soft Computing, 27, 13129-13141.
https://doi.org/10.1007/s00500-022-06899-y
[29] Venkateswarlu, T. and Anmala, J. (2024) Importance of Land Use Factors in the Prediction of Water Quality of the Upper Green River Watershed, Kentucky, USA, Using Random Forest. Environment, Development and Sustainability, 26, 23961-23984.
https://doi.org/10.1007/s10668-023-03630-1
[30] O’Sullivan, A.M., Devito, K.J., Ogilvie, J., Linnansaari, T., Pronk, T., Allard, S., et al. (2020) Effects of Topographic Resolution and Geologic Setting on Spatial Statistical River Temperature Models. Water Resources Research, 56, e2020WR028122.
https://doi.org/10.1029/2020wr028122
[31] Pero, E.J.I., Georgieff, S.M., Gultemirian, M.d.L., Romero, F., Hankel, G.E. and Domínguez, E. (2020) Ecoregions, Climate, Topography, Physicochemical, or a Combination of All: Which Criteria Are the Best to Define River Types Based on Abiotic Variables and Macroinvertebrates in Neotropical Rivers? Science of The Total Environment, 738, Article ID: 140303.
https://doi.org/10.1016/j.scitotenv.2020.140303
[32] Wan Mohd Jaafar, W.S., Abdul Maulud, K.N., Muhmad Kamarulzaman, A.M., Raihan, A., Md Sah, S., Ahmad, A., et al. (2020) The Influence of Deforestation on Land Surface Temperature—A Case Study of Perak and Kedah, Malaysia. Forests, 11, Article 670.
https://doi.org/10.3390/f11060670
[33] Fuso, F., Stucchi, L., Bonacina, L., Fornaroli, R. and Bocchiola, D. (2023) Evaluation of Water Temperature under Changing Climate and Its Effect on River Habitat in a Regulated Alpine Catchment. Journal of Hydrology, 616, Article ID: 128816.
https://doi.org/10.1016/j.jhydrol.2022.128816
[34] Alnahit, A.O., Mishra, A.K. and Khan, A.A. (2022) Stream Water Quality Prediction Using Boosted Regression Tree and Random Forest Models. Stochastic Environmental Research and Risk Assessment, 36, 2661-2680.
https://doi.org/10.1007/s00477-021-02152-4

Copyright © 2025 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.