Prediction of Strength Properties of Soft Soil Considering Simple Soil Parameters ()
1. Introduction
Assessing Soil Parameters is essential in geotechnical engineering and construction practices. They provide critical information about the soil’s properties such as Bearing capacity and stability, slope stability analysis, foundation design etc. These properties of soil must be understood properly to provide critical failure criteria for the construction of superstructures as well as the foundation of constructions [1] . Important soil parameters for assessing geotechnical properties are SPT-N value, Dry Density, Moisture content, Particle size distribution (Cu, Cc), Liquid limit, Plastic Limit etc. [2] . The resistance property of soil can be measured by its SPT-N value during the soil penetration test [3] . Dry Density refers to the mass of soil per unit volume when it is completely in a dry state. Moisture content, expressed as a percentage of the dry weight of soil, indicates the amount of water present in the soil. Particle size distribution provides information about the distribution and composition of soil particles across different size ranges. Atterberg limits such as Liquid limit which is the moisture content at which fine-grained soil transitions from a liquid-like to a plastic state and Plastic limit which is the moisture content at which fine-grained soil transitions from a plastic state to a semi-solid state.
Cohesion is an important part of the physical distribution forces of soil among others such as shear strength which can be experimentally determined with the help of other soil properties [4] . Soil consistency can be predicted using the value of cohesion which is important in Soil physics and Soil mechanics. It is the cohesive force that acts between soil particles according to soil physics. Cohesion in soil mechanics is the degree of shear of soil at a point where the compressive stress is equal to zero. The stability and capability of a soil to adjust when facing overburden loads and loading from structures, are greatly impacted by the shear strength of soil [5] . This shear strength parameter is important in terms of soil stability which denotes how much shear stress a soil can take before sliding down [6] . Thus, it is important to use soil with higher cohesion value to construct retaining walls on inaccessible terrain [7] . The shear strength parameter, especially the cohesion value of soil is of prime importance in the case of different foundation designs. For, Rough rigid strip footing, there is a variance of bearing capacity for the spatial variation of cohesion, and for shallow foundations, bearing capacity is mainly based upon cohesion characteristics of soil [8] [9] . Bearing capacity is an important factor for designing mechanically stabilized earth walls. Detachment rate of soil particles also has relation with soil cohesion. An inverse correlation has been seen between soil detachment and soil cohesion properties [10] . The detachment rate of soil can be determined using the depth of surface runoff of an area. Subsoil reaction to heavy live loads on the road also changes with the cohesion of soil. Soil modeling is important for the sensitivity analysis of soil and cohesion is an important parameter for soil modeling [11] .
The cohesion value of a soil specimen taken from different depths of a borehole can be determined primarily by performing the UCS test, Direct Shear test, and Triaxial tests. These tests have some restraints as it is very time-consuming. If a soil sample that is undisturbed is preferred for these tests different places in Bangladesh have different properties of soil, so we have to perform different shear strength tests [12] . However, for large-scale construction, the cohesion value is preferred to be obtained more easily and rapidly for a huge number of datasets. To reduce the constraints and to obtain the shear strength parameters i.e., cohesion value, much research has been carried out. Analytical approaches for predicting shear strength values were first studied considering the bearing capacity mechanism of failure at the cone tip and from direct shear failure along the penetrometer sleeve [13] . For predicting soil shear wave velocities (Vs) from a cone penetration test, Multiple linear regression was used in Christchurch, New Zealand [14] . Some constitutive models were also developed to further research the creep behavior of soil which gives information about the shear strength of soil [15] . Among these studies, establishing a correlation between soil parameters and the cohesion property of soil to predict the shear strength is the most efficient way without performing laboratory-based shear strength tests. Correlation analysis between the shear strength parameter and unconfined compressive strength, bulk, and dry unit weight has been shown [16] . Multiple regression models for assessing correlation and prediction of shear strength [17] were also studied the correlation between effective cohesion and plasticity index of clay has been also studied earlier [18] . Predicting software-based cohesion metrics with Machine learning models which were studied earlier was not quite satisfactory [19] . Therefore, obtaining the cohesion value of soil samples using Correlation with different soil parameters is more rapid and can give the cohesion value of a large number of datasets with much accuracy.
The previous studies were seen to have been correlated to only some of the soil’s parameters like plasticity index, cone resistance values [20] , CPT-SPT values [21] etc. with other soil properties, especially the shear strength parameter of soil. If the correlation is established between several soil parameters such as SPT-N value, %sand and %fines, dry density, and moisture content, which are some important soil parameters that need to be tested before any construction upon the soil, with undrained cohesion (Cu) of soil, the prediction will be much more accurate. The correlation coefficient (R2) value would be much higher with combined Multiple Linear Regression (MLR) of those soil parameters with cohesion.
Soil sample collection and analysis is the foremost step in soil characteristics examination and establishing a correlation between other soil properties. Most study of any type of soil prefers the undisturbed state of the soil sample as it represents the realistic behavior of soil under loading condition. It is preferable if a correlation of cohesion can be established with soil parameters using undisturbed soil samples to provide a more accurate assessment for further research using these correlations. The focus of our study is to correlate different soil parameters with shear strength (cohesion) using silty clay soil samples in undisturbed condition taken from boreholes of a specific region. This proposed study of correlation can provide a better and accurate analogy of soil using important soil parameters compared to other correlation study performed before. Previously, a correlation of soil parameters was developed with existing CPT and SPT values for Bangladesh [22] . Akan et al. (2015) [23] , Sharma et al. (2018) [24] and Ahmad et al. (2018) [25] attempted to develop correlations using multiple regression models to predict the unconfined compressive strength of soil. Zaman et al. (2016) [26] developed a correlation between consolidation properties of soil with liquid limit, in situ water content, void ratio, and plasticity index. Statistical models were also developed to predict the liquefaction and seismic hazard from SPT data [27] [28] . Moreover, several statistical approaches were used to develop models to conduct slope stability assessment of clayey sand hill tracts [29] [30] [31] and also the deformation of soil due to tunneling [32] [33] [34] . Later on, more advanced support vector machine (SVM) and artificial neural network (ANN) models were also used to predict the unconfined compressive strength by Tabarsa et al. (2021) [35] . The application of ANN models was also implemented in other fields of civil engineering, especially in transportation modeling, quality assessment and safety-related predictions [36] [37] [38] [39] [40] . Kabir et al. (2019) [41] also developed an ANN model to predict the bearing capacity of shallow foundations using relevant soil parameters that can be acquired from in-situ testing. These previous studies clarify the successful implementation of both statistical and advanced machine learning (ML) models in developing several prediction models and establishing correlations using data obtained from in-situ testing and relevant soil parameters.
Considering all these factors to establish a much more accurate correlation between shear strength (Undrained cohesion, Cu) and relevant soil parameters which are SPT-N value, %sand and %fines, dry density, and moisture content were taken into consideration from undisturbed soil samples of 100 Boreholes which were soft soil or silty clay type soil. They contained mostly fine particles and less amount of sand particles. Using these soil parameters, initially, Linear Regression (LR) was proposed with different variables according to their impact on cohesion value, and then Multiple Linear Regression (MLR) and Random Forest algorithms were also applied to develop correlation. All their performances were compared with the results obtained from the Machine Learning (ML) model. In order to assess the accuracy and performances of all the models, mean squared errors (MSE), Root mean square errors (RMSE), and mean average errors (MAE) were also calculated. By performing all these, this research aims to facilitate the prediction of the strength properties of soil to understand soil behavior.
2. Methodology
2.1. Study Area
Soil samples for our study were collected from various locations around Dhaka city, where we identified the existence of silty clay type soil. The soil samples were primarily obtained from areas where the fine particles exceeded the other soil particle percentage by 8 - 9 times. We specifically focused on gathering undisturbed soil samples. We have collected samples from different depths of soil strata i.e., 2.5 m, 5 m, 10 m, and 15 m below the surface. The gathered samples were placed in strong, labeled, sealed polythene bags before being taken to the lab for examination.
2.2. Material Collection
To drill into the boreholes of soils containing fine particles, the rotary method was performed with hydraulic rotation. The borings were advanced by rotation of the drilling bit which was pushed by hydraulic rotary pressure. Thin-walled Shelby tube has been used in the field for collecting undisturbed soil samples at an interval of 1.5 m. The borehole water level was measured when all the sediments were stabilized and the water in the borehole attained the condition of natural groundwater. After boring, the area has been backfilled properly.
2.3. Soil Characteristics
As we performed a sub-soil investigation, it was found that the percentage of fines was much higher than the sand or any other soil particle sizes. Sand is found in a very low percentage among the boreholes whereas the average percentage of clay and silt (fines) is (87% - 97%) which is much higher suggesting a clay-type soil sample (Table 1).
2.4. Laboratory Analysis
2.4.1. Particle Size Analysis
According to ASTM D422, Sieve analysis tests were carried out for the analysis of particle size distribution. Only the particles retained on a 0.075 mm sieve were used as oven-dry materials in order to analyze. It is calculated and given as a percentage of the sample mass how much soil was retained on each sieve. Fine materials underwent hydrometer examination.
2.4.2. Liquid Limit and Plastic Limit
Atterberg limits (mainly Liquid limit and Plastic limit) were determined according to ASTM D4318, of representative soil samples of cohesive soil. To determine these limits, we performed standard tests on soil samples i.e., the Casagrande test for obtaining LL. These tests helped us understand the soil’s consistency and its ability to undergo deformation under different moisture conditions.
2.4.3. Water Content
Water content was determined on an oven-drying basis (ASTM D2216). It was found by extensively drying moist or wet soil for 18 to 24 hours at a constant temperature of 105˚C. The moisture or water content of a soil sample is calculated by splitting the mass of water by the mass of solid particles.
Table 1. Data collection of soil samples.
2.4.4. Dry and Bulk Unit Weight
We determined the dry unit weight of the soil by measuring the weight of a given volume of oven-dried soil according to ASTM D4531. This provided insights into the soil’s density and compaction characteristics.
2.5. Research Methodology
In the data analysis phase of the methodology, we employed linear regression to explore the relationship between liquid limit, plastic limit, moisture content, %fine, and the shear strength parameter. Specifically, we calculated the R-squared (R2) value for each individual linear regression model. Next, we selected the linear regression model with the highest R2 value as a basis for conducting multiple linear regression. This approach allowed us to incorporate multiple independent variables and assess their combined impact on predicting the shear strength parameter. Also, for accurate prediction Random Forest regression model was used alongside the Multiple Linear Regression model. Multiple linear regressions (MLR) are the method of statistics in regression that is used to analyze the relationship between a single response variable (dependent variable) with two or more controlled variables (independent variables) [42] . The selection of this method for the research was based on the presence of more independent variables. The Random Forest method integrates more than one decision tree and integrates all of the results from every one of the decision trees. Then find the final result. And because the simple processed many times from each decision tree, the result will have excellent accuracy [43] . To further evaluate the predictive performance of our models, we employed machine learning techniques. We utilized methods such as Mean Absolute Error (MAE) and Mean Squared Error (MSE) to quantify the accuracy and precision of our predictions. By combining traditional statistical analysis through linear regression, multiple linear regression, and machine learning methods, we aimed to gain a comprehensive understanding of the factors influencing the shear strength parameter and improve our predictive capabilities.
Here, the MLP (Multi-Layer Perceptron) model was used. MLP is an artificial neural network model. It is a deep learning basic model and is widely used for classification, regression, and pattern recognition. Multiple layers of neurons form the MLP. Here the feedforward neural network was applied. Backpropagation updates neuron weights to minimize the discrepancy between predicted and actual results during MLP training. An MLP neuron receives inputs from the previous layer, applies an activation function to the weighted sum, and generates to the next layer. MLP hidden layers between the input and output layers allow the model to learn complicated representations and extract features from input data. Stochastic gradient descent is used to calculate the gradient of the loss function with respect to the model’s parameters and update the weights. The number of hidden layers and neurons in each layer are hyperparameters that can be modified for better model performance.
3. Performance Evaluation
The model’s performance is assessed by comparing simulation values to actual output. This study assessed goodness-of-fit or correlation statistics. Prior to the model’s prediction efficiency assessment, the important soil parameters were identified to be used as the input parameters for the prediction of cohesion. For that, initially, Simple Linear Regression was performed for the soil parameters, and comparing their R2 values decisions on input parameters were taken. Among them only four such soil parameters (moisture content, LL, PL and %fines) which had R2 values comparatively better than other parameters.
Moreover, to compare accuracy between all models we summarized the R2, MAE, MSE, RMSE value of all the models used in our research. As they reflect smaller errors between actual and predictive values, lower values of MSE, MAE and RMSE represent better model performance.
(1)
(2)
(3)
where,
N = number of data points,
= Actual observations and
= Predicted observations.
4. Results and Discussions
The correlation between considered soil parameters and undrained shear strength are shown in Figures 1-4 with their respective R2 values. Although their coefficient of determination, R2 is very much lower, the impact of their characteristics delivers much more important factors deciding them as our base properties to establish proper correlation from which we can observer and predict further soil cohesion values without the mediocre and time-consuming laboratory tests and only from our correlation easily.
For every variable, we got our different slope (m) value, and the intercept (c) was the same for every variable. Which is shown in Table 2. Equation (4) can be used to form four different equations from which we can predict cohesion test values to compare them with our experimented cohesion values and for further evaluation.
Simple linear equation,
(4)
In our research study of the correlation between the shear strength parameter of soil, we applied python based Multiple Linear Regression model to correlate between multiple variables which are Moisture content (wet basis), Liquid Limit, Plastic Limit, %fines and predict simple linear equations to further assess the cohesion value of the soil sample. The correlation is predicted in Figure 5. Performing the MLR model on our dataset it was found that the result (R2 value) was not satisfactory for the establishment of correlation which is why we introduced our dataset to the RFR (Random Forest Regression) model to address the limitation of low R2 value. By RFR it is possible to perform a more comprehensive analysis of the data shown in Figure 6, resulting in a more accurate predictive model. Unlike MLR which depends on test datasets, RFR is a potent ensemble learning technique that makes predictions by combining various decision trees.
Figure 2. Correlation with Liquid Limit.
Figure 3. Correlation with Plastic Limit.
Figure 5. Actual Cu vs Predicted Cu (MLR).
Figure 6. Actual Cu vs Predicted Cu (RFR).
Table 2. Summary of different equations evaluated from MLR model.
Using MLP as a Machine Learning (ML) model to predict cohesion values, initially, the data sets were split into training and testing sets. 80% of the data was used to train the model. The rest 20% were used for testing and validation purposes. Batch size, number of hidden layers, and neurons in each layer are hyperparameters that were adjusted to obtain the best model performance. Table 3 represents the result of the ML model with varying hyperparameters. It was found that the R2 value for the best-fitted model was quite like the Multiple Linear Regression model. From the results of the Machine learning model, it was clear that the model with batch size 8 and 10 hidden layers was able to forecast the unconfined strength more accurately. Table 4 represents the overall comparison of the all-model performances.
These models show how precisely we can correlate the soil parameters with the cohesion of soil and how accurately we will be predicting further cohesion values using those models. The R2 values we got from different models were quite similar. But the R2 value obtained from RFR model has quite higher credibility in predicting the unconfined compressive strength from the considered soil parameters. The soil sample which was collected for our study was in isotropic condition with standard climate change effect. If there happens to be any extreme weather conditions e.g. heavy rainfall, the property of soil will change according to the intensity of the weather condition so as the correlation this study has stated. When performing model framework, the laboratory results, and statistical measures were taken based on samples from similar region with minimal variation. Factors which transform soil into anisotropic condition e.g., seismic activities, can also alter local soil characteristics therefore impacting our observed correlation.
The result obtained from RFR model is very much satisfactory and accurate for the prediction of soil cohesion parameter. Thus, we can predict the cohesion values of clayey soil using these correlations which are quite accurate according to the result summary.
In our study, only silty clay-type soil sample data were used in our established model framework and to correlate them with other soil parameters. This framework would help us to study soil sample data of other regions with different soil characteristics resulting in the establishment of correlation using broad soil types.
Table 3. Results Obtained from ML model.
Table 4. Result summary of all models.
5. Conclusions
The outcome of this research offers many perspectives to understanding soil behavior. Initially, from this study of Dhaka soil, we formulated different prediction models and established correlations among soil parameters and shear strength of soil. Different approaches adopted were using Regression Models (Multiple Linear Regression model, RFR Model) and Machine Learning Model. Soil index properties, fines, and moisture content were also individually correlated with undrained cohesion where it showed very insignificant or almost no correlations. But while they were combined, it was found that the RFR model shows higher accuracy in predicting the undrained cohesion (cu). Hence, using this best fitted prediction model RFR, it’s possible to predict the unconfined compressive strength of soil with higher accuracy from plasticity properties and moisture content of soil. In the prediction of the unconfined compressive strength, it was clear that, though incorporating the machine learning model MLR, it yielded R2 values which represent a lower capability to correlate the strength parameter of clayey soil. On the other hand, the RFR model showed much accuracy with a higher correlation value. The model performance indicators like: MAE, MSE, and RMSE were also less in the case of the RFR model which proves the higher accuracy of the model we used. So, this outcome can facilitate the prediction of unconfined compressive strength without performing any rigorous strength tests. As this study is in its preliminary state with only available data of 100 boreholes of a specific region, we could extend our study to different types of advanced soil including sandy soil of various field areas. Variation of data due to collection of soil samples from larger region and higher depth bore logs was not present in our study as this study was performed for a special soil type (silty clay) with data variation in narrow band. In future this study could incorporate different machine learning methods such as LASSO Regression, Recursive Feature Elimination (RFE) and other Artificial Neural Network (ANN) models to correlate vast number of soil sample data and build strong correlation. Long term monitoring of different areas will enable the study of soil samples of numerous conditions and state with the help of correlation that this study has established. It can help us predict the soil cohesion with less deviation from the real field data taken from laboratory tests. These steps can be taken in future to implement our study to many civil engineering constructions to hasten future projects.