Prediction of Chemical Composition of Ancient Glass Relics before Weathering

Abstract

Ancient glass relics are easily weathered by the influence of buried environment, and the internal elements exchange with the environmental elements in large quantities, resulting in changes in their composition ratio. Archaeological research can often detect the component content of glass relics after weathering, but it is difficult to obtain the corresponding component content before weathering. It is necessary to predict the chemical composition of glass relics before weathering in order to accurately identify the type of glass relics and repair them. To solve this problem, this paper proposes a distributed matching strategy, and studies the influence of weathering on the composition content of glass through compositional correlation analysis and linear regression statistical methods, so as to build a prediction model of the composition content of glass relics before weathering. The results show that the composition prediction model of glass cultural relics constructed by the distribution matching strategy has a good prediction ability, which is consistent with the change trend of the composition ratio of linear regression analysis. Moreover, the model is simple and easy to operate, which is convenient for popularization and application, and provides theoretical basis and reference value for further research on the composition and accurate classification of glass cultural relics.

Share and Cite:

Sun, J. , Chen, H. , Liu, Y. , Lin, H. , Zheng, H. and Qiu, Y. (2023) Prediction of Chemical Composition of Ancient Glass Relics before Weathering. Open Journal of Applied Sciences, 13, 1565-1580. doi: 10.4236/ojapps.2023.139124.

1. Introduction

As a kind of precious vessel, glass is defined as the precious material evidence of the early trade between China and the West in ancient times [1] . The main raw material of glass is quartz sand, and its chemical composition is SiO2. The difference in the flux added when making glass will lead to the difference in the main chemical composition of the final glass. In ancient China, there were mainly two types of lead barium glass and high potassium glass. The content of PbO and BaO in lead-barium glass will be increased by adding the flux lead ore in the firing process. High potassium glass is made from plant ash or other substances with high potassium content as a flux [2] .

Ancient glass products have been buried for a long time, which is highly susceptible to the influence of its surrounding environment. When the internal elements of glass exchange a large amount of chemical elements in the surrounding environment, weathering occurs, which will lead to changes in its chemical composition [3] . In archaeological research, the unearthed glass relics are all in the state after weathering, so it is difficult to accurately judge the chemical composition content before weathering, which seriously affects the type identification of glass relics. In order to accurately identify the type of glass relics and then carry out restoration work, it is necessary to measure/predict the chemical composition content of glass relics before weathering.

It is a hot topic to identify the type of glass cultural relics by quantitative analysis of chemical composition [4] [5] . In the early period, scholars did some research on the method of determining the chemical composition of glass relics. Gan Fuxi et al. [6] analyzed the chemical composition of glass beads by proton excited X-ray fluorescence and inductively coupled plasma emission spectroscopy; Li Qinghui et al. [7] [8] proposed a method of proton induced X-ray emission (PIXE) for the determination of the chemical composition of ancient glass. In recent years, with the development of information technology, neural networks, multi-layer perceptrons, decision trees, random forests, feature selection and other machine learning methods have also been applied to the research on the type identification of glass relics. Cao Yuxuan et al. [9] used GA-BP neural network to study the relationship between the internal chemical composition of different types of ancient glass and their own weathering degree, and established models for predicting the types of ancient glass and measuring the weathering degree; Shi Baoming et al. [10] established a multi-layer perceptron network model to predict the categories of ancient glass products; Lv Fei et al. [11] established decision tree model and random forest model to identify the types of ancient glass products. In addition, there are few quantitative studies on the prediction of various chemical components of glass cultural relics before weathering. Zou Ying [12] established a multivariate time series model to predict the contents of various chemical components of glass cultural relics before weathering.

Different from the methods of other scholars, this paper combined statistical correlation analysis and linear regression method to propose a strategy of data distribution matching. Assuming that there are two populations with normal distribution before and after glass weathering, the mean value and standard deviation of these two populations are used to build a prediction model for the contents of various chemical components of glass relics before weathering.

In order to better demonstrate the research idea of this paper, Figure 1 is used to show the general framework of this paper.

At the same time, the symbolic descriptions are shown in Table 1.

2. Data Source and Processing

According to the chemical composition of glass cultural relics and other detection methods, archaeologists collected a number of Chinese ancient glass cultural relics related data (2022 National College students Mathematical Competition in Modeling C), including the classification information of glass cultural relics and the proportion of the corresponding main components. However, due to the detection methods and other reasons, there are missing data or zero data, and the composition ratio of most samples is not accurate to 100%. Therefore, we first carry out necessary pre-processing of the data.

Figure 1. Overall thinking process.

Table 1. Symbols.

2.1. Missing Value and Zero Value Data Correction

The missing value is mainly generated because the content of one or several elements in the sample is very low and does not reach the detection limit, or the absence of artificial data entry cannot be excluded. There are three commonly used data missing value processing methods: 1) Directly delete the sample or variable containing the missing value; 2) Assign an arbitrary value below the detection limit to the missing value; The missing value is estimated based on the component analysis of the associated sample (e.g., likelihood estimation). In view of the fact that the original data is the component data, the missing values are too numerous and too scattered, and it is not suitable to directly delete or directly assign arbitrary values below the detection limit. Therefore, we use the component analysis based on the associated samples to estimate the missing values. Specifically, the number of missing components is calculated for each sample, and the missing data is filled by the expected maximum likelihood estimation method combined with the ideal condition of 100% of the proportion of each component.

The absence of an element in the sample or the amount of an element that does not reach the detection limit may result in a zero value in the recorded data. Considering the small number of zero values of the original data in the sample and the scattered distribution, we choose to modify the zero values recorded in the original data to a small positive number (0.001), which is convenient for subsequent data processing and modeling calculation.

2.2. Correction of Composition Data

The component data is non-negative, and the sum of each individual element content in sample is 1 [13] , that is W N × D = { W i j 0 | i = 1 , 2 , , N ; j = 1 , 2 , , D } , and W i 1 + W i 2 + + W i D = 1 ( i = 1 , 2 , , N ) . Therefore, if the cumulative sum of component data is biased, the correlation analysis results between elements will be biased, which will further affect the covariance and correlation matrix between components [14] .

The direct data studied in this paper is the content proportion of each chemical component, that is, the component data, which should theoretically meet the constraint of cumulative sum of 1. However, due to the accumulation of the proportion of each chemical component and non-100% deviation caused by the detection means, it is necessary to correct the deviation. If the cumulative sample component content of the original data is between 85% and 105%, it is considered that the fixed sum deviation of the data is not large, and the sample data is valid and can be corrected. If the component content of the original data accumulates and exceeds the range of 85% to 105%, the sample is considered invalid data and cannot be corrected. For valid sample data that can be corrected, the sum is fixed according to formula (1):

x i j = x i j j = 1 m x i j (1)

where x i j ( i = 1 , 2 , , n ; j = 1 , 2 , , m ) represents the original data of the indicator j of the sample i, x i j represents the converted data. After conversion, the proportion of each component of each sample is 100%.

The descriptive statistical results of the data after modification are shown in Table 2. It can be seen from Table 1 that the skewness coefficients of component indicators in the sample are all greater than 0, indicating that there are more data on the right side of the mean value. The skewness coefficient of SiO2 is close to 0, indicating that the data of SiO2 is close to symmetric distribution. The kurtosis coefficient of 2 component indices in the sample is less than 0, indicating that the distribution is flatter at the top or thinner at the tail than the normal distribution. The kurtosis coefficients of the remaining components are greater than 0, indicating that the distribution is sharper at the top or thicker at the tail than the normal distribution. In general, the content of most components in the data does not conform to the normal distribution.

2.3. Ratio Transformation of Component Data

Since the ratio of component data variables is not restricted by the “constant sum” restriction, and the logarithm of the ratio usually follows the characteristics of normal distribution, this paper adopts the methods of additive log-ratio transformation (ALR) and central log-ratio transformation (CLR) [15] , so that the component data presents the situation of multivariate normal distribution after transformation. Considering that the vector after the additive log-ratio transform (ALR) cannot correspond to the original vector one-to-one, the central log-ratio transform [16] is adopted:

Let component vector x = ( x 1 , x 2 , , x D ) , ( x i > 0 ) , and i = 1 D x i = 1 , let

y i = ln x i i = 1 D x i D , i = 1 , 2 , , D . (2)

This transformation is called the central log-ratio transformation (CLR).

The corresponding inverse transformation is:

x i = exp ( y i ) i = 1 D exp ( y i ) , i = 1 , 2 , , D . (3)

The transformed component corresponds to the original component one by one, and further enhances the interpretability of the variable. In addition, the central log-ratio transformation can make the data more stable, reduce the influence of outliers, and reflect the real situation of the data more accurately.

Use x 1 , x 2 , , x 14 to indicate the chemical composition of glass relics SiO2, Na2O…, SO2. According to formula (2), for the component vector x = ( x 1 , x 2 , , x 14 ) S 14 ( x i > 0 ) , the data transformed by CLR is:

y = ( y 1 , y 2 , , y 14 ) = ( log x 1 i = 1 14 x i 14 , , log x 14 i = 1 14 x i 14 ) (4)

Table 2. Descriptive statistics of data after modification.

According to formula (4), the data after logarithm transformation of the content center of high potassium and lead barium glass can be calculated. In view of the rationality of the hypothesis of normal distribution of data, it is necessary to test whether the data distribution pattern conforms to normal distribution. There are many testing methods for normal distribution, including Shapiro-Wilk test, Kolmogrov-Smirnov test, skewness test and kurtosis test [17] . In this paper, the normal distribution of data is tested by skewness and kurtosis test. In statistics, skewness is used to describe symmetry [18] , and kurtosis represents the characteristic number of the height of kurtosis at the mean of the probability density distribution curve. The skewness and kurtosis of the data can be calculated through the software. Limited by space, only the skewness and kurtosis of the lead-barium glass data are shown (Table 3). When the skewness and kurtosis values are closer to 0, the data tends to be more normally distributed. In order to intuitively understand the influence of the center log-ratio transformation on the normal distribution of data, the original data frequency histogram and the data frequency histogram after the center log-ratio transformation were made for the data in this paper (Figure 2 and Figure 3). Limited by space, only some data frequency histograms of lead-barium glass are shown.

As can be seen from Table 3 and Figure 2 and Figure 3, the skewness and kurtosis of the composition content data of lead-barium glass are high, so most of the composition content in the original data does not conform to the normal distribution. After the central logarithmic ratio transformation, skewness and kurtosis of most component content data are reduced, and the processed data tend to be more normal distribution.

Table 3. Test table of skewness kurtosis of lead-barium glass.

Figure 2. Histogram of the frequency of Al2O3, CuO and BaO of the unweathered lead barium population.

Figure 3. Histogram of CaO, Al2O3 and CuO frequencies of Pb barium weathering.

3. Model Design

It is assumed that the glass relics are weathered under natural conditions and the chemical composition is measured accurately. According to the data characteristics of chemical components of glass relics, Pearson correlation coefficient was used to study the correlation of chemical components of glass relics, and the correlation degree and difference between different types of glass relics and their chemical components were discussed. Secondly, linear regression method was used to discuss the influence of weathering state on the content of chemical components of glass relics. At last, the prediction model of the chemical composition contents of glass relics before weathering was constructed by the idea of distribution matching.

3.1. Correlation Analysis of Chemical Components of Glass Relics

The “closure effect” caused by the constant value of sum will lead to the deviation of the analysis results of the correlation between variables. Therefore, the Pearson correlation coefficient between the variables of high potassium and lead barium glass is calculated using the data after the transformation of the central logarithm ratio. The calculation formula is as follows:

ρ x y = i = 1 n ( x i x ¯ ) ( y i y ¯ ) i = 1 n ( x i x ¯ ) 2 i = 1 n ( y i y ¯ ) 2 (5)

where, x i , y i respectively represents the chemical composition variables of glass relics, x ¯ , y ¯ respectively represents the average value of the two variables, and n is the sample size. The closer the calculation result is to 1, the stronger the positive correlation between the two variables is, and the closer the result is, the stronger the negative correlation is. According to the calculation results of equation (5), we obtained the heat map of the correlation coefficient of high-potassium and lead-barium glasses, and the results are shown in Figure 4 and Figure 5. The results show that there is a high correlation between the glass type and the proportion of some chemical components, and the correlation between different types of glass and their chemical components is different. For example, for high potassium glass, SiO2 is positively correlated with K2O, CaO, MgO, Al2O3, CuO, P2O5 and SnO2, and negatively correlated with Na2O and SrO2. SiO2 has a very significant positive correlation with Al2O3 and a very significant negative correlation with Na2O. For lead-barium glass, SiO2 has a very significant positive correlation with Na2O and Al2O3, and a very significant negative correlation with P2O5 and SrO. It can also be concluded that for high potassium glass, SiO2 has a positive correlation with P2O5, and a positive correlation with Na2O, while the lead barium glass is the opposite. Therefore, the correlation between variables of different glass types is quite different.

Figure 4. Heat map of correlation coefficient of each chemic-al composition of high potassium glass. Note: In the Figure 4, red represents the positive correlation, blue represents the negative correlation, the size of the circle represents the absolute value of the correlation coefficient, and the asterisk on the circle represents the existence of significance, one asterisk means significant, and two asterisks mean very significant.

Figure 5. Heat map of correlation coefficient of chemical composition of lead barium glass. Note: In the Figure 5, red represents the positive correlation, blue represents the negative correlation, the size of the circle represents the absolute value of the correlation coefficient, and the asterisk on the circle represents the existence of significance, one asterisk means significant, and two asterisks mean very significant.

3.2. Analysis of the Relationship between Weathering State and the Content of Chemical Components of Glass Relics

Since the weathering state is divided into weathering and unweathering, which belong to categorical variables, and the chemical composition content of glass relics belongs to continuous variables, linear regression analysis can be carried out by setting dummy variables. With no weathering as a reference variable and weathering state as a dummy variable, linear regression models of weathering state of high-potassium glass and lead-barium glass on the content of each chemical component were established according to the data after CLR transformation. The results are shown in Table 4.

The results showed that weathering had different effects on the contents of chemical components of different glass types. 1) Weathering can cause significant changes in the proportion of SiO2 and Na2O components of high potassium glass, p value is 0.000 (<0.05), and can cause changes in the proportion of CuO components to a certain extent. According to the linear model primary term coefficient, surface weathering causes the proportion of SiO2 and CuO components to increase, and the proportion of Na2O components to decrease significantly. 2) Weathering can cause significant changes in the proportions of SiO2, Na2O, K2O, CaO, Al2O3, PbO and P2O5 of lead-barium glass to a certain extent, and the p value is less than 0.05. According to the linear model primary term coefficient, the surface weathering causes the proportion of SiO2, Na2O, K2O and Al2O3 to decrease, while the proportion of CaO, PbO and P2O5 to increase.

Table 4. Linear regression results of high potassium and lead barium glasses.

3.3. Construction of a Prediction Model for the Contents of Each Chemical Component of Glass Relics before Weathering

Due to the lack of detection data before and after weathering, the problem of predicting the chemical composition content of glass before weathering from the detection data of weathering points and unweathered cultural relics needs to analyze the statistical laws of weathering and unweathered populations, and make prediction through the relationship between the two populations. Therefore, the idea of distributed matching can be used for prediction. According to the fact that the ratio of the component data variables is not restricted by the “fixed sum” and the logarithm of the ratio usually follows the characteristics of normal distribution, it is assumed that the data after the transformation of the central logarithm follows the normal distribution. Considering two populations of weathering and unweathering, namely weathered glass population X ~ N ( μ x , σ x 2 ) and unweathered glass population Y ~ N ( μ y , σ y 2 ) , the data after the transformation of the central logarithm ratio were classified and processed to obtain four types of data, namely, high potassium weathering (6), high potassium unweathering (14), lead barium weathering (26) and lead barium unweathering (23). Here, the detected data at its unweathered point in the weathering sample is regarded as unweathered, such as 49 unweathered points in the weathering sample of lead barium is regarded as unweathered category. The mean and standard deviation of each chemical component content of weathered and unweathered glass relics were analyzed respectively.

According to the weathered data C L R ( X ) , the pre-weathering data of high potassium and lead barium glass were calculated by the formula C L R ( Y ) = μ y + σ y σ x ( C L R ( X ) μ x ) . The inverse transformation formula (3) of the central log-ratio transformation is then used to restore the unweathered component data (the result is multiplied by 100). The predicted contents of each chemical component of high-potassium glass and lead-barium glass before weathering were calculated, as shown in Table 5 and Table 6.

In order to intuitively understand the change trend of the predicted data, the calculation results are visualized as shown in Figure 6 and Figure 7. As can be seen from Figure 6, weathering will cause the proportion of SiO2, CuO and P2O5

Figure 6. Visualization of composition prediction of high-potassium glass before weathering (part).

Table 5. Prediction data of high-potassium glass before weathering.

components to increase, and the proportion of Na2O, CaO and K2O components to decrease. As can be seen from Figure 7, weathering causes the proportion of SiO2 and Al2O3 in lead-barium glass to decrease, while the proportion of CuO, CaO and PbO increases, while BaO does not change significantly before and after weathering. It is highly consistent with the above linear regression analysis results of weathering state and chemical composition, indicating that the prediction model constructed has a good prediction effect.

Table 6. Prediction data of lead-barium glass before weathering.

Figure 7. Visualization of composition prediction of lead-barium glass before weathering (part).

4. Conclusion

In this paper, a distributed matching prediction model is proposed. Firstly, the correlation between the chemical components of the glass relics is analyzed by using Pearson correlation coefficient, and the difference of the correlation between the chemical components of different types of glass relics is compared. The correlation between the variables of different glass types is quite different. For high potassium glass, SiO2 is positively correlated with K2O, CaO, MgO, Al2O3, CuO, P2O5 and SnO2, and negatively correlated with Na2O and SrO2. For lead-barium glass, SiO2 is positively correlated with Na2O and Al2O3, and negatively correlated with P2O5 and SrO. It can also be concluded that for high potassium glass, SiO2 has a positive correlation with P2O5, and a negative correlation with Na2O, while the lead barium glass is the opposite. Secondly, the influence of weathering state on the content of various chemical components of different glass types was investigated by linear regression analysis method. The degree of influence of weathering on the contents of chemical components of different glass types is different. Weathering can increase the proportion of SiO2, CuO and P2O5, and decrease the proportion of Na2O, CaO and K2O. At the same time, weathering can cause the proportion of SiO2 and Al2O3 to decrease, while the proportion of CuO, CaO and PbO to increase. Finally, the prediction model of each chemical component content of glass relics before weathering is constructed. The results of model prediction are highly consistent with those of linear regression analysis, which indicates that the prediction model based on distribution matching proposed in this paper has good validity.

Acknowledgements

This paper is the research result of the project “Multi-field coupling numerical simulation of the formation process of hidden ore body”. Project number: 2021AC19224.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Gan, F.X. (2006) Ancient Silk Road and Ancient Chinese Glass. Chinese Journal of Nature, 28, 253-260.
[2] Wang, Y.J. and Zheng, X.D. (2022) Research on the Conservation and Management Strategies of Cultural Relics in Archaeological Excavations. Chinese National Expo, 6, 183-185.
[3] Gu, L.L. and Zhou, J. (2021) Some Questions about the Study of Ancient Chinese Glass Art. Art and Design, 2, 112-114.
[4] Cho, T., Mantani, H., Ohta, T. and Li, G. (2019) Evaluation of Cretaceous Hinterland Weathering and Climate in the Sichuan Basin, SW China. Open Journal of Geology, 9, 696-699.
https://doi.org/10.4236/ojg.2019.910078
[5] Tognonvi, M., Tagnit-Hamou, A., Konan, L., Zidol, A. and N’Cho, W. (2020) Reactivity of Recycled Glass Powder in a Cementitious Medium. New Journal of Glass and Ceramics, 10, 29-44.
https://doi.org/10.4236/njgc.2020.103003
[6] Gan, F.X., Li, Q.H., Gu, D.H., et al. (2003) Study on Early Glass Beads Unearthed from Baicheng and Tacheng, Xinjiang. Journal of the Chinese Ceramic Society, 31, 663-668.
[7] Li, QH., Zhang, B., Cheng, H.S., et al. (2003) Application of Proton Excited X-Ray Fluorescence Technique to the Composition Analysis of Ancient Chinese Glass. Journal of the Chinese Ceramic Society, 31, 950-954.
[8] Zhao, H.X., Li, Q.H., Gan, F.X., et al. (2007) Proton-Stimulated Fluorescence Analysis of Han Dynasty Ancient Glass Unearthed in Hepu Area of Guangxi. Nuclear Techniques, 30, 27-33.
[9] Cao, Y.X. and Sui, G.R. (2023) Study on the Prediction and Weathering Degree Measurement Model of Ancient Glass Based on GA-BP. Software Engineering, 26, 12-16.
[10] Shi, B.M., Zhao, X., Wang, H.T., et al. (2023) Category Prediction of Ancient Glassware Based on Multi-Layer Perceptron Networks. Journal of Lanzhou University of Arts and Sciences (Natural Science Edition), 37, 57-62.
[11] Lv, F., Fu, H.W. and Liu, C.L. (2023) Composition Analysis and Identification of Ancient Glass Products Based on Machine Learning. Information & Computer, 4, 98-102.
[12] Zou, Y. (2023) Molecular-Composition Analysis of Glass Chemical Composition Based on Time-Series and Clustering Methods. Molecules, 28, Article 853.
https://doi.org/10.3390/molecules28020853
[13] Ferrers, N.M. (1866) An Elementary Treatise on Trilinear Coordinates. Macmillan, London, 31-35.
[14] Zhou, D. (1998) Statistical Analysis of Geological Composition Data—Difficulty and Exploration. Earth Science, 23, 147-152.
[15] Aitchison, J. (1986) The Statistical Analysis of Compositional Data. Chapman and Hall, London, 66-72.
[16] Fu, Y.L. (2019) Study on Composition Data Processing Methods: A Case Study of 1:200, 000 Lithic Geochemical Data in Jianshan-Pingkouxia Area, Gansu Province. Ph.D. Thesis, Chang’an University, Xi’an.
[17] Yang, B. (2015) Comparison of Several Methods of Normality Test. Statistics and Decision, 14, 72-74.
[18] Lu, Y.L., Huang, C.M., Liu, C.W., et al. (2018) In-Door Positioning Algorithm of Wireless Local Area Networks Position Fingerprint Based on Skewness-Kurtosis Test. Science Technology and Engineering, 18, 1-6.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.