Clothing Sales Forecast Considering Weather Information: An Empirical Study in Brick-and-Mortar Stores by Machine-Learning

Jieni Lv; Shuguang Han; Jueliang Hu

doi:10.4236/jtst.2023.91001

Journal of Textile Science and Technology > Vol.9 No.1, February 2023

Clothing Sales Forecast Considering Weather Information: An Empirical Study in Brick-and-Mortar Stores by Machine-Learning

Jieni Lv¹, Shuguang Han^2*, Jueliang Hu²
¹School of Fashion Design and Engineering, Zhejiang Sci-Tech University, Hangzhou, China.
²School of Science, Zhejiang Sci-Tech University, Hangzhou, China.
DOI: 10.4236/jtst.2023.91001 PDF HTML XML 177 Downloads 1,274 Views Citations

Abstract

Reliable sales forecasts are important to the garment industry. In recent years, the global climate is warming, the weather changes frequently, and clothing sales are affected by weather fluctuations. The purpose of this study is to investigate whether weather data can improve the accuracy of product sales and to establish a corresponding clothing sales forecasting model. This model uses the basic attributes of clothing product data, historical sales data, and weather data. It is based on a random forest, XGB, and GBDT adopting a stacking strategy. We found that weather information is not useful for basic clothing sales forecasts, but it did improve the accuracy of seasonal clothing sales forecasts. The MSE of the dresses, down jackets, and shirts are reduced by 86.03%, 80.14%, and 41.49% on average. In addition, we found that the stacking strategy model outperformed the voting strategy model, with an average MSE reduction of 49.28%. Clothing managers can use this model to forecast their sales when they make sales plans based on weather information.

Keywords

Clothing Retail, Sales Forecasting, Weather, Machine-Learning, Stacking

Share and Cite:

Lv, J. , Han, S. and Hu, J. (2023) Clothing Sales Forecast Considering Weather Information: An Empirical Study in Brick-and-Mortar Stores by Machine-Learning. Journal of Textile Science and Technology, 9, 1-19. doi: 10.4236/jtst.2023.91001.

1. Introduction

With the fierce competition in the retail market, reasonable and reliable sales forecasts are of great significance to the clothing industry [1]. It is the premise for enterprises to scientifically control production, which can effectively prevent the backlog of commodity inventory, it not only reduces the waste of resources, but also improves the profit rate and overall benefit of enterprises, and it is also a positive response to national energy conservation and emission reduction.

Unlike other products, clothing has a strong fashion and seasonality, a short product life cycle and a long production lead time. These factors make it more difficult to forecast the sales of clothing products. In fashion clothing, ZARA and other fast fashion companies constantly adjust the sales quantity according to the actual demand to reduce inventory risk [2]. Although there has been some improvement in inventory loss, there are still some risks that are difficult to predict and mitigate, and one of the main uncertain factors is weather. Shoppers may take advantage of good weather to participate in outdoor activities and postpone or quit shopping in physical stores; Shoppers may also stay at home in bad weather and shop through online channels instead of physical stores. Swiss fashion chain H&M attributed the profit decline in 2015 to the unusually warm winter; and the poor sales of Gap in July 2016 were due to unfavorable weather conditions; and reduced spring demand according to cold and humid weather conditions at the end of the next year [3]. Our research focuses on the apparel industry, especially the impact of weather information on apparel sales. While reusing products, recycling, and closed-loop supply chain management can help companies mitigate inventory overflows caused by weather changes, it is expected that this problem will be solved at its source in the future. The clothing sales forecast considering weather information can control the inventory quantity before the clothing production and manufacturing, alleviate the overstock at the source, and prevent the overproduction caused by the demand fluctuation caused by weather changes.

Generally, weather has a significant impact on clothing sales [4]. In addition to referring to historical sales transaction data, clothing retailers should consider the influence of weather on sales. At present, most scholars focus on the influence of weather on consumer behavior and psychology or the influence of weather on enterprises’ mid- and long-term strategic decisions. Few scholars have focused on the effects of weather on short-term and retail sales.

Given the above analysis, this study establishes an integrated learning model based on the stacking strategy, which takes weather information as an external impact variable into the model, and combines it with product attributes and historical sales data as a characteristic variable of the model to improve the accuracy and generalization of the clothing sales forecasting model.

The remainder of this paper is organized as follows. Section 2 reviews related research of weather in sales forecasting and the research progress of machine learning algorithms in forecasting models. Section 3 describes the proposed model. Section 4 will introduce the data used in our study explores the data, and analyzes and quantifies the impact of weather information on clothing sales. Section 5 introduces the evaluation criteria of the model application, the research results obtained, and a discussion of these results. A summary and prospect of this work are presented the final section.

2. Literature Review

Our study proposes and expands the influence of weather information on clothing sales forecasting, mainly by discussing the influence of weather on different clothing categories. To better understand this research problem, we provide a literature review of two aspects. First, we discuss existing literature on the impact of weather on sales and presented factors that might link weather to physical store sales. Then we provide some popular algorithms and models in clothing sales forecasting and analyze the advantages and disadvantages of each model.

2.1. Research Progress on the Influence of Weather on Sales Forecast

In recent years, with frequent weather changes and global warming, many impacts of climate change are already visible, for instance, financial losses due to the increasing frequency and severity of extreme weather events due to climate change are continuously rising. Some scholars have begun to study the impact of weather on retail sales organizations regarding the adverse effects and risks of climate change, with the aim of proposing actions and solutions that can increase adaptive and absorptive capacities in organizations to the adverse effects and risks of climate change.

Parsons [5] found that temperature and rainfall would affect people’s offline shopping behavior, and thus retail sales. In a similar study, Steinker et al. [6] found that temperature, rain, and sunshine also significantly influence the daily sales of online retailers. They found that including weather data in the sales forecasting model could reduce sales forecasting errors by 8.6% to 12.2% on average and by 50.6% during summer weekends. When studying the impact of weather on sales, Martinez-de-Albeniz and Belkaid [3] took additional factors such as location, season, and product category, and proved with case data that retailers could increase revenue by 2% through weather pricing. Oh et al. [7] used the Google search data (WGT) of winter jackets to build a generalized linear mixed (GLMM) seasonal clothing demand model, and decomposed the linear relationship between the wind chill data with a time lag and the monthly WGT into random effects over many years. GLMM is an extended form of the ordinary linear model, which extends the distribution of dependent variables, so that the dependent variables only obey the exponential distribution family, and increases the range of weather data that can be used. Babongo et al. [8] developed and trained a generalized additive model (GAM) to predict the demand for the next season using a ten-year data set of winter sports equipment sales in Switzerland and Finland combined with meteorological data, and the results showed that the prediction error was reduced by 45% when the weather data of the previous season were included. Rose and Dolega [9] adopted Random forest Model (RF) when quantifying the impact of weather variables on retailers’ daily sales. The model predicts the amount of goods sold and quantifies the impact of weather variables. There are many weather variables affecting sales, and the relationship between the variables is complicated, so the nonlinear relationship is difficult to parameterize. Compared with GAM model, random forest model can better cope with this difficulty. As a classical machine learning method, Random forest model can process high-dimensional data and obtain the importance ranking of variables. It has fast speed and is not easy to overfit. Choi et al. [10] studied consumers’ different purchasing behaviors for weather changes, introduced a simple demand prediction model, and applied it to the newsboy model to calculate the optimal demand level to minimize the cost and minimize the cost of weather risks. Boada-Collado and Martinez-de-Albeniz [11] developed a stochastic coefficient model that takes non-linear effects and seasonality using different weather parameters, and the authors found that the size of the impact of weather on sales depends on store location and sales theme.

From the literature, we find that most research on sales forecast uses comprehensive data (monthly or quarterly). The impact of weather on sales, it is usually the result of averaging multiple weather data points, which greatly underestimates the impact of individual weather types. Some studies only include weather as a covariant in the demand model, but do not further quantify the impact of weather on the sales of different clothing categories. Most scholars study the linear relationship between weather and sales, and most use traditional statistical models.

2.2. Research Progress on the Sales Forecasting Model

With the deepening and improvement of the basic theory of forecasting by researchers, forecasting models have become increasingly abundant. It is mainly divided into two categories: traditional statistical methods that rely on the characteristics of time series [12], and artificial intelligence heuristic algorithms relying on large-scale historical data [13]. Zhou et al. [14] proposed an improved Bass model for fast-fashion clothing demand forecasting based on the influence of consumer preference and seasonality on demand forecasting. The premise of using the Bass model is that the market potential and performance of the product will remain unchanged in the life cycle, and the product will not be innovated. Fashion retail products do not have enough life to have innovators and imitators, therefore, this approach is not suitable for fashion retail products with a short life cycle [15]. The auto regression integrated moving average (ARIMA) [16] and seasonal auto regression integrated moving average (SARIMA) [17] approaches are simple and intuitive, and can be used for quick clothing demand forecasting. However, the use of these time-series-based methods is insufficient because the demand for fashion products depends on other factors, such as price and the demand for other related products [18].

Based on this argument, many scholars use the combinatorial model to improve the defect that a single algorithm can’t capture the composite feature of time series well, such as Prophet-LSTM [19], ARIMA-LSTM [20], ARIMA-NARNN [21]. In the case of stock shortage, Huang and Liu [22] adopted an adaptive neural fuzzy reasoning method to establish a two-stage intelligent retail prediction system for new clothing products. Loureiro et al. [23] forecast the future sales volume of a single SKU, evaluated the performance of DNN, and compared the performance of four shallow data mining regression techniques. Fallah Tehrani and Ahrens [24] put forward a probabilistic method to identify fashion product categories in sales. They combine kernel machine with probabilistic approach to enhance the performance of kernel machine, and finally use it to predict the sales volume. Tichy et al. [25] proposed a model using natural language, using language-ambiguous IF-THEN rules to establish a causal relationship between average temperature and quarterly sales, and allowing complex linguistic analysis of sales based on possible weather. This kind of language is easy to be understood by the company management, the model predicts the result of strong interpretation, the data set is relatively low dependence.

Faced with the situation of high complexity and large data sets, in recent years, increasing number of technologies, such as deep learning and machine learning have been used for prediction. We expanded on this basis. Our goal is to introduce weather information based on historical sales data, quantify the impact of weather information on clothing sales forecasts for a single product category in multiple stores, and establish an integrated learning model based on a stacking strategy to improve the accuracy and generalization of forecasts.

3. Model Establishment

3.1. Introduction of Stacking Model

There are two commonly used model fusion strategies, voting and stacking fusion. The process of model fusion is similar to that of integrated models, and it is hoped that the advantages of different models can be used as much as possible to produce more reliable results and improve the overall robustness of the model. Voting fusion includes mean fusion which uses multiple groups of prediction results to average and weighted fusion which gives different prediction results different weights and then sums them. Stacking, on the other hand, is relatively complex. It first uses the basic learner to fit the data in the training set to generate the underlying model and then uses the prediction value generated by the basic learner in the first layer as the input of the second layer [26]. Unlike voting fusion, it can be regarded as heterogeneous integration, and it is an advantage set at the algorithm level. The specific process is illustrated in Figure 1.

Stacking algorithm:

Assuming that the original data set A has N features, one label, and M rows of data, we divide the data into three parts, training set in line A, validation set in line B, and test set in line C, where $A + B + C = M$ . And $U_{t r a i n} \cup U_{h o l d_o u t} \cup U_{t e s t} = A$ and $U_{t r a i n} \cap U_{h o l d_o u t} \cap U_{t e s t} = \emptyset$ .

Original data set $A = {\begin{matrix} x_{11}, x_{12}, \dots, x_{1 n}, y_{1} \\ x_{21}, x_{22}, \dots, x_{2 n}, y_{2} \\ \dots \\ x_{m 1}, x_{m 2}, \dots, x_{m n}, y_{m} \end{matrix}}$ ,

Figure 1. Specific flow chart of Stacking algorithm.

$A = {(x_{i j}, y_{j})}; i = 1, 2, \dots, i; j = 1, 2, \dots, j$

Train Set: $U_{t r a i n} = {(x_{i j}, y_{j})}; i = 1, 2, \dots, a; j = 1, 2, \dots, j$

Hold out Set: $U_{h o l d_o u t} = {(x_{i j}, y_{j})}; i = a + 1, a + 2, \dots, a + b; j = 1, 2, \dots, j$

Test Set: $U_{t e s t} = {(x_{i j}, y_{j})}; i = b + 1, b + 2, \dots, b + c; j = 1, 2, \dots, j$

Best Learner: $ξ_{1}, ξ_{2}, \dots, ξ_{t}; t = 1, 2, \dots, t$

Meta Learner: $ξ_{m e t a}$

Step 1 for $t = 1, 2, \dots, t$

do $ξ_{i} \cdot f i t (U_{t r a i n})$

end

Step 2 for $t = 1, 2, \dots, t$

do $P_{i} = ξ_{i} \cdot p r e d i c t (U_{h o l d_o u t})$

do $H_{i} = ξ_{i} \cdot p r e d i c t (U_{t e s t})$

Step 3 $P = \cup_{i = 1}^{k} P_{i}$

$H = \cup_{i = 1}^{k} H_{i}$

do $ξ_{m e t a} \cdot f i t ( P )$

$\hat{H} = ξ_{m e t a} \cdot p r e d i c t ( H )$

3.2. Selection of Learners

The two-layer stacking algorithm takes the prediction value generated by the first-layer base learner as the input of the second-layer meta-learner, therefore the choice of learners has a great influence on the stacking algorithm. Only by choosing the appropriate base learner and meta-learner can we maximize the effect of learning from each other [27]. Usually, the choice of learners needs to pay attention to the following problems. First, ensure that each basic learner has certain differences and the prediction effect is more accurate, and close. Additionally, the meta-learner should be a simple model with stable and good performance. In this study, random forest, Extreme Gradient Boosting (XGB) and Gradient Boosting Decision Tree (GBDT) were selected as the basic learners of the first layer, and Bayesian regression as the meta-learners of the second layer.

Random Forest (RF) is bagging algorithm, and its essence is the process of outputting multiple weak model decision trees into strong models [28]. The model contains many decision trees, each of which randomly samples a small part of the data set for training, and finally integrates the output results of each decision tree. GBDT and XGB belong to the boosting method, but GBDT can only use the CART tree, whereas XGB supports both the CART tree and linear classifier. GBDT uses only the first derivative in optimization, and XGB makes the second Taylor expansion of the cost function, using both the first and second derivatives. To determine the best split point, XGB stores the feature sorting as a block structure before training, and then reuse this structure to reduce the amount of calculation.

4. Experimental Cases

4.1. Data Sources

4.1.1. Retail Data

These experimental data come from a women’s fashion retail brand in Hangzhou, including daily sales transaction data from 2018 to 2019 in POS systems of all physical stores in Hangzhou, with a total of more than 100,000 records, as shown in Table 1. All data were desensitized for commercial reasons. The women’s clothing brand stores are located on pedestrian streets and shopping malls, and the types of stores are divided into regular price stores and Ole stores. All stores adopt the same sales theme, and the categories of products sold are T-shirts, shirts, dresses, trousers, waistcoats, windbreakers and so on, totaling 19 kinds.

Table 1. Example of daily sales transaction data of a women’s clothing brand in Hangzhou.

4.1.2. Weather Data

The weather data used in this study were obtained from the Hui-Ju Meteorological Data Website (https://hz.hjhj-e.com/home), and the selected Hangzhou area had a granularity of days. Twelve weather variables were collected: average temperature, maximum temperature, minimum temperature, humidity, wind speed, wind level, daily rainfall, visibility, average total cloud cover, air pressure, and weather type. These weather variables are based on previous studies on the influence of weather on sales.

4.1.3. Data Summary

Because the data in the POS system is highly intermittent, we cannot directly use SKU-level data from a single store, so we first aggregate the original sales data. On a regional scale, we summarize all stores in Hangzhou; In the time dimension, we aggregate in days; and at the product level, we aggregate individual categories of clothing. Specifically, we summarized the sales data of different stores with the same SKU on the same date into a table, and then count the number of different product categories. Secondly, we matched the weather data according to the date and summarized it in the sales data table. Finally, we obtained the summary data of sales records of different categories in Hangzhou with weather information, and the specific variables included are shown in Figure 2.

4.2. Data Analysis

4.2.1. Exploration of Weather Characteristics

Because the sales of different clothing categories will show different characteristics under different weather conditions, for example, a cold winter with low

Figure 2. Summarize the characteristic variables contained in the data.

temperature and low humidity is beneficial to the sales of down jackets, and a hot summer with high temperature and dryness is beneficial to the sales of short skirts. Therefore, it is helpful to improve the accuracy of the sales forecast model by analyzing the sales characteristics of different clothing categories under different weather conditions.

First, we selected six representative clothing categories, namely T-shirts, shirts, dresses, pants, down jackets, and sweaters, and analyzed their Pearson correlation coefficient with weather variables using, the following formula: The correlation between each clothing category and weather elements was clear. Table 2 shows the correlation coefficient of the top five weather variables with the highest correlation with clothing sales.

$p = \frac{\sum_{i = 1}^{n} (X_{i} - \bar{X}) (Y_{i} - \bar{Y})}{\sqrt{\sum_{i = 1}^{n} {(X_{i} - \bar{X})}^{2} \sum_{i = 1}^{n} {(Y_{i} - \bar{Y})}^{2}}}$

where n is the number of samples, sum is the sample mean of x and y, and p is the correlation coefficient. $\bar{X}$ and $\bar{Y}$ are the sample means of X and Y, respectively, and p is the correlation coefficient.

Among them, temperature has a significant correlation with almost all types of clothing sales, down jackets and sweaters have a negative correlation with temperature; T-shirts, shirts, and pants are positively correlated with temperature. The negative correlation coefficient with the down jacket temperature was the largest 0.520; and the positive correlation coefficient with the T-shirt temperature was 0.317. Knitted sweaters are negatively correlated with wind speed, daily rainfall, and average total cloud cover to different degrees. The sales changes in the different clothing categories in the temperature range are shown in Figure 3.

The sales volume of dresses, pants, T-shirts, and shirts, when the temperature is below 20˚C, the sales volume increases with the increase of the temperature;

Table 2. Correlation coefficient between sales volume of different clothing categories and weather variables.

Note: **Indicates significance < 0.01, and *Indicates significance < 0.05.

Figure 3. Trends of sales volume of different clothing categories with temperature.

The sales volume reaches the maximum value at 20˚C - 25˚C; When the temperature exceeds 25˚C, the sales volume decreases with the increase of air temperature. When the temperature was lower than 15˚C, the sales volume of knitwear increased with an increase in temperature. Sales reached a maximum at 15˚C - 20˚C; when the temperature exceeded 20˚C, the sales volume decreased with an increase in air temperature. The temperature effect of the down jacket was different from that of the first five garments. Its peak sales volume lies in the temperature range 5˚C - 10˚C. Beyond this temperature range, the sales volume decreased with an increase in temperature.

4.2.2. Exploration of Other Characteristics

We divided the time of working days and holidays for the calendar, and drew a box diagram based on these two variables, as shown in Figure 4 comparing sales volume on weekdays and non-weekdays. Figure 5 shows the comparison between holidays and non-holidays. We found that on weekends and holidays, the overall level of clothing sales was much higher than that on weekdays and non-holidays. Therefore, we included these two variables in the model when establishing the sales prediction model.

4.3. Feature Engineering

We collected three types of data: basic product attributes, historical sales, and weather data. As our research focuses on the influence of weather information on sales forecast, to obtain an unbiased estimate, we must ensure that the brand’s prior discount level has nothing to do with the weather type. We interviewed enterprise managers, whose sales plans are usually prepared one or two

Figure 4. Comparison of sales distribution between working days and non-working days. Note: In Figure 4, 0 indicates working days and 1 indicates non-working days.

Figure 5. Comparison of sales distribution between non-holiday and holiday. Note: In Figure 5, 0 indicates non-holiday and 1 indicates holiday.

ago, and they do not use weather as an input variable, which allows us to include weather as an external impact variable in our forecast model, to consider whether introducing weather information can improve the accuracy of the clothing sales forecast.

We sorted and summarized the collected data, preprocessed the data to fill in the missing values and deleted abnormal values. There were 13 characteristics in the original data of this experiment. Due to the high intermittence of data in the POS system, we use the summary data aggregated by category, which will cause us to lose information such as money number, color, and size. These do not affect our research on the original record sheet. In addition, according to the above data exploration, we found that holidays also have a significant impact on clothing sales, so we decomposed the listing date, added feature variables such as month, day, week, whether it is a holiday, holiday type and adopted one-hot unique coding for features with category attributes. Finally, 23 characteristic variables were obtained. Table 3 lists the characteristic variables.

Table 3. Data characteristics.

4.4. Model Evaluation Indicators

In the training process of our model, there are still many parameters that need to be adjusted artificially, and these parameters have a significant impact on the improvement of the model prediction accuracy. In the selection of parameters, the grid search method is used to divide the parameter space into several grids, and the model to be trained is optimized by traversing all parameter combinations at the intersections of the grids, to determine the best parameter combination [29].

To better test the accuracy of the model, we used the mean square error to evaluate the model, which is expressed as follows:

$MSE = \frac{1}{n} \sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}$

$y_{i}$ refers to real data after logarithmic smoothing. ${\hat{y}}_{i}$ indicates the predicted value. The experiment was completed in the Jupyter Notebook Python 3.6. The prediction process is illustrated in Figure 6.

5. Experimental Results and Analysis

5.1. Prediction Results of the Single Model

First, we forecast the total sales volume of clothing, including all clothing categories. We then substituted it into the random forest, XGB, and GBDT models. The prediction errors of the models are shown in Table 4 of the ALL series. Theoretically, we expect to improve the accuracy of the model predictions by adding more external information. However, in our experiment, we found that after adding weather information, the prediction error MSE of these three models increased, regardless of whether it was a random forest, XGB or GBDT. This shows that for the total clothing sales, directly introducing weather information will reduce the accuracy of the model prediction, and the weather information

Figure 6. Specific flow chart of sales forecast.

Table 4. Comparison of the influence of weather information on sales forecast results of different types of clothing.

here is a type of noise for the prediction model. Our analysis may be because the same weather has different effects on the sales of different kinds of clothing, such as high temperature and high humidity. The sales of T-shirts increased, the sales of trousers decreased, but the overall sales of clothing remained unchanged. Therefore, we forecasted the total sales volume of clothing by category and selected six representative garments. The forecast errors for the six garment categories are listed in Table 4.

We found that for the sales of T-shirts, trousers, and knitwear, after adding weather information, the prediction error MSE of the three models increased, which shows that the weather information is also noise for the prediction of the sales of these three garments. After adding weather information to the other three garments, dresses, down jackets, and shirts, the prediction accuracy of the three models was significantly improved, and the MSE of the prediction errors was greatly reduced. The MSE of the dresses, down jackets, and shirts are reduced by 86.03%, 80.14%, and 41.49% on average.

Combined with our data exploration of weather information, we found that the causes of this phenomenon were T-shirts, pants, and sweaters. These three types of clothing belong to basic models, whereas dresses, down jackets, and shirts belong to seasonal models. Basic clothing is less affected by weather, and its demand is relatively stable. It can be sold in a stable manner throughout all seasons. Seasonal clothing is greatly affected by weather. With the alternation of seasons and the change in temperature, the sales volume in each season is also quite different [30]. Therefore, introducing weather information into the forecasting model will interfere with the sales forecast of basic clothing, but it will help improve the accuracy of seasonal clothing sales forecasts. When making short-term sales plans about weather information, the sales staff of offline physical stores need to focus on the influence of weather on seasonal clothing sales to appropriately modify the order quantity to avoid the risk of out-of-stock or over-stock, and improve their daily decisions [3].

5.2. Prediction Results of Stacking Strategy Fusion Integration Model

We obtained the prediction results of the random forest, XGB, and GBDT as a single model. Based on this result, we fused the stacking strategies. Table 5 shows the performance comparison between the stacking strategy model and voting strategy model on the test set. From the table, we can see that the prediction error MSE of the stacking fusion strategy is lower than that of the voting strategy, which averages the prediction results of multiple groups of single models. The MSE of dresses, shirts and down jackets decreased by 52.17%, 25.22%, and 25.29% respectively. It also shows that compared with the numerical linear combination, the stacking integration strategy has a better effect, which can complement the advantages of different learners and improve the overall generalization performance of the model.

In addition, we find that when weather information is included in the forecast model as an external impact variable, the overall trend of the forecast error increases with increasing forecast time. Among them, the sales forecast error of

Table 5. Comparison of prediction errors of stacking fusion model with weather information of different time lengths.

dresses and shirts was the smallest when 7-day weather information was added. However, the sales forecast error of the down jacket is the smallest when 14 days of weather information added.

6. Summary

In this study, we used sales transaction data of a women’s fashion retail brand in Hangzhou to investigate the impact of weather on the sales of physical clothing stores. We found that not all clothing categories could improve the accuracy of sales forecast when weather information is included. For total clothing sales, including weather information, the prediction accuracy cannot be improved. The reason for this phenomenon is that total clothing sales include different types of clothing, and different types of clothing have different weather sensitivity. Therefore, we selected six representative clothing categories and predicted them.

We found that the results depend on the product category. We found that weather information did not improve the accuracy of sales prediction for basic clothing, such as T-shirts, pants, and knitwear. However, introducing weather information can improve the accuracy of model prediction of dresses, down jackets, and shirts with a seasonal trend, and the MSE of dresses decreases by 86.03% on average. On average, the MSE of the down jackets and shirts decreased by 80.14% and 41.49%, respectively. In addition, we found that the stacking integration strategy model performed better than the voting combination strategy model, with an average MSE reduction of 49.28%. The prediction accuracy of the model decreased with an increase in the forecast time, and the weather information containing 7 days was the best. The results showed that weather information with daily granularity is more suitable for short-term prediction.

This has theoretical and practical significance for the garment industry. Academically, when establishing prediction models for clothing sales, researchers in the clothing industry should add variables according to the characteristics of clothing products. It is not that adding more variables to the model can improve the accuracy of the model. For example, weather information can be added to seasonal clothing sales forecasting models; however, the basic clothing sales forecasting model can reduce the consideration of weather information.

In practice, clothing managers should focus on recent weather and reasonably plan product categories when making sales plans based on weather information. Develop different plans for different clothing categories to balance the weather responses of each category. For some weather-sensitive clothing categories, such as seasonal clothing, managers need to contact recent weather information to modify the order quantity appropriately. For other clothing categories that are not sensitive to weather information, such as basic clothing, weather information does not need to be considered to reduce the daily workload.

Although our study extends the influence of weather information on sales forecasting for different clothing categories, as well as the influence of weather information of different time lengths on forecasting accuracy, there are still some factors that limit our findings and provide new directions for future research.

In our study, we used the meteorological and sales data of Hangzhou, so our results represent Hangzhou, and other regions may indicate different weather effects on different clothing categories. In addition, what we do is after-the-fact analysis, the weather data we use is the real weather acquired after the event, and the real forecast scene should be input with the weather forecast data. The error of weather forecast data increases by approximately 80% every 2.5 days [31]. Therefore, future research can be improved in the above two aspects, and different demand patterns can also be considered. In other words, in the demand stage and product life cycle stage, weather information should be considered to improve the accuracy of sales forecasts for different clothing categories.

It is hoped that this review will help inform researchers and industry practitioners to better understand the close relationship between weather change and the garment industry, and to spread comprehensive awareness about the use of weather information as a management tool. While better retail planning won’t completely eliminate the inventory overhang, its reduction is also important.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1]	Aversa, J., Hernandez, T. and Doherty, S. (2021) Incorporating Big Data within Retail Organizations: A Case Study approach. Journal of Retailing and Consumer Services, 60, Article ID: 102447. https://doi.org/10.1016/j.jretconser.2021.102447
[2]	Caro, F. and Martínez-de-Albéniz, V. (2015) Fast Fashion: Business Model Overview and Research Opportunities. In: Agrawal, N. and Smith, S., Eds., Retail Supply Chain Management. International Series in Operations Research & Management Science, Vol. 223, Springer, Boston, 237-264. https://doi.org/10.1007/978-1-4899-7562-1_9
[3]	Martinez-de-Albeniz, V. and Belkaid, A. (2021) Here Comes the Sun: Fashion Goods Retailing under Weather Fluctuations. European Journal of Operational Research, 294, 820-830. https://doi.org/10.1016/j.ejor.2020.01.064
[4]	Tian, X., Cao, S. and Song, Y. (2021) The Impact of Weather on Consumer Behavior and Retail Performance: Evidence from a Convenience Store Chain in China. Journal of Retailing and Consumer Services, 62, Article ID: 102583. https://doi.org/10.1016/j.jretconser.2021.102583
[5]	Parsons, A.G. (2001) The Association between Daily Weather and Daily Shopping Patterns. Australasian Marketing Journal, 9, 78-84. https://doi.org/10.1016/S1441-3582(01)70177-2
[6]	Steinker, S., Hoberg, K. and Thonemann, U.W. (2017) The Value of Weather Information for E-Commerce Operations. Production and Operations Management, 26, 1854-1874. https://doi.org/10.1111/poms.12721
[7]	Oh, J., Ha, K.-J. and Jo, Y.-H. (2022) A Predictive Model of Seasonal Clothing Demand with Weather Factors. Asia-Pacific Journal of Atmospheric Sciences, 58, 667-678. https://doi.org/10.1007/s13143-022-00284-3
[8]	Babongo, F., Appelqvist, P., Chavez-Demoulin, V., Hameri, A.-P. and Niemi, T. (2018) Using Weather Data to Improve Demand Forecasting for Seasonal Products. International Journal of Services and Operations Management, 31, 53-76. https://doi.org/10.1504/IJSOM.2018.094183
[9]	Rose, N. and Dolega, L. (2022) It’s the Weather: Quantifying the Impact of Weather on Retail Sales. Applied Spatial Analysis and Policy, 15, 189-214. https://doi.org/10.1007/s12061-021-09397-0
[10]	Choi, C., Kim, E. and Kim, C. (2011) A Way of Managing Weather Risks Considering Apparel Consumer Behaviors. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.1909185
[11]	Boada-Collado, P. and Martinez-de-Albeniz, V. (2020) Estimating and Optimizing the Impact of Inventory on Consumer Choices in a Fashion Retail Setting. Manufacturing & Service Operations Management, 22, 582-597. https://doi.org/10.1287/msom.2018.0764
[12]	Giri, C., Jain, S., Zeng, X. and Bruniaux, P. (2019) A Detailed Review of Artificial Intelligence Applied in the Fashion and Apparel Industry. IEEE Access, 7, 95376-95396. https://doi.org/10.1109/ACCESS.2019.2928979
[13]	Wang, C., Sen, M.R., Yao, B., Certik, M. and Randrianarivony, K.A. (2021) Harnessing Machine Learning Emerging Technology in Financial Investment Industry: Machine Learning Credit Rating Model Implementation. Journal of Financial Risk Management, 10, 317-341. https://doi.org/10.4236/jfrm.2021.103019
[14]	Zhou, X., Meng, J., Wang, G. and Qin, X. (2021) A Demand Forecasting Model Based on the Improved Bass Model for Fast Fashion Clothing. International Journal of Clothing Science and Technology, 33, 106-121. https://doi.org/10.1108/IJCST-08-2019-0114
[15]	Falatouri, T., Darbanian, F., Brandtner, P. and Udokwu, C. (2022) Predictive Analytics for Demand Forecasting—A Comparison of SARIMA and LSTM in Retail SCM. Procedia Computer Science, 200, 993-1003. https://doi.org/10.1016/j.procs.2022.01.298
[16]	Hu, W. and Zhang, X. (2020) Commodity Sales Forecast Based on ARIMA Model Residual Optimization. 2020 5th International Conference on Communication, Image and Signal Processing (CCISP), Chengdu, 13-15 November 2020, 229-233. https://doi.org/10.1109/CCISP51026.2020.9273506
[17]	Van Calster, T., Baesens, B. and Lemahieu, W. (2017) A Profit-Driven Order Identification Algorithm for ARIMA Models in Sales Forecasting. Applied Soft Computing, 60, 775-785. https://doi.org/10.1016/j.asoc.2017.02.011
[18]	Ren, S., Chan, H.-L. and Ram, P. (2017) A Comparative Study on Fashion Demand Forecasting Models with Multiple Sources of Uncertainty. Annals of Operations Research, 257, 335-355. https://doi.org/10.1007/s10479-016-2204-6
[19]	Ge, N., Sun, L., Shi, X. and Zhao, P. (2019) Research on Sales Forecast of Prophet-LSTM Combination Model. Computer Science, 46, 446-451. (In Chinese)
[20]	Han, Y. (2020) A Forecasting Method of Pharmaceutical Sales Based on ARIMA-LSTM Model. 2020 5th International Conference on Information Science, Computer Technology and Transportation (ISCTT), Shenyang, 13-15 November 2020, 336-339. https://doi.org/10.1109/ISCTT51595.2020.00064
[21]	Li, M., Ji, S. and Liu, G. (2018) Forecasting of Chinese E-Commerce Sales: An Empirical Comparison of ARIMA, Nonlinear Autoregressive Neural Network, and a Combined ARIMA-NARNN Model. Mathematical Problems in Engineering, 2018, Article ID: 6924960. https://doi.org/10.1155/2018/6924960
[22]	Huang, H. and Liu, Q. (2017) Intelligent Retail Forecasting System for New Clothing Products Considering Stock-out. Fibres & Textiles in Eastern Europe, 25, 10-16. https://doi.org/10.5604/01.3001.0010.1704
[23]	Loureiro, A.L.D., Migueis, V.L. and da Silva, L.F.M. (2018) Exploring the Use of Deep Neural Networks for Sales Forecasting in Fashion Retail. Decision Support Systems, 114, 81-93. https://doi.org/10.1016/j.dss.2018.08.010
[24]	Fallah Tehrani, A. and Ahrens, D. (2016) Enhanced Predictive Models for Purchasing in the Fashion Field by Using Kernel Machine Regression Equipped with Ordinal Logistic Regression. Journal of Retailing and Consumer Services, 32, 131-138. https://doi.org/10.1016/j.jretconser.2016.05.008
[25]	Tichy, T., Nguyen, L., Holcapek, M., Kresta, A., and Dvorácková, H. (2022) Quarterly Sales Analysis Using Linguistic Fuzzy Logic with Weather data. Expert Systems with Applications, 203, Article ID: 117345. https://doi.org/10.1016/j.eswa.2022.117345
[26]	Arora, A., Srivastava, A. and Bansal, S. (2020) Business Competitive Analysis Using Promoted Post Detection on Social Media. Journal of Retailing and Consumer Services, 54, Article ID: 101941. https://doi.org/10.1016/j.jretconser.2019.101941
[27]	Pavlyshenko, B. (2018) Using Stacking Approaches for Machine Learning Models. 2018 IEEE Second International Conference on Data Stream Mining & Processing (DSMP), Lviv, 21-25 August 2018, 255-258. https://doi.org/10.1109/DSMP.2018.8478522
[28]	van Steenbergen, R.M. and Mes, M.R.K. (2020) Forecasting Demand Profiles of New Products. Decision Support Systems, 139, Article ID: 113401. https://doi.org/10.1016/j.dss.2020.113401
[29]	Stuke, A., Rinke, P. and Todorovic, M. (2021) Efficient Hyperparameter Tuning for Kernel Ridge Regression with Bayesian Optimization. Machine Learning-Science and Technology, 2, Article ID: 035022. https://doi.org/10.1088/2632-2153/abee59
[30]	Verstraete, G., Aghezzaf, E.-H. and Desmet, B. (2019) A Data-Driven Framework for Predicting Weather Impact on High-Volume Low-Margin Retail Products. Journal of Retailing and Consumer Services, 48, 169-177. https://doi.org/10.1016/j.jretconser.2019.02.019
[31]	Orrell, D., Smith, L., Barkmeijer, J. and Palmer, T.N. (2001) Model Error in Weather Forecasting. Nonlinear Processes in Geophysics, 8, 357-371. https://doi.org/10.5194/npg-8-357-2001

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies