> that maximizes the BC likelihood function.

4) Choose ${\alpha }_{i}$ in the neighborhood of $\stackrel{^}{\alpha }$ with an interval of 0.0001, and repeat steps (2) and (3).

5) Determine the final estimator.

3. Distributions of Medical Expenditures and Blood Pressures

3.1. Medical Expenditures

Figure 1 shows the distribution of medical expenditures. The distribution is skewed and has a very heavy tail on the right side. The basic statistics (points) are as follows: mean: 13,356, median: 4061, standard deviation (SD): 39,241, skewness: 11.0, kurtosis: 174.0, and maximum: 1,212,291. A total of 20.2% of all observations of medical expenditures are zero. On the other hand, 1.9, 0.4, and 0.16% used more than 100,000, 300,000 and 500,000 points, and their medical expenditures accounted for 30.3, 14.3 and 7.8%, respectively, of total medical expenditures.

3.2. SBP and DBP

Figures 2-4 present the distributions of SBP and DBP, respectively. Excluding observations of BP that are too high (SBP > 300 or DBP > 200) or too low (SBP, DBP < 30), the basic statistics of SBP and DBP of 175,083 observations are given in Table 1. Under the 140/90 criterion, 22.8% are diagnosed with hypertension. Under the new guideline of 130/80, this value jumps up to 51.1%, more than a half of observations, suggesting the effect of changing the criterion is quite large.

3.3. Relation between SBP and Medical Expenditures

Figure 5 shows the relation of SBP to average medical expenditures. Average medical expenditures are averages of SBP at intervals of 5 mmHg (i.e., for a SBP

Figure 1. Distribution of medical expenditures.

Figure 2. Distribution of SBP.

Figure 3. Distribution of DBP

Table 1. Summaries of SBP and DBP.

Figure 4. Relationship of SBP and DBP, and 130/80 and 140/90 criteria

Figure 5. Relationship of average medical expenditure and SBP.

value of 130 mmHg, average medical expenditures of observations between 127.5 - 132.5 mmHg). The figure shows an upward trend, and the correlation coefficient between SBP and average medical expenditure between 80 - 180 mmHg is 0.843. This result seems to support the new guideline. But the question is whether this relation is a true or spurious one.

As already mentioned, various factors affect medical expenditures and BP. Figure 6 shows the relationships between the average medical expenditures at intervals of 5 years by gender. As an individual ages, medical expenditure increases, and there is a difference between males and females. Nawata et al.  pointed out that BP is strongly affected by age and gender. SBP increases by 5 mmHg over a 10-year aging period, and the SBP of males is about 4 mmHg higher than that of females. BP becomes higher as an individual grows older. In the next section, we conduct the analysis by the transformation tobit model.

Figure 6. Medical expenditures by age and gender.

4. Results of Analysis by the Power Transformation Model

Models and Explanatory Variables

Regression models are used to remove the effects of various factors. We first consider the following the power transformation tobit model in Equation (1).

Model A:

$\begin{array}{c}{y}_{i}^{*}={\beta }_{1}+{\beta }_{2}Age+{\beta }_{3}Female+{\beta }_{4}Height+{\beta }_{5}BMI+{\beta }_{6}SBP+{\beta }_{7}DBP\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{\beta }_{10}HDL+{\beta }_{9}LDL+{\beta }_{10}Triglyceride+{\beta }_{11}GGP+{\beta }_{12}AST+{\beta }_{13}ALT\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{\beta }_{14}Boold_Sugar+{\beta }_{14}Urine\text{_}sugar+{\beta }_{15}Urin\text{_}protein\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{\beta }_{16}F\text{_}year14+{\beta }_{17}F\text{_}year15+{\beta }_{18}Society2+{\beta }_{19}Socitey3+u\end{array}$ (5)

Besides SBP and DBP (mmHg), the following explanatory variables are used. Age, Female (1: if female, 0: otherwise), Height (cm), BMI ( = height (m)/weight (kg)2), HDL (high density lipoprotein cholesterol blood, mg/dL), LDL (low-density lipoprotein cholesterol, mg/DL), Triglyceride (mg/dL), GGP (γ-glutamyl transferase, U/L), AST (aspartate aminotransferase, U/L), ALT (alanine aminotransferase, U/L), Blood_sugar (mg/dL), Urine_sugar (integers of 1-5, sugar in urine increasing with number; 1 is normal, 5 is worst), Urine_protein (same as Urine_sugar), F_year 14 (1: fiscal year 2014, 0: otherwise), F_year 2015 (1: fiscal year 2015), Society 2 (1: Society 2, 0: otherwise) and Society3 (1: Society 3, 0: otherwise) where U/L is units per liter.

For all explanatory variables, objectively measured values could be obtained from medical checkup data. This model did not include variables related to anamnesis, currently treated diseases, or individual lifestyles. For example, hypertension is an important risk factor of diabetes  . Suppose that the relation may be “hypertension = > diabetes = > medical expenditure”. In this case, if a variable representing diabetes is included, the relation of “hypertension = > medical expenditure” could not be observed. In econometric terms, we used the reduced form so as not to miss any possible effects of BP.

Age, Female and Height represent basic individual characteristics; BMI represents obesity; while HDL, LDL and Triglyceride represent lipid concentration in the blood. If lipid concentration is abnormal (too high or too low), an individual is diagnosed as dyslipidemia. Lipoproteins are proteins that carry cholesterol through the blood. LDL cholesterol makes up most of the body’s cholesterol, and HDL cholesterol absorbs cholesterol and carries it back to the liver  . Triglyceride is the most common type of fat in the body, and stores excess energy  . Although our bodies need lipids to build cells, too much could be a problem  .

Currently, dyslipidemia is mainly hyperlipidemia, where the lipid concentration is too high. WHO  warned: “Raised cholesterol increases the risks of heart disease and stroke. Globally, a third of ischemic heart disease is attributable to high cholesterol. Overall, raised cholesterol is estimated to cause 2.6 million deaths (4.5% of total) and 29.7 million disability adjusted life years (DALYS), or 2.0% of total DALYS.” LDL and HDL cholesterols are classified as “bad” and “good”. LDL (bad) cholesterol contributes to fatty buildups in arteries, and raises the risk factor for chronic coronary heart disease, heart attack and stroke. On the other hand, HDL (good) cholesterol removes LDL cholesterol from the arteries   . GGP, AST and ALT are mainly related to liver functions; Blood_sugar and Urine_sugar are important indicators of diabetes; and Urine_protein represents the condition of the kidneys  .

We first excluded observations with missing values in explanatory variables. We then excluded the following observations: BMI too high (over 100); SBP too high (over 300) or too low (under 30); DBP too high (over 200) or too low (under 30); SBP-DBP becomes zero or negative; HDL too high (over 500); LDL too high (over 500); Triglyceride too high (over 1000); GGT too high (over 1000); ALT too high (over 500); AST too high (over 500); and Blood_sugar too high (500). Excluding observations with missing values in explanatory variables, we used 173,498 (M > 0: 138,407, and M = 0: 35,091) observations for the estimation of the model. Among these observations, 20.2% of medical expenditures were zero, and 79.8% were positive values.

Model A assumes that the effects of BP are continuous. However, it is possible that BP affects health conditions only if it becomes higher than certain threshold values (hereafter, threshold value hypothesis; criteria such as 140/90 and 130/80 are obviously based on this hypothesis). Therefore, we consider the model using dummy variables of SBP. Note that we analyzed only SBP as the SPRINT.

Model B:

$\begin{array}{c}{y}_{i}^{*}={\beta }_{1}+{\beta }_{2}Age+{\beta }_{3}Female+{\beta }_{4}Height+{\beta }_{5}BMI+{\beta }_{6}SBP130\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{\beta }_{7}SBP140+{\beta }_{8}SBP160+{\beta }_{9}SBP180+{\beta }_{10}DBP+{\beta }_{11}HDL\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{\beta }_{12}LDL+{\beta }_{13}Triglyceride+{\beta }_{14}GGP+{\beta }_{15}AST+{\beta }_{16}ALT\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{\beta }_{17}Boold_Sugar+{\beta }_{18}Urine\text{_}sugar+{\beta }_{19}Urin\text{_}protein\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{\beta }_{20}F\text{_}year14+{\beta }_{21}F\text{_}year15+{\beta }_{22}Society2+{\beta }_{23}Socitey3+u\end{array}$ (6)

SBP130 (1: if SPB ≥ 130, 0:otherwise), SBP140 (1: if SPB ≥ 140, 0:otherwise), SBP160 (1: if SPB ≥ 160, 0:otherwise) and SBP180 (1: if SPB ≥ 180, 0:otherwise) are dummy variables representing threshold values. Table 2 presents a summary of the explanatory variables.

Table 3 lists the result of estimations for Model A. Figure 7 shows the distribution of the medical expenditures after the power transformation ( $y={M}^{0.4088}$ ). The distribution is much closer to the normal distribution, suggesting usefulness of the model for analyzing this dataset. Since the sample size was quite large, all variables except F_year 14 were significant at the 1% level. The estimates of Age, Female, Height, BMI, Triglyceride, GGT, AST, ALT, Blood_suger, Urine_suger, Urine_protein, andF_year15 were positive, with these variables making medical

Figure 7. Distribution medical expenditures after the transformation $\left(y={M}^{0.4088}\right)$ .

Table 2. Explanatory variables.

SD: Standard deviation.

Table 3. Result of estimation (Model A).

SE: standard error, **: significant at 1% level.

expenditures higher. The effects of most of these variables were as expected. On the other side, the estimates of LDL, HDL, Society 2 and Society 3 were negative. Although LDL Cholesterol is called “bad” and HDL “good”  , higher levels of both LDL and HDL cholesterols reduced medical expenditures in our study. Hence further studies are necessary to determine the roles and functions of cholesterols, especially LDL cholesterol. This is one important finding of this study.

The medical expenditures of Societies 2 and 3 were lower than those of Society 1. Society 1 was formed by one large corporation, while Societies 2 and 3 were formed by groups of smaller corporations. Although the reason why cannot be elucidated, it might be necessary to check and revise the healthcare system in Society 1. Although the sign of DBP was positive and significant, the estimate of SBP was −0.0566, and its t-values were −10.42 and significant at any reasonable significance at any reasonable level. This means that higher SBP reduced medical expenditures.

Table 4 presents the estimation results of Model B, which contained the threshold value dummies for SBP. The values of estimations for variables other than BP were very similar to those of Model A. For the SBP dummy variables, SBP130, SBP140 and SBP180 dummies were not significant even at the 5% level, despite the fact that the sample size was quite large. Only the SBP160 dummy was significant at the 1% level, but the estimated value was negative. Although it was not significant at the 5% level, the estimate of DBP becomes a negative value in this model. These findings do not support the threshold value hypothesis, at least for SBP.

5. Discussion

The effects of BP on medical expenditures are mixed. Higher DBP makes them higher, but higher SBP makes them lower. We evaluated the relations between medical expenditures and high SBP or SBP hypertension. As shown in Figure 5, there exists an upward trend between SBP and average medical expenditures.

We consider a simple regression model of Equation (1) that is given by:

Model C:

${y}_{i}^{*}={\beta }_{1}+{\beta }_{2}SBP+u$ (7)

Then we get (standard errors are in parentheses),

$\begin{array}{l}{y}_{i}^{*}=13.264+0.1308SBP,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\stackrel{^}{\alpha }=0.4094\left(0.0007\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left(0.466\right)\text{}\left(0.00367\right)\end{array}$ (8)

The estimate of SBP is positive, and its t-value is 35.70 and significant at any reasonable significance level. As shown in Figure 6, age and gender might affect medical expenditures. We add Age and Female, and consider the model:

Model D:

${y}_{i}^{*}={\beta }_{1}+{\beta }_{2}Age+{\beta }_{3}Female+{\beta }_{4}SBP+u$ (9)

The estimation results of this model are given by:

$\begin{array}{l}\stackrel{^}{\alpha }=0.4088\left(0.0007\right)\\ {y}_{i}^{*}=-24.016+0.8508Age+5.8150Female+0.0747SBP\text{\hspace{0.17em}}\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left(0.5665\right)\text{}\left(0.0100\right)\text{}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left(0.15468\right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left(0.00359\right)\end{array}$ (10)

Although the size is almost half that of the previous case, the estimate of SBP is still positive, and the t-value is 20.86 and significant at any reasonable level.

We then add BMI, representing obesity, and consider the model,

Model E:

${y}_{i}^{*}={\beta }_{1}+{\beta }_{2}Age+{\beta }_{3}Female+{\beta }_{4}SBP+{\beta }_{5}BMI+u$

Table 4. Result of estimation (Model B).

SE: standard error, **: significant at 1% level.

The estimation result is given by:

$\begin{array}{l}\stackrel{^}{\alpha }=0.4085\left(0.0007\right)\\ {y}_{i}^{*}=-58.963+1.0416Age+9.0088Female-0.0250SBP+1.7028BMI\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left(0.6973\right)\text{}\text{\hspace{0.17em}}\left(0.0101\right)\text{}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left(0.1554\right)\text{}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left(0.0035\right)\text{}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left(0.0186\right)\end{array}$ (11)

In Model E, the coefficient of SBP becomes negative and significant at the 1% level. Muntner et al.  analyzed data from the US National Health and Nutrition Examination Survey (NHANES). They pooled data from the 2011-2012 and 2013-2014 NHANES cycles of adult participants, 20 years of age and older (n = 10,907). They declared that, “Implementation of the 2017 ACC/AHA hypertension guideline has the potential to increase the prevalence of hypertension and use of antihypertensive medication among U.S. adults. This should translate into a reduction in CVD events.” Although age, gender, race, smoking, total and HDL-cholesterol, and diabetes were included, “obesity” was not considered in their analysis. The correlation coefficient of SBP and BMI is 0.307 in this study. The relation between obesity and hypertension has been recognized for more than a half century  , and many studies have been conducted. For details, see the review works of Kotchen  , Jiang et al.  and Leggio et al.  . Jiang et al.  declared that: “The mechanisms underlying obesity-associated hypertension or other associated metabolic diseases remains to be adequately investigated.” They furthermore contended that, “There is no single cause to explain all the cases of obesity worldwide.” The relation between BP and obesity should be carefully studied.

The results of this study suggest that the risks of hypertension might be spurious, and other factors such as obesity might be affecting health condition. Moreover, BP has been found to affect not only heart diseases but also various other health conditions such as kidney diseases   . The influences of the new guideline of 130/80 are so large that careful reviews of various studies including analyses of various factors and diseases (not only heart diseases) affected by BP levels are absolutely necessary to determine whether or not the new guideline is appropriate.

6. Conclusions

In this study, we mainly evaluated the effects of BP on medical expenditures by the transformation tobit models using a dataset containing 175,123 medical checkups and 6,312,125 receipts obtained from 88,211 individuals obtained from three health insurance societies. Medical expenditure is a very good indicator of an individual’s health condition, because under the current Japanese national health insurance system, most medical institutes receive the same amount for the same treatments and medicines, independent of region. We first considered a model that included various heath information factors for individuals obtained in yearly medical examinations. Although the estimate of DBP had a positive value, that of SBP became negative and the absolute t-value was larger than 10, suggesting that the new guideline for SBP was not supported.

We then theorized that threshold values and BP might affect health condition only if BP exceeded those values (threshold value hypothesis). We used SBP dummies to check the threshold value hypothesis, but the results did not support the hypothesis for SBP. While the estimates of most other variables had expected signs, LDL cholesterol, considered “bad”, showed the opposite result. It is likely we will need additional studies for the evaluation of cholesterols.

We then evaluated the relation between medical expenditures and SBP. Medical expenditures and SBP were positively correlated, and if the simple model only contained SBP, the estimate became a positive value. Although the size of the coefficient was almost cut in half, the sign did not change if age and gender variables were considered in the model. However, when BMI, representing obesity, was added, the estimate of SBP became negative and significant at the 1% level.

It is possible that the relation between SBP and medical expenditures might be spurious, and the correlation of SBP and BMI might affect the result. The relation between BP and obesity should be carefully studied. Moreover, the effect of the new 2017 ACC/AHA guideline, the first comprehensive hypertension clinical practice guideline since 2003  , is so large that a careful and wide range of reviews of various studies, not only of heart diseases but for other types of diseases as well, are absolutely necessary. New studies verifying the guideline should also be conducted.

In this paper, we evaluated medical expenditures, not the risks of BP on heart diseases. Evaluation of the effects of BP on heart diseases and other important diseases is needed. It will also be necessary to analyze a larger and longer time-range dataset from various insurance societies to make the analysis more precise. These are subjects to be studied in future.

Acknowledgements

This study was supported by a Grant-in-Aid for Scientific Research, “Analyses of Medical Checkup Data and Possibility of Controlling Medical Expenses (Grant Number: 17H22509),” from the Japan Society of Science, and by a research grant, “Exploring Inhibition of Medical Expenditure Expansion and Health-oriented Business Management Based on Evidence-based Medicine” from the Research Institute of Economics, Trade and Industry (RIETI). The dataset was anonymized at the health insurance societies. This study was approved by the Institutional Review Boards of the University of Tokyo (number: KE17-30). The authors would like to thank the health insurance societies for their sincere cooperation in providing us the data. We would also like to thank an anonymous referee for his/her helpful comments and suggestions.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

Conflicts of Interest

The authors declare no conflicts of interest. 