Alcohol and Type 2 Diabetes: Results from Canadian Cross-Sectional Data


Cross-section data from Canadian Community Health Surveys are used to examine the relationship between moderate alcohol use and type 2 diabetes. Results from these data are compared with those which have been obtained from prospective longitudinal studies. The major result is that both types of data yield similar conclusions with respect to this relationship. The reason why this occurs is because Canadian drinking behavior is quite stable once a respondent has become an adult and remains relatively stable thereafter. The only difference between the two types of survey is the time at which information on drinking behavior is obtained. Since this does not matter if drinking behavior is stable over large age ranges results from the two types of survey will be similar. Neither type of data can be used to support the proposition that the relationship between drinking behavior and the risk of diabetes is causal. Some advantages that sample survey data have over longitudinal data are also noted.

Share and Cite:

McIntosh, J. (2014) Alcohol and Type 2 Diabetes: Results from Canadian Cross-Sectional Data. Journal of Diabetes Mellitus, 4, 316-323. doi: 10.4236/jdm.2014.44044.

1. Introduction

Much has been written on the effects of moderate alcohol consumption as a prophylactic for type 2 diabetes. The studies which are regarded as the most influential and referred to most often are prospective or longitudinal. They collect baseline information on a sample at a fixed point in time and then follow the respondents in the sample for a number of years until the respondent is either diagnosed with diabetes, dies, or the study is termi- nated for administrative reasons. The measure which is used to represent diabetes is the duration of diabetes free life. One of the distinguishing features of the methodology is that only those respondents who do not have di- abetes at baseline are retained for follow-up coverage. Some studies collect information at regular intervals as the study progresses but most do not.

There is a sufficiently large number of these studies to have generated three meta-studies reviewing their re- sults [1] - [3] . The conclusion of these meta-studies is that moderate alcohol use is associated with a lower risk of getting type 2 diabetes relative to non-drinkers, occasional drinkers, or heavy drinkers. New studies have also emerged: [4] - [7] , which add to this already extensive literature and confirm the earlier results.

The appeal of prospective studies is that the observed statistical relationship between alcohol use and the risk of type 2 diabetes is based on the respondent’s self-reported drinking behavior at baseline which by construction is prior to the onset of diabetes. Because of this the hypothesis that this relationship is causal has some appeal. On the other hand, results based on sample surveys involving cross-section data are seen as much less convinc- ing since drinking behavior at the time of the survey refers to a time after the onset of diabetes for those who suffer from the disease. This would have current behavior explaining events that happened in the past, or so it would seem.

It is argued here that because Canadian drinking habits are relatively stable over time and over a large range of ages the data requirements for both cross-sectional and prospective longitudinal surveys to be informative about this issue are similar. This is good news for diabetes research for two reasons. First, countries like Canada, for example, which have not produced many longitudinal medical or health surveys can use cross-section data to investigate what effect drinking behavior has on the probability of having diabetes using the sample surveys that are regularly carried out by Statistics Canada [8] and [9] . Secondly, these sample surveys can be used to answer an important question that arises when longitudinal data is being used. Deleting respondents with diabetes at baseline generates a selection problem. For example, [10] examined subjects aged 65 or older using the Ameri- can Cardiovascular Health Study. For Canada in 2010 20.7% of males over 65 had diabetes so that the sample of diabetes free respondents may not be representative of the 65+ sample as a whole1. Presumably a similar result holds for their data as well. Does this matter? Sample data from the 2010 Canadian Community Health Survey suggests there are no sample selection problems associated with looking at older populations. What, in principle, appeared to be a problem does not arise.

Finally, unlike longitudinal studies, there are no attrition problems when a cross-section survey is the source of the data. Respondents are contacted only once; there is no need to keep track of them and inference problems due to non-response or death do not arise even if mortality is related to alcohol use2.

The paper has the following format. The argument that cross-sectional and longitudinal surveys are similar with respect to when drinking behavior is determined is developed in detail in the next section. The statistical model is outlined in Section 3 and the results are contained in Section 4. These are discussed in Section 5 and the paper the ends with a summary of the results and some conclusions.

2. Longitudinal vs. Cross-Sectional Data

When researchers use baseline data to examine outcomes that occur later in a project this data has to represent the respondent’s characteristics not just at the time the data was collected but for a considerable period prior to the collection date as well as for the follow-up period. In the case of alcohol consumption and diabetes risk in longitudinal surveys the authors of [6] note that this requires “alcohol intake be fairly stable over time”. However, it is not clear exactly what the appropriate time frame is. Different studies report different results but there is not much information about the age at first moderate alcohol use which minimizes the probability of getting diabetes.

Stability of behavior is also required for cross-section data to be informative about this relationship but the requirements are somewhat more stringent. Alcohol intake has to be stable for a period of unknown length prior to the onset diabetes. For the samples used here many of the respondents became diabetic 10 or 15 years before the data was collected so that reported drinking behavior obtained in the survey, if it is to be informative about the risks of having diabetes, has to be the same as it was long before the information on the respondent’s current alcohol consumption behavior was obtained.

In Canada, drinking behavior is formed when respondents are in their twenties and remains remarkably stable up to ages 50 - 59 for men. There is considerable more variation for women. In Table 1 and Table 2 proportions of the population who claim to be regular drinkers are displayed3.

This table contains data from the 2000-01 and 2011-12 Canadian Community Health Surveys and gives pro- portions by ten-year age groups for both male and female respondents. In 2000-01 the proportions of regular drinkers among males was around 50% for age groups 20 - 59 with little or no significant variation across age groups. For the 2011-12 sample there is hardly any variation at all over the first four age categories but the pro- portions of male regular drinkers in the age groups 20 - 59 in 2010 were about 10% higher than they were eleven years earlier. For women changes in behavior across the two surveys are much larger. In 2011-12 women in all age categories drank considerably more than they did eleven years earlier. The survey design is the same for both years so although the respondents are not the same for the two surveys they come from the same distribu- tion. For men, the conclusion from this is that alcohol consumption behavior does not change very much as res- pondents or cohorts get older and thus the age at which this information is collected will not have a major im- pact on the results concerning the importance of alcohol intake on the risk of diabetes. Of course, the upward trend in regular drinking behavior could have an impact on the results. This issue will be examined later by looking at some simulations.

Table 1. Proportion of regular drinkers in Canadian community health surveys, 2000-01 and 2011-12, for males.

Table 2. Proportion of regular drinkers in Canadian community health surveys, 2000-01 and 2011-12, for females.

3. Statistical Models

In the Canadian Community Health Surveys respondents are asked whether they have type 2 diabetes. They are also asked what type of medication has been prescribed for them. Most respondents (86% in 2011-12) were ei- ther taking some form of oral medication or insulin or sometimes both. So that although the information is self- reported the fact that most respondents had some involvement with a medical practitioner suggests that it is quite reliable. The measure of diabetes used here is the answer to this question for type 2 diabetes. Call this di for res- pondent i. This is a binary variable which takes the value 1 if respondent i has type 2 diabetes and 0 if not at the time of the survey.

The type 2 diabetes indicator variable is explained by normal probability model. Define


where Xi is vector of personal characteristics of respondent i including whether he or she is a regular or occa- sional drinker and ui is a normally distributed error term. can be interpreted as a latent variable measuring the propensity for respondent to become diabetic. The outcome probabilities can then be defined as




where is the unit normal cumulative distribution function. The parameters in the model can be estimated by maximizing the sample likelihood function whose natural logarithm is


The respondent characteristics include six smoking categorical variables going from never smoked to being a daily smoker. There are four educational categories going from less than a high school diploma to a university degree. There is also information on the respondent’s age, body mass index, income decile and level of physical activity. But the last two were not used as regressors. The non-drinker category includes former drinkers. This is seen as problematic by some authors, [12] for example. There are some problems associated with combining never and former drinkers but these are shown to be small in [13] . Table 3 and Table 4 show parameter estimates for the probability model for four age groups for both males and females for the two alcohol use dummies, the natural logarithm of the respondent’s body mass index, and age.

Table 3. Parameter estimates (standard error) by age group for males, 2011-12.

Table 4. Parameter estimates (standard error) by age group for females, 2011-12.

4. Results

Discussion of the results begins with the analysis of the age group 40 - 49 for males. Diabetes prevalence rates for the age groups 20 - 39 are quite low at 1.2%. They rise to 4.8%, four times higher, for the age group 40 - 49. It would therefore appear that most of the respondents who suffer from diabetes in this age group became di- abetic in their late thirties or forties. As argued earlier the drinking behavior for this age group is similar to what it was at younger ages and before the onset of diabetes. Thus, the large and significant regression coefficient for the categorical variable “regular drinker” for males of −0.468 (0.098) in the first row of Table 3 leads to the same conclusions that are claimed in the studies based on longitudinal data. Current drinking behavior as meas- ured at the age when the respondent was surveyed is a good representation of lifetime drinking behavior for this age group so that having been a regular drinker is associated with a lower probability of having type 2 diabetes. Lifetime behavior is exogenous or predetermined so diabetes cannot cause someone to be have been a regular drinker. Being diabetic could induce lower current alcohol consumption but there is no evidence that it does and medical advice concerning alcohol use for diabetics does not usually recommend lower alcohol consumption. Whether moderate alcohol is a causal factor in reducing the risk of diabetes is another matter; but being ex- ogenous or predetermined is not sufficient for causality. Causality is a complex issue and it will be examined in more detail in the next section.

Other variables were included as regressors in the normal probability models. The coefficients of the natural logarithm of the respondent’s body mass index, BMI, was always the largest and most significant, followed by age and then some of the higher educational categories. Being a lifetime non-smoker also reduced the risk of diabetes. Income and physical activity were not included as explanatory variables because of the possibility of reverse causation. Being a heavy drinker was included but it was never significant, a result similar to that found by [7] and others. For this cohort of male respondents who have a history of regular moderate drinking, are not overweight, don’t smoke, are well educated and younger are much more likely not to have diabetes.

The parameter estimates for the older age groups are very similar to those for the age group 40 - 49. This re- sult is somewhat surprising since the distribution of drinking behavior begins to change towards more occasional and nondrinkers as the cohorts get older. Apparently these changes are not large enough to alter the conclusions based on the youngest cohort.

The parameter estimates for women are similar to those for men except that being a regular drinker is more important for women and these coefficients increase with age. This is an unusual result since most studies find that the prophylactic effects of regular moderate alcohol use are less pronounced for women.

Table 5 and Table 6 show predicted probabilities of having diabetes by age groups and type of drinker. Al- though the regression coefficients associated with the regular drinking category do not change very much across age groups predicted probabilities of having diabetes increase dramatically by age and by drinking behavior category. For the age group 70 - 79, for example, respondents in the occasional or non-drinker categories are more than two and a half times as likely to have diabetes than regular drinkers.

Table 5. Predicted probabilities of diabetes by age group and drinking category for males, 2011-12.

Table 6. Predicted probabilities of diabetes by age group and drinking category for females, 2011-12.

5. Discussion

The slight upward trend in the proportion of regular drinkers means that the measured proportions for the age group 40 - 49 actually overstate the proportions of regular drinkers ten or twenty years earlier. If the actual pro- portions overstate the true proportions then it will also lead to an inflated estimate of the true effect of being a regular drinker on the probability of having diabetes. How large is the error associated with the use of the in- flated data? To get an answer to this question female a simulation exercise was carried out where 10% of the regular male drinkers and 37% of the regular female drinkers in the age group 40 - 49 were randomly reallocated equally to the two other categories. This leaves a set of regular drinkers which has the same proportion of regu- lar drinkers for the age group 20 - 29 in the 2000-01 sample for both genders. For these simulated samples there is an increase in the regression coefficient associated with the regular drinker dummy from −0.401 (0.076) to −0.346 (0.077) for men and −0.372 (0.081) to −0.318 (0.090) for women, respectively. Neither change is signifi- cant, and the new coefficients are still many times their standard errors. Although there is a change in the size of the response the effect associated with being a regular drinker it is still present and highly significant. Thus even if current drinking behavior does not represent exactly what respondents did twenty years earlier it is still a good enough measure of their behavior.

One of the reasons why researchers pay so much more attention to longitudinal data sources is because of the belief that it will bring them closer to discovering a causal relation between drinking behavior and the risk of diabetes. The issue of causality is an important one and not being able to claim that results are causal is often seen as detracting from their credibility. The important question here is whether results based on either longitu- dinal or sample survey data can be used to support the hypothesis that there is a causal relation between drinking behavior and diabetes.

From a purely statistical point of view the answer to this question is most probably not and this result does not depend on which type of data is being used in the analysis. In the linear regression framework ( [14] , p. 31) showed that in a relation like Equation (1) the Xi vector of variables cause di if Xi and ui are independent and the regression coefficients are significant. Here causality fails because the drinking dummies are not an adequate description of respondent drinking behavior. There is no information on the type of beverage consumed or whether it is consumed with meals. In addition, the number of drinks consumed per day is an important charac- teristic of drinking behavior and should also be included as a regressor. This information is available in the sur- veys but it is not regarded as being reliable and for that reason was not included here. Respondents in the survey were asked to recall what they drank on each day and what they said they drank does not agree with drinks per day which are based on per capita alcohol sales data. The result is that there is unobservable variation within the category “regular drinker” and this generates a measurement error problem and leads to correlation between Xi and ui. This does not mean that causality is rejected; it just cannot be confirmed using this type of data.

However, what is of slightly more concern is that even if an accurate picture of drinking behaviour could be observed it still might not be possible to confirm that the relation is causal. Suppose for example that instead of Equation (1) the true model is


where the Zi are highly correlated with the Xi variables but cannot be observed by the researcher. Ideally the model to be estimated should be based on the equation


The estimated β coefficients will not be significant but the δ coefficients will be. But when Equation (1) is the basis of the model then


and ui and Xi are not independent. The model based on Equation (1) is not causal not because of measurement error but because of omitted regressors which are correlated with the observable regressors.

This is not just a hypothetical situation. Information on the respondent’s history of dietary and physical activ- ity as well as detailed information on the timing and degree of being overweight or obese is very important in the analysis of diabetes. This information is uniformly absent in almost all surveys whatever their type. In this respect, both types of survey are similar and neither will be very informative about issue of causality.

However, there is other evidence suggesting a causal relationship. [6] found that in addition to moderate le- vels of alcohol consumption being associated with lower risks of diabetes, a small increase in alcohol consump- tion to moderate drinking behavior from being a light drinker also reduced the risk of diabetes. This is extremely compelling evidence in favor of the relation being causal. The duration of the experiment in question in this study was four years suggesting that the periods under consideration here are sufficiently long to capture the ef- fect of drinking behavior on the risk of type 2 diabetes. Additionally, there is some medical evidence which suggests that alcohol use increases insulin sensitivity which lessens the probability of getting diabetes, [15] and [16] . Moreover, Table 5 and Table 6 show very large and significant differentials across drinking categories that increase dramatically with age. This is consistent with the hypothesis that moderate drinking behavior ac- tually reduces the probability of getting diabetes 2. In the absence of plausible competing hypotheses one might be inclined to believe that the relationship is causal.

The sample survey results presented here should be of considerable interest to diabetes researchers because they confirm what others have found using prospective data, namely that there is a “U” shaped relation between alcohol consumption and the risk of diabetes. Evidence of this result was obtained here when age groups were aggregated into the age group 40 - 79. Sample sizes were too small to use the individual age groups. Within the drinker category there are seven sub-categories going from less than once a month to drinking every day. The category with the largest regression coefficient was three to four days per week for men and five to six days per week for women. Optimal drinking behavior was never characterized by drinking every day. This confirms the “U” shaped relation between alcohol use and the risk of type 2 diabetes mentioned above.

In the introduction the study involving older cohorts by [10] was mentioned. The high average age of the res- pondents raised the possibility that sample selection could contaminate the results. However, when this sample was examined here similar results were obtained for the regression coefficients, −0.345 (0.032) for males and −0.528 (0.034) for females, respectively. These coefficients are similar to those for the younger age groups; both are highly significant and the gender differential is preserved. There is no apparent effect of selecting respon- dents who are older than 65 on the prophylactic effects of moderate drinking.

6. Summary and Conclusions

The results in this paper show that moderate alcohol use acts as a prophylactic in reducing the risk of type 2 di- abetes. The data used here is cross-sectional and represents behavior at one point in time. However, Canadian drinking habits are fairly stable over time and across cohorts so that information about them will be similar for both longitudinal and cross-section surveys. Thus it should not be surprising that the protective effects of mod- erate alcohol use that so many longitudinal studies have found should also apply to respondents in the 2010-11 Canadian Community Health Survey. This is useful information since this issue has not been examined using Canadian longitudinal data. There are advantages from using cross-section surveys in terms of cost and the avoidance of both attrition and selection problems that arise in longitudinal surveys. It was also shown that nei- ther type of survey could be used to justify a causal relation between alcohol use and type 2 diabetes. For longi- tudinal surveys the fact that the information on alcohol use was collected prior to the onset of the disease is not sufficient to support the claim that the relation was causal.

Sample surveys like the Canadian Community Health Survey are a new source of data that can and should be used to explore how health issues are related to respondent behavior. However, some of the problems noted about these surveys could be circumvented by including more retrospective content like the history of the res- pondent’s weight and exercise habits as well as more accurate information on how much and how often they drink alcohol.


1This problem can easily be dealt with in longitudinal studies by using the conditional duration density function ((f())/(1-F(a))) where a is the age of the respondent at baseline instead of f(), the unconditional duration density function. This however, is not done in any of the work in this area.

2This issue is discussed in [11] .

3In this study regular drinkers are defined as those who use alcohol at least one a week. Heavy drinkers consume at least five drinks per day at least once a week.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Koppes, L.L.J., Dekker, J.M., Hendriks, H.F.J., et al. (2005) Moderate Alcohol Consumption Lowers the Risk of Type 2 Diabetes. Diabetes Care, 28, 719-725.
[2] Baliunas, D.O., Taylor, B.J., Hyacinth, I., et al. (2009) Alcohol as a Risk Factor for Type 2 Diabetes. Diabetes Care, 32, 2123-2132.
[3] Pietraszek, A., Gregersen, S. and Hermansen, K. (2010) Alcohol and Type Diabetes: A Review. Nutrition, Metabolism, and Cardiovascular Diseases, 5, 366-375.
[4] Buelens, J.W.J., van der Schouw, Y.T., Bergman, M.M., et al. (2012) Alcohol Consumption and the Risk of Type 2 Diabetes in European Men and Women; Influence of Beverage Type and Body Size. The EPIC-InterAct Study. Journal of Internal medicine, 272, 358-370.
[5] Joosten, M.M., Grobbee, D.E., Daphne, L. van der A., Verschuren, W.M.M., Hendriks, H.F.J. and Beulens, J.W.J. (2010) Combined Effect of Alcohol Consumption and Lifestyle Behaviors on Risk of Type 2 Diabetes. American Journal of Clinical Nutrition, 91, 1777-1783.
[6] Joosten, M.M., Chiuve, S.E., Mukamal, K.J., et al. (2011) Changes in Alcohol Consumption and the Subsequent Risk of Type 2 Diabetes in Men. Diabetes, 60, 74-79.
[7] Rasouli, B., Ahlbom, A., Andersson, T., Grill, V., et al. (2012) Alcohol Consumption Is Associated with Reduced Risk of Type 2 Diabetes and Autoimmune Diabetes in Adults: Results from the Nord-Trondelag Health Study. Diabetic Medicine, 30, 56-64.
[8] Statistics Canada (2013) Canadian Community Health Survey, 2011-2012. Statistics Canada, Ottawa.
[9] Statistics Canada (2002) Canadian Community Health Survey, 2000, 2001. Statistics Canada, Ottawa.
[10] Djoussé, L., Biggs, M., Mukamal, K.J., et al. (2007) Alcohol Consumption and Type 2 Diabetes among Older Adults: The Cardiovascular Health Study. Obesity, 15, 1758-1764.
[11] McIntosh, J. (2014) Inference Problems in the Analysis of the Relationship between Alcohol Consumption and Coronary Heart Disease. Communications in Statistics-Simulation and Computation.
[12] Fillmore, K.M., Kerr, W.C., Stockwell, T., Chikritzhs, T. and Bostrom, A. (2006) Moderate Alcohol Use and Reduced Mortality Risk: Systematic Error in Prospective Studies. Addiction Research and Theory, Online Version, 1-16.
[13] McIntosh, J. (2009) Is Alcohol Consumption Good for You? Results from the 2005 Canadian Community Health Survey. Addiction Research and Theory, 16, 553-563.
[14] Pratt, J.W. and Schlaifer, R. (1988) On the Interpretation and Observation of Laws. Journal of Econometrics, 39, 23-52.
[15] Sierksma, A., Patel, H., Ouchi, N., et al. (2004) Effect of Moderate Alcohol Consumption on Adiponectin, Tumor Necrosis Factor Alpha, and Insulin Sensitivity. Diabetes Care, 27, 184-189.
[16] Imhof, A., Pampler, I. and Maier, S. (2009) Effect of Drinking on Adiponectin in Healthy Men and Women. Diabetes Care, 32, 1101-1103.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.