1. Introduction
Covariance structures considered in linear mixed modeling (LMM) of correlated continuous outcomes can be directly specified using either unstructured covariance or covariance pattern matrices [1] . In the unstructured case, correlations are treated as all different from each other. In the covariance pattern case, correlations have well known patterns, for example, exchangeable (i.e., constant) correlations as used in repeated measures analyses and autoregressive correlations as used in time series analyses. Variances are usually treated as all the same or all different, but can be modeled more generally using generalized linear modeling. An alternative for modeling covariances is based on what is called both random effects [2] and random coefficients [3] . In this case, covariances are modeled using polynomials in random effects/coefficients combined with directly specified covariance matrices. Since these models also have means based on polynomials in fixed effects/coefficients, they are called mixed effects models. In what follows, “effects/coefficients” will be referred to as “coefficients” for brevity.
Mixed effects models for correlated continuous outcomes can be generalized to address correlated categorical outcomes. The generalized estimating equations (GEE) approach [4] combines generalized linear models for the means [5] with directly specified covariance structures. The generalized linear mixed modeling (GLMM) approach [1] is based on random coefficients. In the GEE case, variances are treated as a specific function of the means possibly multiplied by a constant dispersion parameter. The correlations are estimated using residuals. In the GLMM case, parameters can be estimated using a pseudo-likelihood approach [6] . An alternative approach used here is extended linear mixed modeling (ELMM) [7] , extending both GEE and LMM, with parameter estimation based on maximizing the multivariate normal density evaluated using residuals for general correlated outcomes, not just normally distributed correlated outcomes. A thorough assessment of the usefulness of ELMM for modeling correlated outcomes is provided in [7] including bootstrapping assessments of its effectiveness. However, ELMM analyses have so far only considered directly specified covariance structures.
Models for the means of general correlated outcomes are often theoretically based. Models for the variances are usually based on specific functions of the means. For example, count outcomes assumed to be Poisson distributed have variances equal to the means. These specific variance functions can be multiplied by dispersions, which are usually treated as constant. However, more general adaptive modeling [7] [8] is possible both for the means and the dispersions in terms of polynomials in power transforms of primary (untransformed) predictors. These are called fractional polynomials [9] . For example, the dependence of the means on time could be modeled using a polynomial based on one or more power transforms of time (e.g., time−0.15 and time2.4) with or without an intercept. The power transforms are chosen using a heuristic search (an overview is given in Section 2.8). Since random coefficients models are based on polynomials, these polynomials can also be fractional. For example, covariances for a longitudinal outcome might be a nonlinear function of time. The heuristic search needs only to be extended to handle an extra set of power transforms. The use of fractional polynomials is especially important to consider since it provides for nonlinearity in continuous predictors of means, variances/dispersions, and random coefficients. The consideration of nonlinearity for random coefficients is especially novel with the potential for unique insights into dependence of covariance on predictors that might also influence dependence of means on predictors.
The purpose of this paper is to extend ELMM to account for random coefficients in terms of fractional polynomials and to compare the impact of such models to directly specified covariance models using analyses of a variety of correlated outcome data sets. Cases considered include linear, Poisson, logistic, exponential, and discrete regression models for correlated continuous, count/rate, dichotomous, positive continuous, and discrete numeric outcomes, respectively. Means and variances/dispersions are based on generalized linear models using appropriate link functions. Random coefficients are treated as independent.
2. Methods
Let
denote observed outcome measurements for matched sets i of conditions j (e.g., quality of life measurements for study participants observed at different times) for
and
, subsets of size
of the full set of possible conditions
. Let
denote the expected values or means of
satisfying
for errors
. Combine
,
, and
for
into the
vectors
,
, and
satisfying
Also let
denote the covariance matrices for
and
be the number of outcome measurements.
2.1. Generalized Linear Models for the Means
The means
can be modeled in terms of predictors
for
. Specifically,
where
denotes the transpose of the M × 1 vector
with entries
,
is a M × 1 vector of fixed coefficient parameters
for
, and g is called the link function. Let
be the M × 1 vector with entries
. When
for all i and j,
is an intercept parameter. This is a generalized linear model for the means, but it allows for nonlinearity of the link transformed means in terms of primary predictors
in the sense that model predictors can equal power transforms of
, that is,
for powers p and for some m. A description of alternative regression models for the means follows.
2.1.1. Linear Regression
The outcome measurements
are continuous, real-valued, and treated as normally distributed with identity link function
so that
The identity link function is the canonical choice for this case.
2.1.2. Poisson Regression
The outcome measurements
are discrete, count-valued, and treated as Poisson distributed with natural log link function
so that
The natural log link function is the canonical choice for this case.
The model for the means of the count outcome measurements
are sometimes converted from a model for counts to a model for rates
relative to associated totals
using offsets
. Formally, let
so that
and then the means
of
satisfy
2.1.3. Logistic Regression
The outcome measurements
are dichotomous with values 0 and 1 and treated as Bernoulli distributed with logit link function
so that
The means satisfy
The logit link function is the canonical choice for this case.
2.1.4. Exponential Regression
The outcome measurements
are continuous, positive real-valued, and treated as exponentially distributed with natural log link function
so that
The canonical choice for the link function for this case is the inverse function
so that
with the shortcoming of requiring
. The use of the natural log link function resolves this problem.
2.2. Discrete Regression Using Censored Poisson Probabilities
The outcome measurements
are discrete and numeric-valued with increasing values
for
so that the means satisfy
where
denote the probabilities of
having values
. Note that this is not a generalized linear model for the means; the probabilities are modeled instead of the means. The probabilities can be modeled in several ways [10] . The approach considered here is based on censored Poisson probabilities satisfying
2.3. Directly Specified Correlation Structures
The covariance matrix can be modeled in terms of variances combined with directly specified correlation structures. Specifically, with
denoting the entries of
for
, let
where
are the
diagonal matrices with diagonal entries the standard deviations
and the correlation matrices
are
submatrices of a directly specified
working correlation matrix
with entries
for
and diagonal entries
.
The most commonly considered directly specified correlation structures are the independent (IND), exchangeable (EXCH), autoregressive of order 1 (AR1), and unstructured (UN) correlation structures. The IND correlation structure has zero correlations, that is,
for
. The EXCH correlation structure has constant correlations, that is,
for
. The AR1 correlation structure has correlations weakening for outcome measurements further apart. Specifically, for the more general spatial AR1 case,
for
where
is a strictly increasing function of the index j and
is the absolute value of the difference
. The more commonly used non-spatial AR1 case sets
, and so treats outcome measurements as equally-spaced even when they are not. The UN correlation structure has the
distinct correlations
for
.
2.4. Variance Modeling
For the generalized linear models of Section 2.1, the variances
can be modeled in terms of dispersions
combined with specific functions
of the means
, that is
For the linear regression case of Section 2.1.1.,
so that the dispersions
are the same as the variances. For the Poisson regression case of Section 2.1.2,
. For the logistic regression case of Section 2.1.3,
. For the exponential regression case of Section 2.1.4,
.
In the discrete regression case of Section 2.2, the variances formally satisfy
These can be combined with dispersions to model the variances
as
The dispersions
of
can be modeled in terms of a second set of predictors
for
. Combine these into the
vectors
. Also, let
be a
vector of fixed coefficient parameters
for
and set
When
for all i and j,
is an intercept parameter. This is a generalized linear model for the dispersions using the natural log link function, but it allows for nonlinearity of the log of the dispersions in terms of primary predictors
in the sense that model predictors can equal power transforms of
, that is,
for powers
and for some
.
For the Poisson regression case with offsets
included to convert the model for the counts
to one for rates
, the offsets can be added to the dispersions as well as to the means. Specifically, let the dispersions satisfy
so that
and then the variances
of
satisfy
2.5. Random Coefficients Modeling of Covariance
Let
for
be a third set of predictors. Combine these into the
vectors
and let
be the
matrices with rows equal to
for
. As for the means and the variances, nonlinearity in terms of primary predictors
can be addressed using model predictors equal to power transforms of
, that is,
for powers
and for some
. Let
denote a
vector of random coefficients
. Assume that
so that
When
for all i and j,
is a random intercept.
Assume that
and
are independent of each other so that the covariance matrices
for the outcome vectors
satisfy
where
is the
covariance matrix for the error vectors
and
is the
covariance matrix for the random coefficients vector
. Assume as well that
is a diagonal matrix with diagonal entries
so that the entries
of
are independent of each other and that
is a diagonal matrix with diagonal entries
so that the entries
of
are independent of each other. Note that the matrices
and
can have more complex, non-diagonal structures, but such more general cases are not considered to limit model complexity.
As for dispersions in Section 2.4, the variances
can be modeled as
To model the random coefficients, let
be a
vector of parameters
for
and define the variances
of
as satisfying
As in Section 2.3, let
denote the entries of
for
. The variances
satisfy
while the covariances
for
satisfy
Also as in Section 2.3, the covariance matrix
can be expressed as
where
are the
diagonal matrices with diagonal entries the standard deviations
for
and the correlation matrices
are
matrices with entries
for
so that the diagonal values satisfy
.
The special case of a constant random coefficients model with
for all i and j as well as
combined with a constant variances model with
for all i and j as well as
has constant variances
for
and constant covariances
for
,
. Consequently, it also has constant correlations with value
for
,
. This is the same model as the model based on the directly specified EXCH correlation structure restricted to have constant variances.
For the Poisson regression case with offsets
included to convert the model for the counts
to one for rates
, offsets can be added to the covariances as well as to the means. Specifically, let
denote the
diagonal matrices with diagonal entries
and redefine the covariance matrices for the counts
as
so that the covariance matrices for the
rate vectors
with entries
satisfy
2.6. Extended Linear Mixed Modeling (ELMM)
Let
denote the set of indexes for the full set of outcome vectors
. Also, let
denote the
vector of model parameters with entries denoted as
for
. Define the likelihood function
as the product over
of the terms
satisfying
where
denotes the determinant of the matrix
and
is the usual constant, that is, using the multivariate normal density. Note that in general this is an extended likelihood and only a true likelihood when the outcomes are normally distributed. However, it is maximized like a likelihood to estimate parameters.
Denote the
gradient vector as
and the
Hessian matrix as
The gradient vector and Hessian matrix can be computed using standard formulas. Derivatives of the residuals
are straightforward. Derivatives of the quantities
and
satisfy
for
[11] [12] . Let
denote the solution to the estimating equations
computed from the measurements with indexes in the set S using an adjustment of Newton’s method (e.g., as provided in [7] ).
Inverses
for covariance matrices based on random coefficients can be ill conditioned. This can be addressed with two adjustments to the estimation process. First, compute inverses
from inverses
and
using
with the diagonal entries of
bounded if necessary to avoid overflow. Second, restrict the amount of change
in the estimated parameter vector from
at iteration w to
at iteration
. For example, let
denote the entries of
and
the entries of
and define
that is, the maximum ratio of absolute values of changes in parameter estimates
relative to absolute values of those parameter estimates
(with small absolute parameter estimates increased to the value 1). If
, use
. Otherwise, replace it by
, that is, adjust the change in the parameter estimates proportionally.
ELMM provides for the incorporation of covariance in parameter estimation for general correlated outcomes. Models generated using ELMM can be used to compute LCV scores as addressed in Section 2.7, which can then be used to control the adaptive modeling process as addressed in Section 2.8 to identify nonlinear relationships in predictors for means, variances/dispersions, and random coefficients.
2.7. Likelihood Cross-Validation
Randomly partition the full set S of matched set indexes into K disjoint subsets
, called folds [13] , for
. Define the K-fold likelihood-cross validation (LCV) score as
Informally, the LCV score equals the geometric average of likelihoods for the folds evaluated with parameter estimates computed using outcome measurements with indexes in the complement of those folds. Larger scores indicate improved models. However, models with smaller LCV scores can be preferable if the decrease in the LCV score is not substantial, especially if they are also simpler. This issue is addressed using LCV ratio tests, computed using the
distribution and expressed in terms of a cutoff for a substantial percent decrease (PD) in the LCV score. The cutoff changes with the sample size n. Specifically, if the PD in the LCV score is larger than the cutoff, the model with the larger LCV score provides a substantial improvement over the model with the smaller LCV score or, equivalently, the model with the smaller LCV score is substantially inferior. Otherwise, the model with the smaller LCV score is a competitive alternative to the model with the larger LCV score. A more complete assessment of LCV is provided in [7] .
2.8. Adaptive Regression Modeling
The adaptive regression modeling process for means and variances in the context of directly specified correlation structures of Section 2.3 is covered in detail in [7] and [8] . A brief overview of that process and its extension to handle random coefficients models of Section 2.5 is addressed here. A base model is first expanded by systematically adding in power transforms of primary predictors to either the mean, variance/dispersion, or random coefficient component of the model, whichever generates an expanded model with the best LCV score. Then the final expanded model is contracted by systematically removing extraneous transforms from either the mean, variance/dispersion, or random coefficient component, whichever generates a contracted model with the best LCV score after adjusting the powers of the remaining transforms to improve the LCV score. The expansion process can also consider the addition of geometric combinations, that is, products of power transforms of primary predictors, and power transforms of these products. Geometric combinations generalize standard interactions and so provide for the nonlinear generalization of the concept of moderation [14] . More details on the adaptive modeling process are provided in [7] and [8] . While those references only consider adaptive models for means and variances/dispersions, the modeling process is readily adjusted to also handle random coefficients. Adaptive modeling also applies to hazard regression analyses [15] [16] and to interrupted time series analyses [17] .
3. Example Analyses
Analyses reported in this section are computed using SAS version 9.4 (SAS Institute, Inc., Cary, NC) with a SAS macro available upon request from the author. LCV scores are computed using 10 folds.
The purpose of these analyses is to compare results for adaptive modeling of correlated outcomes using random coefficients to adaptive modeling using directly specified covariance structures. Generated models for these two cases are evaluated using LCV scores and tested for substantial differences using LCV ratio tests.
3.1. Analyses of the Dental Measurement Data
The analyses of this section use data provided by Potthoff and Roy [18] on 108 dental measurements in mm for 11 girls and 17 boys at ages 8, 10, 12, and 14 years old. The cutoff for a substantial percent decrease (PD) for these data is 1.76%. The more general spatial AR1 correlation structure is considered, but this has no effect since the dental measurements are equally spaced. Analyses of these data using directly specified correlation structures are also provided in [7] and [8] .
3.1.1. Adaptive Models Based on Child Age
Table 1 contains a comparison of adaptive models in child age for the IND, EXCH, AR1, and UN directly specified correlation structures and the associated random coefficients model. The random coefficients model generates the best overall 10-fold LCV score but the EXCH correlation structure generates a competitive LCV score with insubstantial PD 0.84%. The estimated constant EXCH correlation is 0.67. In contrast, estimated correlations for the random coefficients model given in Table 2 vary from 0.64 and 0.72. Interestingly, these correlations are stronger the further apart outcome measurements are in time rather than weaker as for AR1 correlations. These analyses require a total of 19.8 minutes.
3.1.2. Adaptive Additive Models Based on Child Age and Child Gender
Table 3 contains a comparison of adaptive additive models in child age and child gender (i.e., without interactions or geometric combinations in child age and child gender) for the EXCH directly specified correlation structure and the associated random coefficients model. The IND, AR1, and UN directly specified correlation structures have not been considered since they generate worse LCV scores than EXCH in Table 1. The random coefficients model generates the better 10-fold LCV score while the EXCH correlation structure generates an LCV score with substantial PD 5.28%. The estimated constant EXCH correlation is 0.77. In contrast, estimated correlations for the random coefficients model given
Table 1. Adaptive models for dental measurements in terms of child age controlling for outcome covariance.
AR1 - autoregressive of order 1; EXCH - exchangeable; IND - independent; LCV - likelihood cross-validation; UN - unstructured. a. The transform 1 corresponds to an intercept.
Table 2. Estimated means, standard deviations, and correlations for the adaptive random coefficients model of dental measurements in terms of child age.
Table 3. Adaptive additive models for dental measurements in terms of child age and child gender controlling for outcome covariance.
EXCH - exchangeable; LCV - likelihood cross-validation. a. The transform 1 corresponds to an intercept; the transform male is the indicator for the child being a boy.
Table 4. Estimated means, standard deviations, and correlations for the adaptive additive random coefficients model of dental measurements in terms of child age and child gender.
in Table 4 are different for girls and boys and vary from 0.84 and 0.92 for girls and from 0.56 and 0.66 for boys. These analyses require a total of 18.0 minutes.
3.1.3. Adaptive Moderation Models Based on Child Age and Child Gender
Table 5 contains a comparison of adaptive moderation models in child age and child gender for the EXCH directly specified correlation structure and the associated random coefficients model. The random coefficients model generates the better 10-fold LCV score while the EXCH correlation structure generates an LCV score with substantial PD 5.01%. For both the EXCH and random coefficients cases, the additive models of Table 3 generate substantial PDs 4.11% and 3.84%, respectively, compared to associated moderation models of Table 5, and so both approaches support the conclusion of a substantial moderation effect. The estimated constant EXCH correlation is 0.72. In contrast, estimated correlations for the random coefficients model given in Table 6 are different for girls and boys with constant value 0.76 for girls and values varying from 0.48 and 0.59 for boys. Means for girls increase from 22.1 mm to 24.0 mm while means are larger for boys and increasing faster from 22.5 mm to 27.4 mm. These analyses require a total of 101.6 minutes. All the analyses of Sections 3.1.1-3.1.3 require a
Table 5. Adaptive moderation models for dental measurements in terms of child age and child gender controlling for outcome covariance correlation.
EXCH - exchangeable; LCV - likelihood cross-validation. a. The transform 1 corresponds to an intercept; the transform male is the indicator for the child being a boy.
Table 6. Estimated means, standard deviations, and correlations for the adaptive moderation random coefficients model of dental measurements in terms of child age and child gender.
total of 139.4 minutes or about 2.3 hours.
3.2. Analyses of the Epilepsy Seizure Data
The analyses of this section use data provided by Thall and Vail [19] on 295 seizure counts for 59 patients, 31 in an intervention group receiving the drug progabide and 28 in a control group receiving a placebo at baseline visit 0 and subsequent visits 1 - 4. The epilepsy counts are converted to epilepsy seizure rates per week using lengths of time in weeks for periods prior to visits (8 weeks prior to visit 0 and 2 weeks prior to visits 1 - 4). The cutoff for a substantial percent decrease (PD) for these data is 0.65%. The more general spatial AR1 correlation structure is considered, but this has no effect since the seizure counts measurements are equally spaced. Estimated variances for models based on directly specified correlation structures are computed using the combination of adaptively generated dispersions and the variance function
. Analyses of these data using directly specified correlation structures are also provided in [7] and [8] . Analyses of other correlated count/rate outcomes are provided in [20] .
3.2.1. Adaptive Models Based on Clinic Visit
Table 7 contains a comparison of adaptive models in clinic visit for the IND, EXCH, AR1, and UN directly specified correlation structures and the associated random coefficients model. The EXCH correlation structure generates the best overall 10-fold LCV score. The random coefficients model is substantially inferior with PD 7.11% in the LCV score. The estimated constant EXCH correlation is 0.87. These analyses require a total of 83.2 minutes.
3.2.2. Adaptive Additive Models Based on Clinic Visit and Treatment Group
Table 8 contains a comparison of adaptive additive models in clinic visit and treatment group for the EXCH directly specified correlation structure and the associated random coefficients model. The IND, AR1, and UN directly specified correlation structures have not been considered since they generate worse LCV scores than EXCH in Table 7. The same models are generated as in Section 3.2.1 indicating that treatment group does not have an additive effect on seizure rates. These analyses require a total of 25.5 minutes.
3.2.3. Adaptive Moderation Models Based on Clinic Visit and Treatment Group
Table 9 contains a comparison of adaptive moderation models in clinic visit and
Table 7. Adaptive models for epilepsy seizure rates in terms of clinic visit controlling for outcome covariance.
AR1 - autoregressive of order 1; EXCH - exchangeable; IND - independent; LCV - likelihood cross-validation; UN - unstructured. a. The transform 1 corresponds to an intercept.
Table 8. Adaptive additive models for epilepsy seizure rates in terms of clinic visit and treatment group controlling for outcome covariance.
EXCH - exchangeable; LCV - likelihood cross-validation. a. The transform 1 corresponds to an intercept; the transform int is the indicator for being in the intervention group (but it is not included in the models).
Table 9. Adaptive moderation models for epilepsy seizure rates in terms of clinic visit and treatment group controlling for outcome covariance.
EXCH - exchangeable; LCV - likelihood cross-validation. a. The transform 1 corresponds to an intercept; the transform int is the indicator for being in the intervention group.
treatment group for the EXCH directly specified correlation structure and the associated random coefficients model. The same EXCH model is generated as in Sections 3.2.1-3.2.2, but an improved random coefficients model is generated. However, this revised model is substantially inferior to the EXCH model with PD 2.91%. These results indicate that treatment group does not have a substantial moderation effect on seizure rates. These analyses require a total of 56.4 minutes. Under the selected EXCH model, the mean seizure rates are constant with estimated value 2.6 while the standard deviations have estimated values 4.6, 7.6, 8.1, 8.3, and 5.5 at visits 0 - 4, respectively. All the analyses of Sections 3.2.1-3.2.3 require a total of 165.1 minutes or about 2.8 hours.
3.3. Analyses of the Dichotomous Respiratory Status Data
The analyses of this section use data provided by Koch et al. [21] on 555 dichotomous respiratory status levels classified as 0 for poor and 1 for good for 54 patients on active treatment and 57 on a placebo at baseline visit 0 and subsequent visits 1 - 4. The cutoff for a substantial percent decrease (PD) for these data is 0.35%. The more general spatial AR1 correlation structure is considered, but this has no effect since the dichotomous respiratory levels are equally spaced. Estimated variances for models based on directly specified correlation structures are computed using the combination of adaptively generated dispersions and the variance function
. Analyses of these data using directly specified correlation structures are also provided in [7] and [8] .
3.3.1. Adaptive Models Based on Clinic Visit
Table 10 contains a comparison of adaptive models in clinic visit for the IND, EXCH, AR1, and UN directly specified correlation structures and the associated random coefficients model. The EXCH correlation structure generates the best overall 10-fold LCV score. The random coefficients model is about the same model. The estimated constant EXCH correlation is 0.48. These analyses require a total of 35.2 minutes.
Table 10. Adaptive models for dichotomous respiratory status levels in terms of clinic visit controlling for outcome covariance.
AR1 - autoregressive of order 1; EXCH - exchangeable; IND - independent; LCV - likelihood cross-validation; UN - unstructured. a. The transform 1 corresponds to an intercept.
Table 11. Adaptive additive models for dichotomous respiratory status levels in terms of clinic visit and treatment group controlling for outcome covariance.
EXCH - exchangeable; LCV - likelihood cross-validation. a. The transform 1 corresponds to an intercept; the transform int is the indicator for being in the intervention group.
3.3.2. Adaptive Additive Models Based on Clinic Visit and Treatment Group
Table 11 contains a comparison of adaptive additive models in clinic visit and treatment group for the EXCH directly specified correlation structure and the associated random coefficients model. The IND, AR1, and UN directly specified correlation structures have not been considered since they generate worse LCV scores than EXCH in Table 10. The random coefficients model has a better LCV score while the EXCH model generates a substantial PD 0.36, just a little over the cutoff of 0.35%. The estimated constant EXCH correlation is 0.47 while the estimated random coefficients model is also constant with almost the same value 0.46. This raises the question of whether there actually is a substantial additive effect to treatment group as indicated by the random coefficients model. Note that the dependence of the means on visit is the same for both models. Adding the indicator for treatment group to the means for the EXCH model generates a better LCV score 0.56748. An adaptive contraction of this EXCH model first removes the intercept from the means generating an LCV score 0.56570, and then removes the indicator for treatment group generating a LCV score 0.56524 as well as means depending of only visit−0.2, dispersions depending on only an intercept, estimated correlation 0.48, and requiring 0.4 minutes. These results indicate that the random coefficients model of Table 11 can be reduced to a model based only on visit since it is a special case of an EXCH model. These analyses require a total of 113.3 minutes with almost all of it accounted for by the random coefficients model.
3.3.3. Adaptive Moderation Models Based on Clinic Visit and Treatment Group
Table 12 contains a comparison of adaptive moderation models in clinic visit and treatment group for the EXCH directly specified correlation structure and the associated random coefficients model. Essentially equivalent models are generated and both models provide substantial improvements over corresponding additive models of Table 11 with PDs 1.43% and 1.01%, respectively. Consequently, active treatment moderates the effect of visit on dichotomous respiratory status. These analyses require 220.8 minutes. The EXCH model generates the better LCV score in much less time, and so is preferable. Under this EXCH model, the estimated correlation is 0.47. For participants in the control group on a placebo, the probability is constant with estimated value 0.5 over all clinic visits. In contrast, for participants in the intervention group on active treatment, the probability has estimated value starting at 0.50 at baseline visit 0, increases to 0.71 by visit 1, and decreases a little to 0.69 by visit 4. All the analyses of Sections 3.3.1-3.3.3 require a total of 369.3 minutes or about 6.2 hours.
3.4. Analyses of the Blood Lead Level Data
The analyses of this section use data from the Treatment of Lead-exposed Children (TLC) study [22] on 400 blood lead levels in mg/dL for 50 children on treatment with the chelating agent succimer and 50 on a placebo at baseline week 0 and subsequent weeks 1, 4, and 6. The cutoff for a substantial percent decrease (PD) for these data is 0.48%. The more general spatial AR1 correlation structure is important to use since the blood lead levels are not equally spaced. Estimated variances for models based on directly specified correlation structures are computed using the combination of adaptively generated dispersions and the variance function
. Analyses of these data using directly specified
Table 12. Adaptive moderation models for dichotomous respiratory status levels in terms of clinic visit and treatment group controlling for outcome covariance.
EXCH - exchangeable; LCV - likelihood cross-validation. a. The transform 1 corresponds to an intercept; the transform int is the indicator for being in the intervention group.
correlation structures are also provided in [7] .
3.4.1. Adaptive Models Based on Time in Weeks
Table 13 contains a comparison of adaptive models in week for the IND, EXCH, AR1, and UN directly specified correlation structures and the associated random coefficients model. The random coefficients model generates the best overall 10-fold LCV score. The UN correlation structure generates the next best LCV score but with substantial PD 1.55%. Estimated correlations for the random coefficients model given in Table 14 vary from 0.52 and 0.84. These analyses require a total of 262.0 minutes.
3.4.2. Adaptive Additive Models Based on Time in Weeks and Treatment Group
Table 15 contains a comparison of adaptive additive models in week and treatment group for the UN directly specified correlation structure and the associated random coefficients model. The IND, EC, and AR1 directly specified correlation structures have not been considered since they generate worse LCV scores than UN in Table 13. The random coefficients model generates the better 10-fold LCV score while the UN model is substantially inferior with PD 2.35% in the
Table 13. Adaptive models for blood lead levels in terms of time in week controlling for outcome covariance.
AR1 - autoregressive of order 1; EXCH - exchangeable; IND - independent; LCV - likelihood cross-validation; UN - unstructured. a. The transform 1 corresponds to an intercept.
Table 14. Estimated means, standard deviations, and correlations for the adaptive random coefficients model of blood lead levels in terms of time in weeks.
Table 15. Adaptive additive models for blood lead levels in terms of time in week and treatment group controlling for outcome covariance.
LCV - likelihood cross-validation; UN - unstructured. a. The transform 1 corresponds to an intercept; the transform int corresponds to the indicator for being in the intervention group.
Table 16. Estimated means, standard deviations, and correlations for the adaptive additive random coefficients model of blood lead levels in terms of time in weeks and treatment group.
LCV score. Estimated correlations for the random coefficients model given in Table 16 differ by treatment group with values varying from 0.76 to 0.84 for participants on a placebo and values varying from 0.28 and 0.32 for participants under succimer treatment. These analyses require a total of 315.7 minutes.
3.4.3. Adaptive Moderation Models Based on Time in Weeks and Treatment Group
Table 17 contains a comparison of adaptive moderation models in week and treatment group for the UN directly specified correlation structure and the associated random coefficients model. The random coefficients model generates the better 10-fold LCV score while the UN model is substantially inferior with PD 2.29% in the LCV score. For both the UN and random coefficients cases, the additive models of Table 15 generate substantial PDs 11.29% and 11.23%, respectively, compared to associated moderation models of Table 17, and so both approaches support the conclusion of a substantial moderation effect. Estimated correlations for the random coefficients model given in Table 18 differ by treatment group with values varying from 0.75 to 0.83 for participants on a placebo and values varying from 0.33 and 0.55 for participants under succimer treatment. For participants in the control group on a placebo, the estimated mean starts at 26.1 mg/dL at baseline week 0 and decreases a little to 24.0 mg/dL by week 6. In contrast, for participants in the intervention group on succimer treatment, the estimated mean also starts at 26.1 mg/dL at baseline week 0, decreases to 13.5 mg/dL by week 1, and increases to 20.7 mg/dL by week 6. These analyses require a total of 522.6 minutes. All the analyses of Sections 3.4.1-3.4.3 require a total of 1100.3 minutes or about 18.3 hours.
3.5. Analyses of the Trichotomous Respiratory Status Data
The analyses of this section use data provided by Koch et al. [21] on 555
Table 17. Adaptive moderation models for blood lead levels in terms of time in week and treatment group controlling for outcome covariance.
LCV - likelihood cross-validation; UN - unstructured. a. The transform 1 corresponds to an intercept; the transform int corresponds to the indicator for being in the intervention group.
Table 18. Estimated means, standard deviations, and correlations for the adaptive moderation random coefficients model of blood lead levels in terms of time in weeks and treatment group.
trichotomous respiratory status levels classified as 0 for poor, 1 for good, and 2 for excellent for 54 patients on active treatment and 57 on a placebo at baseline visit 0 and subsequent visits 1-4. The dichotomous respiratory status outcome of Section 3.3 is related; it has the good and excellent categories combined into a good category. The cutoff for a substantial percent decrease (PD) for these data is 0.35%. The more general spatial AR1 correlation structure is considered, but this has no effect since the trichotomous respiratory levels are equally spaced. Estimated variances for models based on directly specified correlation structures are computed using the combination of adaptively generated dispersions and the standard variance terms
. Analyses of these data using directly specified correlation structures are also provided in [7] . Analyses of other correlated discrete outcomes are provided in [10] .
3.5.1. Adaptive Models Based on Clinic Visit
Table 19 contains a comparison of adaptive models in clinic visit for the IND, EXCH, AR1, and UN directly specified correlation structures and the associated random coefficients model. The random coefficients model generates the best overall 10-fold LCV score. The UN directly specified correlation structure generates the next best LCV score, but with a substantial PD 0.59%. Estimated correlations for the random coefficients model given in Table 20 vary from 0.35 and 0.67. These analyses require a total of 43.9 minutes.
3.5.2. Adaptive Additive Models Based on Clinic Visit and Treatment Group
Table 21 contains a comparison of adaptive additive models in clinic visit and treatment group for the UN directly specified correlation structure and the associated random coefficients model. The IND, EC, and AR1 directly specified correlation structures have not been considered since they generate worse LCV scores than UN in Table 19. The same models are generated as in Table 19. These analyses require a total of 110.3 minutes.
Table 19. Adaptive models for trichotomous respiratory status levels in terms of clinic visit controlling for outcome covariance.
AR1 - autoregressive of order 1; EXCH - exchangeable; IND - independent; LCV - likelihood cross-validation; UN - unstructured. a. The transform 1 corresponds to an intercept.
3.5.3. Adaptive Moderation Models Based on Clinic Visit and Treatment Group
Table 22 contains a comparison of adaptive moderation models in clinic visit and treatment group for the UN directly specified correlation structure and the associated random coefficients model. The random coefficients model generates the better 10-fold LCV score. The UN model is substantially inferior with PD 0.66% in the LCV score. For both the UN and random coefficients cases, the
Table 20. Estimated means, standard deviations, and correlations for the adaptive random coefficients model of trichotomous respiratory status in terms of clinic visit.
Table 21. Adaptive additive models for trichotomous respiratory status levels in terms of clinic visit and treatment group controlling for outcome covariance.
LCV - likelihood cross-validation; UN - unstructured. a. The transform 1 corresponds to an intercept; the transform int corresponds to the indicator for being in the intervention group (but it is not included in the models)
Table 22. Adaptive moderation models for trichotomous respiratory status levels in terms of clinic visit and treatment group controlling for outcome covariance.
LCV - likelihood cross-validation; UN - unstructured. a. The transform 1 corresponds to an intercept; the transform int corresponds to the indicator for being in the intervention group.
additive models of Table 21 generate substantial PDs 1.21% and 1.28%, respectively, compared to associated moderation models of Table 22, and so both approaches support the conclusion of a substantial moderation effect. Estimated correlations for the random coefficients model given in Table 23 are the same for both treatment groups with values varying from 0.36 to 0.66. For participants in the control group on a placebo, the estimated mean starts at 0.90 at baseline visit 0, increases to 1.15 by visit 1 and then decreases a little to 1.12 by visit 4. In contrast, for participants in the intervention group on active treatment, the estimated mean also starts at 0.90 at baseline visit 0, increases more to 1.39 by week 2, and decreases somewhat to 1.27 by visit 4, and so participants on active treatment are more likely to have post-baseline excellent status with value 2. These analyses require a total of 1658.8 minutes. All analyses of Sections 3.5.1-3.5.3 require a total of 1813.0 minutes or about 30.2 hours.
4. Discussion
Methods are formulated for conducting adaptive analyses of general correlated outcomes using heuristic search controlled by likelihood cross-validation (LCV) scores using fractional polynomial models for means, variances/dispersions, and correlations for those outcomes. Means and variances/dispersions are modeled using generalized linear models based on fixed coefficients (or effects). Correlations are modeled using either directly specified correlations or random coefficients (or effects). Parameters are estimated using extended linear mixed modeling based on maximizing the multivariate normal density evaluated using residuals and covariances for general correlated outcomes, not just normally
Table 23. Estimated means, standard deviations, and correlations for the adaptive moderation random coefficients model of trichotomous respiratory status in terms of clinic visit and treatment group.
distributed outcomes. Associated estimating equations are solved using an adjusted version of Newton’s method. Cases considered include linear, Poisson, logistic, exponential, and discrete regression for modeling continuous, count/rate, dichotomous, positive continuous, and discrete numeric outcomes treated as normally, Poisson, Bernoulli, exponentially, and discrete numerically distributed, respectively. These methods extend to any type of correlated outcomes for which variances can be specified, for example, the inverse Gaussian case [5] . However, the case of multiple event times needs to account for censoring, and so correlation in this case is more readily addressed using conditional modeling [16] .
Example analyses are provided for each of these five outcome cases comparing results for the directly specified independent (IND), exchangeable (EXCH), autoregressive of order 1 (AR1), and unstructured (UN) correlation structures to results for random coefficients. LCV ratio tests are used to identify substantial differences in LCV scores for models. Each set of example analyses considers a correlated outcome observed over a continuous time variable for two groups of individuals. Analyses are first conducted using the continuous time variable for all four directly specified correlation structures and for random coefficients. Only the directly specified correlation structure generating the best LCV score for this first analysis is considered in subsequent analyses. Results for this directly specified correlation structure are compared to those for random coefficients in terms of both the continuous time variable and an indicator for being in one of the two groups. These latter analyses allow for additive effects of time and group as well as for moderation of time effects by group considering geometric combinations generalizing standard interactions.
Random coefficients models considered so far assume independent random coefficients combined with independent errors. Further work is needed to investigate more complex situations with more general covariance structures for both random coefficients and errors.
4.1. Summary of Example Analysis Results
In analyses of dental measurements for girls and boys at ages 8, 10, 12, and 14 years old, the EXCH correlation structure is the best directly specified correlation structure. Random coefficients outperform this directly specified correlation structure in all three analyses. The improvement is not substantial for the model in age only, but is substantial for both the additive and moderation models. Child gender substantially moderates the effect of age on dental measurements.
In analyses of epilepsy seizure rates per week for individuals taking a placebo and the drug progabide at baseline visit 0 and subsequent visits 1 - 4, the EXCH correlation structure is the best directly specified correlation structure. It substantially outperforms random coefficients in all three analyses. Epilepsy seizure rates depend on visit, but treatment group does not have either an additive or moderation effect on epilepsy seizure rates.
In analyses of dichotomous respiratory status coded as 0 for poor and 1 for good for individuals taking a placebo and on active treatment at baseline visit 0 and subsequent visits 1 - 4, the EXCH correlation structure is the best directly specified correlation structure. It generates comparable results to random coefficients in all three analyses although the additive analysis results are not straightforward. Being on active treatment substantially moderates the effect of visit on dichotomous respiratory status.
In analyses of blood lead levels for children taking a placebo and the chelating agent succimer at baseline week 0 and subsequent weeks 1, 4, and 6, the UN correlation structure is the best directly specified correlation structure. Random coefficients substantially outperform this directly specified correlation structure in all three analyses. Being on succimer substantially moderates the effect of week on blood lead levels.
In analyses of trichotomous respiratory status coded as 0 for poor, 1 for good, and 2 for excellent for individuals taking a placebo and on active treatment at baseline visit 0 and subsequent visits 1 - 4, the UN correlation structure is the best directly specified correlation structure. Random coefficients substantially outperform this directly specified correlation structure in all three analyses. Being on active treatment substantially moderates the effect of visit on trichotomous respiratory status.
Clock times are reported for all these example analyses. Not surprisingly, analyses considering random coefficients can require more time and sometime much more time than analyses considering directly specified correlation since they involve heuristic search through three sets of fractional polynomials rather than just two.
4.2. Conclusions
Random coefficients models for correlation are important to consider since they can provide substantial improvements for modeling of correlated outcomes. However, directly specified correlation structures are also important to consider since they can provide competitive or even substantially better results than random coefficients and usually require less time and sometimes much less time.
The results of the example analyses indicate that adaptive random coefficients modeling can generate effective models and so can provide substantive benefits for statistical practice in the context of correlated outcomes. Researchers often have to analyze correlated outcomes. This is typically addressed by arbitrarily choosing one of exchangeable correlations as used in standard repeated measures, autoregressive correlations as used in time series modeling, or unstructured correlations as used in multivariate analysis of variance and multivariate regression. ELMM combined with LCV can be used to identify which of these directly specified correlation structures is most appropriate for the data under analysis. However, it is highly important to consider random coefficients as well. Such evaluations can be conducted using specific models for the means based on theoretical considerations as well as more general adaptive models.
Acknowledgements
Prior development of the SAS macro used to conduct reported example analyses was supported in part by grants R01 AI57043 from the National Institute of Allergy and Infectious Diseases and R03 MH086132 from the National Institute of Mental Health. Work reported on in this article was not supported by external funding.