Meta-Analysis of Multi-Arm Trials Using Binomial Approach

Abstract

Most meta-analysis has concentrated on combining of treatment effect measures based on comparisons of two treatments. Meta-analysis of multi-arm trials is a key component of submission to summarize evidence from all possible studies. In this paper, an exact binomial model is proposed by using logistic regression model to compare different treatment in multi-arm trials. Two approaches such as unconditional maximum likelihood and conditional maximum likelihood have been determined and compared for the logistic regression model. The proposed models are performed using the data from 27 randomized clinical trials (RCTs) which determine the efficacy of antiplatelet therapy in reduction venous thrombosis and pulmonary embolism.

Share and Cite:

Chootrakool, H. and Treewai, P. (2022) Meta-Analysis of Multi-Arm Trials Using Binomial Approach. Open Journal of Statistics, 12, 15-32. doi: 10.4236/ojs.2022.121002.

1. Introduction

There has been a massive growth in the number of randomized clinical trials (RCTs) since the first RCT was introduced in the well-known streptomycin trial in 1946 [1] . In making some of this information more readily available, an attempt is made to pull together the existing evidence in a form that can be used by researchers or statisticians; this is called systematic review. The aim of systematic reviews is to identify and summarize the findings of all possible studies addressing the clinical question of the review. Systematic review reduces the large quantities of information to a manageable size. An important aspect of most reviews is the quantitative synthesis of results; thus, meta-analysis is the statistical part of systematic review. The main purpose of meta-analysis is to increase the precision of the conclusions of a review. With statistical perspective, it is able to detect treatment effects with greater power and estimate these effects with greater precision than any single study. A meta-analysis from systematic reviews of Antiplatelet Trialists’ Collaboration [2] was applied in the paper.

Most meta-analysis has focused on summarizing treatment effect measures based on the comparison of two treatments or called arms, sometimes also called interventions or exposures. In this comparison, two groups of individual studies are exposed to two different treatments. Standard two-arm RCTs are frequently used in clinical research due in part to its relative simplicity of design and interpretation. At its most basic, one power, one significance level and one magnitude of difference are analyzed for two-arm comparisons. Conclusions are straightforward: either the two arms are shown to be different or they are not. The implementation for the model is not complicated. When more than two arms are included in meta-analysis, complexity ensues. These types of dataset are called multi-arm trials [3] [4] [5] although some authors call it mixed treatment comparison (MTC) [6] [7] , or multiple treatment comparison [4] [8] . Additionally, some authors call network meta-analysis for indirect head-to-head evidence for multiple evidences from RCTs [9] - [14] .

For multi-arm trials data, Chootrakool and Shi [3] introduced the normal approximation model using an empirical logistic transform requiring a large number of individual observations and the probability of being the case to be not too near zero or one. If the number of individual observations is small, the normal approximation model is not suitable. The number of samples in a single study should usually be larger than 20 [4] . In this paper, an exact binomial model is introduced to fit the binary multi-arm trials data. There are two alternative maximum likelihood approaches that can be used to make inferences for the unknown parameters in the logistic regression model. These are unconditional method and conditional method. The logistic regression model has become increasingly popular with the easy availability of appropriate computer routines. Many authors have described maximum likelihood estimation procedures which turn out to be iterative [15] . There have been a large number of studies about unconditional and conditional methods, for example in Tritchler [16] , Sartori [17] , Lee et al. [18] and Caterina et al. [19] .

Most existing methods for meta-analysis of multi arm trials use the logistic regression model with the unconditional approach. Lu and Ades [20] introduced the Bayesian hierarchical model for multi-arm trials using the unconditional method to estimate unknown parameters. Weber et al. [21] used random effect model using the unconditional maximum likelihood method for the meta-analysis. They conducted a simulation study comparing two zero-cell corrections under the ordinary random-effects model. Logistic regression model is introduced using the conditional approach in the paper.

There has the direct and indirect comparison for RCTs in a meta-analysis [9] [20] , this has led to network meta-analysis. Seide et al. [12] perform simulation study for sparse network of trial including multi-arm trials. They compared Bayesian and Frequentist methods in random effects network meta-analysis. Jenkins et al. [13] introduced methods for inclusion of evidence in network meta-analysis. More applications of network meta-analysis can be found in Greco et al. [22] , Wang [23] and Zhao et al. [14] .

The structure of the paper is arranged as follows. The data structure of multi-arm trials is firstly introduced in Section 2. Fitting the logistic regression model is proposed in the next section. Unconditional maximum likelihood approach for the model including the standard error of maximum likelihood estimators (MLEs) are described in Section 4. Similarly, conditional maximum likelihood approach for the model is presented in Section 5. In Section 6, the logistic regression model is illustrated with the unconditional and conditional approaches with the data. The final section is discussion and conclusion including the advantages and the limitations of the two approaches.

2. The Data Structure of Multi-Arm Trials

Let the indices i = 1 , , M and j = 0 , 1 , , K stand for the studies and the treatments respectively, where the index j = 0 stands for the control group. To make multi-arm comparisons, suppose that there is M RCTs comparing K + 1 treatments. For the ith study, let r i j represent the number of cases on treatment j and let n i j be the total number in the corresponding group. Let π i j be the probability of being the case for a patient received the treatment j in the ith study. The r i j has a binomial distribution

r i j ~ B i n ( π i j , n i j ) ; i = 1 , , M ; and j = 0 , 1 , , K .

A data structure of multi-arm trials shall be defined by introducing an index set J i comprising the treatments involved in the ith study. The data structure is shown as

D = { ( r i j , n i j ) : i = 1 , , M ; j J i } .

Some studies of meta-analysis might not have all the treatments available. For example, some studies might compare fewer than K + 1 treatments, or some baseline treatments may be different, or both cases could occur simultaneously. The data structure is analogous to an incomplete-blocks design. For example, from Antiplatelet data, the 8th - 17th studies are composed of two treatment comparison unlike the 1st - 7th studies comparing three treatments.

Let b(i) denote baseline treatment or called study-specific reference treatment [4] corresponding to the ith study, which can be the control group or any other treatments. As mentioned earlier about indirect comparison, in a situation that the treatments in some studies cannot be compared directly to the control group, we need to use evidence from the external studies. To make it clear, if b(i) = 0 then the direct comparison is involved in this study. Conversely, if b ( i ) 0 then the study makes indirect comparison. Let J ( i ) = J i \ { b ( i ) } represent the set of treatments involved in the ith study but excluding the baseline treatment b(i). Let k i and k i + 1 denote the number of treatments in the sets J i and J i respectively. The r i b ( i ) and r i j are binomially distributed, respectively as B i n ( n i b ( i ) , π i b ( i ) ) and B i n ( n i j , π i j ) for i = 1 , , M and j J ( i ) . Let D be set of studies that make the direct comparison of treatment thus let I be set of indirect comparison of treatment.

The multi-arm trials data is consisted of 27 RCTs from systematic reviews of Antiplatelet Trialists’ Collaboration [2] in total as shown in Table 1. The studies compare three treatments: aspirin plus dipyridamole (A), aspirin alone (B) and control group (C), where seven trials compare aspirin plus dipyridamole, aspirin alone and control group (i.e. comparing all A, B and C), ten trials compare aspirin plus dipyridamole and control group (i.e. comparing A and C) and ten trials compare aspirin alone and control group (i.e. comparing B and C) The “event” in Table 1 represents the number of patients in whom deep venous thrombosis was detected by systematic fibrinogen scans or venography, or both, after general and orthopedic surgery and in high risk medical patients. The “total” represents the number of patients controlled in each group.

3. Fitting the Logistic Regression Model

This section illustrates how to fit the logistic regression model to the binary data related to multi-arm trials including the direct and indirect comparisons. Logistic regression is a regression model for a binomially distributed response/dependent variable. It is useful for modelling the probability of an event occurring as a function of other factors. Logistic regression is part of a category of statistical models called generalized linear models and uses the logit as its link function. Logistic regression can be used only with two types of dependent variables: one is a categorical dependent variable that has exactly two categories (i.e. a binary or dichotomous variable). The other is a continuous dependent variable that has values in the range 0 to 1 representing the probability values or the proportions. The names for logistic regression used in various other application areas are logistic model or logit model. Logistic regression is similar to linear regression. In linear regression, ultimate objective for the study may be either estimation of the coefficient values, or prediction of the response value. One significant difference between logistic and linear models is that the linear model has a continuous response variable and the logistic model uses a binary or dichotomous response. Logistic regression models for the ith study can be defined by

log ( π i b ( i ) 1 π i b ( i ) ) = α i (1)

log ( π i j 1 π i j ) = α i + δ i , b ( i ) j , j J ( i ) (2)

where α i and δ i , b ( i ) j are trial effect and treatment effect, they will be detailed next.

Table 1. RCTs of Antiplatelet data.

3.1. Trial Effect

Two assumptions are usually made about the trial effect α i . The first one is that the trial effects are assumed to be study-level effects, which means that the α i ’s are different parameters and are treated as nuisance parameters in the model. The M different unknown parameters are needed to include in the model. The second one is that the model may be assumed for the α i ’s. A special case is to assume that the trial effect is a fixed effect, defined by α i = α 0 . Conversely, it may be assumed to be a random effect, given by α i ~ N ( μ α 0 , τ α 0 2 ) , where μ α 0 is the overall mean of the trial effect and τ α 0 2 measures the magnitude of the variation between the studies. Most of the existing methods therefore used the first assumption. However, the number of unknown parameters is the same as the number of studies if the first assumption of the trial effect is made. It will result in some theoretical and computational problems. The accuracy of the estimation depends on the sample size of each study not the overall sample size of the pool in the meta-analysis. Therefore, the α i ’s are assumed to be different in the paper.

3.2. Treatment Effect

The treatment effect δ i , b ( i ) j can be direct treatment effect if i D or indirect treatment effect if i I : they are defined as follows.

δ i , b ( i ) j = { δ i , 0 j N ( μ 0 j , τ 0 j 2 ) if i D , δ i , 0 j δ i , 0 b ( i ) ~ N ( μ 0 j μ 0 b ( i ) , τ 0 j 2 + τ 0 b ( i ) 2 2 ρ j b ( i ) τ 0 j τ 0 b ( i ) ) if i I .

where ρ j b ( i ) is the correlation coefficient between δ i , 0 j and δ i , 0 b ( i ) . For example, if the data, includes 3 treatments such as treatment A, B, C and C stands for control group. The baseline treatment for some studies exclude the control group C (such as A and B) thus the indirect treatment effect can be written as

δ i , A B = δ i , A C δ i , B C N ( μ A C μ B C , τ A C 2 + τ B C 2 2 ρ A B τ A C τ B C ) , i = I .

Next, we shall consider the treatment effect in a matrix form. Let δ i , 0 and μ 0 represent the vectors of ( δ i , 0 j , j = 1 , , K ) t and ( μ 0 j , j = 1 , , K ) t respectively where the superscript t stands for matrix transposition and let Ω 0 represent the K × K covariance matrix. The model δ i , 0 can be written as

δ i , 0 ~ M V N ( μ 0 , Ω 0 ) .

This is called the basic model of random treatment effect. Let F i j be the index vector of length K consisting of elements 0 and 1 corresponding to δ i , b ( i ) j , given by

F i j = { ( 0 , , 0 b ( i ) th , 1 j th , , 0 ) if i D , ( 0 , , 1 b ( i ) th , , 1 j th , , 0 ) if i I .

Now, the random effect δ i , b ( i ) j can be written in the form of δ i , 0 and F i j :

δ i , b ( i ) j = F i j δ i , 0 ~ N ( F i j μ 0 , F i j Ω 0 F i j t ) .

As before, the covariance between the treatment effects δ i , b ( i ) j and δ i , b ( i ) k for j k and j , k J ( i ) may be dependent. For the ith study, let F i be the following k i × K matrix

F i = ( F i j ) k i × K for j J ( i ) ,

Let δ i denote the vector ( δ i , b ( i ) j , j J ( i ) ) t then we have

δ i = F i δ i , 0 ~ M V N ( μ i , Ω i ) ,

where μ i = F i μ 0 and Ω i = F i Ω 0 F i t .

The α i are assumed to be different and the δ i , b ( i ) j are assumed to be a random effect as δ i . The models in the Equation (1) and Equation (2) can be used for both treatment comparisons. From model in Equation (2), log π i j / 1 π i j is called the logistic transform of probability π i j or alternatively log odds π i j or logit π i j . Having considered the properties of logit π i j , the term π i j / 1 π i j is the odds of an unsuccessful outcome from a patient treated with treatment j and so logit π i j is the log odds of being the case. It is easily seen that a value of π i j in the range (0, 1) corresponds to a value of logit π i j in (−∞, ∞). As π i j approaches to 0, logit π i j approaches to −∞; as π i j approaches to 1, logit π i j approaches to ∞ and for π i j = 0.5 , logit π i j = 0 . After some rearrangement, the logistic regression models in Equation (1) and Equation (2) have respectively equivalent formulations as

π i b ( i ) = ( e α i 1 + e α i ) and π i j = ( e α i + δ i , b ( i ) j 1 + e α i + δ i , b ( i ) j ) . (3)

There are two alternative maximum likelihood (ML) approaches, the unconditional and conditional approaches that can be used to estimate the unknown parameters in a logistic regression model.

4. Unconditional Maximum Likelihood Approach

Generally, unconditional ML estimation is preferred if the number of parameters in the model is small relative to the number of studies in a meta-analysis [25] .

4.1. Probability Functions

To demonstrate the unconditional ML estimation, let p ( r i b ( i ) | α i ) and p ( r i j | α i , δ i , b ( i ) j ) denote the probability functions associated with the distributions of r i b ( i ) | α i and r i j | α i , δ i , b ( i ) j respectively for i = 1 , , M and j J ( i ) , defined as follows.

For the baseline treatment,

p ( r i b ( i ) | α i ) = ( n i b ( i ) r i b ( i ) ) π i b ( i ) r i b ( i ) ( 1 π i b ( i ) ) n i b ( i ) r i b ( i ) = ( n i b ( i ) r i b ( i ) ) e α i r i b ( i ) ( 1 + e α i ) n i b ( i ) (4)

For the treatments j, j J ( i )

p ( r i j | α i , δ i , b ( i ) j ) = ( n i j r i j ) π i j r i j ( 1 π i j ) n i j r i j = ( n i j r i j ) e ( α i + δ i , b ( i ) j ) r i j ( 1 + e α i + δ i , b ( i ) j ) n i j . (5)

The combination in Equation (4) represents the number of possible combinations of n i b ( i ) taken r i b ( i ) at a time. The π i b ( i ) in the middle term of Equation (4) is substituted from Equation (3) and ( 1 π i b ( i ) ) becomes 1 / 1 + e α i . The combination in Equation (5) can be analyzed in the same way.

4.2. The Unconditional Likelihood

From the probability functions p ( r i b ( i ) | α i ) and p ( r i j | α i , δ i , b ( i ) j ) , the trial effects α i ’s are study-level effects. They are assumed to be different and also included in both probability functions. While the δ i , b ( i ) j is a random effect, thus the p ( r i j | α i , δ i , b ( i ) j ) involves the vector of random effects, δ i . The standard method of handling a probability function which involves random variables that have a fully specified probability is to integrate the probability function with respect to the distribution of those variables. To deal with the random effects δ i , let r ( i ) be the vector ( r i j , j J ( i ) ) t . The probability function p ( r i j | δ i ) will be integrated with respect to δ i . The p ( r ( i ) ) contains k i integrals, which is the number of treatments in the set J ( i ) , and is given by

p ( r ( i ) ) = δ i j J ( i ) p ( r i j | δ i ) ϕ ( δ i ; μ i , Ω i ) d δ i , (6)

where ϕ ( δ i ; μ i , Ω i ) is the probability density function of the normal distribution with mean μ i and covariance Ω i given by

ϕ ( δ i ; μ i , Ω i ) = 1 ( 2 π ) k i / 2 | Ω i | 1 / 2 e ( δ i μ i ) Ω i 1 ( δ i μ i ) / 2 (7)

The integral in Equation (6) can be calculated numerically; one way to do it is to use the Gauss-Hermite method. To apply Gauss-Hermite approximation, the probability function p ( r ( i ) ) for the ith study can be estimated by

p ( r ( i ) ) π k i / 2 n 1 = 1 l 1 w n 1 ( 1 ) n k i = 1 l k i w n k i ( k i ) { j J ( i ) ( n i j r i j ) e ( α i + ( μ i + 2 Ω i 1 / 2 d i , n ) ) r i j ( 1 + e α i + ( μ i + 2 Ω i 1 / 2 d i , n ) ) n i j } , (8)

where the sampling nodes are at μ i + 2 Ω i 1 / 2 d i , n and d i , n = ( x n 1 ( 1 ) , , x n k i ( k i ) ) . The vector d i , n depends on the number k i , which is the number of treatments comprising in the ith study. The resulting function p ( r ( i ) ) does not depend on the δ i . For most practical purposes, l k i need not be greater than 20, although some authors suggest using even smaller values [26] .

As before, let θ be the collection of all unknown parameters for the meta-analysis including all trial effects ( α i , , α M ) , μ and Ω and let r i be the vector ( r i j , j J i ) . The likelihood function for the ith study can be written as

L ( θ | r i ) = j J i p ( r i j ) = p ( r i b ( i ) | α i ) p ( r ( i ) ) . (9)

where p ( r i b ( i ) | α i ) and p ( r ( i ) ) are given in Equation (4) and Equation (8) respectively. Let l u , i = log L ( θ | r i ) standing for the unconditional log-likelihood function of the logistic regression model for the ith study. The log-likelihood function of θ for the logistic regression models is given by

l u ( θ ) = i = 1 M l u , i . (10)

The number of α i ’s is the same as the number of studies. The computation of MLEs may be quite unstable if the number of studies is large while the sample size of each study is small. As discussed earlier, this may also result in a biased or misleading estimate. A conditional approach is suggested to eliminate all nuisance parameters.

4.3. Asymptotic Variance-Covariance Matrix

This section shows how to calculate the standard errors for the MLEs of the logistic regression model using the unconditional approach. Since there are random effects in the model, some integrals are involved in the likelihood function. The unconditional log-likelihood function l u ( θ ) in Equation (10) can be written as

l u ( θ ) = i = 1 M log p ( r i b ( i ) ) + i = 1 M log p ( r ( i ) ) = i = 1 M log p ( r i b ( i ) ) + i = 1 M log δ i j J ( i ) p ( r i j | δ i ) ϕ ( δ i ; μ i , Ω i ) d δ i . (11)

Let l 1 and l 2 stand for the first and second terms of the above log-likelihood function, given by l u ( θ ) = l 1 + l 2 . Three types of unknown parameters are involved in θ ; the trial effects, α i ’s, the overall mean effects μ’s (for μ ), and the variances τ’s and the correlation coefficients ρ’s in the covariance matrix Ω . Let τ represent a parameter (either τ or ρ) involved in Ω . There is no random effect involved in l 1 .

First, the second-order partial derivative 2 l 1 / α i 2 can be calculated in the usual way; while the other terms are

2 l 1 α i α j = 2 l 1 α i μ = 2 l 1 α i τ = 0 , i j and i , j { 1 , , M } .

Next, consider the second term of Equation (11), for notational convenience, let P i ( δ i ) represent the function j J ( i ) p ( r i j | δ i ) in l 2 . Now the term l 2 takes the form

l 2 ( α 1 , , α M , μ , Ω ) = i = 1 M log δ i p ( δ i ) ϕ ( δ i ) d δ i = i = 1 M l 2 i ,

where ϕ ( δ ) is the density of the multivariate normal distribution with mean μ i and variance matrix Ω i , and l 2 i is a summand of the log-likelihood involving the integrals ( log δ i P i ( δ i ) ϕ ( δ i ) d δ i ). The first-order partial derivatives relating to l 2 are shown as follows

l 2 α i = i = 1 M e l 2 i δ i P i ( δ i ) α i ϕ ( δ i ) d δ i ,

l 2 μ = i = 1 M e l 2 i δ i P i ( δ i ) ϕ ( δ i ) μ d δ i ,

l 2 τ = i = 1 M e l 2 i δ i P i ( δ i ) ϕ ( δ i ) τ d δ i .

Similarly, the second-order partial derivatives are

2 l 2 α i 2 = i = 1 M ( e l 2 i δ i 2 P i ( δ i ) α i 2 ϕ ( δ i ) d δ i ( e l 2 i δ i P ( δ i ) α i ϕ ( δ i ) d δ i ) 2 ) , (12)

2 l 2 μ 2 = i = 1 M ( e l 2 i δ i P i ( δ i ) 2 ϕ ( δ i ) μ 2 d δ i ( e l 2 i δ i P i ( δ i ) ϕ ( δ i ) μ d δ i ) 2 ) , (13)

2 l 2 τ 2 = i = 1 M ( e l 2 i δ i P i ( δ i ) 2 ϕ ( δ i ) τ 2 d δ i ( e l 2 i δ i P i ( δ i ) ϕ ( δ i ) τ d δ i ) 2 ) , (14)

2 l 2 μ τ = i = 1 M ( e l 2 i δ i P i ( δ i ) 2 ϕ ( δ i ) μ τ d δ i ( e l 2 i δ i P i ( δ i ) ϕ ( δ i ) μ d δ i ) 2 ) . (15)

Note that the second-order partial derivative 2 l 2 / α i α j is equal to zero. The second-order partial derivatives of 2 l 2 / μ i μ j and 2 l 2 / τ i τ j can be expressed in similar equations to Equation (13) and Equation (14) respectively. The integrals in the first-order and second-order partial derivatives can be approximated by Gaussian quadrature.

From the log-likelihood l u ( θ ) in Equation (11), the second-order partial derivatives for the observed Fisher information matrix can be calculated as

2 l u α i 2 = 2 l 1 α i 2 + 2 l 2 α i 2 , 2 l u μ 2 = 2 l 2 μ 2 , 2 l u μ i μ j = 2 l 2 μ i μ j ,

2 l u τ 2 = 2 l 2 τ 2 , 2 l u τ i τ j = 2 l 2 τ i τ j , 2 l u μ τ = 2 l 2 μ τ .

As set earlier, the second partial derivatives of 2 l u / ρ 2 and 2 l u / μ ρ (and 2 l u / τ ρ ) can be calculated in similar equations to 2 l u / τ 2 and 2 l u / μ τ respectively. Notice that the second-order derivative of l 1 is only related in 2 l u / α i 2 . The matrix of second partial derivatives can be partitioned into a block matrix with null matrices in the off diagonals:

H ( θ ) = ( H α ( θ ) 0 0 H μ τ ρ ( θ ) ) ,

where H α ( θ ) and H μ τ ρ ( θ ) are the second-order partial derivatives about α j , and μ, τ and ρ respectively. By multiplying H ( θ ) by −1, the observed Fisher information matrix J ( θ ) is obtained. The inverse of J ( θ ) is the asymptotic variance-covariance matrix of MLEs and their standard errors are the square roots of the diagonal of J ( θ ) 1 .

5. Conditional Maximum Likelihood Approach

Conditional likelihood is widely used in logistic regression models with binary data. In particular, this leads to accurate inferences for the parameters of interest and eliminates all nuisance parameters [25] . The conditional likelihood will be defined and the maximum likelihood estimation will be described in this section.

5.1. Conditional Likelihood

From the logistic regression models in Equation (1) and Equation (2), the conditional likelihood r i given that C i = j J i r i j = c i for the ith study, is given by

f ( r i | C i = c i ; δ i ) = f ( r i | j J i r i j = c i ; δ i ) = f ( r i | δ i ) f ( j J i r i j = c i | δ i ) . (16)

The conditional likelihood reflects the probability of the observed data configuration relative to the probability of all possible configurations of the given data. The numerator f ( r i | δ i ) is exactly the same as the unconditional likelihood obtained from Equation (4) and Equation (5). The denominator is what makes the conditional likelihood different from the unconditional likelihood; it sums the joint probability for all possible configurations. To derive the Equation (16), the conditional likelihood r i given C i can be simplified as

f ( r i | C i = c i ; δ i ) = j J i ( n i j r i j ) e δ i r i j u i U i ( n i 0 c i j J ( i ) u i j ) j J ( i ) ( n i j u i j ) e δ i u i j , (17)

where u i = ( u i j , j J ( i ) ) t and

U i = { u i : 0 u i j n i j , j J ( i ) and c i n i 0 j J ( i ) u i j c i } .

Notice that this likelihood function does not involve any nuisance parameters α i ’s and is a function of δ i alone. The removal of the trial effects from the conditional likelihood is important because when the conditional likelihood is used, estimates are obtained only for the parameters of interest in the model and not for the α i ’s.

5.2. Estimation

The conditional likelihood in Equation (17) has k i random effects so the likelihood f ( r i | j J i r i j = c i ) involves k i integrations:

f ( r i | j J i r i j = c i ) = δ i f ( r i | j J i r i j = c i ; δ i ) ϕ ( δ i ; μ i , Ω i ) d δ i , (18)

where ϕ ( δ i ; μ i , Ω i ) is the probability density function of multivariate normal distribution with mean μ i and covariance Ω i given in Equation (7). Gauss-Hermite approximation is applied to Equation (18) and obtain:

f ( r i | j J i r i j = c i ) π k i / 2 n 1 = 1 l 1 w n 1 ( 1 ) n k i = 1 l k i w n k i ( k i ) f ( r i | j J i r i j = c i ; δ i , n ) , (19)

where f ( r i | j J i r i j = c i ; δ i , n ) is obtained from Equation (17) where the

sampling nodes is δ i , n = μ i + 2 Ω i 1 / 2 d i , n and d i , n = ( d n 1 ( 1 ) , , d n k i ( k i ) ) . The likelihood for the ith study L ( θ | r i ) can be written as

L ( θ | r i ) = f ( r i | j J i r i j = c i ; δ i ) .

The log-likelihood function of the logistic regression models using the conditional approach is

l c ( θ ) = log L ( θ | r ) = i = 1 M log L ( θ | r i ) (20)

By maximizing the conditional likelihood function over θ , an exact parameter estimate is obtained for θ , called the conditional maximum likelihood estimate. To calculate the standard error of their MLEs, the log-likelihood function in Equation (20) can be written as

l c ( θ ) = i = 1 M log f ( r i | j J i r i j = c i ) = i = 1 M log δ i f ( r i | j J i r i j = c i , δ i ) ϕ ( δ i ; μ i , Ω i ) d δ i . (21)

Let P i ( δ i ) represent f ( r i | j J i r i j = c i , δ i ) in the above equation. The second-order partial derivatives of 2 l 2 / μ 2 , 2 l 2 / τ 2 and 2 l 2 / μ τ are similar to Equation (13) - Equation (15) respectively. In a similar way to the previous section, the standard errors for the MLEs are obtained.

6. Application to Antiplatelet Therapy Data

From the data given in Table 1, the total number of individual observations is small thus the empirical log-odds model is not appropriate. The logistic regression model will be applied using the unconditional and conditional approaches with the data.

6.1. Unconditional Inference

From the data, there are 27 studies investigating the use of aspirin plus dipyridamole or aspirin alone in comparison with the control group. The studies compare three treatments: aspirin plus dipyridamole (A), aspirin alone (B) and the control treatment (C). Seven studies compare A, B and C, ten studies compare A and C and ten studies compare B and C. There is no indirect comparison for this dataset, so the set D is { 1 , , 27 } . The baseline treatment for all studies is the control group (b(i) = 0).

The indices i = 1 , , 27 and j = 0 , 1 , 2 stand for the studies and the treatments C, A and B, respectively. The data is partitioned into three groups: G 1 = { 1 , , 7 } , G 2 = { 8 , , 17 } and G 3 = { 18 , , 27 } . The sets J i and J ( i ) are given by

{ J i = { 0 , 1 , 2 } , J ( i ) = { 1 , 2 } for i G 1 , J i = { 0 , 1 } , J ( i ) = { 1 } for i G 2 , J i = { 0 , 2 } , J ( i ) = { 2 } for i G 3 . (22)

Let r i 0 , r i 1 and r i 2 be the numbers of patients who suffered reocclusions on treatments C, A and B respectively, where the ith study is in G 1 G 2 G 3 , G 1 G 2 and G 1 G 3 , respectively. The total numbers of patients are n i 0 , n i 1 and n i 2 Let π i 0 , π i 1 and π i 2 be the probabilities that patients have reocclusions on treatments C, A and B respectively in the ith study. The r i 0 , r i 1 and r i 2 are binomially distributed as

r i 0 ~ B i n ( π i 0 , n i 0 ) , i G 1 G 2 G 3 ,

r i 1 ~ B i n ( π i 1 , n i 1 ) , i G 1 G 2 ,

r i 2 ~ B i n ( π i 2 , n i 2 ) , i G 1 G 3 .

The treatment effect δ i for G 1 is defined as

( δ i , 01 δ i , 02 ) ~ M V N ( ( μ 01 μ 02 ) , ( τ 01 2 ρ τ 01 τ 02 ρ τ 01 τ 02 τ 02 2 ) ) (23)

Logistic regression models for the data can be fitted using the Equation (1) and Equation (2) where b(i) = 0 and J ( i ) is given in Equation (22). Note that the trial effects are assumed to be different in each study. To define the unconditional likelihood function, let r ( i ) represent the vector ( r i 1 , r i 2 ) . The probability functions p ( r i 0 ) and p ( r ( i ) ) are formulated from Equation (4) and Equation (8) respectively.

From δ i for G 1 , the correlation coefficient ρ between δ i , 01 and δ i , 02 is in the form τ 0 2 / τ 01 τ 02 , where is obtained from δ i 0 ~ N ( μ 0 , τ 0 2 ) . Note that μ 0 and τ 0 2 are not estimable unless some other information is used. The assumption of homogeneity variance will be considered here. Suppose that all heterogeneity parameters are the same: τ 01 = τ 02 = τ and the correlation coefficient takes the value 1/2. The unknown parameter θ for the models is { α 1 , α 2 , , α 27 , μ 01 , μ 02 , τ 2 } . The log-likelihood function l u ( θ ) is obtained from Equation (10). By maximizing the log-likelihood function, the MLEs can be estimated. Also their standard errors are given by the observed Fisher information matrix.

The results for the treatment effects δ 01 and δ 02 are given in Table 2. SD and CI stand for standard deviation and confidence interval respectively. The trial effects are presented in Table 3. The overall means on the log odds ratio

Table 2. The results of the treatment effects for the model using the unconditional method.

Table 3. The trial effect of the model using the unconditional method.

(LOR) scale for δ 01 and δ 02 are −1.17849 (SD 0.08499) and −0.63700 (SD 0.03728), and the heterogeneity parameter is 0.0372 (SD 0.04752). On the odds ration (OR) scale, the means are 0.30774 and 0.52800 respectively. Their confidence interval (CI) can be calculated from the related CI on the LOR scale. To be concluded, treatments aspirin plus dipyridamole and aspirin only in antiplatelet therapy reduce deep venous thrombosis by over 70% and 45% respectively. The average of both treatments reduces deep venous thrombosis by over 55%.

6.2. Conditional Inference

The models and other parameters are similar to those defined in the unconditional method. The function C i for the data can be defined by

{ C i = r i 0 + r i 1 + r i 2 for i G 1 , C i = r i 0 + r i 1 for i G 2 , C i = r i 0 + r i 2 for i G 3 . (24)

Let r i denote the vector ( r i 0 , r i 1 , r i 2 ) . The conditional likelihood f ( r i | C i ) for the ith study is given in Equation (18). To handle the random treatment effect δ i , the likelihood function is approximated by Gaussian-Hermite approximation as defined in Equation (19). The unknown parameter θ for the models is { μ 01 , μ 02 , τ } . By using the log-likelihood function in Equation (20), the results of the models are given in Table 4. On the LOR scale, the overall mean effects for both treatment effects are −0.87516 (SD 0.04340) and −0.39000 (SD 0.31160) while their variation between studies is 0.37000 (SD 0.03900). Those means on the OR scale are 0.41679 and 0.67434. As before their confidence intervals are obtained from the related CI on the LOR scale. The results indicate that treatments aspirin plus dipyridamole and aspirin only produce a reduction in deep venous thrombosis by over 55% and 30% respectively. The average of both treatments in antiplatelet therapy reduces deep venous thrombosis by over 40%.

As seen from Table 2 and Table 4, the results from using the unconditional likelihood (on the LOR scale) are smaller than from using conditional likelihood. Note that those results are negative. That is to say that estimation with unconditional likelihood may cause underestimation or bias. Collaboration [2] summarized that antiplatelet therapy produced a highly significant (2p ≤ 0.00001) reduction in deep venous thrombosis of about 40%. The results from the model using the conditional likelihood support this.

7. Conclusions and Discussion

The logistic regression model was introduced for the exact binomial distribution. Two alternative approaches for making inferences were presented. The unconditional likelihood involves nuisance parameters (from the trial effects). If the number of studies (M) is large, it may lead to inconsistent estimators. Cox and Snell [15] concluded for the unconditional likelihood that if the number of studies (M) is large and the number of individual observations ( n i j ) is small then it makes estimation inaccurate and inconsistent. The conditional maximum likelihood approach was introduced for the model to eliminate all nuisance parameters. In making a choice between the two approaches, it is needed to consider the number of studies and the number of individual observations. However, the use

Table 4. The results of the treatment effects for the model using the conditional method.

of this method can be expensive in term of the cost of computer running time, especially if the number of individual observations is large. Using the unconditional maximum likelihood approach, note that if the number of studies is large and the number of individual observations is small then the estimate may be biased or misleading.

Some other methods can be used in the logistic regression model, for example, using a pseudo-loglikelihood, (see Severini [27] ); or the modified profile likelihood (see Bellio and Sartori [28] , Hamza, Houwelingen and Stijnen [29] and Caterina et al. [19] ). Gaussian-Hermite quadrature was used to calculate the integral forms of the probabilities including random effects in the likelihood functions for both approaches [30] . The approximation is reasonably effective for low-order integrations [31] . Implementing Gaussian-Hermite approximation, we used the function “gauss.quad” in the software R to estimate MLEs for the model. The number of integrands depends on the number of treatments involved in those studies. If this number is large then it makes the dimensionality of the integral large and so it cannot be approximated accurately. Other approximations such as Laplace approximation or Monte Carlo method can be used, see Shi and Copas [32] and Caterina [19] . Laplace approximation could make the calculation of second-order derivatives for the observed Fisher information matrix easier than using Gaussian approximation since there is no weight term in the approximation [33] [34] . Additionally, simulation studies could be conducted in further work to compare these two approaches in different scenarios including improving the approximation if multi-arm trials are more than three treatment comparisons.

The established binomial tree method could be applied in the further work [24] .

8. Conclusion

Simulation studies could be conducted in further work to compare these two approaches in different scenarios including improving the approximation if multi-arm trials are more than three treatment comparisons.

Acknowledgements

The paper was partially supported for publication by Faculty of Public Health, Mahidol University, Thailand.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Hill, S.A.B. (1990) Memories of the British Streptomycin Trial in Tuberculosis: The First Randomized Clinical Trial. Controlled Clinical Trials, 11, 77-79.
https://doi.org/10.1016/0197-2456(90)90001-I
[2] Antiplatelet Trialists’ Collaboration (1994) Collaborative Overview of Randomised Trials of Antiplatelet Therapy—III: Reduction in Venous Thrombosis and Pulmonary Embolism by Antiplatelet Prophylaxis among Surgical and Medical Patients. British Medical Journal, 308, 235-246.
https://doi.org/10.1136/bmj.308.6923.235
[3] Chootrakool, H. and Shi, J.Q. (2008) Meta-Analysis of Multi-Arm Trials Using Empirical Logistic Transform. The Open Medical Informatics Journal, 2, 112-116.
https://doi.org/10.2174/1874431100802010112
[4] Chootrakool, H., Shi, J.Q. and Yue, R. (2011) Meta-Analysis and Sensitivity Analysis for Multi-Arm Trials with Selection Bias. Statistics in Medicine, 30, 1183-1198.
https://doi.org/10.1002/sim.4143
[5] Rücker, G. (2017) Methods for Including Information from Multi-Arm Trials in Pairwise Meta-Analysis Multi-Arm Trials in Pairwise Meta-Analysis. Research Synthesis Methods, 8, 392-403.
https://doi.org/10.1002/jrsm.1259
[6] Lu, G. and Ades, A.E. (2004) Combination of Direct and Indirect Evidence in Mixed Treatment Comparisons. Statistics in Medicine, 23, 3105-3124.
https://doi.org/10.1002/sim.1875
[7] Lee, A.W. (2014) Review of Mixed Treatment Comparisons in Published Systematic Reviews Shows Marked Increase since 2009. Journal of Clinical Epidemiology, 67, 138-143.
[8] Salanti, G. (2012) Indirect and Mixed-Treatment Comparison, Network, or Multiple-Treatments Meta-Analysis: Many Names, Many Benefits, Many Concerns for the Next Generation Evidence Synthesis Tool. Research Synthesis Methods, 3, 80-97.
https://doi.org/10.1002/jrsm.1037
[9] Lumley, T. (2002) Network Meta-Analysis for Indirect Treatment Comparisons. Statistics in Medicine, 21, 2313-2324.
https://doi.org/10.1002/sim.1201
[10] Mbuagbaw, L., et al. (2017) Approaches to Interpreting and Choosing the Best Treatments in Network Meta-Analyses. Systematic Reviews, 6, Article No. 79.
https://doi.org/10.1186/s13643-017-0473-z
[11] Veroniki, A.A., et al. (2018) Is Providing Uncertainty Intervals in Treatment Ranking Helpful in a Network Meta-Analysis? Journal of Clinical Epidemiology, 100, 122-129.
https://doi.org/10.1016/j.jclinepi.2018.02.009
[12] Seide, S.E., Jensen, K. and Kieser, M. (2019) Simulation and Data-Generation for Random-Effects Network Meta-Analysis of Binary Outcome. Statistics in Medicine, 38, 3288-3303.
https://doi.org/10.1002/sim.8193
[13] Jenkins, D.A., et al. (2021) Methods for the Inclusion of Real-World Evidence in Network Meta-Analysis. BMC Medical Research Methodology, 21, Article No. 207.
https://doi.org/10.1186/s12874-021-01399-3
[14] Zhao, Y.P., et al. (2021) A Network Meta-Analysis for Neoadjuvant and Adjuvant Treatments for Resectable Squamous Cell Carcinoma of Esophagus. Scientific Reports, 11, Article No. 6800.
https://doi.org/10.1038/s41598-021-86102-8
[15] Cox, D.R. and Snell, E.J. (1989) Analysis of Binary Data. 2nd Edition, Chapman & Hall/CRC, Boca Raton.
[16] Tritchler, D. (1984) An Algorithm for Exact Logistic Regression. Journal of the American Statistical Association, 79, 709-711.
https://doi.org/10.2307/2288420
[17] Sartori, N. (2003) Modified Profile Likelihoods in Models with Stratum Nuisance Parameters. Biometrika, 90, 533-549.
http://www.jstor.org/stable/30042064
https://doi.org/10.1093/biomet/90.3.533
[18] Lee, W.J., Shi, J.Q. and Lee, Y.J. (2010) Approximate Conditional Inference in Mixed-Effects Models with Binary Data. Computational Statistics & Data Analysis, 54, 173-184.
https://doi.org/10.1016/j.csda.2009.07.027
[19] Di Caterina, C., Cortese, G. and Sartori, N. (2018) Monte Carlo Modified Profile Likelihood for Panel Data Models. arXiv:1801.02597v1.
[20] Lu, G.B. and Ades, A.E. (2006) Assessing Evidence Inconsistency in Mixed Treatment Comparisons. Journal of the American Statistical Association, 101, 447-459.
https://doi.org/10.1198/016214505000001302
[21] Weber, F., et al. (2020) Zero-Cell Corrections in Random-Effects Meta-Analyses. Research Synthesis Methods, 11, 913-919.
https://doi.org/10.1002/jrsm.1460
[22] Greco, T., et al. (2015) The Attractiveness of Network Meta-Analysis: A Comprehensive Systematic and Narrative Review. Heart Lung and Vessels, 7, 133-142.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4476767/
[23] Wang, F.F., et al. (2018) Pharmacologic Therapy to Induce Weight Loss in Women Who Have Obesity/Overweight with Polycystic Ovary Syndrome: A Systematic Review and Net-Work Meta-Analysis. Obesity Reviews, 19, 1424-1445.
https://doi.org/10.1111/obr.12720
[24] Zhu, L.K., et al. (2019) A New Binomial Tree Method for European Options under the Jump Diffusion Model. Journal of Applied Mathematics and Physics, 7, 3012-3021.
https://doi.org/10.4236/jamp.2019.712211
[25] Kleinbaum, D.G. (1994) Logistic Regression. Springer, New York.
https://doi.org/10.1007/978-1-4757-4108-7
[26] Collett, D. (1991) Modelling Binary Data (Hardback). Taylor & Francis, Oxfordshire.
[27] Severini, T.A. (1998) Likelihood Functions for Inference in the Presence of a Nuisance Parameter. Biometrika, 85, 507-522.
https://doi.org/10.1093/biomet/85.3.507
[28] Bellio, R. and Sartori, N. (2003) Extending Conditional Likelihood in Models for Stratified Binary Data. Statistical Methods and Applications, 12, 121-132.
https://doi.org/10.1007/s10260-003-0055-1
[29] Hamza, T.H., van Houwelingen, H.C. and Stijnen, T. (2008) The Binomial Distribution of Meta-Analysis Was Preferred to Model Within-Study Variability. Journal of Clinical Epidemiology, 61, 41-51.
https://doi.org/10.1016/j.jclinepi.2007.03.016
[30] Adeniran, A.T., et al. (2020) Derivation of Gaussian Probability Distribution: A New Approach. Applied Mathematics, 11, 436-446.
https://doi.org/10.4236/am.2020.116031
[31] Crouch, E.A.C. and Spiegelman, D. (1990) The Evaluation of Integrals of the Form ∫+∞ -∞ f(t)exp(-t2) dt: Application to Logistic-Normal Models. Journal of the American Statistical Association, 85, 464-469.
https://doi.org/10.1080/01621459.1990.10476222
[32] Shi, J.Q. and Copas, J. (2002) Publication Bias and Meta-Analysis for 2 × 2 Tables: An Average Markov Chain Monte Carlo EM Algorithm. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64, 221-236.
https://doi.org/10.1111/1467-9868.00334
[33] Liu, Q. and Pierce, D.A. (1994) A Note on Gauss-Hermite Quadrature. Biometrika, 81, 624-629.
https://doi.org/10.2307/2337136
[34] Shaobo, J. and Björn, A. (2020) A Note on the Accuracy of Adaptive Gauss-Hermite Quadrature. Biometrika, 107, 737-744.
https://doi.org/10.1093/biomet/asz080

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.