The Modi Exponentiated Exponential Distribution

Antoine Dieudonné Ndayisaba; Leo Odiwuor Odongo; Anthony Ngunyi

doi:10.4236/jdaip.2023.114017

Journal of Data Analysis and Information Processing > Vol.11 No.4, November 2023

The Modi Exponentiated Exponential Distribution

Antoine Dieudonné Ndayisaba^1*, Leo Odiwuor Odongo², Anthony Ngunyi³
¹Department of Mathematics, Pan African University Institute for Basic Sciences, Technology and Innovation (PAUSTI), Nairobi, Kenya.
²Department of Mathematics and Actuarial Science, Kenyatta University (KU), Nairobi, Kenya.
³Department of Mathematics, Statistics and Actuarial Sciences, Dedan Kimathi University of Technology (De KUT), Nairobi, Kenya.
DOI: 10.4236/jdaip.2023.114017 PDF HTML XML 91 Downloads 451 Views

Abstract

In this study, a new four-parameter distribution called the Modi Exponentiated Exponential distribution was proposed and studied. The new distribution has three shape and one scale parameters. Its mathematical and statistical properties were investigated. The parameters of the new model were estimated using the method of Maximum Likelihood Estimation. Monte Carlo simulation was used to evaluate the performance of the MLEs through average bias and RMSE. The flexibility and goodness-of-fit of the proposed distribution were demonstrated by applying it to two real data sets and comparing it with some existing distributions.

Keywords

Modi Family, Exponentiated Exponential Distribution, Maximum Likelihood Estimation, Order Statistics, Moments, Quantile Function

Share and Cite:

Ndayisaba, A. , Odongo, L. and Ngunyi, A. (2023) The Modi Exponentiated Exponential Distribution. Journal of Data Analysis and Information Processing, 11, 341-359. doi: 10.4236/jdaip.2023.114017.

1. Introduction

Probability distributions are essential tools in a variety of disciplines as they provide statistical interpretations that help make sense of data. However, the existing distributions such as the exponential distribution have limitations making them insufficient in modeling a variety of data.

The exponential distribution is a continuous probability distribution commonly used in statistics and probability theory. It is frequently used to describe the time between events in a process where events occur independently at a fixed average rate [1] .

To overcome these limitations, many authors have proposed modifications of existing distributions. Gupta and Kundu [2] modified the exponential distribution by adding a shape parameter. However, their model is not flexible enough to control both skewness and kurtosis and also accommodate non-monotonic hazard rate shapes.

In the theory of probability distributions, choosing a particular probability distribution to represent real-life phenomena may be motivated by two factors such as tractability and flexibility [3] . Although a probability distribution’s tractability may be advantageous in theory since it is simple to use, particularly when simulating random samples, practitioners and other stakeholders may be more interested in a distribution’s flexibility. In reality, it is preferable to employ probability distributions that best fit the given data set rather than transforming the data set because doing so may compromise the data set’s original features. Due to this, numerous attempts have been made in recent years to ensure that the current standard theoretical distributions are improved and extended [4] [5] [6] . Marshall and Olkin [7] proposed and studied a new method for adding a parameter to a family of distributions. Margaretha, et al. [8] , in a conversation about employing Bayesian methodology to estimate parameters for the Exponentiated Exponential distribution in the context of left-censored data , in their work, Mahdavi and Kundu [9] introduced a new method for generating distributions and applied it to the exponential distribution. This generator was reported to control the skewness in the distribution which is not normal distributed. Many Researchers have made significant advancements in developing and extending various distributions.

Singh, et al. [10] and Niyoyunguruza, et al. [11] used the Marshall-Olkin generator method to extend distributions. They subsequently conducted parameter estimation using Maximum Likelihood Estimation (MLE) technique. By applying these extended distributions to a range of datasets, they demonstrated that these new distributions provided a better fit to the data when compared to the baseline distributions. Yahaya and Ieren [12] proposed the odd generalized exponential Gumbel distribution (OGEGD) for modeling lifetime data, Salem and selim [13] proposed Generalized Weibull-Exponential Distribution (GWED), the power Exponentiated family was proposed by Modi [14] , Uwadi, et al. [15] introduced and studied the Exponentiated Gumbel Exponential (EGuE) distribution, and studied its mathematical and statistical properties. Modi, et al. [16] , in their work proposed and studied a new family of distributions called Modi Exponential distribution, and applied it to two Real Datasets. In this study, we introduce the Modi Exponentiated Exponential distribution, which comprises four parameters. This distribution can be effectively used for fitting and analyzing data in various fields.

This paper is structured as follows: Section 2 provides a definition of the Modi generator, while Section 3 introduces the Exponentiated Exponential distribution. In Section 4, we present the Modi Exponentiated Exponential distribution, and develop its cumulative distribution function (CDF), probability density function (PDF), hazard rate function, and survival function. We also derive some statistical properties and mathematical properties of the proposed distribution. The estimation of the model parameters is given in Section 5. Simulation of study, Application to real data set, and Conclusion were given respectively in Sections 6, 7, and 8.

2. Modi Family

Modi, et al. [16] proposed and studied Modi family of distributions which is flexible and can be used to model a wide range of phenomena in various fields, including engineering, economics, and finance. The CDF, F(x) and PDF, f(x) of the Modi family are respectively, given by:

$F (x) = \frac{(1 + α^{β}) S (x)}{α^{β} + S (x)}$ (2.1)

$f (x) = \frac{(1 + α^{β}) (α^{β} s (x))}{{[α^{β} + S (x)]}^{2}}$ (2.2)

for all $x > 0$ , where $α > 0$ , $β > 0$ and S(x) is the CDF of the existing distribution, and s(x) is the PDF of the existing distribution.

3. Exponentiated Exponential Distribution

The practice of adding a new parameter to an existing family of distribution functions is a common one in statistical distribution theory.

Gupta and Kundu [2] proposed Exponentiated Exponential (EE) distribution. The CDF and PDF of EE are given respectively by:

$F (x, δ, λ) = {(1 - e^{- λ x})}^{δ}$ (3.1)

where $δ, λ, x > 0$ ;

$f (x, δ, λ) = δ λ {(1 - e^{- λ x})}^{δ - 1} e^{- λ x}$ (3.2)

where $λ$ is scale parameter, and $δ$ is a shape parameter.

4. Modi Exponentiated Exponential (MEE) Distribution

In the field of statistical distribution theory, it is common to enhance the flexibility of a class of distribution functions by introducing an additional parameter. This practice can be very useful for data analysis, as it allows for greater versatility in modeling various types of data.

4.1. Cumulative Distribution Function and Survival Function

From the Equation 2.1 and Equation 3.1 the cumulative distribution $F (x)$ of the MEE distribution is given as:

$F (x, δ, λ, β, α) = \frac{(1 + α^{β}) {(1 - e^{- λ x})}^{δ}}{α^{β} + {(1 - e^{- λ x})}^{δ}}$ (4.1)

for all $x > 0$ and $δ, λ, β, α > 0$ ,

and the survival function is derived from Equations (2.2) as follows:

$\begin{matrix} S (x, δ, λ, β, α) = 1 - F (x, δ, λ, β, α) \\ = 1 - \frac{(1 + α^{β}) {(1 - e^{- λ x})}^{δ}}{α^{β} + {(1 - e^{- λ x})}^{δ}} \\ = \frac{α^{β} [1 - {(1 - e^{- λ x})}^{δ}]}{α^{β} + {(1 - e^{- λ x})}^{δ}} \end{matrix}$ (4.2)

for all $x > 0$ and $δ, λ, β, α > 0$ .

4.2. Probability Density Function and Hazard Rate Function

The PDF plays a crucial role in the modeling and analysis of continuous random variables.

The PDF of MEE is obtained from Equation 4.1 as follows:

$\begin{matrix} f (x, δ, λ, β, α) = \frac{d}{d x} F (x, δ, λ, β, α) \\ = \frac{d}{d x} \frac{(1 + α^{β}) {(1 - e^{- λ x})}^{δ}}{α^{β} + {(1 - e^{- λ x})}^{δ}} \\ = \frac{λ δ α^{β} (α^{β} + 1) {(1 - e^{- λ x})}^{δ}}{(e^{λ x} - 1) {[α^{β} + {(1 - e^{- λ x})}^{δ}]}^{2}} \end{matrix}$ (4.3)

for all $x > 0$ and $δ, λ, β, α > 0$ .

Figure 1 depicts several potential forms of the PDF for the MEE distribution across different parameter values. The MEE PDF has the potential to exhibit symmetry or unimodal as well as a reversed-J shape.

Parameter values were derived through the technique of Sensitivity Analysis, a valuable technique that enhances the understanding of input-variable-to-model-output connections. It offers insights into uncertainties and empowers decision-makers to make well-informed and resilient choices.

The hazard rate function is obtained by using Equations (4.2) and (4.3).

$\begin{matrix} h (x, δ, λ, β, α) = \frac{f (x, δ, λ, β, α)}{S (x, δ, λ, β, α)} \\ = \frac{λ δ (α^{β} + 1) {(1 - e^{- λ x})}^{δ}}{(e^{λ x} - 1) [α^{β} + {(1 - e^{- λ x})}^{δ}] [1 - {(1 - e^{- λ x})}^{δ}]} \end{matrix}$ (4.4)

where $λ$ is a scale parameter and $α, β, δ$ are shapes parameters, for all $x > 0$ , and $α, β, δ, λ > 0$ .

The shapes of hazard rate function for the MEE are depicted in Figure 2. It demonstrates that the hazard rate function can take various shapes such as increasing, decreasing, constant, inverted bathtub, etc.

Figure 1. Plot of PDF of the MEE for various values of $δ, λ, β$ and $α$ .

Figure 2. Plot of hazard rate function of the MEE for various values of $δ, λ, β$ and $α$ .

4.3. Statisical and Mathematical Properties of the MEE Distribution

4.3.1. Quantile Function

The quantile function is crucial in generating random samples from a specified distribution.

Suppose that X is the random variable where $X ~ MEE (λ, δ, α, β)$ , so to obtain the quantile function, one can solve an equation for x by using the Cumulative Distribution Function (CDF) of the Modi Exponentiated Exponential (MEE) as defined in Equation (4.1):

Let $F (x) = u$

Then,

$\begin{array}{l} u = \frac{(1 + α^{β}) {(1 - e^{- λ x})}^{δ}}{α^{β} + {(1 - e^{- λ x})}^{δ}}, \\ \Rightarrow \frac{u}{1 + α^{β}} = \frac{{(1 - e^{- λ x})}^{δ}}{α^{β} + {(1 - e^{- λ x})}^{δ}}, \\ \Rightarrow Q (u) = x_{u} = F^{- 1} (u; λ; δ; α; β) = \frac{- \log [1 - {(\frac{u α^{β}}{1 + α^{β} - u})}^{\frac{1}{δ}}]}{λ} \end{array}$ (4.5)

where $F^{- 1} (.)$ is the inverse of the distribution function of MEE and $0 \leq u \leq 1$ . To obtain the values of the lower quartile, median, and upper quartile, one can use the quantile function by replacing u with 1/4, 1/2, and 3/4, respectively, giving:

The lower quartile as:

$= \frac{- \log [1 - {(\frac{α^{β}}{4 α^{β} + 3})}^{\frac{1}{δ}}]}{λ}$ (4.6)

The median as:

$= \frac{- \log [1 - {(\frac{α^{β}}{2 α^{β} + 1})}^{\frac{1}{δ}}]}{λ}$ (4.7)

The upper quartile as:

$= \frac{- \log [1 - {(\frac{3 α^{β}}{4 α^{β} + 1})}^{\frac{1}{δ}}]}{λ}$ (4.8)

4.3.2. Skewness and Kurtosis

Mathematically, the Moors Kurtosis and Galton skewness of the MEE distribution are stated as:

$\begin{array}{l} K_{M} = \frac{Q (7 / 8) + Q (3 / 8) - Q (5 / 8) - Q (1 / 8)}{Q (6 / 8) - Q (2 / 8)} \\ S_{K} = \frac{Q (3 / 4) + Q (1 / 4) - 2 Q (2 / 4)}{Q (3 / 4) - Q (1 / 4)} \end{array}$ (4.9)

where the quartiles and octiles value is indicated by Q(.)

Table 1 provides information about how quantiles of the MEE distribution vary for different combinations of parameter values. It can be used to look up specific quantile values based on the desired parameters, which can be useful for statistical analysis and modeling.

4.3.3. The r^th Moments of MEE Distribution

Calculation of the moments of a distribution is crucial for statistical analysis, especially in practical applications. Moments are used to determine various statistical measures such as measures of central tendency, dispersion, and shape.

The mathematical expression for the r^th moment is given by

${μ^{'}}_{r} = E (X^{r}) = \int_{0}^{\infty} x^{r} f (x) d x$ (4.10)

where $f (x)$ is the PDF of the distribution.

By substituting (4.3) in (4.10)we get

${μ^{'}}_{r} = E (X^{r}) = \int_{0}^{\infty} x^{r} \frac{λ δ α^{β} (α^{β} + 1) {(1 - e^{- λ x})}^{δ}}{(e^{λ x} - 1) {[α^{β} + {(1 - e^{- λ x})}^{δ}]}^{2}} d x$

By using binomial expansion, we get

$\begin{matrix} {μ^{'}}_{r} = - δ α^{β} (α^{β} + 1) \sum_{n = 0}^{\infty} (\begin{matrix} δ \\ n \end{matrix}) {(- 1)}^{n} \sum_{k = 0}^{\infty} (\begin{matrix} - 1 \\ k \end{matrix}) {(- 1)}^{k} \frac{1}{n - k} \sum_{p = 0}^{\infty} (\begin{matrix} - 2 \\ p \end{matrix}) {(α^{β})}^{- 2 - p} \\ \times \sum_{j = 0}^{\infty} {(- 1)}^{j} (\begin{matrix} δ p \\ j \end{matrix}) \int_{0}^{\infty} x^{r} {(e^{- λ x})}^{j} d x \end{matrix}$

By using gamma function the integral becomes:

$\int_{0}^{\infty} x^{r} {(e^{- λ x})}^{j} d x = \frac{Γ (r + 1)}{{(λ n)}^{r + 1}},$

since

$\int_{0}^{\infty} t^{p - 1} e^{- q t} d t = \frac{Γ p}{q^{p}}$

Comparing with this, we can identify t as x, p as (r + 1), and q as $λ j$ . Finally,

$\begin{array}{l} {μ^{'}}_{r} = E (X^{r}) = - δ α^{β} (α^{β} + 1) \sum_{n = 0}^{\infty} (\begin{matrix} δ \\ n \end{matrix}) {(- 1)}^{n} \sum_{k = 0}^{\infty} (\begin{matrix} - 1 \\ k \end{matrix}) {(- 1)}^{k} \frac{1}{n - k} \sum_{p = 0}^{\infty} (\begin{matrix} - 2 \\ p \end{matrix}) {(α^{β})}^{- 2 - p} \\ \times \sum_{j = 0}^{\infty} {(- 1)}^{j} (\begin{matrix} δ p \\ j \end{matrix}) \frac{Γ (r + 1)}{{(λ j)}^{r + 1}} \end{array}$ (4.11)

Table 2 illustrates the adaptability of the MEE distribution concerning its mean and variance. The Coefficient of Skewness (CS) values indicate its potential to exhibit right, or near symmetrical skewness. Similarly, the Coefficient of Kurtosis (CK) values suggest that the MEE distribution can display mesokurtic, leptokurtic, or platykurtic traits. These attributes highlight the MEE distribution’s versatility, making it an attractive choice for modeling needs.

Table 1. Quantiles of the MEE distribution for some parameter values.

Table 2. The first five moments, skewness, and kurtosis of the MEE distribution for different parameter values.

4.3.4. Order Statistics

Let’s consider a finite random sample $X_{1}, X_{2}, \dots, X_{n}$ drawn from a probability density function. From this sample, we define the j^th order statistic, $X (j)$ , where $X (1)$ represents the smallest value in the sample, $X (2)$ represents the second smallest value, and so on, up to $X (n)$ representing the largest value. The Probability Density Function (PDF) of the j^th order statistic, with $1 \leq j \leq n$ , can be expressed as follows:

$g_{j : n} (x) = \frac{n!}{(j - 1)! (n - j)!} f (x) F^{j - 1} (x) {(1 - F (x))}^{n - j}$ (4.12)

by substituting Equations (4.3) and (4.2) in (5.8), the PDF of the j^th order statistic for the MEEE can be expressed as follows:

$\begin{matrix} g_{j : n} (x) = \frac{n! λ δ α^{β} (α^{β} + 1) {(1 - e^{- λ x})}^{δ}}{(j - 1)! (n - j)! (e^{λ x} - 1) {[α^{β} + {(1 - e^{- λ x})}^{δ}]}^{2}} \\ \times {[\frac{(1 + α^{β}) {(1 - e^{- λ x})}^{δ}}{α^{β} + {(1 - e^{- λ x})}^{δ}}]}^{j - 1} {(1 - \frac{(1 + α^{β}) {(1 - e^{- λ x})}^{δ}}{α^{β} + {(1 - e^{- λ x})}^{δ}})}^{n - j} \\ = \frac{n! λ δ α^{β}}{(j - 1)! (n - j)! (e^{λ x} - 1) [α^{β} + {(1 - e^{- λ x})}^{δ}]} \\ \times {(\frac{(1 + α^{β}) {(1 - α^{β})}^{δ}}{α^{β} (1 - {(1 - e^{- λ x})}^{δ})})}^{j} {(\frac{α^{β} (1 - {(1 - e^{- λ x})}^{δ})}{α^{β} + {(1 - e^{- λ x})}^{δ}})}^{n} \end{matrix}$ (4.13)

The PDF of smallest order statistic for the MEE occurs when the value of j is 1, and is given by

$f_{1 : n} (x) = \frac{n λ δ (1 + α^{β}) {(1 - α^{β})}^{δ}}{(e^{λ x} - 1) [α^{β} + {(1 - e^{- λ x})}^{δ}] (1 - {(1 - e^{- λ x})}^{δ})} {[\frac{α^{β} (1 - {(1 - e^{- λ x})}^{δ})}{α^{β} + {(1 - e^{- λ x})}^{δ}}]}^{n}$ (4.14)

The PDF of largest order statistic for the MEE can be found when the index j is equal to n, and its PDF is given by

$f_{n : n} (x) = \frac{n λ δ α^{β}}{(e^{λ x} - 1) [α^{β} + {(1 - e^{- λ x})}^{δ}]} {[\frac{(α^{β} + 1) {(1 - α^{β})}^{δ}}{α^{β} + {(1 - e^{- λ x})}^{δ}}]}^{n}$ (4.15)

5. Maximum Likelihood Estimation (MLE)

Maximum Likelihood Estimation (MLE) is a method of finding the values of the parameters that maximize the likelihood function. If we have n values $x_{1}, x_{2}, \dots, x_{n}$ that are randomly selected from the MEE distribution, the log-likelihood function denoted by $l (φ)$ is given by:

$l (φ) = \log L (φ) = \log \prod_{i = 1}^{n} f (x_{i}; φ) = \sum_{i = 1}^{n} \log f (x_{i}; φ)$ (5.1)

where $φ = α, β, λ, δ$

Replacing Equation 4.3 in 5.1 to calculate the log-likelihood function that is associated with these values, we get

$\sum_{i = 1}^{n} \log f (x_{i}; λ, δ, α, β) = \sum_{i = 1}^{n} \log \frac{λ δ α^{β} (α^{β} + 1) {(1 - e^{- λ x_{i}})}^{δ}}{(e^{λ x_{i}} - 1) {[α^{β} + {(1 - e^{- λ x_{i}})}^{δ}]}^{2}}$ (5.2)

$\begin{array}{l} \Rightarrow \log L (φ) = n \log λ + n \log δ + n β \log α + n \log (α^{β} + 1) + δ \sum_{i = 1}^{n} \log (1 - e^{- λ x_{i}}) \\ - \sum_{i = 1}^{n} \log (e^{λ x_{i}} - 1) - 2 \sum_{i = 1}^{n} \log [α^{β} + {(1 - e^{- λ x_{i}})}^{δ}] \end{array}$ (5.3)

Differentiating 5.3 partially with respect to each parameter and equating to zero gives

$\frac{\partial}{\partial α} \log L (φ) = \frac{n β}{α} + \frac{n β α^{β - 1}}{α^{β} + 1} - 2 \sum_{i = 1}^{n} \frac{β n α^{β - 1}}{α^{β} + {(1 - e^{- λ x_{i}})}^{δ}} = 0$ (5.4)

$\frac{\partial}{\partial β} \log L (φ) = n \log α + α + \frac{β α^{β - 1}}{α^{β} + 1} - 2 \sum_{i = 1}^{n} \frac{2 β α^{β - 1}}{α^{β} + {(1 - e^{- λ x_{i}})}^{δ}} = 0$ (5.5)

$\begin{array}{l} \frac{\partial}{\partial λ} \log L (φ) = \frac{n}{λ} + δ \sum_{i = 1}^{n} \frac{x_{i} e^{- λ x_{i}}}{1 - e^{- λ x_{i}}} - \sum_{i = 1}^{n} \frac{x_{i} e^{λ - x_{i}}}{1 - e^{- λ x_{i}}} - \sum_{i = 1}^{n} \frac{x_{i} e^{λ x_{i}}}{e^{λ x_{i}} - 1} \\ - 2 \sum_{i = 1}^{n} \frac{δ {(1 - e^{- λ x_{i}})}^{δ - 1} λ e^{- λ x_{i}}}{α^{β} + {(1 - e^{λ x_{i}})}^{δ}} = 0 \end{array}$ (5.6)

$\frac{\partial}{\partial δ} \log L (φ) = \frac{n}{δ} + \sum_{i = 1}^{n} \log (1 - e^{- λ x_{i}}) - 2 \sum_{i = 1}^{n} \frac{δ {(1 - e^{- λ x_{i}})}^{δ - 1} λ e^{- λ x_{i}}}{α^{β} + {(1 - e^{- λ x_{i}})}^{δ}} = 0$ (5.7)

Based on our observations, it is evident that Equations 5.4 - 5.7 can not be solved analytically, meaning there is no direct mathematical solution for them. Consequently, we need to resort to a numerical optimization technique to find their solutions. In this study, the Broyden-Fletcher-Goldfarb-Shannon (BFGS) algorithm was employed to estimate the parameters of the MEE distribution.

6. Simulation Study

This section describes a Monte Carlo simulation study conducted to examine the behavior of Maximum Likelihood Estimators (MLEs) for the parameters of the MEE model. A simulation study was conducted to examine the accuracy of the Maximum Likelihood Estimators (MLEs) for four model parameters ( $α$ , $β$ , $δ$ , and $λ$ ) in terms of their average biases (ABs) and root mean squared errors (RMSEs). In order to obtain random samples from the MEE, the inverse of the CDF presented in Equation 4.5 was utilized. To accomplish this, we generated 1000 replications for each of the sample sizes $n = 50, 75, 100, 125, 200, 275, 600, 625, 650$ , and $n = 700$ using Equation 4.5 for various combinations of parameter values ( $α$ , $β$ , $λ$ , $δ$ ).

The parameter values were provided in two different sets.

Set I: $λ = 0.3$ , $δ = 0.5$ , $α = 1.0$ , $β = 0.07$

Set II: $λ = 0.52$ , $δ = 0.47$ , $α = 0.97$ , $β = 0.065$

The average biases were computed by:

${AB}_{(η)} = \frac{1}{R} \sum_{i = 1}^{R} ({\hat{η}}_{i} - η)$ (6.1)

and the root mean square errors by:

${RMSE}_{(η)} = \sqrt{\frac{1}{R} \sum_{i = 1}^{R} {({\hat{η}}_{i} - η)}^{2}}$ (6.2)

where $η$ is the parameter in question, while ${\hat{η}}_{i}$ is its estimated value at the i^th replication at each sample size, and R is the total number of replication.

The following table presents the MLEs, ABs, and RMSEs values for different sample sizes, corresponding to the parameters, $λ$ , $δ$ , $α$ , $β$ .

Table 3 presents the simulation results for Maximum Likelihood Estimators (MLEs), Average Biases (ABs), and Root Mean Square Errors (RMSEs) for the parameter values in Set I and set II. Analyzing Table 3 in set I, it can be observed that as the sample size increases, the MLEs approach the true parameter values for the MEE distribution. Generally, the ABs, and RMSEs of the parameter estimators decrease as the sample size increases.

Table 3, on the other hand, displays the simulation results for the parameter values in Set II.

Indeed, the simulation results for the MEE distribution in Table 3 for set II demonstrate a similar pattern. As the sample size increases, the MLEs become closer to the true values of the parameters. Additionally, the Average Biases (ABs), and Root Mean Square Errors (RMSEs) for the parameter estimators tend to decrease, indicating improved accuracy, and precision as the sample size increases.

Sensitivity Analysis methodology was used to determine parameter values, and parameter estimations were carried out through Monte Carlo simulation using R software.

7. Application to Real Data Set

In this section, we fitted the Modi Exponentiated Exponential distribution to two real data sets and observed its flexibility compared to other well-known distributions. The analysis was conducted using R software. We calculated values of various information criteria such as Akaike Information Criterion (AIC), Hannan Quin Information Criterion (HQIC), Bayesian Information Criterion (BIC), and Consistent Akaike Information Criterion (CAIC). Additionally, we performed Kolmogorov-Smirnov (K-S), Cramér-von Mises test (W*), and Anderson-Darling test (A*) to assess the goodness of fit for the considered distributions. The distribution with the highest log-likelihood and the highest p-value for the K-S test and the lowest AIC, BIC, HQIC, CAIC, W*, A*, K-S was considered the best.

The PDFs of the existing distributions compared with the Modi Exponentiated Exponential distribution are presented in Table 4.

Data set I: This data set presented in [17] contains information on the remission times, measured in months, for a group of 128 individuals diagnosed with bladder cancer. 3.88, 5.32, 7.39, 10.34, 14.83, 34.26, 0.90, 2.69, 4.18, 5.34, 7.59, 10.66, 15.96, 36.66, 1.05, 2.69, 4.23, 5.41, 7.62, 10.75, 16.62, 43.01, 1.19, 2.75, 4.26, 5.41, 7.63, 17.12, 46.12, 1.26, 2.83, 4.33, 5.49, 7.66, 11.25, 17.14, 79.05, 1.35, 2.87, 5.62, 7.87, 11.64, 17.36, 1.40, 3.02, 4.34, 5.71, 7.93, 0.08, 2.09, 3.48, 4.87, 6.94, 8.66, 13.11, 23.63, 0.20, 2.23, 3.5, 4.98, 6.97, 9.02, 13.29, 0.40, 2.26, 3.57, 5.06, 7.09, 9.22, 13.80, 25.74, 0.50, 2.46, 3.64, 5.09, 7.26, 9.47, 14.24, 25.82, 0.51, 2.54, 3.70, 5.17, 7.28, 9.74, 14.76, 26.31, 0.81, 2.62, 3.82, 5.32, 7.32, 10.06, 14.77, 32.15, 2.64, 11.79, 18.10, 1.46, 4.40, 5.85, 8.26, 11.98, 19.13, 1.76, 3.25, 4.50, 6.25, 8.37, 12.02, 2.02, 3.31, 4.51, 6.54, 8.53, 12.03, 20.28, 2.02, 3.36, 6.76, 12.07, 21.73, 2.00, 3.36, 6.93, 8.65, 12.63, and 22.69.

Table 3. The outcomes of a Monte Carlo simulation, investigation for the parameters in set I and set II.

Table 4. The existing distributions compared with Modi Exponentiated Exponential distribution.

Table 5 presents the summary characteristics of data set I. The data is skewed to the right, as indicated by a skewness coefficient of 3.325, and it exhibits significant tailing in its distribution, with a kurtosis coefficient of 16.15.

The parameter known as the Kurtosis coefficient was obtained from the data set using R software through descriptive statistics.

In Figure 3, the histogram of the data displays a right-skewed and the presence of outliers is effectively revealed by the box plot.

In Figure 4, the TTT plot of the data set, it can be observed that the hazard rate function is an inverted bathtub shape while the violin plot highlights that the majority of values are concentrated around the median.

Table 6 and Table 7 provide the AIC, HQIC, BIC, and CAIC values, along with the K-S, W*, and A* tests. Based on these results, the MEE model emerges as the most favorable choice because it has the lowest values for AIC, HQIC, BIC, CAIC, W*, K-S and A*, indicating better goodness-of-fit. Additionally, it exhibits the highest p-value for the K-S statistic and log-likelihood function value, further supporting its superiority.

Figure 5 illustrates a plot of fitted densities, comparing the MEE distribution to its sub-models using the bladder cancer data set. The plot reveals that the MEE distribution demonstrates a favorable and encouraging fit when compared to the existing distributions.

Data set II: This data set consists of the waiting times (in minutes) of one hundred bank customers before they receive service. This data set has been previously analyzed by Ghitany, et al. [18] . They have fitted both the Lindley distribution and the exponential distribution to this data. The data set is provided below:

0.8, 0.8, 1.3, 1.5, 1.8, 1.9, 1.9, 2.1, 2.6, 2.7, 2.9, 3.1, 3.2, 3.3, 3.5, 3.6, 4.0, 4.1, 4.2, 4.2, 4.3, 4.3, 4.4, 4.4, 4.6, 4.7, 4.7, 4.8, 4.9, 4.9, 5, 5.3, 5.5, 5.7, 5.7, 6.1, 6.2, 6.2, 6.2, 6.3, 6.7, 6.9, 7.1, 7.1, 7.1, 7.1, 7.4, 7.6, 7.7, 8, 8.2, 8.6, 8.6, 8.6, 8.8, 8.8, 8.9, 8.9, 9.5, 9.6, 9.7, 9.8, 10.7, 10.9, 11, 11, 11.1, 11.2, 11.2, 11.5, 11.9, 12.4, 12.5, 12.9, 13, 13.1, 13.3, 13.6, 13.7, 13.9, 14.1, 15.4, 15.4, 17.3, 17.3, 18.1, 18.2,, 18.4, 18.9, 19, 19.9, 20.6, 21.3, 21.4, 21.9, 23.0, 27, 31.6, 33.1, 38.5.

Table 8 presents the summary characteristics of data set II. The data is skewed to the right, as indicated by a skewness coefficient of 1.47277, and it can be

Table 5. Comprehensive overview of the bladder cancer data set: Descriptive analysis.

(a) Histogram of cancer data set (b) Box plot of cancer data set

Figure 3. Histogram and box plots of bladder cancer data set.

(a) TTT plot of cancer data set (b) Violin plot of cancer dataset

Figure 4. TTT and violin plots of bladder cancer data set.

platykurtic based on kurtosis coeficient of 2.54029. The parameter known as the Kurtosis coefficient was obtained from the data set using descriptive statistics in the R software.

In the Figure 6, the TTT plot of the data set shows that the hazard rate function is on the rise, indicating an increasing shape and the histogram illustrates right skewed of the data.

Table 6. Maximum Likelihood Estimates and goodness-of-fit tests for data set I.

Table 7. A summary of the results from the information criteria analysis conducted on data set I.

Figure 5. The densities of the bladder cancer data set estimated using different distribution models.

Table 8. Investigation of customer waiting duration for bank services: A descriptive analysis.

(a) TTT plot of waiting time data set (b) Histogram of waiting time data set

Figure 6. TTT and histogram plots of waiting time data set.

In the Figure 7, the violin plot emphasizes that most values are centered around the median. The box plot effectively identifies the presence of outliers in the data.

The MEE model is considered the best model based on the information provided in Table 9 and Table 10. This is because it has the lowest values for AIC, HQIC, BIC, W*, A* and CAIC, indicating better model fit. Additionally, it has the highest p-value for the K-S statistic and log-likelihood function value, further supporting its superiority compared to other models.

Figure 8 illustrates a plot of fitted densities, comparing the MEE distribution to its sub-models using the waiting time data set. The plot reveals that the MEE distribution demonstrates a favorable and encouraging fit when compared to the existing distributions.

8. Conclusion

In this paper, we introduced a new four-parameter model called the Modi Exponentiated Exponential (MEE) distribution, and applied to two real data sets. We have examined the mathematical and statistical properties of this proposed distribution. We derived expressions for its r^th moment, survival function, hazard rate function, cumulative distribution, and quantile function. Furthermore, through various plots, we have observed that the MEE distribution exhibits different shapes, indicating its versatility in fitting data sets with diverse distributions. We have also obtained the Probability Density Function (PDF) of its minimum and maximum order statistics.

(a) Violin plot of waiting data set (b) Box plot of waiting data set

Figure 7. Violin and box plots of waiting data set.

Table 9. Maximum likelihood estimation and goodness-of-fit analysis for data set II.

Table 10. A summary of the results from the information criteria analysis conducted on data set II.

Figure 8. Histogram and fitted densities of the waiting time data set for different distributions.

To estimate the parameters of the MEE distribution, we employed the method of maximum likelihood estimation. Monte Carlo simulation was used to assess the performance of MLEs. The study observed that MLEs demonstrate good accuracy and consistent estimating of model parameters. As the sample size increases, MLEs tend to approach the true values of the parameters, as indicated by the decreasing ABs. Additionally, RMSEs also decrease with increasing sample size. Our analysis demonstrates that MEE distribution outperforms the existing distributions, in modeling the two data sets considered in this study.

Acknowledgements

The authors would like to acknowledge and thank the Pan African University, Institute for Basic Sciences, Technology, and Innovation (PAUSTI) for their support in conducting this study.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1]	Mansoor, M., Tahir, M.H., Cordeiro, G.M., Provost, S.B. and Alzaatreh, A. (2019) The Marshall-Olkin Logistic-Exponential Distribution. Communications in Statistics-Theory and Methods, 48, 220-234. https://doi.org/10.1080/03610926.2017.1414254
[2]	Gupta, R.D. and Kundu, D. (2001) Exponentiated Exponential Family: An Alternative to Gamma and Weibull Distributions. Biometrical Journal: Journal of Mathematical Methods in Biosciences, 43, 117-130. https://doi.org/10.1002/1521-4036(200102)43:1<117::AID-BIMJ117>3.0.CO;2-R
[3]	Oguntunde, P., Adejumo, A., Okagbue, H. and Rastogi, M. (2016) Statistical Properties and Applications of a New Lindley Exponential Distribution. Gazi University Journal of Science, 29, 831-838.
[4]	Merovci, F. (2013) Transmuted Exponentiated Exponential Distribution. Mathematical Sciences and Applications E-Notes, 1, 112-122.
[5]	Gupta, R.D. and Kundu, D. (2007) Generalized Exponential Distribution: Existing Results and Some Recent Developments. Journal of Statistical Planning and Inference, 137, 3537-3547. https://doi.org/10.1016/j.jspi.2007.03.030
[6]	Algamal, Z.Y. (2008) Exponentiated Exponential Distribution as a Failure Time Distribution. IRAQI Journal of Statistical Science, 14, 63-75. https://doi.org/10.33899/iqjoss.2008.31434
[7]	Marshall, A.W. and Olkin, I. (1997) A New Method for Adding a Parameter to a Family of Distributions with Application to the Exponential and Weibull Families. Biometrika, 84, 641-652. https://doi.org/10.1093/biomet/84.3.641
[8]	Margaretha, M., Fithriani, I. and Abdullah, S. (2020) Parameter Estimation of Exponentiated Exponential Distribution for Left Censored Data Using Bayesian Method. AIP Conference Proceedings, 2242, Article ID: 030029. https://doi.org/10.1063/5.0007892
[9]	Mahdavi, A. and Kundu, D. (2017) A New Method for Generating Distributions with an Application to Exponential Distribution. Communications in Statistics-Theory and Methods, 46, 6543-6557. https://doi.org/10.1080/03610926.2015.1130839
[10]	Kr Singh, R., Yadav, A.S., Singh, S.K. and Singh, U. (2016) Marshallolkin Extended Exponential Distribution: Different Method of Estimations. Journal of Advanced Computing, 5, 12-28. https://doi.org/10.7726/jac.2016.1002
[11]	Niyoyunguruza, A., Odongo, L.O., Nyarige, E., Habineza, A. and Muse, A.H. (2023) Marshall-Olkin Exponentiated Frechet Distribution. Journal of Data Analysis and Information Processing, 11, 262-292. https://doi.org/10.4236/jdaip.2023.113014
[12]	Yahaya, A. and Ieren, T.G. (2017) On Odd Generalized Exponential Gumbel Distribution with Its Applications to Survival Data. Journal of the Nigerian Association of Mathematical Physics, 39, 149-158.
[13]	Salem, H.M. and Selim, M.A. (2014) The Generalized Weibull-Exponential Distribution: Properties and Applications. International Journal of Statistics and Applications, 4, 102-112.
[14]	Modi, K. (2021) Power Exponentiated Family of Distributions with Application on Two Real-Life Datasets. Thailand Statistician, 19, 536-546.
[15]	Uwadi, U.U., Okereke, E.W. and Omekara, C.O. (2019) Exponentiated Gumbel Exponential Distribution: Properties and Applications. American Journal of Applied Mathematics and Statistics, 7, 178-186.
[16]	Modi, K., Kumar, D. and Singh, Y. (2020) A New Family of Distribution with Application on Two Real Datasets on Survival Problem. Science & Technology Asia, 25, 1-10.
[17]	Lee, E.T. and Wang, J. (2003) Statistical Methods for Survival Data Analysis. Volume 476, John Wiley & Sons, Hoboken.
[18]	Ghitany, M.E., Atieh, B. and Nadarajah, S. (2008) Lindley Distribution and Its Application. Mathematics and Computers in Simulation, 78, 493-506. https://doi.org/10.1016/j.matcom.2007.06.007

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies