Mean Difference and Mean Deviation of Tukey Lambda Distribution

Abstract

The purpose of this paper is to broaden the knowledge of mean difference and, in particular, of an important distribution model known as Tukey lambda, which is generally used to choose a model to fit data. We have obtained compact formulas, which are not yet reported in literature, of mean deviation and mean difference related to the said distribution model. These results made it possible to analyze the relationships among variability indexes, namely standard deviation, mean deviation and mean difference, regarding Tukey lambda model.

Share and Cite:

Girone, G. , Massari, A. , Manca, F. and D’Uggento, A. (2020) Mean Difference and Mean Deviation of Tukey Lambda Distribution. Applied Mathematics, 11, 771-778. doi: 10.4236/am.2020.118051.

1. Introduction

The purpose of this work is to increase the methodological contributions on the mean difference and on the relationships of the mean difference with other variability indexes [1] [2]. The studies on the mean difference, introduced by Corrado Gini in 1912 as a measure of the variability of the characters according to the aspect of inequality, have aroused the interest of many scholars over years and also recently [3] [4]. The importance of mean difference is also due to the fact that the sample mean difference is a correct estimate of that of the population distribution model and, therefore, functional for inferential purposes [5]. The theoretical contributions on the mean difference concern the main continuous distribution models (normal, rectangular, exponential, ...) [6], however, for other distribution models, such as Tukey’s, no contributions are known in literature.

2. Tukey Lambda Distribution

Tukey lambda distribution is usually used to choose a distribution model to fit data and its direct use is less usual. In general, its characteristic is that neither its density function $f\left(x\right)$ nor its cumulative function $F\left(x\right)$ is known, but only the inverse of this latter ${F}^{-1}\left(x\right)$, that is the quantile function Q(p) [7] [8].

A complete Tukey distribution shape includes three parameters: one of position, one of scale and one of shape [9] [10].

In order to calculate the mean difference and the mean deviation, it is better to refer to a reduced distribution in which the position parameter is set to zero and the scale to one. Formulas of mean difference and mean deviation of complete distribution are equal to the ones of reduced distribution multiplied by the scale parameter value. Tukey lambda distribution is defined by the quantile function

$x=Q\left(p\right)=\frac{{p}^{\lambda }-{\left(1-p\right)}^{\lambda }}{\lambda },\text{\hspace{0.17em}}\text{\hspace{0.17em}}0 (1)

Said function is not always analytically invertible and, therefore, allows to obtain cumulative function and density function only for some values of λ [11] which are $\lambda =-1,0,1/4,1/3,1/2,1,3/2,2,3,4$. Cumulative functions of Tukey lambda distribution for such values are listed below:

$\lambda =-1,\text{\hspace{0.17em}}F\left(x\right)=\frac{-2+x+\sqrt{4+{x}^{2}}}{2x},\text{\hspace{0.17em}}-\infty (2)

$\lambda =0,\text{\hspace{0.17em}}F\left(x\right)=\frac{1}{1+{\text{e}}^{-x}},\text{\hspace{0.17em}}-\infty (3)

$\lambda =\frac{1}{4},\text{\hspace{0.17em}}F\left(x\right)=\frac{1}{2}+\frac{x}{512}\sqrt{-3584{x}^{2}-17{x}^{6}+\left(1024+12{x}^{4}\right)\sqrt{512+2{x}^{4}}},\text{\hspace{0.17em}}-4<\lambda <4$ (4)

$\begin{array}{l}\lambda =\frac{1}{3},F\left(x\right)=\frac{1}{2}-\frac{5{x}^{3}}{216}+\frac{{x}^{5}}{72{\left(5832+{x}^{6}+108\sqrt{2916+{x}^{6}}\right)}^{1/3}}\\ \text{ }+\frac{x}{72}{\left(5832+{x}^{6}+108\sqrt{2916+{x}^{6}}\right)}^{1/3},\text{\hspace{0.17em}}-3 (5)

$\lambda =\frac{1}{2},\text{\hspace{0.17em}}F\left(x\right)=\frac{1}{8}\left(4-x\sqrt{8-{x}^{2}}\right),\text{\hspace{0.17em}}-2 (6)

$\lambda =1,\text{\hspace{0.17em}}F\left(x\right)=\frac{1+x}{2},\text{\hspace{0.17em}}-1 (7)

$\begin{array}{l}\lambda =\frac{3}{2},\\ F\left(x\right)=\frac{1}{2}\left[1-\sqrt{-2+\frac{1+18{x}^{2}}{{\left(1-45{x}^{2}-\frac{81{x}^{4}}{2}+\frac{3}{2}x\sqrt{{\left(-4+9{x}^{2}\right)}^{3}}\right)}^{1/3}}+{\left(1-45{x}^{2}-\frac{81{x}^{4}}{2}+\frac{3}{2}x\sqrt{{\left(-4+9{x}^{2}\right)}^{3}}\right)}^{1/3}}\right],-\frac{2}{3} (8)

$\lambda =2,\text{\hspace{0.17em}}F\left(x\right)=\frac{1+2x}{2},\text{\hspace{0.17em}}-\frac{1}{2} (9)

$\lambda =3,F\left(x\right)=\frac{1}{2}\left(1-\frac{1}{{\left(6x+\sqrt{1+36{x}^{2}}\right)}^{1/3}}+{\left(6x+\sqrt{1+36{x}^{2}}\right)}^{1/3}\right),-\frac{1}{3} (10)

$\lambda =4,F\left(x\right)=\frac{1}{2}\left(1-\frac{1}{{3}^{1/3}{\left(36x+\sqrt{3}\sqrt{1+432{x}^{2}}\right)}^{1/3}}+\frac{{\left(36x+\sqrt{3}\sqrt{1+432{x}^{2}}\right)}^{1/3}}{{3}^{2/3}}\right),-\frac{1}{4} (11)

It is necessary to use numerical inversion of $Q\left(p\right)$ to get a cumulative function for other λ values.

Regarding Tukey distribution, some characteristic values as function of λ are known: average, mode, median, standard deviation, asymmetry index, disnormality excess index, entropy, characteristic function. Expressions of mean difference and mean deviation are unknown.

3. Variability Indexes of Tukey Lambda Distribution

The variance of Tukey lambda distribution as a function of λ parameter [12] is

${\sigma }^{2}=\frac{2}{{\lambda }^{2}}\left[\frac{1}{1+2\lambda }-\frac{\Gamma {\left(\lambda +1\right)}^{2}}{\Gamma \left(2\lambda +2\right)}\right],\lambda >-\frac{1}{2}.$ (12)

By using the cumulative functions derived by the inversion of quantile functions of Tukey lambda distribution, mean difference and mean deviation values are obtained and shown in Table 1.

Mean difference values for integers from 1 to 10 are arranged exactly on a parabolic hyperbola

$\Delta \left(\lambda \right)=\frac{4}{2+3\lambda +{\lambda }^{2}},\text{\hspace{0.17em}}\lambda >1.$ (13)

Some values of Δ calculated numerically for other values of λ parameter are also all arranged over the said function, which can be then considered a general expression of the mean difference of Tukey lambda distribution. Said function takes not-negative finite values for $\lambda >-1$, as it can be shown in Figure 1.

Therefore, the mean difference in Tukey lambda distribution has a domain $\lambda >-1$ which is wider than the one of standard deviation $\lambda >-1/2$.

Let us now consider the mean deviation. First of all, we can see that the average of our distribution exists only for $\lambda >-1$ and, therefore, said domain also applies to mean deviation. Mean deviation values for integers from 1 to 10 are arranged exactly over the function

$\delta \left(\lambda \right)=\frac{{2}^{1-\lambda }\left({2}^{\lambda }-1\right)}{\lambda \left(\lambda +1\right)},\text{\hspace{0.17em}}\lambda >-1.$ (14)

Values of δ calculated numerically for other values of λ parameter are also all arranged exactly over the said function, which can be then considered the expression

Figure 1. Mean difference of Tukey lambda distribution as a function of λ parameter.

Figure 2. Mean deviation of Tukey lambda distribution as a function of λ parameter.

Table 1. Values of mean difference and mean deviation for some values of λ parameter in Tukey lambda distribution.

of mean deviation of the Tukey lambda distribution. Said function takes not-negative finite values for $\lambda >-1$ as it can be shown in Figure 2.

The mean deviation of Tukey lambda distribution has, therefore, a domain wider than the one of standard deviation.

4. Relations between Variability Indexes of Tukey Lambda Distribution

By inverting the expression of mean difference in Tukey lambda distribution as a function of λ parameter (13), the following two roots come out

${\lambda }_{1}=\frac{-3\Delta +\sqrt{\Delta }\sqrt{16+\Delta }}{2\Delta }$ (15)

and

${\lambda }_{2}=\frac{-3\Delta -\sqrt{\Delta }\sqrt{16+\Delta }}{2\Delta }.$ (16)

The second solution, which is always negative, is not usable to obtain the relationship between ∆ and σ [13].

By substituting the first solution ${\lambda }_{1}$ (15) in the standard deviation expression, it comes out an analytical relationship of the same one related to the mean difference of Tukey lambda distribution:

$\sigma =\frac{2\sqrt{\frac{2}{\sqrt{16/\Delta +1}-2}-\frac{2\text{Γ}{\left[\frac{1}{2}\left(\sqrt{16/\Delta +1}-1\right)\right]}^{2}}{\text{Γ}\left[\sqrt{16/\Delta +1}-1\right]}}}{\sqrt{16/\Delta +1}-3},\Delta >0.$ (17)

Said relationship is represented in Figure 3.

As it can be seen, standard deviation increases quickly when mean difference increases.

Let us, now, consider the relationship between mean difference and mean deviation.

By substituting root ${\lambda }_{1}$ in the formula of mean deviation (14), it comes out the following analytical relationship

$\delta \left(\Delta \right)=\frac{2\left({2}^{\frac{3}{2}-\frac{\sqrt{\frac{16}{\Delta }+1}}{2}}-1\right)}{\sqrt{\frac{16}{\Delta }+1}-\frac{4}{\Delta }-1},\text{\hspace{0.17em}}\Delta >0.$ (18)

As shown in Figure 4, it is evident that the relationship between the two indexes is almost linear.

Finally, let us consider the relationship between mean deviation and standard deviation of Tukey lambda distribution.

Since it is not possible to obtain $\lambda$ parameter as a function of mean deviation, it is necessary to use a numerical procedure to calculate the two variability indexes values for a consistent set of λ parameter values and to represent pairs of values on a Cartesian axis.

By choosing values of λ: −0.49, −0.48, ..., 5.00, it comes out a numerical relationship as shown in Figure 5.

Figure 3. Analytical relationship between mean difference and standard deviation of Tukey lambda distribution.

Figure 4. Analytical relationship between mean difference and mean deviation of Tukey lambda distribution.

Figure 5. Numerical relationship between mean deviation and standard deviation of Tukey lambda distribution.

As it can be seen, the relationship between mean deviation and standard deviation of Tukey Lambda distribution increases with slow acceleration.

5. Conclusive Remarks

In this work, the formulas of mean difference and mean deviation of Tukey Lambda distribution have been obtained. It is an original contribution aimed at increasing the knowledge about this distribution model. These results allowed us to investigate the relationships among the three main variability indexes, standard deviation, mean deviation and mean difference, regarding Tukey lambda model.

Girone Section 1; Massari Section 3; Manca Section 4; D’Uggento Sections 2 and 5.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

 [1] Girone, G., Massari, A. and Manca, F. (2016) The Relation between the Mean Difference and the Standard Deviation in Continuous Distribution Models. Quality and Quantity, 51, 481-507. https://doi.org/10.1007/s11135-016-0398-y [2] D’Uggento, A.M., Girone, G. and Marin, C. (2016) The Relation between the Mean Difference and the Mean Deviation in 11 Continuous Distribution Models. Quality and Quantity, 51, 595-615. https://doi.org/10.1007/s11135-016-0427-x [3] Davydov, Y. and Greselin, F. (2019) Inferential Results for a New Measure of Inequality. The Econometrics Journal, 22, 153-172. https://doi.org/10.1093/ectj/utz004 [4] Greselin, F. and Zitikis, R. (2018) From the Classical Gini Index of Income Inequality to a New Zenga-Type Relative Measure of Risk: A Modeller’s Perspective. Econometrics, 6, 1-20. https://doi.org/10.3390/econometrics6010004 [5] Girone, G. and Mazzitelli, D. (2007) La differenza media nei principali modelli distributivi continui. Annali del Dipartimento di Scienze Statistiche “Carlo Cecchi”, VI, 43-62. [6] Girone, G., Massari, A., Campobasso, F., Manca, F., D’Uggento, A.M., Marin, C. and Nannavecchia, A. (2017) Rassegna sulla differenza media di distribuzioni teoriche continue. Rivista di Economia e Commercio, V, 13-28. [7] Ramberg, J. and Schmeiser, B. (1972) An Approximate Method for Generating Symmetric Random Variables. Communications of the ACM, 15, 987-990. https://doi.org/10.1145/355606.361888 [8] Ramberg, J., et al. (1979) A Probability Distribution and Its Uses in Fitting Data. Technometrics, 21, 201-214. https://doi.org/10.1080/00401706.1979.10489750 [9] Tukey, J. (1960) The Practical Relationship between the Common Transformations of Percentages of Counts and Amounts. Technical Report 36, Statistical Techniques Research Group, Princeton University. [10] Johnson, N.L. and Kotz, S. (1973) Extended and Multivariate Tukey Lambda Distributions. Biometrika, 60, 655-661. https://doi.org/10.1093/biomet/60.3.655 [11] Sarabia, J.M. (1997) A Hierarchy of Lorenz Curves Based on the Generalized Tukey’s Lambda Distribution. Econometric Reviews, 16, 305-320. https://doi.org/10.1080/07474939708800389 [12] Johnson, N., Kotz, S. and Balakrishnan, N. (1994) Continuous Univariate Distributions. Vol. 1, Wiley, New York, 1994. [13] Hastings, C., Mosteller, F., Tukey, J.W. and Winsor, C.P. (1947) Low Moments for Small Samples: A Comparative Study of Order Statistics. Annals of Mathematical Statistics, 18, 413-426. https://doi.org/10.1214/aoms/1177730388