A Normal Weighted Inverse Gaussian Distribution for Skewed and Heavy-Tailed Data

Abstract

High frequency financial data is characterized by non-normality: asymmetric, leptokurtic and fat-tailed behaviour. The normal distribution is therefore inadequate in capturing these characteristics. To this end, various flexible distributions have been proposed. It is well known that mixture distributions produce flexible models with good statistical and probabilistic properties. In this work, a finite mixture of two special cases of Generalized Inverse Gaussian distribution has been constructed. Using this finite mixture as a mixing distribution to the Normal Variance Mean Mixture we get a Normal Weighted Inverse Gaussian (NWIG) distribution. The second objective, therefore, is to construct and obtain properties of the NWIG distribution. The maximum likelihood parameter estimates of the proposed model are estimated via EM algorithm and three data sets are used for application. The result shows that the proposed model is flexible and fits the data well.

Share and Cite:

Maina, C. , Weke, P. , Ogutu, C. and Ottieno, J. (2022) A Normal Weighted Inverse Gaussian Distribution for Skewed and Heavy-Tailed Data. Applied Mathematics, 13, 163-177. doi: 10.4236/am.2022.132013.

1. Introduction

It is well known that mixture distributions produce flexible models with good statistical and probabilistic properties. Our first objective, therefore, is to construct and obtain properties of a finite mixture of two special cases of Generalized Inverse Gaussian distribution. These two special cases are related to the inverse Gaussian distribution which is also a special case of Generalised Inverse Gaussian distribution.

The Generalized Hyperbolic Distribution (GHD) introduced by Barndorff-Nielsen [1] as a Normal Variance-Mean Mixture is obtained when the Generalized Inverse Gaussian (GIG) distribution is the mixing distribution. Barndorff-Nielsen [2] introduced the Normal Inverse Gaussian (NIG) distribution obtained

when the mixing distribution is Inverse Gaussian (IG). The IG is obtained as a special case of GIG when the index parameter λ = 1 2 .

The two special cases and their finite mixture are weighted Inverse Gaussian distributions. Using this finite mixture as a mixing distribution to the Normal Variance Mean Mixture we get a Normal Weighted Inverse Gaussian (NWIG) distribution. The second objective, therefore, is to construct and obtain properties of the NWIG distribution.

The maximum likelihood parameter estimates of the proposed model are estimated via EM algorithm and three data sets are used for application.

In literature, the Normal Inverse Gaussian (NIG) distribution has been used repeatedly for financial data which are skewed, leptokurtic and heavy-tailed because they are collected over short-time intervals, such as daily or weekly. Our third objective is to compare the log-likelihood functions of NWIG and NIG distributions.

Generalized Inverse Gaussian distribution has three parameters λ , δ , γ . The distribution is denoted by G I G ( λ , δ , γ ) . When λ = 1 2 , we have G I G ( 1 2 , δ , γ ) which is an Inverse Gaussian (IG) distribution. If λ = 1 2 , we have G I G ( 1 2 , δ , γ ) which is Reciprocal Inverse Gaussian distribution. The third special case is G I G ( 3 2 , δ , γ ) .

G I G ( 1 2 , δ , γ ) and G I G ( 3 2 , δ , γ ) are expressed in terms of G I G ( 1 2 , δ , γ ) and are weighted IG distributions. Their finite mixture; i.e., p G I G ( 1 2 , δ , γ ) + ( 1 p ) G I G ( 3 2 , δ , γ ) is also WIG.

The concept of weighted distribution was introduced by Fisher [3] and elaborated by Patil and Rao [4]. Gupta and Kundu [5] considered the finite mixture of the IG and the length biased IG distributions. Generalized Hyperbolic Distribution (GHD) is a normal variance mean mixture with GIG mixing distribution. It

is a five parameter distribution denoted by G H ( λ , α , β , δ , μ ) . For λ = 1 2 we have a normal Inverse Gaussian (NIG) distribution. For λ = 1 2 and λ = 3 2 we have normal weighted Inverse Gaussian (NWIG) distributions.

The rest of the paper is organised as follows: section 2 deals with the proposed mixing distributions. Section 3 is on the proposed mixed model, posterior distribution and posterior expectations. Section 4 is on the EM algorithm estimation procedure. Application and Conclusion are in section 5 and 6 respectively.

2. Proposed Mixing Distribution

We show that two special cases of Generalised Inverse Gaussian (GIG) distribution can be expressed as Weighted Inverse Gaussian (WIG) distribution. A finite mixture of these cases can also be expressed as WIG distribution. The Generalized Inverse Gaussian (GIG) distribution is given by

g ( z ) = ( γ δ ) λ z λ 1 2 K λ ( δ γ ) exp { 1 2 ( δ 2 z + γ 2 z ) } (1)

where

z > 0 ; < λ < , δ > 0 , γ > 0

and K λ ( ω ) is the Modified Bessel function of the third kind of order λ evaluated at point ω .

In short form, it is stated as

Z G I G ( λ , δ , γ ) .

The moments around the origin of the G I G ( λ , δ , γ ) distribution are given by

E ( Z r ) = ( δ γ ) r K λ + r ( δ γ ) K λ ( δ γ ) (2)

Remark: This expectation formula works when r is also a negative integer.

Special Cases

When λ = 1 2

g 1 ( z ) = δ e δ γ 2 π z 3 2 exp { 1 2 ( δ 2 z + γ 2 z ) } (3)

This is an Inverse Gaussian (IG) distribution.

When λ = 1 2

g 2 ( z ) = γ e δ γ 2 π z 1 2 exp { 1 2 ( δ 2 z + γ 2 z ) } (4)

This is a Reciprocal Inverse Gaussian (RIG) distribution.

When λ = 3 2

g 3 ( z ) = δ 3 e δ γ 2 π ( 1 + δ γ ) z 5 2 exp { 1 2 ( δ 2 z + γ 2 z ) } (5)

which is the G I G ( 3 2 , δ , γ ) .

Using the concept of weighted distribution introduced by Fisher (1934) it can be shown that the two special cases are weighted inverse Gaussian distribution. More specifically, we express g 2 and g 3 in terms of g 1 as follows:

g 2 ( z ) = γ δ z [ δ e δ γ 2 π z 3 2 exp { 1 2 ( δ 2 z + γ 2 z ) } ] = γ δ z g 1 ( z ) (6)

and

g 3 ( z ) = δ 2 1 + δ γ z 1 [ δ e δ γ 2 π z 3 2 exp { 1 2 ( δ 2 z + γ 2 z ) } ] = δ 2 1 + δ γ z 1 g 1 ( z ) (7)

A finite mixture of the two cases is given by

g 4 ( z ) = p g 2 ( z ) + ( 1 p ) g 3 ( z ) = [ p γ δ z + ( 1 p ) δ 2 1 + δ γ 1 z ] g 1 ( z )

Put

p = δ 3 δ 3 + γ (8)

g 4 ( z ) = [ γ δ 2 δ 3 + γ z + γ δ 2 δ 3 + γ 1 z ] g 1 ( z ) = γ δ 2 δ 3 + γ ( z + 1 1 + δ γ 1 z ) g 1 ( z ) (9)

3. Proposed Model

Construction of the Mixed Model

Suppose the conditional of x given z is N ( μ + β z , z ) . If z follows itself distribution defined by formula (9). The mixed model is constructed as follows

f ( x ) = 0 1 2 π z e [ ( x μ ) β z ] 2 2 z g 4 ( z ) d z = γ δ 3 e δ γ e β ( x μ ) 2 π ( δ 3 + γ ) 0 ( z + 1 1 + δ γ z 1 ) z 2 e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z = γ δ 3 e δ γ e β ( x μ ) 2 π ( δ 3 + γ ) 0 ( z 0 1 + z 2 1 1 + δ γ ) e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z = γ δ 3 e δ γ e β ( x μ ) π ( δ 3 + γ ) { K 0 ( α δ ϕ ( x ) ) + α 2 δ 2 ϕ ( x ) K 2 ( α δ ϕ ( x ) ) 1 + δ γ } (10)

f ( x ) = γ δ e δ γ e β ( x μ ) π ϕ ( x ) ( δ 3 + γ ) { δ 2 ϕ ( x ) K 0 ( α δ ϕ ( x ) ) + α 2 K 2 ( α δ ϕ ( x ) ) 1 + δ γ } = γ δ e δ γ e β ( x μ ) π ϕ ( x ) ( δ 3 + γ ) ( 1 + δ γ ) { ( 1 + δ γ ) δ 2 ϕ ( x ) K 0 ( α δ ϕ ( x ) ) + α 2 K 2 ( α δ ϕ ( x ) ) } (11)

where

ϕ ( x ) = 1 + ( x μ ) 2 δ 2

and

α 2 = β 2 + γ 2

The log-likelihood function

l = log L = i = 1 n log f ( x i ) = i = 1 n log { γ δ e δ γ e β ( x μ ) π ϕ ( x ) ( δ 3 + γ ) ( 1 + δ γ ) { ( 1 + δ γ ) δ 2 ϕ ( x ) K 0 ( α δ ϕ ( x ) ) + α 2 K 2 ( α δ ϕ ( x ) ) } }

= i = 1 n { log ( δ γ ) + δ γ + β x i β μ log ( ( 1 + δ γ ) π ( δ 3 + γ ) ) log ϕ ( x i ) + log { ( 1 + δ γ ) δ 2 ϕ ( x ) K 0 ( α δ ϕ ( x ) ) + α 2 K 2 ( α δ ϕ ( x ) ) } } = n log ( δ γ ) + n δ γ + β i = 1 n x i n β μ n log ( ( 1 + δ γ ) π ( δ 3 + γ ) ) i = 1 n log ϕ ( x i ) + i = 1 n log { ( 1 + δ γ ) δ 2 ϕ ( x ) K 0 ( α δ ϕ ( x ) ) + α 2 K 2 ( α δ ϕ ( x ) ) } (12)

Posterior Expectation

E ( Z / X ) = 0 z ( z + z 1 1 + δ γ ) z 2 e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z 0 ( z + z 1 1 + δ γ ) z 2 e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z = 1 2 0 ( z 1 1 + z 1 1 1 + δ γ ) e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z 1 2 0 ( z 0 1 + z 2 1 1 + δ γ ) e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z = [ δ ϕ ( x ) α + ( δ ϕ ( x ) α ) 1 ] K 1 ( α δ ϕ ( x ) ) K 0 ( α δ ϕ ( x ) ) + ( δ ϕ ( x ) α ) 2 K 2 ( α δ ϕ ( x ) ) 1 + δ γ = [ δ ϕ ( x ) α + α δ ϕ ( x ) ] K 1 ( α δ ϕ ( x ) ) K 0 ( α δ ϕ ( x ) ) + α 2 δ 2 ϕ ( x ) K 2 ( α δ ϕ ( x ) ) 1 + δ γ = [ ( 1 + δ γ ) δ 3 ( ϕ ( x ) ) 3 + α 2 δ ϕ ( x ) ] K 1 ( α δ ϕ ( x ) ) α ( 1 + δ γ ) δ 2 ϕ ( x ) K 0 ( α δ ϕ ( x ) ) + α 3 K 2 ( α δ ϕ ( x ) ) (13)

Similarly,

E ( 1 Z / X ) = 0 z 1 ( z + z 1 1 + δ γ ) z 2 e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z 0 ( z + z 1 1 + δ γ ) z 2 e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z = 1 2 0 ( z 1 1 + z 3 1 1 + δ γ ) e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z 1 2 0 ( z 0 1 + z 2 1 1 + δ γ ) e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z

E ( 1 Z / X ) = α δ ϕ ( x ) K 1 ( α δ ϕ ( x ) ) + ( α δ ϕ ( x ) ) 3 K 3 ( α δ ϕ ( x ) ) 1 + δ γ K 0 ( α δ ϕ ( x ) ) + ( α δ ϕ ( x ) ) 2 K 2 ( α δ ϕ ( x ) ) 1 + δ γ = α δ 2 ϕ ( x ) K 1 ( α δ ϕ ( x ) ) + α 3 1 + δ γ K 3 ( α δ ϕ ( x ) ) ( δ ϕ ( x ) ) 3 K 0 ( α δ ϕ ( x ) ) + α 2 δ ϕ ( x ) 1 + δ γ K 2 ( α δ ϕ ( x ) ) = α δ 2 ( 1 + δ γ ) ϕ ( x ) K 1 ( α δ ϕ ( x ) ) + α 3 K 3 ( α δ ϕ ( x ) ) ( 1 + δ γ ) ( δ ϕ ( x ) ) 3 K 0 ( α δ ϕ ( x ) ) + α 2 δ ϕ ( x ) K 2 ( α δ ϕ ( x ) ) (14)

E ( Z 2 / X ) = 0 z 2 ( z + z 1 1 + δ γ ) z 2 e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z 0 ( z + z 1 1 + δ γ ) z 2 e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z = 1 2 0 ( ( 1 + δ γ ) z 2 1 + z 0 1 ) e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z 1 2 0 ( ( 1 + δ γ ) z 0 1 + z 2 1 ) e α 2 2 ( z + δ 2 ϕ ( x ) α 2 z ) d z = ( 1 + δ γ ) δ 2 ϕ ( x ) α 2 K 2 ( α δ ϕ ( x ) ) + K 0 ( α δ ϕ ( x ) ) ( 1 + δ γ ) K 0 ( α δ ϕ ( x ) ) + K 2 ( α δ ϕ ( x ) ) = ( 1 + δ γ ) δ 2 ϕ ( x ) K 2 ( α δ ϕ ( x ) ) + α 2 K 0 ( α δ ϕ ( x ) ) α 2 ( 1 + δ γ ) K 0 ( α δ ϕ ( x ) ) + α 2 K 2 ( α δ ϕ ( x ) ) (15)

4. EM Algorithm

4.1. Introduction

EM algorithm is a powerful technique for maximum likelihood estimation for data containing missing values or data that can be considered as containing missing values. It was introduced by Dempster et al. [6].

Karlis [7] considers the mixing operation responsible for producing missing data.

Assume that the true data are made of an observed partX and unobserved part Z. Kosta [8] observes the log likelihood of the complete data ( x i , z i ) for i = 1,2,3, , n factorizes into two parts. This implies that the joint density of X and Z is given by

f ( x , z ) = f ( x / z ) g ( z ) .

The likelihood function is

L = i = 1 n f ( x i / z i ) g ( z i ) = i = 1 n f ( x i / z i ) i = 1 n g ( z i )

log L = log i = 1 n f ( x i / z i ) + log i = 1 n g ( z i ) = i = 1 n log f ( x i / z i ) + i = 1 n log g ( z i ) = l 1 + l 2

where

l 1 = i = 1 n log f ( x i / z i )

and

l 2 = i = 1 n log g ( z i ) .

4.2. M-Step for the Conditional Probability

Since

f ( x / z ) = 1 2 π z e ( x μ β z ) 2 2 z

then

l 1 ( μ , β ) = 1 n log 1 2 π z i e ( x i μ β z i ) 2 2 z i = 1 n { 1 2 log ( 2 π ) 1 2 log z i ( x i μ β z i ) 2 2 z i }

l 1 ( μ , β ) = n 2 log ( 2 π ) 1 2 i = 1 n log z i i = 1 n ( x i μ β z i ) 2 2 z i

β l 1 = i = 1 n ( x i μ β z i )

β l 1 = 0 i = 1 n ( x i μ ^ β ^ z i ) = 0

i.e., i = 1 n x i n μ ^ β ^ i = 1 n z i = 0

x ¯ μ ^ β ^ z ¯ = 0

μ ^ = x ¯ β ^ z ¯ (16)

μ l 1 = i = 1 n x i μ β z i z i

μ l 1 = 0 i = 1 n x i z i μ ^ i = 1 n 1 z i n β ^ = 0

i = 1 n x i z i ( x ¯ β ^ z ¯ ) i = 1 n 1 z i n β ^ = 0

i = 1 n x i z i x ¯ i = 1 n 1 z i + β ^ z ¯ i = 1 n 1 z i n β ^ = 0

[ n z ¯ i = 1 n 1 z i ] β ¯ = i = 1 n x i z i x ¯ i = 1 n 1 z i

β ^ = i = 1 n x i z i x ¯ i = 1 n 1 z i n z ¯ i = 1 n 1 z i (17)

4.3. M-Step for the Mixing Distribution

From formula (9)

g ( z ) = γ δ 2 δ 3 + γ ( z + 1 1 + δ γ z 1 ) g 1 ( z ) = γ δ 2 δ 3 + γ ( z + 1 1 + δ γ z 1 ) δ e δ γ 2 π z 3 2 e 1 2 ( δ 2 z + γ 2 z ) = γ δ 3 e δ γ 2 π ( δ 3 + γ ) ( z + 1 1 + δ γ z 1 ) z 3 2 e 1 2 ( δ 2 z + γ 2 z ) (18)

Therefore

l 2 = i = 1 n log g ( z i ) = i = 1 n { log γ + 3 log δ + δ γ 1 2 log ( 2 π ) log ( δ 3 + γ ) log ( 1 + δ γ ) 3 2 log ( z i ) δ 2 2 1 z i γ 2 2 z i + log ( ( 1 + δ γ ) z i + 1 z i ) } = n log γ + 3 n log δ + n δ γ n 2 log ( 2 π ) n log ( δ 3 + γ ) n log ( 1 + δ γ ) 3 2 i = 1 n log ( z i ) δ 2 2 i = 1 n 1 z i γ 2 2 i = 1 n z i + i = 1 n log ( ( 1 + δ γ ) z i + 1 z i ) (19)

Differentiating w.r.t γ we obtain

γ l 2 = n γ + n δ n δ 3 + γ n δ 1 + δ γ γ i = 1 n z i + i = 1 n δ z i ( 1 + δ γ ) z i + 1 z i = ( n γ n δ 3 + γ ) + n δ ( 1 1 1 + δ γ ) γ i = 1 n z i + i = 1 n δ z i ( 1 + δ γ ) z i + 1 z i = n δ 3 γ ( δ 3 + γ ) + n γ δ 2 1 + δ γ γ i = 1 n z i + i = 1 n δ z i 2 1 + ( 1 + δ γ ) z i 2

γ = 0 implies that

n δ 3 γ ( δ 3 + γ ) + n γ δ 2 1 + δ γ γ i = 1 n z i + i = 1 n δ z i 2 1 + ( 1 + δ γ ) z i 2 = 0 (20)

Similarly

δ l 2 = 3 n δ + n γ 3 n δ 2 δ 3 + γ n γ 1 + δ γ δ i = 1 n 1 z i + i = 1 n γ z i 2 1 + ( 1 + δ γ ) z i 2 = 3 n γ δ ( δ 3 + γ ) + n δ γ 2 1 + δ γ δ i = 1 n 1 z i + i = 1 n γ z i 2 1 + ( 1 + δ γ ) z i 2

δ = 0 implies that

3 n γ δ ( δ 3 + γ ) + n δ γ 2 1 + δ γ δ i = 1 n 1 z i + i = 1 n γ z i 2 1 + ( 1 + δ γ ) z i 2 = 0 (21)

4.4. E-Step

Values of random variables Z i , 1 Z i and Z i 2 are not known. So we estimate them by considering posterior expectations

E ( Z i / X i ) , E ( 1 Z i / X i ) and E ( Z i 2 / X i )

as given in formulae (12), (13) and (14) respectively. Let

s i = E ( Z i / X i ) , w i = E ( 1 Z i / X i ) and v i = E ( Z i / X i ) .

The k-th iterations are as follows

s i ( k ) = [ ( 1 + δ ( k ) γ ( k ) ) ( δ ( k ) ) 3 ( ϕ ( k ) ( x i ) ) 3 + ( α ( k ) ) 2 δ ( k ) ϕ ( k ) ( x i ) ] K 1 ( α ( k ) δ ( k ) ϕ ( k ) ( x i ) ) α ( k ) ( 1 + δ ( k ) γ ( k ) ) ( δ ( k ) ) 2 ϕ ( k ) ( x i ) K 0 ( α ( k ) δ ( k ) ϕ ( k ) ( x i ) ) + ( α ( k ) ) 3 K 2 ( α ( k ) δ ( k ) ϕ ( k ) ( x i ) ) (22)

w i ( k ) = α ( k ) ( δ ( k ) ) 2 ( 1 + δ ( k ) γ ( k ) ) ϕ ( k ) ( x i ) K 1 ( α ( k ) δ ( k ) ϕ ( k ) ( x i ) ) + ( α ( k ) ) 3 K 3 ( α ( k ) δ ( k ) ϕ ( k ) ( x i ) ) ( 1 + δ ( k ) γ ( k ) ) ( δ ( k ) ϕ ( k ) ( x i ) ) 3 K 0 ( α ( k ) δ ( k ) ϕ ( k ) ( x i ) ) + ( α ( k ) ) 2 δ ( k ) ϕ ( k ) ( x ) K 2 ( α ( k ) δ ( k ) ϕ ( k ) ( x i ) ) (23)

v i ( k ) = ( 1 + δ ( k ) γ ( k ) ) ( δ ( k ) ) 2 ϕ ( k ) ( x i ) K 2 ( α ( k ) δ ( k ) ϕ ( k ) ( x i ) ) + ( α ( k ) ) 2 K 0 ( α ( k ) δ ( k ) ϕ ( k ) ( x i ) ) ( α ( k ) ) 2 ( 1 + δ ( k ) γ ( k ) ) K 0 ( α ( k ) δ ( k ) ϕ ( k ) ( x i ) ) + ( α ( k ) ) 2 K 2 ( α ( k ) δ ( k ) ϕ ( k ) ( x i ) ) (24)

For the log-likelihood, the k-th iteration is given as

l ( k ) = n log ( δ ( k ) γ ( k ) ) + n δ ( k ) γ ( k ) + β ( k ) i = 1 n x i n β ( k ) μ ( k ) n log ( ( 1 + δ ( k ) γ ( k ) ) π ( ( δ ( k ) ) 3 + γ ( k ) ) ) i = 1 n log ϕ ( k ) ( x i ) + i = 1 n log { ( 1 + δ ( k ) γ ( k ) ) ( δ ( k ) ) 2 ϕ ( k ) ( x ) K 0 ( α ( k ) δ ( k ) ϕ ( k ) ( x ) ) + ( α ( k ) ) 2 K 2 ( α ( k ) δ ( k ) ϕ ( k ) ( x ) ) } (25)

4.5. Iterative Scheme

From Equations (19) and (20), we obtain the following iterative scheme

γ ( k + 1 ) = n ( δ ( k ) ) 3 γ ( k ) ( ( δ ( k ) ) 3 + γ ( k ) ) + n γ ( k ) ( δ ( k ) ) 2 1 + δ ( k ) γ ( k ) + i = 1 ( k ) δ ( k ) z i 2 1 + ( 1 + δ ( k ) γ ( k ) ) z i 2 i = 1 n s i ( k ) (26)

δ ( k + 1 ) = 3 n γ ( k + 1 ) δ ( k ) ( ( δ ( k ) ) 3 + γ ( k + 1 ) ) + n δ ( k ) ( γ ( k + 1 ) ) 2 1 + δ ( k ) γ ( k + 1 ) + i = 1 n γ ( k + 1 ) z i 2 1 + ( 1 + δ ( k ) γ ( k + 1 ) ) z i 2 i = 1 n w i k (27)

From Equations (15) and (16) we also obtain

β ^ ( k + 1 ) = i = 1 n x i w i ( k ) x ¯ i = 1 n w i ( k ) n s ¯ ( k ) i = 1 n w i ( k ) (28)

μ ^ ( k + 1 ) = x ¯ β ^ ( k + 1 ) s ¯ ( k ) (29)

α ^ ( k + 1 ) = [ ( γ ^ ( k + 1 ) ) 2 + ( β ^ ( k + 1 ) ) 2 ] 1 2 (30)

5. Application

Let ( P t ) denote the price process of a security at time t, in particular of a stock. In order to allow comparison of investments in different securities we shall investigate the rates of return defined by

X t = log P t log P t 1 .

In this section, we consider three data sets for data analysis. They include: Range Resource Corporation (RRC), Shares of Chevron Corporation (CVX) and s&p500 index. The histogram for the weekly log-returns in Figure 1 for RRC illustrates that the data is negatively skewed and exhibits heavy tails. The Q-Q plot shows that the normal distribution is not a good fit for the data, especially at the tails. This is also similar for the other data sets.

Table 1 provides descriptive statistics for the return series in consideration. We observe that the data sets experience excess kurtosis indicates the leptokurtic behaviour of the returns. The log-returns have distributions with relatively heavier tails than the normal distribution. The skewness indicates that the two tails of the returns behave differently.

Table 2 below gives the method of moment estimates of NIG for the three data sets. The estimates will be used as initial values for the EM-algorithm.

The stopping criterion is when

l ( k ) l ( k 1 ) l ( k ) < t o l (31)

where tol is the tolerance level chosen; e.g 10−6 and l ( k ) as given in Equation (11). We now wish to obtain the maximum likelihood parameter estimates of the data sets for the proposed model via the EM algorithm. Tables 3-5 illustrate monotonic convergence at different levels. The loglikelihood and AIC for each data set are also provided.

Figure 1. Histogram and Q-Q plot for RRC weekly log-returns.

Table 1. Summary statistics for the data sets.

Table 2. NIG method of moment estimates for the data sets.

Table 3. Maximum likelihood estimates of the proposed model for RRC.

Table 4. Maximum likelihood estimates of the proposed model for CVX.

Table 5. Maximum likelihood estimates of the proposed model for s&p500 index.

Figures 2-4 show that the proposed models is a good fit the data sets.

Remark:

Expressing the proposed model in terms of its components we have

f ( x ) = δ 3 δ 3 + γ × N R I G + γ δ 3 + γ × G H D ( 3 2 , α , δ , β , μ ) (32)

Using the estimates we obtain the estimates of p for the data sets as shown in Table 6 below:

The finite mixture for these data sets is more weighted to the NRIG than the other special case of the GHD when λ = 3 2 .

Figure 2. Fitting the proposed model to RRC log weekly returns.

Figure 3. Fitting the proposed model to CVX log weekly returns.

Figure 4. Fitting the proposed model to s&p500 index log weekly returns.

Table 6. Estimates of p for the data sets.

6. Conclusions

Two special cases of the Generalized Inverse Gaussian have been shown to be Weighted Inverse Gaussian distributions. Their mixture has been used as a mixing distribution for Normal Variance-Mean mixture to a Normal Weighted Inverse Gaussian Model. The mean and variance of the proposed model have been obtained.

Three data sets: Range Resource Corporation (RRC), Shares of Chevron Corporation (CVX) and s&p500 index for the period 3/01/2000 to 1/07/2013 with 702 observations have been used for data analysis. An iterative scheme has been presented for parameter estimation by the EM algorithm. The iterative scheme demonstrates a monotonic convergence. The method of moment estimates for NIG worked well for the three data sets. The model fits the data sets well.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Barndorff-Nielsen, O.E. (1977) Exponentially Decreasing Distributions for the Logarithm of Particle Size. Proceedings of the Royal Society A, 353, 409-419.
https://doi.org/10.1098/rspa.1977.0041
[2] Barndorff-Nielsen, O.E. (1997) Normal Inverse Gaussian Distribution and Stochastic Volatility Modelling. Scandinavian Journal of Statistics, 24, 1-13.
https://doi.org/10.1111/1467-9469.00045
[3] Fisher, R.A. (1934) The Effect of Methods of Ascertainment upon the Estimation of Frequencies. Annals of Eugenics, 6, 13-25.
https://doi.org/10.1111/j.1469-1809.1934.tb02105.x
[4] Patil, G.P. and Rao, C.R. (1978) Weighted Distributions and Size-Biased Sampling with Applications to Wildlife Populations and Human Families. Biometrics, 34, 179-189.
https://doi.org/10.2307/2530008
[5] Gupta, R.C. and Kundu, D. (2011) Weighted Inverse Gaussian—A Versatile Lifetime Model. Journal of Applied Statistics, 38, 2695-2708.
https://doi.org/10.1080/02664763.2011.567251
[6] Dempster, A.P., Laird, N.M. and Rubin, D. (1977) Maximum Likelihood from Incomplete Data Using the EM Algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39, 1-38.
https://doi.org/10.1111/j.2517-6161.1977.tb01600.x
[7] Karlis, D. (2002) An EM Type Algorithm for Maximum Likelihood Estimation of the Normal-Inverse Gaussian Distribution. Statistics and Probability Letters, 57, 43-52.
https://doi.org/10.1016/S0167-7152(02)00040-8
[8] Kostas, F. (2007) Tests of Fit for Symmetric Variance Gamma Distributions. UADPhilEcon, National and Kapodistrian University of Athens, Greece.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.