One-Sample Bayesian Predictive Analyses for a Nonhomogeneous Poisson Process with Delayed S-Shaped Intensity Function Using Non-Informative Priors

Otieno Collins; Orawo Luke Akong’o; Matiri George Munene

doi:10.4236/ojs.2023.135034

Open Journal of Statistics > Vol.13 No.5, October 2023

One-Sample Bayesian Predictive Analyses for a Nonhomogeneous Poisson Process with Delayed S-Shaped Intensity Function Using Non-Informative Priors

Otieno Collins, Orawo Luke Akong’o, Matiri George Munene
Department of Mathematics, Egerton University, Egerton, Kenya.
DOI: 10.4236/ojs.2023.135034 PDF HTML XML 70 Downloads 343 Views

Abstract

The delayed S-shaped software reliability growth model (SRGM) is one of the non-homogeneous Poisson process (NHPP) models which have been proposed for software reliability assessment. The model is distinctive because it has a mean value function that reflects the delay in failure reporting: there is a delay between failure detection and reporting time. The model captures error detection, isolation, and removal processes, thus is appropriate for software reliability analysis. Predictive analysis in software testing is useful in modifying, debugging, and determining when to terminate software development testing processes. However, Bayesian predictive analyses on the delayed S-shaped model have not been extensively explored. This paper uses the delayed S-shaped SRGM to address four issues in one-sample prediction associated with the software development testing process. Bayesian approach based on non-informative priors was used to derive explicit solutions for the four issues, and the developed methodologies were illustrated using real data.

Keywords

Failure Intensity, Non-Informative Priors, Software Reliability Model, Bayesian Approach, Non-Homogeneous Poisson Process

Share and Cite:

Collins, O. , Akong’o, O. and Munene, M. (2023) One-Sample Bayesian Predictive Analyses for a Nonhomogeneous Poisson Process with Delayed S-Shaped Intensity Function Using Non-Informative Priors. Open Journal of Statistics, 13, 717-733. doi: 10.4236/ojs.2023.135034.

1. Introduction

Software reliability assessment has become indispensable for all software developers as it involves reliability testing and debugging processes. The main objective of reliability testing is to identify potential defects that could lead to system failures, crashes, or malfunctions during real-world usage. Debugging is the process of detecting and correcting errors in software [1] . Software end-users are often concerned with the reliability of software products they acquire from the market. A single software error can cause a failure, which can be avoided by producing reliable software [2] . Defective software may not only damage the reputation of the producer but also attract legal procedures in case of a lawsuit and may expose the developer to preventable costs. As such, software developers are concerned with the reliability of their software products before releasing them into the market. Many researchers have developed methods to test software to detect, isolate, and remove faults during development. The reliability of software products is ensured by running tests that emulate the end-user environment.

Although the approaches to testing software reliability have been proposed, a number of issues still arise with testing. Such may include determining the optimal release time of software, the optimal cost of producing software, and the reliability of the software. Moreover, it is difficult and time-consuming to emulate the end-user environment [2] . These issues are well addressed through software reliability modeling by performing predictive analyses using historical data of software failures. The analyses often involve determining the optimal release time of software and expected costs at the time of release and constructing a prediction interval for a future observable failure.

Over the past decades, several nonhomogeneous Poisson process (NHPP) software reliability growth models (SRGMs) have been developed and used in reliability assessment. Such models are classified as perfect or imperfect debugging NHPP SRGMs [1] . Perfect debugging SRGMs assume that errors are immediately removed with certainty when detected, without introducing new faults. On the other hand, imperfect debugging NHPP SRGMs assume that when faults are detected, they may not be removed with certainty, and new errors may also be introduced into the software during the correction process [1] . The delayed S-shaped software reliability model developed by [3] is one of the perfect NHPP SRGMs. The model is distinctive because of its mean value function, which is S-shaped, reflecting the delay in failure reporting [1] . The mean value function is given by;

$m (t) = α (1 - (1 + β t) e^{β t})$ (1)

and the model has an intensity function given by;

$λ (t) = α β^{2} t e^{- β t}$ (2)

The delayed S-shaped NHPP SRGM is based on the following assumptions [4] :

1) Errors are removed immediately when detected, without introducing new errors into the software.

2) The current number of faults in a software and the probability of failure detection are proportional.

3) All the faults in a software are mutually independent from the failure detection point of view.

4) Errors in a software leads to failures at random times.

5) The time between ${(j - 1)}^{th}$ and $j^{th}$ failures depend on the time to the ${(j - 1)}^{th}$ failure.

6) The initial error content of the software is a random variable.

Many researchers have considered the use of the delayed S-shaped software reliability model in software reliability testing. [5] performed Bayesian predictive analysis on the model using gamma-distributed informative prior, determining the optimal release time, expected costs, and the estimated reliability of the software at the time of release. [6] performed Bayesian interval estimation and compared Bayesian credible intervals with Wald confidence intervals and found that Bayesian approach yielded shorter intervals with higher coverage probability. The results imply that Bayesian method yields better parameter estimates, which can then be used to enhance accuracy in prediction. Although the estimation of the parameters of the model is essential in software reliability assessment, the main goal is to obtain accurate predictions to provide adequate information to help software developers in planning tests and inform them about when to terminate the testing process. [7] performed interval estimation on the delayed S-shaped model and obtained 90% prediction intervals for the reliability prediction at any

future time, t. However, none of these studies used the $\frac{1}{α β}$ non-informative

prior, and predictive issues addressed in this paper have not been explored using the delayed S-shaped model.

This paper focuses on single-sample predictive analyses on the delayed S-shaped software reliability model using Bayesian approach. We first outline four issues in software reliability testing. The issues have been addressed by [8] and [9] using the Power law Process (PLP), [2] using the Goel-Okumoto (1979) software reliability model, and [10] using Musa-Okumoto SRGM. The study used Bayesian approach with non-informative priors and the Yamada delayed S-shaped SRGM to develop and derive predictive distributions presented in section 2.2, and applied the developed methodologies to secondary software data to address the four issues, as discussed in section 4.

2. Predictive Issues and Bayesian Method

In this analysis, we assume that a reliability growth testing is performed on a software and the cumulative number of failures, denoted by $N (t)$ , is observed in the time interval $(0, t]$ . Another assumption is that the cumulative number of failures and failure times ( $0 < t_{1} < t_{2} < \dots$ ) follow the NHPP with the intensity function given in Equation (2). When testing stops after a predetermined number of failures, n, the failure data is said to be failure-truncated, and the n failure times are denoted by $Y_{o b s}^{f} = {[t_{i}]}_{i = 1}^{n}$ . However, if testing stops at a predetermined time, t, the failure data is said to be time-truncated, and the corresponding observed failure data is denoted by $Y_{o b s}^{t} = [n, t_{1}, \dots, t_{n}; t]$ .

Let $\underline{t}$ be the vector of observed failure times and $π (θ)$ be the prior density of $θ = (α, β)$ . Then according to Bayes’ rule, the posterior density $π (θ / \underline{t})$ is obtained using the formula:

$π (θ | \underline{t}) = \frac{g (θ, \underline{t})}{f (\underline{t})} = \frac{f (\underline{t} | θ) π (θ)}{f (\underline{t})}$ (3)

where $g (θ, \underline{t})$ is the joint density of $θ$ and $\underline{T}$ , and $f (\underline{t})$ is the marginal density of $\underline{T}$ . The posterior predictive distribution of t⁺ is given as;

$\begin{matrix} f (t^{+} | \underline{t}) = \int f (t^{+}, θ | \underline{t}) d θ \\ = \int f (t^{+} | θ, \underline{t}) π (θ | \underline{t}) d θ \\ = \int f (t^{+} | θ) π (θ | \underline{t}) d θ \end{matrix}$ (4)

2.1. Issues in Single-Sample Software Reliability Prediction

The study addressed the following four issues in single-sample software reliability testing:

1) Suppose that the predetermined target value, $λ_{t v}$ , for the software failure rate is not achieved at time T, what is the probability that the target value will be achieved at time $τ$ , $τ > T$ ?

2) Suppose that the target value, $λ_{t v}$ , for the software failure rate is not achieved at time T, how long will it take so that the software failure rate will be attained at $λ_{t v}$ ?

3) What is the probability that at most k software failures will occur in the failure time period $(T, τ]$ , $τ > T$ ?

4) What is the upper prediction limit (UPL) of $λ (t) = α β^{2} t e^{- β t}$ with level $λ$ , $τ$ being a predetermined value greater than T?

2.2. Prior, Posterior, and Predictive Distributions

Let $Y_{o b s}$ represent $Y_{o b s}^{t}$ or $Y_{o b s}^{f}$ . The joint density of $Y_{o b s}$ is obtained as:

$f (Y_{o b s} | α, β) = α^{n} β^{2 n} (\prod_{i = 1}^{n} t_{i}) e^{- β \sum_{i = 1}^{n} t_{i}} e^{- α [1 - (1 + β T) e^{- β T}]}$ (5)

and the log-likelihood function is given by:

$\begin{matrix} l (α, β | \underline{t}) = n \log α + 2 n \log β + \log (\prod_{i = 1}^{n} t_{i}) \\ - β \sum t_{i} - α (1 - (1 + β T) \exp (- β T)) \end{matrix}$ (6)

Case 1: When the shape parameter, β, is known, we adopt the following non-informative prior distribution for α;

$π (α) \propto \frac{1}{α}, α > 0$ (7)

The posterior distribution of α can be obtained from Equation (3) as:

$π (α | Y_{o b s}) = \frac{f (Y_{o b s} α, β) π (α, β)}{\int_{0}^{\infty} \int_{0}^{\infty} f (Y_{o b s} α, β) π (α, β) d α d β}$

$π (α | Y_{o b s}) = {[Γ (n)]}^{- 1} α^{n - 1} e^{- α [1 - (1 + β T) e^{- β T}]} {[1 - (1 + β T) e^{- β T}]}^{n}$ (8)

Let t⁺ be the random variable being predicted. The posterior predictive density of t⁺ is obtained using Equation (4) as:

$f (t^{+} | Y_{o b s}) = \int_{0}^{\infty} f (t^{+} | Y_{o b s}, α) π (α | Y_{o b s}) d α$ (9)

The Bayesian UPL for t⁺ with level $γ$ satisfies:

$γ = \int_{- \infty}^{y_{U}^{(β)}} p (t^{+} | Y_{o b s}) d t^{+}$ (10)

Case 2: When the shape parameter, β, is unknown, we consider the following non-informative density of α and β, assuming they are mutually independent.

$π (α, β) \propto \frac{1}{α β}, α, β > 0$ (11)

The corresponding posterior joint density is obtained using Equation (3) as:

$π (α, β | Y_{o b s}) = {[k Γ (n)]}^{- 1} α^{n - 1} β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}} e^{- α [1 - (1 + β T) e^{- β T}]}$ (12)

where;

$k = \int_{0}^{\infty} \frac{β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}}}{{(1 - (1 + β T) e^{- β T})}^{n}} d β$ (13)

Similar to Equation (9) and Equation (10), if t⁺ is the random variable being predicted, the posterior predictive distribution becomes:

$f (t^{+} | Y_{o b s}) = \int_{0}^{\infty} \int_{0}^{\infty} f (t^{+} | Y_{o b s}, α, β) π (α, β | Y_{o b s}) d α d β$ (14)

and the Bayesian UPL is:

$γ = \int_{- \infty}^{y U} p (t^{+} | Y_{o b s}) d t^{+}$ (15)

3. Main Results for Prediction

In this section, we present the main results of the four issues stated in section 2.1 as propositions, and their proofs are given in the Appendix.

Proposition 1:

The probability that the target value $λ_{t v}$ will be achieved at time $τ$ ( $τ > T$ ) is:

$γ = {\begin{array}{l} 1 - \sum_{h = 0}^{n - 1} \frac{{(\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}} λ_{t v})}^{h}}{h!} e^{- \frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}} λ_{t v}}, & if β isknown \\ 1 - \frac{1}{k} \sum_{h = 0}^{n - 1} \int_{0}^{\infty} \frac{{(\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}} λ_{t v})}^{h}}{h!} e^{- \frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}} λ_{t v}} \frac{β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}}}{{[1 - (1 + β T) e^{- β T}]}^{n}} d β, & if β isunknown \end{array}$

Proposition II:

Let $τ^{*}$ denote the time required to attain $λ_{t v}$ . For a specified level $γ$ ;

$τ^{*} = {\begin{array}{l} [- \frac{1}{β} W_{n} (\frac{- 2 λ_{t v} [1 - (1 + β T) e^{- β T}]}{β χ^{2} (2 n; γ)})] - T & if β isknown \\ τ - T & if β isunknown \end{array}$

Remark 1:

$γ = 1 - \frac{1}{k} \sum_{h = 0}^{n - 1} \int_{0}^{\infty} \frac{{(\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}} λ_{t v})}^{h}}{h!} e^{- \frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}} λ_{t v}} \frac{β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}}}{{[1 - (1 + β T) e^{- β T}]}^{n}} d β$ (16)

Proposition III:

The probability that at most k failures will occur in the time interval $(T, τ]$ , $τ > T$ is given by:

$γ_{k} = {\begin{array}{l} \frac{{[1 - (1 + β T) e^{- β T}]}^{n}}{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{n}} \sum_{j = n}^{n + k} (\begin{matrix} j - n \\ n - 1 \end{matrix}) \frac{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j}}{{[1 - (1 + β τ) e^{- β τ}]}^{j}} & if β isknown \\ \sum_{j = n}^{n + k} \frac{Γ (j)}{c (j - n)! Γ (n)} \int_{0}^{\infty} β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}} \frac{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j - n}}{{[1 - (1 + β τ) e^{- β τ}]}^{j}} d β & if β isunknown \end{array}$

Proposition IV:

The Bayesian UPL of $λ (t) = α β^{2} t e^{- β t}$ with level $γ$ is obtained as:

$λ_{U}^{(β)} = {\begin{array}{l} \frac{(β^{2} t e^{- β t}) χ^{2} (2 n; γ)}{2 [1 - (1 + β T) e^{- β T}]} & if β isknown \\ λ_{t v} & if β isunknown \end{array}$

Remark 2:

$γ = 1 - \frac{1}{k} \sum_{h = 0}^{n - 1} \int_{0}^{\infty} \frac{{(\frac{1 - (1 + β T) e^{- β T}}{β^{2} t e^{- β t}} λ_{t v})}^{h}}{h!} e^{- \frac{1 - (1 + β T) e^{- β T}}{β^{2} t e^{- β t}} λ_{t v}} \frac{β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}}}{{[1 - (1 + β T) e^{- β T}]}^{n}} d β$ (17)

4. Real Data Application

In this section, real data in Table 1 was used for single-sample Bayesian prediction, and the results are presented and discussed. The study used failure times (cumulative time between failures) $0 < t_{1} < t_{2} < \dots < t_{22}$ , where n = 22. The data is given by [11] , while [12] argued that it has been widely used in assessing software reliability models. For the case where β is assumed known, the study performed maximum likelihood estimation (MLE), fixing initial guess of the parameters arbitrarily at $α = 18$ and $β = 0.00342$ , where the values were chosen such that, they were closer to the ML estimates [11] obtained using their S-shaped model. However, parameter estimation methods can be used to get the initial set of these parameters. The initial parameter values were used with the log-likelihood function of the delayed S-shaped model given by Equation (6) to obtain ML estimates of α and β. The MLE for β was obtained as $β = 0.007609807$ . The study used this value for the cases where β is assumed known.

Table 1. Time between failures data.

Proposition I: Suppose the target value is given by $λ_{t v} = 0.02$ . At time $T = 100$ , the MLE of the achieved software rate for this software is $λ = α β e^{- 100 β} = 0.0640$ , which is greater than 0.02. In this regard, the target value cannot be achieved at the initial time, $T = 100$ . Assuming a time greater than 100, $τ = 500$ , and we want to determine the probability of achieving $λ_{t v}$ at this time: 1) when β is known ( $β = 0.007609807$ ), from the first formula in Proposition I, we obtain $γ = 9.0 \times 10^{- 9}$ . Therefore, it is almost unlikely that the target value will be achieved. 2) When β is unknown, from the second formula in Proposition I, we obtain $γ = 3.0 \times 10^{- 7}$ . Thus, it is less likely that the target software failure intensity of 0.02 will be achieved at time $τ = 500 h$ .

Since the target value was not achieved at $τ = 500$ , we want to assess the relationship between the probability and time by varying τ, while holding $λ_{t v} = 0.02$ constant. This was illustrated in the case when β is unknown. Table 2 shows the probabilities and time, indicating that it is almost unlikely to achieve the target value at any time between 130 h and 630 h. However, it is almost certain that at 830 h and above, the target software failure intensity will be achieved. It is crucial to note that these values do not indicate the exact time at which the failure intensity is achieved but the time at which it shall have been achieved with some probability. Figure 1 displays the relationship, indicating that the probability of achieving a software failure intensity of 0.02 increases with time. It can be observed in the figure that the probability increases rapidly between 630 h and 780 h, suggesting that between these time periods and/or above, the target software failure intensity is achievable.

Since the software failure intensity of 0.02 ( $λ_{t v} = 0.02$ ) was not achieved at $τ = 500 h$ , we want to determine the failure intensity that is most likely at this time period. An illustration was performed for the case when β is unknown to assess the relationship between failure intensity and probability while holding the time, τ, at 500 h. Suppose the interest is to determine the failure intensity that is most achievable at $τ = 500 h$ . The results displayed in Figure 2 were obtained. It can be observed that it is unlikely that the software will have a failure intensity of less than 0.05 at a time period of 500 h. It can also be observed that the most probable failure intensity is between 0.06 and 0.13. Since the delayed S-Shaped model assumes that errors are immediately removed after they are detected, and no further errors are introduced into the software [4] , a failure intensity that corresponds to a probability that tends to one (1) was selected as the highest, while the failure intensities at which the probability is exactly one were ignored.

Table 2. The probability that the software failure intensity of 0.02 is achieved at different time periods, τ.

Figure 1. The probability of achieving $λ_{t v} = 0.02$ at different time periods.

Figure 2. The probability of achieving $λ_{t v} = λ$ at $τ = 500 h$ .

Proposition II: The next step involved determining how long it will require to achieve the target value $λ_{t v}$ since it was not attained at time $T = 100$ . 1) When β is known ( $β = 0.0076098073$ ), and γ is 0.90, we obtain $τ^{*} = 583.365 h$ using the first formula in Proposition II. The result implies that it will take another 583.365h to achieve the desired failure rate of 0.02 (i.e., the target failure intensity will be achieved at $τ = 683.365 h$ ). 2) When β is unknown, we obtain $τ^{*} = 657 h$ using the second formula in Proposition II and Remark 1, where $τ^{*}$ is the value that satisfies Equation (16). Thus, it will require 657 additional hours to achieve the target failure rate of 0.02. The value implies that the target software failure intensity will be achieved at $τ = 757 h$ .

Proposition III: Since the study has established that the probable software failure intensity at a future time interval $(100, τ = 500]$ , is high, we want to obtain the probability that at most k failures will occur at a future time interval $(T, τ^{*}]$ , where $T < τ^{*} < τ$ . Suppose the interest is to determine the probability that at most k failures will occur in the future time interval $(T, τ^{*}] = (100, 130]$ , the following results were obtained for $k = 25$ :

1). When β is known ( $β = 0.007609807$ ), from the first formula in Proposition III, we obtain $γ_{0} = 0.000213$ , $γ_{1} = 0.00171$ , $γ_{2} = 0.00720$ , $γ_{3} = 0.02122$ , $γ_{4} = 0.04916$ , $γ_{5} = 0.09552$ , $γ_{6} = 0.1621$ , $γ_{7} = 0.2470$ , $γ_{8} = 0.3452$ , $γ_{9} = 0.4497$ , $γ_{10} = 0.5529$ , $γ_{11} = 0.6488$ , $γ_{12} = 0.7329$ , $γ_{13} = 0.8031$ , $γ_{14} = 0.8590$ , $γ_{15} = 0.9019$ , $γ_{16} = 0.9335$ , $γ_{17} = 0.9560$ , $γ_{18} = 0.9716$ , $γ_{19} = 0.9821$ , $γ_{20} = 0.9889$ , $γ_{21} = 0.9933$ , $γ_{22} = 0.9960$ , $γ_{23} = 0.9977$ , $γ_{24} = 0.9987$ , and $γ_{25} = 0.9992$ .

2). When β is unknown, using the second formula in Proposition III, we get $γ_{0} = 0.000024$ , $γ_{1} = 0.000230$ , $γ_{2} = 0.00113$ , $γ_{3} = 0.00391$ , $γ_{4} = 0.01054$ , $γ_{5} = 0.02375$ , $γ_{6} = 0.04651$ , $γ_{7} = 0.08137$ , $γ_{8} = 0.1298$ , $γ_{9} = 0.1916$ , $γ_{10} = 0.2649$ , $γ_{11} = 0.3466$ , $γ_{12} = 0.4327$ , $γ_{13} = 0.5189$ , $γ_{14} = 0.6014$ , $γ_{15} = 0.6772$ , $γ_{16} = 0.7444$ , $γ_{17} = 0.8018$ , $γ_{18} = 0.8495$ , $γ_{19} = 0.8880$ , $γ_{20} = 0.9182$ , $γ_{21} = 0.9413$ , $γ_{22} = 0.9586$ , $γ_{23} = 0.9713$ , $γ_{24} = 0.9804$ , and $γ_{25} = 0.9868$ .

Figure 3. The graph of the probabilities $γ_{k}$ that at most k failures (k = 25) will occur in the time interval $(100, 130]$ for the cases of known and unknown β.

Proposition IV: The Bayesian Upper Prediction Limit (UPL) of $λ (t) = α β^{2} t e^{- β t}$ given $τ = 700 h$ and the level, $γ = 0.90$ : 1) when β is known ( $β = 0.0076098073$ ), the Bayesian UPL was obtained using the first formula in (Proposition IV) as $λ_{U}^{(β)} (τ) = 0.01805$ ; 2) when β is unknown, using the second part of the formula in Proposition IV and Remark 2, the Bayesian UPL of $λ (t) = α β^{2} t e^{- β t}$ with level $γ = 0.90$ was obtained as $λ_{U}^{(β)} (τ) = 0.02883$ (Figure 3).

5. Conclusion

Software reliability remains a priority for software developers. Defective software may damage the reputation of the developers, expose them to preventable costs, and cause significant damage to end-users. To address these issues, software reliability modeling comes in handy. Predictive analyses have been performed in software testing to provide information for making decisions regarding the optimal software production costs, release time, and whether and when the desired reliability would be achieved. This study used non-informative priors to derive explicit solutions for four predictive issues in software testing using the Yamada delayed S-shaped model, and applied the derived methodologies with secondary software failure data. Bayesian approach was used because it incorporates prior information about the parameters of the model and is appropriate even when historical data is insufficient. The obtained solutions can be adequately applied in software quality assessment.

Appendices: Proof of Propositions I-IV

We begin by stating the following identity without proof:

$\int_{D (m; a, b)} d F (t_{1}) \dots d F (t_{m}) = {[F (b) - F (a)]}^{m} / m!$ (A1)

where m is any positive integer, a and b are two real numbers such that a < b, $F (t)$ is an increasing and differentiable function, and $D (m; a, b) = [(t_{1}, \dots, t_{m}) : a < t_{1} < \dots < t_{m} < b]$ .

Proof of Proposition I:

The probability is given by:

Let $π (Y_{o b s} | λ_{τ})$ denote the posterior density of $λ (t) = α β^{2} t e^{- β t}$ . Thus, the probability that $λ_{t v}$ will be achieved at time $τ$ ( $τ > T$ ) is given by:

$γ = \Pr (λ_{τ} \leq λ_{t v} | Y_{o b s}) = \int_{0}^{λ_{t v}} p (Y_{o b s} | λ_{τ}) d λ_{τ}$ (A2)

When β is known, we have $α = \frac{λ (t)}{β^{2} t e^{- β t}}$ , obtained from the intensity function, and $\frac{d α}{d λ_{τ}} = \frac{1}{β^{2} t e^{- β t}}$ .

The posterior distribution of $λ_{τ}$ is $π (λ_{τ} | Y_{o b s}) = π (α | Y_{o b s}) | \frac{d α}{d λ_{τ}} |$ . Thus, the posterior density becomes:

$π (λ_{τ} | Y_{o b s}) = \frac{1}{Γ (n)} {(\frac{λ (t)}{β^{2} τ e^{- β τ}})}^{n - 1} e^{- (\frac{λ (t)}{β^{2} τ e^{- β τ}}) [1 - (1 + β T) e^{- β T}]} {[1 - (1 + β T) e^{- β T}]}^{n} \frac{1}{β^{2} τ e^{- β τ}}$

Which reduces to:

$π (λ_{τ} | Y_{o b s}) = \frac{{[\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}}]}^{n}}{Γ (n)} λ_{τ}^{n - 1} e^{- λ_{τ} [\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}}]}$ (A3)

From Equation (A3), $λ_{τ}$ follows a gamma distribution with parameters n and $\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}}$ . However, there is a relationship between gamma and Poisson distribution defined by:

$\frac{β^{α}}{Γ (α)} \int_{0}^{λ} x^{α - 1} e^{- β x} d x = 1 - \sum_{h = 0}^{α - 1} \frac{{(β λ)}^{h}}{h!} e^{- β λ}$ (A4)

From Equation (A2), ( $γ = \int_{0}^{λ_{t v}} p (Y_{o b s} | λ_{τ}) d λ_{τ}$ ), and Equation (A4), it follows that:

$γ = 1 - \sum_{h = 0}^{n - 1} \frac{{(\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}} λ_{t v})}^{h}}{h!} e^{- \frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}} λ_{t v}}$ (A5)

Equation (A5) shows the first formula in proposition I.

When β is unknown: from the intensity function, $α = \frac{λ (t)}{β^{2} t e^{- β t}}$ and let $β = β$ . The Jacobian is $\frac{d (α, β)}{d (λ_{τ}, β)} = \frac{1}{β^{2} t e^{- β t}}$ . The joint posterior density of $(λ_{τ}, β)$ is given by:

$π (λ_{τ}, β | Y_{o b s}) = π (α, β | Y_{o b s}) | \frac{d (α, β)}{d (λ_{τ}, β)} |$

From Equation (12):

$\begin{array}{l} π (λ_{τ}, β | Y_{o b s}) = {[k Γ (n)]}^{- 1} α^{n - 1} β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}} e^{- α [1 - (1 + β T) e^{- β T}]} \frac{1}{β^{2} τ e^{- β τ}} \\ = {[k Γ (n)]}^{- 1} {(\frac{λ_{τ}}{β^{2} τ e^{- β τ}})}^{n - 1} β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}} e^{- (\frac{λ_{τ}}{β^{2} τ e^{- β τ}}) [1 - (1 + β T) e^{- β T}]} \frac{1}{β^{2} τ e^{- β τ}} \\ = {(\frac{1}{β^{2} τ e^{- β τ}})}^{n - 1} \frac{β^{2 n - 1}}{k Γ (n)} e^{- β \sum_{i = 1}^{n} t_{i}} λ_{τ}^{n - 1} e^{- λ_{τ} [\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}}]} \frac{1}{β^{2} τ e^{- β τ}} \\ = {(\frac{1}{β^{2} τ e^{- β τ}})}^{n} \frac{β^{2 n - 1}}{k Γ (n)} e^{- β \sum_{i = 1}^{n} t_{i}} λ_{τ}^{n - 1} e^{- λ_{τ} [\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}}]} \end{array}$

$\begin{array}{l} = \frac{{(β^{2} τ e^{- β τ})}^{- n}}{k Γ (n)} β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}} {\frac{Γ (n)}{{[\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}}]}^{n}}} \frac{{[\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}}]}^{n}}{Γ (n)} λ_{τ}^{n - 1} e^{- λ_{τ} [\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}}]} \\ = \frac{β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}}}{k {[1 - (1 + β T) e^{- β T}]}^{n}} \frac{{[\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}}]}^{n}}{Γ (n)} λ_{τ}^{n - 1} e^{- λ_{τ} [\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}}]} \end{array}$ (A6)

Using Equations (A2), (A4), and (A6), we have;

$\begin{matrix} γ = \frac{1}{k} \int_{0}^{\infty} \frac{β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}}}{{[1 - (1 + β T) e^{- β T}]}^{n}} {1 - \sum_{h = 0}^{n - 1} \frac{{(\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}} λ_{t v})}^{h}}{h!} e^{- \frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}} λ_{t v}}} d β \\ = \frac{1}{k} \int_{0}^{\infty} \frac{β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}}}{{[1 - (1 + β T) e^{- β T}]}^{n}} d β \\ - \frac{1}{k} \sum_{h = 0}^{n - 1} \int_{0}^{\infty} \frac{{(\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}} λ_{t v})}^{h}}{h!} e^{- \frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}} λ_{t v}} \frac{β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}}}{{[1 - (1 + β T) e^{- β T}]}^{n}} d β \end{matrix}$

But $k = \int_{0}^{\infty} \frac{β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}}}{{(1 - (1 + β T) e^{- β T})}^{n}} d β$

Equation (A7) is the formula in the second part of proposition I.

Proof of Proposition II:

For given level γ, the time required to attain the target value $λ_{t v}$ is given by $τ^{*} = τ - T$ , where τ satisfies Equation (A2). When β is known, from Equation (A2), it can be noted that $2 [\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}}] λ_{t v}$ follows a Chi-square distribution with 2n degrees of freedom. Therefore, we have:

$2 [\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}}] λ_{t v} = χ^{2} (2 n; γ)$ (A8)

$\frac{1 - (1 + β T) e^{- β T}}{β^{2} τ e^{- β τ}} = \frac{χ^{2} (2 n; γ)}{2 λ_{t v}}$ (A9)

$β^{2} τ e^{- β τ} = \frac{2 λ_{t v} [1 - (1 + β T) e^{- β T}]}{χ^{2} (2 n; γ)}$

$τ e^{- β τ} = \frac{2 λ_{t v} [1 - (1 + β T) e^{- β T}]}{β^{2} χ^{2} (2 n; γ)}$

$\begin{matrix} τ = - \frac{1}{β} W_{n} (\frac{- 2 β λ_{t v} [1 - (1 + β T) e^{- β T}]}{β^{2} χ^{2} (2 n; γ)}) \\ = - \frac{1}{β} W_{n} (\frac{- 2 λ_{t v} [1 - (1 + β T) e^{- β T}]}{β χ^{2} (2 n; γ)}), for n \in Ζ \end{matrix}$ (A10)

where $n \in Ζ$ denotes the n^th root of the equation $τ e^{τ} = \frac{- 2 λ_{t v} [1 - (1 + β T) e^{- β T}]}{β χ^{2} (2 n; γ)}$ and W is the Lambert W function., satisfying $W (τ e^{τ}) = τ$ .

From Equation (A10), we can obtain the time required to attain $λ_{t v}$ , $τ^{*}$ , as follows:

$τ^{*} = [- \frac{1}{β} W_{n} (\frac{- 2 λ_{t v} [1 - (1 + β T) e^{- β T}]}{β χ^{2} (2 n; γ)})] - T$ (A11)

Equation (A11) is the first formula of Proposition II.

When β is unknown, the time required to attain the target value, $λ_{t v}$ , is τ, which is the solution to:

Proof of Proposition III:

The probability is given by

$γ_{k} = \Pr [N (τ) \leq n + k | Y_{o b s}]$

When β is known:

$γ_{k} = \int_{0}^{\infty} \Pr [N (τ) \leq n + k | Y_{o b s}] π (α | Y_{o b s}) d α$ (A13)

But $π (α | Y_{o b s})$ is given by Equation (8) and

$\Pr [N (τ) \leq n + k | Y_{o b s}] = \sum_{j = n}^{n + k} f (Y_{o b s}, N (τ) = j | α) / f (Y_{o b s} | α)$ (A14)

From Equation (5), $f (Y_{o b s} | α) = α^{n} β^{2 n} (\prod_{i = 1}^{n} t_{i}) e^{- β \sum_{i = 1}^{n} t_{i}} e^{- α [1 - (1 + β T) e^{- β T}]}$ and

$\begin{array}{l} f (Y_{o b s}, N (τ) = j | α) \\ = \int_{D (j - n : T, τ)} f (Y_{o b s}, t_{n + 1}, \dots, t_{j}, N (τ) = j) \prod_{l = n + 1}^{j} d t_{l} \\ = \int_{D (j - n : T, τ)} α^{j} β^{2 j} (\prod_{i = 1}^{j} t_{i}) e^{- β \sum_{i = 1}^{j} t_{i}} e^{- α [1 - (1 + β T) e^{- β T}]} \prod_{l = n + 1}^{j} d t_{l} \\ = α^{j} β^{2 j} (\prod_{i = 1}^{n} t_{i}) e^{- β \sum_{i = 1}^{n} t_{i}} e^{- α [1 - (1 + β τ) e^{- β τ}]} \int_{D (j - n : T, τ)} (\prod_{l = n + k}^{j} t_{l}) e^{- β \sum_{l = n + k}^{j} t_{l}} \prod_{l = n + 1}^{j} d t_{l} \end{array}$ (A15)

We solve the integral part as follows:

$\int_{0}^{t} t e^{- β t} d t = \frac{1}{β^{2}} [1 - (1 + β t) e^{- β t}]$

Substituting the limits T and τ, we get; $\frac{1}{β^{2}} [1 - (1 + β τ) e^{- β τ}] - \frac{1}{β^{2}} [1 - (1 + β T) e^{- β T}]$ , which reduces to $\frac{1}{β^{2}} [(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]$ . Therefore, the integral part of equation (A15) is obtained as;

$\begin{array}{l} \int_{D (j - n : T, τ)} (\prod_{l = n + k}^{j} t_{l}) e^{- β \sum_{l = n + k}^{j} t_{l}} \prod_{l = n + 1}^{j} d t_{l} \\ = \frac{1}{β^{2 (j - n)}} \frac{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j - n}}{(j - n)!} \end{array}$ (A16)

Substituting Equation (A16) into Equation (A15) we obtain;

$\begin{array}{l} f (Y_{o b s}, N (τ) = j | α) \\ = α^{j} β^{2 j} (\prod_{i = 1}^{n} t_{i}) e^{- β \sum_{i = 1}^{n} t_{i}} e^{- α [1 - (1 + β τ) e^{- β τ}]} \frac{1}{β^{2 (j - n)}} \frac{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j - n}}{(j - n)!} \end{array}$

From Equation (A14);

$\begin{array}{l} f (Y_{o b s}, N (τ) = j | α) / f (Y_{o b s} | α) \\ = \frac{α^{j} β^{2 j} (\prod_{i = 1}^{n} t_{i}) e^{- β \sum_{i = 1}^{n} t_{i}} e^{- α [1 - (1 + β τ) e^{- β τ}]} \frac{1}{β^{2 (j - n)}} \frac{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j - n}}{(j - n)!}}{α^{n} β^{2 n} (\prod_{i = 1}^{n} t_{i}) e^{- β \sum_{i = 1}^{n} t_{i}} e^{- α [1 - (1 + β T) e^{- β T}]}} \end{array}$

which reduces to;

$\begin{array}{l} f (Y_{o b s}, N (τ) = j | α) / f (Y_{o b s} | α) \\ = α^{j - n} e^{- α [(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]} \frac{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j - n}}{(j - n)!} \end{array}$

Thus, Equation (A14) becomes;

$\begin{array}{l} \Pr [N (τ) \leq n + k | Y_{o b s}] = \sum_{j = n}^{n + k} f (Y_{o b s}, N (τ) = j | α) / f (Y_{o b s} | α) \\ = \sum_{j = n}^{n + k} α^{j - n} e^{- α [(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]} \frac{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j - n}}{(j - n)!} \end{array}$ (A17)

And hence, Equation (A13) becomes;

$\begin{matrix} γ_{k} = \int_{0}^{\infty} \sum_{j = n}^{n + k} α^{j - n} e^{- α [(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]} \frac{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j - n}}{(j - n)!} \\ \times {{[Γ (n)]}^{- 1} α^{n - 1} e^{- α [1 - (1 + β T) e^{- β T}]} {[1 - (1 + β T) e^{- β T}]}^{n}} d α \\ = \sum_{j = n}^{n + k} \frac{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j - n} {[1 - (1 + β T) e^{- β T}]}^{n}}{(j - n)! Γ (n)} \\ \times \int_{0}^{\infty} α^{j - n} e^{- α [1 - (1 + β τ) e^{- β τ}]} d α \\ = \sum_{j = n}^{n + k} \frac{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j - n} {[1 - (1 + β T) e^{- β T}]}^{n}}{(j - n)! Γ (n)} \\ \times \frac{Γ (j)}{{[1 - (1 + β τ) e^{- β τ}]}^{j}} \int_{0}^{\infty} \frac{{[1 - (1 + β τ) e^{- β τ}]}^{j}}{Γ (j)} α^{j - n} e^{- α [1 - (1 + β τ) e^{- β τ}]} d α \end{matrix}$ (A18)

The integral part of Equation (A18) is a gamma distribution with parameters j and $[1 - (1 + β τ) e^{- β τ}]$ , and thus integrates to 1. Hence, Equation (A18) reduces to;

$γ_{k} = \sum_{j = n}^{n + k} \frac{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j - n} {[1 - (1 + β T) e^{- β T}]}^{n} Γ (j)}{(j - n)! Γ (n) {[1 - (1 + β τ) e^{- β τ}]}^{j}}$ (A19)

Equation (A19) is rearranged to obtain;

$γ_{k} = \frac{{[1 - (1 + β T) e^{- β T}]}^{n}}{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{n}} \sum_{j = n}^{n + k} (\begin{matrix} j - 1 \\ n - 1 \end{matrix}) \frac{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j}}{{[1 - (1 + β τ) e^{- β τ}]}^{j}}$ (A20)

Equation (A20) is the first formula of proposition III.

When β is unknown, we obtain:

$γ_{k} = \int_{0}^{\infty} \int_{0}^{\infty} \Pr [N (τ) \leq n + k | Y_{o b s}] P (α, β | Y_{o b s}) d α d β$

where $\Pr [N (τ) \leq n + k | Y_{o b s}]$ and $π (α, β | Y_{o b s})$ are given by Equations (A17) and (12), respectively.

$\begin{array}{l} γ_{k} = \sum_{j = n}^{n + k} \frac{1}{c (j - n)! Γ (n)} \int \int_{0}^{\infty} α^{j - n} e^{- α [(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]} \\ \times {[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j - n} α^{n - 1} β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}} e^{- α [1 - (1 + β T) e^{- β T}]} d α d β \\ = \sum_{j = n}^{n + k} \frac{Γ (j)}{c (j - n)! Γ (n)} \int_{0}^{\infty} β^{2 n - 1} e^{- β \sum_{i = 1}^{n} t_{i}} \frac{{[(β T e^{- β T} + e^{- β T}) - (β τ e^{- β τ} + e^{- β τ})]}^{j - n}}{{[1 - (1 + β τ) e^{- β τ}]}^{j}} d β \end{array}$ (A21)

where c = k as used in Equation (12). Letter c has been substituted for k because the summation in Equation (A21) is from n to (n + k), and the k’s are not the same. Equation (A21) implies the second formula in Proposition III.

Proof of Proposition IV:

When β is known, given a predetermined $τ$ ( $τ > T$ ), the Bayesian UPL for $λ (t)$ with level γ, denoted by $λ_{U}^{(β)}$ satisfies $γ = \Pr (λ_{t} \leq λ_{U}^{(β)} (τ) | Y_{o b s})$ . From Equations (A2) and (A8);

$γ = \int_{0}^{λ_{U}^{(β)} (τ)} f {[\frac{1 - (1 + β T) e^{- β T}}{β^{2} t e^{- β t}}]}^{n} \frac{λ_{τ}^{n - 1}}{Γ (n)} e^{- λ_{τ} [\frac{1 - (1 + β T) e^{- β T}}{β^{2} t e^{- β t}}]} d λ_{τ}$ (A22)

From Equation (A14);

$2 [\frac{1 - (1 + β T) e^{- β T}}{β^{2} t e^{- β t}}] λ_{U}^{(β)} (τ) = χ^{2} (2 n; γ)$ (A23)

Making $λ_{U}^{(β)} (τ)$ in Equation (A23) the subject, we get;

$λ_{U}^{(β)} (τ) = \frac{(β^{2} t e^{- β t}) χ^{2} (2 n; γ)}{2 [1 - (1 + β T) e^{- β T}]}$ (A24)

Equation (A24) implies the first part of proposition IV.

When β is unknown, the Bayesian UPL for $λ (t) = α β^{2} t e^{- β t}$ with level γ is $λ_{t v}$ , which is the solution to;

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1]	Kaur, R. and Panwar, P. (2015) Study of Perfect and Imperfect Debugging NHPP SRGMs Used for Prediction of Faults in a Software. IJCSC, 6, 73-78.
[2]	Akuno, A.O., Orawo, L.A. and Islam, A.S. (2014) One-Sample Bayesian Predictive Analyses for an Exponential Nonhomogeneous Poisson Process in Software Reliability. Open Journal of Statistics, 4, 402-411. https://doi.org/10.4236/ojs.2014.45039
[3]	Yamada, S., Ohba, M. and Osaki, S. (1984) S-Shaped Software Reliability Growth Models and Their Applications. IEEE Transactions on Reliability, R-33, 289-292. https://doi.org/10.1109/TR.1984.5221826
[4]	Hanagal, D.D. and Bhalerao, N.N. (2018) Analysis of Delayed S-Shaped Software Reliability Growth Model with Time Dependent Fault Content Rate Function. Journal of Data Science, 16, 857-878. https://doi.org/10.6339/JDS.201810_16(4).00010
[5]	Lee, T.Q., Yeh, C.W. and Fang, C.C. (2014) Bayesian Software Reliability Prediction Based on Yamada Delayed S-Shaped Model. Applied Mechanics and Materials, 490, 1267-1278. https://doi.org/10.4028/www.scientific.net/AMM.490-491.1267
[6]	Collins, O., Akong’o, O.L., Munene, M.G. and Okenye, J.O. (2023) Bayesian Interval Estimation in a Non-Homogeneous Poisson Process with Delayed S-Shaped Intensity Function. American Journal of Theoretical and Applied Statistics, 12, 43-50.
[7]	Yin, L. and Trivedi, K.S. (1999) Confidence Interval Estimation of NHPP-Based Software Reliability Models. Proceedings 10th International Symposium on Software Reliability Engineering, 6-11.
[8]	Yu, J.W., Tian, G.L. and Tang, M.L. (2007) Predictive Analyses for Nonhomogeneous Poisson Processes with Power Law Using Bayesian Approach. Computational Statistics & Data Analysis, 51, 4254-4268. https://doi.org/10.1016/j.csda.2006.05.010
[9]	Yu, J.W., Tian, G.L. and Tang, M.L. (2011) Bayesian Estimation and Prediction for the Power Law Process with Left Truncated Data. Journal of Data Science, 9, 445-470. https://doi.org/10.6339/JDS.201107_09(3).0009
[10]	Cheruiyot, N., Orawo, L.A.O. and Islam, A.S. (2018) Bayesian Predictive Analyses for Logarithmic Non-Homogeneous Poisson Process in Software Reliability. Open Access Library Journal, 5, 1-13. https://doi.org/10.4236/oalib.1104767
[11]	Ehrlich, W., Prasanna, B., Stampfel, J. and Wu, J. (1993) Determining the Cost of a Stop-Test Decision (Software Reliability). IEEE Software, 10, 33-42. https://doi.org/10.1109/52.199726
[12]	Hung-Cuong, N. and Quyet-Thang, H. (2015) New NHPP SRM Based on Generalized S-Shaped Fault-Detection Rate Function. Nature of Computation and Communication: International Conference, ICTCC 2014, Ho Chi Minh City, 24-25 November 2014, 212-221. https://doi.org/10.1007/978-3-319-15392-6_21

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies