Share This Article:

Analytical and Numerical Investigations of Probabilistic Monochromatic Problem

Abstract Full-Text HTML XML Download Download as PDF (Size:2669KB) PP. 793-808
DOI: 10.4236/jamp.2019.74054    94 Downloads   194 Views  

ABSTRACT

A probabilistic formalism, relying on Bayes’ theorem and linear Gaussian inversion, is adapted, so that a monochromatic problem can be investigated. The formalism enables an objective test in probabilistic terms of the quantities and model concepts involved in the problem at hand. With this formalism, an amplitude (linear parameter), a frequency (non-linear parameter) and a hyperparameter of the Gaussian amplitude prior are inferred jointly given simulated data sets with Gaussian noise contributions. For the amplitude, an analytical normal posterior follows which is conditional on the frequency and the hyperparameter. The remaining posterior estimates the frequency with an uncertainty of MHz, while the convolution of a standard approach would achieve an uncertainty of some GHz. This improvement in the estimation is investigated analytically and numerically, revealing for instance the positive effect of a high signal-to-noise ratio and/or a large number of data points. As a fixed choice of the hyperparameter imposes certain results on the amplitude and frequency, this parameter is estimated and, thus, tested for plausibility as well. From abstract point of view, the model posterior is investigated as well.

1. Introduction

Fourier transformation tools are used to obtain information about spectra for a given data set. As any data has an uncertainty, Fourier transformation techniques can be supported by probabilistic theory captured by Bayes’ theorem [1] to improve scientific results and conclusions. The works [2] [3] demonstrate some advantages of the probabilistic ansatz over a conventional approach.

A common and misleading assumption in the field is that the Nyquist theorem determines the spectral band limitation. However, as the Nyquist frequency follows from finite data sampling, it can only be an upper limit but cannot be an estimator for the limit caused actually by a source and/or diagnostic throughput. A similar reasoning applies to the lower spectral limit. After a basic formalism has been derived, it could be shown in Ref. [4] that the band limitation can be well inferred from experimental data. Thereby, a linear Gaussian inversion technique was used to infer spectral amplitudes. Furthermore, a settings posterior was introduced which estimates non-linear parameters and hyperparameters of a problem. For instance, the spectral band limits and the uncertainty on the derived quasi-continuous spectrum, originating in non-probed Fourier coefficients, have been inferred jointly.

For many applied analysis schemes, certain model assumptions and their implications are not tested but assumed to be valid with infinite trust. From scientific point of view, any analysis scheme should be tested given simulated noisy data for which all model assumptions are clear. Then, analysis results and model assumptions can be investigated which is achieved objectively by a probabilistic ansatz. If this can be carried out analytically, valuable information is available when only actual measured data are given for a scientific problem.

The problem of a monochromatic source is a good example to show the powerfulness of Bayesian inference and to test model assumptions. In this work, the formalism derived in Ref. [4] is adapted to a monochromatic problem in Section 2 and applied to simulated data with different noise levels in Section 3. In addition, analytical and numerical investigations are carried out, so that the probabilistic findings can be understood in more detail. For example, a better signal-to-noise ratio improves the estimation of the frequency, while more data points compensate a worsened signal-to-noise ratio. The conclusion section can be found at the end.

2. Adapted Probabilistic Formalism

The formalism derived in Ref. [4] is adapted below to a monochromatic problem, involving one frequency parameter f 1 and one amplitude parameter S 1 . In addition, a hyperparameter is at hand, entering in the normal prior for S 1 . Abstractly, Bayes’ theorem reads then

p ( S 1 , f 1 , σ S 1 , P r | D ) = p ( D | S 1 , f 1 ) p ( D ) p ( S 1 | σ S 1 , P r ) p ( σ S 1 , P r ) p ( f 1 ) (1)

with the joint posterior on the left-hand, the likelihood p ( D | S 1 , f 1 ) devided by the evidence, the amplitude prior p ( S 1 | σ S 1 , P r ) , the prior for the hyperparameter, and the prior for the frequency p ( f 1 ) .

The amplitude S 1 maps linearly to the data domain via the vector M 1 = M ( f 1 ) dependent on the frequency in a non-linear manner. The data D with N D entries are assumed to be acquired independently, and the measurement uncertainty of each data point follows a normal distribution with standard deviation σ D . With these assumptions, the Gaussian likelihood becomes

p ( D | S 1 , f 1 ) = exp [ 1 2 ( D M 1 S 1 ) T Σ D 1 ( D M 1 S 1 ) ] ( 2 π ) N D / 2 | Σ D | 1 / 2 (2)

with the covariance matrix Σ D = σ D 2 δ i j . The amplitude prior takes the form

p ( S 1 | σ S 1 , P r ) = exp [ 1 2 S 1 2 / σ S 1 , P r 2 ] ( 2 π ) 1 / 2 σ S 1 , P r (3)

for vanishing prior mean and variance σ S 1 , P r 2 . After some analytical operations, one obtaines the Gaussian amplitude posterior

p ( S 1 | f 1 , σ S 1 , P r , D ) = exp [ 1 2 ( S 1 S 1, P o ) σ S 1 , P o 2 ( S 1 S 1, P o ) ] ( 2 π ) 1 / 2 σ S 1 , P o (4)

conditional on f 1 and σ S 1 , P r . In the above equation, the posterior variance is given by σ S 1 , P o 2 = ( M 1 T M 1 σ D 2 + σ S 1 , P r 2 ) 1 , and the posterior mean reads S 1, P o = σ S 1 , P o 2 M 1 T D / σ D 2 .

From the remaining terms, the so-called settings posterior

p ( f 1 , σ S 1 , P r | D ) = 1 K σ S 1 , P o σ S 1 , P r exp [ 1 2 S 1, P o 2 / σ S 1 , P o 2 ] p ( σ S 1 , P r ) p ( f 1 ) (5)

is derived which can be interpreted as Ockham’s razor with respect to S 1 . This settings posterior has no general analytical solution, and, thus, the normalisation constant K cannot be stated further. For the remainder of the paper, p ( σ S 1 , P r ) = 1 / Δ σ and p ( f 1 ) = 1 / Δ f are taken as uniform distributions. By this choice, K = K p ( σ S 1 , P r ) p ( f 1 ) can be used to investigate analytically and numerically the quantity

p × K = σ S 1 , P o σ S 1 , P r exp [ 1 2 S 1, P o 2 / σ S 1 , P o 2 ] . (6)

Finally, the evidence can be identified as

p ( D ) = K Δ σ Δ f exp [ 1 2 D T Σ D 1 D ] ( 2 π ) N D / 2 | Σ D | 1 / 2 . (7)

The constant K / ( Δ σ Δ f ) varies with the chosen model H, including likelihood, priors, and the prior domains for f 1 and σ S 1 , P r . Hence, this constant is linked to the model posterior p ( H | D ) which is of importance, when the model is further abstracted, or a comparison with an alternative model is carried out.

3. Monochromatic Problem

For a monochromatic even source, data sets with different noise levels are simulated, and the formalism derived in the previous section is applied. To explain the results found, analytical and numerical investigations are carried out. While this problem is treated abstractly in the following, the main results will be presented for two examples with low and high noise contribution to ease the presentation.

(A) Simulated Data: Two Examples

Data sets are modelled for the real-world interferometer found in Ref. [5] which achieves an optical path difference to obtain an interferogram. Formally, the interferogram V i = 2 S cos ( 2 π f x i / c ) is simulated with the amplitude S = 0.5 and the frequency f = 50 GHz. The optical path difference grid x i is chosen as x i = ( i 129 ) Δ x with i = 1 , , N D = 789 with constant increment Δ x = 40 μ m . Accordingly, the total length L = 31.56 mm and the length L S S = 21.28 mm of the single-sided region follow. Two noisy data sets D i = V i + N i ( 0, σ D 2 ) are generated with the noise levels σ D at 0.05 and 1.0 (see Figure 1).

(B) Application of Formalism

The linear mapping of the amplitude parameter is given by M 1, i = 2 cos ( 2 π f 1 x i / c ) , yielding M 1 . For the two simulated data sets, Figure 2 shows p ( S 1 | f 1 , σ S 1 , P r , D ) , varying in f 1 and for three values of σ S 1 , P r . Even for the high noise level case, S 1, P o peaks close to 50 GHz, and the uncertainty σ S 1 , P o remains reasonably small. The peak has the full width at half maximum (FWHM) of about 8 GHz which is roughly twice as large as the classical spectral resolution stated by c / ( 2 L ) 4.75 GHz . High and low values of σ S 1 , P r seem to have little effect on the results.

Figure 1. Simulated interferograms V (without noise, black) and D (with noise, red). Formally, V = 2 S cos ( 2 π f x / c ) was used with amplitude 2 S = 1 , frequency f = 50 GHz, and optical path difference grid x with increment Δ x = 40 μ m (number of data points N D = 789 ). The data-like quantity D = V + N ( 0, σ D 2 ) resembles a sample of V , assuming a Gaussian noise component with variance σ D 2 . Furthermore, the noise is chosen to be independent of x. (a) σ D = 0.05 (low noise); (b) σ D = 1.0 (high noise).

Figure 2. Conditional Gaussian amplitude posterior p ( S 1 | f 1 , σ S 1 , P r , D ) with mean S 1, P o and standard deviation σ S 1 , P o . This posterior depends on the frequency f 1 , prior standard deviation σ S 1 , P r for S 1 , and noisy interferogram data D . The conditional posterior is obtained for two sets D (see Figure 1) modelled by a monochromatic source with the frequency f = 50 GHz, the amplitude S = 0.5 and the noise levels σ D = 0.05 (left column) and 1 (right column). The posterior mean (cyan) peaks at values of about 0.5 close to 50 GHz. The spectral FWHM of this peak accounts for about 8 GHz. The posterior standard deviation (dashed-white for its multiples) increases with the noise level, as one would expect. Only small values of σ S 1 , P r affect S 1, P o when the noise level is high.

The settings posterior p ( f 1 , σ S 1 , P r | D ) is proportional to the quantity p × K which takes very large values even on logarithmic scale and, thus, ln p × K is used in the following. For the two cases, ln p × K is shown in Figure 3(a) and Figure 3(b) versus the spectral parameter and for three prior values of σ S 1 , P r . All three cross-sections look similar, but close to 50 GHz the peak is highest for values near σ S 1 , P r = 0.5 . Identifying this global maximum by max ( p × K ) , the ratio p × K / max ( p × K ) takes reasonable numbers and is presented for the

Figure 3. p × K , being proportional to joint posterior p ( f 1 , σ S 1 , P r | D ) , dependent on frequency f 1 and prior standard deviation σ S 1 , P r given interferogram data set D (see Figure 1) with low and high noise levels σ D = 0.05 (left column) and 1 (right column), respectively. Since p × K takes large values, ln p × K is presented in (a) and (b). Deviding by the global maximum, the ratio p × K / max ( p × K ) becomes manageable from numerical point of view as shown in (c) and (d). This ratio peaks in the vicinity of 50 GHz and σ S 1 , P r = 0.5 ( log σ S 1 , P r = 0.3 ). The spectral width of p × K / max ( p × K ) increases with the noise level from about 10 to 200 MHz but remains well below the classical spectral resolution of about 4.75 GHz. The width with respect to σ S 1 , P r remains constant and covers one order of magnitude. (a) ln p × K (low noise case) for broad spectral domain at three values of σ S 1 , P r ; (b) ln p × K (high noise case) for broad spectral domain at three values of σ S 1 , P r ; (c) p × K / max ( p × K ) (low noise case) for spectral domain 49.9 - 50.1 GHz; (d) p × K / max ( p × K ) (high noise case) for spectral domain 49 - 51 GHz.

relevant domains of f 1 and σ S 1 , P r by Figure 3(c) and Figure 3(d) for both data sets. This ratio is narrow in the vicinity of 50 GHz and in the domain 0.3 σ S 1 , P r 3 . Furthermore, the peaking has a spectral width of about 10 MHz and 200 MHz for the low and high noise case, respectively. When interpreted as spectral resolution, this width is smaller by several orders of magnitude with respect to its classical counterpart of about 4.75 GHz.

(C) Analytical and Numerical Investigations

To have further inside in the results, the conditional amplitude posterior and p × K are analytically investigated. In order to do so, the noise contribution in the data is neglected, meaning that D i is replaced with the noise-less interferogram V i = 2 S cos ( 2 π f x i / c ) . This allows the tracking of the dependencies of the posteriors on the original amplitude S and frequency f, noise level σ D , and number of data points N D . Furthermore, a second factorisation p ( f 1 , σ S 1 , P r | D ) = p ( f 1 | σ S 1 , P r , D ) p ( σ S 1 , P r | D ) can be justified. Parallel to this investigation, numerics and conditional posteriors are presented.

1) Conditional Amplitude Posterior

Starting point is the variance σ S 1 , P o 2 of the conditional posterior p ( S 1 | f 1 , σ S 1 , P r , D ) (see Equation (4)). Formally, one finds for the inverse

σ S 1 , P o 2 = M 1 T M 1 σ D 2 + 1 σ S 1 , P r 2 . (8)

As long as the spatial sampling is well enough, one can use the approximation

M 1 T M 1 = 4 i = 1 N D cos ( 2 π f 1 c x i ) cos ( 2 π f 1 c x i ) 2 N D [ 1 + cos ( 2 π f 1 c L S S ) sinc ( 2 π f 1 c L ) ] , (9)

as shown by Equation (A1), to get the posterior variance

σ S 1 , P o 2 = σ D 2 / ( 2 N D ) 1 + cos ( 2 π f 1 c L S S ) sinc ( 2 π f 1 c L ) + σ D 2 2 N D σ S 1 , P r 2 . (10)

This variance increases quadratically with the noise level and reduces with the number of data samples. Thus, more data points per spatial unit can compensate the noise contribution, at least partly. However, this assumes that the noise is independent of the data sampling.

The modulated sinc function in the denominator of Equation (10) depends on f 1 , L and L S S for which the interferogram is available. The sinc function becomes unity and vanishes for small and large frequencies, respectively. For instance, since both data examples share the same spatial domain, the sinc function is close to 0 above about 20 GHz. The amplitude prior influences σ S 1 , P o 2 , when the ratio σ D 2 / ( 2 N D σ S 1 , P r 2 ) becomes larger than 1 which is obtained for a large noise level, a small value of σ S 1 , P r and/or few data points.

For the posterior mean,

S 1, P o ( f 1 ) = σ S 1 , P o 2 σ D 2 M 1 T V = 2 i = 1 N D V i cos ( 2 π f 1 c x i ) / ( 2 N D ) 1 + cos ( 2 π f 1 c L S S ) sinc ( 2 π f 1 c L ) + σ D 2 2 N D σ S 1 , P r 2 (11)

follows, leaving aside the noise contribution but the uncertainty on the mean is still captured by σ S 1 , P o . With the approximation (A1) and the trigonometric identity (B1), one can resolve

S 1, P o ( f 1 ) = S cos ( π f f 1 c L S S ) sinc ( π f f 1 c L ) + cos ( π f + f 1 c L S S ) sinc ( π f + f 1 c L ) 1 + cos ( 2 π f 1 c L S S ) sinc ( 2 π f 1 c L ) + σ D 2 2 N D σ S 1 , P r 2 . (12)

Indeed, one finds the original amplitude S 1 , P o = S for f 1 = f and prior values with the condition σ D 2 / ( 2 N D ) σ S 1 , P r 2 . Furthermore, due to cos ( π f f 1 c L ) sinc ( π f f 1 c L S S ) there is a peaking of S 1, P o ( f 1 ) in the vicinity of f 1 f with the FWHM determined by both trigonometric arguments. Away from f, the oscillating amplitude drops like 1 / ( f f 1 ) due to the sinc function which can be seen in Figure 2.

2) Conditional Posterior for f1

Now the information is available to investigate p ( f 1 , σ S 1 , P r | D ) . Neglecting the noise, the triple product in the exponent of the exponential in p × K reads

1 2 S 1, P o σ S 1 , P o 2 S 1, P o 1 2 M 1 T V σ D 2 σ S 1 , P o 2 M 1 T V σ D 2 = 1 2 σ S 1 , P o 2 4 N D 2 S 2 σ D 4 [ cos ( π f f 1 c L S S ) sinc ( π f f 1 c L ) + cos ( π f + f 1 c L S S ) sinc ( π f + f 1 c L ) ] 2 . (13)

The global maximum is close to f 1 = f , and, assuming a sufficiently large frequency f, one gets the approximation

[ cos ( π f f 1 c L S S ) sinc ( π f f 1 c L ) ] 2 [ 1 ( π f f 1 c L S S ) 2 1 3 ( π f f 1 c L ) 2 ] (14)

when the Taylor series expansions sinc ( z ) 1 z 2 / 3 ! and cos ( z ) 1 z 2 / 2 are used. Then,

1 2 S 1, P o σ S 1 , P o 2 S 1, P o 1 2 σ S 1 , P o 2 4 N D 2 S 2 σ D 4 [ 1 ( f f 1 ) 2 ( π c ) 2 ( L S S 2 + 1 3 L 2 ) ] (15)

remains, and the term which is independent on f f 1 takes very large numerical values and is treated in more detail in the next subsection. The remaining term can be rewritten by a quadratic exponent of a Gaussian with posterior mean f 1, P o = f and square root of the variance

σ f 1 , P o = 1 σ S 1 , P o σ D 2 2 N D S 1 π c L ( 3 3 L S S 2 / L 2 + 1 ) 1 / 2 σ D S N D 1 / 2 1 π c L ( 3 2 1 3 L S S 2 / L 2 + 1 ) 1 / 2 , (16)

using Equation (10) for large frequencies and σ S 1 , P r 2 σ D 2 / ( 2 N D ) . This uncertainty increases with the noise but decreases with the square root of the number of data points per spatial unit, the signal level and the spatial domain covered. For the data set examples, one inserts L = 31.56 mm, L S S = 21.28 mm, N D = 789 , σ D / S = 0.1 and 2 to find the astonishing numbers σ f 1 , P o = 8.6 and 171.9 MHz, respectively. This explains the narrowness of p ( f 1 , σ S 1 , P r | D ) with respect to f 1 as shown in Figure 3(c) and Figure 3(d).

In fact, σ S 1 , P o changes with f 1 on a GHz, but the peaking in f 1 is of the order of MHz. Hence, the factorisation p ( f 1 , σ S 1 , P r | D ) p ( f 1 | σ S 1 , P r , D ) p ( σ S 1 , P r | D ) with the conditional Gaussian posterior

p ( f 1 | σ S 1 , P r , D ) = exp [ 1 2 ( f 1 f 1, P o ) 2 σ f 1 , P o 2 ] ( 2 π ) 1 / 2 σ f 1 , P o (17)

and

p ( σ S 1 , P r | D ) = σ S 1 , P o σ S 1 , P r exp [ 1 2 σ S 1 , P o 2 4 N D 2 S 2 σ D 4 ] K ( 2 π ) 1 / 2 σ f 1 , P o (18)

is justified. Thereby, σ S 1 , P o needs to be taken at f 1 = f 1, P o . The remaining probability p ( σ S 1 , P r | D ) is investigated in the next subsection.

Numerics

Because the variance σ S 1 , P o 2 changes in f 1 on GHz scale but

exp [ 1 2 S 1, P o σ S 1 , P o 2 S 1, P o ] on MHz scale, only the exponential needs to be taken into account for the determination of f 1, P o and σ f 1 , P o . Furthermore, one faces large numbers, and, hence, the exponent is dealt with directly.

At a given σ S 1 , P r , the frequency at which the maximum of the exponent occurs is identified as the posterior mean f 1, P o ( σ S 1 , P r ) . The posterior standard deviation σ f 1 , P o ( σ S 1 , P r ) is obtained when the ratio of the exponent to its maximum at f 1, P o reads 1 2 S 1, P o σ S 1 , P o 2 S 1, P o / max ( 1 2 S 1, P o σ S 1 , P o 2 S 1, P o ) = 1 / 2 . For both data sets, the ratios are shown in Figure 4(a) and Figure 4(b) in the vicinity of the peak. While the uncertainty of f 1, P o and, hence, the width of the Gaussian augment with the noise level (see also Figure 5(a) and Figure 5(b)) as described by Equation (16), f remains included in the 2- σ f 1 , P o band centred at f 1, P o . Furthermore, the choice of σ S 1 , P r affects f 1, P o and σ f 1 , P o only for small values (see Figure 5(a) and Figure 5(b)).

3) Posterior for σ 1 , P r

After the factorisation, one combines Equations (16) and (18) like

p ( σ S 1 , P r | D ) = P K = σ S 1 , P o σ S 1 , P r exp [ 1 2 σ S 1 , P o 2 4 N D 2 S 2 σ D 4 ] K ( 2 π ) 1 / 2 σ f 1 , P o = P K σ D 2 N D S c L 3 L 2 3 L S S 2 + L 2 (19)

with the important term

Figure 4. Ratio of exponent to its maximum dependent on frequency f 1 and keeping constant prior standard deviation σ S 1 , P r . For the data set with low noise level ( σ D = 0.05 , left), the peaking is more narrow than the one for the high noise level ( σ D = 1 , right). The maximum location and the width of the ratio are used to estimate posterior mean f 1, P o ( σ S 1 , P r ) and standard deviation σ f 1 , P o ( σ S 1 , P r ) . (a) Ratio for low noise data set at three values of σ S 1 , P r . For σ S 1 , P r = 0.5 , f 1 , P o = 50.016 GHz and σ f 1 , P o = 8.5 MHz are extracted which are consistent with 50 GHz used to simulate the data; (b) Ratio for high noise data set at three values of σ S 1 , P r . For σ S 1 , P r = 0.5 , f 1, P o = 50.167 GHz and σ f 1 , P o = 173 MHz are estimated. σ f 1 , P o and the deviation of f 1, P o from 50 GHz increase with the noise level. Furthermore, small prior values for σ S 1 , P r affect σ f 1 , P o .

Figure 5. Posterior mean f 1, P o and standard deviation σ f 1 , P o for frequency f 1 dependent on prior standard deviation σ S 1 , P r for noise levels σ D = 0.05 (left) and 1 (right). The original frequency f = 50 GHz (red line) is included usually in the band f 1, P o ± 2 σ f 1, P o . However, σ f 1, P o increases from about 10 MHz to 200 MHz given both noise levels. The posterior p ( σ S 1 , P r | D ) (bottom row) changes little for the different noise levels and has a global maximum close to S = 0.5 (dashed-black). (a) f 1, P o ± 2 σ f 1, P o for σ D = 0.05 ; (b) f 1, P o ± 2 σ f 1, P o for σ D = 1 ; (c) p ( σ S 1 , P r | D ) for σ D = 0.05 ; (d) p ( σ S 1 , P r | D ) for σ D = 1 .

P = exp [ 1 2 σ S 1 , P o 2 4 N D 2 S 2 σ D 4 ] σ S 1 , P r . (20)

The exponent of P becomes

σ S 1 , P r 2 N D S 2 σ D 2 σ S 1 , P r 2 [ 1 + cos ( 2 π f 1, P o c L S S ) sinc ( 2 π f 1, P o c L ) ] + σ D 2 2 N D (21)

which is monotonically rising in σ S 1 , P r from 0 to about N D S 2 / σ D 2 . In case σ S 1 , P r 2 > σ D 2 / ( 2 N D ) and S / σ D = 1 , the number of data points dominates. For the two examples with N D = 789 , large numbers exp ( 78900 ) and exp ( 197 ) are at hand as could be seen in Figure 3(a) and Figure 3(b).

As the exponent vanishes for σ S 1 , P r = 0 , 1 / σ S 1 , P r and, thus, P and P diverge. Hence, this global maximum should be excluded by a proper choice for the prior of σ S 1 , P r . Doing so, P has the global maximum P G M a x at

σ S 1 , P r , G M a x 2 = S 2 2 σ D 2 2 N D + S 2 2 ( 1 2 σ D 2 N D S 2 ) 1 / 2 (22)

supposed a large f 1, P o . Interestingly, N D reduces the noise impact drastically, so that even for elevated noise levels σ D S , the maximum is still given by S with minor corrections.

The uniform prior p ( σ S 1 , P r ) may be finite for the domain σ S 1 , P r , L σ S 1 , P r , U , and the maximum of P in this domain is denoted by P M a x . Then, the peaking can be made more obvious by

P P M a x = σ S 1 , P r , M a x σ S 1 , P r exp [ 1 2 σ S 1 , P r , M a x 2 σ S 1 , P r 2 σ S 1 , P r 2 + σ D 2 2 N D S 2 σ S 1 , P r , M a x 2 + σ D 2 2 N D ] (23)

after some algebra for a large f 1, P o . With the normalisation of the above equation

σ S 1 , P r , L σ S 1 , P r , U P P M a x d σ S 1 , P r = K σ S 1 , P r ( σ S 1 , P r , L , σ S 1 , P r , U ) , (24)

the posterior becomes

p ( σ S 1 , P r | D ) = 1 K σ S 1 , P r P P M a x . (25)

In case, the prior is finite for the domain σ S 1 , P r , L σ S 1 , P r , G M a x σ S 1 , P r , U , so that P M a x = P G M a x , one can set σ S 1 , P r , M a x 2 = S 2 σ D 2 / ( 2 N D ) . This reveals the simple forms

P G M a x = exp [ N D S 2 σ D 2 1 2 ] S 2 σ D 2 2 N D (26)

and

P P G M a x = S 2 σ D 2 2 N D σ S 1 , P r exp [ 1 2 ( 1 S 2 σ S 1 , P r 2 + σ D 2 2 N D ) ] (27)

of Equation (23). With the above expression and Equation (19), the constant

K = P G M a x K σ S 1 , P r = σ D 2 N D S c L 3 2 π L 2 3 L S S 2 + L 2 exp [ N D S 2 σ D 2 1 2 ] S 2 σ D 2 2 N D K σ S 1 , P r (28)

follows with proper unit Hz. K depends in a complicated way on the noise level, the signal, the number of data points, the spatial domain for which the data is acquired, and the limits σ S 1 , P r , L and σ S 1 , P r , U . For the two examples, one gets ln ( P G M a x / Hz ) = 78910 ( σ D = 0.05 ) and ln ( P G M a x / Hz ) = 214 ( σ D = 1 ).

Numerics

Starting point is Equation (19) which is rewritten like

ln p ( σ S 1 , P r | D ) + ln K = ln P = ln ( σ S 1 , P o σ S 1 , P r max ( exp [ 1 2 S 1, P o σ S 1 , P o 2 S 1, P o ] ) ( 2 π ) 1 / 2 σ f 1 , P o ) (29)

to handle large numerical values. Thereby, σ S 1 , P o is evaluated at f 1 = f 1 , P o . The far right-hand side of the above equation is numerically available, and the maximum ln P M a x at σ S 1 , P r , M a x and the normalisation K σ S 1 , P r can be determined for the domain from σ S 1 , P r , L to σ S 1 , P r , U .

Choosing for the moment σ S 1 , P r , L = 10 2 and σ S 1 , P r , U = 10 3 , so that the global maximum P G M a x is included, the above formalism yields ln ( P G M a x / Hz ) = 76861 (213) and K σ S 1 , P r = 6.32 (6.36) for the data set simulated with σ D = 0.05 (1). For both examples, the order of magnitude of ln ( P G M a x / Hz ) is well described by Equation (26).

For the data sets, both P G M a x are located at σ S 1 , P r , G M a x = 0.499 and 0.501 which is in the vicinity of S (see Figure 5(c) and Figure 5(d)) and is in agreement with expression (22). For the different noise levels, p ( σ S 1 , P r | D ) has a similar shape; the probability reduces drastically below σ S 1 , P r , G M a x but remains finite above up to σ S 1 , P r = 10 , at least.

4) Model Posterior

The posterior of the model with uniform priors for σ S 1 and f 1 is given by

p ( H | D ) = P M a x K σ S 1 , P r ( σ S 1 , P r , U σ S 1 , P r , L ) ( f 1, P r , U f 1, P r , L ) . (30)

This posterior weighs the constants obtained with respect to the spectral and amplitude domains. Hence, p ( H | D ) increases with the constants in the numerator and smaller domains in the denominator. When one investigates Figure 3(a) and Figure 3(b), it is obvious that p ( H | D ) increases, when f 1, P r , L and f 1, P r , U are chosen to include f 1, P o and are separated by few GHz or even some 100s MHz. Similarly, choosing σ S 1 , P r , L and σ S 1 , P r , U to include the global maximum P G M a x at σ S 1 , P r , G M a x (see Figure 5(c) and Figure 5(d)) elevates p ( H | D ) . Furthermore, p ( H | D ) has a global maximum when σ S 1 , P r , L and σ S 1 , P r , U approach σ S 1 , P r , G M a x S from either side. This is numerically confirmed by scanning σ S 1 , P r , L and σ S 1 , P r , U over several orders of magnitude and evaluating the model posterior for the two data sets (see Figure 6(a) and Figure 6(b)).

(D) Joint Posterior and Marginal Posteriors for S1 and f1

Because the model posterior peaks at σ S 1 , P r , G M a x , the marginal posterior for f 1 is simply given by the Gaussian p ( f 1 | σ S 1 , P r = σ S 1 , P r , G M a x , D ) with posterior mean f 1, P o and standard deviation σ f 1 , P o (see Section III C2 and Figure 5(a) and Figure 5(b)). Furthermore, the joint posterior

p ( S 1 , f 1 | σ S 1 , P r , G M a x , D ) = p ( S 1 | f 1 , σ S 1 , P r , G M a x , D ) p ( f 1 | σ S 1 , P r , G M a x , D ) (31)

can be evaluated and is shown for both data sets in Figure 7(a) and Figure 7(b). This joint posterior peaks close to S and f used to simulate the data, and its width increases with the noise level. However, S and f are usually located inside the peak, and, hence, one can trust the results within the posterior uncertainties.

The marginal posterior for S 1 is obtained by the marginalisation

p ( S 1 | σ S 1 , P r , G M a x , D ) = p ( S 1 , f 1 | σ S 1 , P r , G M a x , D ) d f 1 . (32)

The above integration is performed numerically, and the results are shown in Figure 7(c) and Figure 7(d) for both data sets. Another way to achieve the marginalisation is to draw samples for f 1 from p ( f 1 | σ S 1 , P r , G M a x , D ) , and for each of those samples draw samples for S 1 from p ( S 1 | f 1 , σ S 1 , P r , G M a x , D ) . Histograms of these posterior samples are given in Figure 7(c) and Figure 7(d).

Figure 6. Model posterior p ( H | D ) . The stated model uses a uniform prior for the prior standard deviation σ S 1 , P r in the domain from σ S 1 , P r , L to σ S 1 , P r , U . As these limits approach σ S 1 , P r , G M a x 0.499 for σ D = 0.05 and 0.501 for σ D = 1 (dashed-white), the posterior increases. Thereby, σ S 1 , P r , G M a x S 2 σ D 2 / ( 2 N D ) is closely linked with the original amplitude S = 0.5. (a) Low noise case σ D = 0.05 ; (b) High noise case σ D = 1 .

Figure 7. Joint posterior p ( S 1 , f 1 | σ S 1 , P r , G M a x , D ) and marginal posterior p ( S 1 | σ S 1 , P r , G M a x , D ) of frequency f 1 and amplitude S 1 for low noise ( σ D = 0.05 , left) and high noise data set ( σ D = 1 , right). The joint posterior peaks close to the original values S = 0.5 and f = 50 GHz used to simulate the data sets. For the higher noise level, the posterior width increases, and the location of the global maximum separates further from S and f. By numerical integration, the marginal p ( S 1 | σ S 1 , P r , G M a x , D ) is evaluated, and its shape is close to a normal distribution. This is confirmed by a second approach which uses 105 samples from the joint posterior to estimate the marginal posterior mean and standard deviation for S 1 . The histogram of the samples is given as well. (a) p ( S 1 , f 1 | σ S 1 , P r , G M a x , D ) for low noise level; (b) p ( S 1 , f 1 | σ S 1 , P r , G M a x , D ) for high noise level; (c) p ( S 1 | σ S 1 , P r , G M a x , D ) for low noise level. The marginal mean and standard deviation read 0.4998 and 0.0013, respectively; (d) p ( S 1 | σ S 1 , P r , G M a x , D ) for high noise level. One finds that the marginal mean and standard deviation account for 0.505 and 0.026, respectively.

Evaluating the mean and standard deviation from these samples approximates p ( S 1 | σ S 1 , P r , G M a x , D ) by a Gaussian. In doing so for the data set with σ D = 0.05 (1), the marginal posterior mean S 1, P o , M a = 0.4998 (0.505) and standard deviation 0.0013 (0.026) are obtained. Both normal distributions are good approximations for p ( S 1 | σ S 1 , P r , G M a x , D ) (see Figure 7(c) and Figure 7(d)).

4. Conclusions

The presented work investigates a monochromatic problem with an adapted version of Bayes’ theorem to infer an amplitude parameter, a frequency parameter and a hyperparameter which acts on the amplitude prior only. The amplitude at a given frequency is a parameter which has a linear mapping to the data domain. If the measurements have a normal noise contribution (Gaussian likelihood) and the amplitude prior is chosen to be Gaussian, an analytical linear inversion technique is enabled. The amplitude posterior, being conditional on the frequency and prior information, has similiarities with the result of a conventional Fourier transformation analysis approach. However, the estimation of the frequency reveals a well localised domain which is in accordance with the measured data. The corresponding posterior for the frequency is orders of magnitude narrower than the width of the convolution function used by conventional approaches to estimate the spectral resolution. The prior information captured by one hyperparameter is tested as well, and its posterior peaks over a one order of magnitude and is very robust for significantly different noise levels on the data.

The findings for the monochromatic problem are investigated analytically. This reveals, for instance, the influence of the signal-to-noise ratio, the amount of data points and the spatial domains covered by the measurements on the results. The presented approach can be followed to examine more complex problems which involve more than one frequency and one amplitude and other diagnostic imperfections like a variable offset. Since this offset has a linear correspondence to the data domain, it would enter in the conditional amplitude posterior. Any non-linear parameter and additional hyperparameter would be estimated by the joint settings posterior specific for a certain problem. In doing so, a profound understanding of model implications could be established in analytical terms for applications relying on Fourier transformations. This is essential for designing a diagnostic for a given problem with somewhat known signal-to-noise ratio and hardware limitations, or for comparing results of different models in an objective way when a data set is given and, for example, the number of contributing frequencies is unknown.

Acknowledgements

This work has been carried out within the framework of the EURO fusion Consortium and has received funding from the Euratom research and training programme 2014-2018 and 2019-2020 under grant agreement No. 633053. The views and opinions expressed herein do not necessarily reflect those of the European Commission.

Appendix A: Integral Approximation

The spatial grid x i with i [ 1, N ] may be given by the set x i = i Δ x + x 0 of uniformly distributed locations separated by the constant increment Δ x , so that x 1 = L D S and x N = L D S + L . Then, the sum

i = 1 N cos ( 2 π f 1 c x i ) cos ( 2 π f 2 c x i ) 1 2 Δ x L D S L D S + L [ cos ( 2 π f c x ) + cos ( 2 π f + c x ) ] d x = 1 2 Δ x [ c 2 π f sin ( 2 π f c x ) ] L D S L D S + L + 1 2 Δ x [ c 2 π f + sin ( 2 π f + c x ) ] L D S L D S + L = N 2 cos ( π f c L S S ) sinc ( π f c L ) + N 2 cos ( π f + c L S S ) sinc ( π f + c L ) (A1)

is approximated by an integral over the spatial domain using the trigonometric identities (B1) and (B2). The approximation reveals two modulated sinc functions with a spatial and a spectral dependence. The spectral dependence is due to the sum and difference frequencies f = f 1 f 2 and f + = f 1 + f 2 , respectively. Spatially, the modulation depends on the length L S S = L 2 L D S of single-sided domain which is present on side only. The spatial dependence of the sinc function is given by the length L of the total domain covered.

Appendix B: Trigonometric Identities

The trigonometric identities are

cos ( x ) cos ( y ) = 1 2 [ cos ( x y ) + cos ( x + y ) ] , (B1)

sin ( x ) sin ( y ) = 2 cos ( x + y 2 ) sin ( x y 2 ) . (B2)

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

Cite this paper

Schmuck, S. and Svensson, J. (2019) Analytical and Numerical Investigations of Probabilistic Monochromatic Problem. Journal of Applied Mathematics and Physics, 7, 793-808. doi: 10.4236/jamp.2019.74054.

References

[1] Sivia, D. and Skilling, J. (2006) Data Analysis: A Bayesian Tutorial. Oxford University Press, Oxford.
[2] Jaynes, E.T. (1987) Maximum Entropy and Bayesian Spectral Analysis and Estimation Problems. Fundamental Theories of Physics, 21, 1-37.
[3] Bretthorst, G.L. (1988). Bayesian Spectrum Analysis and Parameter Estimation. Springer-Verlag, Berlin, Heidelberg.
https://doi.org/10.1007/978-1-4684-9399-3
[4] Schmuck, S. and Svensson, J. (2017) Fourier Spectroscopy: A Bayesian Way. International Journal of Spectroscopy, 2017, Article ID: 9265084.
https://doi.org/10.1155/2017/9265084
[5] Schmuck, S., Fessey, J., Boom, J.E., Meneses, L., Abreu, P., Belonohy, E. and Lupelli, I. (2016) Electron Cyclotron Emission Spectra in X- and O-Mode Polarisation at JET: Martin-Puplett Interferometer, Absolute Calibration, Revised Uncertainties, Inboard/Outboard Temperature Profile, and Wall Properties. Review of Scientific Instruments, 87, Article ID: 093506.
https://doi.org/10.1063/1.4962809

  
comments powered by Disqus

Copyright © 2019 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.