An Analysis of Two-Dimensional Image Data Using a Grouping Estimator ()
1. Introduction
The analysis of two-dimensional (2D) images by machine learning methods, one type of methods used in artificial intelligence, is widely used [1] - [20] . For details see the review and survey works [21] [22] [23] [24] [25] of this subject. Brown [26] mentions the potential of marching learning and its limitations. In keeping with Brown’s statement, we believe that it is important to research and identify the limitations of machine learning. In 2D image analyses, it is very important to divide the sample space into two regions (such as the target and background regions). Suppose that S is a bounded subspace of 2D space, y is a binary variable which takes a value of 1 in the region
and 0 in the region
. The boundary is given by the deterministic function
where
. A is given by
and B is given by
as shown in Figure 1. In this case, we can separate the space with a non-stochastic line such as by using support vector machines [27] .
However, when the model contains stochastic factors such as random observation errors as in Figure 2, it is necessary to consider stochastic models. Ma et al. [28] pointed out that even if non-stochastic patterns of noise are added to images, the machine learning methods may not give proper results. In this case, we separate the region defined by
such that
if
and
if
. (Note that the model can be easily generalized to the α-quantile cases.)
Figure 1. The case in which S is divided into two regions by a non-stochastic line.
Figure 2. The case in which the model contains stochastic factors.
The problem is that it is not easy to estimate
properly in the non-stochastic case. If
is mis-specified, we cannot get proper results. Nawata [29] proposed the estimator of
by the grouping method based on Nawata [30] [31] (hereafter referred to as the grouping estimator). The grouping estimator is a semiparametric parameter and does not require
to be specified. The method has not been used because grouping observations are difficult. Analyses of 2D high-resolution images have become very important in many fields. The sizes of 2D images are finite, and the images are overlaid with grid lines (and usually rectangles). Therefore, grouping is very easy, and each group can have a sufficient number of observations.
A grouping estimator for binary variables in the 2D case is explained in this study, and the results of a Monte Carlo study are presented.
2. Models and Assumptions of a Grouping Estimator
Let
be the binary variable that takes 1 if the targeted object occurs and 0 otherwise, let S be a bounded subspace of the 2D space where we obtain observations, and let
be an m-dimensional vector given by
,
,
,(1)
,
.
is a function of
. Let
,which is the total number of observations. S is divided into two regions such that:
Region A:
if
,(2)
and
Region B:
if
.
Suppose that the boundary C between the two regions in S is given by
,and
if
,
if
,(3)
and
if
,
where
is the m-th dimensional vector of unknown parameters. This means
if
and
if
. (4)
Note that we only consider a linear function of
,but the method can be easily generalized to non-linear cases. From (3) and (4), we get
and
. (5)
where
is the indicator function that takes 1 if D is true and 0 otherwise.
is a random error term such that
where
is the distribution function of
. One of the biggest problems is that we do not know the distribution function. A linear probability function (and modified types of linear functions) is sometimes used because computation is easy using such a function. However, Amemiya ( [32] , p. 268) mentions that “(it) is not a proper distribution function as it is not lie between 0 and 1.” The other widely used alternative distribution is the logistic distribution. Miguel-Hurtado [33] considered the linear and logistic regression methods and concluded that “Our experiments have shown that the machine learning classification typically out-performs linear (logistic) regression for the prediction of these four demographic trials based on bin assessments.” However, it is to be expected that we will not obtain correct results if the model is mis-specified; that is, there is no reason to use linear or logistic regression in the analysis. The grouping estimator is a semiparametric estimator that does not depend on the distribution of error terms; it is consistent in not only independent and identically distributed (i.i.d) cases but also in heteroscedastic cases.
The following assumptions are made:
Assumption 1
S is a bounded closed subspace of the 2D space. S is divided into Region A:
if
and Region B:
if
where
. The boundary C of the two regions is given by
,
,
,
.
Assumption 2
is the continuous and bounded function of
in the proper neighborhood of
. There exists
such that
if
.
Assumption 3
are independent random variables but are not necessarily identically distributed. Let
be the distribution function of
. Then
,
if u > 0 and
if u < 0,
is a continuous function of u, and there exists
such that
in
and
if
.
Assumption 4
satisfy the following conditions.
1) Let
be the neighborhood of
such that
and
be the number of observations in
. Then there exist
such that
and
for any
.
2)
converges to a nonsingular matrix.
3. Grouping Estimator for Binary Cases
Divide S into T non-overlapping subsets
so that the conditions of Assumption 4 are satisfied.
Let
be the number of observations in
. Define
(6)
and
where
.
represents the mean of
and
is the median of
in
. The grouping estimator for the binary model is the probit estimator using
,
. The estimator maximizes
. (7)
where
is the distribution function of the standard normal distribution.
represents the estimator maximizing
. Since the boundary of the two regions does not change if we multiply a non-zero constant, we need to normalize
. Therefore,
is standardized by the i-th element
and the grouping estimator is defined by
. (8)
From Theorem 4.3 of Nawata [29] , the estimator is consistent. Since the idea of the grouping estimator is based on the normality of the asymptotic distribution of the median and the proof uses Bernstein’s inequality, which gives the precise probabilities of the tail portions of the sum of random variables (for details see Benntett [34] ), it is useful to consider the normal distribution and probit model.
4. Monte Carlo Experiments
In the Model Carlo study, we consider the case where S is the rectangle given by
and
. Both
and
are divided by 1000 equidistant grid lines. Let
be the i-th grid line in
and
be the j-th grid line in
. The intersection of
and
is denoted as
,and each trial contains 1 million observations (
).
We consider the basic but important models given by
. (9)
The boundary C is given by
. (10)
The parameter value of
is 0 for all cases, and the cases in which
,2, and 4 are considered. The areas of A and B are the same for
; the area of A is 1/4 of S for
,and the area of A is 1/8 of S for
. First, the cases in which the error terms are i.i.d. random variables are analyzed. For the distributions of
,the normal (normal distribution cases, Cases 1 - 3) and Cauchy (Cauchy distribution cases, Cases 4 - 6) distributions are considered. Then, non-i.i.d. (heteroscedastic) cases such that 1)
if
and
if
(heteroscedastic distribution cases I, Cases 7 - 9) and 2)
if
and
if
(heteroscedastic distribution cases II, Cases 10 - 12) where
follows the standard normal distribution are analyzed.
For all cases,
and
are estimated by the probit maximum likelihood estimator (MLE) and grouping estimator. For the grouping estimator, each group contains 9 intersection points determined by 3 neighboring grid lines of
and
. The number of groups becomes
. (The points on
or
are not used.) As shown in this example, the grouping estimator essentially reduces the resolutions of the images. The number of repetitions is 100.
Tables 1-4 show the results of the Monte Carlo experiments. When the error terms are i.i.d. and follow the normal distribution (Table 1), the probit MLE is an efficient estimator, and the biases and standard deviations (SDs) are quite small. The biases of the grouping estimator are very small; however, the SDs are larger than those of the probit MLE. When the error terms follow a Cauchy distribution (Table 2), the biases of the probit MLE are very small when
. This is considered to occur because the distribution
is symmetric with respect to the boundary C in this case. In the cases in which
and 4, the biases of the probit MLE become larger. In particular, in the
case, the biases are quite large, and are −0.9485 and −1.2541 for
and
,respectively. On the other hand, the biases of the grouping estimator are very small for the cases in which
and 2. For the
case, the biases are −0.0421 and −0.2264 for
and
,respectively, much smaller than those of the probit MLE. Although the SDs of the grouping estimator are larger than those of the probit MLE in many cases, the SDs are much smaller than the biases. Figure 3
Table 1. Normal distribution cases (Cases 1 - 3).
SD: Standard Deviation.
Table 2. Cauchy distribution cases (Cases 4 - 6).
Table 3. Heteroscedastic cases I (Cases 7 - 9).
shows the boundaries obtained from the true parameter values, the probit MLE, and the grouping estimator for the Cauchy and
case. The boundaries of the probit MLE and grouping estimator are calculated for Case 6 in Table 2. The result obtained with the grouping estimator is much more accurate than that of the probit MLE.
Figure 3. Boundaries obtained from the true parameter values (True), probit MLE (Probit) and grouping estimator (Grouping) for Cauchy distributions: Case 6 in Table 2. The grouping estimator clearly improves the probit MLE.
Table 4. Heteroscedastic cases II (Cases 10 - 12).
The results of heteroscedastic error term cases are given in Table 3 (heteroscedastic distribution cases I, Cases 7 - 9) and 4 (heteroscedastic distribution cases II, Cases 10 - 12). For the heteroscedastic distribution cases I, the grouping estimator clearly reduces the biases of
,but the biases of
become slightly larger. (Although the SDs of the grouping estimator are larger than those of the probit MLE, the effects of the SDs are much smaller than those of the biases, as noted above.) Figures 4-6 show the boundaries obtained from the true parameter values, probit MLE, and grouping estimator. As before, the boundaries of the probit MLE and grouping estimator are calculated from the results in Table 3. The grouping estimator clearly improves the probit MLE in these cases heteroscedastic distribution cases I. For the heteroscedastic distribution cases II, the grouping estimator reduces the biases of
,but the biases of
are slightly increased in the cases in which
and 4. Figures 7-9 show the boundaries obtained from the true parameter values, probit MLE, and grouping estimator. As before, the boundaries of the probit MLE and grouping estimator are calculated from the results in Table 4. The grouping estimator clearly improves the probit MLE in Cases 10 and 11 but slightly improves the probit MLE in Case 12.
Figure 4. Boundaries obtained from the true parameter values (True), probit MLE (Probit), and grouping estimator (Grouping) for heteroscedastic distribution I: Case 7 in Table 3. The grouping estimator clearly improves the probit MLE.
Figure 5. Boundaries obtained from the true parameter values (True), probit MLE (Probit), and grouping estimator (Grouping) for heteroscedastic distribution I: Case 8 in Table 3. The grouping estimator clearly improves the probit MLE.
Figure 6. Boundaries obtained from the true parameter values (True), probit MLE (Probit), and grouping estimator (Grouping) for heteroscedastic distribution I: Case 9 of Table 3. The grouping estimator clearly improves the probit MLE.
Figure 7. Boundaries obtained from the true parameter values (True), probit MLE (Probit), and grouping estimator (Grouping) for heteroscedastic distribution I: Case 10 in Table 4. The grouping estimator clearly improves the probit MLE.
Figure 8. Boundaries obtained from the true parameter values (True), probit MLE (Probit), and grouping estimator (Grouping) for heteroscedastic distribution II: Case 11 in Table 4. The grouping estimator clearly improves the probit MLE.
Figure 9. Boundaries obtained from the true parameter values (True), probit MLE (Probit), and grouping estimator (Grouping) for heteroscedastic distribution II: Case 12 in Table 4. The grouping estimator slightly improves the probit MLE.
5. Discussion
Analyses of high- and very-high-resolution images [35] - [44] are becoming more important as such data become more widely available. When the boundary between two regions is not deterministic, a stochastic approach must be used to determine the boundary, and it is necessary to identify the proper functional form of
. Although distributions such as normal, logistic, and linear probabilities are frequently used, we cannot obtain consistent results unless the distribution is correctly specified. The Monte Carlo experiments conform this conclusion. The probit MLE has very large biases in many cases.
The grouping estimator is a semiparametric estimator and does not depend on the probability functions. It is consistent under very general assumptions. The results of the Monte Carlo experiments show that the grouping estimator clearly improves the conventional probit MLE when the distribution of the error terms is not only non-normal but also heteroscedastic. The grouping estimator essentially reduces the resolutions of the images. The previous low-resolution-image analyses [35] [45] [46] [47] [48] unintentionally used the methods of the grouping estimator. In other words, misspecification of the probability distributions might not be a critical problem for low-resolution images, but it might not be proper to use the methods of low-resolution images to high-resolution images. When we analyze high-resolution images, the conventional methods (used in low-resolution-image analyses) might not produce satisfactory results, and special attention should be paid to the selection of the models. Shao et al. [49] used a pyramid scene parsing pooling module that combines high-resolution and low-resolution images. Xu et al. [50] also suggested a method that involved changing a high-resolution image into low-dimensional images by bicubic downsampling and combining them. However, their methods lack a theoretical background. The grouping estimator may provide theoretical justifications for these methods. The results of high-resolution image analyses performed by conventional methods such as the probit MLE should be combined and compared with the results obtained low-resolution images. Although 2D cases are considered in this paper, this method can easily be applied to 3D cases [51] [52] .
6. Conclusions
Analyses of 2D images are increasing in importance as high-resolution images become more commonly available. Dividing 2D images into two regions, A and B, is a basic but very important challenge. When the boundary of the two regions is not deterministic, a stochastic approach must be used to determine the boundary between the regions. In this case, it is necessary to identify a proper probability functional form. Although distributions such as normal, logistic, and linear probability are frequently used, accurate results cannot be obtained unless the distribution is correctly specified, as shown in the Monte Carlo experiments.
The grouping estimator does not depend on probability distributions. It is a consistent estimator not only in i.i.d. cases but also in heteroscedastic cases. The Monte Carlo experiments show that the grouping estimator improves the probit MLE in many cases when the distribution of error terms is either non-normal or heteroscedastic. The grouping estimator is based on grouping the data, and it essentially decreases the resolutions of the images. In other words, misspecification of the distributions of the error terms might not be critical for low-resolution images, but it is critical for high-resolution images. The grouping estimator gives the theoretical justifications for this. It implies that we might not obtain proper results by applying the conventional methods used for low-density images to the analysis of high-resolution images. If the probability distributions are mis-specified, we may obtain incorrect results in high-resolution image analyses. It is important to combine and compare the high- and low-resolution-image results.
The methods to determine the optimal grouping (for example, numbers of observations in each group) are not yet unknown. The proper methods to combine and compare the high- and low-resolution-image results are important. However, proper methods are not developed yet. Researches to use the grouping estimator for 3D images are also important. These are the topics to be studied in the future.
Acknowledgements
The author would also like to thank an anonymous reviewer for his/her helpful comments and suggestions.