1. Introduction
The scripture of the Quran has been subjected to various intense mathematically based studies to reveal the protection mechanisms embedded in the composition of the Quran and to provide evidence of its credibility, authenticity and divinity see for instance [1] [2] .
Therefore, the development of the mathematical theory has been highly motivated and driven by the categorical recognition of the author that Allah may have embedded varying mathematical algorithms, equations and regression models for protecting the Quran, as well as to prove its divinity and to emphatically exclude any human influence on the manufacture of the Quran. Because Allah promises that the Quran will always be preserved and protected from any corruption such as addition or deletion or relocation of any of its verses from chapter to another. Therefore, unveiling any of these algorithms would help unlock many of the Quranic secrets, particularly those related to the Quran’s primary parameters like words and verbs as well as how the Quran’s design is related to the fit of linear regression.
Furthermore, this work is set to statistically and numerically validate and authenticate the first drawing of the Quran (Uthmanic manuscript) related statistics such as the total number of words and verbs of the Quran.
Regression analysis describes the relationship between a dependent variable and several independent variables.
Regression analysis describes the relationship between a dependent variable and several independent variables, for the estimation of the parameters model see for instance [3] [4] .
This paper is organized as follows: in Section 2, we give the initiation of linear regression; linear regression model in Quran with numerical results is given in Section 3.
2. Linear Regression Models
2.1. Simple Linear Regression
Regression analysis is a statistical technique for estimating the relationship among variables which have reason and result relation. Main focus of univariate regression is analyses the relationship between a dependent variables
and one independent variable Y and formulates the linear relation equation between dependent and independent variable.
The simple linear regression model is the simplest regression model in which we have only one predictor X.
This model, which is common in practice, is written as
where
-
are the values of the response and predictor variables in the trial, respectively;
- The unknown parameters: a is called the intercept, and b is the slope of the line;
-
is usually assumed to be iid (error) from
specially for inference purposes (see for instance [5] ).
Then estimates of simple linear model’s parameters should be obtained accordingly, using some method like the ordinary least squares method, which relies on minimizing the sum of square of errors
.
For the simple linear regression model the ordinary least squares estimations of a and b are
and
where
and
are the mean of the variable X and the variable Y respectively.
The goodness
of fit is defined as
.
2.2. Matrix Form of Multiple Regression
Regression models with one dependent variable and more than one independent variable are called multi-linear regression (see for instance [6] ).
Multivariate regression analysis model is formulated as in the following:
where
- Y is the dependent variable,
-
is the independent variables,
-
is the parameters,
- ε is the error.
The assumptions of multi-linear regression analysis are normal distribution, linearity, freedom extreme values and having no multiple ties between independent variables [6] .
The linear model can be written as
where
-
is the vector of observations on the dependent variable,
.
-
is the matrix consisting of a column of ones and p column vectors of the observations on the independent variables, the form of X is
-
is the vector of parameters to be estimated,
.
-
is the vector of random errors,
.
The vector
is a vector of unknown constants to be estimated from the data by
.
The normal equations [7] are written as
.
If
has an inverse, then the unique solution of normal equations given by
.
The vector
of estimated means of the dependent variable Y for the values of the independent variables
in the dataset is computed as
.
However, to express
as a linear function of Y. Thus,
.
3. Model and Numerical Results
3.1. Definition of Variables
In this section we conceder the following integer variables:
-
the frequency of plural present verbs with a form (y―un, ي---ون) or there inverse (ly―un, لي---ون);
-
the frequency of plural present verbs with a form (t―un, ت---ون) or there inverse (lt―un, لت---ون);
-
the frequency of the verbs in the above form without (t,ت) or (y,ي ).
We use the software of Quran statistics [8] let
for determined the set of triple
.
Let
is the sum of frequency of all above derivative verbs with same form.
However, the dataset are given in Table 1.
3.2. Simple Linear Regression Model of Verbs
Since
are not a frequency of plural present verbs [9] then, we let
a dependent variable and Y an independent variable for simple linear regression model. However, the simple linear regression model of verbs is:
where a and b are the parameters of the model.
For estimate the parameters a and b, calculate the coefficient of correlation R, plotting a fit and test hypothesis of simple linear regression model we used the following matlab codes:
[r, m, b] = regression (X, Y);
plotregression (X, Y);
fitlm (X, Y);
![]()
Table 1. Dataset of words in Quran.
The results of regression m function is given as the parameters
and
the coefficient of correlation
, see also Figure 1. And the results of test hypotheses by fitlm. m function are:

is frequency of verbs of enter (yadkolun دخلون ، يدخلون dakilun).
Also there exist 17 point in dataset in the line fit their equation is
, when we use the commend find (
) there indexes in dataset are:

For more information’s about the correspondence of this points in the dataset see Table 2 see for instance [10] [11] .3.3. Multi Linear Regression Model of VerbsFor multi linear regression model of verbs:
where
and
are the parameters of the model.For estimate the parameters
and
, and test hypothesis of multi linear regression model we used the following matlab code:
Number of observations: 387, Error degrees of freedom: 384Root Mean Squared Error: 6.18R-squared: 0.836, Adjusted R-Squared 0.835F-statistic vs. constant model: 981, p-value = 1.2e−151Also in the Figure 2, we show the name of Allah in Arabic الله.
4. Conclusions and Future Work
The present dataset in this paper fined in Quran gives a good simple and multi linear regression model between the frequency of verbs with a form (―un, ون---) made by the frequency of plural present verbs (t―un, ت---ون) or (y―un, ي---ون) and there inverses.
The results show that the parameters of the model are ones vector (1,1) and mean of dataset is (6, 7). It corresponds to the verb point enter (yadkolun دخلون ، يدخلون dakilun), also other 17 points exist in the line and in the dataset of 387 verbs and their derivate verbs in Quran. The name of Allah (الله) showed when we use tree variables and plot it in 3D with option “Show Text”.
For future work, the estimation parameter of this dataset will be done by using
norm and sub-gradient method, and comparing this model in other drawing of the Quran.
Acknowledgements
The author thanks Allah for this miracle dataset in Quran. Also we are indebted to the anonymous reviewers and editors of AJCM for many suggestions and stimulating comments to improve the original manuscript.