On the Regression-Tensor Analysis of the Hardening Process of Metal Coatings ()
1. Introduction
Researchers pay a lot of attention to increasing the strength characteristics of the technological process of hardening metal coatings (see, for example, 1) Munz W.-D., Lewis D.B., Hovsepian P.E. et al. Industrial scale manufacturing of superlattice hard PVD Coatings//Surface Engineering. -2001. -V. 17. -pp. 15-17; 2) Mitterer C., Holler F., Ustel F. et al. Application of hard coatings in aluminum die casting//Surface & Coating Technology. -2000. -V. 125. -pp. 233-239). Nonlinear integrative physical and chemical (PC) processes lie at the root of the methods of hardening the working surfaces of modern power machines, which actualizes the issues related to formalization/development of their mathematical models. In this context, regression models [1] [2] [3] [4] are still in demand, where regression-tensor systems [5] [6] [7] form an important class. These systems, on the one hand, are obviously close in their predictive properties to polynomial models [2], admitting a detailed analytical description based on tensor calculus [7], functional analysis of strong Frechet’s differentials [8] and extremum problems theory. And on the other hand, they acquire an important role in the nonlinear analysis of multifactorial tribological and anticorrosion properties of complex metal coatings based on mathematical modeling of physical and mechanical (PM) properties of composite media, developing a nonlinear predictive analysis of integrative characteristics of metal coatings induced by their nanostructure geometry [9] [10].
In the article, you will find the development of the tasks set in the conclusions of [5]. In this case, the main goal is not so much the formal accuracy of inferences, but rather the clarity of concepts in the development of general problems of tribology [11] related to the precision modeling of nanostructures of complex metal coatings. In the context of the article, the problem of the formation of the PM functional that evaluates the PC mode hardening of composite metal coatings is solved. Analytical interpretations of the multi-connected conditions of the PC mode optimization, under the imposed nonlinear (and essentially difficult to formalize) constraints, are constructed [12] [13]. The regression-tensor model for tribological/corrosion tests is substantiated by means of identification of multivariate nonlinear PM regression equations with a minimum tensor norm by the least squares method (LSM).
2. Motivations, Terminology and Problem Formulation
Let R be a field of real numbers, Rn be a n-dimensional vector space over R with Euclidean norm
,
be a column vector with elements
and let
be the space of all
-matrices with elements from R. Moreover, let us assume that
is the space of all covariant tensors of k-th valency, i.e. real polylinear forms
with norm
, where
is the “coordinate matrix” of the tensor
with respect to the canonical basis [14] in the space of
.
Let
be the vector of varying PC predictors [2] for a nonlinear PM regression with a fixed origin in
(the reference PM mode of hardening),
be the vector of indices of PM variables. To describe a multifactorial physical and chemical process, consider a multidimensional functional nonlinear input-output type system described by a vector-tensor k-valent PM regression equation of the following form:
(1)
Here
,
is a nonparameterizable class vector-function
, (2)
,
is the 0-rank tensor, representing the tribological index
of the PM quality of the investigated PC process in its reference mode, given by the vector
.
Note 1. The precision of nonlinear simulation of the PC process in the class of regression-tensor systems (1) (and adaptation of their parameters) is correct because of the continuous dependence ( [8], p. 495) of solutions of the differential diffusion equation [15] on its initial boundary conditions. The tensor structure of the Equation (1) arises in accordance with Theorem 3 ( [16], p. 255) and the polylinear nature ( [8], p. 490] of the higher-order Frechet derivatives when computing the strong differentials at a point
from the vector function
. This ultimately summarizes Assertion 2 from [5] (see Problem (I) below). In this case, the accuracy of the nonlinear PM modeling is represented by the function estimate (2) as a remainder term in the Peano form related to the k-valence index of the Equation (1).
The problem of the multidimensional nonlinear regression-tensor modeling of multifactor physical and chemical process of hardening of metal coatings, optimal with respect to some target “tribological criterion”, was set and investigated in detail in work [5] for the 2-valent model (1). With that, analytical solutions of three methodological positions of this problem of optimal mathematical modeling are obtained:
(I) for a fixed vector-predictor
and its open neighborhood
, analytical conditions are defined, under which the vector function
of PM property indices satisfies the multivariate regression-tensor system (1);
(II) a direct algorithm is constructed for identifying tensor coordinates
in a 2-valent regression-tensor model (1) based on a numerical solution of a two-criteria LSM problem of optimal a posteriori PM modeling written as:
(3)
where
are, respectively, the vectors of experimental factor-predictors of the PC process, i.e.
is the a posteriori response to the target variation
relative to the coordinates of the reference vector
under the condition
(this inequality is methodologically dictated by condition (2)), q is the number of tribological experiments conducted (determined by representativeness of model (1)), carried out with the dynamics of PC processes [15];
(III) for the 2-valent regression-tensor model (1) with the given predictor
and nominal condition
, the analytical solution of the optimization problem as a non-linear “v-optimization” of the varied (relative to the vector
) factor-predictors of the prognostic PM characteristics of the designed composite metal coatings was obtained:
, (4)
where the vector function
has a coordinate representation according to the LSM-identified model (1)-(3),
are weight factors reflecting the priority of PM indices; we can also investigate Problem (III) for some
, which corresponds to the position when
should be minimized in PM indices.
The significance of the nonlinear multifactor regression-tensor analysis is not only in the exact theorems already obtained by this method [4] [5], but also in the simple and clear heuristic rules (e.g. the condition of the experiments
, or the equality
in Corollary 2) involved in the construction of optimal multivariate posterior modeling. Over time, these rules can be brought to the level of strict theorems of regression analysis (like [2] [17] [18] ), but even now their usefulness is undoubted [6].
Problem statement (according to analytical conclusions of [5] ):
(i) to determine necessary and sufficient conditions of solvability of the optimization problem (4) for a 3-valent (
) functional regression-tensor system (1);
(ii) to construct an algorithm for correction of sufficient conditions of extremum of stationary point of Problem (i) based on the r-parametric adjustment of the
PM functional
. (5)
3. Optimization of Physical and Mechanical Indices of the Hardening Process of Metal Coatings
Consider Problem (i) on optimization of the PM characteristics of metal coatings at
; note that the solution of the accompanying Problem (II) of the parametric identification for
is an non-complicated modification of Assertion 3 of [5] (see also [17] ).
In such a mathematical formulation, the nonlinear multivariate prognostic equation (1) can be given in the following vector-matrix-tensor form:
(6)
where
. Without loss of generality, we believe that each matrix
has an upper triangular structure; this substantially simplifies the numerical implementation of the ANC-algorithm (3). Additionally, we note that the vector function
satisfies (according to (2)) the qualitative estimate of
.
According to (1), at
PM functional of the total tribological indices (5) are twice continuously differentiable, which guarantees the equality of the mixed derivatives
(7)
Therefore, in the solution of optimization Problem (4) for 3-valent model (6) the main result, according to Theorem 3 ( [8], p. 505) and Theorem 7.2.5 [14], can be considered as the following Assertion 1. But first of all, let us first agree on a condition that
, (8)
where each
is a matrix of the system (6) (the matrix of the tensor
in such a statement when it is not considered to be symmetric in the system (1)). Moreover, let us consider a vector function
(9)
where
is the gradient of the functional
.
Assertion 1. The stationary points
of Problem (i)are the essence of the solutions of equation
(10)
A sufficient condition
is that
,as a stationary point of the functional (5),must be of elliptic type. In other words,the point
for the Hessian
of the functional (5)must satisfy the inequalities
(11)
where
are the principal submatrices of the Hessian, det is determinant
which is equivalent: characteristic numbers
of the matrix
satisfy the
(12)
Corollary 1. In case of
the Hessian of the functional (5)and conditions (11), (12)are invariant to the position of the stationary point
,and the Hessian equals
, (13)
which leads to a linear dependence of the numbers
on the normalization of the vector r.
If
,the solution of Equation (10)is unique and has the form of
, (14)
which makes the position of the point
invariant to the normalization of the vector r.
According to vector functions
, the equation (10) is geometrically defined by the intersection of m quadrics ( [16], p. 219]. Local analysis can be performed on the basis of the fixed point principle ( [8], p. 75]). If inequalities (11) (equivalent to (12)) are not fulfilled, i.e. at least one of them has a sign change to the opposite, the stationary point
is hyperbolic (saddle point). On the other hand, changing the inequality < to the reflexive £ (i.e.
) induces a parabolic point structure for
. Thus, in the case of a saddle/parabolic point
, a purposeful parametric correction of the functional (5) is required to ensure its elliptic nature (12). It is clear that such a correction can shift the position of the stationary point, i.e. a refinement recalculation
is required after this correction (by virtue of Corollary 1, such a recalculation at
, in turn, no longer entails changing the spectrum (12) of the Hessian
).
One of the factors affecting the stationary point
geometry of Assertion 1 is the digital adaptive parametric adjustment of
, which leads to elliptic conditions (11) or (12). This is the subject of the next section.
4. Parametric Correction of the PM Functional Using the r-Parameter Family of Its Hessians
Consider statement (ii): For a stationary point of the optimization problem (i), construct a numerical procedure for correction of weight factors
, based on fulfillment of spectral conditions (12), i.e. providing elliptic nature of the stationary point
of Statement 1. This formulation is relevant for optimization of
-parameters of the PM process when some target PM indices have to be minimized (i.e.
).
Note 2. Despite the algebraic equivalence of conditions (11)~(12), the use of expansion of determinants (11) in construction of adaptive correction
is almost inevitably doomed to failure (even by means of computer algebra) due to a large number of terms expressed through multivariate regression coefficients.
The solvability conditions for a problem similar to (ii) can be obtained only in exceptional cases. Therefore, below we shall discuss an approach to this problem based on the ideas of the theory of localization and perturbations of eigenvalues [14]. Another productive mathematical tool appears to be the transformation of conditions (12) to the problem of a “quadratic” stability by constructing a Lyapunov function ( [19], p. 134) (see Conclusion below) in the affine family of Hessians of the optimization Problem (i) on the grounds that this family clearly depends on variations of vector
coordinates due to the structure of functional (5).
Let some initial vector
of weight factors from Statement (ii) be given. For example, the heuristic choice of the vector
can be made based on the equality of its coordinates
to the values of some functions
(with a clear physical context) that depend on the values of functionals
from auxiliary problems of optimal prediction of PM quality by individual target tribological indices
. In particular, for the 2-valent regression model (1), this position, according to Corollary 2 of [5], will be characterized by the following simple proposition.
Assertion 2. If the maximal valency of tensors k is two,then the vector of initial weight factors
with coordinates
has an analytic representation
,
where
is the canonical basis in
.
Let us denote by
some stationary point of the functional (5) in the case when the r-priority of the probing points is
. Correspondingly, we denote by
the Hessian of the given functional calculated for the pair
and let
.
Then for the admissible linear variation
of vector
coordinates, given (due to comments to formula (4)) by the region of this variation
written as
,
the
-parametric family of linear variations of the Hessian
is defined by a matrix
-multiverse written as:
(15)
By virtue of (7), the matrices of the family (15) are symmetric.
For the matrices of the manifold (15), the eigenvalues can be characterized as a series of optimization problems by means of the Courant-Fischer Theorem [14]. On the other hand, in the circle of analytic applications of this theorem lie the reasoning of the Weyl Theorem [11] on the relations between the characteristic numbers of the Hessian and any matrix from the manifold (15), allowing to clarify more transparently the geometric meaning of constructions of the linear
-correction
of the target functional (5) carried out below.
Taking into account the introduced constructions, the adaptive adjustment of the PC process tribological quality functional
, which ensures that inequality (12) is fulfilled when varying the vector
at the stationary point, contains the following Assertion 3 below. In essence, this assertion is a non-complicated modification (in the version of the strong derivative
) of Theorem 6.3.12 [14] based on Theorem 2 ( [8], p. 491] and Theorem 4.1.3 [14], which takes into account the structure of the manifold (15) as symmetric matrices.
Assertion 3. Let
,
be the set of eigenpairs of the Hessian
,i.e.
,and let,given the realization of the manifold (15),the numbers
are set.
Then the eigenvalues
of the Hessian
have the form
(16)
System (16) gives an estimate of the sensitivity of the Hessian
spectrum to linear variations
of the weight factors. For nonlinear variations we can refer to the recurrence formulas from p. (b) ( [16], p. 154), which can be computed symbolically using computer algebra. Of course, this analysis is approximate (valid for small
). It is especially efficient for the 2-valent model when
(this equality is not difficult to implement due to the relative variability of the number of PM indices).
Corollary 2. Let
,
,
be a vector of characteristic numbers of the matrix
and
be their corresponding eigenvectors. Moreover,let
be a vector of characteristic numbers that are “benchmark/reference” by criterion (12),and
be a
-matrix with elements
.
Then for
,where the variation vector has the representation
,the eigenvalues of the Hessian
will be
close to the benchmark
.
Note 3. Since Corollary 2 is valid for small
, the question remains whether the iterative computational process will converge to
,
constructed from the calculation
, if the initial divergence
is significant enough. Moreover, according to the structure of the target functional (5), at each iteration step j for vector
coordinates it is necessary (within the physical statement of Problem (4)) to check the coordinate conditions
.
Note 4. For adaptive systems, the evaluation of input signals (in our case
in (3)) is essential (which is why adaptive techniques with learning are used). In this context, it is important to obtain sufficient conditions for the adaptive system to have robust bounded solutions [20], with the very fact of existence of predetermining solutions satisfying these properties being more important (see (2)) than their specific solutions. Thus, a fixed parameter setting providing qualitative (see (12)) control of the predictive system (1), which is not very sensitive to the exact value of the parameters, can yield a number of possible values
, allowing us to determine the optimal values v, guaranteeing the target quality (4).
In the context of Note 3, let us provide the result of calculating an upper bound estimate for perturbation
. To this end, assume that
is the matrix norm in
, consistent with the norm in Euclidean space
, and
,
is the unit matrix. For example, the Frobenius norm
,
or the spectral (induced) matrix norm
can serve as such.
Returning to Corollary 2, we have
,
. Now suppose that the vector of characteristic numbers
turns into a perturbed vector
(in particular, due to the members
of system (16)), and the matrix B turns into
. Then the vector
will get (due to a modification of Corollary 2) some increment
, passing to the value
, which satisfies the equation
.
It is obvious that
models the perturbations of the vector
, and the inaccuracy of the parametric estimation of the matrix B (if
, then
; see ( [21], p. 197). The result of calculating the upper bound estimate of perturbation
formulates Corollary 3. For technical details of the accompanying calculations using the construction of the matrix conditional number, see a popular (among graduate students) monograph ( [21], p. 197).
Corollary 3. Let
,the conditional number of matrix B,where
is the norm
or
,be added to the assumptions of Corollary 2. Then the following estimate is valid for
If
and
are,respectively,the smallest and the largest eigenvalues of the matrix
,then in the last inequality we can assume that
.
Note 5. The construction of the conditional number
obtained using the spectral norm
is transparent due to equality
.
Alternative approaches [22] [23] [24], including deep insight (via computer algebra [6] ) into the physical content of the subject of nonlinear PC modeling, can be used to take into account interferences other than those covered by Corollary 3.
5. Conclusions
The aim of the article was, in development of the results [5], to point out the connection that exists between the problem of determining the value of the Hessian matrix function at the stationary point of the target functional (5) and the vector r of weight factors in (5), reflecting the priority between
-modeled predictions of the target tribological PM indicators. In this context, Assertion 1 and its Corollary 1 show that, unlike the 3-valent regression-tensor model, in the 2-valent one, the Hessian
is invariant to the stationary point position. In this case, both variants allow us to identify the r-dependence of the Hessian spectrum
on the basis of the nonlinear multivariate regression PM model for the PC mode of hardening of composite metal coatings identified within the framework of the LSM problem (II).
Assertion 3 essentially asked: what can we say about the eigenvalues of the matrix
, if each variation of
is a small parameter? Thus, we were only interested in the purely formal aspect of the mathematical modeling problem under study, when we do not consider the question of what the real value of the increment
must be for the term “small parameter” to be really relevant. In this case, the result of Assertion 3 is based on the fact that the eigenvalues (12) smoothly r depend on the Hessian
elements during the current parametric r-correction of the target functional (5). However, it should be noted that some information is lost when we deal only with the characteristic polynomial, because there are many different matrices with a given characteristic polynomial. It is therefore not surprising that the stronger results on modeling the Hessian spectrum
, in particular Assertion 3 and Corollary 2, take into account the structure of
. The latter ones admit technical simplifications by means of specialized computer algebra proceeding from the geometrical assumption that any Hessian matrix is orthogonally similar to a real diagonal matrix.
Numerical methods for finding eigenvalues and eigenvectors represent one of the most important parts of matrix theory. The analysis of the vector
and matrix B from Corollary 2 has not touched on any aspect of this topic above, but Corollary 3 gives an upper estimate for the perturbation
via relative perturbations
, B and the conditional number
. The
is involved in the estimation in all cases, whether the perturbations occur in
, only in B, or in
and B at the same time.
Finally, we denote another approach (essentially cybernetic) in adaptive correction
, related to the use of sufficient robust stability conditions for the 2-valent model of the matrix
, which also leads to conditions (12). In this context, it is required that with interval tolerances on the vector r coordinates one can construct a Lyapunov function
, where
is the symmetric positively-defined matrix for which the Lyapunov equation
has a solution given a symmetric positively-defined
-matrix Q. The transition to adaptive robust quadratic stability [19] and methods of its solution are also proposed in [20] [23]. This theory, due to the abundance of its computational problems and the opportunities that it opens for applications of nonlinear multivariate regression-tensor analysis, may acquire great (extended) importance in the problems of precision multifactor nonlinear optimization of PC processes of the hardening of complex composite metal coatings and alloys [25].
Acknowledgements
The research was carried out with funding from the Ministry of Education and Science of the Russian Federation (project: 121041300056-7).