The First-Order Comprehensive Sensitivity Analysis Methodology (1st-CASAM) for Scalar-Valued Responses: I. Theory ()
1. Introduction
Many works have been published on using adjoint operators for computing first- and second-order sensitivities (i.e., functional derivatives) of model responses (i.e., results produced by models) to imprecisely known model parameters since the original work of Wigner [1] on the linear neutron transport equation and the introduction of the first-order adjoint sensitivity analysis methodology for nonlinear systems by Cacuci [2] [3]. Representative works in this regard are cited in the books by Cacuci [4] [5], along with the original presentations of the first- and the second-order adjoint sensitivities analysis methodologies. It is well known that the adjoint method of sensitivity analysis [2] [3] [4] [5] enables the most efficient computation of the exact (to machine or to a priori set precision) response sensitivities to model parameters. The efficiency of the second-order adjoint sensitivity analysis methodology developed by Cacuci [5] has been recently demonstrated by its application to a OECD/NEA reactor physics benchmark [6] to compute [7] - [12] exactly the 21,976 first-order sensitivities and 482,944,576 second-order sensitivities of this benchmark’s response with respect to the benchmark’s model parameters, showing in particular that the effects of the 2nd-order sensitivities on the uncertainty in the model’s response are even more important than the effects of the 1st-order ones. Another step towards overcoming the curse of dimensionality in sensitivity analysis, uncertainty quantification and predictive modeling has been provided by the third-order adjoint sensitivity analysis methodology for linear systems provided recently by Cacuci [13].
However, none of the works cited above are capable of computing response sensitivities to imprecisely known domain internal and/or external boundaries. Very few works have attempted to develop mathematical/computational methodologies for computing exactly the first-order sensitivities of responses to imprecisely known boundaries. The representative works (Komata [14], Larsen and Pomraning [15], Rahnema and Pomraning [16], McKinley and Rahnema [17], Favorite and Gonzalez [18]) that have addressed this issue were limited to specific linear neutron transport or diffusion problems. Furthermore, none of the works published thus far have addressed, in a general theoretical/mathematical setting, the simultaneous computation of response sensitivities to imprecisely known model parameters, imprecisely known internal boundaries/interfaces between nonlinear systems that model coupled yet distinct physical processes, and/or imprecisely known external boundaries.
This work presents the mathematical foundations of a new method for computing efficiently, exactly and exhaustively, the first-order response sensitivities for coupled nonlinear physical systems characterized by imprecisely known parameters that describe not only processes within the system but also at the physical interfaces between systems, as well as at the systems’ imprecisely known domain boundaries. This new method will be called the first-order comprehensive adjoint sensitivity analysis methodology (1st-CASAM). Notably, the 1st-CASAM enables the quantification of the effects of manufacturing tolerances on the responses of physical and engineering systems.
This work is structured as follows: Section 2 presents the mathematical framework of two coupled generic nonlinear physical systems comprising imprecisely known parameters, internal interfaces, and external boundaries. Section 3 presents the mathematical framework of the 1st-CASAM, which enables the efficient computation of the exact sensitivities of a scalar-valued response with respect to the imprecisely known parameters, interfaces, and boundaries that characterize the generic coupled nonlinear physical systems. As is well known [19], the availability of response sensitivities to imprecisely known parameters, interfaces and boundaries is essential for a variety of subsequent uses, including uncertainty quantification, optimization, data assimilation, model calibration and validation, and reduction of uncertainties in predicted model results. Section 4 offers concluding remarks.
The sequel to this work [20] presents an illustrative application of the 1st-CASAM to a benchmark problem [21] [22] [23] that models coupled heat conduction and convection in a physical system comprising an electrically heated rod surrounded by a coolant which simulates the geometry of an advanced (“Generation-IV”) nuclear reactor [24]. This benchmark problem [21] [22] [23] admits exact closed-form solutions for the sensitivities of the temperature distribution in the coupled rod/coolant system which can be used to benchmark thermal-hydraulics production codes. In particular, this benchmark [21] [22] [23] was used to verify the numerical results produced by the FLUENT Adjoint Solver [25], showing that that the current “FLUENT Adjoint Solver” cannot compute any sensitivities for the temperature distribution within the solid rod. Although the “FLUENT Adjoint Solver” is capable of computing sensitivities of fluid temperatures to boundary parameters (e.g., boundary temperature, boundary velocity, boundary pressure), it yields accurate results only for the sensitivities of the fluid outlet temperature and the maximum rod surface temperature to the inlet temperature and inlet velocity, respectively.
2. Mathematical Modeling of Generic Coupled Nonlinear Physical Systems Comprising Imprecisely Known Parameters, Interfaces and Boundaries
The physical system considered in this work comprises two nonlinear physical systems which are coupled to one another across a common internal interface (boundary) in phase-space. Each system comprises imprecisely known model parameters, including imprecisely known parameters that characterize the interface between the systems and the systems’ outer boundaries. The first physical system is represented mathematically as follows:
(1)
Bold letters will be used in this work to denote matrices and vectors. Unless explicitly stated otherwise, the vectors in this work are considered to be column vectors. The second system is represented mathematically as follows:
(2)
If differential operators appear in Equations (1) and (2), a corresponding set of boundary and/or initial/final conditions must also be given; these conditions can be represented in operator form as follows:
(3)
The quantities appearing in Equations (1)-(3) are defined as follows:
1)
denotes a
-dimensional column vector whose scalar-valued components are all of the imprecisely known internal and boundary parameters (both of) the physical systems, including imprecisely known parameters that characterize the interface and boundary conditions. Some of these parameters will be common to both physical systems, particularly those that characterize common interfaces. These scalar parameters are considered to be imperfectly known, subject to uncertainties. The minimum information needed for these parameters is their nominal or average values, which will be denoted as
. The superscript “zero” will be used in this work to denote known nominal or average values of various quantities. The symbol “
” will be used to denote “is defined as” or “is by definition equal to” and transposition will be indicated by a dagger (
) superscript.
2)
denotes the
-dimensional phase-space position vector of independent variables for the system defined in Equation (1). The vector of independent variables
is defined on a phase-space domain denoted as
which is defined as
. The lower-valued imprecisely known boundary-point of the independent variable
is denoted as
, while the upper-valued imprecisely known boundary-point of the independent variable
is denoted as
. For physical systems modeled by diffusion theory, for example, the “vacuum boundary condition” requires that the particle flux vanish at the “extrapolated boundary” of the spatial domain facing the vacuum; the “extrapolated boundary” depends on the imprecisely known geometrical dimensions of the system’s domain in space and also on the system’s microscopic transport cross sections and atomic number densities. The boundary
of the domain
comprises all of the endpoints
and
of the intervals on which the respective components of
are defined. It may happen that some components
and/or
are infinite, in which case they would not depend on any imprecisely known parameters.
3)
denotes the
-dimensional phase-space position vector of independent variables for the physical system defined in Equation . The vector of independent variables
is defined on a phase-space domain denoted as
which is defined as follows:
. The lower-valued imprecisely known boundary-point of the independent variable
is denoted as
, while the upper-valued imprecisely known boundary-point of the independent variable
is denoted as
. Some or all of the points
may coincide with the points
. Also, some components of
may coincide with some components of
.
4)
denotes a
-dimensional column vector whose components represent the system’s dependent variables (also called “state functions”). The vector-valued function
is considered the unique nontrivial solution of the physical problem described by Equations (1) and (2).
5)
denotes a
-dimensional column vector whose components represent the system’s dependent variables (also called “state functions”); The vector-valued function
is considered the unique nontrivial solution of the physical problem described by Equations (2) and (3).
6)
denotes a column vector of dimensions
whose components are operators (including differential, difference, integral, distributions, and/or infinite matrices) acting nonlinearly on
and
.
7)
denotes a column vector of dimensions
whose components are operators (including differential, difference, integral, distributions, and/or infinite matrices) acting nonlinearly on
and
.
8)
denotes a
-dimensional column vector whose elements represent inhomogeneous source terms that depend either linearly or nonlinearly on
. The components of
may involve operators, rather than just finite-dimensional functions, and distributions acting on
and
.
9)
denotes a
-dimensional column vector whose elements represent inhomogeneous source terms that depend either linearly or nonlinearly on
. The components of
may involve operators, rather than just finite-dimensional functions, and distributions acting on
and
.
10) The vector-valued operator
comprises all of the boundary, interface, and initial/final conditions for the coupled physical systems. If the boundary, interface and/or initial/final conditions are inhomogeneous, which is most often the case, then
.
11) Since
and may involve operators and distributions acting on
and
, all of the equalities in this work, including Equations (1)-(3), are considered to hold in the weak (“distributional”) sense, since the right-sides (“sources”) of and of other various equations to be derived in this work may contain distributions (“generalized functions/functionals”), particularly Dirac-distributions and derivatives and/or integrals thereof.
The nominal solutions of Equations (1)-(3) will be denoted as
and
; they are obtained by solving these equations at the nominal parameter values
. In other words, the vectors
and
satisfy the following equations:
(4)
(5)
(6)
Equations (4)-(6) represent the “base-case” or nominal state of the physical system. Throughout this work, the superscript “0” will be used to denote “nominal” or “expected” values.
The response considered in this work is a generic scalar-valued operator (i.e., a functional) of the state functions, denoted as follows:
. (7)
The nominal value of the response, denoted as
, is determined by computing the response at the nominal values
,
and
.
3. Mathematical Framework of the 1st-CASAM for Operator-Valued Responses for Coupled Linear Physical Systems Comprising Imprecisely Known Parameters, Interfaces and Boundaries
As has been mentioned in the foregoing, the model and boundary parameters are considered to be imprecisely known quantities. Their true values may differ from their nominal (average, or “base-case”) values by variations denoted as
, where
,
. In turn, the parameter variations
will cause variations
and
in the state functions, through Equations (1)-(3). Furthermore, the variations
,
and
will cause variations in the response
around the nominal response value
. Sensitivity analysis aims at computing the functional derivatives (called “sensitivities”) of the response to the imprecisely known parameters
. Subsequently, these sensitivities can be used for a variety of purposes, including quantifying the uncertainties induced in responses by the uncertainties in the model and boundary parameters, combining the uncertainties in computed responses with uncertainties in measured response (“data assimilation”) to obtain more accurate predictions of responses and/or parameters (“model calibration,” “predictive modeling”, etc.). As has been shown by Cacuci [2] [3], the most general definition of the 1st-order total sensitivity of an operator-valued model response to parameter variations is provided by the first-order “Gateaux-variation” (G-variation) of the response under consideration. To determine the first G-variation of the response
, it is convenient to denote the functions appearing in the argument of the response as being the components of a vector
, which represents an arbitrary “point” in the combined phase-space of the state functions and (all) parameters. The point which corresponds to the nominal values of the state functions and parameters in this phase space is denoted as
. Analogously, it is convenient to consider the variations in the model’s state functions and parameters to be the components of a “vector of variations”,
, defined as follows:
. The 1st-order Gateaux- (G-) variation of the response
, which will be denoted as
, for arbitrary variations
in the model parameters and state functions in a neighborhood (
) around
, is obtained, by definition, as follows:
(8)
The existence of the G-variation
does not guarantee its numerical computability. Numerical methods most often require that
be linear in the variations
in a neighborhood (
) around
. The necessary and sufficient conditions for the G-differential
of a nonlinear operator
to be linear in
in a neighborhood (
) around
, and thus admit partial and total G-derivatives, are as follows:
1)
satisfies a weak Lipschitz condition at
; (9)
2) for two arbitrary vectors of variations
and
, the operator
satisfies the relation
(10)
If the G-variation
is linear in
, then the function
is called the G-differential of
and is usually denoted as
. Furthermore, the result of the differentiations indicated on the right-side of the definition provided in Equation (8) can be written as follows:
(11)
where the so-called “direct-effect” term is defined as follows:
(12)
while the so-called “indirect-effect” term is defined as follows:
(13)
In Equations (12) and (13), the vectors
,
and
comprise, as components, the first-order partial G-derivatives computed at the phase-space point
. The G-differential
is an operator defined on the same domain as
and has the same range as
. The G-differential
satisfies the relation
, with
.
The “direct effect” term
depends only on the parameter variations
and can therefore be computed immediately, since it does not depend on the variations
and
. On the other hand, the “indirect effect” term
depends indirectly on the parameter variations
through the yet unknown variations
and
in the state functions, which are the solutions of the system of equations obtained by applying the definition of the G-differential to Equations (1)-(3), to obtain the following relations:
(14)
(15)
(16)
Performing in Equations (14)-(16) the differentiations with respect to
and setting
in the resulting expressions yields the following system of equations:
(17)
(18)
(19)
where
(20)
(21)
The system of equations comprising Equations (17)-(19) is called the “First-Level Forward Sensitivity System” (1st-LFSS) and could be solved to obtain the variations
and
in the state functions in terms of the parameter variations
which appear as sources in the 1st-LFSS equations. Subsequently, the variations
and
thus obtained could be used to compute the indirect-effect term defined in Equation (13).
However, since there are at least
variations to consider, it becomes prohibitively expensive computationally to solve in practice the 1st-LFSS, which may comprise differential and or integral operators, for all possible parameter variations
. The need for solving repeatedly the 1st-LFSS for every possible parameter variation
can be circumvented by applying the concepts first outlined by Cacuci [2] [3] to construct a “First-Level Adjoint Sensitivity System” (1st-LASS), the solution of which will be used to eliminate the appearance of the variations
and
in the expression of the indirect-effect term defined in Equation (13). The 1st-LASS is constructed by implementing the following sequence of steps:
1) Introduce a Hilbert space pertaining to the domain
, denoted as
, comprising square-integrable vector-valued elements of the same form as the vectors
and
. The inner product underlying
, between two
elements
and
is denoted as
and defined as follows:
(22)
2) Introduce a Hilbert space pertaining to the domain
, denoted as
, comprising square-integrable vector-valued elements of the same form as the vectors
and
, i.e.,
and
The Hilbert space
is endowed with an inner product denoted as
, which is defined as follows:
(23)
3) In the Hilbert
, form the inner product of Equation (17) with a yet undefined vector-valued function
to obtain the following relation:
(24)
4) Using the definition of the adjoint operator in the Hilbert space
, recast the left-side of Equation (24) as follows:
(25)
where
denotes the bilinear concomitant evaluated on the boundary
. In Equation (25), the operator
is the formal adjoint of
.
5) Replace the left-side of Equation (24) by the right-side of Equation (25) to obtain the following relation:
(26)
6) In the Hilbert
, form the inner product of Equation (18) with a yet undefined vector-valued function
to obtain the following relation:
. (27)
7) Using the definition of the adjoint operator in the Hilbert space
, recast the left-side of Equation (24) as follows:
(28)
where
denotes the bilinear concomitant evaluated on the boundary
. In Equation (28), the operator
is the formal adjoint of
.
8) Replace the left-side of Equation (27) by the right-side of Equation (28) to obtain the following relation:
(29)
9) Add Equations (29) and (26) to obtain:
(30)
10) The next step is to relate the right-side of Equation (30) with the indirect-effect term
defined in Equation (13). Since the response considered is a functional of
and
, the G-differential
is also a functional of
and
. Consequently, the well-known Riesz representation theorem (which states that every functional can be expressed uniquely in terms of the inner product pertaining to the respective Hilbert space) ensures that the indirect-effect term
can be expressed uniquely as follows:
(31)
11) Identifying the right-side of Equation (31) with the left-side of Equation (30) indicates that the indirect-effect term
would be equal to the right side of Equation (30) provided that the following relations are satisfied by the yet undetermined functions
and
:
(32)
(33)
12) Using Equations (31)-(33) in Equation (30) transforms the latter into the following form:
(34)
13) The boundary, interface and initial/final conditions for the functions
and
are now determined by imposing the following requirements:
a) Implement the boundary, interface and initial/final conditions given in Equation (19) into the bilinear concomitants in Equation (34).
b) Eliminate the remaining unknown boundary, interface and initial/final conditions involving the functions
and
from the expression of the bilinear concomitants in Equation (34) by selecting boundary, interface and initial/final conditions for the functions
and
such that the selected conditions for
and
must be independent of unknown values of
,
and
while ensuring that Equations (32) and (33) are well posed. The boundary conditions thus chosen for the adjoint functions
and
can be represented in operator form as follows:
(35)
where the subscript “A” indicates “adjoint”.
14) The selection of the boundary conditions for the adjoint functions
and
represented by Equation (35) eliminates the appearance of any unknown values of the variations
and
in the bilinear concomitants in Equation (34) and reduces these concomitants to a residual quantity that contains boundary terms involving only known values of
,
,
,
,
,
. This residual quantity will be denoted as
. In general, this residual quantity does not automatically vanish, although it may do so in particular instances. In principle,
could be forced to vanish, if necessary, by considering extensions, in the operator sense, of the linear operators
and/or
, but such extensions seldom need to be used in practice.
15) Using the conditions represented by Equations (19) and (35) in Equation (34) yields the following (final) expression for the indirect-effect term
:
(36)
As the expression in Equation (36) indicates, the desired elimination of the unknown variations
and
from the original expression of
given in Equation
(13) has been accomplished by having replaced them by expressions involving the functions
and
, which do not depend on any parameter variations, a fact that has been underscored by having explicitly indicated that the indirect-effect term can now be written in the form
.
The system of equations represented by Equations (32), (33), and (35) is called the First-Level Adjoint Sensitivity System (1st-LASS) and the functions
and
are called the “first-level adjoint sensitivity functions.” The essential feature of the 1st-LASS is that it is independent of parameter variations (in contradistinction to the 1st-LFSS), so it needs to be solved only once per response to obtain the first-level adjoint sensitivity functions
and
. Once the adjoint functions
and
are available, they can be used
in Equation (36) to compute the indirect-effect term
exactly and efficiently, using quadrature formulas, which are many orders of magnitude faster to compute then solving the operator (differential, integral) equations that underlie the 1st-LFSS. As is well known [2] [3] [4] [5], it is this property that makes the adjoint sensitivity analysis method “unbeatable” when needing to compute the sensitivities of functional-valued responses to many imprecisely known parameters.
4. Concluding Remarks
This work has presented the first-order comprehensive adjoint sensitivity analysis methodology (1st-CASAM) for computing efficiently, exhaustively and exactly, the first-order response sensitivities for coupled nonlinear physical systems characterized by imprecisely known parameters characterizing the systems, the interfaces between systems and the systems’ domain boundaries. The 1st-CASAM fundamentally generalizes and extends all previously published theoretical works on this topic, also enabling the quantification of the effects of manufacturing tolerances on the responses of physical and engineering systems. The 1st-CASAM highlights the conclusion that response sensitivities to the imprecisely known domain boundaries and interfaces can arise both from the definition of the system’s response as well as from the equations, interfaces and boundary conditions defining the model and its imprecisely known domain. Ongoing research will generalize the methodology presented in this work, aiming at computing exactly and efficiently higher-order response sensitivities for coupled systems involving imprecisely known interfaces, parameters, and boundaries. The sequel [20] to this work illustrates the application of the 1st-CASAM to a benchmark problem [21] [22] [23] that models heat conduction and convection in a physical system comprising an electrically heated rod surrounded by a coolant which simulates the geometry of an advanced (“Generation-IV”) nuclear reactor [24].