1. Introduction
Recently, we have proposed in [1] a rigorous model for prediction and control of a large-scale joint swarm of unmanned ground vehicles (UGVs) and unmanned aerial vehicles (UAVs), performing an autonomous land-air operation. In that paper, we have also introduced a need for a cognitive supervisor for the high- dimensional distributed multi-robotic system. Its primary task is to have a birds- eye view of the situation across the joint land-air operation, and based on the GPS locations of both the target and all included robots, to provide them with good 2D and 3D attractor fields so that they can reach the proximity of the target in the shortest possible time. The purpose of the present paper is to develop a rigorous model for this cognitive supervisor, based on recent discoveries in brain science that show us how humans navigate in 2D environments and how bats navigate in 3D environments, and to couple this with a meta-cognitive supervisor model that allows vehicles to reason about actions and construct simple plans.
The 2014 Nobel Prize in Physiology or Medicine was awarded jointly to John O’Keefe, May-Britt Moser and Edvard I. Moser “for their discoveries of cells that constitute a positioning system in the brain’’, in other words, for the hippo- campus path integration and navigation system.
Briefly, there is a part of a mammal brain, called the hippocampal formation, which in humans is mostly developed in taxi drivers, grows in size with their experience and can be also trained (like a muscle) using fast-action video games. The hippocampal formation provides a cognitive map of a familiar environment which can be used to identify one’s current location and to navigate from one place to another. This mapping system provides two independent strategies for locating places, one based on environmental landmarks and the other on a path integration system (see [2] [3] and the references therein), which uses information about distances traveled in particular directions. This brain navigation system exists in all mammals, while in humans it additionally provides the basis for the so-called episodic memory [4] .1
Two main components of the hippocampal formation (discovered by O’Keefe) are: (i) hippocampal place cells, and (ii) grid cells from the entorhinal cortex (discovered by Mosers). In particular, according to Edvard Moser, “All network models for grid cells involve continuous attractors ...”―which is similar to our attractor Hamiltonian dynamics of UGVs and UAVs, given by Equations (1)-(2) in the next section.
As inspired by this discovery in brain science, the present paper proposes a novel probabilistic spatio-temporal model for mammalian path integration and navigation, formulated as an adaptive Hamiltonian path integral. The model combines: (i) a cognitive map
performed by hippocampal place cells, (ii) an entorhinal map
performed by grid cells, (iii) a current of sensory (extra-hippocampal) stimuli
, and (iv) Hebbian learning in hippocampal synaptic weights
. This model represents an infinite-dimensional neural network, which can be simulated (using 106 - 107 neurons) on IBM’s TrueNorth chip.
We also propose to couple this cognitive supervisor to a meta-cognitive supervisor, supporting dynamic mission planning, using on an established pro- positional multimodal logic framework. This approach gives robotic vehicles the ability to construct and execute simple plans on the fly against goals, given local sensor information, state information communicated locally between vehicles, and aspects of the state of the robot itself. The coupling to the cognitive supervisor and to the outside world is through the truth value of logical atoms and using multimodal actions.
2. Affine Hamiltonian Control for a Joint (UGV + UAV) Swarm
The affine Hamiltonian control model with many degrees-of-freedom has been presented in the form of 2n-dimensional (2ND) Langevin-type attractor matrix equations with nearest-neighbor couplings, which represent two recurrent neural networks:
UGV-swarm:
(1)
UAV-swarm:
(2)
The following terms are used in Equations (1) and (2):
and
are time-evolving matrices defining coordinates and momenta of the UGV-swarm, respectively, with initial conditions:
and
. Similarly,
and
are time-evolving tensors defining coordinates and momenta of the UAV-swarm, respectively, with initial conditions:
and
.
and
are the 2D attractors for the UGV swarm, while
and
are the 3D attractors for the UGV swarm.
and
are the attractor field strengths for the UGV and UAV swarms, respectively;
and
are corresponding adaptive weights of both swarms which can be trained by Hebbian learning,
and
are the initial formations of both swarms,
and
are Lie-derivative controllers for both swarms,
and
are affine Hamiltonians of both swarms, while
and
are zero-mean, delta-correlated, Gaussian white noises added to
and
variables in both swarms. For more technical details on affine Hamiltonian (or, similar, port-Hamiltonian) control of large-scale dynamical systems, see [1] and the references therein.
The purpose of the cognitive supervisor is to provide the 2D and 3D inputs to the recurrent neural nets (1)-(2), or specifically, 2D attractors
for the UGV-swarm and 3D attractors
for the UAV-swarm (see, e.g. [5] ).
3. Adaptive Hamiltonian Path Integral
In his recent Nobel lecture, John O’Keefe referred to his pioneering 1976-paper [6] [7] , describing the function of the hippocampal place cells performing Tolman’s cognitive mapping: “When an animal had located itself in an environment (using environmental stimuli) the hippocampus could calculate subsequent positions in that environment on the basis of how far and in what direction the animal had moved in the interim ...” This quotation was accompanied by a commutative diagram depicting the vector addition and a suggestion that an animal moves in a sequence of vectors.
This extract from O’Keefe’s lecture is the motivation for the present mathematical model. Basically, any two-dimensional (2D) vector is equivalent to a complex number:
where (
) are Cartesian coordinates and
The same complex number
can be also given in the polar form as:
, where
is the radius vector and
is the heading. The sequence of
vectors is the sum of complex numbers:
(3)
In this way, we can describe a particle-like animal motion in the complex plane
from some initial point
to the final point
performed in
steps, as an integral complex number (3). This basic idea describes an animal’s motion in purely static and deterministic terms; it can be generalized into a more realistic, probabilistic dynamics, as follows.2
Now, instead of the complex plane
, consider a particle-like animal motion in the phase plane (p-q), where
represents the action of the hippocampal place cells and
defines the action of the entorhinal grid cells. The animal moves from some initial point
given by canonical coordinates (
) at initial time
, to the final point
given by canonical coordinate (
) at final time
, via all possible paths, each path having an equal probability (so that the sum of all path-probabilities is = 1). This most general 2D motion is properly defined by the transition amplitude:
whose absolute square represents the transition probability density function:
The transition amplitude
can be calculated via the following Hamiltonian path integral3
(4)
where the integration is performed over the
and
values at every time
with the time-step
and the velocity
For technical details of the derivation of the Hamiltonian path integral (4), see e.g. [8] [9] and the references therein.
Next, the sources from various extra-hippocampal stimuli can be incorporated into the basic transition amplitude (4) by adding some form of a bio-electric current
, as:
(5)
where
is the system’s partition function dependent on the current
and obeying the normalization condition:
.
A generalization from a single particle Hamiltonian path integral (4) to the probabilistic dynamics of an N-particle system is straightforward. The phase-
space functional integral that defines the transition amplitude
, from the initial ND point
at time
to the final ND point
at time
is given by:
where we are allowing for the full Hamiltonian of the system
to depend upon all the
coordinates
and momenta
collectively.
Again, we can add various sources as incoming bio-currents
as a straightforward generalization of the single-particle partition function (5) to the system of
particles:
where
is the system’s partition function dependent on all the incoming currents
and obeying the normalization condition:
for
Our final step is to transform the N-particle partition function
into an infinite-dimensional recurrent neural network (of a generalized Hopfield type) by including the hippocampal synaptic weights
into it (compare with [12] ), as:
(6)
where the weights
are adapted (in a discrete time) by Hebbian-type learning:
(7)
where
represent local signal and noise amplitudes, respectively, while superscripts
and
denote desired and achieved system states, respectively.
The system (6-7) defines the proposed adaptive Hamiltonian path integral model for a generic mammalian path integration and navigation. Both the sequential (Ising spin) Hopfield network [13] with its Galuber dynamics and the graded-response Hopfield network [14] with its Fokker-Planck dynamics can be considered as special cases of this general Hamiltonian neural system.
Direct computer simulations of the adaptive Hamiltonian path integral system (6-7) can be performed on the IBM TrueNorth chip (see [15] and the references therein) as a Markov-chain Monte Carlo simulation over a grid of Hopfield nets (which are already implemented in the TrueNorth chip, using 106-107 artificial neurons).
In the next section we propose a more efficient approach to simulate the path integral system (6-7).
4. A Pair of Coupled Nonlinear Schrödinger Equations
In this section, instead of the direct computer simulations on a supercomputer, we will present an indirect approach of simulating the path integral system (6-7) on an ordinary PC, represented as a pair of coupled nonlinear Schrödinger (NLS) equations. In his first paper [11] , Feynman showed that his Lagrangian q-path integral was equivalent to the standard linear Schrödinger equation from quantum mechanics, given here for the case of a free particle (in natural physical units:
):
(8)
which defines the complex-valued microscopic wave function
, whose absolute square
defines the probability density function (PDF).
In the last decade it was shown (see [16] and the references therein) that if the linear Schrödinger Equation (8) is put into an adaptive (iterative) feedback loop, it adds a cubic nonlinearity with a potential field
and becomes the NLS equation:
(9)
which now defines the macroscopic wave function
whose absolute square
still defines the PDF [17] . A variety of analytical solutions for the NLS Equation (9) have been reported in [18] [19] .
Finally, to represent the Hamiltonian
-path integral, we can use the
-pair of NLS equations, as follows.
4.1. Special Case: Analytical Soliton
We start with a simple
-NLS pair representation for the path integral system (6-7), which admits the analytical closed-form solution, given by the so-called Manakov system (with the constant potential
):4
(10)
(11)
which was proven in [20] , using the Lax pair representation [21] , to be completely integrable Hamiltonian system, by the existence of infinite number of involutive integrals of motion.
The
-NLS pair (10)-(11) admits both “bright” and “dark” soliton as solutions, of which the simplest one is the so-called Manakov bright 2-soliton given by:
(12)
where
and
are real-valued parameters and
4.2. General Case: Numerical Simulation
Now that we have introduced the simple
-NLS pair, we can define our real representation for the path integral system (6-7), as the following more general
-NLS pair:
(13)
(14)
including the bell-shaped (sech) spatiotemporal potentials:
and the soft-step shaped (tanh) spatiotemporal input currents:
together with the common initial condition:
the set of parameters:
and the set of adaptive synaptic weights
The
-NLS pair (13)-(14) has been numerically simulated in mathema- tica®, producing 3D plots of real and imaginary parts of the
-wave functions (see Figure 1) and density plots of attractor fields for robotic swarms (see Figure 2), using the following code:
Figure 1. Simulation of the
-NLS pair (13)-(14) in mathematica®, showing the adaptation of the
-waves with the change of the synaptic weights
. Each 3D plot shows the following two surfaces (real and imaginary values of the wave function): (a-q) is the q-wave plot at
, (a-p) is the p-wave plot at
; (b-q) is the q-wave plot at
, (b-p) is the p-wave plot at
; (c-q) is the q-wave plot at
, (c-p) is the p-wave plot at
; (d-q) is the q-wave plot at
, (d-p) is the p-wave plot at
.
Figure 2. Density plots corresponding to 3D plots in Figure 1, representing hypothetical attractor fields for robotic swarms.
Defining potentials:
Defining NLS-equations:
Numerical solution:
3D Plots:
The bidirectional associative memory, given by the NLS-pair (13)-(14) effectively performs quantum neural computation, by giving a spatiotemporal generalization of Hopfield, Grossberg and Kosko BAM family of recurrent neural networks (see [16] and the references therein). In addition, the shock- wave and solitary-wave nature of the coupled NLS equations may describe brain-like effects: propagation, reflection and collision of shock and solitary waves (see [24] ).
5. Meta-Cognitive Supervisor
The meta-cognitive supervisor model is concerned with equipping the elements of the robotic swarm with a limited capability for higher reasoning about potential consequences of its actions, using logical constraints that attempt to rule out actions leading to states that we wish to avoid. That is not to say that such states will never occur, so we also require considerable flexibility in our formulation, in the sense that it should maintain the potential to continue to operate under adverse conditions including partial system failure. Yet we require a formalism for capturing this that is as simple as possible, decidable, and computationally feasible in practice on small platforms; in this sense, we regard the Situation Calculus as too rich for our present purpose, since it is inherently first-order and consequently undecidable.
Our starting point is with the logic of actions and plans
, which is broadly similar to the propositional dynamic logic
[25] [26] , with the difference being that the necessity operator
in
is slightly less expressive (and hence less expensive) than the iteration operator
in
. This loss of expressivity does not currently appear to be a limiting factor in our application, while the compactness and strong completeness of
-which is not shared by
-represent a distinct advantage in the demanding highly dynamic context of our application and the limited resources of our envisaged platforms.
The logic of necessity
in
is S4 [27] , and
has a dual possibility operator
, which is used to express goals. The logic of each modal operator
, where
is a countable set of verbs designating distinct actions, is the basic modal logic K [27] . We also include a null action
. There is also a countable set
of atoms, which designate relevant state conditions in the environment (including within robotic system itself). The set of literals
consists of all atoms
and their negations
. We symbolize the canonical tautology as
, which evaluates to true in all interpretations, and the canonical contradiction as
, which evaluates to false in all interpretations.
Formulae are then defined in the usual manner:
and
are formulae, all literals
and
for
are formulae, and conjunctions
, disjunctions
, and material implications
are formulae if
and
are formulae, and
is a formula if
is a formula and
is a verb. Nothing else is a formula.
For our application, we utilize a message passing approach for distributed communication, so actions for local broadcast of meta-cognitive supervisor messages are represented logically in meta-cognition as verbs and any messages received appear as atoms. Note also that this framework admits interaction with the cognitive supervisor, in both directions; specifically, some actions define initial conditions for the determination of the 2D
and 3D
attractors. Actions for attractor determination are relative to platform position rather than absolute. A formula that does not contain any of the modal operators
,
, or
for
is described as classical.
Possibility and necessity are related as
, and we write
to mean
. Goals are formulas having the form
, where
is classical. For every verb
and formula
, we have
, which means that the formula
is invariant; formulae of the form
constitute integrity constraints. When
in
is classical, the constraint is a static constraint, while
where
contains
for some
is a dynamic constraint that describes some action law. We are especially concerned with effect con-
straints, which have the form
, and describe the consequences of performing actions. For instance,
states
that if Carrying is true and the action release is executed, then Delivered will be made true.
Frame axioms are effect constraints of the form
, which specifies that action
cannot cause the condition
. For example,
specifies that Carrying will remain true after action travel is executed if it was true before. Formulae of the form
mean that after performing action
, condition
will be true, so
says that Carrying is false after executing release, for instance. We can also relate alternate actions
and
depending on the truth or falsity of a current condition
to yield an outcome
using
, which we abbreviate as
. A formula
means that action
is executable, while
means that
is not executable.
The major weakness with using
directly lies in the large number of frame axioms required; these axioms in bulk means that any action changes the truth value of relatively few formulae, which bogs down the inference procedure. Even though our current meta-cognitive definitions are small, these frame axioms would likely still be sufficiently numerous to cause problems for the limited computational resources we typically expect on the currently available small and cheap platforms our research is targeting. Consequently, our choice of logic of action and plans is a variant of
, namely
[28] , which employs a weak ternary causal dependence relation involving atoms, actions and conditions to overcome the frame problem. This logic also means that our meta-cognitive supervisor is able to derive indirect effects of actions, which we expect will be very important in terms of being able to reason about exposure to unacceptable failure.
The logic
is a variant of the earlier
[29] , which also solves the frame problem using a notion of weak causal dependence. The attraction for our problem of
over
is that the former supports more compact domain descriptions without decreasing expressibility, increasing complexity or sacrificing decidability; using the latter would force us to still have to explicitly state conditional frame axioms, which have the form
. In
. In either system we still have to state indirect dependencies.
In contrast to the ternary dependence relation provided by our choice of
, the dependence relation in
is a binary relation
between actions and literals; it is actually the complementary independence relation
that is used here to encode frame axioms, according to
.
We use the logic
because it includes an extra parameter in the dependence relation to capture the conditions under which actions may impact on atoms, thereby avoiding the need to state conditional frame axioms in defining the problem domain. We write the ternary contextual dependence relation as
, where
is a classical formula,
is a verb and
is an atom, to mean that if
is true then action
may change the truth value of
. Note that weak contextual dependence does not mean that the action in the context causes a change in truth value of the atom, only that the change might happen. For instance,
states that executing release when Carrying is true and Delivered is false may effect the value of Delivered; the conditional frame axiom
is not needed in the domain description. In our case, the conditions
in
are always conjunctions of literals, because disjunctions can be simply split into separate dependence statements. Figure 3 contains a small example of a robot meta- cognition scenario in
.
The logic
is decidable, with the satisfiability problem being EXPTIME-complete, which is the same decidability and complexity as the base system
. A tableau method for
is simply a combination of the tableau rules for the logics S4 and K, while
and
require additional rules to handle their dependency relations. We use a notational variant of the definitions from [29] for tableau rules for
, with the different rules from [28] for the ternary weak dependence relation
in place of those for
to describe the tableau calculus.
A labeled formula is a pair
where
is a formula and
is from a countable set of labels for possible worlds; we just use the non-negative integers
. A Skeleton is a ternary relation
, which represents the accessibility relations between possible worlds under actions; we write
for
. A Tree, which corresponds to a
-model, is a pair
consisting of a set
of labeled formulae and a skeleton
. A tableau for
is then the limit of a sequence
for
of sets of trees, where
and each
is obtained from
by the application of a tableau rule (see Figure 4). We also use additional rules for other connectives in practice, but they amount to simple combinations of the basic rules shown here.
Figure 3. A simple theory in
. Note the absence of frame axioms, which we would have in
, and conditional frame axioms we would still need in
. We have
.
Figure 4. Tableau rules for
, as a modification of the tableau rules for
, with the final two rules substituted for new rules for handling the ternary dependence relation. The notation
means that the literal
cannot be verified in world
from the formulae in
.
The rule
states that all literals that depend on an action in a context that does not verify have to be propagated following the execution of that action. Rule
is a back-propagation rule, which is required for completeness of the tableau method, and it states that literals that are true but whose truth value was not changed from the parent node in the tree must also have been true in the parent node. See [29] and [28] for further details.
Space precludes a full account of our test implementation; our system is in the pure functional language Haskell. We use a straightforward algebraic type for representing formulas, and labeled formulas are represented using a simple type synonym. The ordering of the clauses is important: formulas causing tableau branching are lower in the definition, which means that the ordering on formulas that the compiler generates by declaring the type to be an instance of “Ord” is used to prioritize reduction rule application to push branching towards the lower part of the tableau.
data LAPFormula = F|T|Atom String|Not LAPFormula And Formula
|Necessary LAPFormula|Possible LAPformula
|Cause String Formula...
deriving (Eq, Ord)
type Formula = (Int, Formula)
Space precludes a complete account of our test implementation, so we offer an abbreviated description focussing on the core components. In addition to the tableau evaluation module, we also have parser modules, and a number of supporting data structures particularly for representing skeleton relations and dependency relations. The tableau is implemented in operational semantics style, using an algebraic data type to represent a set of abstract operations on tableau branches. As a sample: “Fail” represents an error condition, ‘Result’ carries a return value, “Get” retrieves a formula for reduction, “Put” adds a labeled formula to the branch where it will be subject to further reduction, “Fresh” generates a new possible world index, “Claim” asserts accessibility between possible worlds under actions, “Close” closes the tableau branch by asserting a contradiction for a given world index, and “Split” forks the tableau into two tableau.
data Tableau u = Fail
|Result u
|Get (Formula -> Tableau u)
|Put Formula (Tableau u)
|Fresh (Int -> Tableau u)
|Claim (Int, String, Int) (Tableau u)
|Close Int (Tableau u)
|Split (Tableau u) (Tableau u)
...
There are also constructors for other operations such as those for adding labeled literals to a separate list where they will not be subject to further reduction steps, for searching the skeleton relation, and for searching the dependency relations (our implementation actually supports
as well as
). These operations all follow the same basic pattern as those shown, so we have omitted them for brevity.
instance Monad Tableau where
return = Result
Fail >>= f = Fail
(Result u) >>= f = f u
(Get g) >>= f = Get $ \x -> g x >>= f
(Put x t) >>= f = Put x (t >>= f)
(Fresh g) >>= f = Fresh $ \n -> g n >>= f
(Close n t) >>= f = Close n (t >>= f)
(Split t t’) >>= f = Split (t >>= f) (t’ >>= f)
...
It is easy to verify that this obeys the left unit, right unit and associativity monad laws. For convenience, we use some simple wrapper functions around the constructors, rather than using them directly. For instance:
get :: Tableau Formula
get = Get Result
put :: Formula -> Tableau ()
put x = (Put x . Result) ()
close :: Int -> Tableau ()
close n = (Close n . Result) ()
The instantiation of the abstract machine uses a data type that has separate structures for holding labeled formulas that will be subject to further reduction, those that will not be further used to fire rules, the skeleton relation of accessibility between possible worlds, and the ternary dependency relation. Here “BBTree” is an ordered tree type that implements the priority relation on formulas, so that non-branching formulas are preferred for reduction over those that cause branching. “Skeleton” is an indexed tree structure supporting the various kinds of searches needed on accessibility relation instances, and “Dependency” represents the ternary dependency relation and similarly allows the necessary searches on its instances.
data Branch = Branch {
todo :: BBTree Formula
, lits :: BBTree Formula
, rho :: Skeleton
, index :: Int
, deps :: Dependency
} deriving (Eq, Ord)
The central function is the “runTab” function that defines the how the abstract operations should be applied to branch structures. The rest of the cases (not shown) follow the same basic pattern.
runTab:: (Ord u) => Tableau u -> Branch -> BBTree (u, Branch)
runTab Fail _ = nil
runTab (Result y) b = singleton (y, b)
runTab (Get g) b = case delmin (todo b) of
Nothing -> nil
(Just x, xs) -> run (g x) (b { todo = xs})
runTab (Put f t) b = let b’ = into b f
in runTab t b’
runTab (Fresh g) b = let n = 1 + ()index b)
b’ = b {index = n}
in runTab (g n) b’
runTab (Close n t) b = let b’ = b { todo = nil, lits = single (n, F) }
in runTab t b’
runTab (Split t t’) b = (runTab t b) ‘mplus’ (runTab t’ b)
...
The functions “nil” and “singleton” build trees with zero and one element, respectively. The case for ‘Put’ uses an auxiliary function ‘into’ that checks first to see if the branch already contains the negation of the formula to be inserted into the branch and, if so, inserts a formula containing contradiction “F” instead. This approach provides us with a very compact and natural definition for the tableau reduction rules, as illustrated below.
reduce :: Formula -> Tableau ()
reduce (n, F) = close n
reduce (n, T) = return ()
reduce (n, And x y) = put (n,x) >> put(n,y)
reduce (n, Or x y) = put (n, x) ‘mplus’ put (n,y)
reduce (n, Not (Necessary x)) = fresh >>= \n -> put (n’, neg x)
>> claim (n, “[]”, n’)
reduce (n, Necessary x) = put (n, x) >> fromN w
>>= mapM_ (\(_, m) -> put (m, Necessary x)))
...
The second to last case implements rule
The last case in the snippet implements rules
and
. It uses a function “fromN” that returns a list of verbs and world indexes accessible from a given world index, and the standard library “mapM” to map each of these to actions that insert each resulting formula into the branch. A program for the abstract tableau machine for completely expanding a branch is also very simple.
expand :: Tableau ()
expand = isComplete >>= \c -> if c then return ()
else get >>= reduce >> reduction
Given an initial branch, the tableau can be applied to an initial branch, say b, with runTab expand b. We also have a number of convenience functions for producing an initial branch data structure from a list of formulas and dependency relation instances, and for extracting the results from the resulting tree of closed and saturated branches. Their implementation is straightforward though somewhat tedious.
6. Conclusions
In this paper, we have presented sophisticated cognitive and meta-cognitive supervisor models for joint swarms of robotic aerial and ground vehicles. Based on the research of the recent Nobel prize in Physiology on path integration and navigation in the mammalian and human hippocampus (briefly reviewed in the appendix), this paper develops a Hamiltonian path integral cognitive supervisor model. This model emulates an
-dimensional neural recurrent neural network, yielding attractor fields for robotic swarms. While direct simulation of this Hamiltonian path integral can be done using IBMs’s TrueNorth chip, for the purpose of its immediate evaluation on common hardware, we have transformed this into a coupled pair of NLS equations and simulated this in Mathematica.
The central point of our meta-cognitive model is that we are not utilising inference in modal logic systems to determine low-level movement, but rather to simulate a kind of high-level awareness in light of descriptive goals, general conditions and broad actions to make sense of sensor data and guide overall vehicle behaviour. Our representation of meta-cognitive communication using simple atoms and verbs reflects this choice. By effectively delegating details, the relatively simple models are supported by the propositional multi-modal logic
suffice, with decidability of inference and, in practice, reasonable computation costs and fairly compact meta-cognitive behavioural definitions. We plan to supplant our current use of a tableau method for meta-cognitive inference using the multimodal logic
with a new path integral representation; that is, to eventually fully integrate meta-cognition and affine Hamiltonian control into what would amount to a single coherent generalized Hopfield type recurrent neural network.
Acknowledgements
The authors are grateful to Dr Yi Yue and Dr Martin Oxenham, Decision Sciences, Joint and Operations Analysis Division, DST Group, Australia―for their constructive comments which have improved the quality of this paper. This work is a part of the DSTG TAS SRI (Tyche) project.
Appendix
Hippocampal Path Integration and Navigation in Mammals and Humans
We start with a brief history of hippocampal navigation, from O’Keefe’s pioneering work to the discovery of grid cells by Mosers.
While most of neural network theory (including the concepts of associative synaptic plasticity, cell assemblies and phase sequences) is founded on Hebb’s seminal work [30] , and hippocampus as a spatial cognitive map was proposed in [7] [31] , the pioneering paper on hippocampal navigation was [6] , in which O’Keefe proposed a theoretical suggestion of a landmark-independent navigational system upstream of the hippocampus. A few years later, path integration in mammals was reported in [32] , followed by a quantitative description of head direction-sensitive cells in the brain by [33] , a report of remapping in hippo- campal place cells in [34] , and an early version of the head-direction path-inte- grator model in [35] , which formed the conceptual basis of subsequent continuous attractor models for path integration.
A landmark paper [36] introduced empirical understanding of hippocampal neurodynamics, by the ability to record simultaneously from many neurons in the freely behaving animal. The phase relationship between hippocampal place units and the EEG theta rhythm was shown in [37] , and the hippocampus as a path-integration system was proposed in [38] .
A series of continuous attractor papers started with [39] , followed by an attractor model of head direction cell by angular velocity integration in [40] and the introduction of the concept of periodic boundaries and an early introduction of medial entorhinal grid cells in mammals by [41] .
Next two papers by O’Keefe, [4] [42] , consider human hippocampus place cells, which are signaling both spatial and non-spatial information.
The pioneering study [43] reports that spatial position is represented accurately among ensembles of principal neurons in superficial layers of the medial entorhinal cortex (MEC), while the scale of representation increases along the MEC’s dorsoventral axis. It is followed by [44] that reports the discovery of grid cells, which are suggested as a foundation for a universal path integration-based neuronal map of the spatial environment. Spatial representation and the architecture of the entorhinal cortex was presented in [45] .
Next, we give a current brief overview of hippocampal formation: place cells and grid cells.
The review paper [2] shows that the hippocampal formation is able to encode relative spatial location of mammals and humans (without any reference to external cues) by the integration of translational and rotational self-motion, which is called the path integration.
Both theoretical and empirical studies show that the synaptic matrix of the MEC-grid cells of young mammals perform heavy self-organizing path-integra- tion computations, similar to Turing’s symmetry-breaking operation5, while the scale at which space is represented increases systematically along the dorsoventral axis in both the hippocampus and the MEC. Spatially periodic inputs (at multiple scales) converging from the MEC-grid cells, result in non-periodic spatial firing of the hippocampal place cells.
The paper [3] reviews how place cells and grid cells form the entorhinal-hip- pocampal representations, initially observed in [46] and mathematically modeled in [47] [48] , for quantitative spatio-temporal representation of places, routes, and associated experiences during behavior and in memory.
It has been observed that place cells perform both pattern completion and pattern separation, while hippocampal representations cannot always be discon- tinuous as in a sequential Hopfield network [13] , but rather similar to the graded-response Hopfield network [14] .
Finally, while all the research mentioned so far was dealing with 2D hippocampal path integration and navigation, which is relevant for our UGVs, in recent years this research has been generalized to 3D navigation of bats in [49] [50] .
NOTES
1For technical details on the Nobel awarded work of John O’Keefe, May-Britt Moser and Edvard I. Moser, see Appendix and the references therein.
2For For simplicity reasons, we are using the same Hamiltonian symbols,
and
, for the cognitive representation of robotic coordinates and momenta, to emphasize the one-to-one correspondence between the physical robotic level and the mental supervisor level. However, while at the physical robotic level,
and
are only temporal variables, at the cognitive level,
and
represent spatiotemporal wave functions.
3The path integral (4) was formulated by R. Feynman in [10] . It has been widely appreciated that the phase-space (i.e., Hamiltonian) path integral is more generally applicable, or more robust, than the original, Lagrangian version of the path integral, introduced in Feynman’s first paper [11] . For example, the original Lagrangian path integral is satisfactory for Lagrangians of the form:
but it is unsuitable, e.g., for the case of a particle with the Lagrangian (in normal units):
For such a system (as well as many more general expressions) the Hamiltonian path integral is more robust; e.g., the Hamiltonian path integral for the free particle:
is readily evaluated.
4The Manakov system has been used to describe the interaction between wave packets in dispersive conservative media, and also the interaction between orthogonally polarized components in nonlinear optical fibres (see, e.g. [22] [23] and the references therein).
5A landmark Turing’s paper [51] demonstrating that symmetry breaking can occur in the simple reaction-diffusion system, that results in spatially periodic structures can account for pattern formation in nature.