Discrete Exterior Calculus of Proteins and Their Cohomology
Naoto Morikawaorcid
Genocript, Zama, Japan.
DOI: 10.4236/ojdm.2022.123004   PDF    HTML   XML   120 Downloads   522 Views  

Abstract

This paper proposes a novel application of cohomology to protein structure analysis. Since proteins interact each other by forming transient protein complexes, their shape (e.g., shape complementarity) plays an important role in their functions. In our mathematical toy models, proteins are represented as a loop of triangles (2D model) or tetrahedra (3D model), where their interactions are defined as fusion of loops. The purpose of this paper is to describe the conditions for loop fusion using the language of cohomology. In particular, this paper uses cohomology to describe the conditions for “allosteric regulation”, which has been attracted attention in safer drug discovery. I hope that this paper will provide a new perspective on the mechanism of allosteric regulation. Advantages of the model include its topological nature. That is, we can deform the shape of loops by deforming the shape of triangles (or tetrahedra) as long as their folded structures are preserved. Another advantage is the simplicity of the “allosteric regulation” mechanism of the model. Furthermore, the effect of the “post-translational modification” can be understood as a resolution of singularities of a flow of triangles (or tetrahedra). No prior knowledge of either protein science, exterior calculus, or cohomology theory is required. The author hopes that this paper will facilitate the interaction between mathematics and protein science.

Share and Cite:

Morikawa, N. (2022) Discrete Exterior Calculus of Proteins and Their Cohomology. Open Journal of Discrete Mathematics, 12, 47-63. doi: 10.4236/ojdm.2022.123004.

1. Introduction

Proteins are hardworking macromolecules that perform a variety of functions in cells. Proteins are made up of chains of 50 to 2000 amino acids, usually folded into a well-defined three-dimensional structure. Since proteins interact each other by forming transient protein-protein complexes, their shape (such as shape complementarity at the interface) plays an important role in their functions.

In this paper, three mathematical models of proteins are presented, i.e., the loop model, the flow model, and the cohomological model in this order (Figure 1).

First, in Section 2, we explain the problem intuitively using a loop model. In particular, we give the intuitive definition of protein interactions and allosteric triplets. For example, “protein interaction is defined as fusions of loops”.

Next, in Section 3, we consider conditions for protein interaction and allosteric regulation using the flow model. The flow model is a differential geometric model [1], where proteins are represented as a closed trajectory of a flow of triangles (2D model) or tetrahedra (3D model). Then, we can define conditions for protein interaction and allosteric regulation using the language of differential geometry [2]. For example, “two proteins interact if the corresponding local flow is integrable”.

Finally, in Section 4, we rephrase the conditions obtained in Section 3 using the cohomological model. For example, “two proteins interact if the cohomology class of the corresponding vector field is zero”. Cohomology classes of vector fields are defined using exterior derivative operators.

“Allosteric regulation” and “post-translational modification” are briefly introduced in Subsections 2.3 and 3.6, respectively. In the flow/cohomological model, “allosteric regulation” corresponds to the integrability of a flow, and “post-translational modification” corresponds to a resolution of singularities of a flow (Figure 2).

As for previous works, the author is unaware of any other geometrical models of allosteric regulation nor any applications of cohomology to protein structure analysis. For an overview of history of allostery, see [3]. For an application of cohomology to biological time series, see [4].

For the sake of clarity, we mainly consider loops of triangles and triangular meshes. In the following, we denote the set of all integers by Z.

Figure 1. The mathematical models. (a) Protein (Polyline); (b) The loop model; (c) The flow/cohomological model.

Figure 2. Advantages of the flow/cohomological model. (a) Integrability of a flow. The arrow indicates a conflict between two distal triangles across two “not-form-a-loop” triangles; (b) Resolution of singularities of a flow. By splitting each of the “not-form-a-loop” triangles into two, a loop of length four (dark grey) is obtained.

2. Loop Model of Proteins

In the loop model (Figure 1(b)), amino acid sequences are represented as a closed chain of triangles (2D model) or tetrahedra (3D model).

2.1. Loops and Their Normal Edges

A chain u of triangles is a series of triangles connected by a common edge, i.e., for some interval I Z ,

u = { t [ i ] ( i I ) | t [ i ] t [ j ] if i j t [ i ] and t [ i + 1 ] intersect at a commonn edge } . (1)

A closed chain u is called a loop.

Let u = { t [ i ] } be a chain of triangles. The underlying mesh M ( u ) of u is a triplet defined by

M ( u ) : = T ( u ) , E ( T ( u ) ) , V ( T ( u ) ) , (2)

where

{ T ( u ) : = { t | t u } , E ( T ( u ) ) : = the set of all edges of the triangles in T ( u ) , V ( T ( u ) ) : = the set of all vertices of the triangles in T ( u ) . (3)

In this paper, we consider only the chains such that (1) two triangles in T(u) do not intersect except at an edge, and (2) the edges in E(T(u)) are shared by at most two triangles in T(u). In particular,

E ( T ( u ) ) = E 1 ( T ( u ) ) E 2 ( T ( u ) ) ( disjoint union ) , (4)

where E k ( X ) denotes the set of edges in E ( X ) shared by k triangles in X. Edges in E 1 ( T ( u ) ) are called the boundary edges of M ( u ) .

Let u = { t [ i ] } be a chain of triangles. Let t u . The normal edge N(t) of t is the edge that is not shared with the adjacent triangles along u. (In Figure 1(c), the normal edges are drawn with thick line segments.) If u is a loop, each triangle in u has exactly one normal edge. If u is not a loop, the endpoint triangles have two normal edges. If u consists of one triangle, say u = { t } , t has three normal edges. We denote the set of all normal edges of triangles in u by N(u):

N ( u ) : = { N ( t ) | t u } . (5)

Remark 2.1. The normal edge to a chain is a discrete version of the “normal vector” to a curve.

The pair M ( u ) , N ( u ) of M ( u ) and N ( u ) is called the local flow of u. Note that we can recover u uniquely from M ( u ) , N ( u ) by connecting triangles in M ( u ) along the edges in N ( u ) , i.e.,

Lemma 2.2. Let u = { t [ i ] } be a chain of triangles. Then, there is a one-to-one correspondence

u M ( u ) , N ( u ) . (6)

In the same way, a chain u of tetrahedra is defined as a series of tetrahedra connected by a common face. The normal edge of a tetrahedra in a trajectory is defined as the edge that is not shared with two adjacent tetrahedra along the trajectory.

2.2. Loop Interaction

Let u a and u b be two chains of triangles. Binary set operations between M ( u a ) and M ( u b ) are defined by

M ( u a ) M ( u b ) : = T ( u a ) T ( u b ) , E ( T ( u a ) T ( u b ) ) , V ( T ( u a ) T ( u b ) ) , (7)

where stands for a set operation such as , , , and others. Binary relations between M ( u a ) and M ( u b ) are defined by

M ( u a ) M ( u b ) if and only if T ( u a ) T ( u b ) (8)

where stands for a binary relation such as , , and others.

Remark 2.3. In this paper, we suppose that E k ( T ( u a ) T ( u b ) ) = for k > 2. For example, we consider M ( u a ) M ( u b ) only for u a and u b with no overlap.

Definition 2.4. (Reaction Intermediate) Let u a , u b , , u c be loops of triangles. M ( u a ) M ( u b ) M ( u c ) is called loop-transitive if there are loops u , u 1 , u 2 , , u m such that

{ T ( u ) T ( u a ) T ( u b ) T ( u c ) , E ( T ( u ) ) E 1 ( T ( u a ) T ( u b ) T ( u c ) ) , M ( u a ) M ( u b ) M ( u c ) = M ( u ) M ( u 1 ) M ( u 2 ) M ( u m ) . (9)

Then, we write

u u a + u b + + u c . (10)

u is called a reaction intermediate generated from u a , u b , , u c .

Remark 2.5. If u is a reaction intermediate generated from u a , u b , , u c , then M ( u ) has the same contour as M ( u a ) M ( u b ) M ( u c ) , but has internal holes occupied by M ( u i ) of loops u 1 , u 2 , , u m .

Remark 2.6. M ( u a ) M ( u b ) M ( u c ) is called strictly loop-transitive if m = 0 , i.e., there is a loop u such that M ( u a ) M ( u b ) M ( u c ) = M ( u ) .

In nature, proteins interact with each other by forming “transient protein complexes”. In the loop model, interactions between loops are defined as fusions of loops.

Definition 2.7. (Loop Interaction) Loops u a , u b , , u c are called interactable if there is a reaction intermediate generated from u a , u b , , u c .

Example 2.8. In Figure 3, u a and u e f are interactable and their reaction intermediate is shown in the upper middle of the figure. On the other hand, u r e g and u a are not interactable. For example, the “intermediate” shown in the lower middle of the figure encloses an “open” trajectory of length 6.

2.3. Allosteric Regulation in the Loop Model

“Allosteric regulation” is a type of interaction between two distal sites of a protein. For example, in the case of allosteric enzymes or receptors, the interaction of a protein with a molecule or protein (called “regulator”) at an “active” site is affected by the binding of another molecule (called “effector”) at a remote “allosteric” site. Allosteric sites have been attracted attention as targets for safer drug discovery [5]. However, allosteric site prediction still remains challenging [6]. Shown in Figure 3 is a schematic diagram of allosteric regulation with three loops, where u a interacts with “regulator” u r e g only after interaction with “effector” u e f .

Remark 2.9. In the loop model, molecules are also represented as a loop of triangles or tetrahedra.

Definition 2.10. (Allosteric Triplet) Let u a , u r e g , and u e f be loops of triangles. The triplet u a , u r e g , u e f of loops are called an allosteric triplet if there are two reaction intermediates u x , u y such that

{ u x u a + u e f and u y u r e g + u a + u e f , u u r e g + u a for any loop u . (11)

Figure 3. Allosteric triplet. u a and u e f generate a reaction intermediate u x (upper middle). u r e g and u a do not generate any reaction intermediate (lower middle). u r e g and u x form a reaction intermediate u y (right).

Example 2.11. Loops u a , u r e g , u e f in Figure 3 form an allosteric triplet.

Remark 2.12. When effectors bind to proteins, they often change the conformation of the protein. In the loop model, since the loop model is a “topological” model, the conformational changes of proteins are considered to be absorbed into deformations of triangles in the mesh.

3. Flow Model of Proteins

In the flow model (Figure 1(c)), amino acid sequences are described as closed trajectories in a flow of triangles (2D model) or tetrahedra (3D model).

3.1. Meshes

In this paper, a triangular mesh is a collection of triangles connected by a common edge. We write a triangular mesh M as a triplet of sets, i.e.,

M : = T , E ( T ) , V ( T ) , (12)

where

{ T : = a set of triangles , E ( T ) : = the set of all edges of the triangles in T , V ( T ) : = the set of all vertices of the triangles in T . (13)

In this paper, we consider only the meshes such that (1) two triangles in T do not intersect except at an edge, and (2) the edges in E(T) are shared by at most two triangles in T. In particular,

E ( T ) = E 1 ( T ) E 2 ( T ) ( disjoint union ) , (14)

Edges in E 1 ( T ) are called the boundary edges of M. Binary set operations and binary relations between M a and M b are defined in the same way as for M ( u a ) and M ( u b ) .

Definition 3.1. (Reaction Intermediate) Let M = T , E ( T ) , V ( T ) be a triangular mesh. M is called loop-transitive if there are loops u , u 1 , u 2 , , u m such that

{ T ( u ) T , E ( T ( u ) ) E 1 ( T ) , M = M ( u ) M ( u 1 ) M ( u 2 ) M ( u m ) . (15)

Then, we write

M M ( u ) . (16)

u is called a reaction intermediate generated from M.

Remark 3.2. M is called strictly loop-transitive if m = 0 , i.e., there is a loop u such that M = M ( u ) .

Example 3.3. Shown in the upper middle of Figure 3 is a reaction intermediate generated from M ( u a ) M ( u e f ) . Shown in the right of Figure 3 is a reaction intermediate generated from M ( u r e g ) M ( u a ) M ( u e f ) .

In the same way, a tetrahedral mesh is a set of tetrahedra connected by a common face. We write a tetrahedra mesh M as a quartet of sets, i.e.,

M : = F , T ( F ) , E ( F ) , V ( F ) , (17)

where F is a set of tetrahedra, T(F) is the set of all faces (i.e., triangles) of the tetrahedra in F, E(F) is the set of all edges of the tetrahedra in F, and V(F) is the set of all vertices of the tetrahedra in F.

3.2. Directed Elements of a Mesh

Let M = T , E ( T ) , V ( T ) be a triangular mesh. To specify flows on M, we consider directed elements of M, i.e., directed vertices, directed edges, and directed triangles.

Since vertices have no direction, the set S 0 ( M ) of all directed vertices in M is defined by

S 0 ( M ) : = V ( T ) . (18)

Edges in M are denoted by two endpoints, i.e., the edge joining vertices v a and v b is denoted by v a v b . We make a distinction between two edges v a v b and v b v a , where v a v b is an edge with the direction from vertex v a to vertex v b . The corresponding edge in E ( T ) is denoted by | v a v b | , i.e., | v a v b | = | v b v a | E ( T ) . Then, the set S 1 ( M ) of all directed edges in M is defined by

S 1 ( M ) : = { v a v b | | v a v b | E ( T ) } . (19)

Triangles in M are denoted by three vertices, i.e., the triangle with three vertices v a , v b , and v c is denoted by v a v b v c . We make a distinction between triangles with different order of vertices. For example, v a v b v c v a v c v b , where v a v b v c is a triangle with the ordered triplet v a , v b , v c . The corresponding triangle in T is denoted by | v a v b v c | , i.e., | v a v b v c | = | v a v c v b | T . Then, the set S 2 ( M ) of all directed triangles in M is defined by

S 2 ( M ) : = { v a v b v c | | v a v b v c | T } . (20)

3.3. Local Flows on a Mesh

Let M be a triangular mesh. A local flow on M is defined as a subset N of S 1 ( M ) . Elements of N are called the normal edges of the local flow. We often denote a local flow N on M by M , N (i.e., as a pair of M and N). Binary relations between M a , N a and M b , N b are defined by

M a , N a M b , N b if and only if M a M b and N a N b . (21)

where stands for a binary relation such as , , and others.

Example 3.4. Shown in Figure 1(c) is part of a local flow, where the normal edges are drawn with thick line segments.

Remark 3.5. Recall that a chain u of triangles can be recovered from the underlying mesh M ( u ) by giving the set N ( u ) of its normal edges (Lemma 2.2).

To perform “differential geometric analysis” of a local flow M , N , we assign “gradient” to the edges in S 1 ( M ) . First, we assign a “height” to the vertices in S 0 ( M ) . Then, the “gradient” of the edges in S 1 ( M ) is computed as the difference of the height along the edge. Finally, the normal edge of a triangle in S 2 ( M ) is defined as the “steepest” edge of the triangles.

The height α ( v a ) of a vertex v a S 0 ( M ) is an integer-valued function defined on S 0 ( M ) , i.e.,

α : S 0 ( M ) Z . (22)

Remark 3.6. Vertices are considered to be lifted vertically from the mesh M through to the “height”.

The gradient β α ( v a v b ) of a directed edge v a v b S 1 ( M ) with respect to α is the difference in the height function α along the edge, i.e.,

β α : S 1 ( M ) Z , β ( v a v b ) : = α ( v b ) α ( v a ) (23)

Note that β α ( v a v b ) = β α ( v b v a ) .

Let β be a Z-valued function on S 1 ( M ) . The normal edge n β ( v a v b v c ) of v a v b v c S 2 ( M ) with respect to β is defined as the steepest (positive) edge of the triangle with respect to β , i.e., n β : S 2 ( M ) S 1 ( M ) ,

n β ( v a v b v c ) : = { v i v j if | β ( v i v j ) | > | β ( v j v k ) | , | β ( v k v i ) | and β ( v i v j ) > 0 v j v i if | β ( v i v j ) | > | β ( v j v k ) | , | β ( v k v i ) | and β ( v i v j ) < 0 Ø otherwise (24)

where { i , j , k } = { a , b , c } .

Remark 3.7. We select edges with positive values as the normal edge of a triangle.

We denote the set of all normal edges of M with respect to β by N ( β ) , i.e.,

N ( β ) : = { n β ( v a v b v c ) | v a v b v c S 2 ( M ) } S 1 ( M ) . (25)

The corresponding edges in E ( T ) are denoted by | N ( β ) | , i.e.,

| N ( β ) | : = { | n β ( v a v b v c ) | | v a v b v c S 2 ( M ) } E ( T ) . (26)

Triangles with one normal edge is called regular. Triangles with no normal edge and triangles with more than two normal edges are called singular. A local flow N is called regular if every triangle is regular. For example, a local flow corresponding to a loop (i.e., | N ( β ) | = N ( u ) ) for some loop u) is regular.

Remark 3.8. Triangles may have multiple normal edges because some edges are shared by two triangles.

A local flow M , N is called differentiable if there is a Z-valued function β on S 1 ( M ) such that N = N ( β ) . β is called the vector field of N and denoted by β ( N ) .

A differentiable local flow M , N is called integrable if there is an Z-valued function α on S 0 ( M ) such that N = N ( β α ) . α is called a potential function of N and denoted by α ( N ) .

A differentiable local flow M , N ( β ) is called 2-bounded if

0 < | β ( v a v b ) | 2 for v a v b S 1 ( M ) . (27)

Proposition 3.9. Let M , N ( β ) be a 2-bounded differentiable local flow. Then, N ( β ) is regular if

β ( v a v b ) + β ( v b v c ) + β ( v c v a ) = 0 for v a v b v c S 2 ( M ) . (28)

Proof. Let v a v b v c S 2 ( M ) . Since M , N ( β ) is differentiable, β ( v i v j ) = 2 , β ( v i v k ) = 1 , and β ( v k v j ) = 1 for some i , j , k Z such that { i , j , k } = { a , b , c } . That is, n β ( v a v b v c ) = v i v j . On the other hand, v i v k and v k v j are not contained in N ( β ) because β ( v i v k ) , β ( v k v j ) < 2 . ∎

Remark 3.10. β ( v a v b ) + β ( v b v c ) + β ( v c v a ) is called the circulation of β around a triangle v a v b v c .

Example 3.11. By piling unit cubes up diagonally in the direction of ( 1 , 1 , 1 ) , we obtain a regular local flow of triangles on the surface of the piled cubes (Figure 4). The normal edges are the vertical diagonals of the unit cubes. That is, each upper face of a unit cube is divided into two triangles by the vertical diagonal. Then, connecting triangles along the vertical diagonals, we obtain a flow on the surface of piled cubes. A triangular mesh M resides on the hyperplane x + y + z = 0 and the height of a point p = ( l , m , n ) Z 3 over

π ( p ) : = ( 2 l m n 3 , l + 2 m n 3 , l m + 2 n 3 ) M (29)

Figure 4. Examples of local flows. (a) Top view of five local flows obtained by piling unit cubes. The normal edges are drawn in thick lines. If one more cube is put on the surface, the local flow will change as indicated by the arrows; (b) The regular flow shown in the upper middle of (a); (c) The non-regular flow shown in the lower middle of (a); (d) The singular triangles indicated by the S-shaped arrows in (a) and (c). Note that they are not obtained by dividing faces of a unit cube.

is given by

α ( π ( p ) ) = l + m + n . (30)

Note that the local flow of Figure 4(c) is differentiable but not 2-bounded due to the triangles pointed by the S-shaped arrow. On the other hand, the local flow of Figure 4(b) is differentiable and 2-bounded.

3.4. Interaction of Loops in a Flow

Let M , N be a local flow. A trajectory of M , N is a chain u of triangles such that

M ( u ) M and N ( u ) N . (31)

A closed trajectory of M , N is called a loop of M , N . A trajectory is called maximal if it cannot be extended further within M. Let ϕ = { u a , u b , , u c } be a set of trajectories of M , N . ϕ is called a flow of M , N if

M = M ( u a ) M ( u b ) M ( u c ) . (32)

Let M , N be a local flow, where M = T , E ( T ) , V ( T ) . M , N is called closed if E 1 ( T ) N ., i.e., the boundary edges of T are normal edges. M , N is called finite if T consists of finite triangles.

Proposition 3.12. Let M , N be a closed finite local flow. Then,

M = M ( u 1 ) M ( u 2 ) M ( u m ) (33)

for some loops u 1 , u 2 , , u m of M , N if N is regular.

Proof. Since E 1 ( T ) N , trajectories of M , N do not cross the boundary of M. Since N is regular, maximal trajectories of M , N have no endpoint. The result follows immediately. ∎

Definition 3.13. (Loop Interaction) Let u a , u b , , u c be loops of M , N . u a , u b , , u c are called interactable if there is a reaction intermediate generated from u a , u b , , u c .

Example 3.14. Shown in the upper middle of Figure 3 is a reaction intermediate generated from u a , u e f . Shown in the right of Figure 3 is a reaction intermediate generated from u r e g , u a , u e f .

3.5. Allosteric Regulation in the Flow Model

Definition 3.15. (Reaction Precursor) Let M = T , E ( T ) , V ( T ) be a triangular mesh. M is called pre-loop-transitive if there is a loop u such that

{ T ( u ) T E ( T ( u ) ) E 1 ( T ) (34)

Then, we write

M ~ M ( u ) . (35)

u is called a reaction precursor generated from M. By definition, reaction intermediates are reaction precursors.

Remark 3.16. If u is a reaction precursor of M, then M(u) may have internal holes with singular triangles inside.

Example 3.17. Shown in the lower middle of Figure 3 is a reaction precursor generated from M ( u r e g ) M ( u a ) . It has a hole with two singular triangles, i.e., the endpoints of the “open” trajectory of length 6.

Proposition 3.18. Let M be a triangular mesh. Suppose that there is a reaction precursor u generated from M of finite length, i.e., M ~ M ( u ) . Let M , N ( β ) be an 2-bounded differentiable local flow such that

M ( u ) , N ( u ) M , N ( β ) . (36)

Then, u is a reaction intermediate (i.e., M is loop-transitive) if M , N ( β ) is integrable.

Proof. Since M ( u ) , N ( u ) is integrable, there is a potential function α such that N ( u ) = N ( β α ) . Then,

β α ( v a v b ) + β α ( v b v c ) + β α ( v c v a ) = 0 for v a v b v c S 2 ( M ) . (37)

The result follows from Proposition 3.9 and Proposition 3.12 immediately. ∎

Definition 3.19. (Pre-Allosteric Triplet) Let u a , u r e g , and u e f be loops of M , N . The triplet u a , u r e g , u e f of loops are called a pre-allosteric triplet if there are two reaction precursors u x , u y such that

{ M ( u a ) M ( u e f ) ~ M ( u x ) and M ( u r e g ) M ( u a ) M ( u e f ) ~ M ( u y ) , M ( u a ) M ( u r e g ) M ( u ) for any loop u . (38)

Corollary 3.20. (Conditions for loop interaction) Let M , N be a local flow. Let u a and u b be loops of M , N . Suppose that there is a reaction precursor u generated from u a and u b of finite length, i.e.,

M ( u a ) M ( u b ) ~ M ( u ) . (39)

Let M , N ( β ) be an 2-bounded differentiable local flows such that

M ( u ) , N ( u ) M , N ( β ) . (40)

Then, u a and u b are interactable if M , N ( β ) is integrable.

Corollary 3.21. (Conditions for allosteric triplet) Let u a , u r e g , u e f be a pre-allosteric triplet such that

M ( u a ) M ( u e f ) ~ M ( u x ) and M ( u r e g ) M ( u a ) M ( u e f ) ~ M ( u y ) (41)

where u x and u y are reaction precursors of finite length. Let M , N ( β x ) and M , N ( β y ) be 2-bounded differentiable local flows such that

M ( u ) , N ( u ) M , N ( β x ) and M ( u ) , N ( u ) M , N ( β y ) . (42)

Then, u a , u r e g , u e f is an allosteric triplet if M , N ( β x ) and M , N ( β y ) are integrable.

3.6. Post-Translational Modification in the Flow Model

“Post-translational modifications (PTMs)” are biochemical modifications of the side chains of amino acids within a protein after their biosynthesis. They have a significant impact on the structure and function of proteins. For example, they play critical roles in regulating the stability of the 3D structure of proteins and their interactions with other molecules and proteins. In particular, the analysis of PTMs is important for the study of diseases, such as heart disease, cancer, and diabetes [7].

In the flow model, the effect of PTMs can be understood as a resolution of singularities of a local flow. That is, PTMs control the stability of a loop by inducing a resolution of the nearby singularity of the local flow.

It is helpful to consider the curvature of vertices to intuitively understand the geometric effects of singularity resolutions. Let M = T , E ( T ) , V ( T ) be a triangular mesh. Let v V ( T ) . The curvature K ( v ) of v is defined by

K ( v ) : = k 6 , (43)

where k is the number of edges incident on v.

Remark 3.22. In comparison to the continuous version of geometry, K corresponds to the “Gaussian curvature” of a plane. For example, a “saddle point” has a negative curvature, and a “point on a hemisphere” has a positive curvature.

Example 3.23. For triangular meshes obtained by piling unit cubes (Example 3.9.), the curvature is zero at all vertices (Figure 4).

Example 3.24. In Figure 1(c), a loop encloses two singular triangles. By splitting each of the singular triangles into two, we obtain a loop of length four (Figure 2(b) dark grey). The curvatures of the vertices of the singular triangles increase by 1 to become positive or zero (i.e., saddle-point). On the other hand, the new vertex obtained in the center has negative curvature −2 (i.e., point on a hemisphere).

4. Cohomological Model of Proteins

In the cohomological model (Figure 1(c)), amino acid sequences are also described as closed trajectories in a flow of triangles (2D model) or tetrahedra (3D model) as in the case of the flow model. Here, we define “cohomology classes” of vector fields on a triangular mesh and rephrase conditions for “interaction” and “allosteric regulation” of loops using the language of cohomology.

4.1. Functions on a Mesh

Let M be a triangular mesh. We denote the set of all “anti-symmetric” assignments of integers to the vertices, edges, and triangles in M by F 0 ( M ) , F 1 ( M ) , and F 2 ( M ) , respectively, i.e.,

F 0 ( M ) : = { α : S 0 ( M ) Z } , (44)

F 1 ( M ) : = { β : S 1 ( M ) Z | β ( v a v b ) = β ( v b v a ) } , (45)

F 2 ( M ) : = { γ : S 2 ( M ) Z | γ ( v a v b v c ) = γ ( v b v c v a ) = γ ( v c v a v b ) = γ ( v a v c v b ) = γ ( v c v b v a ) = γ ( v b v a v c ) } . (46)

Elements of F 0 ( M ) are called scaler functions (or potential functions) on M. Elements of F 1 ( M ) are called vector fields on M.

Remark 4.1. In comparison to the continuous version of geometry, F 0 ( M ) corresponds to “scaler functions (or potential functions)” on a space and F 1 ( M ) corresponds to “vector fields” on a space.

Let X 1 ( M ) be a subset of F 1 ( M ) defined by

X 1 ( M ) : = { β F 1 ( M ) : S 1 ( M ) { ± 1 , ± 2 , , ± n } } , (47)

where n = 2 (if M is a triangular mesh) or n = 3 (if M is a tetrahedral mesh). A vector field β F 1 ( M ) is called n-bounded if β X 1 ( M ) .

Remark 4.2. In comparison to the continuous version of geometry, X 1 ( M ) corresponds to “differentiable vector fields” on a space.

4.2. Exterior Derivative Operator and Cohomology

Let M be a triangular mesh. Now let’s define “differentials” of functions on M.

Discrete exterior derivative d i ( i = 0 , 1 ) is a mapping from F i ( M ) to F i + 1 ( M ) defined by

d 0 : F 0 ( M ) F 1 ( M ) , d 0 α ( v a v b ) : = α ( v b ) α ( v a ) , (48)

d 1 : F 1 ( M ) F 2 ( M ) , d 1 β ( v a v b v c ) : = β ( v a v b ) + β ( v b v c ) + β ( v c v a ) (49)

(Figure 5). Note that d 0 α ( v a v b ) = d 0 α ( v b v a ) and d 1 β ( v a v b v c ) = d 1 β ( v b v c v a ) = d 1 β ( v c v a v b ) = d 1 β ( v a v c v b ) = d 1 β ( v c v b v a ) = d 1 β ( v b v a v c ) .

Remark 4.3. d 0 computes the difference of α F 0 ( M ) along an edge. On the other hand, d 1 computes the circulation of β F 1 ( M ) around a triangle.

Let β F 1 ( M ) . β is called integrable if β = d 0 α for some α F 0 ( M ) .

Lemma 4.4. d 1 d 0 α = 0 for any α F 0 ( M ) .

Proof. It follows immediately from the definitions. ∎

Discrete exterior co-derivative δ i ( i = 0 , 1 ) is a mapping from F i + 1 ( M ) to F i ( M ) defined by

δ 0 : F 1 ( M ) F 0 ( M ) , δ 0 β ( v a ) : = { v b | v a v b S 1 ( M ) } β ( v a v b ) , (50)

Figure 5. Discrete exterior derivative/co-derivative operators. In the figure, e i j represents edge v i v j and t i j k represents triangle v i v j v k .

δ 1 : F 2 ( M ) F 1 ( M ) , δ 1 γ ( v a v b ) : = γ ( v b v a v c ) γ ( v a v b v d ) , (51)

where v b v a v c and v a v b v d are the triangles that share the edge v a v b (Figure 5). Note that δ 1 γ ( v a v b ) = δ 1 γ ( v b v c ) .

Remark 4.5. δ 0 computes the divergence of β F 1 ( M ) at a vertex, i.e., the sum over the outbound arrows. On the other hand, δ 1 computes the difference of γ F 2 ( M ) at a common edge.

Example 4.6. Figure 6(a) is a computation of d 1 d 0 α = 0 for α on the left. Figure 6(b) is a computation of d 1 ( β 1 ) and δ 1 ( β ) for β in the center.

Remark 4.7. To learn more about discrete exterior derivative/co-derivative operators, see [8] or [9].

To define cohomology of vector fields on M, we consider a short sequence of sets given by

F 0 d 0 F 1 d 1 F 2 . (52)

Two subsets of F 1 ( M ) are defined using the sequence, i.e.,

Ker ( d 1 ) : = { β F 1 ( M ) | d 1 β = 0 } , (53)

Im ( d 0 ) : = { d 0 α F 1 ( M ) | α F 0 ( M ) } . (54)

Lemma 4.8. Im ( d 0 ) Ker ( d 1 )

Proof. It follows immediately from Lemma 4.4. ∎

Because of Lemma 4.8, we can consider the quotient set of Ker ( d 1 ) by Im ( d 0 ) .

Definition 4.9. (Cohomology Class) The (first) cohomology set H 1 ( M ) is defined by

H 1 ( M ) : = Ker ( d 1 ) / Im ( d 0 ) F 1 ( M ) / Im ( d 0 ) . (55)

Let β F 1 ( M ) . The equivalence class β mod Im ( d 0 ) H 1 ( M ) is called the cohomology class of β.

Figure 6. Computation examples of discrete exterior derivative/co-derivative operators. In the figure, e i j represents edge v i v j and t i j k represents triangle v i v j v k .

Lemma 4.10. If β mod Im ( d 0 ) = 0 H 1 ( M ) , then β = d 0 α for some α F 0 ( M ) .

Proof. It follows immediately from the definitions. ∎

4.3. Cohomological Conditions for Allosteric Regulation

Let’s rephrase some of the definitions given in Subsection 3.3. Let M be a triangular mesh. Then,

1) A local flow M , N is differentiable if N = N ( β ) for some β F 1 ( M ) ,

2) A differentiable local flow M , N ( β ) is integrable if β is integrable (i.e., β = d 0 α for some α F 0 ( M ) ),

3) A differentiable local flow M , N ( β ) is 2-bounded if β X 1 ( M ) .

Lemma 4.11. Let M , N ( β ) be a differentiable local flow. Then, d 1 β = 0 if M , N ( β ) is integrable.

Proof. It follows immediately from Lemma 4.4. ∎

Then, we obtain a cohomological description of conditions for loop interaction.

Proposition 4.12. Let M be a triangular mesh. Suppose that there is a reaction precursor u generated from M of finite length, i.e., M ~ M ( u ) . Let M , N ( β ) be an 2-bounded differentiable local flow such that

M ( u ) , N ( u ) M , N ( β ) . (56)

Then, u is a reaction intermediate (i.e., M is loop-transitive) if the cohomology class of β is zero.

Proof. It follows immediately from Proposition 3.18. ∎

Corollary 4.13. (Conditions for loop interaction)

Let M , N be a local flow. Let ua and ub be loops of M , N . Suppose that there is a reaction precursor u generated from ua and ub of finite length, i.e.,

M ( u a ) M ( u b ) ~ M ( u ) . (57)

Let M , N ( β ) be an 2-bounded differentiable local flows such that

M ( u ) , N ( u ) M , N ( β ) . (58)

Then, ua and ub are interactable if the cohomology class of β is zero.

Corollary 4.14. (Conditions for allosteric triplet)

Let u a , u r e g , u e f be a pre-allosteric triplet such that

M ( u a ) M ( u e f ) ~ M ( u x ) and M ( u r e g ) M ( u a ) M ( u e f ) ~ M ( u y ) . (59)

Let M , N ( β x ) and M , N ( β y ) be 2-bounded differentiable local flow such that

M ( u ) , N ( u ) M , N ( β x ) and M ( u ) , N ( u ) M , N ( β y ) . (60)

where u x and u y are reaction precursors of finite length.

Then, u a , u r e g , u e f is an allosteric triplet if the cohomology classes of β x and β y are zero.

5. Conclusions

We have considered protein interactions from the viewpoint of cohomology theory, using two-dimensional toy models of proteins. As a specific example, cohomological conditions for allosteric regulation are presented. In this paper, proteins are represented as loops of triangles and protein interactions are represented as fusions of loops. Then, cohomology classes of vector fields on proteins (i.e., a triangular mesh) are defined using discrete exterior operators.

Cohomological conditions for loop-interaction (i.e., protein interaction) are obtained as follows. First, we define reaction intermediates and their precursors generated from a given set of loops. By definition, loops interact if there is a reaction intermediate generated from the loops. Conditions for a precursor to be a reaction intermediate are then given using the language of “differential geometry”. That is, a precursor is a reaction intermediate if the local flow of the precursor is integrable. Finally, the cohomological conditions are obtained by rephrasing the differential geometric conditions using the language of “cohomology”. That is, a precursor is a reaction intermediate if the cohomology class of the vector field on the precursor is zero.

6. Discussion

Currently, since models of allosteric regulation only provide explanations of existing allostery, computer simulations are required to detect unknown allostery. However, the flow/cohomological model, despite its simplicity, is capable of explaining not only the existence of allostery but also its non-existence [10]. In particular, the model can predict the behavior of proteins: “if we remove the obstacle of allostery, we can obtain a new allosteric protein”.

In protein science, when considering protein-protein interactions, only local properties such as local shape complementarity are considered, mainly due to insufficient computer power. However, even local convexities on the surface of proteins are not formed locally, but as a result of global folding. One of the strengths of the flow/cohomological model is that we can consider both the shape of a protein and its folding structure at once. Then, the global properties of proteins can be described using the language of cohomology.

A drawback of the model is that it predicts nothing about the actual protein because it is a 2D model. Therefore, future research directions include the study of the 3D model, where a chains of tetrahedra would be a “backbone” (i.e., streamline) of a flow, rather than a “cage” (i.e., surface of a finite region) of a flow. Then, by detecting “turbulence in a flow” using the language of cohomology, we can predict the behavior of a protein.

Another direction is the study of surface flows, i.e., triangular flows on the surface of folded chains of tetrahedra induced by a tetrahedral flow. As for the 2D model, the study of weaker conditions for loop interaction is also required.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

References

[1] Morikawa, N. (2017) Discrete Differential Geometry and the Structural Study of Protein Complexes. Open Journal of Discrete Mathematics, 7, 148-164.
https://doi.org/10.4236/ojdm.2017.73014
[2] Morikawa, N. (2018) Global Geometrical Constraints on the Shape of Proteins and Their Influence on Allosteric Regulation. Applied Mathematics, 9, 1116-1155.
https://doi.org/10.4236/am.2018.910076
[3] Liu, J. and Nussinov, R. (2016) Allostery: An Overview of Its History, Concepts, Methods, and Applications. PLOS Computational Biology, 12, e1004966.
https://doi.org/10.1371/journal.pcbi.1004966
[4] Pinčák, R., Kanjamapornkul, K. and Bartoš, E. (2020) Cohomology Theory for Biological Time Series. Mathematical Methods in the Applied Sciences, 43, 552-579.
https://doi.org/10.1002/mma.5906
[5] Grover, A.K. (2013) Use of Allosteric Targets in the Discovery of Safer Drugs. Medical Principles and Practice, 22, 418-426.
https://doi.org/10.1159/000350417
[6] Wu, N., Strőmich, L. and Yaliraki, N.Y. (2022) Prediction of Allosteric Sites and Signaling: Insights from Benchmarking Datasets. Patterns, 3, 100408.
https://doi.org/10.1016/j.patter.2021.100408
[7] Xu, H., Wang, Y., Lin, S., Deng, W., Peng, D., Cui, Q. and Xue, Y. (2018) PTMD: A Database of Human Disease-Associated Post-Translational Modifications. Genomics, Proteomics & Bioinformatics, 16, 244-251.
https://doi.org/10.1016/j.gpb.2018.06.004
[8] Crane, C. (2021, Feb 24). Lecture 9: Discrete Exterior Calculus (Discrete Differential Geometry) [Video]. YouTube.
https://www.youtube.com/watch?v=-cUhuzwW-_A
[9] Crane, C. (2022, May 2) Discreet Differential Geometry: An Applied Introduction. 1-172.
https://www.cs.cmu.edu/~kmcrane/Projects/DDG/paper.pdf
[10] Morikawa, N. (2021) Two Mathematical Approaches to Inferring the Internal Structure of Proteins from Their Shape. Global Journal of Science Frontier Research: F, 21, 1-25.
https://doi.org/10.34257/GJSFRFVOL21IS3PG1

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.