Biggs Theorem for Directed Cycles and Topological Invariants of Digraphs

Abstract

We generalize Biggs Theorem to the case of directed cycles of multi-digraphs allowing to compute the dimension of the directed cycle space independently of the graph representation with linear runtime complexity. By considering two-dimensional CW complex of elementary cycles and deriving formulas for the Betti numbers of the associated cellular homology groups, we extend the list of representation independent topological inavariants measuring the graph structure. We prove the computation of the 2nd Betti number to be sharp #P hard in general and present specific representation invariant sub-fillings yielding efficiently computable homology groups. Finally, we suggest how to use the provided structural measures to shed new light on graph theoretical problems as graph embeddings, discrete Morse theory and graph clustering.

Share and Cite:

Hecht, M. and Sbalzarini, I. (2021) Biggs Theorem for Directed Cycles and Topological Invariants of Digraphs. Advances in Pure Mathematics, 11, 573-594. doi: 10.4236/apm.2021.116037.

1. Introduction

Graphs are widely used abstractions to model systems and interactions, e.g., in molecular chemistry [1], biochemistry [2], systems biology [3] [4], neuroscience [5], and social sciences [6]. Therefore, graph theoretical problems play a pivotal role across all disciplines of sciences. However, the combinatorical complexity of graph theoretical problems often hampers practical applications. Despite trees, planar graphs are a graph class for which most optimization or decision problems can efficiently be solved or closely be approximated. This is notably the case for the graph isomorphism problem [7], graph-coloring problems [8] and detection of Hamilton cycles [9], i.e., solving the traveling salesman problem [10]. These facts are reflected in Whitney’s theorem [11] that proves planar embeddings of 3-connected graphs to be unique up to isomorphism. However, planarity is a very strong condition on the graph G, which often does not match to instances occurring in practice. Extending considerations to graphs of higher genus allow to relax the conditions on the graph structure, which, by itself, is generated by the cycles the graph G contains. However, as counting all elementary cycles is already a #P-hard [12] problem finding an embedding of minimum genus is NP-hard [13]. In addition, the number of possible embeddings increases exponentially with the graph genus [13], which in contrast to Whitney’s Theorem makes it hard to efficiently represent G independently of its automorphism class.

Here, we address the question of whether efficiently computable invariants of graphs exist and how they can be used to measure the structural complexity of graphs independently of their representation. Biggs Theorem [14] is a classic result in graph theory that, similarly to the Euler characteristic of surfaces [15], provides such an invariant for simple digraphs.

Theorem (Biggs Theorem). The set of connected cycles O ¯ ( G ) of a simple digraph G generates a free R-module or vector space Λ ( G , R ) of dimension

dim R Λ ( G , R ) = | E | | V | + # G , R = , 2 , (1)

where #G denotes the connected components of G.

However, Biggs Theorem is only known to hold for the space generated by connected cycles of a simple digraph. Here we generalize the statement to the case of the directed cycle space O ( G ) R , R = , 2 spanned by all directed cycles O ( G ) of a multi-digraph G.

1.1. Statement of Contribution

We denote by O el ( G ) , O ¯ el ( G ) the set of all directed or connected elementary cycles (passing no vertex twice) of a multi-digraph G and by G ( O el ( G ) ) , G ( O ¯ el ( G ) ) the induced subgraphs, respectively. Then we prove the following generalization of Biggs Theorem.

Theorem 1 (Biggs Theorem for directed cycles).Let G = ( V , E , head , tail ) be a multi-digraph and Λ ( G , R ) = O ¯ el ( G ) R its connected cycle space with respect to R = , 2 . Then

G ( O el ( G ) ) = G ( O ¯ el ( G ) ) if and only if O el ( G ) R = O ¯ el ( G ) R . (2)

Consequently, in case of equivalence, there holds

dim R O el ( G ) R = | E | | V | + # G = dim R Λ ( G , R ) . (3)

Indeed the construction of the induced subgraph G ( O el ( G ) ) is given by the union of all strongly connected components, which therefore can be realized in O ( | E | ) [16]. Consequently, Theorem 1 allows to determine dim R O el ( G ) R efficiently in O ( | E | ) . The efficient computation of dim R O el ( G ) R potentially addresses many challenging problems in graph theory. In this article, we extend the 1-dimensional CW-complex X 1 ( G ) given by the geometrical realization of G given by attaching 2-cells accordingly to a chosen set of elementary cycles O O ¯ el ( G ) . Thereby, much more structural information of G is understood as topological structure. Theorem 1 enables us to derive formulas for the Betti numbers of the associated homology groups H k ( G , O el ( G ) , R ) , k = 0 , 1 , 2 .

Theorem 2 (Homology of the Elementary Filling). Let G = ( V , E , head , tail ) be a multi-digraph, R = , 2 , O O ¯ el ( G ) and X ( G , O ) the geometrical realization induced by the choice of O O ¯ el . The associated cellular homology groups H k ( G , O , R ) are free R-modules of ranks b k ( G , O , R ) , k = 0 , 1 , 2 .

i) b 0 ( G , O , R ) = # G and can be computed in O ( | V | 2 + | V | | E | ) .

ii) b 1 ( G , O , R ) = | E | | V | + # G dim R O R and b 1 ( G , O ¯ el ( G ) , R ) = 0 .

If O = O el ( G ) then b 1 ( G , O , R ) can be computed in O ( | V | 2 + | V | | E | ) .

iii) b 2 ( G , O , R ) = | O | dim R O R . If O = O el ( G ) , O ¯ el ( G ) then it is #P-hard (sharp P-hard) to compute b 2 ( G , O , R ) .

Using that cellular and singular homology are isomorphic [15] (Theorem 2.35), Theorem 2 provides a path for computing the singular homology of the geometric realization X ( G , O ) of G with respect to the set of cycles O O el ( G ) , O ¯ el ( G ) . To overcome the involvement of a #P-hard counting problem in iii), we suggest several representation independent sub-fillings O O el ( G ) , O ¯ el ( G ) allowing to compute H k ( G , O , R ) , k = 0,1,2 , efficiently. Consequently, the graph structure of G can be understood in terms of singular homology. In light of this fact, we revisit classic problems in graph theory as graph embeddings, discrete Morse theory and graph clustering from this new perspective.

1.2. Preleminaries

For rendering the introduced concepts of this article and the proof Theorem 1 consistent we introduce a definition of graphs that yields a notion of multi-digraphs without requiring multi-sets.

Definition 1. Let G = ( V , E , head , tail ) be a 4-tuple, where V , E are finite sets and head , tail : E V are some maps. We call the elements v V vertices and the elements e E edges of G, while head ( e ) , tail ( e ) V are called head and tail of the edge e. An edge e with head ( e ) = tail ( e ) is called a loop. In general, we call G a multi-digraph. The following cases are often relevant:

i) G is called a digraph iff the map H : E V × V , with H ( e ) = ( tail ( e ) , head ( e ) ) is injective.

ii) If there are no loops, i.e., e E with tail ( e ) = head ( e ) , then G is called simple.

ii) G is called an undirected graph if G is a digraph and for every e E there is f E \ { e } with head ( e ) = tail ( f ) , tail ( e ) = head ( f ) . In this case, we slightly simplify notation by shortly writing e for the pair e : = ( f , g ) E × E , which is then called an edge. The notion of head , tail can thereby be replaced by link : E V with link ( e ) = head ( e ) tail ( e ) .

iv) In the special case, where E V × V the maps head , tail are assumed to be canonically given by the relation of E, i.e., head ( ( x , y ) ) = y , tail ( ( x , y ) ) = x for all ( x , y ) E .

One readily observes that our definition coincides with the common understanding of graphs. As said, the notion allows to consider multiple edges e , f with head ( e ) = head ( f ) , tail ( e ) = tail ( f ) by distinguishing e , f . Thus, no considerations of multi-sets E as in [17] are needed. We denote with | E | , | V | and #G the number of edges, vertices and the number of connected components of G, respectively.

Two edges e and f are called consecutive if head ( e ) = tail ( f ) and are called connected if { head ( e ) , tail ( e ) } { head ( f ) , tail ( f ) } . The degree deg ( v ) of a vertex v V is given by the number of all edges e E that are connected with v. A directed path p = { e 1 , , e n } E of length n from a vertex u to a vertex v is a list of consecutive edges e i E , i = 1 , , n such that u = tail ( e 1 ) and v = head ( e n ) . Thereby, repetition is allowed, i.e., e i = e j , 1 i < j n is possible. A connected path p = { e 1 , , e n } E from a vertex u to a vertex v is a list of possible multiple occurring connected edges that become a directed path by exchanging a subset { e i 1 , , e i k } , j = 1 , , k , k n of edges with their converse directed versions.

A directed (connected) cycle is a directed (connected) path p from some vertex u V to itself, which can also be a loop. A cycle is elementary if every vertex it contains is passed exactly once. A cycle is called simple if each edge it contains occurs exactly once. We denote by O ¯ ( G ) , O ( G ) the set of all connected or directed cycles and with O ¯ el ( G ) , O el ( G ) the set of all directed, connected elementary cycles, respectively. In fact, every simple or even non-simple cycle c O ¯ ( G ) , O ( G ) is generated by passing through several elementary cycles c 1 , , c n O ¯ el ( G ) , O el ( G ) , n .

Given c = ( e 1 , , e n ) , d = ( f 1 , , f n ) O ¯ ( G ) , n we introduce the equivalence relation c d iff d can be derived from c by cyclic reordering, i.e., there is k with e i + k mod n + k + 1 = f i , i = 1 , , n . We denote with O ¯ ( G ) / and O ( G ) / the quotients of non-equivalent cycles and count O ¯ ( G ) , O ( G ) up to equivalence, i.e., | O ¯ ( G ) | : = | O ¯ ( G ) / | , | O ( G ) | : = | O ( G ) / | . While non-elementary cycles c , c can coincide in their edge sets but differ in their orderings an elementary cycle is uniquely determined up to cyclic reordering of its edges. This makes elementary cycles the pivotal choice of our considerations. We will slightly abuse notation by making no difference between O el ( G ) , O ¯ el ( G ) and O el ( G ) / , O ¯ el ( G ) / , respectively.

Example 1. For G in Figure 1 the cycle c 0 = { e , f , g , h , k , l , m , o , n } is a directed, simple and non-elementary cycle, while the directed cycles c 1 = { e , f , g , h , k , n } and c 2 = { o , l , m } are elementary. Certainly, c 0 is given by passing through c 1 and c 2 . Further, the cycle c 3 = { e , f , g , h , k , n , o , l , m } c 0 while c 0 { l , m , o , n , e , f , g , h , k } = c 4 . The only reorderings of c 1 , c 2 keeping the edges consecutive are cyclic reorderings.

Figure 1. Elementary and simple cycles.

With G \ e , G \ v we denote the graphs obtained by deleting the edge e or the vertex v and all its connected edges. Further, G ( ) , E ( ) , V ( ) denote the graph, the set of all edges, and the set of all vertices induced by a set of graphs, edges and vertices. By P ( A ) we denote the power set of a given set A of finite cardinality | A | .

1.3. Outline

In Section 2, we provide all ingredients to generalize Bigss Theorem and give the proof in Section 3. In Section 4, the 2-dimensional elementary CW complex of G and its homology is studied. Section 5 considers representation invariant sub-fillings while Sections 6 and 7 suggest applications and yield a conclusion of our results.

2. The Module of Cycles

To provide an algebraic notion of cycles we define the characteristic vector representing how often and in which direction an edge is passed by a cycle or path.

Definition 2 (Characteristic Vector). Let G = ( V , E , head , tail ) be a multi-digraph we introduce the free R-modules, R = , 2

V ( G ) : = v V v , V 2 ( G ) : = V ( G ) / 2 V ( G ) = v V 2 v (4)

E ( G ) : = e E e , E 2 ( G ) : = E ( G ) / 2 E ( G ) = e E 2 e . (5)

Let ε = { e 1 , , e n } , e i E , i = 1, , n be an ordered list of connected edges. Then for i = 1 , , n 1 we define

x i = { 1 , if e i , e i + 1 are consecutive 1 , if e i , e i + 1 are connected but not consecutive (6)

If ε O ¯ ( G ) is a cycle then we set x n = 1 if e n , e 1 are consecutive and x n = 1 else. If ε O ¯ ( G ) then we set x n = 0 . By considering y i = x i mod 2 , i = 1 , , n the vectors

[ ε ] = i = 1 , , n x i e i E ( G ) , [ ε ] 2 = i = 1 , , n y i e i E 2 ( G ) (7)

are called the characteristic vectors of ε with respect to -and 2 -coefficients.

Remark 1. We want the following fact to be clear. The characteristic vector depends on the ordering of ε . However, if c 1 , c 2 O ¯ el ( G ) are connected elementary cycles such that c 1 c 2 are equivalent. Then [ c 1 ] = ± [ c 2 ] holds. If c 1 , c 2 O el ( G ) are directed then even [ c 1 ] = [ c 2 ] holds. Thus, the maps

χ : O el ( G ) / E ( G ) , c [ c ] , χ 2 : O ¯ el ( G ) / E 2 ( G ) , c [ c ] 2 (8)

are well defined.

Definition 3 (Incidence operator). Let G = ( V , E , head , tail ) be a multi-digraph. We denote with I ( G , ) : E V , I ( G , 2 ) : E 2 V 2 the , 2 -linear incidence operator operators, defined on the generators by

I ( G , ) [ e ] = head ( e ) tail ( e ) , (9)

I ( G , 2 ) [ e ] 2 = head ( e ) + tail ( e ) mod 2 V ( G ) . (10)

By choosing an enumeration V { v 1 , , v | V | } , E = { e 1 , , e | E | } we can represent I ( G , ) by the classic incidence matrix I ( G , ) = ( θ i j ) 1 i | E | 1 j | V | with

θ i j = { 0, if tail ( e i ) = v j and head ( e i ) = v j , 0, if tail ( e i ) v j and head ( e i ) v j , 1, if tail ( e i ) = v j and head ( e i ) v j , 1, if tail ( e i ) v j and head ( e i ) = v j . (11)

The notion of I ( G , 2 ) = ( μ i j ) 1 i | E | 1 j | V | is given by setting μ i j = θ i j mod 2 .

By Remark 1 the characteristic vector [ c ] , c O ¯ ( G ) might depend on the ordering of c. However, the kernel of I ( G , R ) is not sensitive to the possible choices of signs. Therefore, we obtain a well defined algebraic notion of cycles as follows.

Lemma 3. Let G = ( V , E , head , tail ) be a multi-digraph and O O ¯ el ( G ) be a set of elementary cycles. Then we denote with

[ O ] = { x E ( G ) | x = [ c ] , c O } , (12)

[ O ] 2 = { x E 2 ( G ) | x = [ c ] 2 , c O } (13)

the set of all characteristic vectors induced by cycles in O. The free generated cycle modules are given by

Λ ( G , ) : = [ O ¯ ( G ) ] = [ O ¯ el ( G ) ] = ker I ( G , )

Λ ( G , 2 ) : = [ O ¯ ( G ) ] 2 2 = [ O ¯ el ( G ) ] 2 2 = ker I ( G , 2 ) ,

where , 2 denotes the span with respect to , 2 .

Proof. The identities [ O ¯ ( G ) ] = [ O ¯ el ( G ) ] , [ O ¯ ( G ) ] 2 2 = [ O ¯ el ( G ) ] 2 2 follow directly by the definition of [ O ¯ ( G ) ] , [ O ¯ ( G ) ] 2 . Further, we verify that for any list ε = { e 1 , , e n } E of connected edges, due to (6), we have

± I ( G , ) [ ε ] = tail ( e 1 ) head ( e n ) ,

I ( G , 2 ) [ ε ] 2 = tail ( e 1 ) + head ( e n ) mod 2 V ( G ) . (14)

Hence,

I ( G , ) [ ε ] = 0 ε O ¯ ( G ) , I ( G , 2 ) [ ε ] 2 = 0 ε O ¯ ( G ) (15)

proving the claim.

To characterize the cycle spaces we rephrase Biggs Theorem [14] for connected cycles of simple digraphs matching our setup and notation.

Theorem 4 (Biggs Theorem). Let G = ( V , E , head , tail ) be a simple digraph then the dimension of the connected cycle spaces are given by

dim R Λ ( G , R ) = | E | | V | + # G , R = , 2 , (16)

where #G denotes the number of connected components of G.

Indeed, for simple digraphs the first line of the incidence matrix in (11) is unnecessary and thereby our definition yields the classic notion of I ( G , R ) for any fixed enumeration of V , E . In fact, the proof of Theorem 4 is based on computing the rank of I ( G , R ) , which is independent of the chosen enumeration, yielding our reformulation to be genuine. Due to Lemma 3 we obtain the following consequence.

Corollary 1 (Biggs Theorem for connected cycles of multi-digraphs). Let G = ( V , E , head , tail ) be a multi-digraph. Then the dimension of the connected cycle spaces are given by

dim R Λ ( G , R ) = | E | | V | + # G , R = , 2 , (17)

where #G denotes the number of connected components of G.

Proof. Indeed, each loop or multiple edge increases the dimension by 1. More precisely: Every loop [ { e } ] , [ { e } ] 2 or two cycle [ { e , f } ] , [ { e , f } ] 2 , e , f E is R-linear independent, to all other elements in [ O ¯ ( G \ e ) ] , [ O ¯ ( G \ e ) ] 2 . Thus, by removing e we obtain dim R Λ ( G \ e , R ) = dim R Λ ( G , R ) 1 . Recursively applying this procedure ends up with a simple digraph G = ( V , E , hea d , tai l ) with # G = # G . Thus, due to Theorem 4 we compute

dim R Λ ( G , R ) = | E | | E | + dim R Λ ( G , R ) = | E | | E | + | E | | V | + # G = | E | | V | + # G . (18)

How to generalize the characterisation of cycle spaces in the directed case is provided in the next section.

3. Biggs Theorem for Directed Cycles

We prove a generalization, we introduce the concept of contracting edges in a multi-digraph as follows.

Definition 4 (Contracted Graph). Let G = ( V , E , head , tail ) be a multi-digraph and e E . Let v = head ( e ) , u = tail ( e ) . An equivalence relation u , v on V is defined by

x u , v y x = y or x , y { u , v } . (19)

The equivalence class of x V is denoted by [ x ] u , v and V / e : = V / u , v gives the quotient of V with respect to u , v . Further, we define E / e : = E \ { e } and set

head / e ( f ) = [ head ( f ) ] u , v , tail / e ( f ) = [ tail ( f ) ] u , v , f E / e . (20)

The multi-digraph G / e : = ( V / e , E / e , head / e , tail / e ) is called the contracted graph of G with respect to e E . If e , f E then we observe that ( G / e ) / f ( G / f ) / e . Thus, if G G is a subgraph then G / G : = G / E ( G ) can be defined by contracting E ( G ) in arbitrary order, yielding a well defined notion.

Remark 2. Let G = ( V , E , head , tail ) be a simple digraph and e , f , h E such that e , f are consecutive and tail ( h ) = tail ( e ) , head ( h ) = head ( f ) . Then G / e becomes a multi-digraph, i.e., h and f become parallel edges in G / e . Furthermore, the contraction of a two cycle c = { e , f } , tail ( e ) { tail ( f ) , head ( f ) } , head ( e ) { tail ( f ) , head ( f ) } results in a loop c / e O ¯ ( G / e ) . These phenomena are the reasons why we consider multi-digraphs with loops, rendering the notion of contracted graphs consistent within our framework. Certainly, contraction preserves connectivity, i.e., # G = # G / e .

Now we have all ingredients to state the first main theorem of this article.

Theorem 5 (Biggs Theorem for directed cycles). Let G = ( V , E , head , tail ) be a multi-digraph and Λ ( G , R ) = O ¯ el ( G ) R its cycle space with respect to R = , 2 . Then

G ( O el ( G ) ) = G ( O ¯ el ( G ) ) if and only if O el ( G ) R = O ¯ el ( G ) R . (21)

Consequently, in case of equivalence, there holds

dim R O el ( G ) R = | E | | V | + # G = dim R Λ ( G , R ) . (22)

Proof. The proof splits into several steps:

Step 1: We show O el ( G ) = O ¯ el ( G ) G ( O el ( G ) ) = G ( O ¯ el ( G ) ) . Indeed, given [ c ] = i = 1 n x i e i O ¯ el ( G ) . Then we can generate [ c ] = k = 1 m λ k [ c k ] by characteristic vectors of directed elementary cycles c k O el ( G ) , k = 1, , m . Thus, every e i hast to be contained in at least one of them proving G ( O el ( G ) ) = G ( O ¯ el ( G ) ) .

Step 2: We show G ( O el ( G ) ) = G ( O ¯ el ( G ) ) O el ( G ) = O ¯ el ( G ) . Indeed, O el ( G ) O ¯ el ( G ) implies that by Lemma 3 O el ( G ) O ¯ el ( G ) = Λ ( G , ) is a free sub-module of Λ ( G , ) . Thus, it suffices to show (22) in order to prove the converse inclusion. We argue by induction on | V | # G . If | V | # G = 0 then there are only loops. Thus,

dim O el ( G ) = | E | = | E | | V | + # G (23)

and the claim follows. Now assume that | V | # G > 1 . Let f E be not a loop; then we consider the contracted graph G / f . Let c O el ( G ) if f E ( c ) and V ( c ) V ( f ) = or V ( f ) V ( c ) = { head ( f ) } , V ( f ) V ( c ) = { tail ( f ) } then certainly c / f O el ( G ) . If f E ( c ) then c / f remains elementary. If f E \ E ( c ) and V ( f ) V ( c ) = V ( f ) then c / f consists of two elementary cycles c 1 , c 2 O el ( G / f ) with one of them being a loop. Thus, G ( O el ( G / f ) ) = G ( O ¯ el ( G / f ) ) . By induction, we therefore compute

dim O el ( G / f ) = | E / f | | V / f | + # G / f = | E | | V | + # G dim O el ( G ) . (24)

Step 2a: We show that dim O el ( G ) dim O el ( G / f ) . Let c O el ( G ) with f E ( c ) . By choosing an equivalent cycle c ¯ , c c ¯ if necessary w.l.o.g. we can assume that c is ordered such that c = ( e 1 , , e | c | 1 , f ) O el ( G ) , e i E , 1 i < | c | . Conversely, for every cycle d O el ( G / f ) we choose an ordering d = ( e 1 , , e | c | ) such that either d ˜ = ( e 1 , , e | c | , f ) or d ˜ = ( e 1 , , e | c | ) becomes a cycle in O ¯ el ( G ) . We consider the map

τ : O el ( G / f ) O ¯ el ( G ) , with τ ( d ) : = d ˜ . (25)

Given h 1 , , h l O el ( G / f ) , l such that [ h 1 ] , , [ h l ] E ( G / f ) are a basis of O el ( G / f ) . Then τ induces a -linear injective lift

τ * : O el ( G / f ) O ¯ el ( G ) , τ * ( [ h i ] ) = [ τ ( h i ) ] = [ h ˜ i ] , i = 1, , l . (26)

Step 2b: We show that im ( τ * ) O el ( G ) . Indeed, if d O el ( G / f ) is such that [ d ˜ ] has no negative entries then d ˜ = τ ( d ) O el ( G ) and [ d ˜ ] = τ * ( [ d ] ) O el ( G ) . If this is not the case then the entry [ d ˜ ] f = 1 corresponding to f is the only negative entry. Since G ( O el ( G ) ) = G ( O ¯ el ( G ) ) there is b O el ( G ) with f E ( b ) .

For any x , y V we denote by d x , y the directed subpath of d ˜ connecting x and y. Further, let x 1 , , x n V ( d ˜ ) be all crossing vertices such that for the edges e i , f i E ( b ) with head ( e i ) = x i = tail ( f i ) we have e i E ( d ˜ ) and f i E ( d ˜ ) , i = 1 , , n . Analogously, we consider all crossing vertices y 1 , , y n V ( d ˜ ) such that for the edges h i , g i E ( b ) with head ( h i ) = y i = tail ( g i ) we have h i E ( d ˜ ) and g i E ( d ˜ ) . Moreover, we assume that tail ( f ) = x 1 , head ( f ) = y 1 and that the x i , y i are ordered w.r.t the ordering of directed path d x 1 , y 1 . We consider two cases.

Case 1: If n = 1 then setting d 0 = b , d 1 = b y 1 , x 1 d x 1 , y 1 yields d x 1 , y 1 = f , see Figure 2 and thereby

[ d ˜ ] + [ d 0 ] [ d 1 ] = 0 (27)

implying [ d ˜ ] = τ * ( [ d ] ) O el ( G ) .

Case 2: If n > 1 then we construct three cycles d 0 , d 1 , d 2 O el ( G ) by

d 0 = b , d 1 = d x 1 , y n b y n , x 1 , d 2 = b y 1 , y n d y n , y 1 , (28)

see again Figure 3 for an illustration, where d 0 is indicated by the black lines. Since b , d ˜ O el ( G ) are elementary we have that the connected components of

Figure 2. Sketch of the cycle construction for Case 1.

b \ d ˜ = b y n , x 1 i = 1 , , n 1 b y i , x i + 1 , b d ˜ = i = 1 , , n d x i , y i (29)

are pairwisely disjoint paths. Moreover, d 0 , d 1 , d 2 O el ( G ) are elementary. Therefore, we verify again that

[ d ˜ ] + [ d 0 ] [ d 1 ] [ d 2 ] = 0 (30)

holds and thereby implies that [ d ˜ ] = τ * ( [ d ] ) O el ( G ) . Hence, im ( τ * ) O el ( G ) and we have proven Step 2. Thus, due to Corollary 1 we have proven the theorem for R = .

Step 3: For R = 2 the claim follows by using that G ( O el ( G ) ) = G ( O ¯ el ( G ) ) is equivalent to O ¯ el ( G ) = O el ( G ) as proven above. Then the identity

Λ ( G , 2 ) O ¯ el ( G ) / 2 E ( G ) = O el ( G ) / 2 E ( G ) O el ( G ) 2 (31)

proves the theorem with respect to 2 -coefficients.

Remark 3. For simplification, the sketch of Case 2 in Figure 3 assumes that the paths b x i , y i are not nested, which might be not true in general. However, this circumstance does not affect (29) nor the fact that d 0 , d 1 , d 2 are elementary, which were all assumptions we used for the argumentation.

An immediate consequence of Theorem 5 is the following.

Corollary 2. Let G = ( V , E , head , tail ) be a multi-digraph. Then

dim R O el ( G ) R = dim R Λ ( G ( O el ( G ) ) , R ) , R = , 2 (32)

can be determined in O ( | E | ) .

Proof. By Theorem 5 we have O el ( G ) R = Λ ( G ( O el ( G ) ) , R ) and therefore dim R O el ( G ) R = dim R Λ ( G ( O el ( G ) ) , R ) . Now G ( O el ( G ) ) is given by the union of all strongly connected components of G which can be computed in linear time [16] yielding the claim due to (22).

The opportunity of computing the dimension of the directed cycles space O el ( G ) R efficiently, potentially addresses many challenging problems in graph theory. We use the result to study the structure of G in terms of cellular homology groups.

4. The CW Complex of Elementary Cycles

An excellent introduction to algebraic topology is given by Allen Hatcher [15].

Figure 3. Sketch of the cycle construction for Case 2.

Especially, the notion of CW complexes, geometric realizations and the classic theory of simplicial and cellular homology were presented explicitly therein. The following notions and results take these concepts for granted.

Definition 5 (Coherent Orientation). Let G = ( V , E , head , tail ) be a multi-digraph and O O ¯ el ( G ) be a set of elementary connected cycles. A map o : E { 1,0,1 } is called a coherent orientation w.r.t. O iff

i) o ( e ) = 0 for all e E \ E ( O )

ii) [ c ] = i = 1 n x i e i = ± i = 1 n o ( e i ) e i for all characteristic vectors [ c ] with c O .

Lemma 6. Let G = ( V , E , head , tail ) be a multi-digraph and O O ¯ el ( G ) be a set of elementary connected cycles. Then there exists a coherent orientation o : E { 1,0,1 } .

Proof. We argue by induction on | O | . If | O | = 1 then we choose an ordering c = ( e 1 , , e n ) O and thereby a representative of all equivalent cycles in O. Thus, setting o ( e i ) = x i with [ c ] = x i e i defines a coherent orientation. Now assume that | O | > 1 . Let c = ( e 1 , , e n ) O and O O be the set of all cycles being not equivalent to c. We consider G = G ( O ) and a splitting G = G 1 G K into elementary connected components. That is

i = 1 K O O ¯ el ( G i ) = O and O O ¯ el ( G i ) O ¯ el ( G j ) = , 1 i < j K . (33)

By induction there is a coherent orientation o w.r.t. O . Let [ c ] = i = 1 n x i e i be the characteristic vector of c. Assume there are 1 k , l n with o ( e k ) = x k and o ( e l ) = x l . Then e k , e l belong to different components, i.e., e k E ( G i ) , e l E ( G j ) , i j . Observe, that we can change o to o on every component G i by keeping the requirements of a coherent orientation preserved. Thus, we can change o to o ˜ such that o ˜ ( e i ) = x i for all e i E ( c ) E ( O ) . Hence, setting o = o ˜ on E ( O ) and o ( e i ) = x i for all e i E ( O ) yields a coherent orientation.

Definition 6 (Elementary CW Complex). Let G = ( V , E , head , tail ) be a multi-digraph and X 1 ( G ) be its geometric realization. Further, let O O ¯ el ( G ) be a set of elementary connected cycles. The geometric realization X ( G , O ) given by attaching 2-cells to X 1 ( G ) accordingly to O is called the elementary CW complex of G w.r.t. O. Further, X ¯ 0 , X ¯ 1 , X ¯ 2 shall denote the set of 0,1,2 -cells of X ( G , O ) , and

L 0 : X ¯ 0 V , L 1 : X ¯ 1 E , L 2 : X ¯ 2 O (34)

the maps yielding the identification of 0,1,2 -cells with vertices, edges and cycles, respectively.

Indeed, for the complexes X ( G , O el ( G ) ) , X ( G , O ¯ el ( G ) ) the cellular homology can be derived by considering the following chain complexes.

Definition 7 (Elementary Chain Complex). Let G be a multi-digraph, O O ¯ el ( G ) , be a set of elementary connected cycles, R = , 2 and o : E { 1,0,1 } be a coherent orientation w.r.t. O. Let X ( G , O ) be the associated CW complex. We consider the free generated R-modules

C 2 = σ 2 X ¯ 2 ( G , O ) σ 2 R , C 1 = σ 1 X ¯ 1 ( G ) σ 1 R , C 0 = σ 0 X ¯ 0 ( G ) σ 0 R . (35)

For R = , we define the boundary operators k : C k C k 1 , k = 1 , 2 on the generators by 2 ( σ 2 ) = σ 1 X ¯ 1 ( G ) m ( σ 2 , σ 1 ) σ 1 ,

m ( σ 2 , σ 1 ) = { o ( L 1 ( σ 1 ) ) , if L 1 ( σ 1 ) E ( L 2 ( σ 2 ) ) 0 , else (36)

and 1 ( σ 1 ) = σ 0 X 0 n ( σ 1 , σ 0 ) σ 0 ,

n ( σ 1 , σ 0 ) = { 1 , if head ( L 1 ( σ 1 ) ) = L 0 ( σ 0 ) 1 , if tail ( L 1 ( σ 1 ) ) = L 0 ( σ 0 ) 0 , else (37)

For R = 2 we consider m ( σ 2 , σ 1 ) mod 2 and n ( σ 1 , σ 0 ) mod 2 , respectively. We enforce k to be R-linear by setting m ( λ σ 2 + μ σ ¯ 2 , σ 1 ) : = λ m ( σ 2 , σ 0 ) + μ m ( σ ¯ 2 , σ 0 ) , n ( λ σ 1 + μ σ ¯ 1 , σ 0 ) : = λ m ( σ 1 , σ 0 ) + μ n ( σ ¯ 1 , σ 0 ) for all σ 2 , σ ¯ 2 X ¯ 2 ( G , O ) , σ 1 , σ ¯ 1 X ¯ 1 ( G ) , σ 0 X ¯ 0 ( G ) , λ , μ R . Finally, we set C 3 , C 1 = 0 and k = 0 for k { 0,3 } .

Remark 1 asserts why even in case of -coefficients the maps k yield well defined boundary operators. More precisely:

Theorem 7. Let G = ( V , E , head , tail ) be a multi-digraph, O O ¯ el ( G ) be a set of connected elementary cycles and o : E { 1,0,1 } be a coherent orientation w.r.t. O. Let X ( G , O ) be the elementary CW complex, R = , 2 and ( C k , k ) be given due to Definition 7.

i) The pairs ( C k , k ) define a finite chain complex, i.e., k 1 k = 0 for all k = 0 , 1 , 2 , 3 .

ii) The cellular homology groups H k ( X ( G , O ) , R ) are free R-modules H k ( X ( G , O ) , R ) : = ker ( k ) / im ( k + 1 ) , k = 0 , 1 , 2

H 2 ( X ( G , O ) , R ) R b 2 ( G , O , R ) , (38)

H 1 ( X ( G , O ) , R ) R b 1 ( G , O , R ) , (39)

H 0 ( X ( G , O ) , R ) R b 0 ( G , O , R ) . (40)

iii) The Betti numbers are given by

b 2 ( G , O , R ) = | O | dim R O R (41)

b 1 ( G , O , R ) = | E | | V | + # G dim R O R (42)

b 0 ( G , O , R ) = # G . (43)

Proof. By considering the sequence

0 3 C 2 2 C 1 1 C 0 0 0 (44)

the identities k 1 k = 0 , follow for k { 1,3 } . For γ = k = 1 n λ k σ k 1 C 1 and δ = k = 1 m α k σ k 0 C 0 we consider the induced characteristic vectors

[ γ ] = k = 1 n λ k [ L 1 ( σ k 1 ) ] , [ γ ] 2 = k = 1 n μ k [ L 1 ( σ k 1 ) ] 2 , μ k = λ k mod 2 . (45)

[ δ ] = k = 1 m α k [ L 0 ( σ k 0 ) ] , [ δ ] 2 = k = 1 m β k [ L 0 ( σ k 0 ) ] 2 , β k = α k mod 2. (46)

Then we observe that [ 1 γ ] = I ( G , ) [ γ ] , [ 1 γ ] 2 = I ( G , 2 ) [ γ ] 2 , where I ( G , R ) denotes the incidence operator of G. Thus, 1 γ = 0 if and only if γ corresponds to a cycle, i.e., [ γ ] Λ ( G , ) , [ γ ] 2 Λ ( G , 2 ) . By definition of 2 and Lemma 3, this yields im ( 2 ) ker ( 1 ) ker I ( G , R ) and consequently 1 2 = 0 , showing (i). Furthermore, we realize that, im ( 3 ) = 0 and im ( 2 ) O R . Thus, im ( 2 ) is free generated by O, implying that im ( 2 ) R dim R O R . By using the dimension formula we realize that the quotient H 2 ( X ( G , O ) , R ) = C 2 / im ( 3 ) is a free R-module of dimension

b 2 ( G , O , R ) = | O | dim R O R . (47)

Likewise, ker ( 1 ) ker I ( G , R ) implies that H 1 ( X ( G , O ) , R ) Λ ( G , R ) / O R . Since O R and Λ ( G , R ) are free R-modules, there are isomorphisms ρ : O R R dim R O R , τ : Λ ( G , R ) R | E | | V | + # G . Let i : R dim R O R R | E | | V | + # G be the natural embedding and B O be a basis of O R then B O = i ρ ( B O ) can be extended to a basis B R of R | E | | V | + # G and τ 1 ( B R ) yields a basis of Λ ( G , R ) containing B O as a sub-basis of O R . Thus,

R | E | | V | + # G / R dim R O R = τ ( Λ ( G , R ) ) / ( i ρ ( O R ) ) Λ ( G , R ) / O R ,

showing that the 1-st homology H 1 ( X ( G , O ) , R ) is a free R-module of dimension

b 1 ( G , O , R ) = dim R ( Λ ( G , R ) / O R ) = | E | | V | + # G dim R O R . (48)

Since C 1 and ker ( 1 ) are free R-modules we analogously observe that there is a sub-basis of ker ( 0 ) generating im ( 1 ) . Hence, H 0 ( X ( G , O ) , R ) is a free R-module of dimension

b 0 ( G , O , R ) = | V | dim R im ( 1 ) = | V | ( | E | dim R ker ( 1 ) ) = | V | ( | E | dim R Λ ( G , R ) ) = # G , (49)

which proves (iii).

Remark 4. Note that the Betti numbers do not depend on the choice of the coherent orientation o. Even though, Lemma 6 provides the existence of a coherent orientation o an explicit construction of o, which might be computational expensive, is not necessary to compute the b k ( G , O , ) , k = 0 , 1 , 2 .

Remark 5. The singular homology H n sing ( X ) of any CW complex X is isomorphic to its cellular homology [15] (Theorem 2.35). Thus, Theorem 7 provides a path for computing the singular homology of the elementary CW complexes X ( G , O ) . Further, we obtain the following consequence that yields main Theorem 2.

Corollary 3. Let the assumptions of Theorem 7 be fulfilled and denote with b k the R-rank of the cellular homology group H k ( X ( G , O ) ) , k = 0 , 1 , 2 .

i) b 0 ( G , O el ( G ) , R ) = b 0 ( G , O ¯ el ( G ) , R ) = # G can be computed in O ( | V | 2 + | V | | E | ) .

ii) b 1 ( G , O ¯ el ( G ) , R ) = 0 and b 1 ( G , O el ( G ) , R ) can be computed in O ( | V | 2 + | V | | E | ) .

iii) Computing b 2 ( G , O el ( G ) , R ) , b 2 ( G , O ¯ el ( G ) , R ) is #P-hard (sharp P-hard).

Proof. The number of connected components of any multi-graph can be determined by breadth-first search within O ( | V | 2 + | V | | E | ) [18], yielding i) due to Theorem 7. To show ii), recall that by Lemma 3, we have that O ¯ el ( G ) R = Λ ( G , R ) yielding b 1 ( G , O ¯ el ( G ) , R ) = 0 . Due to Corollary 2, computing dim R O el ( G ) R can be done in O ( | E | ) . Thus, by Theorem 7, the bottleneck for computing b 1 ( G , O , R ) is again given by determing #G yielding ii). Now iii) is a consequence of Theorem 7 and the fact that, by reduction to the Hamiltonian cycle problem, counting the number of all directed (connected) elementary cycles of a digraph is #P-hard [12].

We recall that two multi-digraphs G 1 = ( V 1 , E 1 ) and G 2 = ( V 2 , E 2 ) are called homeomorphic if there are sets of directed paths P 1 , P 2 such that deg ( u ) = 2 for any inner vertex u V ( p ) of p P 1 or p P 2 , respectively, and the graphs G 1 / P 1 G 2 / P 2 given by contracting P 1 , P 2 are isomorphic.

Corollary 4. Let G 1 and G 2 be two homeomorphic multi-digraphs. We denote with O i = O el ( G i ) , O ¯ i = O ¯ el ( G i ) , consider the elementary CW complexes X ( G i , O i ) , X ( G i , O ¯ i ) , i = 1 , 2 and R = , 2 . Then

H k ( X ( G 1 , O 1 ) , R ) H k ( X ( G 2 , O 2 ) , R ) , (50)

H k ( X ( G 1 , O ¯ 1 , R ) ) H k ( X ( G 2 , O ¯ 2 , R ) ) , (51)

where k = 0 , 1 , 2 .

Proof. As one can readily verify, neither the number of cycles | O el ( G i ) | , | O ¯ el ( G i ) | nor the dimension dim R O el ( G i ) R , O ¯ el ( G i ) R nor the number of connected components of G i change under the contraction of the paths P i , i = 1 , 2 yielding the claim due to Theorem 7.

Due to Corollary 3 (3) counting | O el ( G ) | , | O ¯ el ( G ) | is #P-hard. Though there are efficient algorithms [19] counting the number of cycles approximately, we, here, suggest to consider certain sub-fillings O O el ( G ) , O ¯ el ( G ) to yield an suitable setup.

5. Representation Invariant Sub-Fillings

In this section we discuss several possibilities of filling the graph G by choosing O O ¯ el ( G ) , such that the cellular homology of the resulting CW complexes X ( G , O ) can be computed efficiently and the filling is invariant under automorphisms, i.e., is independent of the chosen representation of G. The list below is by no means complete and just provides some suggestions.

5.1. Fillings Induced by Cycle Bases

If the cycles O O ¯ el ( G ) are R-linear independent then due to Theorem 7 for the CW complex X ( G , O ) the Betti numbers are given by b 2 ( G , O , R ) = 0 , b 1 ( G , O , R ) = | E | | V | + # G | O | and b 0 ( G , O , R ) = # G . Thus, b 1 is given by the co-dimension of O R in Λ ( G , R ) . Thereby, fillings by R-linear independent cycles or basis yield only trivial topological information.

5.2. Fillings Induced by Shortest Cycles

Let G = ( V , E , head , tail ) be a multi-digraph. We consider the set of all shortest cycles

min ( G ) = { c O ¯ el ( G ) | | c | | c | c O ¯ el ( G ) } . (52)

Further, N E ( v ) : = { e E | head ( e ) = v } , N E ( v ) : = { e E | tail ( e ) = v } , N E ( v ) = N E ( v ) N E ( v ) denote the set of all incoming or outgoing edges of v V , and their unions.

Lemma 8. Let G = ( V , E ) be a multi-digraph and Δ its maximum degree.

i) | O min | < Δ | E | .

ii) O min can be determined in O ( Δ | E | 2 ) .

Proof. Note that the all-shortest path problem, which is to find a shortest connected path between all pairs of vertices u , v V is very well studied. In the case of integer edge weights, as we consider here, for instance the algorithms of [20] [21], solve the problem in much less then O ( | E | 2 ) . We consider all edges E min E with shortest path p e of minimum length l min such that p e { e } becomes a shortest cycle c e , p e . For any outgoing edge f N E ( tail ( e ) ) we denote with N f = N E ( tail ( e ) ) \ f and with c e , f , p e the possibly empty shortest cycle w.r.t. G \ N f . Then we observe that

O min = e E min , f N E ( tail ( e ) ) { c e , f , p e } . (53)

Thus, i) follows. Further, due to Edsger W. Dijkstra’s famous shortest path algorithm [22] each cycle c e , f , p e can be computed in O ( | E | ) yielding ii).

Certainly, O min ( G ) O min ( G ) for all isomorphic graphs G G . Further, the associated Betti numbers b k ( G , O min ( G ) , R ) , k = 0 , 1 , 2 , R = , 2 of X ( G , O min ( G ) ) can be computed efficiently due to Lemma 8 and Theorem 7. Indeed the bottleneck is given by computing dim R O min ( G ) R , which requires at most O ( | E | 3 ) by applying Gauss elimination or can be done even faster [23].

Recursively applying this procedure to the contraction G / G ( O min ) provides an array of Betti numbers b k n , k = 0 , 1 , 2 , 3 , n = 1, , N that decodes the structure of G independently of the chosen representation. We expect that this efficiently provided information is quite informative for structural analysis. That is: multi-digraphs G , G ¯ with b k n = b ¯ k n for all k = 0 , 1 , 2 ; n = 1 , , N are similar and G , G ¯ are not isomorphic whenever b k n , b ¯ k n differ for some k = 0 , 1 , 2 ; n = 1 , , N .

5.3. Fillings Induced by Cliques

Let K n = ( V n , E n ) , n , denote the complete undirected graph of n vertices, C n = O ¯ el ( K n ) . Then the number of cycles | C n | , up to cyclic reordering, is given by all subsets of vertices larger than 3, i.e.,

| V n | = n , | E n | = ( n 2 ) , | C n | = i = 3 n ( n i ) = 2 n n 2 , (54)

which can be computed in O ( n ) [24]. Since C n R = Λ ( K n , R ) , computing the Betti numbers of X ( K n , C n ) can be done in O ( n ) . Let K n ( G ) be the set of all cliques K n G of size n . Let α { 3, , N } , N with 3 α . Then we consider a set of cycles O α O ¯ el ( G ) such that c O ¯ el ( K k ) and K k K k ( G ) , k α . The geometric realization X ( G , O α ) is a canonical filling in the following sense:

The number of cycles | O α | can be computed efficiently due to (54). On the other hand O α contains all triangles of G α : = G ( O α ) , which generate Λ ( G α , R ) , R = , Z 2 . Thus, dim R O α R = dim R Λ ( G α , R ) . Hence, due to Theorem 7 the associated Betti numbers can be computed efficiently once the cliques K k ( G ) , k α have been determined.

Note that the general clique problem, i.e., finding a clique of given size n in a graph G, belongs to the list of classic NP-complete problems [25]. However, if the size n is fixed, finding cliques can be done efficiently in O ( | V | n n 2 ) . Even better performs the Bron-Kerbosch algorithm, which lists all maximal cliques [26]. Many other contributions in regard of clique detecting algorithms were made, allowing us to find a non-trivial filling as described above efficiently. Consequently, the homology groups H k ( X ( G , O ) , R ) yield a method of decoding how the cliques are glued with each other.

6. Applications

We expect the concept of studying a graph G by computing its cellular homology groups with respect to specified fillings O O ¯ el ( G ) to open up many possible applications. A short, non-exclusive list of ideas is presented below.

6.1. Graph Embeddings

We strengthen the classic notion of cellular graph embeddings as follows.

Figure 4. Embeddings of K 3,3 into 2 (left) and T 2 (right).

Definition 8 (Elementary embedding). Let G be a simple undirected graph, X 1 ( G ) its 1-dimensional geometric realization, and S a compact, connected C 0 -surface without boundary. Given a continuous embedding

ρ : X 1 ( G ) S (55)

we set X 1 ( S ) = ρ ( X 1 ( G ) ) and let X ( S ) = X 2 ( S ) be the CW complex given by attaching the set of disjoint 2-cells S \ X 1 ( S ) in the canonical way. We say that ρ is an elementary embedding of G into S if and only if there is a CW-complex embedding

ξ : X ( S ) X ( G , O ¯ el ( G ) ) , with ξ | X 1 ( S ) ρ = i d X 1 ( G ) . (56)

Thus, there are no edge crossings in ρ ( X 1 ( G ) ) , and every 2-cell in X ( S ) corresponds to some 2-cell in X ( G , O ¯ el ( G ) ) . By considering digraphs and replacing O ¯ el ( G ) with O el ( G ) directed notions can be derived.

Remark 6. Note that the usual notion of 2-cell embeddings of graphs is given by requiring only that

ρ : X 1 ( G ) S (57)

is an embedding, such that S \ ρ ( X 1 ( G ) ) is given by the union of 2-cells. Hence, an elementary embedding of G into S is a special case of a cellular embedding and therefore a stronger requirement on the embedding ρ . The following example makes this circumstance visible.

Example 2. Figure 4 illustrates embeddings of K 3,3 into the projective plane 2 and the torus T 2 . We recall that 2 can be understood as the quotient D / of a disc D 2 with the equivalence relation ~ defined on the boundary D S 1 2 by identifying opposite points x x , x 2 . In contrast, the torus T 2 = 2 / 2 is given by identifying opposite sides of a square. Consequently, the embedding of K 3,3 into 2 is an elementary embedding, while the boundary of F 3 in T 2 is not an elementary cycle. Therefore, the embedding of K 3,3 into T 2 is a 2-cell embedding, but not an elementary embedding.

Remark 7. Note furthermore, that in contrast to cellular embeddings an elementary embedding needs not to exist. For instance consider 2 cubes intersecting

in only one vertex. An elementary embedding could only be realized by allowing S to consist of two spheres intersecting in one point, requiring a relaxation that allows S to be a polyfold.

Given a simple undirected graph G with G = G ( O el ( G ) ) such that there is a surface S and an elementary embedding ρ : X 1 ( G ) S . Observe that the Euler characteristic of S is given by

χ ( S , R ) = b 2 ( G , O , R ) b 1 ( G , O , R ) + b 0 ( G , O , R ) , R = , 2 . (58)

Since S is a closed surface, we have b 2 ( G , O , 2 ) = 1 , while b 2 ( G , O , ) = 1 when S is orientable and b 2 ( G , O , ) = 0 if S is non-orientable [15].

Thus, if b 1 ( G , O , ) = 0 then S is given by the unit sphere S 2 or the projective plane 2 and G is required to be (projective) planar. The detection and construction of a (projective) planar cell embedding can be done in O ( | V | ) [27] [28]. In general, the problem of finding a cellular embedding of G into a orientable surface S of minimal genus is NP-complete [29]. However, finding a cellular embedding into a surface S of maximal genus is polynomial time solvable [30]. Our strengthened notion of embeddings and the relaxation of S being allowed to be non-orientable might provides new insights into embedding theory. Note that S is determined if we can find O O ¯ el ( G ) such that:

i) For every e E ( O ¯ el ( G ) ) there are exactly 2 cycles c 1 , c 2 with e E ( c 1 ) , e E ( c 2 ) .

ii) dim R O R = | O | 1 .

Thus, due to this simple description, one might finds new methods as integer linear programming techniques (ILP), which construct O and thereby S. By changing (2) into dim R O R < | O | the case of polyfold embeddings is treated, see Remark 7.

6.2. Graph Clustering

Many applications require comparing graphs or clustering a set of graphs G into subsets of “similar” graphs. The distance functions D : G × G + used to measure similarity between graphs often lead to NP-complete optimization problems and can only be approximated. For instance, the graph-edit distance D edit : G × G + [31] is a (pseudo) metric measuring the cost of deleting and inserting the minimum number of edges and vertices required to modify a graph G 1 into a graph G 2 . Determining this distance in general is an NP-hard problem, which is why additional assumptions and restrictions to special graph classes are usually made. We posit that any feasible clustering G 1 , , G n G of a given set of graphs G , i.e., G i G j = , i j , and i = 1 n G i = G , must fulfill the following properties:

a) Let G 1 , G 2 G be isomorphic or more generally homeomorphic, then G 1 , G 2 belong to the same cluster, i.e., G 1 , G 2 G i for some 1 i n .

b) Given G 1 , G 2 G , the problem of deciding whether G 1 and G 2 will belong to the same cluster is solvable in polynomial time.

b) The similarity distance D : G × G + is a (pseudo) metric.

As a consequence of (c) we have that if D ( G 0 , G ) d , d + for all G G i and some G 0 G i , then D ( G 1 , G 2 ) 2 d for all G 1 , G 2 G i , 1 i n . We propose that the following distance definition based on our work leads to a feasible clustering:

Definition 9. Let G = { G 1 , , G n } be a given set of multi-digraphs and O 1 , , O n O ¯ el ( G ) be sets of elementary cycles of G i , i = 1 , , n . Then, we define the homological distance D H : G × G by

D H ( G i 1 , G i 2 ) : = i = 0 2 | b i ( G i 1 , O i 1 , R ) b i ( G i 2 , O i 2 , R ) | , 1 i 1 , i 2 n . (59)

Certainly, D H is a pseudo metric. Further, due to Section 5, we can choose fillings O i that ensure that the Betti numbers, and thereby D H , can be computed efficiently. We denote with B i = k = 0 3 | b k ( G i , O i , R ) | and let ν = i = 1 n ( B i μ ) 2 , μ = mean ( B i ) i = 1, , n be the variance of the B i . Consider a maximum set of graphs G i j G , 1 j < n with D H ( G i j , G i j ) > 2 ν for all j j , then

G j = { G G | D H ( G i j , G ) ν } (60)

yields a clustering G = i = 1 n G i that satisfies (a), (b), and (c) above. Due to the fact that the Betti numbers measure the topological complexity of the graphs, we expect that such a clustering will be informative in structural data analysis.

6.3. Discrete Morse Theory

Let X be a CW complex, K the set of cells of X, and f : K a given function. If the (discrete) gradient flow of f possesses no degenerate critical points or singularities f is called a discrete Morse function [32]. In this regard, the strong and weak Morse inequalities become important, stating that

m N m N 1 + ± m 0 b N b N 1 + ± b 0 (61)

m n b n forall 1 n N = dim X , (62)

where m n denotes the number of critical cells of a Morse function f with Morse index n, and b n = b n ( X , K ) are the Betti numbers of X w.r.t. some field e.g. 2 . Thus, if X = X ( G , O ) is given by a filling of G then we can bound the essential complexity of the dynamics from below. Since the Betti numbers do not depend on the Morse function, we can furthermore determine whether there are homotopies that reduce the complexity of the system without loosing any of its essential dynamics [32].

7. Conclusion

We extended the classic Theorem of Biggs to the case of directed cycles of multi-digraphs. These findigs were then incorporated to extend the classic interpretation of a graph G being a 1-dimensional CW complex to the 2-dimensional case by attaching 2-cells accordingly to an a priori specified set of elementary cycles O O ¯ el ( G ) . We proved the associated cellular homology groups H k ( G , O , R ) , k = 0 , 1 , 2 to be free R-modules and derived formulas for computing their Betti numbers. Of course, one could ask for extending these considerations to even higher dimensions. However, in order to describe a 3-cell, we need to know all cycles forming its boundary. Thus, dealing with higher-dimensional cells makes computing the Betti numbers combinatorically harder, which made us restricting to the presented case.

7.1. Classic Graph-Theoretical Problems

It is natural to ask the question how classic graph-theoretical problems behave for certain classes of graphs. Here we provided some methods to classify graphs with respect to certain aspects of their topology. The fillings presented in Section 5 might be a good starting point of investigating classic graph theoretical problems on graphs with bounded topological complexity in terms of the derived Betti numbers. For instance the Feedback Arc Set Problem, which is to delete as less edges as possible to make G acyclic, could profit from certain obstructions on the cycle topology, [33] [34], as derived in Section 5.2. Further, the understanding of cliques of a graph is crucial if one wants to determine its chromatic numbers [35]. Despite for coloring problems [8], the minimal genus of a graph plays an important role within the unresolved Graph Isomorphism Problem [36] or extensions of Whitney’s Theorem [11].

7.2. Related Complexes and Homology Theories

In addition to the concepts presented here, there are multiple other possibilities of assigning complexes and (co)-homologies to a given graph G [37] [38] [39]. Translations of these theories in terms of our contribution are certainly of interest.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Mcnaught, A.D. and Wilkinson, A. (1997) IUPAC. Compendium of Chemical Terminology. 2nd Edition (the “Gold Book”), Wiley Blackwell, Hoboken.
[2] Dokholyan, N.V., Li, L., Ding, F. and Shakhnovich, E.I. (2002) Topological Determinants of Protein Folding. Proceedings of the National Academy of Sciences, 99, 8637-8641.
https://doi.org/10.1073/pnas.122076099
[3] Albert, R. (2005) Scale-Free Networks in Cell Biology. Journal of Cell Science, 118, 4947-4957.
https://doi.org/10.1242/jcs.02714
[4] Barkai, N. and Leibler, S. (1997) Robustness in Simple Biochemical Networks. Nature, 387, 913-917.
https://doi.org/10.1038/43199
[5] Anderson, J.A. (1995) An Introduction to Neural Networks. MIT Press, Cambridge.
https://doi.org/10.7551/mitpress/3905.001.0001
[6] Otte, E. and Rousseau, R. (2002) Social Network Analysis: A Powerful Strategy, Also for the Information Sciences. Journal of Information Science, 28, 441-453.
https://doi.org/10.1177/016555150202800601
[7] Hopcroft, J.E. and Wong, J.K. (1974) Linear Time Algorithm for Isomorphism of Planar Graphs (Preliminary Report). In: Proceedings of the 6th Annual ACM Symposium on Theory of Computing, STOC ‘74, ACM, New York, 172-184.
https://doi.org/10.1145/800119.803896
[8] Berge, C. (2001) The Theory of Graphs. Dover Books on Mathematics. Dover, New York.
[9] Deı, V.G., Klinz, B., Woeginger, G.J., et al. (2006) Exact Algorithms for the Hamiltonian Cycle Problem in Planar Graphs. Operations Research Letters, 34, 269-274.
https://doi.org/10.1016/j.orl.2005.04.013
[10] Berge, C. and Ghouila-Houri, A. (1965) Programming, Games and Transportation Networks. Methuen, London.
[11] Whitney, H. (1992) Congruent Graphs and the Connectivity of Graphs. In: Hassler Whitney Collected Papers, Springer, Berlin, 61-79.
https://doi.org/10.1007/978-1-4612-2972-8_4
[12] Arora, S. and Barak, B. (2009) Computational Complexity: A Modern Approach. Cambridge University Press, New York.
[13] Gross, J.L. and Tucker, T.W. (1987) Topological Graph Theory. Wiley-Interscience, New York.
[14] Biggs, N. (1993) Algebraic Graph Theory. 2nd Edition, Cambridge University Press, Cambridge.
[15] Hatcher, A. (2002) Algebraic Topology. Cambridge University Press, Cambridge.
[16] Tarjan, R. (1972) Depth-First Search and Linear Graph Algorithms. SIAM Journal on Computing, 1, 146-160.
https://doi.org/10.1137/0201010
[17] Bang-Jensen, J. and Gutin, G.Z. (2008) Digraphs: Theory, Algorithms and Applications. 2nd Edition, Springer Publishing Company, Incorporated, Berlin.
[18] Hopcroft, J. and Tarjan, R. (1973) Algorithm 447: Efficient Algorithms for Graph Manipulation. Communications of the ACM, 16, 372-378.
https://doi.org/10.1145/362248.362272
[19] Roberts, B. and Kroese, D. (2007) Estimating the Number of ST Paths in a Graph. Journal of Graph Algorithms and Applications, 11, 195-214.
https://doi.org/10.7155/jgaa.00142
[20] Johnson, D.B. (1977) Efficient Algorithms for Shortest Paths in Sparse Networks. Journal of the ACM (JACM), 24, 1-13.
https://doi.org/10.1145/321992.321993
[21] Hagerup, T. (2000) Improved Shortest Paths on the Word Ram. In: International Colloquium on Automata, Languages, and Programming, Springer, Berlin, 61-72.
https://doi.org/10.1007/3-540-45022-X_7
[22] Dijkstra, E.W. (1959) A Note on Two Problems in Connexion with Graphs. Numerische Mathematik, 1, 269-271.
https://doi.org/10.1007/BF01386390
[23] Cheung, H.Y., Kwok, T.C. and Lau, L.C. (2013) Fast Matrix Rank Algorithms and Applications. Journal of the ACM (JACM), 60, 31.
https://doi.org/10.1145/2528404
[24] Preiss, B.R. (1999) Data Structures and Algorithms with Object-Oriented Design Patterns in C++. John Wiley & Sons Inc., New York.
[25] Karp, R.M. (1972) Reducibility among Combinatorial Problems. In: Miller, R.E. and Thatcher, J.W., Eds., Complexity of Computer Computations, Plenum Press, New York, 85-103.
https://doi.org/10.1007/978-1-4684-2001-2_9
[26] Bron, C. and Kerbosch, J. (1973) Algorithm 457: Finding All Cliques of an Undirected Graph. Communications of the ACM, 16, 575-577.
https://doi.org/10.1145/362342.362367
[27] Gagarin, A. (2003) Graph Embedding Algorithms.
[28] Hopcroft, J. and Tarjan, R. (1974) Efficient Planarity Testing. Journal of the ACM (JACM), 21, 549-568.
https://doi.org/10.1145/321850.321852
[29] Thomassen, C. (1989) The Graph Genus Problem Is NP-Complete. Journal of Algorithms, 10, 568-576.
https://doi.org/10.1016/0196-6774(89)90006-0
[30] Furst, M.L., Gross, J.L. and McGeoch, L.A. (1988) Finding a Maximum-Genus Graph Imbedding. Journal of the ACM, 35, 523-534.
https://doi.org/10.1145/44483.44485
[31] Hecht, M. (2017) A Generalization of the Most Common Subgraph Distance and Its Application to Graph Editing. Pattern Recognition Letters, 87, 71-78.
https://doi.org/10.1016/j.patrec.2016.09.008
[32] Forman, R. (1998) Morse Theory for Cell Complexes. Advances in Mathematics, 134, 90-145.
https://doi.org/10.1006/aima.1997.1650
[33] Hecht, M. (2018) Exact Localisations of Feedback Sets. Theory of Computing Systems, 62, 1048-1084.
https://doi.org/10.1007/s00224-017-9777-6
[34] Hecht, M., Gonciarz, K. and Horvát, S. (2021) Tight Localizations of Feedback Sets. ACM Journal of Experimental Algorithmics, 26, 1-19.
https://doi.org/10.1145/3447652
[35] Johnson, D.S. and Trick, M.A. (1996) Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge, October 11-13, 1993. Volume 26, American Mathematical Society, Boston.
[36] Babai, L. (2016) Graph Isomorphism in Quasi-Polynomial Time. In: Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing, ACM, New York, 684-697.
https://doi.org/10.1145/2897518.2897542
[37] Grigor’yan, A., Lin, Y., Muranov, Y. and Yau, S.-T. (2012) Homologies of Path Complexes and Digraphs.
[38] Grigor’yan, A., Lin, Y., Muranov, Y. and Yau, S.-T. (2014) Cohomology of Digraphs and (Undirected) Graphs. Asian Journal of Mathematics, 19, 887-932.
[39] Jonsson, J. (2008) Simplicial Complexes of Graphs. Springer-Verlag, Berlin.
https://doi.org/10.1007/978-3-540-75859-4

Copyright © 2021 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.