<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Assembling Coherent Network Topologies Using Round-Trip Graphs (short paper)⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marino Miculan</string-name>
          <email>marino.miculan@uniud.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matteo Paier</string-name>
          <email>matteo.paier@imtlucca.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Mathematics</institution>
          ,
          <addr-line>Computer Science and Physics</addr-line>
          ,
          <institution>University of Udine</institution>
          ,
          <addr-line>Udine</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>ICTCS 2023: 24th Italian Conference on Theoretical Computer Science</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>IMT Scuola Alti Studi</institution>
          ,
          <addr-line>Lucca</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Discovering the network topology in computer networks is challenging due to limited communication and incomplete information about non-immediately connected nodes. In this paper we address the problem of assembling partial views obtained by discovery tools into a coherent representation, using round-trip graphs: labelled bipartite directed graphs representing the communications between hosts, interfaces, and networks. A merge operation is introduced, facilitating compositional and incremental assembly of partial views. This research provides a practical solution for incrementally constructing a comprehensive network topology.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Graph theory</kwd>
        <kwd>Round-trip graphs</kwd>
        <kwd>Network discovery</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Discovering a network’s topology from inside the network itself is notoriously dificult [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Tools such as traceroute allow for scanning the network from the host on which they are
executed, but these scans produce partial views of the network, because not all communications
are allowed between all nodes (e.g., due to the presence of firewalls) and some information
about non-immediately connected nodes may not be observed (e.g., MAC addresses).
      </p>
      <p>To obtain a better understanding of the network, scans can be performed from diferent
starting points, yielding diferent partial spanning trees of the same network. The problem that
arises at this point is: given many partial trees of this kind, how can they be assembled into a
coherent view of the entire network?</p>
      <p>In this paper, we address this problem in terms of graph theory. More specifically, we
introduce round-trip graphs, which are bipartite directed (multi)graphs, where nodes represent
hosts, interfaces and networks, and labelled edges represent the type of communication allowed
between nodes. The output of a scan produces such graphs, more specifically corresponding to
directed acyclic graphs (DAGs).</p>
      <p>We present a “merge” operation for these graphs, called amalgamation. This operation is
associative, allowing us to assemble the partial views in a compositional and incremental manner.
This operation can be computed in linear time with respect to the size of the input graphs.</p>
      <p>Overall, this research provides a practical solution for incrementally constructing a
comprehensive network topology; moreover, the graph models we introduce in this work can be used
for the optimal planning of network scans.</p>
      <p>
        Synopsis In Section 2 we recall the basic definitions about labelled graphs, and introduce the
notions of grounding and amalgamation. In Section 3 we introduce round-trip graphs, and
provide an example application. Conclusions and directions for future works are in Section 4.
2. Directed graphs, grounding and amalgamation
Let  = (, ) be a graph with vertex set  and edge set  ⊆  ×  .  is a bipartite graph
with labelled edges if there exists a partition of  into two disjoint sets 1 and 2, such that
 ⊆ 1 × 2 ∪ 2 × 1 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>More formally, let  be a fixed set of nodes,  be a fixed a set of labels.</p>
      <p>Definition 1. A labelled bipartite directed graph (shortly, graph) is a tuple  = (1, 2, 1, 2)
such that 1, 2 are two disjoint finite subsets of  , and 1 ⊆ 1 ×  × 2 and 2 ⊆ 2 ×  × 1.
For a graph , its components are denoted as 1, 2, etc. We use the shorthands  = 1 ∪ 2
and  = 1 ∪ 2.</p>
      <p>Intuitively, we will use nodes to represent parties of a network, while labelled edges denote
the types of communication allowed between these parties.</p>
      <p>Several operations can be defined on graphs. Let ,  be graphs.</p>
      <p>Graph union:  ∪  = (1 ∪ 1 , 2 ∪ 2 , 1 ∪ 1 , 2 ∪ 2 ).</p>
      <p>Renaming: A (node) renaming is a function  :  →  (not necessarily injective). It is
extended pointwise to sets of names. Moreover, if  ⊆  ×  ×  , then [ ] =
{( (), ,  ()) | (, , ) ∈ }.</p>
      <p>The renaming of  under  is [ ] = ( (1),  (2), 1[ ], 2[ ]).</p>
      <p>Proposition 1. 1. Composition of substitution: [ ∘  ′] = [ ′][ ]</p>
      <p>2. Renaming distributes over union: (1 ∪ 2)[ ] = 1[ ] ∪ 2[ ]
Definition 2. Let  be a graph. A labelled path (of length ) in  is a list  =
(0, 0, 1, 1, . . . , − 1, ) where for all  ∈ {0, . . . ,  − 1} : (, , +1) ∈ . If the path is
all labelled with the same label , it is called -path.</p>
      <p>We say that there is a /ℎ-round path from  to , denoted as  →/ℎ −  if there exists a -path
from  to  and a ℎ-path from  to .</p>
      <p>An unlabelled path (of length ) in  is a list  = (0, 1, . . . , ) such that for each  ∈
{0, . . . ,  − 1} there exists  such that (, , +1) ∈ .</p>
      <p>In the following by “path” we mean unlabelled path, unless diferently stated.
2.1. Grounding and Amalgamation
We will use graphs to represent (partial) knowledge about the network. To deal with incomplete
information we have to distingush between consolidated knowledge, like that directly observable
from a node, and hypothetical knowledge, i.e., guesses about parts of the network not directly
observable. This knowledge can be refined, e.g. by further observations on the network. To this
end, we fix a set  ⊆  of nodes, that we call ground. Ground nodes are those whose identity
is consolidated. Non-ground nodes are those whose existence is assumed or deduced, but still
to be verified. This verification happens when merging the knowledge obtained from diferent
points of view, that is, diferent graphs sharing some ground nodes and paths.
Definition 3 (Ground and suspended paths). A path  is ground if all nodes in it are ground.
We denote by gpath(, , ) the set of all ground paths in  between  and , and by gpath()
the set of all ground paths in .</p>
      <p>Let ,  ∈  . A path  is suspended between  and  if the first and last nodes of  are 
and  respectively, and all intermediate nodes are not ground (i.e., not in  ). Let us denote by
spath(, , ) the set of all suspended paths between  and  in .</p>
      <p>In our application, we aim to build a graph whose ground paths form a spanning tree.
Therefore a graph can contain at most one ground path between any two nodes.</p>
      <sec id="sec-1-1">
        <title>Definition 4.</title>
        <p>A graph  is sound if, for every ,  ∈  , | gpath(, , )| ≤ 1.</p>
        <p>When merging two graphs, we may resolve the uncertain knowledge of suspended paths
using ground paths between the same nodes. We call this operation grounding, defined next.
Definition 5 (Grounding). Let  = (0, . . . , ) be a ground in a sound graph , and let  be a
suspended path between 0 and , of the same length . The grounding of  on  is a partial
substitution  (,  ) :  ⇀  defined as follows:
 (,  )() =
{︃ if  = (0, . . . , ) and  = 
⊥</p>
        <p>otherwise.</p>
        <p>Let  be another graph. The grounding of  on  is the graph [ /] obtained by grounding
all suspended paths in  using the corresponding ground paths in . Formally,  / :  →  is
defined as follows:</p>
        <p>/ = ⋃︁{ (,  ) |  ∈ gpath(),  = (, . . . , ),  ∈ spath(, , ), | | = | |}
In other words:
 /() =
⎪⎩</p>
        <p>otherwise.
⎧⎪ if there exists  ∈ gpath(),  = (0, . . . , ),
⎨</p>
        <p>∈ spath(, 0, ),  = (0, ′1, . . . , ′− 1, ), ′ = 
This definition is well given because  is sound and hence  , if it exists, is unique. An example
of path grounding is in Fig. 1.</p>
        <p>We are almost ready to provide the core definition of our theory, i.e., how two diferent
(sound) views of the same network can be coherently merged, yielding a new (sound) view of
the network. We call this operation amalgamation.
a
a
b
c
e
d
e
⇒
a
b
c
d
e
Definition 6. Two graphs 1, 2 are compatible if for all  = (, . . . , ) ∈ gpath(1) and
 = (′, . . . , ′) ∈ gpath(2), if  = ′ and  = ′ then  =  .</p>
        <p>Lemma 1. For 1 and 2 two compatible sound graphs, 1 ∪ 2 is sound.</p>
        <p>Definition 7 (Amalgamation). Let 1, 2 be two compatible sound graphs. The amalgamation
of 1 and 2 is the graph obtained by grounding the union of 1 and 2 with the information of
both:  ⨿  ≜ ( ∪ )[ ∪/∪ ].</p>
        <p>Proposition 2. Let 1, 2 be two compatible sound graphs. Then:
1. Soundness: 1 ⨿ 2 is sound;
2. Neutral element: 1 ⨿ 0 = 1 (where 0 is the empty graph);
3. Symmetry: 1 ⨿ 2 = 2 ⨿ 1;
4. Associativity: (1 ⨿ 2) ⨿ 3 = 1 ⨿ (2 ⨿ 3).</p>
        <p>This result allows us to construct incrementally partial views of the network: we start with a
partial view (i.e., the result of a scan), and every time we obtain a new (partial) sound view of
the network, we can add it to the current graph.</p>
        <p>Proposition 3. The amalgamation of two compatible sound graphs 1, 2 can be computed in
(|1 ∪ 2 |).</p>
        <p>Proof. Follows from the fact that union and grounding are (|1 ∪ 2 |).</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>3. Round-trip graphs</title>
      <p>
        We now define round-trip graphs, i.e., labelled bipartite directed graphs designed for representing
the information that can be observed on computer networks using tools like traceroute [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <sec id="sec-2-1">
        <title>Definition 8.</title>
        <p>A round-trip graph is a labelled directed graph  where
ground nodes are of four kinds: CPU nodes (*_cpu); Layer nodes (*_layer3); Interface nodes
(*_eth); Network nodes (*.in-addr.arpa).
non-ground nodes are as above, but with a ? in the name (e.g., A_eth?);
edges can connect only: Layer nodes to CPU nodes and to interface nodes (and vice versa); interface
nodes to network nodes (and vice versa);
edge labels are from the set {icmp0, icmp8}.</p>
        <p>It is easy to see that round-trip graphs are bipartite: one side are CPU and interface node,
the other is layer and network nodes. Hence we can readily apply the theory developed in the
previous Section. In particular:</p>
      </sec>
      <sec id="sec-2-2">
        <title>Definition 9.</title>
        <p>A round-trip from  to  is an icmp0/icmp8-round path from  to .</p>
        <p>Tools like traceroute (normally) produce round-trips of minimal length, where only a few
nodes at the beginning of the path and the final one are ground; all the other nodes in between
are not ground. By amalgamating all the round-trips from a given starting point in the network,
we obtain a graph similar to a DAG.</p>
        <p>As an example, let us consider a network with six hosts , , , , ,  , connected via four
networks, as represented by the round-trip graph in Fig. 2d. The scans from  produce the
round-trip graph in Fig. 2a, while from  we can construct the graph in Fig. 2b. Nodes with
light color are not ground. Amalgamating these two graphs, we obtain the graph in Fig. 2c:
comparing with the actual one (Fig. 2d) we see that the only suspended paths are those which
have not been observed neither from  nor from .</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Conclusions</title>
      <p>In this paper we have shown how to merge network graphs, in order to create incrementally a
coherent view from partial views of the same network.</p>
      <p>
        As future work, we plan to consider that the starting views could be not accurate (e.g., due to
accessibility policies) and thus the resulting amalgamation could be incomplete. To this end, a
hierarchical model of the network could be useful; hence we plan to extend the theory of this
paper from directed graphs to directed bigraphs [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ], which have been successfully applied to
model hierarchical networks (like, e.g., containers and membrane systems [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ]). Finally, we
would like to implement this amalgamation notion in actual network discovery tools, as it will
prove useful to sys-admins and network security experts.
      </p>
      <p>A_cpu
icmp8icmp0
A_layer3
icmp8 icmp0
A_eth0
icmp8icmp0
0.1.0.10.in-addr.arpa
icmp8 icmp0
B_eth0
icmp8icmp0
B_layer3
icmp8 icmp0
A,C,6#B_eth?
icmp8 icmp0
A,C,7#?.in-addr.arpa
icmp8 icmp0
A,C,8#C_eth?
icmp8 icmp0</p>
      <p>C_layer3
icmp8 icmp0 icmp0 icmp8
C_cpu A,E,8#C_eth?</p>
      <p>icmp0 icmp8
A,E,7#?.in-addr.arpa
icmp0 icmp8
A,E,6#B_eth?
icmp8 icmp0</p>
      <p>B_eth1
icmp8icmp0
0.2.0.10.in-addr.arpa
icmp8 icmp0 icmp8 icmp0</p>
      <p>C_eth0 D_eth0
icmp8 icmp0
C_layer3
icmp8 icmp0
B,E,6#C_eth?
icmp8 icmp0
B,E,7#?.in-addr.arpa
icmp8 icmp0
B,E,8#E_eth?
icmp8 icmp0 icmp0
E_layer3
icmp0 icmp8 icmp8 icmp0
A,E,12#E_eth? E_cpu
icmp0 icmp8
A,E,11#?.in-addr.arpa
icmp0 icmp8</p>
      <p>A,E,10#C_eth?</p>
      <p>B_layer3
B_eth0</p>
      <p>B_eth1
icmp0 icmp8 icmp0 icmp8</p>
      <p>D_eth1
icmp0 icmp8icmp0 icmp8</p>
      <p>0.4.0.10.in-addr.arpa
icmp0 icmp8 icmp0 icmp8</p>
      <p>F_eth0
icmp0 icmp8icmp0 icmp8</p>
      <p>F_layer3
icmp0 icmp8 icmp0 icmp8</p>
      <p>F_cpu
(a) Graph induced by traceroutes from A.
(b) Graph induced by traceroutes from B.
(c) Amalgamation of the graphs in Figs. 2a and 2b.</p>
      <p>Figure 2: Amalgamation of round-trip graphs induced by traceroutes</p>
      <p>A_cpu
icmp0 icmp8icmp0 icmp8</p>
      <p>A_layer3
icmp0 icmp8 icmp0 icmp8</p>
      <p>A_eth0
icmp0 icmp8icmp0 icmp8</p>
      <p>0.1.0.10.in-addr.arpa
icmp0 icmp8 icmp0 icmp8</p>
      <p>B_eth0
icmp0 icmp8icmp0 icmp8</p>
      <p>B_layer3
icmp8 icmp0 icmp8
B_cpu
icmp0 icmp8</p>
      <p>C_eth0
icmp0 icmp8 icmp0 icmp8</p>
      <p>C_layer3
icmp8 icmp0 icmp8 icmp0 icmp8</p>
      <p>C_eth1
icmp0 icmp8icmp0 icmp8</p>
      <p>0.3.0.10.in-addr.arpa
icmp0 icmp8 icmp0 icmp8</p>
      <p>E_eth0
icmp0 icmp8icmp0 icmp8</p>
      <p>E_layer3
icmp0 icmp8 icmp0 icmp8</p>
      <p>E_cpu
icmp0 icmp8 icmp0 icmp8</p>
      <p>B_eth1
icmp0 icmp8icmp0 icmp8</p>
      <p>0.2.0.10.in-addr.arpa
icmp0 icmp8 icmp0 icmp8 icmp0 icmp8</p>
      <p>D_eth0
icmp0 icmp8 icmp0 icmp8</p>
      <p>D_layer3
icmp0 icmp8 icmp0 icmp8</p>
      <p>D_cpu</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Beerliova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Eberhard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Erlebach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hofmann</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Mihal'ak, L. S. Ram, Network discovery and verification</article-title>
          ,
          <source>J. selected areas in communications 24</source>
          (
          <year>2006</year>
          )
          <fpage>2168</fpage>
          -
          <lpage>2181</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bang-Jensen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. Z.</given-names>
            <surname>Gutin</surname>
          </string-name>
          , Digraphs - Theory, Algorithms and Applications, 2nd ed., Springer Monographs in Mathematics, Springer,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Malkin</surname>
          </string-name>
          ,
          <article-title>Traceroute Using an IP Option</article-title>
          , RFC
          <volume>1393</volume>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bacci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Grohmann</surname>
          </string-name>
          , M. Miculan,
          <article-title>DBtk: A toolkit for directed bigraphs</article-title>
          ,
          <source>in: Proc. CALCO</source>
          <year>2009</year>
          , volume
          <volume>5728</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2009</year>
          , pp.
          <fpage>413</fpage>
          -
          <lpage>422</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Grohmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Miculan</surname>
          </string-name>
          ,
          <article-title>Directed bigraphs</article-title>
          ,
          <source>in: Proc. MFPS</source>
          <year>2007</year>
          , volume
          <volume>173</volume>
          <source>of ENTCS</source>
          , Elsevier,
          <year>2007</year>
          , pp.
          <fpage>121</fpage>
          -
          <lpage>137</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Burco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Miculan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Peressotti</surname>
          </string-name>
          ,
          <article-title>Towards a formal model for composable container systems</article-title>
          ,
          <source>in: Proc. SAC</source>
          , ACM,
          <year>2020</year>
          , pp.
          <fpage>173</fpage>
          -
          <lpage>175</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bacci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Grohmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Miculan</surname>
          </string-name>
          ,
          <article-title>Bigraphical models for protein and membrane interactions</article-title>
          ,
          <source>in: MeCBIC</source>
          , volume
          <volume>11</volume>
          <source>of EPTCS</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <source>icmp8 icmp0 icmp8 icmp0</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>