<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Visualizing reconciliations in co-phylogeny (Extended Abstract)?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tiziana Calamoneri</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valentino Di Donato</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Diego Mariottini</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maurizio Patrignani</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science Department, University of Rome \Sapienza"</institution>
          ,
          <addr-line>Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Engineering Department, Roma Tre University</institution>
          ,
          <addr-line>Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>An urgent need in the biological research eld consists of unambiguously and e ectively representing co-phylogenetic trees, that are pairs of phylogenetic trees that have a mapping among their nodes. The typical application is the investigation of the co-evolution of families of organisms such as hosts and parasites. A phylogenetic tree is a full binary rooted tree representing the inferred evolutionary relationships among relative species. In order to be meaningful from a biological point of view, a phylogenetic tree is required to be a full binary tree (i.e., each node has zero or two children). The biologists who study the co-evolution of hosts and parasites start with a host phylogenetic tree H, a parasite phylogenetic tree P , and a mapping function ' from the leaves of P to the leaves of H and represent them as a tanglegram (i.e. a pair of plane trees whose leaves are connected by straight-line edges [1,4]). Triple (H; P; ') is called co-phylogenetic tree and only represents the input of a more complicated analysis process that aims at computing a mapping , called reconciliation, that extends ' and maps all the parasite nodes onto the host nodes with constraints. Intuitively, a reconciliation is a reasonable reconstruction of the co-evolution events that produced the co-phylogenetic tree [2,6]. Given a co-phylogenetic tree, a great number of di erent reconciliations are possible. Some of them can be easily discarded, since they are not consistent with time (i.e. they induce contradictory constraints on the periods of existence of the species associated to internal nodes). The remaining reconciliations are generally ranked based on some quality measure. Nevertheless, the number of plausible reconciliations is still so high and the di erence between two equallyranked reconciliations may be so big that biologists have to perform a painstaking manual inspection to select the reconciliations that are more compatible with the two phylogenetic trees and with their understanding of the evolution phenomena. In this context, the contributions of this paper are the following:</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Introduction
? This research was partially supported by MIUR project \MODE { MOrphing graph
Drawings E ciently", prot. 20157EFM5C 001 and by Sapienza University of Rome
projects \Graph Algorithms for Phylogeny: a promising approach" and
\Combinatorial structures and algorithms for problems in co-phylogeny".
1. We propose a new and unambiguous metaphor to represent reconciliations of
co-phylogenetic trees. The main idea is that of representing H in a suitable
space- lling style and of using a traditional node-link style to represent P .
2. In order to pursue readability we study the number of crossings that are
introduced in the drawing of tree P in order to respect the represented
reconciliation; on the one hand, we characterize some \planar instances", i.e.,
triples hH; P; 'i, in which every reconciliation can be represented without
crossings; on the other hand, we show that, in the general case and for all
kinds of representations that comply with a restricted set of aesthetic criteria,
reducing the number of crossings in the representation of the reconciliations
is NP-hard.
2</p>
      <p>
        A new model for the visualization of reconciliations
While a decade-old literature is devoded to the visualization of co-phylogenetic
trees as tanglegrams (see, for example, [
        <xref ref-type="bibr" rid="ref1 ref4">1,4</xref>
        ]), little can be said about the
representation of the reconciliations of co-phylogenetic trees, except some
visualization strategies adopted by the available tools for their computation.
      </p>
      <p>In particular, few strategies can be highlighted.</p>
      <p>The simplest one, schematically represented in Fig. 1(a), is that of
representing the two trees by adopting the traditional node-link metaphor, where the
nodes of P are drawn close to the nodes of H they are associated to. The
advantage of this strategy lays in its simplicity. However, when several parasite
nodes are associated to the same host node, the drawing becomes cluttered and
the attribution of parasite nodes to host nodes becomes unclear. Further, even
if tree P was drawn without crossings (tree H always is), the overlapping of the
two trees produces a high number of crossings.</p>
      <p>(a)
(b)
(c)</p>
      <p>An alternative strategy (Fig. 1(b)) consists in representing H as a background
shape, such that its nodes are light-colored circles and its edges are thick pipes,
while P is contained into H and drawn in the traditional node-link style. This
strategy is particularly e ective, as it is unambiguous and crossings between the
two trees are strongly reduced, but it is still cluttered when a parasite subtree
has to be squeezed inside the reduced area of a host node.</p>
      <p>To solve this cluttering problem some visualization tools adopt the strategy
of keeping the containment metaphor while only drawing thick edges of H and
omitting host nodes (Fig. 1(c)). This produces a node-link drawing of the parasite
tree drawn inside the pipes representing the host tree. Also this strategy is
sometimes ambiguous, since it is unclear how to attribute parasite nodes to host
nodes.</p>
      <p>With the aim of overcoming the limitations of existing visualization
strategies, we propose a new metaphor for the representation of reconciliations. Our
strategy is complementary to the third strategy just described: we omit
altogether the arcs of H and represent its nodes as regions of the plane. Arcs are
implicitly represented by the contact of the regions corresponding to the nodes.
Tree P , instead, maintains the traditional node-link representation, where
parasites are placed into the regions associated with the hosts they are mapped
to. This representation is not ambiguous, since the containment relationship
between host and parasites unambiguously represents the mapping .</p>
      <p>
        Indeed, the representation of tree H is a variant of a representation known
in the literature with the name of icicle [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. An icicle (Fig. 2(a)) is a
spacelling representation of hierarchical information in which nodes are represented
by rectangles and arcs are represented by the contact of rectangles, such that
the bottom side of the rectangle of a node touches the top sides of its children.
      </p>
      <p>With respect to the standard icicle representation, we draw H with the
following variants (Fig. 2(b)):
{ we impose that the bottom sides of the rectangles representing the leaves of
H touch the bottom border of the drawing area. This forces all
contemporaneous hosts to share the same bottom line that represents current time,
{ for each generation of the host tree, we modify the height of the rectangles
in such a way that all the parasite subtrees are contained into their hosts
rectangles,
{ we specialize the icicle to the representation of binary tree, rounding the
corners of the rectangles in order to ease the visual recognition of the two
children of each node.
3</p>
      <p>Planar instances and reconciliations with the minimum
number of crossings
In this section we characterize the reconciliations that can be planarily drawn,
showing that there exist some \planar instances" not depending on the
represented reconciliation. Then we naturally wonder what happens when the given
instance is not planar and show an NP-hardness result. We omit the proof for
the sake of brevity.</p>
      <p>Let be given a co-phylogenetic tree (H; P; ') and a reconciliation .
De nition 1. A drawing D( ) of
any metaphor such that:
is a visual representation of
according to
{ the leaves of H are drawn along a horizontal line (i.e. they have the same
y-coordinate, w.l.o.g. let it be equal to 0),
{ tree H is embedded (i.e. drawn in a plane way) in the half-plane with
nonnegative y-coordinates,
{ each node p 2 P is drawn inside the representation of node h 2 H such that
(p) = h,
{ each edge (s; t) of P (except the ones performing a host-transfer) is drawn
inside the representation of the nodes and the edges of H belonging to path
from (s) and (t),
{ the sub-forest induced by all nodes of P mapped on the same node of H
follows a strictly downward convention.</p>
      <p>We call V (H), A(H), V (P ) and A(P ) the set of nodes and of arcs of H and
P , respectively. VL(H)) and VL(P )) are the set of leaves of the two trees.</p>
      <p>Let G(H; P; ') be the graph whose arc set is A(H) [ A(P ) and node set is
V (H) [ (V (P ) n VL(P )) and every leaf l of P is identi ed with leaf '(l) of H.
G(H; P; ') is cyclic and, in general, not planar. It can be represented on the
plane so that:
{ all the leaves of H (and hence also those of P ) lie on a horizontal line l,
{ internal nodes and edges of H are embedded (i.e. drawn in a plane way) in
the half-plane above l in a strictly downward fashion,
{ internal nodes and edges of P are represented (not necessarily without
crossings) in the half-plane under l in a strictly upward fashion.</p>
      <p>In the following, we will speak about drawing of G(H; P; ') without other
specications to mean a representation on the plane satisfying the above constraints.</p>
    </sec>
    <sec id="sec-2">
      <title>Theorem 1. The following claims are equivalent:</title>
    </sec>
    <sec id="sec-3">
      <title>1. G(H; P; ') is planar and a plane drawing of it is known,</title>
      <p>2. for each time-consistent reconciliation , there exists a plane drawing D( ).</p>
      <p>Theorem 1 states that we will have a plane drawing of a reconciliation
if and only if graph G(H; P; ') is planar. It is hence natural to wonder what
happens otherwise. We focus on the problem of the crossing minimization in
the drawing of a given reconciliation and prove that computing a drawing of a
reconciliation with the minimum number of crossings is NP-hard.</p>
      <p>Given and a constant k, we consider the problem Reconciliation Layout
(RL) as the problem asking whether there exist a permutation of the leaves of
H and a mapping of the nodes of H and of P on the plane such the drawing of
has at most k crossings.</p>
    </sec>
    <sec id="sec-4">
      <title>Theorem 2. Problem RL is NP-complete.</title>
      <p>
        We prove it by reducing to RL the NP-complete problem Two-Trees Crossing
Minimization (TTCM), whose input consists of two binary trees T1 and T2,
whose leaf sets are in one-to-one correspondence, and a constant k; the question
is whether T1 and T2 admit a tanglegram drawing with at most k crossings
among the tangles [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>We observe that, in the reduction, a primary role is played by host-transfer
edges, and this is reasonable, since they connect nodes in completely di erent
parts of the tree, possibly very far. Nevertheless, host-transfers are very special
edges and we believe that biologists have no problems in understanding the
drawing even if these edges introduce some crossings, so we wonder whether
crossings involve only host-transfers or not. We answer in a negative way to this
question by showing that some crossings can occur, even if host-transfer edges
are eliminated from the representation.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Buchin</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buchin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Byrka</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Nollenburg,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Okamoto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Silveira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.I.</given-names>
            ,
            <surname>Wol</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>Drawing (complete) binary tanglegrams</article-title>
          .
          <source>Algorithmica</source>
          <volume>62</volume>
          (
          <issue>1-2</issue>
          ),
          <volume>309</volume>
          {
          <fpage>332</fpage>
          (
          <year>2012</year>
          ), http://dx.doi.org/10.1007/s00453-010-9456-3
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Charleston</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          :
          <article-title>Jungles: a new solution to the host/parasite phylogeny reconciliation problem</article-title>
          .
          <source>Math Biosci</source>
          <volume>149</volume>
          (
          <issue>2</issue>
          ),
          <volume>191</volume>
          {
          <fpage>223</fpage>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Fernau</surname>
            , H., Kaufmann,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poths</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Comparing trees via crossing minimization</article-title>
          .
          <source>J. Comput. Syst. Sci</source>
          .
          <volume>76</volume>
          (
          <issue>7</issue>
          ),
          <volume>593</volume>
          {
          <fpage>608</fpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Fernau</surname>
            , H., Kaufmann,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poths</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Comparing trees via crossing minimization</article-title>
          .
          <source>J. Comput. Syst. Sci</source>
          .
          <volume>76</volume>
          (
          <issue>7</issue>
          ),
          <volume>593</volume>
          {
          <fpage>608</fpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kruskal</surname>
            ,
            <given-names>J.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Landwehr</surname>
            ,
            <given-names>J.M.:</given-names>
          </string-name>
          <article-title>Icicle plots: Better displays for hierarchical clustering</article-title>
          .
          <source>The American Statistician</source>
          <volume>37</volume>
          (
          <issue>2</issue>
          ),
          <volume>162</volume>
          {
          <fpage>168</fpage>
          (
          <year>1983</year>
          ), http://www.jstor.org/ stable/2685881
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Page</surname>
          </string-name>
          , R.D.,
          <string-name>
            <surname>Charleston</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          :
          <article-title>Trees within trees: phylogeny and historical associations</article-title>
          .
          <source>Trends Ecol Evol</source>
          <volume>13</volume>
          (
          <issue>9</issue>
          ),
          <volume>356</volume>
          {
          <fpage>359</fpage>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>