<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Prediction of New Associations between ncRNAs and Diseases Exploiting Multi-Type Hierarchical Clustering (Discussion Paper)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Emanuele Pio Barracchia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gianvito Pio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Domenica D'Elia</string-name>
          <email>domenica.delia@ba.itb.cnr.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michelangelo Ceci</string-name>
          <email>michelangelo.cecig@uniba.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Big Data Laboratory</institution>
          ,
          <addr-line>CINI Consortium - Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>CNR, Institute for Biomedical Technologies - Bari</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Dept. of Computer Science - University of Bari Aldo Moro</institution>
          ,
          <addr-line>Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Dept. of Knowledge Technologies, Jozef Stefan Institute</institution>
          ,
          <addr-line>Ljubljana</addr-line>
          ,
          <country country="SI">Slovenia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The study of functional associations between ncRNAs and human diseases is a pivotal task of modern research to develop new and more e ective therapeutic approaches. Nevertheless, it is not a trivial task since it involves entities of di erent types, such as microRNAs, lncRNAs or target genes. Such a complexity can be faced by representing the involved biological entities and their relationships as a network and by exploiting network-based computational approaches able to identify new associations. However, existing methods are limited to homogeneous networks or can exploit only a limited set of the features of biological entities. To overcome the limitations of existing approaches, we proposed the system LP-HCLUS, which analyzes heterogeneous networks consisting of several types of objects and relationships, each possibly described by a set of features, and extracts hierarchically organized, possibly overlapping, multi-type clusters that are subsequently exploited to predict new ncRNA-disease associations. Our experimental evaluation shows that, according to both quantitative (i.e., TPR@k, ROC and PR curves) and qualitative criteria, LP-HCLUS produces better results.</p>
      </abstract>
      <kwd-group>
        <kwd>non-coding RNA (ncRNAs)</kwd>
        <kwd>diseases</kwd>
        <kwd>cancer</kwd>
        <kwd>heteroge- neous network</kwd>
        <kwd>clustering</kwd>
        <kwd>link prediction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>High-throughput sequencing technologies and recent, more e cient
computational approaches, have been fundamental for the rapid advances in functional
genomics. Among the most relevant results, there is the discovery of thousands
of non-coding RNAs (ncRNAs) with a regulatory function on gene expression.</p>
      <p>
        In parallel, the number of studies reporting the involvement of ncRNAs in
the development of many di erent human diseases has grown exponentially. The
Copyright c 2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0). This volume is published
and copyrighted by its editors. SEBD 2020, June 21-24, 2020, Villasimius, Italy.
rst type of ncRNAs that has been discovered and largely studied is that of
microRNAs (miRNAs), classi ed as small non-coding RNAs in contrast with long
non-coding RNAs (lncRNAs), that are ncRNAs longer than 200nt. While
miRNAs primarily act as post-transcriptional regulators, lncRNAs have a plethora of
regulatory functions [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. However, the number of lncRNAs for which the
functional and molecular mechanisms are completely elucidated is still quite poor
and experimental investigations are still too much expensive for being carried
out without any computational pre-analysis. In the last few years, there have
been several attempts to computationally predict the relationships among
biological entities, such as genes, miRNAs, lncRNAs, diseases [
        <xref ref-type="bibr" rid="ref1 ref11 ref13 ref15">1,11,13,15</xref>
        ]. Such
methods are based on a network representation of the entities under study and
on the identi cation of new links among nodes in the network. However, most
of them are able to work only on homogeneous networks (where nodes and links
are of one single type) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], are strongly limited by the number of di erent node
types or are constrained by pre-de ned network structures.
      </p>
      <p>
        In this discussion paper, we describe the method LP-HCLUS [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], that is able
to overcome these limitations. In particular, it can discover new ncRNA-disease
relationships from heterogeneous attributed networks (i.e., consisting of di
erent biological entities related by di erent types of relationships) with arbitrary
structure. This ability allows LP-HCLUS to investigate the interactions among
di erent types of entities, possibly leading to increased prediction accuracy.
      </p>
      <p>LP-HCLUS exploits a combined approach based on hierarchical, multi-type
clustering and link prediction. As we will detail in the next section, a multi-type
cluster is actually a heterogeneous sub-network. Therefore, the adoption of a
clustering-based approach allows LP-HCLUS to base its predictions on relevant,
highly-cohesive heterogeneous sub-networks. Moreover, the hierarchical
organization of clusters allows it to perform predictions at di erent levels of granularity,
taking into account either local/speci c or global/general relationships.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Method</title>
      <p>In the following, we introduce the notation and some useful de nitions.
De nition 1 (Heterogeneous attributed network). A heterogeneous
attributed network is a network G = (V; E), where V is the set of nodes and E is
the set of edges, and both nodes and edges can be of di erent types. Moreover:
{ T = Tt [ Ttr is the set of node types, where Tt is the set of target types, i.e.
considered as target of the clustering/prediction task, and Ttr is the set of
task-relevant types. Only nodes of target types are clustered and considered
in the identi cation of new relationships.
{ Each node type Tv 2 T de nes a subset of nodes in the network, i.e., Vv V .
{ Each node type Tv 2 T is associated with a set of attributes Av = fAv;1; Av;2;
:::; Av;mv g, i.e., nodes of the type Tv are described by the attributes Av.
{ R is the set of all the possible edge types.
{ Each edge type Rl 2 R de nes a subset of edges El E.
C2
C3
C4
C5</p>
      <p>C1, C2
C4, C5</p>
      <p>C1, C2, C3
(b)</p>
      <p>C1, C2, C3,</p>
      <p>C4, C5
(a)</p>
      <p>De nition 2 (Hierarchical multi-type clustering). A hierarchy of
multitype clusters is de ned as a list of hierarchy levels [L1; L2; : : : ; Lk], where each
Li consists of a set of overlapping multi-type clusters. For each level Li; i =
2; 3; :: : : : k, 8 G0 2 Li 9 G00 2 Li 1, such that G00 is a subnetwork of G0 (Fig. 1).
According to these de nitions, we de ne the task considered in this work.
De nition 3 (Predictive hierarchical clustering for link prediction).
Given a heterogeneous attributed network G = (V; E) and the set of target types
Tt, the goal is to nd:
{ A hierarchy of overlapping multi-type clusters [L1; L2; : : : ; Lk].
{ A function (w): Vi1 Vi2 ![0; 1] for each hierarchical level Lw (w 2 1; 2; :::; k),
where nodes in Vi1 are of type Ti1 2 Tt and nodes in Vi2 are of type Ti2 2 Tt.
Each function (w) maps each possible pair of nodes (of types Ti1 and Ti2 )
to a score representing the degree of certainty of their relationship.
In this paper LP-HCLUS has been used to solve the task formalized in De nition
3, by considering ncRNAs and diseases as target types. Hence, we determine two
distinct set of nodes denoted by Tn and Td, representing the set of ncRNAs and
the set of diseases, respectively. In the following subsections, we will describe the
main steps executed by LP-HCLUS (see Fig. 2 for a general overview).</p>
      <sec id="sec-2-1">
        <title>2.1 Estimation of the strength of the relationship</title>
        <p>
          In the rst phase, we estimate the strength of the relationship among all the
possible ncRNA-disease pairs in the network G. In particular, we aim to
compute a score s(ni; dj ) for each possible pair ni; dj , by exploiting the concept of
meta-path. According to [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], a meta-path is a set of sequences of nodes
(involving both target and task-relevant types) which follow the same sequence of
edge types, and can be used to fruitfully represent conceptual (possibly
indirect) relationships between two entities in a heterogeneous network. Given the
ncRNA ni and the disease dj , the relationship between them can be
considered \certain" if there is at least one meta-path which con rms its certainty.
        </p>
        <p>Estimation of the
strength of
relationships
Predicted relationships
d1 s(n1, d1) n1
d3 s(n1..,.d3) n1
di s(nk, di) nk</p>
        <p>Extracted edges
w1
w2
...
w|Ê|
Prediction</p>
        <p>Construction of the
hierarchy of clusters
Hierarchy of clusters</p>
        <p>Therefore, by assimilating the score associated with an interaction to its degree
of certainty, we compute s(ni; dj ) as the maximum value observed over all the
possible meta-paths between ni and dj . Formally:
s(ni; dj ) =</p>
        <p>
          max
P 2metapaths(ni;dj)
pathscore(P; ni; dj )
(1)
where metapaths(ni; dj ) is the set of meta-paths connecting ni and dj , and
pathscore(P; ni; dj ) is the degree of certainty of the relationship between ni and
dj according to the meta-path P . In order to compute pathscore(P; ni; dj ), we
represent each meta-path P as a nite set of sequences of nodes. If a sequence
in P connects ni and dj , then pathscore(P; ni; dj ) = 1. Otherwise, following
the same strategy introduced before, it is computed as the maximum similarity
between the sequences which start with ni and the sequences which end with dj
(see Fig. 3). The intuition behind this formula is that if ni and dj are not directly
connected, their score represents the similarity of the nodes and edges they are
connected to. The similarity between two sequences seq0 and seq00 is computed
according to the the attributes of all nodes involved in the two sequences:
following [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], if x is numeric, then sx(seq0; seq00) = 1 jvalx(mseaqx0)x vmalixn(xseq00)j , where
minx (resp. maxx) is the minimum (resp. maximum) value, for the attribute x;
if x is not a numeric attribute, sx(seq0; seq00) = 1 if valx(seq0) = valx(seq00), 0
otherwise. In this solution there could be some node types that are not involved
in any meta-path. In order to exploit the information conveyed by these nodes,
we add an aggregation of their attribute values (the arithmetic mean for
numerical attributes, the mode for non-numerical attributes) to the nodes that are
connected to them and that appear in at least one meta-path.
2.2 Construction of a hierarchy of overlapping multi-type clusters
We construct the rst level of the hierarchy by identifying a set of overlapping
multi-type clusters in the form of bicliques. To this aim, we perform three steps:
i) Filtering, which keeps only the ncRNA-disease pairs with a score greater
than (or equal to) . The result of this step is the subset f(ni; dj )js(ni; dj ) g
ii) Initialization, which builds the initial set of clusters in the form of bicliques,
each consisting of a ncRNA-disease pair in f(ni; dj )js(ni; dj ) g.
iii) Merging, which iteratively merges two clusters C0 and C00 into a new cluster
C000. This step regards the initial set of clusters as a list sorted according to an
ordering relation &lt;c that re ects the quality of the clusters. Each cluster C0 is
then merged with the rst cluster C00 in the list that would lead to a cluster
C000 which still satis es the biclique constraint. This step is repeated until no
additional clusters that satisfy the biclique constraint can be obtained.
        </p>
        <p>The ordering relation &lt;c de nes a greedy search strategy that guides the
order in which pairs of clusters are analyzed. &lt;c is based on the cluster
cohesiveness h(c), that corresponds to the average score in the cluster, namely:
h(C) = jpair1s(C)j P(ni;dj)2pairs(C) s(ni; dj ), where pairs(C) is the set of all the
possible ncRNA-disease pairs that can be constructed from the set of ncRNAs
and diseases in the cluster. Accordingly, if C0 and C00 are two di erent clusters,
the ordering relation &lt;c is de ned as follows: C0 &lt;c C00 () h(C0) &gt; h(C00).</p>
        <p>
          The approach adopted to build the other hierarchical levels is similar to the
merging step performed to obtain L1. The main di erence is that we do not
obtain bicliques, but generic multi-type clusters. Since the biclique constraint is
removed, we need another stopping criterion for the iterative merging procedure.
Coherently with approaches used in hierarchical co-clustering and following [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ],
we adopt a user-de ned threshold on the cohesiveness of the obtained clusters.
In particular, two clusters C0 and C00 can be merged into a new cluster C000 if
h(C000) &gt; , where h(C000) is the cluster cohesiveness. This means that de nes
the minimum cluster cohesiveness that must be satis ed by a cluster obtained
after a merging. The iterative process stops when it is not possible to merge
more clusters with a minimum level of cohesiveness .
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.3 Prediction of new ncRNA-disease relationships</title>
        <p>In the last phase, we exploit each level of the identi ed hierarchy of multi-type
clusters as a prediction model. In particular, we compute, for each
ncRNAdisease pair, a score representing its degree of certainty on the basis of the
multi-type clusters containing it. Formally, let Ciwj be a cluster identi ed in
the w-th hierarchical level in which the ncRNA ni and the disease dj appear.
We compute the degree of certainty of the relationship between ni and dj as:</p>
        <p>
          (w)(ni; dj ) = h Ciwj , that is, we compute the degree of certainty of the new
interaction as the average degree of certainty of the known relationships in the
cluster. In some cases, the same interaction may appear in multiple clusters,
since the proposed algorithm is able to identify overlapping clusters. In this
case, Ciwj represents the list of multi-type clusters in which both ni and dj appear
and we aggregate their cohesiveness values according to four di erent strategies:
maximum, minimum, average and evidence combination [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experiments</title>
      <p>
        LP-HCLUS has been run with di erent values of its input parameters. In
particular, following the results obtained in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], we considered 2 f0:1; 0:2g and
2 f0:3; 0:4g. The considered datasets are: i) HMDD v3.0 which stores 985
miRNAs, 675 diseases and 20,859 relationships between diseases and miRNAs;
ii) Integrated Dataset (ID), built by integrating multiple datasets [
        <xref ref-type="bibr" rid="ref3 ref4 ref7 ref8">3,4,7,8</xref>
        ],
composed by 7,049 diseases, 70 lncRNA-miRNA relationships, 3,830
relationships between diseases and ncRNAs, 90,242 target genes, 26,522 disease-target
associations and 1,055 ncRNA-target relationships.
      </p>
      <p>
        We compared LP-HCLUS with the following competitors:
i) HOCCLUS2 [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], a biclustering algorithm that, similarly to LP-HCLUS,
identi es a hierarchy of (possibly overlapping) heterogeneous clusters. It is,
however, limited to work with only two types of objects. Since its parameters have a
similar meaning with respect to LP-HCLUS parameters, we evaluated its results
with the same setting, i.e., 2 f0:1; 0:2g and 2 f0:3; 0:4g;
ii) ncPred [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], a system that was speci cally designed to predict new
ncRNAdisease associations. ncPred cannot catch information coming from other entities
in the network and it is not able to exploit features associated to nodes and links.
iii) LP-HCLUS-NoLP, which corresponds to a baseline version of system
LPHCLUS, without the clustering and the link prediction steps. In particular, we
consider the score obtained in the rst phase of LP-HCLUS (see Section 2.1) as
the nal score associated with each interaction.
      </p>
      <p>We adopted the 10-fold cross validation on the set of known ncRNA-disease
relationships and, due to absence of negative samples, we evaluated the results in
terms of TruePositiveRate@k curve. Moreover, we also report the results in terms
of ROC and Precision-Recall curves by considering the unknown relationships
as negative examples. We remark that ROC and PR curves can only be used for
relative comparison and not as absolute evaluation measures because they are
spoiled by the assumption made on unknown relationships.</p>
      <p>
        In Figs. 4 and 5 we show some results obtained with the most promising
congurations. From the quantitative viewpoint, we can observe that the proposed
method LP-HCLUS, with the combination strategy based on the maximum, is
able to obtain the best performances, for all the considered measures. From a
qualitative point of view, we rst performed a comparative analysis between the
results obtained by LP-HCLUS against the validated interactions reported in
the updated version of HMDD (i.e., v3.2 released on March 27th, 2019). We
found 3,055 LP-HCLUS predictions con rmed by the new release of HMDD at
the hierarchy level 1, 4,119 at level 2 and 4,797 at level 3. Next, we conducted a
qualitative analysis of the top-ranked relationships predicted by LP-HCLUS
using ID dataset, selecting only those with a score equal to 1.0. For this purpose, we
exploited MNDR v2.0, which is a comprehensive resource including more than
260,000 experimental and predicted ncRNA-disease associations for mammalian
species. Also in this case, we found some associations in both MNDR and in the
list of predicted associations by LP-HCLUS. A more comprehensive analysis,
reporting several additional examples, can be found in the full paper [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>In this paper, we have tackled the problem of predicting possibly unknown
ncRNA-disease relationships. The proposed approach LP-HCLUS is able to take
advantage from the possible heterogeneous nature of the attributed biological
network analyzed. The results con rm the initial intuitions and show
competitive performances of LP-HCLUS in terms of accuracy of the predictions, also
when compared with state-of-the-art competitor systems. These results are also
supported by a comparison of LP-HCLUS predictions with data reported in
MNDR and by a qualitative analysis that revealed that several ncRNA-disease
associations predicted by LP-HCLUS have been subsequently experimentally
validated and introduced in a more recent release (v3.2) of HMDD. As future
work, we will evaluate the performance of LP-HCLUS in other domains.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>We acknowledge the support of Ministry of Education, Universities and Research
(MIUR) through the PON project TALIsMAn - Tecnologie di Assistenza
personALizzata per il Miglioramento della quAlita della vitA (ARS01 01116). Dr.
Gianvito Pio acknowledges the support of Ministry of Education, Universities
and Research (MIUR) through the project AIM1852414, activity 1, line 1.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Alaimo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giugno</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pulvirenti</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>ncPred: ncRNA-Disease Association Prediction through Tripartite Network-Based Inference</article-title>
          .
          <source>Frontiers in Bioengineering and Biotechnology</source>
          <volume>2</volume>
          (
          <year>Dec 2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Barracchia</surname>
            ,
            <given-names>E.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pio</surname>
          </string-name>
          , G.,
          <string-name>
            <surname>D'Elia</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceci</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>Prediction of new associations between ncrnas and diseases exploiting multi-type hierarchical clustering</article-title>
          .
          <source>BMC bioinformatics 21(1)</source>
          ,
          <volume>1</volume>
          {
          <fpage>24</fpage>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bauer-Mehren</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rautschka</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanz</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Furlong</surname>
            ,
            <given-names>L.I.:</given-names>
          </string-name>
          <article-title>DisGeNET: a Cytoscape plugin to visualize, integrate, search and analyze gene-disease networks</article-title>
          .
          <source>Bioinformatics</source>
          (Oxford, England)
          <volume>26</volume>
          (
          <issue>22</issue>
          ),
          <volume>2924</volume>
          {2926 (Nov
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qiu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cui</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          :
          <article-title>LncRNADisease: a database for long-non-coding RNA-associated diseases</article-title>
          .
          <source>Nucleic Acids Research</source>
          <volume>41</volume>
          (
          <string-name>
            <surname>Database</surname>
            <given-names>issue)</given-names>
          </string-name>
          (
          <year>Jan 2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ji</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , Zhang,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <surname>Q.</surname>
          </string-name>
          :
          <article-title>Constructing lncRNA functional similarity network based on lncRNA-disease associations and disease semantic similarity</article-title>
          .
          <source>Scienti c Reports</source>
          <volume>5</volume>
          (
          <year>Jun 2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. Han,
          <string-name>
            <given-names>J</given-names>
            .,
            <surname>Kamber</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Data mining: concepts and techniques</article-title>
          . Elsevier/Morgan Kaufmann, Amsterdam (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Helwak</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kudla</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , et al.:
          <article-title>Mapping the human miRNA interactome by CLASH reveals frequent noncanonical binding</article-title>
          .
          <source>Cell</source>
          <volume>153</volume>
          (
          <issue>3</issue>
          ),
          <volume>654</volume>
          {
          <fpage>665</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Jiang</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Juan</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teng</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          :
          <article-title>miR2disease: a manually curated database for microRNA deregulation in human disease</article-title>
          .
          <source>Nucleic Acids Research</source>
          <volume>37</volume>
          (
          <issue>Database issue</issue>
          ),
          <source>D98{104 (Jan</source>
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Lesmo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saitta</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Torasso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Evidence combination in expert systems</article-title>
          .
          <source>International Journal of Man-Machine Studies</source>
          <volume>22</volume>
          (
          <issue>3</issue>
          ),
          <volume>307</volume>
          {326 (Mar
          <year>1985</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Melissari</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grote</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Roles for long non-coding RNAs in physiology and disease</article-title>
          . P ugers Archiv -
          <source>European Journal of Physiology</source>
          <volume>468</volume>
          (
          <issue>6</issue>
          ),
          <volume>945</volume>
          {
          <fpage>958</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Mignone</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pio</surname>
          </string-name>
          , G.,
          <string-name>
            <surname>D'Elia</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceci</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Exploiting transfer learning for the reconstruction of the human gene regulatory network</article-title>
          .
          <source>Bioinform</source>
          .
          <volume>36</volume>
          (
          <issue>5</issue>
          ),
          <volume>1553</volume>
          {
          <fpage>1561</fpage>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Pio</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceci</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>D'Elia</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Loglisci</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malerba</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <string-name>
            <given-names>A Novel</given-names>
            <surname>Biclustering</surname>
          </string-name>
          <article-title>Algorithm for the Discovery of Meaningful Biological Correlations between microRNAs and their Target Genes</article-title>
          .
          <source>BMC Bioinformatics</source>
          <volume>14</volume>
          (
          <issue>Suppl 7</issue>
          ),
          <source>S8 (Apr</source>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Pio</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceci</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prisciandaro</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malerba</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Exploiting causality in gene network reconstruction based on graph embedding</article-title>
          .
          <source>Machine Learning</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Pio</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , Sera no,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Malerba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Ceci</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.:</surname>
          </string-name>
          <article-title>Multi-type clustering and classi cation from heterogeneous networks</article-title>
          .
          <source>Information Sciences</source>
          <volume>425</volume>
          ,
          <volume>107</volume>
          {126 (Jan
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          , et al.:
          <article-title>Improved method for prioritization of disease associated lncRNAs based on ceRNA theory and functional genomics data</article-title>
          .
          <source>Oncotarget</source>
          <volume>8</volume>
          (
          <issue>3</issue>
          ),
          <volume>4642</volume>
          {4655 (Dec
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>