<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards the Construction of an RNA-centered Knowledge Graph</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Emanuele Cavalleri</string-name>
          <email>emanuele.cavalleri@unimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sara Bonfitto</string-name>
          <email>sara.bonfitto@unimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alberto Cabri</string-name>
          <email>alberto.cabri@unimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jessica Gliozzo</string-name>
          <email>jessica.gliozzo@unimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paolo Perlasca</string-name>
          <email>paolo.perlasca@unimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mauricio Soto-Gomez</string-name>
          <email>mauricio.soto@unimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gabriella Trucco</string-name>
          <email>gabriella.trucco@unimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elena Casiraghi</string-name>
          <email>elena.casiraghi@unimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giorgio Valentini</string-name>
          <email>giorgio.valentini@unimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Mesiti</string-name>
          <email>marco.mesiti@unimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dep. of Computer Science, Università di Milano</institution>
          ,
          <addr-line>Via Celoria 18, 20133 Milano</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The use of RNA molecules for developing new drugs and new vaccines is attracting more and more scientific centers all over the world that produce biological banks with diferent kinds of relationships existing among the diferent coding and non-coding molecules. Collecting and identifying relationships among the data included in these collections is of paramount importance for knowledge discovery and analysis. In this paper, we describe the initial steps in the construction of RNA-KG, an RNA-centered Knowledge Graph that will contain the diferent types of entities that can be extracted from diferent public databases and the relationships that can be inferred. A meta-graph reporting the main kinds of relationships that can be included by the integration of the identified data sources is finally presented. These activities are conducted in the context of the “National Center for Gene Therapy and Drugs based on RNA Technology” funded by the Italian PNRR and the NextGenerationEU program.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;RNA-based technologies</kwd>
        <kwd>Knowledge Graphs</kwd>
        <kwd>RNA-drug discovery</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        RNA-based drugs represent one of the most promising advances in therapeutics, as evidenced
by the recent success of mRNA-based vaccines for the COVID-19 pandemic [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. More generally,
coding and non-coding RNA molecules can potentially lead to new treatments of cancer, genetic
and neurodegenerative disorders, cardiovascular and infectious diseases [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Conventional drugs show relevant limitations in their druggable targets because they usually
consist of small molecules targeting proteins. Only about 10% of proteins have druggable binding
sites and no more than 2% of the human genome is protein-coding. On the contrary, RNA drugs
can target both proteins and mRNA, as well as other non-coding RNA (ncRNA). Moreover, they
can encode missing or defective proteins, regulate the transcriptome, and mediate DNA or RNA
editing. Thus, RNA technology significantly broadens the set of druggable targets and is also
less expensive than other technologies (e.g. drug synthesis based on recombinant proteins), due
to the relatively simple structure of RNA molecules that facilitates their biochemical synthesis
and chemical modifications [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>In the framework of the NextGenerationEU funded “National Center for Gene Therapy
and Drugs based on RNA Technology”, we aim to support the discovery of novel RNA-based
drugs by developing an RNA-centered Knowledge Graph (RNA-KG).1 RNA-KG will collect and
organize data and knowledge about RNA molecules, retrieved from public databases and/or
generated from the results of the research groups involved in the National Center. It will also
provide a comprehensive description of the relationships among the various kind of RNAs,
diseases, drugs, phenotypes, and other bio-medical entities. RNA-KG will be the basis for the
development of novel cutting-edge AI methods specifically tailored for the analysis of biological
processes involving RNA. These methods could also open the way to RNA-drug prioritization,
RNA drug-target prediction, and other prediction tasks for discovering new RNA drugs.</p>
      <p>
        In this paper, we report our initial achievements that have been recently published in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] for
the identification of a meta-graph representing the kind of relationships that can be identified
among the diferent types of RNA molecules. This result is obtained by examining more than
50 public online repositories for non-coding RNA sequences and annotations and by studying
the kinds of interactions that can exist among these molecules. The public online repositories
have been selected through an extensive literature review of top journals of the sector (like
NAR, BMC Bioninformatics, Science, RNA Journal, IEEE/ACM TCBB), that are periodically
updated by their developers, and contain significant amounts of molecules and relationships.
In the identification of the repositories, we have taken into account the presence of controlled
vocabularies, thesaurus, ontologies that formally describes the repository content, and the
presence of well-recognized identification schemes.
      </p>
      <p>The paper is organized as follows. Section 2 introduces related work devoted to data
integration and to the construction of RNA-KG starting from diferent heterogeneous databases.
Moreover, it introduces biomedical ontologies that can be used in this context for the
characterization of RNA-molecules and their interaction. Section 3 highlights the characteristics of the
identified databases. Section 4 highlights the main types of relationships that can be extracted
from the sources, introduces a meta-graph that shows the potential relationships that will be
available among the RNA molecules, and describe the characteristics of an initial instantiation
of the knowledge graph. Finally, Section 5 reports our concluding remarks.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Data Integration Approaches. The data integration issue is a well-known problem in the
area of data management, and many approaches have been devised to deal with relational data
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. However, the explosion of data formats (like CSV, JSON, XML) and the variability in the
representation of the same types of information [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] has pushed the need to exploit ontologies
as global common models both for accessing (OBDA – Ontology-Based Data Access) and
integrating (OBDI – Ontology-Based Data Integration) data sources [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]. In OBDA, queries are
1Available at https://github.com/AnacletoLAB/RNA-KG
expressed in terms of an ontology, and the mappings between the ontology and the data sources’
schema are described in the form of declarative mapping rules. Two approaches are usually
proposed to enable access and integration to diferent data sources: materialization, where data
are converted from the local schema according to the ontology concepts and relationships (i.e.
data are converted into an RDF KG and locally stored in a data-warehouse of triples that can
be queried by means of SPARQL); virtualization, where the transformation is executed on the
lfy during the evaluation of queries by exploiting the mapping rules and the ontology. In this
case, only the data from the original sources involved in the query are accessed for generating
the query result in accordance with the adopted ontology. Materialization can provide fast
and accurate access to data because already organized in a centralized repository. However,
data freshness can be compromised when data sources frequently change. On the other hand,
virtualization allows access to fresh data but requires the application of transformations during
query evaluation and can cause delay. Diferent approaches support the specification of mapping
rules like R2RML [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] (a W3C standard for relational sources), and RML [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] which extends the
standard for dealing with other formats. Moreover, SPARQL-Generate [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], YARRRML [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and
ShExML [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] were also proposed for dealing with data heterogeneity.
      </p>
      <p>
        KG construction from bio-medical data sources. In the biological context, many eforts
are nowadays devoted to the construction of KGs by integrating diferent public sources that
exploit the materialization and virtualization approaches previously described. An approach for
integrating diferent biological data into a biological KG was proposed in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. The approach
designs a Connecting Ontology  to integrate all the external ontologies describing the involved
data sources. By exploiting algorithms for fusing and integrating annotations, an enriched KG
is obtained that spans multiple data sources and is annotated by the integrated biological
ontology. The efectiveness of this approach is shown by integrating rice gene-phenotype and
lactobacillus data sources by gluing together the GO, Trait, Disease, and Plant Ontologies.
In [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], the Precision Medicine KG (named PrimeKG) was developed to represent holistic and
multimodal views of diseases. PrimeKG integrates more than 20 high-quality resources with
more than 4M relations that capture information like disease-associated perturbations in the
proteome, biological processes, and molecular pathways. The considered data were collected and
annotated using diverse ontologies such as Disease Gene Network (DisGeNet), Mayo Clinical
knowledgebase, Mondo Diseases Ontology, Bgee, and DrugBank. A virtualization approach
based on an ontology-based federation of three data sources (Bgee, OMA, and UNIProtKB) was
presented in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Starting from a semantic model for gene expression, the authors propose
using mapping rules for dealing with the diferent formats of the three sources and allowing the
issue of joint queries across the sources by exploiting SPARQL endpoints. PheKnowLator [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]
(Phenotype Knowledge Translator) is a fully automated Python 3 library for the construction of
semantically rich, large-scale biomedical KGs that are Semantic Web compliant and amenable
to automatic OWL reasoning, conform to contemporary property graph standards. The library
ofers tools to download data, transform and/or pre-processing of resources into edge lists,
construct knowledge graphs, and generate a wide range of outputs. All these papers point out
the dificulties that arise when trying to integrate diferent data sources that exploit diferent
data models, formats, and ontologies. Specifically, data redundancies, data duplicates, and lack
of common identifier mechanisms must be properly addressed.
      </p>
      <p>Name
Gene Ontology
Disease Ontology
Chemical Entities of
Biological Interest
Non-Coding RNA</p>
      <p>Ontology
Ontology for Biomedical</p>
      <p>Investigations</p>
      <p>Single-Nucleotide
Polymorphism Ontology</p>
      <p>EMBRACE Data And</p>
      <p>Methods</p>
      <p>Sequence Ontology
BRENDA Tissue Ontology</p>
      <p>Experimental Factor</p>
      <p>Ontology
Medical Subject Headings</p>
      <p>Abbr. Description
GO GO provides the terms representing gene product properties. GO covers three domains:</p>
      <p>cellular component, molecular function, and biological process.</p>
      <p>DO DO provides the terms representing human diseases.</p>
      <p>ChEBI ChEBI provides the terms representing molecular entities of ‘small’ chemical
com</p>
      <p>pounds.</p>
      <p>NCRO NCRO provides the terms representing non-coding RNA molecules both of biological</p>
      <p>origin, and engineered.</p>
      <p>OBI OBI provides the terms representing biological and clinical investigations.</p>
      <p>SNPO SNPO provides the terms representing formal and unambiguous representation of</p>
      <p>genomic variations.</p>
      <p>EDAM EDAM provides the terms representing concepts that are prevalent within bioscientific</p>
      <p>data analysis, data management in life sciences.</p>
      <p>SO SO provides the terms representing features used in biological sequence annotation.</p>
      <p>BTO BTO provides the terms representing the source of an enzyme comprising tissues, cell</p>
      <p>lines, cell types and cell cultures.</p>
      <p>EFO EFO provides the terms representing experimental variables. It combines parts of</p>
      <p>several biological ontologies, e.g. UBERON anatomy, ChEBI, and Cell Ontology.</p>
      <p>MeSH MeSH provides the terms used for indexing PubMed citations.</p>
      <p>Biomedical Ontologies. Several standard ontologies can be used for the characterization of
RNA molecules and their interaction with other biomedical entities (see Table 1). Moreover, data
formats specifically developed for biological pathways (like Panther, Reactome or Wikipathways)
are used for semantically annotating the RNA molecules.</p>
      <p>
        In general, a well-recognized and globally accepted ontology for the representation of any
kind of ncRNA molecules is still lacking. Often for referring to ncRNA molecules, the name
of the gene encoding the physically closest protein is used. Moreover, ncRNA genes with no
known function are named pragmatically based on their genomic context; if there is a proximal
(genomically adjacent close in physical proximity) protein coding gene (PCG) then the ncRNA
genes are given a gene symbol beginning with the PCG symbol [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The identification scheme
used for miRNA (which are the majority of data sources) is always borrowed from miRBase. This
makes all the other data sources associated with miRNAs, miRBase “compliant". Furthermore,
the identification scheme associated with miRNAs is partially included in NCRO, which includes
miRNA transcripts from Homo Sapiens cells [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. RNA-based data sources</title>
      <p>
        The wide variety of RNA molecules are translated into proteins, regulate gene expression, hold
enzymatic activity, and modify other RNAs. Coding RNA molecules are named messenger RNA
(mRNA) molecules, translated into proteins helped by ribosomal RNA (rRNA), transfer RNA
(tRNA), small nuclear RNA (snRNA), and small nucleolar RNA (snoRNA) molecules. Non-coding
RNA molecules having less than 200 nucleotides are named small non-coding RNA (snRNA).
This category includes a wide variety of RNA molecules, such as microRNA (miRNA), short
interfering RNA (siRNA), short hairpin RNA (shRNA), antisense oligonucleotides (ASO),
piwiinteracting RNA (piRNA), transfer RNA fragments (tRF), guide RNA (gRNA), aptamer, riboswitch,
and ribozyme molecules. Non-coding RNA molecules with more than 200 nucleotides are named
long non-coding RNA (lncRNA). Circular RNA (circRNA) are lncRNA molecules produced from
alternative splicing events. Further details on the role and meaning of these molecules can be
found in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        Table 2 provides an overview of the identified databases organized according to the main
molecule that they make available. Specifically, for each kind of molecule, the table reports the
number of available databases, the number of molecules, the number of relationships that can
be extracted, and the list of molecules with which relationships can be established. Details on
the databases can be found in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] with the bibliographic references.
      </p>
      <p>
        Besides the sequences, these data sources also contain diferent kinds of relationships that
can be represented according to the Relation Ontology (RO) [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. Table 3 reports the main
identified relationships. For each relation, Table 3 reports the RO identifier, the corresponding
meaning, and an abbreviated form used in our paper. The general relationships “interacts with"
available in RO with the meaning “A relationship that holds between two entities in which
the processes executed by the two entities are causally connected" has been declined in the
most specific relationships “molecularly interacts with" in our classification to represent the
situation in which the two partners are molecular entities that directly physically interact with
each other (e.g. via a stable binding interaction or a brief interaction during which one modifies
the other). We use this relationship when we wish to represent a specific interaction process
at the molecular level (e.g. complementary base pairing occurring in RNAi in miRNA-mRNA
interaction or tRNA molecule charged with a specific amino acid). Figure 1 summarizes the
relationships among RNA molecules that we have identified in the diferent data sources. More
details can be found in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. A meta-graph for modelling RNA-centered relationships</title>
      <p>Starting from the analysis of the data sources, the meta-graph in Fig. 2 has been realized. Colored
edges represent uni-direction relationships (e.g. tRF regulates miRNA). The graphical
representation provides a global overview of the richness of information that is currently provided.
Moreover, the meta-graph points out the presence of a central hub, named “GENE/mRNA", that
is bound to many kinds of ncRNA. This characteristic might have a deep impact on the discovery
of new unconsidered interactions among ncRNA molecules. To simplify the visualization of the
meta-graph, we omitted most of the non-RNA biomolecular and medical entities that are known
to play an important role to study the biology and support the discovery of novel RNA drugs.
Indeed the meta-graph in Fig. 2 can be further extended with other nodes representing other
biological entities (e.g. diseases, epigenetic modifications, small molecules, tissues, biological
pathways, cellular compartments) and relationships relevant to the analysis of RNA-KG.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>This paper reports the initial results of an ongoing project for the creation of a biomedical
knowledge graph for the representation of non-coding RNA molecules and their relationships
made available in diferent publicly available data sources.</p>
      <p>The first release of RNA-KG can be accessed through a SPARQL endpointfor which we used
an AllegroGraph triplestore that ofers a graphical user interface for performing queries. The
code used for the integration of the diferent sources is available on our GitHub repository. The
knowledge graph has been realized by exploiting the primitives made available in PheKnowLator
because they are efective and well-documented.</p>
      <p>
        We are currently working on further integrating specific databases on RNA. Moreover,
PheknowLator provides 12 Open Biological and Biomedical Foundry Ontologies and 31 publicly
available resources that can be integrated with our ongoing RNA-KG. The resulting RNA-KG
will be analyzed with cutting-edge AI graph representation learning algorithms [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], developed
in the context of the National Center for Gene Therapy and Drugs based on RNA Technology, to
support the discovery of novel RNA drugs. Finally, we would like to develop graphical facilities
for supporting the user in the data acquisition process and thus reducing the manual efort
required for mapping the data available in the diferent data sources into RNA-KG [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Barbier</surname>
          </string-name>
          , et al.
          <article-title>The clinical progress of mRNA vaccines</article-title>
          and immunotherapies,
          <source>Nature Biotechnology</source>
          <volume>40</volume>
          (
          <year>2022</year>
          )
          <fpage>840</fpage>
          -
          <lpage>865</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Damase</surname>
          </string-name>
          , et al.
          <article-title>The limitless future of RNA therapeutics</article-title>
          ,
          <source>Frontiers in Bioengineering and Biotechnology</source>
          <volume>9</volume>
          (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .3389/fbioe.
          <year>2021</year>
          .
          <volume>628137</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K.</given-names>
            <surname>Paunovska</surname>
          </string-name>
          , et al.
          <article-title>Drug delivery systems for RNA therapeutics</article-title>
          .,
          <source>Nat Rev Genet</source>
          <volume>23</volume>
          (
          <year>2022</year>
          )
          <fpage>265</fpage>
          -
          <lpage>280</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E.</given-names>
            <surname>Cavalleri</surname>
          </string-name>
          , et al.
          <article-title>A meta-graph for the construction of RNA-KG</article-title>
          ,
          <source>in: 10th Int'l Work-Conference on Bioinformatics and Biomedical Engineering</source>
          ,
          <year>2023</year>
          . To appear.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Halevy</surname>
          </string-name>
          , Information Integration, Springer,
          <year>2009</year>
          . doi:
          <volume>10</volume>
          .1007/978-0-
          <fpage>387</fpage>
          -39940-9_
          <fpage>1069</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mesiti</surname>
          </string-name>
          , et al.
          <article-title>XML-based approaches for the integration of heterogeneous bio-molecular data</article-title>
          ,
          <source>BMC Bioinformatics 10</source>
          (
          <year>2009</year>
          ). doi:
          <volume>10</volume>
          .1186/
          <fpage>1471</fpage>
          -2105-10-S12-S7.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Poggi</surname>
          </string-name>
          , et al.
          <article-title>Linking data to ontologies</article-title>
          , in: J.
          <string-name>
            <surname>on Data Semantics</surname>
            <given-names>X</given-names>
          </string-name>
          , Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          , et al.
          <article-title>Accessing scientific data through knowledge graphs with Ontop</article-title>
          , in: Patterns, CellPress,
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .1016/j.patter.
          <year>2021</year>
          .
          <volume>100346</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          , et al.
          <article-title>R2RML: RDB to RDF mapping language, www</article-title>
          .w3.org/TR/r2rml/,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dimou</surname>
          </string-name>
          , et al.
          <article-title>RML: a generic language for integrated RDF mappings of heterogeneous data</article-title>
          ,
          <source>in: Proc. of the 7th Workshop on Linked Data on the Web</source>
          , volume
          <volume>1184</volume>
          <source>of CEUR Workshop Proc</source>
          .,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lefrançois</surname>
          </string-name>
          , et al.
          <article-title>A SPARQL extension for generating RDF from heterogeneous formats</article-title>
          ,
          <source>in: The Semantic Web</source>
          , Springer,
          <year>2017</year>
          , pp.
          <fpage>35</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>Heyvaert</surname>
          </string-name>
          , et al.
          <article-title>Declarative rules for linked data generation at your fingertips!</article-title>
          ,
          <source>in: The Semantic Web: ESWC 2018 Satellite Events</source>
          , Springer,
          <year>2018</year>
          , pp.
          <fpage>213</fpage>
          -
          <lpage>217</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>H.</given-names>
            <surname>García-González</surname>
          </string-name>
          , et al.
          <article-title>ShExML: improving the usability of heterogeneous data mapping languages for first-time users</article-title>
          ,
          <source>PeerJ Computer Science</source>
          <volume>6</volume>
          (
          <year>2020</year>
          )
          <article-title>27</article-title>
          . doi:
          <volume>10</volume>
          .7717/peerj-cs.
          <volume>318</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tang</surname>
          </string-name>
          , et al.
          <article-title>A graph-based approach for integrating biological heterogeneous data based on connecting ontology</article-title>
          ,
          <source>in: IEEE Int'l Conf. on Bioinformatics and Biomedicine</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>600</fpage>
          -
          <lpage>607</lpage>
          . doi:
          <volume>10</volume>
          .1109/BIBM52615.
          <year>2021</year>
          .
          <volume>9669700</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P.</given-names>
            <surname>Chandak</surname>
          </string-name>
          , et al.
          <article-title>Building a knowledge graph to enable precision medicine</article-title>
          ,
          <source>Sci Data 10</source>
          (
          <year>2023</year>
          )
          <article-title>67</article-title>
          . doi:
          <volume>10</volume>
          .1038/s41597-023-01960-3.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Sima</surname>
          </string-name>
          , et al.
          <article-title>Enabling semantic queries across federated bioinformatics databases</article-title>
          ,
          <year>Database 2019</year>
          (
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .1093/database/baz106.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>T. J.</given-names>
            <surname>Callahan</surname>
          </string-name>
          , et al.
          <article-title>A framework for automated construction of heterogeneous large-scale biomedical knowledge graphs, bioRxiv (</article-title>
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .1101/
          <year>2020</year>
          .04.30.071407.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M. W.</given-names>
            <surname>Wright</surname>
          </string-name>
          ,
          <article-title>A short guide to long non-coding rna gene nomenclature</article-title>
          ,
          <source>Human Genomics</source>
          <volume>8</volume>
          (
          <year>2014</year>
          )
          <article-title>7</article-title>
          . doi:
          <volume>10</volume>
          .1186/
          <fpage>1479</fpage>
          -7364-8-7.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J.</given-names>
            <surname>Huang</surname>
          </string-name>
          , et al.
          <article-title>The non-coding RNA ontology (NCRO): a comprehensive resource for the unification of non-coding RNA biology</article-title>
          ,
          <source>J. of Biomedical Semantics</source>
          <volume>7</volume>
          (
          <year>2016</year>
          )
          <article-title>24</article-title>
          . doi:
          <volume>10</volume>
          .1186/ s13326-016-0066-0.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>E.</given-names>
            <surname>Ong</surname>
          </string-name>
          , et al.
          <article-title>Ontobee: A linked ontology data server to support ontology term dereferencing, linkage, query and integration</article-title>
          ,
          <source>Nucleic Acids Res</source>
          .
          <volume>45</volume>
          (
          <year>2016</year>
          )
          <fpage>D347</fpage>
          -
          <lpage>D352</lpage>
          . doi:
          <volume>10</volume>
          .1093/nar/ gkw918.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Xia</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          et al.:
          <article-title>Graph Learning: A Survey</article-title>
          .
          <source>IEEE Transactions on Artificial Intelligence</source>
          <volume>2</volume>
          (
          <issue>2</issue>
          ) (
          <year>2021</year>
          )
          <fpage>109</fpage>
          -
          <lpage>127</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bonfitto</surname>
          </string-name>
          , et al.
          <article-title>Easy-to-use interfaces for supporting the semantic annotation of web tables</article-title>
          ,
          <source>in: Int'l Workshop on Data Platforms Design</source>
          , Management, and
          <string-name>
            <surname>Optimization</surname>
          </string-name>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>