<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Integrating Knowledge Graphs for Explainable Arti cial Intelligence in Biomedicine?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marta Contreiras Silva</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniel Faria</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Catia Pesquita</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>LASIGE, Dep. de Informatica</institution>
          ,
          <addr-line>Faculdade de Ci</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>encias da Universidade de Lisboa</institution>
          ,
          <country country="PT">Portugal</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The rich panorama of publicly available data and ontologies in the biomedical domain represents an unique opportunity for developing explainable knowledgeenabled systems in biomedical Arti cial Intelligence (AI) [1, 3, 4]. Building on decades of work by the semantic web and biomedical ontologies communities, a semi-automated approach for building and maintaining a Knowledge Graph (KG) to support AI-based personalized medicine is within our grasp. However, personalized medicine also poses signi cant challenges that require advances to the state of the art, such as the diversity and complexity of the domain and underlying data, coupled with the requirements for explainability. We propose an approach (see Figure 1) to build a KG for personalized medicine to serve as a rich input for the AI system (ante-hoc) and incorporate its outcomes to support explanations, by connecting input and output (post-hoc). A preparatory step is Data and ontology collection and curation. This includes the selection and curation of relevant public datasets for the domain in question, the identi cation of ontologies referenced by the datasets, and the selection of other relevant ontologies to ensure adequate coverage of the domain and su cient semantic richness to support explanations. Additionally, data privacy inherent to patient data should inform the decision to make part of the KG private to its data providers and the data integration process also mostly automatic to reduce the need of human involvement [2]. The rst step in our approach is Ontology Matching. Key challenges are scalability and complex matching, since building a comprehensive KG requires matching multiple ontologies with hundreds of thousands of concepts, covering di erent domains, and with di erent modeling perspectives. Regarding scalability, our solution is to match ontologies iteratively, by matching and merging the largest pair of ontologies into a single one, then mapping and merging this to the third largest ontology, and so on, using complex matching algorithms to uncover rich relations across domains [6, 7]. Before the nal integration of ontologies, alignments are partially validated by experts to ensure an accurate KG that can support explanations [5]. The Semantic Data Annotation process relies on the development of parsers to interpret each type of dataset, and annotation algorithms to produce an RDF version of the dataset that is semantically integrated into the KG. Finally, the Integration with the AI system ensures that the KG serves as both input to AI methods (directly or through feature generation [8]) and</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        also encodes the AI outcomes, which supports a shared semantic space for data,
scienti c context, and predictions capable of supporting KG-based explanations
methods, including querying, reasoning and similarity searches [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
Acknowledgments This work was supported by FCT through the LASIGE
Research Unit (UIDB/00408/2020 and UIDP/00408/2020). It was also partially
supported by the KATY project which has received funding from the European Union's
Horizon 2020 research and innovation program under grant agreement No 101017453.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Chari</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gruen</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seneviratne</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McGuinness</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          :
          <article-title>Directions for explainable knowledge-enabled systems</article-title>
          . arXiv preprint arXiv:
          <year>2003</year>
          .
          <volume>07523</volume>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cui</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          :
          <article-title>Survey and open problems in privacy preserving knowledge graph: Merging, query, representation, completion and applications</article-title>
          . arXiv preprint arXiv:
          <year>2011</year>
          .
          <volume>10180</volume>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Holzinger</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Langs</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Denk</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zatloukal</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , Muller, H.:
          <article-title>Causability and explainability of arti cial intelligence in medicine</article-title>
          .
          <source>WIREs Data Mining and Knowledge Discovery</source>
          <volume>9</volume>
          (
          <issue>4</issue>
          ),
          <year>e1312</year>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Lecue</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>On the role of knowledge graphs in explainable AI</article-title>
          .
          <source>Semantic Web</source>
          <volume>11</volume>
          (
          <issue>1</issue>
          ),
          <volume>41</volume>
          {
          <fpage>51</fpage>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dragisic</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faria</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ivanova</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jimenez-Ruiz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lambrix</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pesquita</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>User validation in ontology alignment: functional assessment and impact</article-title>
          .
          <source>The Knowledge Engineering Review</source>
          <volume>34</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Lima</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faria</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pesquita</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Pattern-guided association rule mining for complex ontology alignment</article-title>
          .
          <source>In: ISWC</source>
          <year>2021</year>
          Poster &amp; Demo
          <string-name>
            <surname>Track</surname>
          </string-name>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Oliveira</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pesquita</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Improving the interoperability of biomedical ontologies with compound alignments</article-title>
          .
          <source>Journal of biomedical semantics 9(1)</source>
          ,
          <volume>1</volume>
          {
          <fpage>13</fpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Paulheim</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , Fumkranz, J.:
          <article-title>Unsupervised generation of data mining features from linked open data</article-title>
          .
          <source>In: Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics - WIMS '12</source>
          . p.
          <fpage>1</fpage>
          . ACM Press (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Pesquita</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Towards semantic integration for explainable arti cial intelligence in the biomedical domain</article-title>
          .
          <source>In: BIOSTEC 2021</source>
          . vol.
          <volume>5</volume>
          , pp.
          <volume>747</volume>
          {
          <issue>753</issue>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>