<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Knowledge Graphs for Impactful Data Science</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Victor de Boer</string-name>
          <email>v.de.boer@vu.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Vrije Universiteit Amsterdam</institution>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <fpage>13</fpage>
      <lpage>15</lpage>
      <abstract>
        <p />
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Things.</p>
      <p>
        In this invited talk I will argue that to build scalable, transparent and explainable AI in various domains
where heterogeneous data is available, we need to collaborate with domain experts to develop relevant
and high-quality knowledge graphs as well as appropriate data science and Machine Learning methods
to constantly enrich and analyse these graphs. I give examples in the Digital Humanities and Internet of
In many modern statistical approaches to AI, raw data is the preferred input for (Machine
Learning) models. In some areas and in some cases, however, we struggle to find this raw form
of data. One such area involves heterogeneous knowledge: entities, their attributes and internal
relations. The Semantic Web community has invested decades of work on just this problem:
how to use graphs to represent knowledge, in various domains, in as raw and as usable a form as
possible, satisfying many use cases. To build scalable, transparent and explainable AI in various
domains where such heterogeneous data and knowledge is available, we need to collaborate
with domain experts to develop a) relevant and high-quality knowledge graphs as well as b)
appropriate data science and ML methods to constantly enrich and analyse these graphs[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>In the domain of Digital Humanities (DH), a large amount of heterogeneity of data and
knowledge exists. Digitized datasets can derive from centuries-old sources and multiple views
on history and heritage should be represented. The capacity of the Knowledge Graph to capture
such heterogeneity makes this an ideal model to represent, share and combine data sources
to allow for new types of analyses. In the domain, Machine Learning and other Data Science
methods are more and more looked at to identify patterns in the data, establish new links or
categorize entities. Transparency and explainability are key requirements for such methods to
be used in serious scholarly analysis.</p>
      <p>
        Although the domain of Internet of Things (IoT) and Smart Homes difers in many ways
from that of DH, here too do we find datasets of varying sources, combined to allow for new
types of applications and analysis[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In smart home scenarios, methods that combine data into
knowledge graphs for further analysis or applications will need to be privacy-aware, transparent
and explainable. Using ontologies such as SAREF[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], we can achieve this interoperability. Using
re-usable (python) notebooks we can establish a Data Science pipeline.
http://victordeboer.com/ (V. d. Boer)
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>X.</given-names>
            <surname>Wilcke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bloem</surname>
          </string-name>
          , V. De Boer,
          <article-title>The knowledge graph as the default data model for learning on heterogeneous knowledge, Data Science 1 (</article-title>
          <year>2017</year>
          )
          <fpage>39</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R. van der</given-names>
            <surname>Weerdt</surname>
          </string-name>
          , V. de Boer,
          <string-name>
            <given-names>L.</given-names>
            <surname>Daniele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Nouwt</surname>
          </string-name>
          ,
          <article-title>Validating saref in a smart home environment</article-title>
          ,
          <source>in: Research Conference on Metadata and Semantics Research</source>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>35</fpage>
          -
          <lpage>46</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Daniele</surname>
          </string-name>
          , F. d. Hartog,
          <string-name>
            <given-names>J.</given-names>
            <surname>Roes</surname>
          </string-name>
          ,
          <article-title>Created in close interaction with the industry: the smart appliances reference (saref) ontology</article-title>
          , in: International Workshop Formal Ontologies Meet Industries, Springer,
          <year>2015</year>
          , pp.
          <fpage>100</fpage>
          -
          <lpage>112</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>