<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Learning with Knowledge Graphs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Volker Tresp</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yunpu Ma</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stephan Baier</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Siemens AG</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Corporate Technology</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Munich</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Germany</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ludwig-Maximilians-Universitat Munchen</institution>
          ,
          <addr-line>Munich</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In recent years a number of large-scale triple-oriented knowledge graphs have been generated. They are being used in research and in applications to support search, text understanding and question answering. Knowledge graphs pose new challenges for machine learning, and research groups have developed novel statistical models that can be used to compress knowledge graphs, to derive implicit facts, and to detect errors in the knowledge graph. In this paper we decribe the concept of triple-oriented knowledge graphs and corresponding learning approaches. We also discuss episodic knowledge graphs which are able to represent temporal data; learning with episodic data can be the basis for decision support systems, e.g. in a clinical context. Finally we discuss how knowledge graphs can support perception, by mapping subsymbolic sensory inputs, such as images, to semantic triples. A particular feature of our approach would be that perception, episodic memory and semantic memory are highly interconnected and that, in a cognitive interpretation, all rely on the same brain structures.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        A technical realization of a semantic memory is a knowledge graph (KG) which
is a triple-oriented knowledge representation: A labelled link implies a (subject,
predicate, object) statement where subject and object are entities that are
represented as the nodes in the graph and where the predicate labels the link from
subject to object. Large KGs have been developed that support search, text
understanding and question answering [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. A KG can be represented as a tensor
which maps indices to true or false
      </p>
      <p>s; p; o 7! Q
with Q 2 fT; Fg, and where s 2 1; : : : ; N and o 2 1; : : : ; N are indices for the N
entities used as subject and object, and where p 2 1; : : : ; R is the index for the
predicate.</p>
      <p>
        A statistical model for a KG can be obtained by a tensor model of the form
s; p; o 7! ae(s); ap; ae(o) 7! P:
(1)
Here e(s) and e(o) are the entities associated with subject and object,
respectively. The indices are rst mapped to their latent representations ae(s); ap; ae(o)
Copyright © 2017 for this paper by its authors. Copying permitted for private and academic purposes.
which are then mapped to a probability P 2 [0; 1]. P ((s; p; o) = Tjae(s); ap; ae(o))
represents the Bernoulli probability that the triple (s; p; o) is true, and, when
normalized across all triples, P (s; p; ojae(s); ap; ae(o)) stands for the categorical
probability that the triple (s; p; o) is selected as an answer in a query process. A
number of mathematical models have been developed for the mapping in
Equation 1 (see [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]). A representative example is the RESCAL model [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which is a
constraint Tucker2 tensor model.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Episodic Knowledge Graphs</title>
      <p>Whereas a semantic KG model re ects the state of the world, e.g, of a clinic and
its patients, observations and actions describe factual knowledge about discrete
events. Generalizing the semantic KG, an episodic KG can be represented as a
4-way tensor with time index t as the map</p>
      <p>s; p; o; t 7! Q:
A statistical model for a KG can be obtained by a 4-way tensor model of the
form
s; p; o; t 7! ae(s); ap; ae(o); at 7! P
(2)
where at is the latent representation for time index t.</p>
      <p>
        The basis for the tight link between di erent memory functions is the \unique
representation hypothesis", which states that an entity has a unique latent
representation in a technical application, but maybe also in the human brain [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        As discussed in [
        <xref ref-type="bibr" rid="ref11 ref5">11, 5</xref>
        ] both the episodic KG and the semantic KG might rely
on the same representations, i.e., it was proposed that the semantic KG can be
derived from the episodic KG by a marginalization operation. Thus an episodic
fact might represent that \Jack, wasDiagnosed, Diabetes, on Jan 15", the derived
semantic fact might be \Jack, hasDisease, Diabetes". In [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] medical decision
systems are described that combine semantic and episodic tensor representations
of data with recurrent neural network predictive models.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Perception</title>
      <p>The tensor models permit generalization, i.e., the prediction of the probability of
triples which were not known to be true in the data. This is especially important
in perception, which we propose can be thought o as the mapping of
subsymbolic sensory inputs to a semantic description in the form of a set of triples,
describing and explaining the sensory inputs. These triples then becomes part
of episodic memory.</p>
      <p>Let ut;1; : : : ; ut;c; : : : ; ut;C be the content of the sensory bu ers at time t. We
propose that this sensory input can predict the latent representation for time in
the form of a map
ut;1; : : : ; ut;c; : : : ; ut;C 7! at:</p>
      <p>This map at(ut; w) might be modelled by a deep neural network with weights
w. Perceptual decoding then produces likely triples from the probability
distribution (generalized nonlinear model) using</p>
      <p>
        P (s; p; o; ae(s); ap; ae(o); at(ut; w)):
An episodic memory would simply store at, and memorizing simply means the
restoring of a past at, which then can be decoded as described [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ]. A semantic
memory uses the marginalizing approach describes in Section 2.
      </p>
      <p>
        As another approach, there is the option to use P (s; p; o) or P (s; p; o; t) as a
semantic prior in sensory decoding. This was the basis for approaches to extract
triples from Web sources [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and for the extraction of triples from images [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Stephan</given-names>
            <surname>Baier</surname>
          </string-name>
          , Yunpu Ma, and
          <string-name>
            <given-names>Volker</given-names>
            <surname>Tresp</surname>
          </string-name>
          .
          <article-title>Improving visual relationship detection using semantic modeling of scene descriptions</article-title>
          .
          <source>In ISWC</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Xin</given-names>
            <surname>Dong</surname>
          </string-name>
          , Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun, and Wei Zhang.
          <article-title>Knowledge Vault: A Web-scale Approach to Probabilistic Knowledge Fusion</article-title>
          .
          <source>In KDD</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Cristobal</given-names>
            <surname>Esteban</surname>
          </string-name>
          , Danilo Schmidt, Denis Krompa , and
          <string-name>
            <given-names>Volker</given-names>
            <surname>Tresp</surname>
          </string-name>
          .
          <article-title>Predicting sequences of clinical events by using a personalized temporal latent embedding model</article-title>
          .
          <source>In Healthcare Informatics (ICHI)</source>
          , 2015 International Conference on,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Yinchong</given-names>
            <surname>Jang</surname>
          </string-name>
          , Volker Tresp, and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Fasching</surname>
          </string-name>
          .
          <article-title>Predictive modeling of therapy decisions in metastatic breast cancer with recurrent neural network encoder and multinomial hierarchical regression decoder</article-title>
          .
          <source>In ICHI</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Yunpu</given-names>
            <surname>Ma</surname>
          </string-name>
          , Volker Tresp, and
          <string-name>
            <given-names>Erik</given-names>
            <surname>Daxberger</surname>
          </string-name>
          .
          <article-title>Embedding models for episodic memory</article-title>
          . In submitted,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Maximilian</given-names>
            <surname>Nickel</surname>
          </string-name>
          , Volker Tresp, and
          <string-name>
            <surname>Hans-Peter Kriegel</surname>
          </string-name>
          .
          <article-title>A Three-Way Model for Collective Learning</article-title>
          . In ICML,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Maximilian</given-names>
            <surname>Nickel</surname>
          </string-name>
          , Kevin Murphy, Volker Tresp, and
          <string-name>
            <given-names>Evgeniy</given-names>
            <surname>Gabrilovich</surname>
          </string-name>
          .
          <article-title>A review of relational machine learning for knowledge graphs: From multi-relational link prediction to automated knowledge graph construction</article-title>
          .
          <source>Proceedings of the IEEE</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Amit</given-names>
            <surname>Singhal</surname>
          </string-name>
          .
          <article-title>Introducing the Knowledge Graph: things, not strings</article-title>
          .
          <source>O cial Google Blog</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Volker</given-names>
            <surname>Tresp</surname>
          </string-name>
          , Cristobal Esteban, Yinchong Yang,
          <string-name>
            <given-names>Stephan</given-names>
            <surname>Baier</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Denis</given-names>
            <surname>Krompa</surname>
          </string-name>
          .
          <article-title>Learning with memory embeddings</article-title>
          .
          <source>NIPS 2015 Workshop (extended TR); arXiv:1511.07972</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Volker</surname>
            <given-names>Tresp</given-names>
          </string-name>
          , Yunpu Ma, and
          <string-name>
            <given-names>Stephan</given-names>
            <surname>Baier</surname>
          </string-name>
          .
          <article-title>Tensor memories</article-title>
          .
          <source>In CCN</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Volker</surname>
            <given-names>Tresp</given-names>
          </string-name>
          , Yunpu Ma, Stephan Baier, and
          <string-name>
            <given-names>Yinchong</given-names>
            <surname>Yang</surname>
          </string-name>
          .
          <article-title>Embedding learning for declarative memories</article-title>
          .
          <source>In ESWC</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>