<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Putting RDF2vec in Order</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Data and Web Science Group, University of Mannheim</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>One Domain Model</institution>
          ,
          <addr-line>Walldorf</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>SAP SE Business Technology Platform</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>The RDF2vec method for creating node embeddings on knowledge graphs is based on word2vec, which, in turn, is agnostic towards the position of context words. In this paper, we argue that this might be a shortcoming when training RDF2vec, and show that using a word2vec variant which respects order yields considerable performance gains especially on tasks where entities of di erent classes are involved.3 Poster Submission Fig. 1. Classic word2vec vs. Structured word2vec</p>
      </abstract>
      <kwd-group>
        <kwd>RDF2vec</kwd>
        <kwd>knowledge graphs</kwd>
        <kwd>knowledge graph embeddings</kwd>
        <kwd>machine learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        RDF2vec [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] is a representation learning approach for entities in a knowledge
graph. The basic idea is to rst create sequences from a knowledge graph by
starting random walks from each node. These sequences are then fed into the
word2vec algorithm [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] for creating word embeddings, with each entity or
property in the graph being treated as a \word". As a result, a xed-size feature
vector is obtained for each entity.
Angela_Merkel
leader
      </p>
      <p>Germany
country
country</p>
      <p>Hamburg</p>
      <p>leader
birthPlace</p>
      <p>residence</p>
      <p>Word2vec is a well-known neural language model to train latent
representations (i.e., xed size vectors) of words based on a text corpus. Its objective is
either to predict a word w given its context words (known as continous bag of
words or CBOW), or vice versa (known as skip gram or SG).</p>
      <p>Given the context k of a word w, where k is a set of preceding and succeeding
words of w, the learning objective of word2vec is to predict w. This is known as
continuous bag of words model (CBOW). The skip-gram (SG) model is trained
the other way around: Given w, k has to be predicted. Within this training
process, the size of k and is also known as window or window size.</p>
      <p>One shortfall of the original original word2vec approach is its insensitivity
to the relative positions of words. It is, for instance, irrelevant whether a word
is preceding or succeeding w, and the actual distance to w is not considered.
This property of word2vec is ideal to cope with the fact that in many languages,
the same sentence can be expressed with di erent word orderings (cf.
Yesterday morning, Tom ate bread vs. Tom ate bread yesterday morning ). In contrast,
walks extracted from knowledge graphs, the semantics of the underlying nodes
di er depending on the position of an entity in the walk, as the following
examples illustrates.</p>
      <p>Fig. 2 depicts a small excerpt of a knowledge graph. Among others, the
following walks could be extracted from the graph:
Hamburg -&gt; country -&gt; Germany -&gt; leader -&gt; Angela_Merkel
Germany -&gt; leader -&gt; Angela_Merkel -&gt; birthPlace -&gt; Hamburg
Hamburg -&gt; leader -&gt; Peter_Tschentscher -&gt; residence -&gt; Hamburg</p>
      <p>If an RDF2vec model is trained for the entities in the center (i.e., Germany,
Angela Merkel, and Peter Tschentscher), all of the sequences share exactly
two entities in their context (Hamburg and leader), i.e., they will be projected
equally close in the vector space. However, a model respecting positions would
particularly di erentiate the di erent meanings of leader (i.e., whether
someone/thing has or is a leader), and the di erent roles of involved entities (i.e.,
Hamburg as a place of birth or a residence of a person, or being located in a
country). Therefore, it would map the two politicians closer to each other than
to Germany.</p>
      <p>
        Ling et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] present an extension to the word2vec algorithm, known as
structured word2vec, which incorporates the positional information of words.
This is achieved by using multiple encoders (CBOW) respectively decoders (SG)
depending on the position of the context words. An illustration for SG can be
found in Figure 1 where it is visible that the classic component uses only one
output matrix O which maps the embeddings to the output while the structured
approach uses one output matrix per position in the window (e.g. O+1 for the
subsequent word to w0).
      </p>
      <p>In this paper, we present RDF 2vecoa, an order aware variant of RDF2vec
obtained by changing the training component from word2vec to structured word2vec,
and show promising preliminary results.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        RDF2vec was one of the rst approaches to adopt statistical language modeling
techniques to knowledge graphs. Similar approaches, such as node2vec [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and
DeepWalk [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], were proposed for unlabeled graphs while knowledge graphs are
labeled by nature, i.e., they contain di erent types of edges.
      </p>
      <p>
        Other language modeling techniques that have been adapted for knowledge
graphs include GloVe [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], which yielded KGlove [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], and BERT [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], which yielded
KG-BERT [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>
        Variants of RDF2vec include the use of di erent heuristics for biasing the
walks [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]; [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] evaluate multiple heuristics for biasing the walks or
alternative walk strategies. Very few authors tried to change the training objective
of RDF2vec. Besides word2vec, the GloVe [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] algorithm has also been used [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Experiments and Preliminary Results</title>
      <p>
        We use jRDF2vec4 [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] to generate random walks and Ling et al.'s structured
word2vec implementation5 to train an embedding based on the walks.
      </p>
      <p>For the embeddings, we use the DBpedia 2016-04 dataset. We generated 500
random walks for each node in the graph with a depth of 4 (node hops). word2vec
and structured word2vec were trained using the same set of walks and the same
training parameters: SG, window = 5, and size 2 f100; 200g.</p>
      <p>
        We evaluate both, the classic and the position aware RDF2vec approach, on
a variety of di erent tasks and datasets. For our evaluation, we use the GEval
framework [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. We follow the setup proposed in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Those works use
data mining tasks with an external ground truth. Di erent feature extraction
methods { which includes the generation of embedding vectors { can then be
compared using a xed set of learning methods. Overall, we evaluate our new
embedding approach on six tasks using 20 datasets altogether. The evaluation
is conducted on six di erent downstream tasks { classi cation and regression,
clustering, determining semantic analogies, and computing entity relatedness and
document similarity, the latter based on entities mentioned in the documents.
      </p>
      <p>The results are presented in Table 1. When comparing the classic to the
order aware embeddings, it is visible that the performances are very similar on
most tasks such as classi cation. A rst observation is that we cannot observe
4https://github.com/dwslab/jRDF2Vec
5https://github.com/wlin12/wang2vec
signi cant performance drops on any of the tasks when switching from classic to
order aware RDF2vec embeddings. However, signi cant performance increases
can be observed on clustering tasks and on semantic analogy tasks, which are the
tasks where entities of di erent classes are involved (whereas the classi cation
and regression tasks deal with entities of the same class, e.g., cities or countries).
The order aware RDF2vec con guration with 100 dimensions achieved on 7
datasets the overall best results and outperforms its classic con guration with
the same dimension on 10 datasets partly with signi cantly better outcomes.
On the other hand, in most cases where the classic variant performs better,
it does so by a smaller margin. Thus, in general, the order-aware variant can
be used safely without performance drops, and in some cases with signi cant
performance gains.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Summary and Future Work</title>
      <p>
        In this paper, we presented a position aware variant of RDF2vec together with
rst very promising evaluation results. In the future, we plan to conduct more
thorough analyses, analyzing which knowledge graph characteristics and
downstream tasks bene t most from the ordered variant, and which do not. For
example, we believe that graphs with a small set of predicates, or graphs which
have all symmetric, inverse, and transitive relations materialized [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], can bene t
more from using the ordered variant.
      </p>
      <p>
        Furthermore, we plan to analyze how the ordered variant can be integrated
into other RDF2vec con gurations and avours, such as di erent biased walks
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], or RDF2vec Light [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Cochez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ristoski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ponzetto</surname>
            ,
            <given-names>S.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paulheim</surname>
          </string-name>
          , H.:
          <article-title>Biased graph walks for RDF graph embeddings</article-title>
          .
          <source>In: WIMS 2017</source>
          . pp.
          <volume>21</volume>
          :
          <issue>1</issue>
          {
          <fpage>21</fpage>
          :
          <fpage>12</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Cochez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ristoski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ponzetto</surname>
            ,
            <given-names>S.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paulheim</surname>
          </string-name>
          , H.:
          <article-title>Global RDF vector space embeddings</article-title>
          .
          <source>In: ISWC 2017. LNCS</source>
          , vol.
          <volume>10587</volume>
          , pp.
          <volume>190</volume>
          {
          <fpage>207</fpage>
          . Springer (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Devlin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>M.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toutanova</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Bert: Pre-training of deep bidirectional transformers for language understanding</article-title>
          . arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Grover</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leskovec</surname>
          </string-name>
          , J.: node2vec:
          <article-title>Scalable feature learning for networks</article-title>
          .
          <source>In: ACM SIGKDD</source>
          <year>2016</year>
          . pp.
          <volume>855</volume>
          {
          <issue>864</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Iana</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paulheim</surname>
          </string-name>
          , H.:
          <article-title>More is not always better: The negative impact of a-box materialization on rdf2vec knowledge graph embeddings</article-title>
          .
          <source>In: Proceedings of the CIKM 2020 Workshops</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ling</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dyer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Black</surname>
            ,
            <given-names>A.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trancoso</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Two/too simple adaptations of word2vec for syntax problems</article-title>
          .
          <source>In: NAACL HLT</source>
          <year>2015</year>
          . pp.
          <volume>1299</volume>
          {
          <fpage>1304</fpage>
          .
          <string-name>
            <surname>ACL</surname>
          </string-name>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corrado</surname>
            ,
            <given-names>G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          .
          <source>In: NIPS</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Pellegrino</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Altabba</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garofalo</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ristoski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cochez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Geval: A modular and extensible evaluation framework for graph embedding techniques</article-title>
          .
          <source>In: ESWC</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Pennington</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C.D.: Glove:
          <article-title>Global vectors for word representation</article-title>
          .
          <source>In: EMNLP 2014</source>
          . pp.
          <volume>1532</volume>
          {
          <issue>1543</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Pennington</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C.D.: Glove:
          <article-title>Global vectors for word representation</article-title>
          .
          <source>In: EMNLP 2014</source>
          . pp.
          <volume>1532</volume>
          {
          <fpage>1543</fpage>
          .
          <string-name>
            <surname>ACL</surname>
          </string-name>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Perozzi</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Al-Rfou</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Skiena</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Deepwalk:
          <article-title>Online learning of social representations</article-title>
          .
          <source>In: ACM SIGKDD</source>
          <year>2014</year>
          . pp.
          <volume>701</volume>
          {
          <issue>710</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Portisch</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hladik</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paulheim</surname>
          </string-name>
          , H.:
          <article-title>Rdf2vec light - A lightweight approachfor knowledge graph embeddings</article-title>
          .
          <source>In: ISWC Posters and Demos</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Ristoski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Noia</surname>
          </string-name>
          , T.D.,
          <string-name>
            <surname>Leone</surname>
            ,
            <given-names>R.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paulheim</surname>
          </string-name>
          , H.:
          <article-title>Rdf2vec: RDF graph embeddings and their applications</article-title>
          .
          <source>Semantic Web</source>
          <volume>10</volume>
          (
          <issue>4</issue>
          ),
          <volume>721</volume>
          {
          <fpage>752</fpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Ristoski</surname>
          </string-name>
          , P., de Vries,
          <string-name>
            <surname>G.K.D.</surname>
          </string-name>
          ,
          <string-name>
            <surname>Paulheim</surname>
          </string-name>
          , H.:
          <article-title>A collection of benchmark datasets for systematic evaluations of machine learning on the semantic web</article-title>
          .
          <source>In: ISWC</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Vandewiele</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steenwinckel</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonte</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weyns</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paulheim</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ristoski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Turck</surname>
            ,
            <given-names>F.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ongenae</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Walk extraction strategies for node embeddings with rdf2vec in knowledge graphs</article-title>
          . CoRR abs/
          <year>2009</year>
          .04404 (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Yao</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mao</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Kg-bert: Bert for knowledge graph completion</article-title>
          . arXiv preprint arXiv:
          <year>1909</year>
          .
          <volume>03193</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>