<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>FB15k-CVT: A Challenging Dataset for Knowledge Graph Embedding Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mouloud Iferroudjene</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Victor Charpenay</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antoine Zimmermann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Mines Saint-Etienne, Univ Clermont Auvergne, INP Clermont Auvergne, CNRS, UMR 6158 LIMOS</institution>
          ,
          <addr-line>Saint-Etienne</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Knowledge Graphs (KGs) are an essential component of neuro-symbolic AI. KG Embedding Models (KGEMs) are used to represent elements of a KG (its entities and relations) in a vector space, to enable eficient processing and reasoning over knowledge. Most KGEMs are evaluated against datasets derived from the Freebase KG: FB15k and FB15k-237. In this paper, we identify limitations in these datasets with respect to Compound Value Types (CVTs), which are nodes introduced in Freebase as a substitute for -ary relations. In FB15k and FB51k-237, CVTs have been removed, thereby eliminating valuable information. To evaluate whether KGEMs can learn semantically accurate representations of entities and relations in Freebase, we introduce here a new dataset named FB15k-CVT, which reintroduces the deleted CVT nodes. In a preliminary evaluation, we assess the limitations of baseline KGEMs (TransE, DistMult) in the presence of CVTs. The evaluation suggests that KGEMs based on tensor decomposition are more promising than translational models but, most of all, it calls for further experiments with KGEMs that can answer conjunctive queries or that preserve logical entailment.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Knowledge Graphs</kwd>
        <kwd>Neurosymbolic AI</kwd>
        <kwd>Knowldge Graph Embeddings Models</kwd>
        <kwd>FB15K-237</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        (wrt the number of facts they are associated with) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. It was later noticed that FB15k was flawed
wrt link prediction (LP), the main learning task for which KGEMs are trained. Many facts used
for testing could trivially be reconstructed from other facts exposed to models during training
(test leakage) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Among the 1,000+ relations of FB15k, 237 relations were kept in FB15k-237.
      </p>
      <p>Both FB15k and FB15k-237 were designed to be representative subsets of Freebase. Yet, they
hide significant details of Freebase that are rarely discussed in KGEM research. Most of the
relations in these datasets are actually not in Freebase. 144 relations, out of the 237 of FB15k-237,
are not atomic relations but two-hops paths that have been created by composing two Freebase
relations. In this paper, we motivate the (re)introduction of atomic relations in FB15k, to be
more representative of Freebase—and, in fact, of most large KGs—and describe the resulting
dataset.</p>
      <p>
        Our dataset, which we call FB15k-CVT, gets its name from a particular type of entity in
Freebase: Compound Value Types (CVTs). We introduce CVTs in Sec. 2 and show the limitations
that pertain to removing CVTs from Freebase. Fatemi et al. already observed that CVTs
represented most of Freebase [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. However, in this paper, we draw diferent conclusions as to
what KGEM research should focus on. While Fatemi et al. consider that CVTs should be hidden
behind -ary relations, we argue that the numerous CVTs in Freebase are an ideal benchmark
for emerging KGEMs targeting conjunctive query evaluation (such as Query2Box [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]) and
logical entailment (such as ELEm [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]). We describe FB15k-CVT in Sec. 3. We give preliminary
results with baseline KGEMs for an extension of the link prediction task to two-hops paths in
Sec. 4, showing that FB15k-CVT is more challenging than FB15k-237.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Compound Value Types in Freebase</title>
      <p>
        Most facts in Freebase are to be interpreted as -ary statements. Because a typical KG can only
capture binary statements, relations with higher arity are represented in Freebase via CVTs [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
A CVT is an entity whose existence depends upon (at least two) other entities. At a theoretical
level, it can be interpreted as the reification of a fact over  entities ( ≥ 2) and possibly literal
values. Examples of CVTs include: a position held by a person over a certain period; a monetary
amount; an award nomination relating one or more nominees, an award and a film. The first
two examples are given in Fig. 1. Two examples of award nomination CVTs are given in more
details in Fig. 1. In Freebase, CVTs may only be defined with respect to non-CVT entities, which
are referred to as topics. In the examples of Fig 1, ‘Dominique de Villepin’, ‘Minister of Foreign
Afairs’, ‘STM’ and ‘US dollar’ are all topics.
      </p>
      <p>Dealing with -ary relations has been a long-standing issue on the Semantic Web. Other
KGs, including Wikidata [8] and YAGO [9] also include -ary relations, mostly in the form
of spatial-temporal annotations. In fact, the three examples of CVTs we give here also hold
temporal annotations. The modeling choice of Freebase, YAGO and Wikidata slightly difer as to
how to represent these -ary annotations but all three KGs use a form of reification that follows
documented best practices [10]. CVTs are thus representative of how large KGs represent
annotated facts.</p>
      <p>Dominique
de Villepin
positions
held</p>
      <sec id="sec-2-1">
        <title>2.1. Removing Compound Value Types</title>
        <p>The original FB15k dataset was designed for link prediction experiments in a transductive
setting. That is, a fact ⟨ℎ, , ⟩ may only be predicted for head and tail entities that are known
beforehand. In large open KGs, it is reasonable to assume that the majority of entities has a
known identifier (for instance, a Wikipedia page title). It is however less obvious that identifiers
for CVTs should be known in advance, given the potentially large number of CVTs in a KG. To
train the first KGEMs on Freebase in a transductive setting, Bordes et al. chose to collapse CVTs
found in Freebase in order to learn embeddings for topics only. As a result, KGEMs are built
from a simplification of Freebase that considers only simple relations or paths (of length 2) that
bypass CVTs. The simplification process comes with a number of issues.</p>
        <p>First, not all paths have been preserved in FB15k. For example, consider the ternary relation
award_nomination that relates a nominee, a type of award and a creative work. Freebase
includes the statement that Fran Walsh was nominated for an Academy Award for Best Original
Song in The Return of the King, the last film in the Lord of the Rings trilogy (see Fig. 1 on the
right). There are three possible paths for this statement: (i) between Fran Walsh and the film,
(ii) between Fran Walsh and the award type and (iii) between the film and the award type. FB15k
only captures the first two paths. There is no relation between The Return of the King and the
Academy Award in the dataset. It is unclear what criteria were chosen to decide what paths are
preserved and what paths are deliberately lost in the simplification.</p>
        <p>
          To overcome this problem of information loss, Fatemi et al. [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] proposed to get rid of CVTs and
see FB15k as a hypergraph instead. They created the -FB15k dataset2, which includes binary,
ternary and quaternary relations. -FB15k does not include the award_nomination relation,
though, which is likely due to the incompleteness of Freebase with respect to nominations.
For instance, Freebase does not say for what film Peter Gabriel and Thomas Newman were
nominated. Turning the CVT into an award_nomination statement would generate an empty
assignment for nominated_for. -FB15k only includes fully instantiated statements. If KGEMs
learn representations of CVTs instead of -ary statements, they may still be able to correctly
predict the missing link pointing to WALL-E. A hypergraph representation of CVTs thus remains
limited.
        </p>
        <p>Some CVTs also hold information not present in -FB15k because they relate to literal values
and not to topics. Literal values can e.g. be temporal annotations (time instants or intervals).</p>
        <sec id="sec-2-1-1">
          <title>2https://github.com/ElementAI/HypE</title>
          <p>For instance, award nominations are associated with a date in Freebase. A dedicated class of
KGEMs target temporal KGs, seen as collections of quads. It is possible to generate a temporal
KG from CVTs found in FB15k-237. Such a dataset significantly difers from the classical
ICEWS benchmark for temporal KGEMs [11]. Quads inherit the limitations of arbitrary -ary
statements in the presence of incomplete knowledge, though. Generic KGEMs that would learn
representations of CVTs may also capture temporal information. Temporal relations between
CVTs may be materialized in the dataset for KGEMs, in order to turn temporal reasoning tasks
into classical link prediction. The nomination of Fran Walsh (in 2003) e.g. preceded that of
Peter Gabriel and Thomas Newman (2009). A temporal link prediction task would consist in
predicting events ′ such that ⟨, precedes, ′⟩ holds, knowing event  occurred.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Reintroducing Compound Value Types</title>
        <p>As already mentioned, the main reason why CVTs were not taken into account in FB15k is the
restriction of transductive link prediction to well-known entities (i.e. topics). If the point was
valid in 2013, it may not hold anymore, after ten years of KGEM research. Indeed, in recent
years, KGEMs have started being evaluated on other tasks beyond link prediction. In particular,
several KGEMs target conjunctive query evaluation and logical entailment, such that it is not
necessary anymore to maintain an exhaustive list of embeddings for all possible entities.</p>
        <p>The first KGEM to focus on conjunctive queries is the Graph Query Embedding (GQE)
model [12]. It was then followed by Query2Box [13], BetaE [14] and, more recently,
GammaE [15]. All these models can emulate link prediction by answering path queries that bypass
CVTs: they would learn against Freebase (without simplification) and only have embeddings
for topics and binary relations that are optimized for answering complex queries instead of
predicting single facts.</p>
        <p>
          Another class of KGEMs, designed to preserve logical consistency wrt a logic (e.g. a description
logic), has recently emerged. They are primarily targeting the ℰ ℒ++ logic, one that is well
used in Semantic Web ontologies. These include EL Embeddings (ELEm) [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], BoxEL [16] and
the more recent Box2EL [17], all based on the idea of representing classes of entities as convex
polytopes in the embedding space. In KGEMs that preserve the semantics of ℰ ℒ++, logical
axioms involving compositions of relations can be formulated a priori and guide training. As
for query evaluation, no embedding for CVTs would then be required.
        </p>
        <p>Despite their increased expressivity, these two classes of models are still evaluated on FB15k
and FB15k-237. Given that these datasets are not faithful to the original KG, the performance of
KGEMs on these datasets may not necessarily reflect their performance in real-world
applications. We thus introduce FB15k-CVT in the next section and compare it to other datasets often
used to evaluate KGEMs.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. The FB15k-CVT dataset</title>
      <p>The FB15k-CVT dataset is a KG that we created by expanding FB15k-237 with additional types
of entities and relations that were not present in it. In this section, we describe how we extracted
the triples from Freebase, filtered the relations, and split the dataset into train/validation/test
subsets. We also discuss the size of the dataset and how it compares to other datasets derived
from FB15k.</p>
      <sec id="sec-3-1">
        <title>3.1. Creation of the set of triples</title>
        <p>First, to create FB15k-CVT, we downloaded the Freebase large reified knowledge graph that
contains millions of entities and relations. We took the latest version available, which was last
updated in August 2015.3</p>
        <p>Entity Filtering. Next, we filtered the Freebase KG triples to extract only those involving
entities that appear in the FB15k-237 dataset (referred to as FB15k-237’s “topics”), whether
as subjects or objects. We further restricted the selection to triples where the entities are
identified by Machine IDentifier (MID) consisting of /m/ followed by a base-32 unique identifier,
such as /m/026t6. This resulted in triples from Freebase KG where the "topics" of
FB15k237 are connected to new entities labeled as "CVT". Consequently, the current set of triples
contains additional entities and not yet filtred set of relations, which includes many new
triples. Among these are the one-hop relations already present in the FB15k-237 KG, such as
(/m/01lxd4, /music/genre/artists, /m/0f0y8).</p>
        <p>Relation Filtering. Next, to determine which triples to retain, we compile a set of relation
iflters that includes both the decomposition of two-hop relations and the one-hop relations
present in FB15k-237. This set contains a total of 302 relation types. Note that FB15k-237
consists of 237 relations, of which 93 are direct relations and the remaining 144 are composed
relations. After, we apply this filter, we obtain a smaller dataset size, with only 14,468 topics,
which represents represents 99% of FB15k-237’s entities, and 292 types of relations, which
corresponds to 96% of the total types of relations found in FB15k-237.</p>
        <p>Introduction of CVTs. By filtering the extracted set of triples this way, we introduce
“CVT” entities back, where it constitutes the intermediate nodes between topics. CVTs
were mainly hidden within the composed relations in FB15k-237, e.g., the entities and
relations in following two triples : (i) (/m/04ljl_l, /award/award_category/nominees,
/CVT/1234); (ii) (/CVT/1234, //_/_ ,/m/03qcfvw)
are originated from the following triple with a composed two-hop relation (/m/04ljl_l,
/award/award_category/nominees./award/award_nomination/nominated_for,
/m/03qcfvw) in FB15k-237 KG.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Train/validation/test split</title>
        <p>Next, we want to ensure that the train/validation/test split of FB15k-CVT gets the same
distribution of topics and relations as in FB15k-237. This will help us lay the groundwork for training,
evaluating KGEMs and carrying experiments on the new dataset. First, for each subset, we keep
triples with direct relations. Then, for triples with composed relations, we look for intermediate
nodes of types CVT, in the previously built set of triples, to split the relation into two new triples
that formulate a path. Each path consists of two triples that include 3 entities and 2 relations.
Only one of the entities is of type CVT. We note them as follow: 1, 2 for the topics, cvt for
the intermediate node and 1, 2 for the relations. Depending on the direction of the relations,</p>
        <sec id="sec-3-2-1">
          <title>3http://commondatastorage.googleapis.com/freebase-public/rdf/freebase-rdf-latest.gz</title>
          <p>Dataset</p>
          <p>FB15k
Fb15k-237
-FB15k
YAGO15k
FB15k-CVT
we distinguished 3 types of paths in our dataset: (i) direct path: (1, 1, cvt)/(cvt, 2, 2);
(ii) splitting path: (1, 1, cvt)/(2, 2, cvt); (iii) joining path: (cvt, 1, 1)/(cvt, 2, 2).
By following this filtering and splitting process, we ensure that the built dataset remains closely
aligned with the original FB15k-237 dataset. The resulting dataset has the characteristics given
in Table 1, compared with other reference datasets. It is available online4.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments and preliminary results</title>
      <p>In this section, we present the details of the conducted experiments and the preliminary
results. We run our evaluation of KGEMs on two datasets: FB15k-237 and our built FB15k-CVT
datatset. We focus on two widely used baseline models, TransE and DistMult, and analyze their
performance in the presence of CVTs in KGs. The later is assessed on the path prediction (PP)
tasks.</p>
      <sec id="sec-4-1">
        <title>4.1. From link prediction to path prediction</title>
        <p>Generally, researchers assess the performance of their developed KGMEs on a link prediction
(LP) task, which consists of predicting the presence or absence of a relation between two entities
in the knowledge graph. In practice, they first train their models using a subset of triples 
from the knowledge graph  dataset i.e.,  ⊂ ℰ × ℛ × ℰ ⊂  . The goal of this step
is to learn vector representations for the set of entities ℰ and relations ℛ. After training, a set
of unseen evaluation triples is provided , with entities and relations known beforehand.
For each triple (ℎ, , ) ∈ , they evaluate their models on two diferent tasks: (i) head
prediction (?, , ) and (ii) tail prediction (ℎ, , ?). Technically, a relation  and a tail  (Resp.
head ℎ) entity are provided, and the model aims to rank correctly the head ℎ (resp. tail ℎ)
groundtruth entity, such that (ℎ, , ) ∈ . This, after scoring each possible (ℎ′, , ), ∀ℎ′ ∈ ℰ (resp.
(ℎ, , ′), ∀′ ∈ ℰ ) with the same scoring method as in the training phase.</p>
        <p>However, in our case, LP task is not proper for evaluating KGEMs on capturing the complex
relations in FB15k-CVT, which involve two-hops paths along with intermediate CVT entities to
represent -ary relations. Therefore, we propose to evaluate KGEMs on a path prediction task,
where the goal is to predict a complete path between two topic entities in a knowledge graph.
Diferently from LP task, a set of unseen evaluation quads, representing paths, is created  ⊂
ℰ×ℛ×ℛ×ℰ . For each quadruple (1, 1, 2, 2), we evaluate the models on four
possible PP tasks: (i) chain backward prediction i.e., (?, 1, cvt)/(cvt, 2, 2), (ii) chain forward
prediction i.e., (1, 1, cvt)/(cvt, 2, ?), (iii) join prediction i.e., (1, 1, cvt)/(?, 2, cvt)
and (iv) split prediction i.e., (cvt, 1, 1)/(cvt, 2, ?). In other terms, we are given an entity
of type topic 1 (resp. 2) and two relations 1 and 2, and then evaluated on ranking correctly
the ground-truth 2 (resp. 1). This evaluation task aims at assessing the ability of KGEMs
to learn the embedding space of CVTs depending on the relations they are involved in. This
provides a more comprehensive and challenging benchmark for evaluating the efectiveness of
KGEMs on real-world KGs.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Experiments Settings</title>
        <p>
          Datasets. In the following experiments, we consider, for now, only the two datasets: FB15k-237
(for reproduction) and our newly built dataset FB15k-CVT (for evaluation). For FB15k-237, we
closely reproduced the results in the original papers of the models (TransE [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and DistMult [18]),
assessed on LP tasks. On the other hand, for FB15k-CVT, we evaluated the same KGEMs on
path prediction (PP) tasks.
        </p>
        <p>
          Preprocessing of FB15k-CVT. Now that the split of FB15k-CVT dataset is done, we need
to look up and prevent any flaws in it. First, we adapt the dataset for transductive settings; by
deleting from validation and test sets the triples with topics, that don’t appear not appearing in
the training set. This resulted in deleting 4 triples from validation set and 3 others from test
set. Secondly, we follow the same stratregy as in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] to prevent inverse relation leakage . The
latter was between triples that contain at least on entity of type CVT, i.e., {(?1, ?1, ?2)train ∩
(?2, ?2, ?1)eval}, where 1 ̸= 2. Finally, we wrap up by deleting any duplicate triples
remaining between evaluation sets (i.e., validation and test sets) and training set. The goal of
this preprocessing step is to create a more challenging dataset that requires models to generalize
better to new relation between entities.
        </p>
        <p>Creating paths (quadruples) for validation and test sets As stated in subsection.4.1, we
recall that our PP task requires a set of quadruples, noted Eval,in order evaluate of the models.
Therefore, we generate all possible path in our subsets and delete the intermediate CVT nodes.
As a result, we got |test| = 8,667 quadruples. Also, it’s worth noting that along this process,
we make certain there is no paths with a reflexive relation, i.e., (, , cvt), (cvt, , ) as this
will bias the results of the evaluation.</p>
        <p>
          Model training. We choose two KGE models, TransE [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and DistMult [18], which
belongs respectively to diferent sub-categories of tensor-based models: translational and tensor
decomposition models, to train on train of the chosen datasets. For this, we used the best
configurations we found, as provided in the PyKEEN framework 5, designed for FB15k-237 and
reused them for FB15k-CVT.
        </p>
        <p>Model Evaluation. In our evaluation setting for PP task, we restrict the models to only to
use the learned embeddings for topic ℰ and ℛ. As we consider that not all CVTs are seen by
the model in training set. To this end, in this phase, we slightly modified the scoring functions
of the evaluated models, i.e., TransE and DistMult, to suit the evaluation task (see eq. 1 and</p>
        <sec id="sec-4-2-1">
          <title>5PyKEEN (Python KnowlEdge EmbeddiNgs) https://pykeen.github.io/</title>
          <p>Dataset
FB15k-237
FB15k-CVT</p>
          <p>Task
LP (Both)
PP (All)
eq. 2). Note that the function sign() returns the direction of the relation , in order to calculate
accurately the projection of the pair (, ) to the region, in the latent space, representing the
CVTs embeddings.</p>
          <p>(1, 1, 2, 2) = ‖(⃗1 + sign(1).⃗1) − (⃗2 − sign(2).⃗2)‖
(1, 1, 2, 2) = ‖(⃗1 ⊙ sign(1).⃗1) − (⃗2 ⊙ sign(2).⃗2)‖
(1)
(2)</p>
          <p>Evaluation Metrics. As we evaluated the models on link-like prediction tasks. There are
two type of metric in the literature: ranking and classification metrics, to quantify the models’
performance. For our experiments, among various available metrics we focus on rank-based
ones to measure diferent aspects of ranking performance. Precisely, we choose: (1) Mean Rank
(MR) metric, which is a base metric that summarizes the central tendency of ranks by computing
the arithmetic mean over all individual ranks, and (2) The Hits at K metric, noted Hit@, which
measures the fraction of times when the correct result is in the top- ranked triples. We set 
for our experiments to be  ∈ {1, 5, 10}.</p>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Experiments Results</title>
        <p>Table 2 summarizes the performance of TransE and DistMult models on FB15k-237 and
FB15kCVT, respectively, for the link prediction (LP) and path prediction (PP) tasks. We report the
average results for the models’ performance on sub-prediction tasks, including head and tail
prediction for LP, as well as the four subtasks mentioned in subsection 4.1 for PP. In the first
line, both models exhibit, relatively to stat-of-the-art link prediction KGEMS, high Hits@.
In second line, for istance, one could observe that TransE gets high mean rank (MR) and low
Hit@. This indicates that the model is not well suited for path prediction in the presence of
CVTs. Detailed results show that it is especially the case for the “chain forward” and “chain
backward” predictions (see Table 3 in Appendix for more details). Results for DistMult are
slightly more contrasted. We observe the same pattern but the decrease in performance is much
lower. MR even significantly improves for DistMult.</p>
        <p>However, the evaluation confirms that both models got challenged in path prediction task, as
they couldn’t succeed at learning the underlying vector representation of CVTs. Furthermore,
the evaluation calls for further experiements with KGEMs with our proposed FB15k-CVT dataset
to explore new avenues of existing models or improving or developing new ones.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>The FB15k-CVT dataset that we have described was generated from FB15k and its successor,
FB15k-237. Its main distinctive characteristic is that it is an exact subset of Freebase, which FB15k
and FB15k-237 aren’t. We argued in the paper that the reintroduction of CVTs in the dataset
ofers new challenges for emerging KGEMs, including Query2Box, ELEm and their competitors.
Our preliminary evaluation of TransE and DistMult shows that the overall performances of the
two models significantly decrease in the presence of CVTs but the change benefits DistMult,
suggesting that tensor decomposition is better suited for handling CVTs than translational
models.
[8] D. Vrandečić, M. Krötzsch, Wikidata: A free collaborative knowledgebase, Commun. ACM
57 (2014) 78–85. URL: https://doi.org/10.1145/2629489. doi:10.1145/2629489.
[9] J. Hofart, F. M. Suchanek, K. Berberich, G. Weikum, Yago2: A spatially and temporally
enhanced knowledge base from wikipedia, Artificial Intelligence 194 (2013) 28–61. URL:
https://www.sciencedirect.com/science/article/pii/S0004370212000719. doi:https://doi.
org/10.1016/j.artint.2012.06.001, artificial Intelligence, Wikipedia and
SemiStructured Resources.
[10] N. Fridman Noy, A. L. Rector, Defining N-ary Relations on the Semantic Web,
W3C Note, World Wide Web Consortium, 2006. URL: https://www.w3.org/TR/2006/
NOTE-swbp-n-aryRelations-20060412/.
[11] A. García-Durán, S. Dumančić, M. Niepert, Learning Sequence Encoders for
Temporal Knowledge Graph Completion, 2018. URL: http://arxiv.org/abs/1809.03202,
arXiv:1809.03202 [cs].
[12] W. L. Hamilton, P. Bajaj, M. Zitnik, D. Jurafsky, J. Leskovec, Embedding Logical Queries
on Knowledge Graphs, in: S. Bengio, H. M. Wallach, H. Larochelle, K. Grauman, N.
CesaBianchi, R. Garnett (Eds.), Advances in Neural Information Processing Systems 31: Annual
Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8,
2018, Montréal, Canada, 2018, pp. 2030–2041. URL: https://proceedings.neurips.cc/paper/
2018/hash/ef50c335cca9f340bde656363ebd02fd-Abstract.html.
[13] H. Ren, W. Hu, J. Leskovec, Query2box: Reasoning over Knowledge Graphs in Vector</p>
      <p>Space using Box Embeddings, 2020. ArXiv:2002.05969 [cs, stat].
[14] H. Ren, J. Leskovec, Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge
Graphs, in: Advances in Neural Information Processing Systems, volume 33, Curran
Associates, Inc., 2020, pp. 19716–19726. URL: https://proceedings.neurips.cc/paper/2020/
hash/e43739bba7cdb577e9e3e4e42447f5a5-Abstract.html.
[15] D. Yang, P. Qing, Y. Li, H. Lu, X. Lin, GammaE: Gamma Embeddings for Logical Queries
on Knowledge Graphs, in: Y. Goldberg, Z. Kozareva, Y. Zhang (Eds.), Proceedings of the
2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022,
Abu Dhabi, United Arab Emirates, December 7-11, 2022, Association for Computational
Linguistics, 2022, pp. 745–760. URL: https://aclanthology.org/2022.emnlp-main.47.
[16] B. Xiong, N. Potyka, T. Tran, M. Nayyeri, S. Staab, Faithful Embeddings for ℰℒ++
Knowledge Bases, in: U. Sattler, A. Hogan, C. M. Keet, V. Presutti, J. P. A. Almeida,
H. Takeda, P. Monnin, G. Pirrò, C. d’Amato (Eds.), The Semantic Web - ISWC 2022 - 21st
International Semantic Web Conference, Virtual Event, October 23-27, 2022,
Proceedings, volume 13489 of Lecture Notes in Computer Science, Springer, 2022, pp. 22–38. URL:
https://doi.org/10.1007/978-3-031-19433-7_2.
[17] M. Jackermeier, J. Chen, I. Horrocks, Box$^2$EL: Concept and Role Box Embeddings for
the Description Logic EL++, 2023. URL: http://arxiv.org/abs/2301.11118, arXiv:2301.11118
[cs].
[18] B. Yang, W.-t. Yih, X. He, J. Gao, L. Deng, Embedding Entities and Relations for Learning
and Inference in Knowledge Bases, in: Y. Bengio, Y. LeCun (Eds.), 3rd International
Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015,
Conference Track Proceedings, 2015. URL: http://arxiv.org/abs/1412.6575.
[19] R. Abboud, Learning and inference over relational data, PhD Thesis,
University of Oxford, Oxford, 2022. URL: https://ora.ox.ac.uk/objects/uuid:
da7744ad-efd-4fc9-b7ab-a00b03a86a53.
[20] P. Wang, J. Chen, L. Su, Z. Wang, N-ary relation prediction based on knowledge graphs
with important entity detection, Expert Systems with Applications 221 (2023) 119755. URL:
https://linkinghub.elsevier.com/retrieve/pii/S0957417423002567. doi:10.1016/j.eswa.
2023.119755.
[21] J. Leblay, M. W. Chekol, Deriving Validity Time in Knowledge Graph, in: Companion of
the The Web Conference 2018 on The Web Conference 2018 - WWW ’18, ACM Press, Lyon,
France, 2018, pp. 1771–1776. URL: http://dl.acm.org/citation.cfm?doid=3184558.3191639.
doi:10.1145/3184558.3191639.
[22] T. Lacroix, G. Obozinski, N. Usunier, Tensor Decompositions for temporal knowledge base
completion, 2020. URL: http://arxiv.org/abs/2004.04926, arXiv:2004.04926 [cs, stat].
[23] J. Messner, R. Abboud, I. I. Ceylan, Temporal Knowledge Graph Completion Using Box
Embeddings, Proceedings of the AAAI Conference on Artificial Intelligence 36 (2022)
7779–7787. URL: https://ojs.aaai.org/index.php/AAAI/article/view/20746. doi:10.1609/
aaai.v36i7.20746.
[24] M. Ali, M. Berrendorf, C. T. Hoyt, L. Vermue, M. Galkin, S. Sharifzadeh, A. Fischer, V. Tresp,
J. Lehmann, Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph
Embedding Models under a Unified Framework, IEEE Transactions on Pattern Analysis
and Machine Intelligence (2021) 1–1. URL: https://ieeexplore.ieee.org/document/9601281/.
doi:10.1109/TPAMI.2021.3124805.
[25] Z. Sun, Z.-H. Deng, J.-Y. Nie, J. Tang, RotatE: Knowledge Graph Embedding by Relational
Rotation in Complex Space, 2019. URL: http://arxiv.org/abs/1902.10197, arXiv:1902.10197
[cs, stat].
[26] S. Guo, Q. Wang, L. Wang, B. Wang, L. Guo, Jointly Embedding Knowledge Graphs and
Logical Rules, in: Proceedings of the 2016 Conference on Empirical Methods in Natural
Language Processing, Association for Computational Linguistics, Austin, Texas, 2016, pp.
192–202. URL: http://aclweb.org/anthology/D16-1019. doi:10.18653/v1/D16-1019.
[27] W. Zhang, B. Paudel, L. Wang, J. Chen, H. Zhu, W. Zhang, A. Bernstein, H. Chen, Iteratively
Learning Embeddings and Rules for Knowledge Graph Reasoning, in: The World Wide
Web Conference, ACM, San Francisco CA USA, 2019, pp. 2366–2377. URL: https://dl.acm.
org/doi/10.1145/3308558.3313612. doi:10.1145/3308558.3313612.
[28] J. Lajus, L. Galárraga, F. Suchanek, Fast and Exact Rule Mining with AMIE 3, in:
A. Harth, S. Kirrane, A.-C. Ngonga Ngomo, H. Paulheim, A. Rula, A. L. Gentile, P. Haase,
M. Cochez (Eds.), The Semantic Web, volume 12123, Springer International
Publishing, Cham, 2020, pp. 36–52. URL: http://link.springer.com/10.1007/978-3-030-49461-2_3.
doi:10.1007/978-3-030-49461-2_3.
[29] C. Meilicke, M. W. Chekol, D. Rufinelli, H. Stuckenschmidt, Anytime Bottom-Up Rule
Learning for Knowledge Graph Completion, in: Proceedings of the Twenty-Eighth
International Joint Conference on Artificial Intelligence, International Joint
Conferences on Artificial Intelligence Organization, Macao, China, 2019, pp. 3137–3143. URL:
https://www.ijcai.org/proceedings/2019/435. doi:10.24963/ijcai.2019/435.
[30] S. Ott, C. Meilicke, M. Samwald, SAFRAN: An interpretable, rule-based link prediction
method outperforming embedding models, 2021. URL: http://arxiv.org/abs/2109.08002,
arXiv:2109.08002 [cs].</p>
      <p>Dataset
FB15k-237
FB15k-CVT</p>
      <p>Dataset
FB15k-237
FB15k-CVT</p>
      <p>Task
Head
Tail</p>
      <p>Both
Chain forward
Chain Backward</p>
      <p>Join
Split
All
Task
Head
Tail</p>
      <p>Both
Chain forward
Chain Backward</p>
      <p>Join
Split
All</p>
    </sec>
    <sec id="sec-6">
      <title>A. Extended results of the experiment</title>
      <sec id="sec-6-1">
        <title>We provide detailed results in Table 3.</title>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>B. Related Work</title>
      <p>As argued in Sec. 2, our dataset brings new challenges to KGEMs that intend to capture either
-ary statements or rules. We now review existing approaches for these two cases and the
datasets on which they were evaluated.</p>
      <p>
        HypE was the first model to target -ary statements via hypergraph embeddings [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. It was
then followed by several models, including extensions of TransE [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and DistMult [18], BoxE [19,
p. 67] and more recent approaches based on graph neural networks (such as Att-ImpGCN [20]).
These models are evaluated against -FB15k, which is derived from FB15k and not FB15k-237.
The separation between training, validation and testing in FB15k is known to lead to data
leakage, hence the creation of FB15k-237 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The problem, which is due to inverse relations not
being taken into account, persists in the -ary variant. The results of some models, including
SimplE, are artificially high. HypE is derived from SimplE and should be reevaluated against
our dataset. The same applies to similar methods based on tensor decomposition.
      </p>
      <p>As already mentioned, temporal relations are a special case of -ary relations. Temporally
aware TransE and DistMult (referred to as TA-TranE and TA-DistMult), are among the early
KGEMs designed for temporal relations [11]. They were tested on YAGO15k6, a temporal KG
derived from FB15k, by aligning entities with YAGO and retrieving YAGO facts that included
‘occurs since’ and ‘occurs until’ annotations. TA-TransE and TA-DistMult were also evaluated on
a much larger dataset derived from Wikidata. This dataset was originally constructed to evaluate
TTransE, a temporal extension of TransE [21]. Temporal annotations in the Wikidata dataset
(and most of the YAGO15k dataset) are only at coarse granularity: statements are annotated
with years. A larger temporal KG, also derived from Wikidata, was used to evaluate temporal
extensions of ComplEx (TComplEx and TNTComplEx) [22]. The authors of these extensions
considerd YAGO15k was too small for temporal link prediction. More recent temporally aware
KGEMs, such as BoxTE [23], tend to favor other benchmarks over YAGO15k.</p>
      <p>At another end of the spectrum, some KGEMs focus on rule induction. For instance, RotatE,
one of the best performing KGEMs on FB15k-2377, was in fact motivated by the ability to
capture composition rules, as well as symmetry, antisymmetry and inversion [25]. Before that,
the Knowledge and Logic Embedding method (KALE), integrated rules to TransE using fuzzy
logic expressions [26]. For evaluation, 47 rules were defined for a subset of FB15k. These rules
have a restricted form, though: they are limited to subsumption and (basic) composition. KALE
assumes rules are provided. In contrast, the IterE model is designed to learn rules jointly with
entity and relation representations [27]. Rules correspond to linear algebraic expressions in the
embedding space, such that the plausibility of a rule can be calculated as a distance between
matrices. IterE can learn the same rules as those targeted by RotatE but it can e.g. not learn
subsumption.</p>
      <p>KALE, IterE and other rule learning models generally focus on restricted rule forms, so that
the space of possible inductions is of reasonable size. In fact, if one specifically looks for Horn
rules, exploration may be performed directly in the syntactic space: rules may be mined without
leverage any latent embedding space. It is the approach taken e.g. by the latest Association
Rule Mining under Incomplete Evidence (AMIE 3) [28] and Anytime Bottom-up Rule Learning
(AnyBURL) [29] algorithms. On link prediction, AnyBURL tends to have comparable results
with standard KGEMs over FB15k-237 [29, 30]. It may in fact outperform them on our dataset,
given the high support rules involving CVT values have.
6training, validation, and test splits are available at available at https://github.com/mniepert/mmkb/tree/master/TemporalKGs/yago15k
7as reported in a large-scale experiment by Ali et al. [24].</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hogan</surname>
          </string-name>
          , E. Blomqvist,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cochez</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>D'amato</article-title>
          , G. D.
          <string-name>
            <surname>Melo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Gutierrez</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Kirrane</surname>
            ,
            <given-names>J. E. L.</given-names>
          </string-name>
          <string-name>
            <surname>Gayo</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Navigli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Neumaier</surname>
            ,
            <given-names>A.-C. N.</given-names>
          </string-name>
          <string-name>
            <surname>Ngomo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Polleres</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <string-name>
            <surname>Rashid</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Rula</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Schmelzeisen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Sequeda</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Staab</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Zimmermann</surname>
          </string-name>
          ,
          <article-title>Knowledge graphs</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>54</volume>
          (
          <year>2021</year>
          ). URL: https://doi.org/10.1145/3447772. doi:
          <volume>10</volume>
          .1145/3447772.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bordes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Usunier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>García-Durán</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Weston</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Yakhnenko</surname>
          </string-name>
          ,
          <article-title>Translating Embeddings for Modeling Multi-relational Data</article-title>
          , in: C.
          <string-name>
            <surname>J. C. Burges</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Bottou</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Ghahramani</surname>
            ,
            <given-names>K. Q.</given-names>
          </string-name>
          <string-name>
            <surname>Weinberger</surname>
          </string-name>
          (Eds.),
          <source>Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8</source>
          ,
          <year>2013</year>
          ,
          <string-name>
            <given-names>Lake</given-names>
            <surname>Tahoe</surname>
          </string-name>
          , Nevada, United States,
          <year>2013</year>
          , pp.
          <fpage>2787</fpage>
          -
          <lpage>2795</lpage>
          . URL: https://proceedings.neurips.cc/paper/2013/hash/ 1cecc7a77928ca8133fa24680a88d2f9-Abstract.html.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Observed versus latent features for knowledge base and text inference</article-title>
          ,
          <source>in: Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality</source>
          ,
          <source>Association for Computational Linguistics</source>
          , Beijing, China,
          <year>2015</year>
          , pp.
          <fpage>57</fpage>
          -
          <lpage>66</lpage>
          . URL: https://aclanthology.org/W15-4007. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>W15</fpage>
          -4007.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Fatemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Taslakian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vazquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Poole</surname>
          </string-name>
          ,
          <string-name>
            <surname>Knowledge Hypergraphs</surname>
          </string-name>
          : Prediction Beyond Binary Relations,
          <year>2020</year>
          . URL: http://arxiv.org/abs/
          <year>1906</year>
          .00137, arXiv:
          <year>1906</year>
          .00137 [cs, stat].
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>H.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Leskovec,</surname>
          </string-name>
          <article-title>Query2box: Reasoning over Knowledge Graphs in Vector Space using Box Embeddings</article-title>
          ,
          <year>2020</year>
          . URL: http://arxiv.org/abs/
          <year>2002</year>
          .05969, arXiv:
          <year>2002</year>
          .05969 [cs, stat].
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kulmanov</surname>
          </string-name>
          , W. Liu-Wei,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hoehndorf</surname>
          </string-name>
          , EL Embeddings:
          <article-title>Geometric Construction of Models for the Description Logic EL++</article-title>
          , in: S. Kraus (Ed.),
          <source>Proceedings of the TwentyEighth International Joint Conference on Artificial Intelligence, IJCAI</source>
          <year>2019</year>
          , Macao, China,
          <source>August 10-16</source>
          ,
          <year>2019</year>
          , ijcai.org,
          <year>2019</year>
          , pp.
          <fpage>6103</fpage>
          -
          <lpage>6109</lpage>
          . URL: https://doi.org/10.24963/ijcai.
          <year>2019</year>
          / 845.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T. Pellissier</given-names>
            <surname>Tanon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vrandečić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schafert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Steiner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pintscher</surname>
          </string-name>
          ,
          <article-title>From freebase to wikidata: The great migration</article-title>
          ,
          <source>in: Proceedings of the 25th International Conference on World Wide Web, WWW '16, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE</source>
          ,
          <year>2016</year>
          , p.
          <fpage>1419</fpage>
          -
          <lpage>1428</lpage>
          . URL: https://doi. org/10.1145/2872427.2874809. doi:
          <volume>10</volume>
          .1145/2872427.2874809.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>