<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Simpli ed Benchmark for Non-ambiguous Explanations of Knowledge Graph Link Prediction using Relational Graph Convolutional Networks?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nicholas Halliwell</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabien Gandon</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Freddy Lecue</string-name>
          <email>freddy.lecueg@inria.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CortAIx</institution>
          ,
          <addr-line>Thales, Montreal</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Inria, Universite Co</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Need for Explaining Links Predicted by RGCNs</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>te d'Azur</institution>
          ,
          <addr-line>CNRS, I3S</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Relational Graph Convolutional Networks (RGCNs) identify relationships within a Knowledge Graph to learn real-valued embeddings for each node and edge. Recently, researchers have proposed explanation methods to interpret the predictions of these black-box models. However, comparisons across explanation methods is di cult without a common dataset and standard evaluation metrics to evaluate the explanations. In this paper, we propose a method, including two datasets (Royalty20k and Royalty-30k), to benchmark explanation methods on the task of explainable link prediction using Graph Neural Networks. We report the results of state-of-the-art explanation methods for RGCNs. 3 Knowledge Graphs often represent facts as triples in the form (subject, predicate, object), where a subject and object represent an entity, linked by some predicate. Link prediction is used to discover new facts from existing ones. Graph embedding algorithms can be used for link prediction, learning a function to map each subject, object, and predicate to a low dimensional space. A Relational Graph Convolutional Network (RGCN) [4] leverages Graph Convolutional Networks [3] with a scoring function such as DistMult [5] as an output layer, returning a probability of the input triple being a fact. Recently, methods have been developed to explain the predictions of Graph Neural Networks. GNNExplainer [6] explains the predictions of any Graph Neural Network, learning a mask over the adjacency matrix to identify the most informative subgraph. ExplaiNE [2] quanti es how the predicted probability of a link changes when weakening or removing a link with a neighboring node.</p>
      </abstract>
      <kwd-group>
        <kwd>link prediction</kwd>
        <kwd>Explainable AI</kwd>
        <kwd>knowledge graphs</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The authors of ExplaiNE acknowledge the di culty in measuring the
quality of explanation generated, and a lack of available datasets with ground truth
explanations [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This approach is evaluated with four datasets, none of which
include ground truth explanations. Additionally, computing ground truth
explanations for these datasets is non-trivial.
      </p>
      <p>On the task of link prediction on Knowledge Graphs, GNNExplainer remains
untested. Neither GNNExplainer or ExplaiNE use the same dataset in their
experiments. On the task of explainable link prediction on Knowledge Graphs,
no standard datasets or scoring metrics exist to evaluate explanation methods.</p>
      <p>In this work, we propose a method, including two datasets (Royalty-20k
and Royalty-30k) to evaluate explanations methods for link prediction using
RGCNs. We propose the use of a scoring metric on these datasets, and perform
a benchmark comparing the explanations of ExplaiNE and GNNExplainer.
2</p>
      <p>
        Building the Royalty Datasets with Explanations
We de ne ground truth explanations as a single justi cation for an entailment.
We use an open-source semantic reasoner with rule-tracing capabilities [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] to
generate ground truth explanations for each rule, allowing for each set of
explanations to be precisely controlled and selected. We rely on a set of rules
equivalent to strict Horns clauses i.e. disjunctions of literals with exactly one
positive literal lc, all the other li being negated: :l1 _ ::: _ :ln _ lc. The
implication form of the clause can be seen as an inference rule assuming that, if all
li hold (the antecedent of the rule), then the consequent lc also holds, denoted
lc l1 ^ ::: ^ ln. In our case, each literal is a binary predicate capturing a triple
pattern of the Knowledge Graph with variables universally quanti ed for the
whole clause.
      </p>
      <p>For a given Knowledge Graph and a given set of rules, the semantic
reasoner performs a forward chaining materialization of all inferences that can be
made. Each time the engine nds a mapping of triples T1; : : : ; Tn making the
antecedent of a rule true, it materializes the consequent triple Tc, and records
the explanations in the form Tc (T1; : : : ; Tn), where Tc is a generated triple,
and triples T1; : : : ; Tn are its explanation.</p>
      <p>For example, under the hasGrandparent rule, if two triples (Princess Marie
Anne, hasParent, Louis XIV ), and (Louis XIV, hasParent, Anne of Austria) are
identi ed in a Knowledge Graph, the semantic reasoner generates a new triple
(Princess Marie Anne , hasGrandparent, Anne of Austria). We compute
explanation triples in a similar fashion for each logical rule de ned in each dataset. In
this example, the explanation for why (Princess Marie Anne, hasGrandparent,
Anne of Austria) is a valid fact is because (Princess Marie Anne, hasParent,
Louis XIV ), and (Louis XIV, hasParent, Anne of Austria).</p>
      <p>This approach to generating ground truth explanations can be applied to
many Knowledge Graphs and many sets of rules from di erent domains. It also
corresponds to the RGCN explanation methods, which provide explanations in
the form of a set of evidence triples.</p>
    </sec>
    <sec id="sec-2">
      <title>Dataset</title>
    </sec>
    <sec id="sec-3">
      <title>Predicate</title>
      <p>hasSpouse</p>
    </sec>
    <sec id="sec-4">
      <title>Royalty-20k hasSuccessor hasPredecessor</title>
    </sec>
    <sec id="sec-5">
      <title>Full data</title>
      <p>hasSpouse</p>
    </sec>
    <sec id="sec-6">
      <title>Royalty-30k hasGrandparent hasParent Full data</title>
      <p># Triples G#TenreiRprualetlseed #EnUtnitiiqeuse ECxaprldainnaatliitoyn PPrreodpiecrattye</p>
      <p>There may be several ways to de ne these logical rules. For example,
hasSuccessor can be de ned using its inverse relation hasPredecessor. In some cases,
hasPredecessor could be correlated to the hasParent relation and therefore
considered an explanation. Both explanations could be correct, thus the optimal
explanation is ambiguous. In this work, we build two datasets with non-ambiguous
explanations, where each logical rule is designed to give one and only one ground
truth explanation for each triple in the training and test sets. This prevents an
explanation method from having to arbitrarily select between alternative
explanations. Table 1 gives dataset statistics for each predicate. Each rule is detailed
below.</p>
      <p>Spouse Entity X is the spouse of Y if Y is the spouse of X, more formally,
hasSpouse(X; Y ) hasSpouse(Y; X). This is a symmetric relationship. A
predicate p is said to be symmetric for some subject s and object o if and only if
(s; p; o) (o; p; s). There are 7; 526 triples with the hasSpouse predicate in each
dataset, 3; 763 of which are generated by rules. Note this rule is the same for
both datasets.</p>
      <p>Successor and Predecessor A successor in the context of royalty is one
who immediately follows the current holder of the throne. X is the
successor of Y if Y is the predecessor of X. Equivalently, hasSuccessor(X; Y )
hasP redecessor(Y; X). Likewise, a predecessor is de ned as one who held the
throne immediately before the current holder. X is the predecessor of Y if Y is
the successor of X. Equivalently, hasP redecessor(X; Y ) hasSuccessor(Y; X).
Indeed hasSuccessor and hasPredecessor follow an inverse relationship. A
predicate p1 is the inverse of p2 if and only if (s; p1; o) (o; p2; s). Therefore triples
with the hasSuccessor predicate are used to explain the triples with the
hasPredecessor predicate and vice-versa. There are 6; 277 triples with hasSuccessor
predicate, 2; 003 of which are generated by rules. Similarly, there are 6; 277 triples
with hasPredecessor predicate, 2; 159 of which are generated by rules.
Grandparent We de ne hasGrandparent using a chain property pattern. A
predicate p is a chain of predicates pi if and only if (s; p; o) (s; p1; s2) ^
::: ^ (sn; pn; o). Y is the grandparent of X if Y is the parent of X's parent P .
Equivalently, hasGrandparent(X; Y ) hasP arent(X; P ) ^ hasP arent(P; Y ).
There are 7; 736 triples with hasGrandparent predicate, all of which are generated
by rules.
3</p>
      <p>Evaluating Explanations of RGCNs
In this benchmark, ExplaiNE and GNNExplainer were asked to identify existing
triples in the graph as candidate explanations. We use the Jaccard similarity
to score the similarity between predicted and ground truth explanations. This
metric is appropriate, as it compares the triples in common between the predicted
and ground truth without considering the order. Furthermore, this metric can be
applied to data from di erent domains without relying on further assumptions.
Benchmark results can be found in Table 2.</p>
      <p>As an example of one of ExplaiNE's errors, for some triple (Albert III, Count
of Everstein, hasSpouse, Richeza of Poland ) and its ground truth explanation
(Richeza of Poland, hasSpouse, Albert III, Count of Everstein ), ExplaiNE
predicted a rst degree neighbor (Richeza of Poland, hasSpouse, Alfonso VIII of
Leon and Castile) to be its explanation. Note the incorrectly predicted triple
uses the hasSpouse predicate but in a wrong way.</p>
      <p>Across both datasets, we observe GNNExplainer had the highest Jaccard and
F1-Score performance on the hasSpouse predicate. GNNExplainer's best
performing predicates were hasSpouse, hasSuccessor and hasPredecessor. On the
Royalty-30k dataset, we observe performance drops across all metrics on the
hasGrandparent predicate. We found GNNExplainer to have a recall of 1 for
each predicate, meaning the triples in the ground truth explanation are always
identi ed, however, many false positives are included in the predicted
explanation set. Indeed a recall of 1 can be trivially achieved by including the entire
input graph in the predicted explanation set. This was not the case for the
predicted explanations of GNNExplainer, however, the predicted explanation sets
were often too large (up to 20 triples on average), and had many false positives.</p>
      <p>Table 2 also reports the results of ExplaiNE on the Royalty datasets.
Similar to GNNExplainer, we observe the best Jaccard and F1-Score performance
on the hasSpouse predicate. On the Royalty-30k dataset, we see lower relative
performance on the predicates where larger explanations need to be predicted,
such as the hasGrandparent rule.</p>
      <p>Overall, we nd ExplaiNE outperformed GNNExplainer in terms of Jaccard
score for all rules across both datasets. While GNNExplainer outperforms
ExplaiNE in terms of F1 score on hasSpouse, and hasGrandparent, along with the
Royalty-30k full dataset, this is likely due to GNNExplainer's high recall,
indicating that the F1 metric is not appropriate for this task.</p>
    </sec>
    <sec id="sec-7">
      <title>Dataset</title>
    </sec>
    <sec id="sec-8">
      <title>Models</title>
    </sec>
    <sec id="sec-9">
      <title>Metrics Spouse Successor Predecessor Grandparent Full set</title>
      <p>0.067
1.0
0.125
0.133</p>
    </sec>
    <sec id="sec-10">
      <title>RGCN</title>
      <p>0.182
1.0
0.307
0.178</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Corby</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gaignard</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faron Zucker</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montagnat</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>KGRAM Versatile Inference and Query Engine for the Web of Linked Data</article-title>
          . In: IEEE/WIC/ACM International Conference on Web Intelligence. pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          .
          <string-name>
            <surname>Macao</surname>
          </string-name>
          ,
          <string-name>
            <surname>China</surname>
          </string-name>
          (Dec
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Kang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lij</surname>
            <given-names>jt</given-names>
          </string-name>
          , J.,
          <string-name>
            <surname>Bie</surname>
          </string-name>
          , T.D.:
          <article-title>Explaine: An approach for explaining network embedding-based link predictions</article-title>
          . CoRR abs/
          <year>1904</year>
          .12694 (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Kipf</surname>
            ,
            <given-names>T.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Welling</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Semi-supervised classi cation with graph convolutional networks</article-title>
          .
          <source>In: International Conference on Learning Representations, ICLR</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Schlichtkrull</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kipf</surname>
            ,
            <given-names>T.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bloem</surname>
          </string-name>
          , P., van den Berg, R.,
          <string-name>
            <surname>Titov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Welling</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Modeling relational data with graph convolutional networks</article-title>
          . In: Gangemi,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Navigli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Vidal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Hitzler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Troncy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Hollink</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Tordai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Alam</surname>
          </string-name>
          , M. (eds.) European Semantic Web Conference, ESWC (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yih</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>He</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deng</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Embedding entities and relations for learning and inference in knowledge bases</article-title>
          . In: Bengio,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>LeCun</surname>
          </string-name>
          , Y. (eds.) 3rd International Conference on Learning Representations,
          <source>ICLR</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ying</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bourgeois</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>You</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zitnik</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leskovec</surname>
          </string-name>
          , J.: Gnnexplainer:
          <article-title>Generating explanations for graph neural networks</article-title>
          . In: Wallach,
          <string-name>
            <given-names>H.M.</given-names>
            ,
            <surname>Larochelle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Beygelzimer</surname>
          </string-name>
          , A.,
          <string-name>
            <surname>d'</surname>
            Alche-Buc,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fox</surname>
            ,
            <given-names>E.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garnett</surname>
            ,
            <given-names>R</given-names>
          </string-name>
          . (eds.)
          <source>Advances in Neural Information Processing Systems</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>