<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>DeepLENS: Deep Learning for Entity Summarization?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Qingxia Liu</string-name>
          <email>qxliu2013@smail.nju.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gong Cheng</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yuzhong Qu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Key Laboratory for Novel Software Technology, Nanjing University</institution>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Entity summarization has been a prominent task over knowledge graphs. While existing methods are mainly unsupervised, we present DeepLENS, a simple yet e ective deep learning model where we exploit textual semantics for encoding triples and we score each candidate triple based on its interdependence on other triples. DeepLENS signi cantly outperformed existing methods on a public benchmark.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Entity summarization is the task of computing a compact summary for an
entity by selecting an optimal size-constrained subset of entity-property-value
triples from a knowledge graph such as an RDF graph [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. It has found a
wide variety of applications, for example, to generate a compact entity card
from Google's Knowledge Graph where an entity may be described in dozens
or hundreds of triples. Generating entity summaries for general purposes has
attracted much research attention, but existing methods are mainly
unsupervised [
        <xref ref-type="bibr" rid="ref10 ref11 ref13 ref2 ref3 ref4 ref5 ref6 ref9">2,9,3,4,13,10,6,5,11</xref>
        ]. One research question that naturally arises is whether
deep learning can much better solve this task.
      </p>
      <p>
        To the best of our knowledge, ESA [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] is the only supervised method in the
literature for this task. ESA encodes triples using graph embedding (TransE),
and employs BiLSTM with supervised attention mechanism. Although it
outperformed unsupervised methods, the improvement reported in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] was rather
marginal, around +7% compared with unsupervised FACES-E [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] on the ESBM
benchmark [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. It inspired us to explore more e ective deep learning models for
the task of general-purpose entity summarization.
      </p>
      <p>In this short paper, we present DeepLENS,1 a novel Deep Learning based
approach to ENtity Summarization. DeepLENS uses a simple yet e ective model
which addresses the following two limitations of ESA, and thus achieved signi
cantly better results in the experiments.
1. Di erent from ESA which encodes a triple using graph embedding, we use
word embedding because we consider textual semantics more useful than
graph structure for the entity summarization task.
? Copyright c 2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
1 https://github.com/nju-websoft/DeepLENS
2. Whereas ESA encodes a set of triples as a sequence and its performance is
sensitive to the chosen order, our aggregation-based representation satis es
permutation invariance and hence more suitable for entity summarization.</p>
      <p>In the remainder of the paper, Section 2 details DeepLENS, Section 3 presents
experiment results, and Section 4 concludes the paper.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Approach</title>
      <p>
        Problem Statement An RDF graph T is a set of triples. The description of
entity e in T , denoted by Desc(e) T , comprises triples where e is the subject
or object. Each triple t 2 Desc(e) describes a property prop(t) which is the
predicate of t, and gives a value val(t) which is the object or subject of t other
than e. For a size constraint k, a summary of e is a subset of triples S Desc(e)
with jSj k. We aim to generate an optimal summary for general purposes.
Overview of DeepLENS Our approach DeepLENS generates an optimal
summary by selecting k most salient triples. As a supervised approach, it learns
salience from labeled entity summaries. However, two issues remain unsolved.
First, knowledge graph such as RDF graph is a mixture of graph structure
and textual content. The e ectiveness of a learning-based approach to entity
summarization relies on a proper representation of entity descriptions of such
mixed nature. Second, the salience of a triple is not absolute but dependent on
the context, i.e., the set of other triples in the entity description. It is
essential to represent their independence. DeepLENS addresses these issues with the
scoring model presented in Fig. 1. It has three modules which we will detail
below: triple encoding, entity description encoding, and triple scoring. Finally,
the model scores each candidate triple t 2 Desc(e) in the context of Desc(e).
Triple Encoding For entity e, a triple t 2 Desc(e) provides a property-value
pair hprop(t); val(t)i of e. Previous research [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] leverages graph embedding to
encode the structural features of prop(t) and val(t). By contrast, for the task of
entity summarization we consider textual semantics more important than graph
structure, and we solely exploit textual semantics for encoding t.
      </p>
      <p>Speci cally, for RDF resource r, we obtain its textual form as follows. For an
IRI or a blank node, we retrieve its rdfs:label if it is available, otherwise we
have to use its local name; for a literal, we take its lexical form. We represent
each word in the textual form by a pre-trained word embedding vector, and we
average these vectors over all the words to represent r, denoted by Embedding(r).
For triple t 2 Desc(e), we generate and concatenate such vector representations
for prop(t) and val(t) to form t, the initial representation of t. Then t is fed
into a multi-layer perceptron (MLP) to generate h, the nal representation of t:
t = [Embedding(prop(t)); Embedding(val(t))] ;
h = MLPC(t) :
(1)
Triple Encoding</p>
      <p>g1, g2,..., gn
score(t|Desc(e))
[h; d]</p>
      <sec id="sec-2-1">
        <title>MLPS</title>
        <p>a</p>
      </sec>
      <sec id="sec-2-2">
        <title>MLPC</title>
        <p>h
t
t</p>
        <p>t1, t2,..., tn</p>
        <p>
          Word Embedding
Entity Description Encoding To score a candidate triple in the context
of other triples in the entity description, previous research [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] captures the
independence between triples in Desc(e) using BiLSTM to pass information.
Triples are fed into BiLSTM as a sequence. However, Desc(e) is a set and the
triples lack a natural order. The performance of this model is unfavourably
sensitive to the order of input triples. Indeed, as we will show in the experiments,
di erent orders could lead to considerably di erent performance.
        </p>
        <p>To generate a representation for Desc(e) that is permutation invariant, we
perform aggregation. Speci cally, let t1; : : : ; tn be the initial representations of
all the n triples in Desc(e) computed by Eq. (1). We feed a MLP with each ti
for 1 i n and generate their nal representations g1; : : : ; gn, which in turn
are weighted using attention mechanism from h computed by Eq. (1), the nal
representation of the candidate triple t to be scored. We calculate the sum of
these weighted representations of triples to represent Desc(e), denoted by d:
gi = MLPD(ti) ;
ai =</p>
        <p>exp(cos(h; gi))
Pj exp(cos(h; gj ))
;
d =
n
X aigi :
i=1
where the cos function computes the cosine similarity between two vectors, and
ai is the i-th component of the attention vector a. The result of summation is
not sensitive to the order of triples in Desc(e).</p>
        <p>Triple Scoring For each candidate triple t 2 Desc(e) to be scored, we
concatenate its nal representation h and the representation d for Desc(e). We feed the
result into a MLP to compute the context-based salience score of t:
score(tjDesc(e)) = MLPS([h; d]) :
(2)
(3)</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experiments</title>
      <sec id="sec-3-1">
        <title>Datasets</title>
        <p>Parameters of the entire model are jointly trained based on the mean squared
error loss, supervised by labeled entity summaries.</p>
        <p>
          We used ESBM v1.2, the largest available benchmark for evaluating
generalpurpose entity summarization [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].2 For each of 125 entities in DBpedia and
50 entities in LinkedMDB, this benchmark provided 6 ground-truth summaries
created by di erent human experts under k = 5, and another 6 ground-truth
summaries under k = 10. We used the train-valid-test split speci ed in the
benchmark to perform ve-fold cross-validation.
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Participating Methods</title>
        <p>We compared DeepLENS with 10 baseline methods.</p>
        <p>
          Unsupervised Methods. We compared with 9 unsupervised methods that
had been tested on ESBM: RELIN [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], DIVERSUM [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], FACES [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ],
FACESE [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], CD [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], LinkSUM [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], BAFREC [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], KAFCA [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], and MPSUM [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. We
directly presented their results reported on the ESBM website.
        </p>
        <p>
          Supervised Methods. We compared with ESA [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], the only supervised
method in the literature to our knowledge. We reused its open-source
implementation and con guration.3 We fed it with triples sorted in alphabetical order.
        </p>
        <p>
          For our approach DeepLENS, we used 300-dimensional fastText [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] word
embedding vectors trained on Wikipedia to generate initial representations of
triples. The sizes of fully connected layers in MLPC, MLPD, and MLPS were
f64, 64g, f64, 64g, and f64, 64, 64, 1g, respectively. All hidden layers used
ReLU as activation function. In particular, the output layer of the entire model,
i.e., the last layer of MLPS, consisted of one linear unit. We trained the model
using Adam optimizer with learning rate 0.01.
        </p>
        <p>For both ESA and DeepLENS, we performed early stopping on the validation
set to choose the number of training epochs from 1{50.</p>
        <p>
          Oracle Method. ORACLE approximated the best possible performance
on ESBM and formed a reference point used for comparisons [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. It outputted
k triples that most frequently appeared in ground-truth summaries.
3.3
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Results</title>
        <p>Following ESBM, we compared machine-generated summaries with ground-truth
summaries by calculating F1 score, and reported the mean F1 achieved by each
method over all the test entities in a dataset.
2 https://w3id.org/esbm
3 https://github.com/WeiDongjunGabriel/ESA</p>
        <p>Comparison with Baselines. As shown in Table 1, supervised methods
were generally better than unsupervised methods. Our DeepLENS outperformed
all the baselines including ESA. Moreover, two-tailed t-test showed that all the
di erences were statistically signi cant (p &lt; 0:01) in all the settings. DeepLENS
achieved new state-of-the-art results on the ESBM benchmark. However, the
notable gaps between DeepLENS and ORACLE suggested room for improvement
and were to be closed by future research.</p>
        <p>Ablation Study. Compared with ESA, we attributed the better
performance of DeepLENS to two improvements in our implementation: the
exploitation of textual semantics, and the permutation invariant representation of triple
set. They were demonstrated by the following ablation study of ESA.</p>
        <p>First, we compared two variants of ESA by encoding triples in di erent ways.
For triple t, the original version of ESA encoded the structural features of prop(t)
and val(t) using TransE. We implemented ESA-text, a variant that encoded
both prop(t) and val(t) using fastText as in our approach. As shown in Table 2,
ESA-text slightly outperformed ESA, showing the usefulness of textual semantics
compared with graph structure used by ESA.</p>
        <p>Second, we compared two variants of ESA by feeding with triples in di erent
orders. The default version of ESA was fed with triples sorted in alphabetical
order for both training and testing. We implemented ESA-rnd, a variant that was
fed with triples in alphabetical order for training but in random order for testing.
We tested ESA-rnd 20 times and reported its mean F1 with standard deviation.
In Table 2, the notable falls from ESA to ESA-rnd showed the unfavourable
sensitivity of BiLSTM used by ESA to the order of input triples.
We presented DeepLENS, a simple yet e ective deep learning model for
generalpurpose entity summarization. It has achieved new state-of-the-art results on
the ESBM benchmark, signi cantly outperforming existing methods. Thus,
entity summarization becomes another research eld where a combination of deep
learning and knowledge graph is likely to shine. However, in DeepLENS we only
exploit textual semantics. In future work, we will incorporate ontological
semantics into our model. We will also revisit the usefulness of structural semantics.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgments</title>
      <p>This work was supported by the National Key R&amp;D Program of China under
Grant 2018YFB1005100 and by the Qing Lan Program of Jiangsu Province.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bojanowski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grave</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joulin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Enriching word vectors with subword information</article-title>
          .
          <source>TACL 5</source>
          ,
          <issue>135</issue>
          {
          <fpage>146</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Cheng, G.,
          <string-name>
            <surname>Tran</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>RELIN: relatedness and informativeness-based centrality for entity summarization</article-title>
          .
          <source>In: ISWC</source>
          <year>2011</year>
          ,
          <string-name>
            <surname>Part</surname>
            <given-names>I.</given-names>
          </string-name>
          <year>pp</year>
          .
          <volume>114</volume>
          {
          <issue>129</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Gunaratna</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thirunarayan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sheth</surname>
            ,
            <given-names>A.P.:</given-names>
          </string-name>
          <article-title>FACES: diversity-aware entity summarization using incremental hierarchical conceptual clustering</article-title>
          .
          <source>In: AAAI 2015</source>
          . pp.
          <volume>116</volume>
          {
          <issue>122</issue>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Gunaratna</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thirunarayan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sheth</surname>
            ,
            <given-names>A.P.</given-names>
          </string-name>
          , Cheng, G.:
          <article-title>Gleaning types for literals in RDF triples with application to entity summarization</article-title>
          .
          <source>In: ESWC 2016</source>
          . pp.
          <volume>85</volume>
          {
          <issue>100</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>E.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Choi</surname>
            ,
            <given-names>K.S.:</given-names>
          </string-name>
          <article-title>Entity summarization based on formal concept analysis</article-title>
          .
          <source>In: EYRE</source>
          <year>2018</year>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kroll</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nagel</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balke</surname>
          </string-name>
          , W.T.:
          <article-title>BAFREC: Balancing frequency and rarity for entity characterization in linked open data</article-title>
          .
          <source>In: EYRE</source>
          <year>2018</year>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Liu</surname>
          </string-name>
          , Q., Cheng, G.,
          <string-name>
            <surname>Gunaratna</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Entity summarization: State of the art and future challenges</article-title>
          . CoRR abs/
          <year>1910</year>
          .08252 (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Liu</surname>
          </string-name>
          , Q., Cheng, G.,
          <string-name>
            <surname>Gunaratna</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>ESBM: An entity summarization benchmark</article-title>
          .
          <source>In: ESWC</source>
          <year>2020</year>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Sydow</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pikula</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schenkel</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>The notion of diversity in graphical entity summarisation on semantic knowledge graphs</article-title>
          .
          <source>J. Intell. Inf. Syst</source>
          .
          <volume>41</volume>
          (
          <issue>2</issue>
          ),
          <volume>109</volume>
          {
          <fpage>149</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Thalhammer</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lasierra</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rettinger</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>LinkSUM: Using link analysis to summarize entity data</article-title>
          .
          <source>In: ICWE 2016</source>
          . pp.
          <volume>244</volume>
          {
          <issue>261</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            ,
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          :
          <article-title>MPSUM: Entity summarization with predicate-based matching</article-title>
          .
          <source>In: EYRE</source>
          <year>2018</year>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Zang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Zhou</surname>
          </string-name>
          , W., Han,
          <string-name>
            <given-names>J</given-names>
            .,
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          : ESA:
          <article-title>Entity summarization with attention</article-title>
          .
          <source>In: EYRE 2019</source>
          . pp.
          <volume>40</volume>
          {
          <issue>44</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zheng</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          : CD at ENSEC 2016:
          <article-title>Generating characteristic and diverse entity summaries</article-title>
          .
          <source>In: SumPre</source>
          <year>2016</year>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>