<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>November</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>ESA: Entity Summarization with Atention</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dongjun Wei∗</string-name>
          <email>weidongjun@iie.ac.cn liuyaxin@iie.ac.cn Institute of Information Engineering, CAS, Beijing, China University of Chinese Academy of Sciences, Beijing, China</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wei Zhou</string-name>
          <email>zhouwei@iie.ac.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fuqing Zhu†</string-name>
          <email>zhufuqing@iie.ac.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Liangjun Zang</string-name>
          <email>zangliangjung@iie.ac.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jizhong Han</string-name>
          <email>hanjizhong@iie.ac.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Songlin Hu</string-name>
          <email>husonglin@iie.ac.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Information, Engineering</institution>
          ,
          <addr-line>CAS, Beijing</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Yaxin Liu∗</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <volume>0</volume>
      <fpage>3</fpage>
      <lpage>07</lpage>
      <abstract>
        <p>Entity summarization task aims at creating brief but informative descriptions of entities from Knowledge Graph. While previous work mostly focuses on traditional techniques such as clustering algorithms and graph models, we make an attempt to integrate deep learning methods into this task. In this paper, we propose an Entity Summarization with Attention (ESA) model, which is a neural network with supervised attention mechanisms for entity summarization. Specifically, we first calculate attention weights for facts in each entity. Then, we rank facts to generate reliable summaries. We explore techniques to solve complex learning problems presented by the ESA. On several benchmarks, experimental results show that ESA improves the quality of the entity summaries in both F-measure and MAP compared with some state-of-the-art methods, demonstrating the efectiveness of ESA. The source code and output can be accessed in https://github.com/WeiDongjunGabriel/ESA1.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <p>• Computing methodologies → Semantic networks.</p>
    </sec>
    <sec id="sec-2">
      <title>1 INTRODUCTION</title>
      <p>
        Since Knowledge Graph (KG) was first formally defined by
Google in 2012, it has been widely applied to various
communities in Artificial Intelligence (AI). KG serves for
describing real-world entities and the relationship among entities.
The way to represent databases in KG to describe entities
is generally by Resource Description Framework (RDF), in
the form of &lt;subject, predicate, object&gt;[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. With knowledge
databases rapidly growing up, the amount of entities and
relations in KG simultaneously rises in an alarming rate. This
phenomenon makes more challenge to extract or focus on
considerable representative triples. To comprehend lengthy
descriptions in large-scale KG quickly, summarizing useful
information to condense the scale of knowledge databases
is an emerging problem. Entity summarization is a method
to extract both brief and informative entities, which has
attracted keen interest in recent years. Since high quality of
extracted entities is fundamental to derive subsequent
knowledge in kinds of semantic tasks.
      </p>
      <p>
        Cheng et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] proposed RELIN to rank features based
on relatedness and informativeness for quick identification
of entities, which is adapted according to random surfer
model. DIVERSUM [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] takes the diverse property of
entities into consideration for summarizing tasks in KG. FACES
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] makes a proper balance between centrality and diversity
of extracted triples through Cobweb algorithm. FACES-E
proposed by Gunaratna et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] optimizes FACES by
considering the efect of literals in entity summarization. CD
[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] follows the idea of binary quadratic knapsack problem
to complete entity summarization. Based on PageRank
algorithm to rank triples, LinkSUM[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] focuses more on the
objects rather than utilizing the diversity of properties.
      </p>
      <p>In retrospect, previous work requires considerable prior
knowledge to construct complex ranking rules for entity
summarization. Besides, we can hardly find deep learning
methods for entity summarization. Due to attention mechanism
generates diferent weights according to human concern, we
can acquire higher weights for triples that people more
focus on. Following the advantages of BiLSTM, contextual
information is fully used to capture more informative triples.
Therefore, we propose a model called ESA, which uses
supervised attention mechanism with BiLSTM. The ESA allows
us to calculate attention weights for triples derived from each
entity, final reliable summaries can be extracted by ranking
weights.</p>
      <p>concat
LSTM
LSTM
concat
p1
o1
embedding</p>
      <p>embedding
transE
transE
2</p>
    </sec>
    <sec id="sec-3">
      <title>TASK DESCRIPTION</title>
      <p>RDF is an abstract data model, and an RDF graph
consists of a collection of statements. Simple statements
generally represent real-world entities, which are usually stored as
triples. Each triple t represents a fact that is in the form of
&lt;subject, predicate, object&gt;, denoted as &lt; s, p, o &gt;. Since
RDF data is encoded by unique identifiers (URIs), an
entity in RDF graphs can be regarded as a subject with all
predicates and corresponding objects to those predicates.</p>
      <p>Definition 1 (Entity Summarization): Entity
Summarization (ES) is a technique to summarize RDF data for
creating concise summaries in KG. The subject of each
entity provides the core for summarizing entities. Therefore,
the task of entity summarization is defined as extracting a
subset from a lengthy feature set of each entity with the
respective subject. Given an entity e and a positive integer k,
the output is top-k features of every entity e in the ranking
list of ES (e, k).
3</p>
    </sec>
    <sec id="sec-4">
      <title>THE PROPOSED ESA</title>
      <p>We model ES as a ranking task similar to existing work,
such as RELIN, FACES, and ES-LDA. Diferent from the
traditional approaches of generating entity summaries in KG,
the ESA is a neural network model using sequence model,
Figure 1 describes the architecture of the model.</p>
      <p>
        Similar to most sequence models [
        <xref ref-type="bibr" rid="ref19 ref5">5</xref>
        ], the ESA has an
encoder-decoder structure. The encoder is consisted of
knowledge representation and BiLSTM, which maps an input
sequence (t1, t2, . . . , tn) of RDF triples from a certain entity
to a continuous representation h = (h1, h2, . . . , hn). The
decoder is mainly composed of attention model. Given h, the
decoder then uses a supervised attention mechanism
generates an output vector α = (α1, α2, . . . , αn) representing
attention vector for each entity, which is then used as
evidence for summarizing entities. Higher attention weights are
related to more important triples, we finally select triples
according to top-k highest weights as our entity summaries.
3.1
      </p>
    </sec>
    <sec id="sec-5">
      <title>Knowledge Representation</title>
      <p>
        Entities in large-scale KG are usually described as RDF
triples, while each triple consists of a subject, a predicate,
and an object. MPSUM proposed by Wei et al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] takes
the uniqueness of predicates and the importance of objects
into consideration for entity summarization. The
experimental results show that the characteristics of predicates and
objects are key factors to select entities. In order to make
full use of the information contained by RDF triples, we
extract predicates and objects from above triples. Let n be
the number of triples with the same subject s, then two
lists respectively based on extracted predicates and objects
are l1 = (p1, p2, . . . , pn) and l2 = (o1, o2, . . . , on), where pi
and oi are corresponding predicates and objects from the
i-th triples. For each entity, we employ diferent methods
to map predicates and objects into continuous vector space
respectively [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. To solve the UNK problem of objects, we
employ diferent methods for each entity to map predicates
and objects into continuous vector space respectively.
Predicate Embedding Table. We use learned embeddings [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] to
convert a predicate input to a vector of dimension dp. We
randomly initialize embedding vector for each predicate and
tune it in training phase.
      </p>
      <p>
        Object Embedding Table. Unlike generating representation of
predicates based on word embedding technique, we use TransE
model [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] to map objects into vectors of dimension do. We
ifrst pretrain TransE model based on ESBM benchmark v1.1,
and extract the word vectors of objects to construct a lookup
table for object vectors. Then we obtain object vectors by
looking up the table as input, the object vectors are fixed
during training.
3.2
      </p>
    </sec>
    <sec id="sec-6">
      <title>BiLSTM Network</title>
      <p>We first randomly map the set of triples into a sequence,
then we employ BiLSTM to extract the information of
former triples from 1 to i − 1 and later triples from i + 1 to
n, where the information respectively propagations forward
and backward. In this paper, we denote the LST ML and
LST MR as the forward and backward LSTM model, xi as
the input at the time step i for LST ML and LST MR, the
Output
softmax
attention layer
concat
LSTM
LSTM
concat
p2
o2
· · ·
· · ·
· · ·
· · ·
· · ·
· · ·
· · ·
· · ·
concat
LSTM
LSTM
concat
embedding
transE
pn
on
&lt; s, p1, o1 &gt; &lt; s, p2, o2 &gt; &lt; s, p3, o3 &gt;
&lt; s, pn, on &gt;
0
c1
1
a1
0.02
0
c2
0
a2
0
intitialize
count
0
c3
8
normalize
a3
0.16
· · ·
· · ·
· · ·
corresponding of output of them are hLi and hRi . We encode
the input xi using Bidirectional LSTM as follows:
hLi = LST ML (xi, hLi−1 )
hRi = LST MR (xi, hRi−1 ).
(1)
The final output h = (h1, h2, · · · , hn), and its component
h = (h1, h2, · · · , hn) of BiLSTM is calculated by
concatenating hLi and hRi .</p>
      <p>Moreover, the hs is concatenated by hs1 and hs2. hs1 is the
value of hidden state from the final cell of upper LSTM layer,
while hs2 is the value of hidden state from the final cell of
lower LSTM layer. We then take hs as the input of
subsequent attention layer.
3.3</p>
    </sec>
    <sec id="sec-7">
      <title>Supervised Attention</title>
      <p>
        Attention model is a mainstream neural network in various
tasks such as Natural Language Processing (NLP) [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
For instance, in machine translation tasks [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], only certain
words in the input sequence may be relevant for predicting
the next [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Attention model incorporates this notion by
allowing the model to dynamically pay attention to only
certain parts of the input that help in performing the task at
hand efectively. In entity summarization task, when users
observe the facts in each subject, they may pay more
attention to certain facts than the rest, which can be modeled
based on Attention model by assigning an attention weight
for each fact in the subject.
      </p>
      <p>Given above considerations, we first construct gold
attention vectors using existing datasets. Then, we employ
attention mechanism to generate machine attention vectors.
Gold Atention Vectors. In this work, we use ESBM benchmark
v1.1 as our dataset. For each subject, we need to summarize,
ESBM becnchmark v1.1 not only provides the whole RDF
triples which are related to this subject, but also provides
several sets of top-5 and top-10 triples selected by
diferent users according to their preference. Above triples can be
utilized to construct gold attention vectors. which we can
utilize to construct gold attention vectors. We first initialize
an attention vector to zero of dimension n, where n is the
number of RDF triples corresponding to a specific subject.
Then, we count the frequency of each triple selected by users
to update the vector, the i-th value ci in this vector
represents the frequency of triple ti. Since ESBM benchmark v1.1,
each subject has five sets of top-5 and top-10 triples selected
by five diferent users, so the frequency of each triple ranges
from 0 to 5. Figure 2 illustrates the details, where α is the
ifnal gold attention vector after normalization, each value in
α is calculated by the following equation, αi denotes the i-th
value in vector α :
α i =</p>
      <p>ci
Σni=1ci
.</p>
      <p>(2)
Machine Atention Vectors. To generate machine attention
vectors with Attention model, we first obtain the output vectors
h = (h1, h2, . . . , hn) that the BiLSTM layer produced. Then,
the attention layer can automatically learn attention vector
0
cn
5
an
0.1
α = (α1, α2, . . . , αn) based on h. We use softmax function
to generate final attention vector α :
( T )
α = sof tmax hs h ,
(3)
where hs is concatenated by hs1 and hs2, hs1 is the value of
hidden state from the final cell of upper LSTM layer, while
hs2 is the value of hidden state from the final cell of lower
LSTM layer. We rank final attention weight vector α . Then
we obtain the entity summaries based on corresponding
topk values.</p>
      <p>Training. Given the gold attention α and the machine
attention α produced by our model, we employ cross-entropy loss
and define the loss function L of the proposed ESA model
as follows:</p>
      <p>L (α , α ) = CrossEntropy (α , α ) .
(4)
Finally, we use back-propagation algorithm to jointly train
the ESA model.
4</p>
    </sec>
    <sec id="sec-8">
      <title>EXPERIMENT</title>
      <p>
        In this section, we first introduce the datasets and
evaluation metrics employed in our experiment. Then we give the
implementation details to describe the overall procedure. To
prove the efectiveness of our model, we finally compare ESA
with the state-of-the-art approaches, including RELIN [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ],
DIVERSUM [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], CD [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], FACES [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] FACES-E [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and
LinkSUM [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. The experimental results are presented in
section 4.2.
      </p>
      <p>In this work, experiments are conducted based on ESBM
Benchmark v1.1 as ground truth. The ESBM benchmark
v1.1 consists of 175 entities including 125 entities are from
DBpedia2 and the rest entities are from LinkedMDB3. The
datasets and ground truth of the entity summarization can
be obtained from the ESBM 4.
4.1</p>
    </sec>
    <sec id="sec-9">
      <title>Implementation Details</title>
      <p>We apply word embedding technique to map predicates into
continuous space and use pretrained translation vectors with
TransE for objects. We first randomly partition the ESBM
dataset into five subsets for cross-validation before training.
During training , the word vectors of predicates are jointly
trained while the object vectors are fixed. The whole ESBM
benchmark v1.1 is trained using thunlp 5. We generate gold
attention vectors based on ESBM benchmark v1.1, and
calculate machine attention vectors based on our model.
Finally, we compare our model in terms of top-5 and top-10
entity summaries with the benchmark results of the entity
summarization tools, i.e., RELIN, DIVERSUM, CD, FACES-E,
FACES, and LinkSUM, as shown in Table 1 and Table 2.</p>
      <p>Hyper-parameters are tuned on the selected datasets. We
set the dimension of predicate embedding to 100, the
dimension of TransE to 100. The initial learning rate in our model
is set to 0.0001, which is an invariant parameter during
training.
4.2</p>
    </sec>
    <sec id="sec-10">
      <title>Experimental Results</title>
      <p>In this paper, we have carried out several experiments using
F-measure and MAP metrics based on two datasets:
DBpedia and LinkedMDB. The results regarding F-measures are
shown in Table 1, and MAP are shown in Table 2. The
results regarding F-measure and MAP are respectively shown
in Table 1 and Table 2. ESA achieves better results than all
other state-of-the-art approaches in each dataset, as well as
performs best in each metric.</p>
      <p>F-measure. As shown in Table 1, the best improvement in
single dataset is under top-5 summaries generated from
DBpedia, our model reaches the highest F-measure with 0.310,
which excesses the previously best result produced by CD. In
terms of DBpedia dataset, the total increase of top-5 and
top10 summaries is 3.1%. For LinkedMDB dataset, our model
obtains the best score in both k = 5 and k = 10. Meanwhile,
we combine two datasets to implement entity
summarization, our model has 7.96% and 5.82% increase respectively
for the results based on top-5 and top-10 results.
MAP. Our model also achieves better scores for MAP metric,
as Table 2 shows, where the best increase is 3% represented
in LinkedMDB for k = 10. The improvement of LinkedMDB
is more obvious in MAP metric than F-measure, where the
total increase is up to 5.6%.</p>
      <p>ALL. Combining Table 1 and Table 2, it is evident that our
ESA yields better results both for F-measures and MAPs. It
2https://wiki.dbpedia.org
3http://linkedmdb.org
4http://ws.nju.edu.cn/summarization/esbm/
5https://github.com/thunlp/TensorFlow-TransX
is worth mentioning that our model outperforms all other
state-of-art approaches in both F-measure and MAP given
by ESBM benchmark v1.1, which can significantly
demonstrate the efectiveness of our model.
5</p>
    </sec>
    <sec id="sec-11">
      <title>CONCLUSION</title>
      <p>In this work, we propose ESA, a neural network with
supervised attention mechanisms for entity summarization. Our
model aims at involving the human preference to augment
the reliability of extracted entities. Meanwhile, we explore
the way to construct gold attention vectors for modelling
supervised attention mechanism. The ESA applies extracted
predicates and objects as input, in particular, we exploit
diferent but proper knowledge embedding methods
respectively for predicates and objects, where the word embedding
method is for predicates and TransE is for objects. The final
output of ESA is normalized attention weights, which can be
used to select representative entities. Our experiments
indicate that word embedding technique and graph embedding
technique like TransE can be combined together into a
single task, which can better represent the fact or knowledge in
knowledge graph and provide a more powerful input vectors
for neural networks or other models. Experimental results
show that our work outperforms all other approaches both
in F-measure and MAP.</p>
    </sec>
    <sec id="sec-12">
      <title>6 FUTURE WORK</title>
      <p>In future work, we expect to integrate various deep
learning methods, and design several more powerful and efective
neural networks. Specifically, we may improve our work in
the following ways: (1) extending the scale of training set
to better train our models; (2) instead of employing TransE
model to tackle the UNK problem, we plan to analyze RDF
triples in more fine-grained aspects.
k=5</p>
      <p>LinkedMDB
k=5 k=10
k=5</p>
      <p>LinkedMDB
k=5 k=10
0.203
0.207
0.211
0.313
0.169
0.140
0.320
0.241
0.266</p>
      <p>0.341
0.155
0.141
0.369
k=10
k=10
0.231
0.237
0.252
0.289
0.241
0.236
0.312
0.313
0.298</p>
      <p>0.375
0.227
0.213
0.386
ALL
0.399
0.464
0.455
0.461
0.381
0.421
0.491
0.466
0.468</p>
      <p>0.527
0.351
0.345
0.549</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Yoshua</given-names>
            <surname>Bengio</surname>
          </string-name>
          , Réjean Ducharme, Pascal Vincent, and
          <string-name>
            <given-names>Christian</given-names>
            <surname>Janvin</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>A Neural Probabilistic Language Model</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>3</volume>
          (
          <year>2000</year>
          ),
          <fpage>1137</fpage>
          -
          <lpage>1155</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Antoine</given-names>
            <surname>Bordes</surname>
          </string-name>
          , Nicolas Usunier, Alberto García-Durán,
          <string-name>
            <given-names>Jason</given-names>
            <surname>Weston</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Oksana</given-names>
            <surname>Yakhnenko</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Translating Embeddings for Modeling Multi-relational Data</article-title>
          .
          <source>In NIPS.</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Sneha</given-names>
            <surname>Chaudhari</surname>
          </string-name>
          , Gungor Polatkan, Rohan Ramanath, and
          <string-name>
            <given-names>Varun</given-names>
            <surname>Mithal</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>An Attentive Survey of Attention Models</article-title>
          . CoRR abs/
          <year>1904</year>
          .02874 (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Gong</surname>
            <given-names>Cheng</given-names>
          </string-name>
          , Thanh Tran, and
          <string-name>
            <given-names>Yuzhong</given-names>
            <surname>Qu</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>RELIN: Relatedness and Informativeness-Based Centrality for Entity Summarization</article-title>
          . In Semantic Web-iswc -international
          <source>Semantic Web Conference.</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Kyunghyun</given-names>
            <surname>Cho</surname>
          </string-name>
          , Bart van Merrienboer,
          <string-name>
            <surname>Dzmitry Bahdanau</surname>
            , and
            <given-names>Yoshua</given-names>
          </string-name>
          <string-name>
            <surname>Bengio</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>On the Properties of Neural Machine Translation: Encoder-Decoder Approaches</article-title>
          . In SSST@EMNLP.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Maria</given-names>
            <surname>De-Arteaga</surname>
          </string-name>
          , Alexey Romanov,
          <string-name>
            <surname>Hanna M. Wallach</surname>
          </string-name>
          , Jennifer T. Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Cem Geyik, Krishnaram Kenthapadi, and Adam Tauman Kalai.
          <year>2019</year>
          .
          <article-title>Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting</article-title>
          .
          <source>In FAT.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Kalpa</given-names>
            <surname>Gunaratna</surname>
          </string-name>
          , Krishnaprasad Thirunarayan, Amit Sheth, and Cheng Gong.
          <year>2016</year>
          .
          <article-title>Gleaning Types for Literals in RDF Triples with Application to Entity Summarization</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Kalpa</given-names>
            <surname>Gunaratna</surname>
          </string-name>
          , Krishnaprasad Thirunarayan, and
          <string-name>
            <given-names>Amit P.</given-names>
            <surname>Sheth</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>FACES: Diversity-Aware Entity Summarization Using Incremental Hierarchical Conceptual Clustering</article-title>
          .
          <source>In AAAI.</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Kalpa</given-names>
            <surname>Gunaratna</surname>
          </string-name>
          , Krishnaprasad Thirunarayan,
          <string-name>
            <given-names>Amit P.</given-names>
            <surname>Sheth</surname>
          </string-name>
          , and Gong Cheng.
          <year>2016</year>
          .
          <article-title>Gleaning Types for Literals in RDF Triples with Application to Entity Summarization</article-title>
          .
          <source>In ESWC.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Siwei</surname>
            <given-names>Lai</given-names>
          </string-name>
          , Kang Liu, Liheng Xu,
          <string-name>
            <given-names>and Jian</given-names>
            <surname>Zhao</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>How to Generate a Good Word Embedding</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          <volume>31</volume>
          (
          <year>2016</year>
          ),
          <fpage>5</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Thang</surname>
            <given-names>Luong</given-names>
          </string-name>
          , Hieu Pham, and
          <string-name>
            <given-names>Christopher D.</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Efective Approaches to Attention-based Neural Machine Translation</article-title>
          .
          <source>In EMNLP.</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Danyun</given-names>
            <surname>Xu Liang Zheng Yuzhong Qu</surname>
          </string-name>
          .
          <year>2016</year>
          . CD at ENSEC 2016:
          <article-title>Generating Characteristic and Diverse Entity Summaries</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Marcin</surname>
            <given-names>Sydow</given-names>
          </string-name>
          , Mariusz Pikula, and
          <string-name>
            <given-names>Ralf</given-names>
            <surname>Schenkel</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>DIVERSUM: Towards diversified summarisation of entities in knowledge graphs</article-title>
          .
          <source>2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW</source>
          <year>2010</year>
          )
          <article-title>(</article-title>
          <year>2010</year>
          ),
          <fpage>221</fpage>
          -
          <lpage>226</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Andreas</surname>
            <given-names>Thalhammer</given-names>
          </string-name>
          , Nelia Lasierra, and
          <string-name>
            <given-names>Achim</given-names>
            <surname>Rettinger</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>LinkSUM: Using Link Analysis to Summarize Entity Data</article-title>
          .
          <source>In ICWE.</source>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Ashish</surname>
            <given-names>Vaswani</given-names>
          </string-name>
          , Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,
          <string-name>
            <given-names>Aidan N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , Lukasz Kaiser, and
          <string-name>
            <given-names>Illia</given-names>
            <surname>Polosukhin</surname>
          </string-name>
          .
          <year>2017</year>
          . Attention Is All You Need. (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Dongjun</surname>
            <given-names>Wei</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Shiyuan</given-names>
            <surname>Gao</surname>
          </string-name>
          , Yaxin Liu, Zhibing Liu, and
          <string-name>
            <given-names>Longtao</given-names>
            <surname>Huang</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>MPSUM: Entity Summarization with Predicatebased Matching</article-title>
          .
          <source>In EYRE.</source>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Danyun</surname>
            <given-names>Xu</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Liang</given-names>
            <surname>Zheng</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Yuzhong</given-names>
            <surname>Qu</surname>
          </string-name>
          .
          <year>2016</year>
          . CD at ENSEC 2016:
          <article-title>Generating Characteristic and Diverse Entity Summaries</article-title>
          . In SumPre@ESWC.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Tom</given-names>
            <surname>Young</surname>
          </string-name>
          , Devamanyu Hazarika, Soujanya Poria, and
          <string-name>
            <given-names>Erik</given-names>
            <surname>Cambria</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Recent Trends in Deep Learning Based Natural Language Processing [Review Article]</article-title>
          .
          <source>IEEE Computational Intelligence Magazine</source>
          <volume>13</volume>
          (
          <year>2018</year>
          ),
          <fpage>55</fpage>
          -
          <lpage>75</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <article-title>k=5 k=10 k=10</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>