<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Topical Sentence Embedding for Query Focused Document Summarization</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yang Gao</string-name>
          <email>gyang@bit.edu.cn</email>
          <email>gyang@bit.edu.cn Heyan Huang BIT; Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications hhy63@bit.edu.cn</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Linjing Wei</string-name>
          <email>weilinjing@bit.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Qian Liu</string-name>
          <email>liuqian2013@bit.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>BIT; Beijing Advanced Innovation Center for, Imaging Technology, Capital Normal University</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Beijing Institute of Technology (BIT);, Beijing Engineering Research, Center of High Volume Language, Information Processing and</institution>
          ,
          <addr-line>Cloud Computing Applications</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p />
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Distributed vector representation for sentences
have been utilized in summarization area, since
it simplifies semantic cosine calculation between
sentence to sentence as well as sentence to
document. Many extension works have been done
to incorporate latent topics and word embedding,
however, few of them assign sentences with
explicit topics. Besides, much sentence embedding
framework follows the same spirit of prediction
task about a word in the sentence, which omits
the sentence-to-sentence coherence. To address
these problems, we proposed a novel sentence
embedding framework to collaborate the current
sentence representation, word-based content and
topic assignment of the sentence to predict the
next sentence representation. The experiments on
summarization tasks show our model outperforms
state-of-the-art methods.</p>
      <p>Copyright ⃝c by the paper’s authors. Copying permitted for private and
academic purposes.</p>
      <p>InI:n:AP.roEcdeietodri,ngBs. oCfoIeJdCiAtoIr W(eodrsk.)s:hoPproocneeSdeimngasntoifc tMheacXhYinZe LWeoarknsinhgop,
Location,(SCMouLnt2ry0,1D7)D,A-MuMg1M9--Y25Y2Y0Y1,7p,uMbleislbhoeduranteh,tAtpu:/s/tcreaulira-.ws.org
1</p>
    </sec>
    <sec id="sec-2">
      <title>Introduction</title>
      <p>Text summarization is an important task in natural language
processing, which is expected to understand the meaning
of the documents and then produce a coherent, informative
but brief summarization of the original document with in a
limited length. The main approaches of text summarization
can be divided into two categories: extractive and
generative. Most extractive summarization systems extract parts
of the document (a few sentences or a few words) that are
deemed interesting by some metric (i.e., inverse-document
frequency) and join them to form a summary.
Conventionally, selecting sentences rely on feature engineering
approach in terms of extracting surface feature statistics (i.e.,
TFIDF cosine similarity) to compare with query and
document representation.</p>
      <p>Recently, distributed vector semantic representation for
words and sentences have achieved overwhelming success
in summarization area [KMTD14, KNY15, YP15], since it
converts high-dimensional and sparse linguistic data into a
controllable and dense dimension of semantic vectors. It
becomes more straightforward for generic summarization
to compute similarity (or relevance to some extents) and
facilitates semantic calculation. Delighted by the successful
word2vec model [MCCD13, MSC+13], Paragraph Vector
(PV) [LM14] model (i.e., the paragraph can be sentence,
paragraph or document) also contributes to predict the next
word given sequential word context and the current
paragraph representation. It inherits the semantic
representation and its efficiency, further captures the word order for
sentence representation. Moreover, the sentence vector can
benefit summaries since it directly characterises the
relevance between queries and candidate sentences.</p>
      <p>However, most of the sentence embedding models
[LM14, YP15] are trained as the prediction task about a
word in the sentence. In these models, sentences are
independently learnt via their local word content but often omit
the coherent relationship between sentences.
Summarization system focuses more on comprehensive attributes of
sentences, such as sentence coherence, sentence topic,
sentence representation and so on. Utilizing the conventional
sentence vectors may neglect the coherence between
candidate sentences as well as sentence topics. Although,
models incorporating topic and word embedding models, such
as TWE [LLCS15], have achieved successful results in
some NLP tasks, at sentence level, very few work focuses
on representing sentences with topics. For example, given
a user’s query that emphasises on possible plans, progress
and problems with hydroelectric projects. The query
contain complex topics like “plans”, “progress”, “problems”
and “hydroelectric projects”. Nevertheless, normal
vectorbased models can retrieve those relevant sentences that only
emphasis on one or two aspects of the query. It is
problematic to capture all the aspects of the query .</p>
      <p>In order to tackle the problems, we propose a novel
sentence embedding learning framework to enhance sentence
representation by incorporating multi-topic semantics for
summarization task, called Topical Sentence Embedding
(TSE) model. Gaussian distributions are utilised to model
mixtured centralities of the embedding space, which
capture a prior preference of topic for sentence prediction. In
addition, instead of training to predict words in the
document, our proposed model represents one sentence by
predicting the next sentence via jointly training the words in
the current sentence and the topic of the sentence.</p>
      <p>The rest of this paper is organized as follows. Section
2 summarizes the basic methods of embedding models and
summarization systems. We then introduce a newly
summarization framework in Section 3, especially in Section
3.2, the novel TSE model is proposed. Section 4 reports
the experimental results and corresponding analysis.
Finally, we conclude the paper.
2</p>
    </sec>
    <sec id="sec-3">
      <title>Background and Related Work</title>
      <p>We firstly introduce the Word2Vec and the PV model to
investigate the basic framework of training embedding model
for words and sentences.</p>
      <p>Word2Vec:
The basic assumption behind Word2Vec [MCCD13] is that
the representation of co-occurred words have the similar
representation in the semantic space. To this target, a
sliding window is employed on the input text stream, where
the central word is the target word and others are contexts.
Word2Vec method contains two models: CBOW and
Skipgram model. CBOW aims at predicting the target word
using the context words in the sliding window. The objective
of CBOW is to maximize the average log probability,
1 D
L = ! log P r(wi | C; W ). (1)</p>
      <p>D i=1
where, wi is the target word, C is the word contexts and
W is is word matrix, D is the corpus size. Different from
CBOW, Skip-gram aims to predict context words given the
target word. We ignore the details of this approach here.
Paragraph Vector (PV):
It [LM14] is an unsupervised algorithm that learns
fixedlength semantic representations of variable-length of texts,
which follows the same predicting task with Word2Vec.
The only change is the concatenate vector constructed from
W and S, where S is sentence matrix instead of individual
W . The PV model is a strong alternative sentence model,
and it is widely applied in learning representations for
sequential data.</p>
      <p>Work on extractive summarization spans a large range
of approaches. Most existing systems [Gal06, YGVS07]
use rank model to select the sentences with highest scores
to form the summarization. However, multi-document texts
often describe one central topic and some sub-topics, which
cannot be described only depending on ranking model.
Then we focus on how to rank the sentences and
collaborate topic coverage.</p>
      <p>A variety of features were defined to measure the
relevance, including TF-IDF cosine similarity [NVM06,
YGVS07], cue words [LH00], topic theme [HL05], and
WordNet similarity [OLLL11], etc. However, these
features usually suffer from lacking of deep
understanding semantics mechanism, which fail to meet the query
need. Since Mikolov et al. [MCCD13] proposed the
efficient word embedding method, there is a surge of
works [LM14, LLCS15] focusing on embedding models
for capturing the linguistic regularities. Embedding
models [KMTD14, KNY15, YP15, CLW+15] for words and
sentences also have encouraged summarization tasks from
the perspective of semantic relevance computing, such as
DocEmb and CNNLM. However, aforementioned methods
usually reward semantic similarity without considering of
topic coverage, which fail to meet the summary need.</p>
      <p>Topic-based methods have been proved their successes
for summarization. Parveen et al. [PRS15] proposed an
approach, which is based on a weighted graphical
representation of documents obtained by topic modeling. [GNJ07]
measured topic concentration in a direct manner: a
sentence was considered relevant to the query if it contained at
least one word from the query. While these work assume
that documents related to the query only talk about one
topic. Tang et al. [TYC09] proposed a unified probabilistic
approach to uncover query-oriented topics and four
scoring methods to calculate the importance of each sentence in
to represent the probability distribution for sampling a
vector x from the GMM.</p>
      <p>Subsequently, we can infer the posterior probability
distribution of topics. For each sentence s, the posterior
distribution of its topic is
q(zs = k) =</p>
      <p>πz N (vec(s)|µz, Σz )
"K</p>
      <p>k=1 N (vec(s)|µk, Σk)</p>
      <p>Based on the distribution, the topic of sentence s
can be vectorized as vec(Ts) = [q(zs = 1), q(zs =
2), · · · , q(zs = K)].</p>
      <sec id="sec-3-1">
        <title>Generative Sentence Embedding</title>
        <p>The assumption of the TSE is that sentences are
coherent and associated with their neighbours. Consequently,
we model one sentence as a prediction task based on
semantic structure of the previous sentences. The semantic is
represented by collaborating sentence topic, sentence
representation and its content. The Negative Sampling (NEG)
method is applied in [MCCD13] which is an efficient
approximation method. Therefore, we carry on the similar
estimation schema in our model.</p>
        <p>Definition 1. Label ls!: A label of sentence s is 1 or 0. The
#
label of positive sample is 1, the label of negative samples
are 0. For ∀s# ∈ S,
the document collection. Wang et al. [WLZD08] propose
a new multi-document summarization framework (SNMF)
based on sentence-level semantic analysis and symmetric
non-negative matrix factorization. The symmetric matrix
factorization has been shown to be equivalent to
normalized spectral clustering and is used to group sentences into
clusters. Futhermore, several approaches incorporate
vector representations with topics , such as NTM [CLL+15],
TWE [LLCS15] and GMNTM [YCT15], have collaborated
both benefits of semantic representation and classified
topics. This motivates us to investigate the cooperation models
for summarization system.
3</p>
        <p>The Framework for Query-focused
Summarization
Extracting salient sentences is the main task in this study.
At sentence level, the sentence embedding and sentence
ranking are utilised to enable sentence relevance to the user
queries and extract salient summaries.
3.1</p>
      </sec>
      <sec id="sec-3-2">
        <title>The Proposed TSE Model</title>
        <p>Inheriting the superiority of the PV model that constructs a
continuous semantic space, the novel architecture of
learning sentence representation, called TSE model, as shown in
the Figure 1.
(3)
(4)
(6)
ls(s) =
#
$ 1, s# = s;</p>
        <p>0, s# ̸= s;</p>
        <p>Let Xs be a concatenation of given information of
current sentence for predicting the next sentence, s, s′
be the current sentence. Xs = vec(Ts′ ) ⊕ vec(s′) ⊕
vec(w1)⊕, · · · , ⊕vec(wm). We incorporate the vectors as
the input, which includes topics, sentence embedding, and
its content of words.</p>
        <p>Given the collection S, we show how to learn
representation of sentences and topics. In this paper, we concentrate
to exploit the latent relationship between sentences.
Subsequently, the target sentence s is predicted purely by the
information from previous sentence, namely Xs. So the
objective of TSE is to maximize the probability
G =
% g(s) = %</p>
        <p>%
s∈S
s∈S u∈{s∪s−}
p(u|Xs)
(5)
Instead of using softmax function as prediction
probability, we directly use its negative sampling
approximation. The prediction objective function of sentence s is
g(s)=&amp;s∈S p(u|X s), and the probability function is
represented as follows
p(u|Xs) =
$ σ(XsT θu), ls(u) = 1</p>
        <p>#
1 − σ(XsT θu), ls(u) = 0
#
or write as a whole
p(u|Xs) = [σ(XsT θu)]ls(u!) · [1 − σ(XsT θu)]1−ls(u!) (7)
classifier
concatenate
s
1
s*
0</p>
        <p>Context
w1
w2
w3 . . . wn-1
wn
sÿ</p>
        <p>Ts
GMM</p>
      </sec>
      <sec id="sec-3-3">
        <title>Topic Vectorization by GMM</title>
        <p>Let K represent the number of topics, V is the size of
vector, and W represent word dictionary. S denotes the
sentence collection, in which s is one of the sentences. Let
vec(Ts) be the topic vector of sentence s. The vectors of
sentences and words are represented as vec(s) ∈ RV and
vec(w) ∈ RV . πk ∈ R, µk ∈ RV , Σk ∈ RV ×V and
"K</p>
        <p>k=1 πk = 1 are denoted as mixture weights, means and
covariance matrices, respectively. The parameters of the
GMM are collectively represented by λ = {πk, µk, Σk},
where k = 1, · · · , K. Given the collection of parameters,
we use</p>
        <p>P (x|λ) =</p>
        <p>K
! πkN (x|µk , Σk)
k=1
(2)
where σ(x) = 1/(1 + exp(−x)) and θu ∈ RV is the
parameter of Xs.</p>
        <p>The objective function is taken log-likelihood and
defined as
L =
! ls(u) log[σ(XsT θu)]+
L(s, u) = ls(u) · log[σ(XsT θu)]+
[1 − ls(u)] · log[1 − σ(XsT θu)]
(9)</p>
      </sec>
      <sec id="sec-3-4">
        <title>Parameters Estimation</title>
        <p>The parameters {λ, θu, Xs}, where λ = {πk, µk, Σk}
are estimated by maximizing the likelihood of the
objective function jointly. A two-phase iteration process is
conducted.</p>
        <p>Given {θu, Xs}, stochastic gradient descent (SGD) is
adopted in updating parameters of the GMM. Given λ,
the gradient of θu is calculated using the back propagation
based on the objective in Eq. 9.
3.2</p>
      </sec>
      <sec id="sec-3-5">
        <title>Sentence Ranking</title>
        <p>Sentence ranking aims to measure the relevant sentences
with consideration of query information. In this paper,
relevance ranking of sentences primarily relys on
semantic vector-based cosine similarity [KMTD14] that is a
promising measure to compute relatedness for
summarization. Additionally, statistics features (i.e., TFIDF score
[NVM06]). In summary, the ranking score is formulated
as:
Score(S) = α
nw
! T F IDF (wt) + βsim(vec(s), vec(Q))
t=1
+ γsim(vec(Ts), vec(TQ))
(10)
where Q is the query, sim(·) represents the function to
compute similarity, and we use cosine similarity in this
paper. α, β and γ are parameters in the summarization
system.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiments</title>
      <p>In this section, we present experiments to evaluate the
performance of our method in query focused multi-document
summarization task.
4.1</p>
      <sec id="sec-4-1">
        <title>Dataset and Evaluation Metrics</title>
        <p>
          In this study, we use the standard summarization
benchmark DUC2005 and DUC20061 for evaluation. DUC2005
contains 50 query-oriented summarization tasks. For each
query, a relevant document cluster is assumed to be
“retrieved”, which contains 25-50 docu
          <xref ref-type="bibr" rid="ref21">ments. DUC2006</xref>
          contains 50 query-oriented summarization tasks as well and
each query contains 25 documents. Thus, the task is to
generate a summary from the document cluster for
answering the query2. The length of a result summary is limited
by 250 words.
        </p>
        <p>We conducted evaluations by ROUGE [LH03] metrics.
The measure evaluates the quality of the summarization by
counting the number of overlapping units, such as n-grams.
Basically, ROUGE-N is n-gram recall measure.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Baseline Models and Settings</title>
        <p>We compare the TSE model with several query-focused
summarization methods.</p>
        <p>• TF-IDF: this model uses TF-IDF [NVM06] for
scoring words and sentences.
• Lead: take the first sentences one by one from the
document in the collection, where documents are
ordered randomly. It is often used as an official baseline
of DUC.
• LDA: this method uses Latent Dirichlet
Allocation[BNJ03] to learn the topic model.
After learned the topic model, we give max score to the
word of the same topic with query. The reader can
refer to the paper [TYC09] for the details.
• SNMF: this system [WLZD08] is for topic-biased
summarization. It utilised non-negative matrix
factorization (SNMF) to cluster sentences and from which
selected multi-coverage summary sentences.
• Word2Vec: the vector representations of words can
be learned by Word2Vec [MCCD13, MSC+13]
models. The sentence representation is calculated by using
an average of all word embeddings in the sentence.
• PV: PV [LM14] learns sentence vectors based on
Word2Vec Model. Thus, we use the same
parameters as that in our approach to calculate the scores of
sentences.
• TWE: TWE [LLCS15] employs LDA to refine
Skipgram model. It learns topical word embeddings based
on both words and their topics. The sentence
representation is calculated by using an average of all word
vectors in the sentence.
1http://duc.nist.gov/data.html
2In DUC, the query is also called “narrative” or “topic”
×
√
√
×
√
In this subsection, we give a report of experimental results
and analysis. Table 1 shows the overall summarization
performances of the proposed model and baseline
models. It can be observed that our approach gives the best
summary compare to any other method in ROUGE metrics
over two benchmark datasets, which strongly demonstrates
the outstanding performance of the proposed
summarization model. Impr denotes the relative improvements over
the best of the nine baselines. We can find that the
proposed TSE sentence embedding consistently outperforms
the baselines from 0.03% to 6.35%.</p>
        <p>Experimental results have validated our proposed model
that exploits sentence similarity and topic information can
improve the overall performance. Nevertheless, they could
not point out impact of the designed measure of sentence
similarity. Hence, we keep consistency for our algorithm
framework except for removing the part of features while
calculating sentence ranking, to investigate the importance
3https://catalog.ldc.upenn.edu/LDC2011T07
of each element as shown in Table 2. We calculate the
percentage that the TSE is superior to the one neglecting one
feature, denoted as ratio 1 for ROUGE-1 metrics and ratio
2 for ROUGE-2. As shown the ratio 1 is 3.86% and ratio
2 is highly up to 8.27%, it illustrates that sentence
similarity computation by our proposed sentence embedding
plays a consistently dominant role for the summary. On
the contrary, it has improving space for utilizing topics for
summary.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>This work proposes a novel sentence embedding model
which wisely incorporates sentence coherence and topic
characteristics in the learning process. It can automatically
generates distributed representations for sentences as well
as assigns sentences with semantic and meaningful topics.
We conduct extensive experiments on DUC query-focused
summarization datasets. Utilizing the superiority of the
proposed TSE that facilitates sentence ranking, the system
achieves competitive performance. A promising future
direction is to strengthen topic optimization during the
sentence learning. With the assistance of semantic topic, we
can extract sentence-based saliance topic representation as
direct summary.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work is supported by National Basic Research
Program of China (973 Program, Grant No.2013CB329303),
National Nature Science Foundation of China (Grant</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>[BNJ03] David</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Blei</surname>
            ,
            <given-names>Andrew Y.</given-names>
          </string-name>
          <string-name>
            <surname>Ng</surname>
            , and
            <given-names>Michael I.</given-names>
          </string-name>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Jordan</surname>
          </string-name>
          .
          <article-title>Latent dirichlet allocation</article-title>
          .
          <source>JMLR</source>
          ,
          <volume>3</volume>
          :
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [CLL+15]
          <string-name>
            <surname>Ziqiang</surname>
            <given-names>Cao</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Sujian Li</given-names>
            ,
            <surname>Yang</surname>
          </string-name>
          <string-name>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Wenjie</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Heng</given-names>
            <surname>Ji</surname>
          </string-name>
          .
          <article-title>A novel neural topic model and its supervised extension</article-title>
          .
          <source>In Proceedings of AAAI'15</source>
          , pages
          <fpage>2210</fpage>
          -
          <lpage>2216</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [CLW+15]
          <string-name>
            <given-names>Kuan</given-names>
            <surname>Yu</surname>
          </string-name>
          <string-name>
            <surname>Chen</surname>
          </string-name>
          , Shih Hung Liu, Hsin Min Wang, Berlin Chen, and Hsin Hsi Chen.
          <article-title>Leveraging word embeddings for spoken document summarization</article-title>
          .
          <source>Computer Science</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [KMTD14]
          <article-title>Mikael Ka˚geba¨ck, Olof Mogren</article-title>
          , Nina Tahmasebi, and
          <string-name>
            <given-names>Devdatt</given-names>
            <surname>Dubhashi</surname>
          </string-name>
          .
          <article-title>Extractive summarization using continuous vector space models</article-title>
          .
          <source>In Proceedings of EACL'14</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>[LLCS15] Yang</surname>
            <given-names>Liu</given-names>
          </string-name>
          , Zhiyuan Liu, Tat Seng Chua, and
          <string-name>
            <given-names>Maosong</given-names>
            <surname>Sun</surname>
          </string-name>
          .
          <article-title>Topical word embeddings</article-title>
          .
          <source>In Proceedings of AAAI'15</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [Gal06] [GNJ07] [HL05] [KNY15] [LH00] [LH03]
          <string-name>
            <surname>Quoc</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Le</surname>
            and
            <given-names>Tomas</given-names>
          </string-name>
          <string-name>
            <surname>Mikolov</surname>
          </string-name>
          .
          <article-title>Distributed representations of sentences and documents</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <source>Computer Science</source>
          ,
          <volume>4</volume>
          :
          <fpage>1188</fpage>
          -
          <lpage>1196</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [MCCD13]
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , Kai Chen, Greg Corrado, and
          <string-name>
            <given-names>Jeffrey</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <article-title>Efficient estimation of word representations in vector space</article-title>
          .
          <source>Computer Science</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [MSC+13]
          <string-name>
            <surname>Tomas</surname>
            <given-names>Mikolov</given-names>
          </string-name>
          , Ilya Sutskever, Kai Chen, Greg Corrado, and
          <string-name>
            <given-names>Jeffrey</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          .
          <volume>26</volume>
          :
          <fpage>3111</fpage>
          -
          <lpage>3119</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [NVM06]
          <string-name>
            <given-names>Ani</given-names>
            <surname>Nenkova</surname>
          </string-name>
          , Lucy Vanderwende, and
          <string-name>
            <given-names>Kathleen</given-names>
            <surname>Mckeown</surname>
          </string-name>
          .
          <article-title>A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <source>In Proceedings of SIGIR'06</source>
          , pages
          <fpage>573</fpage>
          -
          <lpage>580</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [OLLL11]
          <string-name>
            <given-names>You</given-names>
            <surname>Ouyang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Wenjie</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Sujian</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Qin</given-names>
            <surname>Lu</surname>
          </string-name>
          .
          <article-title>Applying regression models to queryfocused multi-document summarization</article-title>
          .
          <source>Information Processing &amp; Management An International Journal</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [PRS15] [TYC09]
          <string-name>
            <given-names>Daraksha</given-names>
            <surname>Parveen</surname>
          </string-name>
          ,
          <string-name>
            <surname>Hans-Martin Ramsl</surname>
            , and
            <given-names>Michael</given-names>
          </string-name>
          <string-name>
            <surname>Strube</surname>
          </string-name>
          .
          <article-title>Topical coherence for graphbased extractive summarization</article-title>
          .
          <source>In Proceedings of EMNLP'15</source>
          , pages
          <fpage>1949</fpage>
          -
          <lpage>1954</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Jie</given-names>
            <surname>Tang</surname>
          </string-name>
          , Limin Yao, and
          <string-name>
            <given-names>Dewei</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <article-title>Multitopic based query-oriented summarization</article-title>
          .
          <source>In Proceedings of SDM'09</source>
          , pages
          <fpage>1147</fpage>
          -
          <lpage>1158</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>[WLZD08] Dingding</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Tao</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Shenghuo Zhu, and Chris Ding. Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization</article-title>
          .
          <source>In Proceedings of SIGIR'08</source>
          , pages
          <fpage>307</fpage>
          -
          <lpage>314</lpage>
          . ACM,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [YCT15]
          <string-name>
            <given-names>Min</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Tianyi</given-names>
            <surname>Cui</surname>
          </string-name>
          , and Wenting Tu.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <article-title>Ordering-sensitive and semantic-aware topic modeling</article-title>
          .
          <source>In Proceedings of AAAI'15</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [YGVS07]
          <article-title>Wen Tau Yih, Joshua Goodman, Lucy Vanderwende, and Hisami Suzuki. Multi-document summarization by maximizing informative content-words</article-title>
          .
          <source>In Proceedings of IJCAI'07</source>
          , pages
          <fpage>1776</fpage>
          -
          <lpage>1782</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [YP15]
          <string-name>
            <given-names>Wenpeng</given-names>
            <surname>Yin</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yulong</given-names>
            <surname>Pei</surname>
          </string-name>
          .
          <article-title>Optimizing sentence modeling and selection for document summarization</article-title>
          .
          <source>In Proceedings of IJCAI'15</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Galley</surname>
          </string-name>
          .
          <article-title>A skip-chain conditional random field for ranking meeting utterances by importance</article-title>
          .
          <source>In Proceedings of EMNLP'07</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <given-names>Surabhi</given-names>
            <surname>Gupta</surname>
          </string-name>
          , Ani Nenkova, and
          <string-name>
            <given-names>Dan</given-names>
            <surname>Jurafsky</surname>
          </string-name>
          .
          <article-title>Measuring importance and query relevance in topic-focused multi-document summarization</article-title>
          .
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <source>In Proceedings of SIGIR'05</source>
          , pages
          <fpage>202</fpage>
          -
          <lpage>209</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>Hayato</given-names>
            <surname>Kobayashi</surname>
          </string-name>
          , Masaki Noguchi, and
          <string-name>
            <given-names>Taichi</given-names>
            <surname>Yatsuka</surname>
          </string-name>
          .
          <article-title>Summarization based on embedding distributions</article-title>
          .
          <source>In Proceedings of EMNLP'15</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <surname>Chin Yew Lin</surname>
            and
            <given-names>Eduard</given-names>
          </string-name>
          <string-name>
            <surname>Hovy</surname>
          </string-name>
          .
          <article-title>The automated acquisition of topic signatures for text summarization</article-title>
          .
          <source>In Proceedings of COLING'00</source>
          , pages
          <fpage>495</fpage>
          -
          <lpage>501</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <surname>Chin Yew Lin</surname>
            and
            <given-names>Eduard</given-names>
          </string-name>
          <string-name>
            <surname>Hovy</surname>
          </string-name>
          .
          <article-title>Automatic evaluation of summaries using n-gram co-occurrence statistics</article-title>
          .
          <source>In Proceedings of ACL'03</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>