<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Automatic Related Work Section Generation by Sentence Extraction and Reordering</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zekun Deng</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zixin Zeng</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Weiye Gu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jiawen Ji</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bolin Hua</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Information Management, Peking University</institution>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Related work section is essential in a scienti c publication, for it elaborates past studies relevant to the topic in comparison with the current one. The automatic generation of related work section in scienti c papers is a meaningful yet challenging task. While prior works have gained encouraging results, they have not fully addressed the issue of informativeness and the di culty of obtaining citation sentences due to delay of publication. In this paper, we introduce SERGE, a novel and e ective system for generating descriptive related work section automatically by sentence extraction and reordering. Our system rst employs a BERT-based ensemble model to select the most salient sentences in reference papers, and then uses a similar model to reorder these sentences for better readability. Automatic evaluation results show that SERGE signi cantly outperforms existing baselines on ROUGE metrics, gaining an improvement of 18% to 56% on recall and 4% to 33% on F-score. Human evaluation shows that SERGE gains a higher informativeness score than human-written gold standard as well as the baseline, indicating its ability to provide valuable information that matches the real interest of researchers. In contrast to existing methods, since our system is free from delayed citation problem and yields high informativeness, it shows a great potential for various applications.</p>
      </abstract>
      <kwd-group>
        <kwd>Related work section</kwd>
        <kwd>Literature review</kwd>
        <kwd>Scienti c documents</kwd>
        <kwd>Summarization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Scienti c papers usually contain a related work section, which is also known as a
literature review. It summarizes previous works relevant to the research topic in
order to establish the link between existing knowledge and new ndings[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Very
often, authors of scienti c papers cite existing papers in this section to show
the appropriateness of their research question, to justify their adopted methods,
and/or to present the creativeness and superiority of their ideas. However, it is
quite challenging to produce a high-quality related work section, since it involves
identifying crucial points from a long piece of paper and reorganizing them in a
neat and logical way.
      </p>
      <p>
        It is generally accepted that there are two distinct styles of literature reviews:
descriptive and integrative[
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. Descriptive literature reviews focus on
individual papers and provide more detailed description on the methods, results, and
implications of each study. They illustrate previous researches in high accuracy
and are thus more objective and rigorous. In contrast, integrative literature
reviews focus more on synthesis of ideas. Although including fewer details of
individual studies, integrative literature reviews provide more high-level critical
summaries of topics and are thus more condensed and structurally complex.[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
      </p>
      <p>
        In this paper, we particularly focus on the generation of descriptive related
work section. On this matter, Cohan and Goharian[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] have proposed a sentence
ranking algorithm that takes advantage of citation context to summarize
scienti c papers. Abura'ed et al.[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] have proposed a citation-based summarizer for
scienti c documents based on supervised learning and acheived competitive
results in CLScisumm-17 challenges. However, most of the existing studies require
citing sentences (a.k.a., \citances") of citing publications as inputs. Thus, these
strategies are limited by delay of publications||Mostly a new publication may
not be widely recognized and cited within a short period of time and, therefore,
it is quite hard to obtain the citing sentences mentioning the publication.
      </p>
      <p>To this end, in this paper, we propose a novel method for automatic
generation of descriptive related work section in scienti c papers by extracting salient
sentences from scienti c literature and rearranging them into a logical order. In
contrast to most existing methods which su er from citation delay problem, our
method does not require any citances to achieve its goal, making it applicable
even when no citation data is available.</p>
      <p>The main contributions of this paper are as follows:
1. We propose a novel and e ective approach to automatic descriptive related
work section generation based on extractive document summarization
techniques, including sentence extraction and reordering.
2. Our method does not need any citation data to achieve its goal, which implies
that the method does not su er from the delay of citing publications or
require the input of citation data. Such a characteristic o ers more potentials
of our proposed method with various applications.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Automatic related work section generation is a special case of multi-document
summarization tailored for scienti c articles[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Multi-document summarization
could be either extractive or abstractive, depending on whether the summary
contains sentences from source articles[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Partly due to scarcity of training data
and computational challenges, a large proportion of previous research are in
the extractive track, which typically constitutes of a sentence classi cation
subtask and a sentence reordering sub-task[7{9]. Common approaches for extracting
relevant sentences include graph-based ranking algorithms[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and neural
classi cation models[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Subsequently, extracted sentences are reordered based on
heuristic criteria or neural architectures with sentence ordering mechanisms[
        <xref ref-type="bibr" rid="ref12 ref13">12,
13</xref>
        ].
      </p>
      <p>
        Automatic related work generation di ers from summarization of generic
texts in the following aspects: 1) summarization of generic texts often focus on
the content of source papers, whereas the related work section should also
delineate contributions and limitations of reference papers (i.e., cited papers) as
well as the relationship between reference papers and the current paper; 2)
compared to generic texts, scienti c articles contain more domain-speci c concepts
and technical terms, which poses great challenges for language modeling; and
3) scienti c articles are more structured than generic texts and reference prior
research[
        <xref ref-type="bibr" rid="ref14 ref8">8, 14</xref>
        ]; accordingly, various unique approaches have been put forward.
For instance, Jaidka, Khoo and Na[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] proposed a literature review generation
framework that imitates human writing behavior. Many other algorithms are
based on citing behavior in scienti c articles. Hu and Wan[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] used Probabilistic
Latent Semantic Analysis (pLSA) to rank sentences from a set of reference
papers. Chen and Zhuge[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] analyzed citation sentences to detect common facts,
which were used to nd relevant sentences. More recently, Saggoin, Shvets and
Bravo[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] exploited pointer-generator architecture with copy-attention technique
and coverage mechanism to produce descriptive related work sections.
      </p>
      <p>
        However, we believe prior research on related work generation, with ROUGE
scores as the most popular evaluation metric, haven't fully addressed the issue
of informativeness: the property of conveying useful information[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. This could
be potentially problematic because the ROUGE metric may penalize summaries
that contain relevant sentences not included in the golden standard summary[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
Also, as is discussed in Section 1, most of the previous methods are
citationbased, which are infeasible in case of delayed citation. Therefore, a novel method
is proposed in this paper to tackle these problems.
3
3.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>System Design</title>
      <sec id="sec-3-1">
        <title>Overall Architecture</title>
        <p>Here, we introduce our system, SERGE, which stands for \Sentence extraction
and rEordering based Related work section GEnerator". The overall
architecture of SERGE is brie y illustrated in Fig. 1. Given a set of papers from which
a related work section will be automatically generated, the system takes the full
text of these papers as input, and generates a descriptive related work section
involving all these papers as output. SERGE consists of two main parts: a
classi cation model and a reordering model. The classi cation model is used
to determine whether a sentence is su ciently salient that it should be included
in the generated output. For each sentence as input, the classi cation model
generates a probability value indicating the salience of the sentence. Then, the
sentences with highest probability values are fed in to the reordering model,
which determines their best order (sequence of sentences). Lastly, the sentences,
sorted by the reordering model into the most sensible order, are modi ed with
citation tags and proper pronouns, forming the nal output of the system.</p>
        <p>
          Set of
papers
The task of classi cation model is formally stated as follows: Given a sentence s,
the model is supposed to assign a label Y (s) 2 f0; 1g to the sentence, or rather,
compute a probability value y(s) 2 [0; 1] indicating the salience of sentence s.
To accomplish the task, we adopt an ensemble model which consists of two
sub-models: a deep neural network model and a bag-of-words (BOW) sentence
matching model. The input sentence s is fed in to both models simultaneously.
The architecture of the deep neural network model follows Google's original
BERT paper[
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], as is illustrated in Fig. 2. The input sequence s is rst processed
by BERT with pre-trained parameters. Then, the nal hidden vector of the
BERT model C 2 RH corresponding to the rst input token ([CLS]) is fed in
to a classi cation layer on the top. The classi cation layer computes a vector
Z = softmax(CW T), where W 2 R2 H . Here, W is learnable, and Z = (z0; z1)
is a 2-dimensional vector, where zi is the estimated probability of the true label
of input sequence being i (i = 0; 1).
        </p>
        <p>
          The training of the model requires annotated data pairs. However, due to the
lack of suitable training corpus, we opt to annotate a new dataset automatically
by leveraging the ScisummNet corpus[
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], a large annotated dataset containing
1000 ACL Anthology papers with their citations. For each paper in the corpus,
the dataset includes its full text and incoming citation sentences. Based on the
generally agreed assumption that citation sentences usually underscore the most
        </p>
        <p>Linear layer
C</p>
        <p>Z</p>
        <sec id="sec-3-1-1">
          <title>This study</title>
          <p>…
[SEP]</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>Tokenization &amp; Add embedding</title>
          <p>
            s
important aspects of the cited paper and highlight its key contributions, we make
use of the citation sentences of a paper in the corpus to produce a gold label of
whether a sentence in the paper is salient. An algorithm similar to Nallapati et
al.'s paper[
            <xref ref-type="bibr" rid="ref11">11</xref>
            ] is applied to annotate the label of each sentence in a paper, which
is stated as follows: (1) Join all citation sentences of the paper together to form
a benchmark paragraph P , and create an empty paragraph P with no sentence
in it. (2) Select and append to P the sentence from the abstract or conclusion
part of paper which maximizes the ROUGE score between the newly updated
paragraph P and P and has not been appended to P before. (3) Repeat Step
2 until the ROUGE score does not increase anymore. (4) Label all the sentences
included in P as 1 (being salient) and all else as 0 (being not salient).
          </p>
          <p>By employing the greedy annotating algorithm as described above, we obtain
an annotated dataset that can be used to train our neural classi cation model.
The dataset includes 11954 training samples, in which 3541 are positive ones.</p>
          <p>
            The bag-of-words sentence matching model simply checks whether the input
sentence s contains any of the words in a curated feature word set B. We
manually choose the words to be contained in word set B according to the ndings of
Shin[
            <xref ref-type="bibr" rid="ref18">18</xref>
            ], who proposed a dictionary for detecting innovative points in academic
literature. Examples of words in B are \novel", \propose", and \improve".
Denote the predicted label of this BOW model as r(s), then r(s) = 1 if and only if
s contains at least one word in B, otherwise r(s) = 0.
          </p>
          <p>The combined output of the ensemble model is de ned by the following
equations:</p>
          <p>
            8 0;
y(s) = &lt;
Y (s) =
0; 0 y(s)
1; 0:5 &lt; y(s)
Essentially, Eq. 1 considers a trade-o between precision and recall. By setting
L = 0:2 and H = 0:4, the ensemble model achieves the most desirable overall
performance, with a precision of 0.622 and a recall of 0.793.
The task of reordering model is formally stated as follows: Given a set of
sentences S = (s1; s2; :::; sn), the model is supposed to nd their optimal
arrangement si1 ; si2 ; :::; sin (ir 6= it; 8r 6= t; ik; k; r; t 2 f1; 2; :::; ng), where the
probability of the sequence P(si1 ; si2 ; :::; sin ) is maximized. However, considering its
sheer scale, it is practically impossible to obtain the solution of the problem
directly. Therefore, we decompose the big problem into much smaller ones using
a method similar to Chen et al.'s paper[
            <xref ref-type="bibr" rid="ref12">12</xref>
            ]. Using the de nition of conditional
probability, it is obvious that
n
P(si1 ; si2 ; :::; sin ) = P(si1 ) Y P(sik jsi1 ; si2 ; :::; sik 1 )
k=2
To simplify the calculation, let
          </p>
          <p>P(sik jsi1 ; si2 ; :::; sik 1 ) = P(sik jsik 1 )
where k 2 f2; 3; :::; ng. We also assume P(si1 ) = 1. Thus, Eq. 3 becomes
P(si1 ; si2 ; :::; sin ) =
It can be seen from Eq. 5 that the computation of the probability of all possible
arrangements can be approached by simply computing the conditional
probability of each sentence appearing after another, reducing the complexity of the
problem signi cantly.</p>
          <p>
            We use a MobileBERT[
            <xref ref-type="bibr" rid="ref19">19</xref>
            ] based model to compute the conditional
probabilities. MobileBERT is a compact task-agnostic BERT that runs more than 5 times
faster than BERTBASE while still achieving comparable results on a variety of
benchmarks. We adopt MobileBERT instead of BERT mainly for practical
reasons: the system running time would get intolerable if BERT is used in real-world
application.
          </p>
          <p>To obtain the desired output, MobileBERT is ne-tuned on next sentence
prediction (NSP) task, which generates a probabilistic value indicating whether
the rst sentence in the input is followed by the second one in the source
document. The architecture of our NSP model is identical to our neural classi cation
model as is described in Section 3.2, except that the BERT layers are
substituted by MobileBERT. The training data for NSP model is also extracted from
ScisummNet corpus, whose writing style closely matches the expected input. To
build the dataset, we pick out every pair of neighboring sentences from the 1000
papers as positive sample, combined with roughly the same amount of negative
samples where the two sentences are not adjacent, yielding a total of 360509
training samples.</p>
          <p>Finally, to nd the optimal sentence sequence, the model computes the
probability of all possible arrangements of the sentences in S using Eq. 5. For the
sake of the concision of the nal output, if the output of the classi cation model
exceeds nmax = 3 sentences, only the ones with highest predicted value y(s) are
kept, and the rest are discarded.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Evaluation and Results</title>
      <p>
        Both automatic and human evaluation approaches are employed to assess the
performance of SERGE. 10 descriptive paragraphs of related work section are
randomly extracted from 8 computer science papers published in journals or
proceedings. The papers cited in each paragraph are collected to form a set of
reference papers. MEAD[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] is used as baseline. During automatic evaluation,
SERGE and baseline system generate a related work section respectively on each
set of reference papers. The output of the two systems are then compared with
the human-written paragraph in published papers by computing the ROUGE
score between them.
      </p>
      <p>During human evaluation, 3 computer science experts are instructed to grade
the paragraphs generated by the two systems and the gold standard on three
aspects: informativeness (INF), uency (FLU), and succinctness (SUC). The
experts are uninformed about the authorship of the texts. The range of score is
1-5.</p>
      <p>The result of automatic evaluation is presented in Table 1. Except for the
precision of ROUGE-1, SERGE outperforms the baseline on all metrics. Notably,
our system achieves a signi cant higher recall on all the three ROUGE metrics,
yielding a relative gain of 20%, 56% and 18%, respectively. The result of human
evaluation is presented in Table 2. SERGE gains an informativeness score of
4.23, which is about 6% higher than the baseline and 3% higher than the gold
standard. SERGE also outperforms the baseline on succinctness. In short, the
result above indicates that our system performs signi cantly better than previous
baseline in numerous aspects, and even yields a higher informativeness than
human-written gold standards.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Discussion and Conclusion</title>
      <p>In this paper, we propose a novel method for automatic generation of descriptive
related work section in scienti c papers by extracting salient sentences from past
literature and reordering them into a smooth paragraph, which is made possible
by two BERT-based neural models. The performance of our method is evaluated
by both automatic metrics and human experts. The results show that our method
gains a substantial improvement compared with past baselines and achieves a
high degree of informativeness comparable to human authors.</p>
      <p>Our method addresses the problems of existing works in two ways. First, our
method improves the informativeness of automatically generated related work
sections signi cantly, providing more valuable information in existing literature
which matches the real interest of researchers. Second, our method is immune
from citation delay problem, suggesting its prospect for a wider range of
applications.</p>
      <p>There are several limitations in the current study. For example, the evaluation
is not su ciently robust due to the high cost of human assessment. Also, our
method is not necessarily optimal in uency and a few other metrics. These
issues leave room for future exploration.</p>
      <p>The result of this study clearly shows the e ectiveness of our novel method
for related work section generation. Although the corpus used in this study is
limited to the computer science eld, it is e ortless to adapt our method to other
disciplines. Considering its universality and adaptiveness, our method shows a
tremendous potential for becoming an intelligent and helpful tool which can
increase the e ciency of researchers and boost scienti c innovations.</p>
      <p>In the future, we may continue to explore new methods for this task via
various paths, for instance, by abstractive summarization approaches or entity
extraction. We are also interested in integrating summarization problem with
certain knowledge bases which brings more intelligence to automatic systems.</p>
      <sec id="sec-5-1">
        <title>Acknowledgements</title>
        <p>This work was supported in part by The National Social Science Fundation of
China (Number: 17BTQ066).</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Webster</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Watson</surname>
          </string-name>
          , R.T.:
          <article-title>Analyzing the past to prepare for the future: Writing a literature review</article-title>
          .
          <source>MIS Quarterly</source>
          <volume>26</volume>
          (
          <issue>2</issue>
          ), xiii{xxiii (
          <year>2002</year>
          ), http://www.jstor.org/stable/4132319
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Jaidka</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khoo</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Na</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          :
          <article-title>Literature review writing: how information is selected and transformed</article-title>
          .
          <source>In: Aslib proceedings: New Information Perspectives</source>
          . vol.
          <volume>65</volume>
          , pp.
          <volume>303</volume>
          {
          <fpage>325</fpage>
          .
          <string-name>
            <surname>Emerald</surname>
          </string-name>
          (
          <year>2013</year>
          ). https://doi.org/10.1108/00012531311330665
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Saggion</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shvets</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bravo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , et al.:
          <article-title>Automatic related work section generation: experiments in scienti c document abstracting</article-title>
          .
          <source>Scientometrics</source>
          <volume>125</volume>
          (
          <issue>3</issue>
          ),
          <volume>3159</volume>
          {
          <fpage>3185</fpage>
          (
          <year>2020</year>
          ). https://doi.org/10.1007/s11192-020-03630-2
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Khoo</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Na</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jaidka</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Analysis of the macro-level discourse structure of literature</article-title>
          .
          <source>Online Information Review</source>
          <volume>35</volume>
          (
          <issue>2</issue>
          ),
          <volume>255</volume>
          {
          <fpage>271</fpage>
          (
          <year>2011</year>
          ). https://doi.org/10.1108/14684521111128032
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Cohan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goharian</surname>
          </string-name>
          , N.:
          <article-title>Scienti c article summarization using citation-context and article's discourse structure</article-title>
          .
          <source>In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <volume>390</volume>
          {
          <issue>400</issue>
          (
          <year>2015</year>
          ). https://doi.org/10.18653/v1/
          <fpage>D15</fpage>
          -1045
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saggion</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Accuosto</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bravo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , et al.: Lastus/taln@ clscisumm17:
          <article-title>Cross-document sentence matching and scienti c text summarization systems</article-title>
          .
          <source>In: BIRNDL@ SIGIR (2)</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Teslyuk</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The concept of system for automated scienti c literature reviews generation</article-title>
          . In: Krzhizhanovskaya,
          <string-name>
            <given-names>V.V.</given-names>
            ,
            <surname>Zavodszky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Lees</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.H.</given-names>
            ,
            <surname>Dongarra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.J.</given-names>
            ,
            <surname>Sloot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.M.A.</given-names>
            ,
            <surname>Brissos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Teixeira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J</given-names>
            . (eds.)
            <surname>Computational</surname>
          </string-name>
          <string-name>
            <surname>Science { ICCS</surname>
          </string-name>
          <year>2020</year>
          . pp.
          <volume>437</volume>
          {
          <fpage>443</fpage>
          . Springer International Publishing,
          <string-name>
            <surname>Cham</surname>
          </string-name>
          (
          <year>2020</year>
          ). https://doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -50420-5 32
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Ibrahim</given-names>
            <surname>Altmami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>El Bachir Menai</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Automatic summarization of scienti c articles: A survey</article-title>
          .
          <source>Journal of King</source>
          Saud University - Computer and Information Sciences (
          <year>2020</year>
          ). https://doi.org/10.1016/j.jksuci.
          <year>2020</year>
          .
          <volume>04</volume>
          .020, https://www.sciencedirect.com/science/article/pii/S1319157820303554
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lapata</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Hierarchical Transformers for Multi-Document Summarization</article-title>
          . arXiv e-prints arXiv:
          <year>1905</year>
          .
          <volume>13164</volume>
          (May
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Erkan</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Radev</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          : Lexrank:
          <article-title>Graph-based lexical centrality as salience in text summarization</article-title>
          .
          <source>Journal of arti cial intelligence research 22</source>
          ,
          <volume>457</volume>
          {
          <fpage>479</fpage>
          (
          <year>2004</year>
          ). https://doi.org/10.1613/jair.1523
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Nallapati</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhai</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>SummaRuNNer: A recurrent neural network based sequence model for extractive summarization of documents</article-title>
          .
          <source>Proceedings of the AAAI Conference on Arti cial Intelligence</source>
          <volume>31</volume>
          (
          <issue>1</issue>
          ) (
          <year>Feb 2017</year>
          ), https://ojs.aaai.org/index.php/AAAI/article/view/10958
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qiu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Neural Sentence Ordering</article-title>
          . arXiv e-prints
          <source>arXiv:1607.06952 (Jul</source>
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Devlin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>M.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toutanova</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          : BERT:
          <article-title>Pre-training of Deep Bidirectional Transformers for Language Understanding</article-title>
          . arXiv e-prints arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (Oct
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Yasunaga</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kasai</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Zhang, R.,
          <string-name>
            <surname>Fabbri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Friedman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Radev</surname>
            ,
            <given-names>D.:</given-names>
          </string-name>
          <article-title>ScisummNet: A large annotated corpus and content-impact models for scienti c paper summarization with citation networks</article-title>
          .
          <source>In: Proceedings of AAAI</source>
          <year>2019</year>
          (
          <year>2019</year>
          ). https://doi.org/10.1609/aaai.v33i01.
          <fpage>33017386</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Jaidka</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khoo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Na</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          :
          <article-title>Deconstructing human literature reviews { a framework for multi-document summarization</article-title>
          .
          <source>In: Proceedings of the 14th European Workshop on Natural Language Generation</source>
          . pp.
          <volume>125</volume>
          {
          <fpage>135</fpage>
          .
          <article-title>Association for Computational Linguistics, So a</article-title>
          ,
          <source>Bulgaria (Aug</source>
          <year>2013</year>
          ), https://www.aclweb.org/anthology/W13-2116
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wan</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Automatic generation of related work sections in scienti c papers: An optimization approach</article-title>
          .
          <source>In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          . pp.
          <volume>1624</volume>
          {
          <fpage>1633</fpage>
          . Association for Computational Linguistics, Doha, Qatar (Oct
          <year>2014</year>
          ). https://doi.org/10.3115/v1/
          <fpage>D14</fpage>
          -1170, https://www.aclweb.org/anthology/D14- 1170
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhuge</surname>
          </string-name>
          , H.:
          <article-title>Automatic generation of related work through summarizing citations</article-title>
          .
          <source>Concurrency and Computation: Practice and Experience</source>
          <volume>31</volume>
          (
          <issue>3</issue>
          ),
          <year>e4261</year>
          (
          <year>2019</year>
          ). https://doi.org/10.1002/cpe.4261, https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.4261,
          <string-name>
            <surname>e4261</surname>
            <given-names>CPE</given-names>
          </string-name>
          <source>-16- 0462</source>
          .R2
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Shin</surname>
          </string-name>
          , Y.:
          <source>Research on Innovative Point Identi cation and Mining of Academic Literature. Master's thesis</source>
          , Peking University (
          <year>Jun 2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Song</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>D.:</given-names>
          </string-name>
          <article-title>MobileBERT: a compact task-agnostic BERT for resource-limited devices</article-title>
          .
          <source>In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics</source>
          . pp.
          <volume>2158</volume>
          {
          <fpage>2170</fpage>
          . Association for Computational Linguistics,
          <source>Online (Jul</source>
          <year>2020</year>
          ). https://doi.org/10.18653/v1/
          <year>2020</year>
          .acl-main.
          <volume>195</volume>
          , https://www.aclweb.org/anthology/2020.acl-main.
          <fpage>195</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Radev</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Allison</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blair-Goldensohn</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blitzer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Celebi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dimitrov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Drabek</surname>
            ,
            <given-names>E.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hakim</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lam</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , et al.:
          <article-title>Mead-a platform for multidocument multilingual text summarization</article-title>
          .
          <source>In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC'04)</source>
          (
          <year>2004</year>
          ). https://doi.org/10.7916/D8MG7XZT
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>