<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Improving Scientific Article Visibility by Neural Title Simplification</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Universitat Pompeu Fabra</institution>
          ,
          <addr-line>Barcelona 08018</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>The rapidly growing amount of data that scientific content providers should deliver to a user makes them create effective recommendation tools. A title of an article is often the only shown element to attract people's attention. We offer an approach to automatic generating titles with various levels of informativeness to benefit from different categories of users. Statistics from ResearchGate used to bias train datasets and specially designed post-processing step applied to neural sequence-to-sequence models allow reaching the desired variety of simplified titles to gain a trade-off between the attractiveness and transparency of recommendation.</p>
      </abstract>
      <kwd-group>
        <kwd>Scientific Text Summarization</kwd>
        <kwd>Machine Translation</kwd>
        <kwd>Recommender Systems</kwd>
        <kwd>Personalized Simplification</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The amount of information scientific society produces on a daily basis results in the
necessity of researchers to have proper guidance in a digital space. The function of the
virtual assistance is performed by various scientometric systems, research paper
recommender systems
        <xref ref-type="bibr" rid="ref1">(Haruna et al., 2017)</xref>
        and different kinds of search engines.
        <xref ref-type="bibr" rid="ref2 ref24">(Shvets et al., 2015)</xref>
        summarizes the most common types of systems for scientometric
analysis. The recent trend in scientific paper delivery is purpose specific
webresources, blogs, and e-journals often coupled with email subscriptions. They often
provide personalized recommendations based on users’ behavior and preferences.
      </p>
      <p>The recommendation usually has a form of imprint often limited only by a title (as
in the case with email subscriptions associated with limited space and lack of time to
attract people’s attention). Eventually, the success of recommendation depends on the
informativeness of the title of an article subject to user’s intentions and
acknowledgment with a certain scientific field. This denotes the necessity of finding a way of
varying the title of the same paper for different categories of users.</p>
      <p>
        The focus of this paper is in developing models for creating a variety of simplified
versions of the titles of scientific articles which would be condensed and informative
enough and at the same time would correspond to the original topic of a paper to
maintain users’ loyalty. We aim at supporting two scenarios of personalized
simplification: the first ensuring narrow focus on specific scientific concepts for goal-oriented
experts and the second providing a general overview for researchers working on the
edge of a topic willing to expand their horizons. The second case should not be
treated as a generation of clickbaits (catchy short misleading headings) that are to be
blocked with the use of efficient machine learning approaches
        <xref ref-type="bibr" rid="ref3">(Biyani et al., 2016)</xref>
        .
      </p>
      <p>
        There is a variety of algorithms that could be used for title simplification which is a
rapidly growing research area
        <xref ref-type="bibr" rid="ref4 ref5 ref7">(Bouayad-Agha et al., 2009; Saggion et al., 2015; Guo
et al., 2018)</xref>
        . As long as the defined task is similar to text compression and abstractive
summarization we made a choice towards encoder-decoder neural architectures
        <xref ref-type="bibr" rid="ref6 ref8">(Nallapati et.al, 2016, Nikolov et.al, 2018)</xref>
        .
      </p>
      <p>The remainder of the paper is structured as follows. In Section 2, we propose a
method for scientific title diversification and simplification. Section 3 is devoted to
describing the datasets used for training. Section 4 denotes the experiment setup.
Section 5 provides results of numerical experiments. Section 6 is devoted to human
evaluation. In Section 7, finally, we discuss results and outline future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Method</title>
      <p>Recent advances in natural machine translation (NMT) incite to solve the task in a
supervised manner controlling the style of a title by conditioning training data. The
method we propose comprises the following steps: a) selecting a subset from an
abstract-to-title dataset to impose conditions that would force a model to generate
hypotheses with desirable properties; b) training a sequence-to-sequence (seq2seq)
model; c) applying a model to title-to-title generation; d) performing post-processing step
to remove unnecessarily repeated tokens; e) filtering titles with improper structure.
The remainder of this section describes each step in details.</p>
      <p>To create titles of different styles for various categories of researchers several
datasets should be used. The set of highly popular scientific titles may help to generate
attractive headings for users with interests peripheral to the subject of a paper. The
condition to have a multi-word noun phrase NPmw in a target text is to avoid
producing overly shortened pointless titles. In case each training example contains a
reference text Rt and a target text Tt that have similar NPmw (at least 2 common terms), a
model might learn to preserve the most important concepts from original titles needed
by experts. Figure 1 shows the training example with similar NPmw-s in Rt and Tt.</p>
      <p>Input sequence (an abstract, lower case, tokenized, truncated)
the main goal of this research is to study whether or not the
order of presentation of the premises in a logical argument
form , such as a conditional reasoning task , could affect</p>
      <p>
        Target sequence
(a title, tokenized, lower case)
effects of order of presentation
on conditional reasoning
We chose a bidirectional LSTM
        <xref ref-type="bibr" rid="ref9">(Luong et al., 2015)</xref>
        with a copy mechanism
        <xref ref-type="bibr" rid="ref10 ref11">(Gu et
al., 2016; See et al., 2017)</xref>
        as a basic model. In particular, we used the realization in
OpenNMT toolkit
        <xref ref-type="bibr" rid="ref12">(Klein et al., 2018)</xref>
        enabling pointer that allows copying tokens
from the reference text. The trained model is to be applied to new unseen titles, which
are, in opposite to abstracts (cut-off after 50 tokens in our experiments), not truncated.
      </p>
      <p>
        Since the task differs from general NMT task and summarization task by the
absence of need in tracking alignment, traditional coverage mechanism
        <xref ref-type="bibr" rid="ref13">(Wu et al.,
2016)</xref>
        , that discourage repetitions, is not included not to impose potentially harmful
restrictions and not to overcomplicate the model. Instead, we introduce the
postprocessing step PS as follows. Firstly, each repetition of a term is removed leaving the
only occurrence closest to the beginning of a text. Secondly, all the auxiliary tokens
without required terms in between or after them are eliminated. In the end, we
iteratively remove the last token in a text if it is an adjective or auxiliary token and, in
addition, capitalize the title.
      </p>
      <p>The last step consists in filtering improper titles, i.e., generated sequences that have
less than two NPmw-s similar to some NPmw-s of the source title. In those use cases
when even potentially pointless output is required, this step should be skipped.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Datasets</title>
      <p>We chose ResearchGate1 platform as a source of data. It has a recommender system
and therefore openly counts the number of times a page with a paper was visited to
provide reasonable recommendations that motivates authors to be more visible.</p>
      <p>
        We selected 150K imprints of articles on various topics using a wide list of general
scientific words
        <xref ref-type="bibr" rid="ref14">(Osipov et al., 2014)</xref>
        as an entry point to the articles. Figure 2 shows
the correlation between the number of paper views Nv and the title lengths Lt (in
characters) in the collection. The top-viewed articles along negative correlation formed
the desired set of highly popular titles. The whole pool of imprints formed a generic
dataset. Random split for training and validation (93/7) was carried out. The set of
1000 imprints with Nv = 1 and Lt &gt; 100 was used for testing the models.
The texts were pre-processed on the fly applying language detection with langid.py2
and sentence detection with tokenization from NLTK3. Cleaning of training and
vali
      </p>
      <sec id="sec-3-1">
        <title>1 https://www.researchgate.net/</title>
        <p>dation data consisted in leaving only examples with Nv &gt; 1, at least one common term
in Rt and Tt, and Lt &gt; 20.</p>
        <p>
          To detect noun phrases we used Spacy chunker
          <xref ref-type="bibr" rid="ref15">(Honnibal and Montani, 2017)</xref>
          that
we elaborated for detecting complex phrases, which map single concepts (e.g.,
“vertex energy of a graph” that is a lexical variation of the concept “graph energy”).
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiment Setup</title>
      <p>
        Selecting the first 75 characters of the reference text is generally used as a baseline in
summarization tasks; cf., e.g.,
        <xref ref-type="bibr" rid="ref16">(Rush et al., 2015)</xref>
        . We added subsequent cut-off after
the last noun in a phrase. This improved baseline is referred to as MBase henceforth.
      </p>
      <p>
        Several seq2seq models (M1, M2, …) with the above-described architectures differed
by a number of layers were applied to various datasets to bias the style of output text.
They were then extended with post-processing PS (M1ps, M2ps, …) and filtering steps,
which are novel for the best knowledge of the author; cf. Table 1 for details.
For the final model assessment, we used measures BLEU
        <xref ref-type="bibr" rid="ref17">(Papineni et al., 2002)</xref>
        ,
ROUGE-1, ROUGE-2, ROUGE-L
        <xref ref-type="bibr" rid="ref18">(Lin, 2004)</xref>
        , and specially designed NPdiff-p, i.e.,
NPmw-based precision evaluated as rouge-L-p but considering only one occurrence of
similar NPmw-s in a hypothesis.
      </p>
      <p>The intermediate models created at checkpoints during the training were assessed
and the best by NPdiff-p were selected as resulting.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Results</title>
      <p>The most of the basic models performed reasonably: produced titles were in general
shorter than original, multiple-word noun phrases from reference title covered a
significant part of the generated title (NPdiff-p = 0.68 on average). However, some
models, especially M5, introduced many repetitions (for all checkpoints): the BLEU value
reflected it being equal to 0.18 for M5 while the average value for the rest of models</p>
      <sec id="sec-5-1">
        <title>2 https://github.com/saffsd/langid.py 3 https://www.nltk.org</title>
        <p>was equal to 0.35. Since BLEU depends on a number of same word occurrences, the
increase of it by 24% on average due to PS attests usefulness of the step (cf. Table 2).
Filtering step allowed dropping less informative titles so that one can take advantage
even of poor models reducing a risk to present misleading picking-eye headings or
generic topics to an end user (cf. Figure 3 for examples of generated texts).
model
MBase
M1ps+F
M2ps+F
M3ps+F
M4ps+F
M5ps+F
M6ps+F</p>
        <sec id="sec-5-1-1">
          <title>Addiction and the New Black?</title>
          <p>picking-eye The Romans Know?</p>
        </sec>
        <sec id="sec-5-1-2">
          <title>Spain: a Focus</title>
          <p>generic topics</p>
        </sec>
        <sec id="sec-5-1-3">
          <title>Consumer Loyalty</title>
        </sec>
        <sec id="sec-5-1-4">
          <title>Financial Cooperation</title>
        </sec>
        <sec id="sec-5-1-5">
          <title>Active Learning for Biomedical Data Classification</title>
          <p>Access to Specialist Medical Services: a Pilot Study
s
e
l
t
i
t
l
a
n
i
F
The extension of basic models led to an increase of NPdiff-p by 9% and rouge-L-f by
11% on average. Table 3 gives an idea of the variation of titles of different models in
style and in compression rate.
It is worth noting that 1-layer models M1 and M2 trained on conditioned datasets
reached higher values for the majority of measures in comparison to models M3 and
M4 fed with generic data. This highlights rationality in pre-directing the training.
6</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Human Evaluation</title>
      <p>For human evaluation, we selected five papers of the NLP research group (TALN
UPF) with titles longer than 93 characters (10-18 words). Their authors who own
Ph.D. degrees were asked to rank output titles for these papers including original title
by preference on clicking if they saw a title briefly in a daily email digest. To face
different decision criteria assessors worked with papers of their authorship (for
simulating expert behavior) and with papers of their colleagues (expanding horizons use
case). If some titles in a set were the same or assessors did not have any preference
between two similar titles they were allowed to rank them equally. The top models
sorted by the average rank and examples of titles from one set are listed in Table 4.
Noted final increase of NPdiff-p and rouge-L-f indicates that common subsequences
became longer in relation to the length of titles meaning that offered post-processing
step with filtering plays an important role in forming a fluent text. At the same time,
the output should not have been just one of the original subsequences, therefore, we
did not aim at reaching too high precision values.</p>
      <p>Pure state-of-the-art seq2seq models without post-processing step got low ranks on
human evaluation. The models M1ps and M2ps have a higher average rank of 6. Their
titles are well-formed and represent a combination of original multi-word expressions
(cf. Table 3 for relatively high scores of rouge-2-r), however, less corresponding to
the topic that is partly reflected by comparatively lower values of rouge-L-p. The
outputs of the models M3ps and M5ps were often preferred to original titles. Having 1.3
times shorter titles than M5ps, conditionally trained M6ps achieved almost the same
average score. The baseline has the highest rank since it often better preserves the
meaning although does not always form a complete phrase. The main drawback is
that it usually only generalize a title to some extent (in case of well-turned
subsequence) and miss details experts might need.</p>
      <p>The close average ranks of models and rouge-L-f on the same level for all models
denote an opportunity to overcome the general problem of lacking the variability in
neural seq2seq generation. Different title styles give a possibility to reach a preferable
trade-off between the conciseness of the title and its transparency.</p>
      <p>
        For future work, we plan to gain value from methods of paraphrasing
        <xref ref-type="bibr" rid="ref19">(Cao et al.,
2017)</xref>
        , advanced simplification
        <xref ref-type="bibr" rid="ref20 ref21">(Zhang and Lapata, 2017; Štajner and Saggion, 2018)</xref>
        and surface realization for deep input representations
        <xref ref-type="bibr" rid="ref22">(Belz et al., 2018)</xref>
        to obtain
diverse semantically close outputs differ from text reformulated with mostly the same
words. Fake-paper detecting
        <xref ref-type="bibr" rid="ref23">(Byrne and Labbé, 2017)</xref>
        and assessing the quality of
scientific texts
        <xref ref-type="bibr" rid="ref2 ref24">(Shvets, 2015)</xref>
        will help to avoid training the models on misleading
titles. Finally, pre-existing taxonomies (e.g., JEL codes in Economics, the ACM
taxonomy in Computer Science, the Web of Science categories attached to journals), and
meta information of papers such as authors’ keywords or KeywordsPlus items
inferred from the references cited
        <xref ref-type="bibr" rid="ref25">(Garfield and Sher, 1993)</xref>
        are to be used for
preselecting the most relevant concepts to bias the training.
      </p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>The presented work was supported by the European Commission under the contract
numbers H2020-700024-RIA, H2020-700475-IA, H2020-779962-RIA,
H2020786731-RIA, and H2020-825079-RIA and by the Russian Foundation for Basic
Research under the contract number 18-37-00198. Many thanks to the four anonymous
reviewers for their valuable comments, and to the five postdoctoral researchers for
their high responsiveness in the evaluation and insightful feedback.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Haruna</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ismail</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Damiasih</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutopo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Herawan</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>A collaborative approach for research paper recommender system</article-title>
          .
          <source>PloS one</source>
          ,
          <volume>12</volume>
          (
          <issue>10</issue>
          ),
          <year>e0184516</year>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Shvets</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Devyatkin</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sochenkov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tikhomirov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Popov</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yarygin</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Detection of current research directions based on full-text clustering</article-title>
          .
          <source>In 2015 Science and Information Conference (SAI)</source>
          , pp.
          <fpage>483</fpage>
          -
          <lpage>488</lpage>
          , IEEE (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Biyani</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsioutsiouliklis</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blackmer</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>"8 Amazing Secrets for Getting More Clicks": Detecting Clickbaits in News Streams Using Article Informality</article-title>
          .
          <source>In Thirtieth AAAI Conference on Artificial Intelligence</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bouayad-Agha</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casamayor</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferraro</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wanner</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Simplification of patent claim sentences for their paraphrasing and summarization</article-title>
          .
          <source>In 22nd FLAIRS Conference</source>
          (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Saggion</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Štajner</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bott</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mille</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rello</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Drndarevic</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Making it simplext: Implementation and evaluation of a text simplification system for spanish</article-title>
          .
          <source>ACM Transactions on Accessible Computing (TACCESS)</source>
          ,
          <volume>6</volume>
          (
          <issue>4</issue>
          ),
          <volume>14</volume>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Nallapati</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gulcehre</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Abstractive text summarization using sequence-to-sequence rnns and beyond</article-title>
          .
          <source>CoNLL</source>
          <year>2016</year>
          ,
          <volume>280</volume>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pasunuru</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bansal</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Dynamic Multi-Level Multi-Task Learning for Sentence Simplification</article-title>
          .
          <source>In Proceedings of the 27th International Conference on Computational Linguistics</source>
          , pp.
          <fpage>462</fpage>
          -
          <lpage>476</lpage>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Nikolov</surname>
            ,
            <given-names>N. I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pfeiffer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hahnloser</surname>
            ,
            <given-names>R. H.</given-names>
          </string-name>
          :
          <article-title>Data-driven Summarization of Scientific Articles</article-title>
          .
          <source>In Proc. of the 7th International Workshop on Mining Scientific Publications</source>
          ,
          <string-name>
            <surname>LREC</surname>
          </string-name>
          <year>2018</year>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Luong</surname>
            ,
            <given-names>M. T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pham</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C. D.:
          <article-title>Effective approaches to attention-based neural machine translation</article-title>
          .
          <source>In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing</source>
          , pp.
          <fpage>1412</fpage>
          -
          <lpage>1421</lpage>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Gu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>V. O.</given-names>
          </string-name>
          :
          <article-title>Incorporating copying mechanism in sequence-tosequence learning</article-title>
          .
          <source>Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics</source>
          , pp.
          <fpage>1631</fpage>
          -
          <lpage>1640</lpage>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>See</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>P. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C. D.:
          <article-title>Get to the point: Summarization with pointer-generator networks</article-title>
          .
          <source>Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics</source>
          , pp.
          <fpage>1073</fpage>
          -
          <lpage>1083</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deng</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Senellart</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rush</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          :
          <source>OpenNMT: Neural Machine Translation Toolkit. Proceedings of the 13th Conference of the Association for Machine Translation in the Americas</source>
          , pp.
          <fpage>177</fpage>
          -
          <lpage>184</lpage>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schuster</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Le</surname>
            ,
            <given-names>Q. V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Norouzi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macherey</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , ... &amp;
          <string-name>
            <surname>Klingner</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Google's neural machine translation system: Bridging the gap between human and machine translation</article-title>
          .
          <source>arXiv preprint arXiv:1609.08144</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Osipov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smirnov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tikhomirov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sochenkov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shelmanov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shvets</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Information retrieval for R&amp;D support</article-title>
          . In Professional search in the modern world. Springer, LNCS,
          <volume>8830</volume>
          ,
          <fpage>45</fpage>
          -
          <lpage>69</lpage>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Honnibal</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montani</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <article-title>: spacy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing (</article-title>
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Rush</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chopra</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weston</surname>
          </string-name>
          , J.:
          <source>In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing</source>
          , pp.
          <fpage>379</fpage>
          -
          <lpage>389</lpage>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Papineni</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roukos</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ward</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>W. J.:</given-names>
          </string-name>
          <article-title>BLEU: a method for automatic evaluation of machine translation</article-title>
          .
          <source>In Proceedings of the 40th annual meeting on association for computational linguistics</source>
          , pp.
          <fpage>311</fpage>
          -
          <lpage>318</lpage>
          (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C. Y.</given-names>
          </string-name>
          :
          <article-title>Rouge: A package for automatic evaluation of summaries</article-title>
          .
          <source>Text Summarization Branches Out</source>
          (
          <year>2004</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Cao</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Joint copying and restricted generation for paraphrase</article-title>
          .
          <source>In Thirty-First AAAI Conference on Artificial Intelligence</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lapata</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Sentence Simplification with Deep Reinforcement Learning</article-title>
          .
          <source>In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing</source>
          , pp.
          <fpage>584</fpage>
          -
          <lpage>594</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Štajner</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saggion</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          :
          <article-title>Data-Driven Text Simplification</article-title>
          .
          <source>In Proceedings of the 27th International Conference on Computational Linguistics: Tutorial Abstracts</source>
          , pp.
          <fpage>19</fpage>
          -
          <lpage>23</lpage>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Belz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bohnet</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pitler</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wanner</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mille</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The First Multilingual Surface Realisation Shared Task (SR'18): Overview and Evaluation Results (</article-title>
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Byrne</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Labbé</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Striking similarities between publications from China describing single gene knockdown experiments in human cancer cell lines</article-title>
          .
          <source>Scientometrics</source>
          ,
          <volume>110</volume>
          (
          <issue>3</issue>
          ),
          <fpage>1471</fpage>
          -
          <lpage>1493</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Shvets</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A Method of Automatic Detection of Pseudoscientific Publications</article-title>
          .
          <source>In Intelligent Systems' 2014</source>
          . Springer, AISC,
          <volume>323</volume>
          ,
          <fpage>533</fpage>
          -
          <lpage>539</lpage>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Garfield</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sher</surname>
            ,
            <given-names>I. H.</given-names>
          </string-name>
          : Key Words Plus [TM]
          <article-title>-Algorithmic Derivative Indexing</article-title>
          .
          <source>JournalAmerican Society For Information Science</source>
          ,
          <volume>44</volume>
          ,
          <fpage>298</fpage>
          -
          <lpage>298</lpage>
          (
          <year>1993</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>