<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Evaluation of MUSS and T5 Models in Scientific Sentence Simplification: A Comparative Study</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Running Hou</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xinyi Qin</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Zurich, Department of Informatics</institution>
          ,
          <addr-line>Binzmühlestrasse 14, 8050 Zurich</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper discusses a study by the QH Research Group at the University of Zurich aimed at simplifying scientific text for SimpleText@CLEF-2023's Task 3. Using the pre-trained MUSS and T5 models, we explored their efectiveness in reducing sentence complexity without loss of essential information. Performance comparison across various scientific fields was undertaken, using both quantitative and qualitative measures for assessing simplification quality and fluency. Results highlight the substantial potential of both models, yet revealing distinct strengths and weaknesses. Strategies for further enhancements are discussed.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Scientific Sentence Simplification</kwd>
        <kwd>Multilingual Sentence Simplifier</kwd>
        <kwd>Text-to-Text Transfer Transformer</kwd>
        <kwd>Text Complexity Reduction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In the modern era of rapidly advancing knowledge, accessibility to complex scientific research
is an issue of increasing importance. As a significant portion of scientific knowledge remains
confined within academia, it often becomes challenging for non-specialists to comprehend due
to the inherent complexity of scientific language [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. One proposed solution to this challenge
lies in the realm of Natural Language Processing (NLP): the simplification of scientific sentences
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Scientific sentence simplification aims at reformulating scientific texts to make them more
understandable, thereby bridging the knowledge gap between expert and non-expert audiences
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This can lead to increased democratization of science, allowing a broader audience to
engage with and benefit from scientific discoveries [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        Recent advancements in NLP models have shown promise in text simplification tasks. In this
paper, we focus on two such pre-trained models, MUSS (Multilingual Sentence Simplifier) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and
T5 (Text-to-Text Transfer Transformer) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. MUSS and T5, both are reputed models in sentence
simplification, have been chosen for their established capabilities in handling multilingual and
large-scale text corpora, respectively.
      </p>
      <p>Our research aim is to evaluate the efectiveness of these models in reducing the complexity
of scientific sentences whilst ensuring that the core information remains intact. To this end, we
have conducted an in-depth evaluation, comparing the performance of the two models across
multiple scientific domains.</p>
      <p>This paper employs a combination of quantitative and qualitative metrics to assess the quality
and fluency of the simplification provided by these models. As each model exhibits unique
strengths and limitations, we also delve into these attributes, discussing potential strategies for
further improvement.</p>
      <p>We hope this research will contribute to the expanding field of scientific text simplification
and assist in the broader eforts of making science more accessible and democratic. The insights
drawn from our study could potentially direct future work in this area and lead to more efective
models for scientific text simplification.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Methodology</title>
      <p>This section elucidates our research methods, primarily focusing on the adaptation of the
Multilingual Unsupervised Sentence Simplification (MUSS) model to a HuggingFace BART
model, as well as the deployment of a T5-large model. The employed data, comprising a
comprehensive corpus of English scientific sentences from the Cross-Language Evaluation
Forum (CLEF), is also outlined. The training and finetuning processes of these models are
explored in-depth.
2.1. Data
Our research utilizes a substantial dataset provided by the CLEF organizers, which comprises an
array of scientific sentences from various domains. This dataset is distinguished by its extensive
scale and diverse representation of diferent scientific fields, rendering it apt for our study.</p>
      <p>The dataset contains only 648 training entries with original sentences as input and
humangenerated simplified sentences as the target. Testing data is categorized into small (2,234 entries),
medium (4,797 entries), and large (152,073 entries) sets. Data in this dataset is derived from
abstracts of scientific articles. These abstracts are segmented into sentences, with each sentence
treated as an individual data point. Therefore, the column ’query’ in the dataset refers to the
original article’s topic of the sentence.</p>
      <p>
        The wide-ranging topics include ’drones’, ’self-driving’, ’cryptocurrency’, ’digital marketing’,
and ’gene editing’ among others, with the most specific focus on various aspects of ’muscle
hypertrophy’ and ’exercise training’. The diversity in the corpus, spanning several scientific
domains, ofers an opportunity to evaluate the versatility of MUSS and T5 models in scientific
text simplification. The complexity of the sentences also serves as an appropriate challenge for
these advanced models, efectively testing their capabilities [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>In order to maintain the coherence of the simplified sentence with the main topic, we insert
the query words at the end of the original sentence, connected by the phrase ’related to’. For
example, as shown in Table 1, the topic of the first sentence is ’How many training per week for
hypertrophy?’. In addition, we the keyword simplify is added at the beginning of each source
sentence to mark it as a simplification task.</p>
      <sec id="sec-2-1">
        <title>2.2. Models</title>
        <p>
          MUSS, the Multilingual Sentence Simplifier, has demonstrated superior performance in
sentencelevel simplification tasks across multiple languages, hence justifying its selection for this
research [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. We use similar control tokens as defined by Martin [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] to control diferent aspects
of simplification including compression ratio (Chars), paraphrasing (Levenshtein similarity),
lexical complexity (word rank), syntactic complexity (the depth of the dependency tree). In
addition, we add another aspect of compression ratio (Words) as we believe that simple texts
should contain fewer words. The T5(Text-to-Text Transfer Transformer) model, specifically the
T5-large variant, has demonstrated proficiency in a number of Natural Language Processing
(NLP) tasks including translation, summarization, and sentence simplification, making it a
promising choice for our study [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
        <p>For our research, both the MUSS and T5-large models were trained for 8 epochs with a
learning rate of 3e-5 and a batch size of 8, which optimizes their performance in our specific
context of scientific sentence simplification.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.3. control tokens</title>
        <p>Five control tokens are embedded into input sentences:
• Character Length Ratio (C): The ratio of the number of characters in the target sentence
to the number of characters in the source sentence.
• Normalized Levenshtein Similarity (L): The normalized similarity at the character level
between the source and target sentences, based on the Levenshtein distance.
• WordRank (WR): The inverse frequency order of all words in the target sentence compared
to the source sentence.
• Dependency Tree Depth Ratio (DTD): The ratio of the maximum depth of the dependency
tree in the target sentence to that of the source sentence.
• Word Ratio (W): The ratio of the number of words in the target sentence to the number
of words in the source sentence.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.4. lexical Complexity</title>
        <p>The lexical complexity score for a given sentence is calculated by first converting each sentence
into a list of words. This is done through a process of tokenization, removing punctuation, and
ifltering out common "stop words". The list of words is then further refined to include only those
words which are present in our preprocessed word ranking dictionary, efectively filtering out
unknown words. We then convert each word in the sentence into its respective rank obtained
from our preprocessed dictionary. These ranks are logged (to smooth out the distribution), and
the 75th percentile (the third quartile) of these ranks is taken as the sentence’s lexical complexity
score. This means that we mainly consider the top 25% most complex words in the sentence
when assessing the sentence’s overall complexity. In the case of batch processing, we calculate
the score for each pair of simple and complex sentences, take a safe division of the scores, and
then calculate the mean of these ratios. Thus, our method provides a single numerical score
that represents the lexical complexity of a sentence, or the average complexity ratio between
two lists of sentences, which can be utilized to compare and assess diferent textual contents.</p>
        <p>Source Text
simplify: W_0.67 C_0.64 L_0.59 WR_0.97
DTD_0.67 Meta-regression analysis of
nonvolume-equated studies showed a significant
efect favoring higher frequencies, although
the overall diference in magnitude of efect
between frequencies of 1 and 3+ days per week
was modest, related to How many training Iper
week for hypetrophy?.
simplify: W_0.78 C_0.76 L_0.86 WR_1.06
DTD_1.00 Four major capabilities were
identified, each of which evolves as a result of using
the tools, related to digital marketing.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results and Discussion</title>
      <p>Target Text
Analysis of studies with diferent training
volumes showed better results for higher
frequencies, although the diference between
frequencies of 1 and 3+ days per week was small.</p>
      <p>Four major capabilities were identified, each of
which evolves as a result of using the tools.</p>
      <p>The evaluation metrics chosen for this study were designed to reflect our goals of sentence
simplification. We aimed to measure the level of semantic similarity, the preservation of essential
information, the reduction of extraneous details, the addition of suitable words, and the linguistic
quality and readability of the simplified sentences.</p>
      <p>Our research findings contribute to our understanding of MUSS and T5’s capabilities in
the field of scientific sentence simplification. Both models showed promise, each presenting
unique strengths and weaknesses when confronted with the nuances of scientific sentences
from chosen disciplines. Tables 2, 3, and 4 provide numerical insights into model performance
according to the applied evaluation metrics.</p>
      <sec id="sec-3-1">
        <title>3.1. Evaluation Metrics</title>
        <p>
          SARI (System output, Automatic and Reference Inputs) contributed to our evaluation by focusing
on three facets of text simplification: the preservation of meaning (KEEP), the addition of
appropriate words (ADD), and the deletion of unnecessary information (DELETE) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>
          BLEU (Bilingual Evaluation Understudy), a metric developed to measure the overlap of
ngrams between machine-generated translations and multiple reference translations [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], was
used to gauge the linguistic quality of the simplified sentences in our context.
        </p>
        <p>
          FKGL (Flesch-Kincaid Grade Level), a readability test that calculates a score based on the
average number of syllables per word and the average number of words per sentence [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ],
helped us understand the extent to which the complexity of the scientific sentences was reduced
by the models.
        </p>
        <p>The Compression Ratio evaluates the extent to which the simplified sentence is shorter
than the original sentence. This metric is helpful for assessing the extent of reduction in the
complexity of the sentence after simplification.</p>
        <p>Levenshtein Similarity, on the other hand, measures the number of single-character edits
required to change one sentence into the other. In our context, it helps us assess how much
the simplified sentence difers from the original one, thus providing a measure of information
preservation and modification.</p>
        <p>The Lexical Complexity Score helps us evaluate the linguistic complexity of the simplified
sentences. It provides insights into the readability of the simplified sentences.</p>
        <p>These chosen metrics are widely regarded in the field of automatic text simplification and
allowed us to evaluate diferent aspects of the models’ performances, from semantic accuracy
to readability. The results based on these metrics are detailed in Tables 2,3, and 4 below.</p>
        <p>Table 2 illustrates examples of sentences before and after simplification by MUSS and T5. This
table exemplifies the diferent approaches each model took to simplification. For instance, in
the context of "penetration testing", MUSS retained the original sentence structure and content,
while T5 removed important details, potentially afecting the understanding of the concept.</p>
        <p>When simplifying a sentence regarding "decompression", MUSS smoothly retained the essence
of the original sentence, while T5 removed the connective term "though", subtly impacting the
sentence tone. For "classification algorithm", MUSS again preserved the original sentence, while
T5 decided to eliminate the temporal marker "finally", subtly altering the sentence’s flow. In the
case of the term "steganographic approach", MUSS maintained the original sentence but replaced
’imperceptible’ with ’undetectable’, potentially enhancing the sentence’s comprehensibility.
However, T5’s simplification was incomplete, cutting of at the end of the sentence, and leaving
important information out.</p>
        <p>Table 3 compares MUSS and T5’s handling of specific scientific terms. It reveals how each
model navigates complex terminology during the simplification process. For instance, with the
term "decompression", both MUSS and T5 successfully integrated the concept in a simplified
manner. Yet, when it came to terms like "penetration testing" or "classification algorithm", T5
omitted crucial information, potentially compromising the coherence of the scientific concepts
involved.</p>
        <p>Table 4 provides a quantitative comparison of the models based on our chosen evaluation
metrics. In terms of Compression ratio, Levenshtein similarity, and Lexical complexity score,
MUSS showed superior performance, indicating its ability to reduce sentence length, maintain
semantic similarity to the original sentence, and achieve a lower complexity level. The SARI
and BLEU scores followed a similar pattern, with MUSS scoring higher, suggesting it performed
better in preserving the original meaning while deleting unnecessary information and matching
reference translations.</p>
        <p>However, in terms of FKGL, which measures readability, T5 outperformed MUSS with a lower
score, indicating that T5 might produce simpler sentences, even though they may lose some
crucial information. This contrast underscores a tension between readability and semantic
preservation, which is a key challenge in text simplification tasks.</p>
        <p>In conclusion, the results highlight that while both models exhibit potential for scientific
snt_id Original
T15.1_2 Although penetration
9520022 testing has traditionally
52_2 focussed on technical
aspects, the field has started
to realise the importance of
the human in the
organisation, and the need to ensure
that humans are resistant
to cyberattacks.</p>
        <p>T15.1_1 Though decompression is
5763372 not required.
84_7
T13.3_2
9441887
49_4</p>
        <p>Muss
Although penetration
testing has traditionally
focussed on technical
aspects, the field has started
to realise the importance of
the human in the
organisation, and the need to ensure
that humans are resistant
to cyberattacks.</p>
        <p>There is no need for
decompression, though.</p>
        <p>Finally, use the 10 sets of im- Finally, use the 10 sets of
imbalanced data in the KEEL balanced data in the KEEL
database as test objects, and database as test objects, and
F-value and G-mean are F-value and G-mean are
used as evaluation indica- used as evaluation
indicators to verify the perfor- tors to verify the
performance of the classification mance of the classification
algorithm. algorithm.</p>
        <p>T15.1_1 To enlarge the capacity of To enlarge the capacity of
5763372 the hidden secret informa- the hidden secret
informa84_5 tion and to provide an imper- tion and to provide an
imceptible stego-image for hu- perceptible stego-image for
man vision, a novel stegano- human vision, a novel
apgraphic approach called tri- proach called tri-way
pixelway pixel-value diferencing value diferencing (TPVD) is
(TPVD) is used for embed- used for embedding.
ding.</p>
        <p>T5-large
Despite the importance of
the human in the
organisation, and the need to ensure
that humans are resistant to
cyberattacks.</p>
        <p>Decompression is not
required.</p>
        <p>Use the 10 sets of
imbalanced data in the KEEL
database as test objects, and
F-value and G-mean are
used as evaluation
indicators to verify the
performance of the classification
algorithm.</p>
        <p>To enlarge the capacity of
the hidden secret
information and to provide an
imperceptible stego-image for
human vision, a novel
steganographic approach called
triway pixel-value diferencing
(TP
T5-Large - GPU
Omitted Crucial
Information
Removed ’Though’
Removed ’Finally’
text simplification, MUSS generally shows better performance in semantic preservation and
coherence, which are crucial in scientific contexts. Meanwhile, T5 seems to prioritize readability,
but at the potential cost of omitting key information. This points towards the importance of
ifnding a balance between readability and information preservation in text simplification tasks,
a topic warranting further research.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future Work</title>
      <p>This research aimed to investigate the eficiency of the MUSS and T5 models in the challenging
task of scientific sentence simplification. Our findings highlight the significant potential of
both models, while also shedding light on their unique strengths and weaknesses. MUSS
showed consistent performance in maintaining the original sentence’s structure and meaning,
suggesting it to be a reliable choice for preserving technical details in complex sentences. T5,
while demonstrating reasonable proficiency, did occasionally omit important details, suggesting
areas for further improvement.</p>
      <p>However, it is crucial to acknowledge the limitations of our study. The use of only two models
restricts the generalizability of our findings to all sentence simplification models. Moreover, the
performance of these models may vary across diferent scientific domains and levels of sentence
complexity, beyond what was covered by the CLEF dataset.</p>
      <p>Future research should extend this analysis to a broader range of models and datasets,
including diferent languages and scientific fields, to understand better the models’ performances.
The models themselves could also be improved, particularly in terms of preserving essential
information while simplifying text and handling very complex sentences more eficiently. These
improvements may involve fine-tuning existing models, developing novel training
methodologies, or even creating new models altogether.</p>
      <p>The implications of this work are significant. By making scientific literature more accessible
through sentence simplification, we can democratize science and promote knowledge sharing
among non-experts. The use of models like MUSS and T5 could become an integral part
of the future of scientific communication, making the world of research more inclusive and
approachable for everyone.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>We would like to express our gratitude to the Cross-Language Evaluation Forum (CLEF) for
providing the dataset used in this study. We also thank Professor Simon Clematide, Tannon
Kew and Andrianos Michail who have provided invaluable insights and feedback throughout
the course of this research.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Baker</surname>
          </string-name>
          ,
          <article-title>Is there a language of science?</article-title>
          ,
          <source>Nature</source>
          <volume>467</volume>
          (
          <year>2010</year>
          )
          <fpage>153</fpage>
          -
          <lpage>155</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zopf</surname>
          </string-name>
          ,
          <article-title>The complexities of computational text simplification</article-title>
          ,
          <source>Language and Linguistics Compass</source>
          <volume>13</volume>
          (
          <year>2019</year>
          )
          <article-title>e12323</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mandya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Duarte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Orasan</surname>
          </string-name>
          ,
          <article-title>Towards a better understanding of the challenge of scientific text simplification</article-title>
          ,
          <source>in: Proceedings of the Third Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR)</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>36</fpage>
          -
          <lpage>44</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L.</given-names>
            <surname>Scharrer</surname>
          </string-name>
          , E. Rupprecht, P. Lux,
          <source>Science communication 2</source>
          .
          <article-title>0: The impact of online media and popular science infotainment on sciences</article-title>
          ,
          <source>PLoS ONE 15</source>
          (
          <year>2020</year>
          )
          <article-title>e0230432</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Birch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kuhn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          , Muss:
          <article-title>Multilingual sentence simplifier</article-title>
          ,
          <source>in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>3254</fpage>
          -
          <lpage>3266</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rafel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roberts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Narang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Matena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Exploring the limits of transfer learning with a unified text-to-text transformer</article-title>
          , arXiv preprint arXiv:
          <year>1910</year>
          .
          <volume>10683</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E.</given-names>
            <surname>Liana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Eric</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Stéphane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Olivier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hosein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Jaap</surname>
          </string-name>
          ,
          <article-title>Overview of simpletext - clef2023 track on automatic simplification of scientific texts</article-title>
          , in: Avi Arampatzis, Evangelos Kanoulas, Theodora Tsikrika, Stefanos Vrochidis, Anastasia Giachanou,
          <string-name>
            <given-names>Dan</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Mohammad</given-names>
            <surname>Aliannejadi</surname>
          </string-name>
          , Michalis Vlachos, Guglielmo Faggioli, Nicola Ferro (Eds.)
          <string-name>
            <surname>Experimental IR Meets Multilinguality</surname>
          </string-name>
          , Multimodality, and
          <string-name>
            <surname>Interaction</surname>
          </string-name>
          .
          <source>Proceedings of the Fourteenth International Conference of the CLEF Association (CLEF</source>
          <year>2023</year>
          ),
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>W.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Pavlick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Callison-Burch</surname>
          </string-name>
          ,
          <article-title>Optimizing statistical machine translation for text simplification</article-title>
          ,
          <source>Transactions of the Association for Computational Linguistics</source>
          <volume>4</volume>
          (
          <year>2016</year>
          )
          <fpage>401</fpage>
          -
          <lpage>415</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K.</given-names>
            <surname>Papineni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Roukos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ward</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. J.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>Bleu: a method for automatic evaluation of machine translation</article-title>
          ,
          <source>in: Proceedings of the 40th annual meeting of the Association for Computational Linguistics</source>
          ,
          <year>2002</year>
          , pp.
          <fpage>311</fpage>
          -
          <lpage>318</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Kincaid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. P.</given-names>
            <surname>Fishburne</surname>
          </string-name>
          <string-name>
            <surname>Jr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Rogers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. S.</given-names>
            <surname>Chissom</surname>
          </string-name>
          ,
          <article-title>Derivation of new readability formulas (Automated Readability Index</article-title>
          , Fog Count, and Flesch Reading Ease Formula)
          <article-title>for Navy enlisted personnel</article-title>
          ,
          <source>Technical Report, Naval Technical Training Command Millington TN Research Branch</source>
          ,
          <year>1975</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>