<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Text Adaptation for Easy Read Content Generation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jesús Calleja</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Fundación Vicomtech, Basque Research and Technology Alliance (BRTA)</institution>
          ,
          <addr-line>Donostia-San Sebastián, 20009, Gipuzkoa</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of the Basque Country, UPV/EHU</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Adapting information for people with cognitive or reading dificulties is essential to foster inclusion and accessibility in society. This thesis focuses on automatic text adaptation to Easy Read (ER), including aspects of text simplification and information selection, with the goal of accelerating the work of professionals and improving access to information for vulnerable groups. Our research centres on three main aspects that are critical to achieve quality ER adaptations: resource creation, neural model adaptation and model evaluation. In this context, we address corpus alignment issues, information adaptation, as well as image and explanation integration.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Easy Read Adaptation</kwd>
        <kwd>Text Simplification</kwd>
        <kwd>Resource Creation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Around 25% of the population in countries like Spain experiences dificulties with reading and
comprehension in their daily lives [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This makes information inaccessible to many, creating a communication
barriers for individuals with cognitive disabilities, older adults, migrants, refugees, and people with
learning dificulties, among others. Moreover, the United Nations’ International Convention on the
Rights of Persons with Disabilities, the Charter of Fundamental Rights of the European Union, and the
Spanish Constitution all enshrine in their articles the right to access to information and culture [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ],
for everyone.
      </p>
      <p>To address this issue, professionals work to adapt these texts into Easy Read,1 a style designed to
make documents easier to understand. For Spanish, experts in the field rely on the UNE 153101:2018
EX standard, which provides guidelines on creating, adapting, and verifying Easy Read documents.
Additionally, the Inclusion Europe Easy Read guidelines2 are openly available and describe the key
features of an Easy Read text.</p>
      <p>Creating or adapting a text to Easy Read involves applying various linguistic strategies to improve
comprehension. Lexically, this includes avoiding dificult (e.g., “help” instead of “assist”), technical
jargon or foreign words, and using quantifiers like little or much instead of large or specific numerical
values. Syntactically, it involves using short sentences that convey a single idea, preferably in the
second person, avoiding negations, and favouring the active voice (e.g., “You have rights. You can
ask for help.”). If longer sentences are necessary, they may be broken into two lines at natural pause
points. Discursively, information should be well-organized: related ideas should be grouped, complex
terms explained, the text structured with clear headings that reflect the content. Additionally, the text
should prioritize essential content, omitting non-critical data. The challenge here lies in determining
what is considered “essential”, as this can vary depending on the needs and characteristics of the target
audience.</p>
      <p>Manually adapting content to Easy Read (ER) style is time-consuming, and the volume of daily
information makes this process unfeasible with current resources. This PhD thesis aims to address
this challenge by developing resources, models, and tools to support ER content generation, assisting
professionals in the manual adaptation process. The research will notably explore the creation of quality</p>
      <p>ER resources via automated alignment, ER modelling with Large Language Models (LLMs), and the
evaluation of automatically adapted ER texts.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background and Related Work</title>
      <sec id="sec-2-1">
        <title>2.1. Easy Read</title>
        <p>
          Despite the limited research in the ER field noted by [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], there is a growing interest in leveraging natural
language processing techniques for ER. This is evident in recent developments, including the creation
of new ER applications [
          <xref ref-type="bibr" rid="ref10 ref6 ref7 ref8 ref9">6, 7, 8, 9, 10</xref>
          ], evaluations of LLMs for ER tasks [
          <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
          ], and focused studies on
specific aspects of automated ER adaptation, such as the incorporation of explanatory structures [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
        </p>
        <p>
          In terms of resources, ER data in Spanish and Basque remains scarce. For Spanish, [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] covers sports,
literature, competitive examinations, and exhibitions as domains, with nearly 2,000 aligned sentences
by professionals. ClearSim [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] includes 2,000 texts automatically adapted to ER using ChatGPT and
validated by non-experts, along with 400 texts adapted by professionals. However, neither of these
corpora is publicly available, although the raw data from the former can be requested. The only
publicly available ER corpura in Spanish are IrekiaLFes [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], which consists of 35 manually aligned news
documents, amounting to 705 sentences, sourced from the Basque Government’s Irekia transparency
portal. While ER-specific corpora remain limited, there are several publicly available datasets for general
text simplification, such as Newsela [ 15], which does not follow ER guidelines (or does not state doing
so explicitly) but can still be valuable for benchmarking and system comparison purposes.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Automatic Text Simplification</title>
        <p>Automatic Text Simplification is described as a text-to-text task whose goal is to reduce the complexity
of a text without altering its meaning, typically through various linguistic transformations. Early
simplification systems relied on handcrafted rules developed by experts [ 16] or inferred from aligned
texts [17]. These rules often involved substituting complex structures or words with simpler alternatives.
As machine learning capabilities advanced, research in the field shifted toward statistical and neural
models based on texts, framing the task as a monolingual machine translation problem, translating
from complex to simple language, using aligned sentence pairs from parallel corpora [18].</p>
        <sec id="sec-2-2-1">
          <title>2.2.1. Lexical Simplification</title>
          <p>Recent lexical simplification approaches typically involve complex word identification, substitute
selection, filtering and ranking; though not all of these steps are always present in a given system.
LSBert [19], for example, uses a BERT model to mask complex words and predict suitable substitutes,
ranking them based on the model’s probabilities, word frequency in corpora, and cosine similarity with
fastText static embeddings.</p>
          <p>More recent methods employ controllable generation, where predefined parameters are included in
the model’s input data to guide its behaviour during inference [20]. Based on this idea, this approach has
been applied in models such as T5 [21], T5-Large [22], BART [20], and most recently, mT5 [23], which
achieved state-of-the-art results in Spanish, English and Portuguese using a single multilingual model.
Large Language Models (LLMs) such as GPT-3 have also been explored for simplification, achieving
results comparable to mT5 [24].</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>2.2.2. Syntactic Simplification</title>
          <p>Before the emergence of LLMs, syntactic simplification was typically approached as a
sequence-tosequence task, with systems trained on parallel data derived from knowledge graphs [25], extracted
from Wikipedia [26], or created by experts [27]. Due to challenges in generalizing to unseen data,
DisSim [28] relies on a set of handcrafted rules. DSS [29] uses a syntactically annotated corpus to create
rules, which are then applied in a translation-based system.</p>
        </sec>
        <sec id="sec-2-2-3">
          <title>2.2.3. End-to-end Simplification</title>
          <p>The ACCESS system [20] was the first to use input parameters, such as word character count, Levenshtein
similarity, inverse frequency, and dependency tree depth; to guide the output of a Transformer model
output [30]. Later, [31] tested the same approach to a pre-trained BART model and aligned large
numbers of paraphrased sentence pairs from the CCNet corpus.</p>
          <p>Other researchers have adopted this controlled generation method using the T5 model, incorporating
additional parameters such as the sentence length ratio or dependency tree depth between input and
output sentences [32, 33], with the latter achieving the best results.</p>
          <p>From an unsupervised perspective, i.e. when training output data is unavailable, [34] propose using
existing machine translation corpora, treating bridge language translations as simplified versions of
the original sentences in the target language. [35] compute control parameters over noisy translations
generated by pre-trained XLM models and attempt to recover the original sentence, incorporating
penalties for length, concurrency, and complexity during output generation.</p>
        </sec>
        <sec id="sec-2-2-4">
          <title>2.2.4. End-to-end Simplification with LLMs</title>
          <p>The emergence of Large Language Models has significantly advanced the task of text simplification by
addressing one of its main challenged: the scarcity of training data. Unlike earlier models, LLMs do not
require large-scale annotated corpora and can perform in an unsupervised manner.</p>
          <p>KiS model [36] uses GPT-2 in combination with a reinforcement learning algorithm that evaluates the
output based on simplicity, fluency and clarity of the response. [ 37] assess the performance of ChatGPT
in comparison to other existing models, supervised and unsupervised, finding that it achieves similar
results. Similarly, [38] experiment with GPT-4, testing various prompts and instructions to optimize
output quality, and report performance in par with the best supervised models.</p>
        </sec>
        <sec id="sec-2-2-5">
          <title>2.2.5. Evaluation and Metrics</title>
          <p>Given the high cost of human evaluation, various automatic metrics have been proposed to assess the
quality of simplifications. Alva-Manchego et al. [39] recommended the use of metrics such as SARI [40],
SAMSA [41], and BERTScore [42], as they show higher correlation with human judgements. Conversely,
the use of traditional metrics such as BLEU [43] is discouraged, as it does not align well with human
assessments in the context of text simplification. In addition to these metrics, readability formulas are
commonly employed to estimate the linguistic complexity of simplified texts. The Flesch–Kincaid Grade
Level (FKGL) [44] is one of the most widely used, along with its Spanish adaptation, the
FernándezHuerta index [45]. While these scores ofer interpretable indicators of surface-level complexity, they have
been criticized for their high sensitivity to small lexical or syntactic changes that may not significantly
afect overall readability or simplification quality [ 46]. Nonetheless, they remain useful in applications
aimed at producing accessible content.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Description of the Proposed Research and Hypotheses</title>
      <p>This thesis aims to investigate the automatic adaptation of texts to ER, to support professional
practitioners in accelerating the adaptation process and to improve access to information for vulnerable
populations. The primary target languages for this research will be Basque and Spanish, with particular
emphasis on Basque, where resources for text adaptation are especially scarce.</p>
      <p>To achieve the objectives of this thesis, the development of parallel corpora (complex-adapted) is
foreseen, alongside the creation of text adaptation models for various domains, including narrative,
news and technical documents.</p>
      <p>The main objectives of the thesis are:
1. Development of methods and tools for Easy Read text adaptation. While models for general
text simplification exist, notable characteristics of the Easy Read style, such as sentence splitting
and rephrasing, or automatic sentence segmentation, are relatively less explored in end-to-end
fashion. This thesis will explore methods and tools specifically designed to cover these aspects.
2. Creation of an automatic complex-adapted text aligner. Both Spanish and, more notably,
Basque, currently lack parallel corpora for automatic text adaptation. This scarcity hinders
progress in the field. An objective of this thesis is the development of components and methods
to align ER texts at diferent levels of granularity (e.g., sentence and paragraph levels), to provide
quality ER-adapted alignments and enable the creation of new corpora.
3. Development of ER datasets in Spanish and Basque across domains. Using the
aforementioned aligner, new training datasets will be created for a variety of domains. This is essential, as
the Easy Read adaptation style can vary between domains.
4. Training and benchmarking of adapted neural models. New neural models will be trained
using both the corpora developed in this thesis and publicly available data. These models will be
benchmarked against existing approaches in the literature to assess their efectiveness.
5. Development of methods and models for the automatic generation of domain-adapted
explanations of complex terms. Easy Read texts frequently include brief explanations of
complex terms, particularly when no simpler synonyms are available. This thesis explores
methods and models that can identify complex terms and generate appropriate explanations in
the Easy Read style, adapted to the context and domain of the text, as well as the target audience.
6. Development of methods and models for the automatic insertion of relevant, illustrative
images. Easy Read texts are often supplemented with images that reinforce textual content and
aid comprehension. Methods and models will be developed to automatically retrieve and assign
suitable images to accompany the text, streamlining the adaptation process for professionals.
Based on these objectives, the following hypotheses will be tested:
1. The development of resources, methods and models tailored to Easy Read adaptation will improve
the readability of texts for people with disabilities.
2. The creation of an automatic aligner tailored to ER text will improve the quality of aligned data
and, consequently, the outputs of the models trained on this data.
3. Increasing the amount of training data will enhance the quality and consistency of text adaptations.
4. The inclusion of explanations for complex terms will improve the comprehensibility of adapted
texts.
5. Automatically inserting relevant images will enhance the reading experience for individuals with
reading dificulties.
6. The use of neural models, particularly large language models, will support the efective generation
of Easy Read content.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology and Preliminary Results</title>
      <sec id="sec-4-1">
        <title>4.1. Methodology</title>
        <p>For model training and benchmarking, various approaches will be compared, including instruction-tuned
models (few-shot and zero-shot), and fine-tuned base models. These will be evaluated against existing
state-of-the-art systems across multiple datasets in Spanish and Basque, where available. Evaluation will
be conducted using established metrics for text adaptation, such as SARI, BLEU, and BERTScore. Given
the active research in this field, additional metrics may be incorporated if they contribute meaningfully
to the interpretability and robustness of the results.</p>
        <p>To develop the automatic complex–adapted text aligner, we will contrast various methods, including
embedding and lexical similarity, as well as models tuned to the specifics of ER alignment. Sequentially,
regarding the training of explanation generation models, the use of task-specific architectures or
the integration of explanation generation into the main simplification and adaptation models will be
investigated. These models will be designed to identify complex terms and generate suitable explanations
either from a structured knowledge base or based on the surrounding context. As there are, to the best
of the author’s knowledge, no publicly available datasets for this task, initial experimentation will focus
on few-shot and zero-shot paradigms. Dataset creation will be considered if no suitable resource is
found. Following this, an image generation module will be developed to enhance Easy Read texts. This
module will either use a text-to-image generation model or retrieve relevant images from a curated
image bank. Both approaches will be explored to assess their feasibility and impact on comprehension.</p>
        <p>This research will follow standard scientific procedures, including systematic reviews of
state-of-theart literature and the dissemination of findings through high-quality journals, conferences, congresses,
and workshops, thereby ensuring validation by the scientific community.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Preliminary Results</title>
        <p>Aligned with the objectives of the thesis, three topics have already been explored:
1. Split and Rephrase. One of the subtasks involved in syntactic and end-to-end text simplification
and adaptation is Split and Rephrase, which takes a single complex sentence as input and breaks
it into multiple shorter sentences while preserving the original meaning and information. We
explored the capabilities of large language models for this task, demonstrating improvements
over the state of the art. The study involved comparisons across diferent prompting strategies,
domain adaptation settings, base model sizes, training data configurations, and few-shot/zero-shot
approaches using instruction-tuned models. Full details can be found in [47].
2. Text Segmentation. Easy Read guidelines specify that each sentence should be presented on its
own line. Furthermore, sentemces should be split “where people would pause when reading out
loud”, as recommended by the Inclusion Europe English guidelines. We introduced a novel task:
automatic segmentation of sentences to conform to Easy Read principles. Two approaches were
evaluated: (1) scoring-based segmentation using constituency parsers and Masked Language
Model (MLM) scoring methods, and (2) generative LLM-based segmentation using in-context
learning and fine-tuning. Further details are available in [48].
3. Corpus creation. Support for ER text generation remains limited, with few resources available
for the development of automated systems. We created a novel corpus of ER news texts in Basque
and Spanish, marking the first publicly accessible resource designed to support the training and
evaluation of ER text adaptation models in both languages. The work outlines the methodology
used to build the corpus and includes both intrinsic and extrinsic evaluations. The resource is
intended to be released to the research community under a CC-BY-NC-ND 4.0 license. As of this
writing, the associated publication is under review.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Research Topics to Discuss</title>
      <p>As previously presented, this thesis aims to explore automatic text adaptation in Basque and Spanish,
following Easy Read guidelines. While several experiments have been carried out, some key research
questions and unresolved challenges remain:
• What are the optimal criteria to identify information that needs to be preserved, omitted or
transformed in ER adaptation?
• How can the quality of complex-adapted alignment be evaluated accurately?
• What is the best strategy to generate domain-adapted explanations for complex terms?
• How can the hallucination problem be detected and mitigated in the generation of adapted or
explanatory content?
• How can image generation or selection be guided in a way that supports text comprehension
without introducing ambiguity or distraction?
• To what extent do existing evaluation metrics reflect the actual accessibility improvements
brought by automated methods?</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work was partially supported by the Department of Economic Development and Competitiveness
of the Basque Government (Spri Group) via funding for the IRAZ project (ZL-2024/00570).</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The author has not employed any Generative AI tools.
of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022), Association
for Computational Linguistics, Abu Dhabi, United Arab Emirates (Virtual), 2022, pp. 86–97. URL:
https://aclanthology.org/2022.tsar-1.8/. doi:10.18653/v1/2022.tsar-1.8.
[15] W. Xu, C. Callison-Burch, C. Napoles, Problems in Current Text Simplification Research:
New Data Can Help, Transactions of the Association for Computational Linguistics
3 (2015) 283–297. URL: https://doi.org/10.1162/tacl_a_00139. doi:10.1162/tacl_a_00139.
arXiv:https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl_a_00139/1566780/tacl_
[16] R. Chandrasekar, C. Doran, B. Srinivas, Motivations and Methods for Text Simplification, in:
COLING 1996 Volume 2: The 16th International Conference on Computational Linguistics, 1996.</p>
      <p>URL: https://aclanthology.org/C96-2183/.
[17] R. Chandrasekar, B. Srinivas, Automatic induction of rules for text simplification,
KnowledgeBased Systems 10 (1997) 183–190. URL: https://www.sciencedirect.com/science/article/pii/
S0950705197000294. doi:https://doi.org/10.1016/S0950-7051(97)00029-4.
[18] L. Specia, Translating from Complex to Simplified Sentences, in: T. A. S. Pardo, A. Branco,
A. Klautau, R. Vieira, V. L. S. de Lima (Eds.), Computational Processing of the Portuguese Language,
Springer Berlin Heidelberg, Berlin, Heidelberg, 2010, pp. 30–39.
[19] J. Qiang, Y. Li, Y. Zhu, Y. Yuan, Y. Shi, X. Wu, LSBert: Lexical Simplification Based on BERT,
IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2021) 3064–3076. doi:10.
1109/TASLP.2021.3111589.
[20] L. Martin, É. de la Clergerie, B. Sagot, A. Bordes, Controllable Sentence Simplification, in:
N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard,
J. Mariani, H. Mazo, A. Moreno, J. Odijk, S. Piperidis (Eds.), Proceedings of the Twelfth Language
Resources and Evaluation Conference, European Language Resources Association, Marseille,
France, 2020, pp. 4689–4698. URL: https://aclanthology.org/2020.lrec-1.577/.
[21] K. C. Sheang, D. Ferrés, H. Saggion, Controllable Lexical Simplification for English, in: S. Štajner,
H. Saggion, D. Ferrés, M. Shardlow, K. C. Sheang, K. North, M. Zampieri, W. Xu (Eds.), Proceedings
of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022), Association
for Computational Linguistics, Abu Dhabi, United Arab Emirates (Virtual), 2022, pp. 199–206. URL:
https://aclanthology.org/2022.tsar-1.19/. doi:10.18653/v1/2022.tsar-1.19.
[22] M. Maddela, F. Alva-Manchego, W. Xu, Controllable Text Simplification with Explicit Paraphrasing,
in: K. Toutanova, A. Rumshisky, L. Zettlemoyer, D. Hakkani-Tur, I. Beltagy, S. Bethard, R. Cotterell,
T. Chakraborty, Y. Zhou (Eds.), Proceedings of the 2021 Conference of the North American Chapter
of the Association for Computational Linguistics: Human Language Technologies, Association
for Computational Linguistics, Online, 2021, pp. 3536–3553. URL: https://aclanthology.org/2021.
naacl-main.277/. doi:10.18653/v1/2021.naacl-main.277.
[23] K. C. Sheang, H. Saggion, Multilingual Controllable Transformer-Based Lexical Simplification,</p>
      <p>Procesamiento del Lenguaje Natural 71 (2023) 109–123.
[24] D. Aumiller, M. Gertz, UniHD at TSAR-2022 Shared Task: Is Compute All We Need for Lexical
Simplification?, in: S. Štajner, H. Saggion, D. Ferrés, M. Shardlow, K. C. Sheang, K. North,
M. Zampieri, W. Xu (Eds.), Proceedings of the Workshop on Text Simplification, Accessibility,
and Readability (TSAR-2022), Association for Computational Linguistics, Abu Dhabi, United
Arab Emirates (Virtual), 2022, pp. 251–258. URL: https://aclanthology.org/2022.tsar-1.28/. doi:10.
18653/v1/2022.tsar-1.28.
[25] S. Narayan, C. Gardent, S. B. Cohen, A. Shimorina, Split and Rephrase, in: M. Palmer, R. Hwa,
S. Riedel (Eds.), Proceedings of the 2017 Conference on Empirical Methods in Natural Language
Processing, Association for Computational Linguistics, Copenhagen, Denmark, 2017, pp. 606–616.</p>
      <p>URL: https://aclanthology.org/D17-1064/. doi:10.18653/v1/D17-1064.
[26] J. A. Botha, M. Faruqui, J. Alex, J. Baldridge, D. Das, Learning to Split and Rephrase From Wikipedia
Edit History, in: E. Rilof, D. Chiang, J. Hockenmaier, J. Tsujii (Eds.), Proceedings of the 2018
Conference on Empirical Methods in Natural Language Processing, Association for Computational
Linguistics, Brussels, Belgium, 2018, pp. 732–737. URL: https://aclanthology.org/D18-1080/. doi:10.
18653/v1/D18-1080.
[27] Y. Gao, T.-H. Huang, R. J. Passonneau, ABCD: A Graph Framework to Convert Complex Sentences
to a Covering Set of Simple Sentences, in: C. Zong, F. Xia, W. Li, R. Navigli (Eds.), Proceedings of the
59th Annual Meeting of the Association for Computational Linguistics and the 11th International
Joint Conference on Natural Language Processing (Volume 1: Long Papers), Association for
Computational Linguistics, Online, 2021, pp. 3919–3931. URL: https://aclanthology.org/2021.acl-long.303/.
doi:10.18653/v1/2021.acl-long.303.
[28] C. Niklaus, M. Cetto, A. Freitas, S. Handschuh, DisSim: A Discourse-Aware Syntactic Text
Simplification Framework for English and German, in: K. van Deemter, C. Lin, H. Takamura (Eds.),
Proceedings of the 12th International Conference on Natural Language Generation, Association
for Computational Linguistics, Tokyo, Japan, 2019, pp. 504–507. URL: https://aclanthology.org/
W19-8662/. doi:10.18653/v1/W19-8662.
[29] E. Sulem, O. Abend, A. Rappoport, Simple and Efective Text Simplification Using Semantic and
Neural Methods, in: I. Gurevych, Y. Miyao (Eds.), Proceedings of the 56th Annual Meeting of the
Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational
Linguistics, Melbourne, Australia, 2018, pp. 162–173. URL: https://aclanthology.org/P18-1016/.
doi:10.18653/v1/P18-1016.
[30] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, I. Polosukhin,
Attention is All you Need, in: I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S.
Vishwanathan, R. Garnett (Eds.), Advances in Neural Information Processing Systems, volume 30,
Curran Associates, Inc., 2017. URL: https://proceedings.neurips.cc/paper_files/paper/2017/file/
3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
[31] L. Martin, A. Fan, É. de la Clergerie, A. Bordes, B. Sagot, MUSS: Multilingual Unsupervised
Sentence Simplification by Mining Paraphrases, in: N. Calzolari, F. Béchet, P. Blache, K. Choukri,
C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, J. Odijk, S. Piperidis
(Eds.), Proceedings of the Thirteenth Language Resources and Evaluation Conference, European
Language Resources Association, Marseille, France, 2022, pp. 1651–1664. URL: https://aclanthology.
org/2022.lrec-1.176/.
[32] K. C. Sheang, H. Saggion, Controllable Sentence Simplification with a Unified Text-to-Text Transfer
Transformer, in: A. Belz, A. Fan, E. Reiter, Y. Sripada (Eds.), Proceedings of the 14th International
Conference on Natural Language Generation, Association for Computational Linguistics, Aberdeen,
Scotland, UK, 2021, pp. 341–352. URL: https://aclanthology.org/2021.inlg-1.38/. doi:10.18653/
v1/2021.inlg-1.38.
[33] S. Štajner, D. Ferrés, M. Shardlow, K. North, M. Zampieri, H. Saggion, Lexical simplification
benchmarks for English, Portuguese, and Spanish, Frontiers in Artificial Intelligence Volume 5
2022 (2022). URL: https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.
2022.991242. doi:10.3389/frai.2022.991242.
[34] X. Lu, J. Qiang, Y. Li, Y. Yuan, Y. Zhu, An Unsupervised Method for Building Sentence Simplification
Corpora in Multiple Languages, in: M.-F. Moens, X. Huang, L. Specia, S. W.-t. Yih (Eds.), Findings
of the Association for Computational Linguistics: EMNLP 2021, Association for Computational
Linguistics, Punta Cana, Dominican Republic, 2021, pp. 227–237. URL: https://aclanthology.org/
2021.findings-emnlp.22/. doi: 10.18653/v1/2021.findings-emnlp.22.
[35] O. Kariuk, D. Karamshuk, CUT: Controllable Unsupervised Text Simplification, CoRR
abs/2012.01936 (2020). URL: https://arxiv.org/abs/2012.01936. arXiv:2012.01936.
[36] P. Laban, T. Schnabel, P. Bennett, M. A. Hearst, Keep it Simple: Unsupervised Simplification of
MultiParagraph Text, in: C. Zong, F. Xia, W. Li, R. Navigli (Eds.), Proceedings of the 59th Annual Meeting
of the Association for Computational Linguistics and the 11th International Joint Conference on
Natural Language Processing (Volume 1: Long Papers), Association for Computational Linguistics,
Online, 2021, pp. 6365–6378. URL: https://aclanthology.org/2021.acl-long.498/. doi:10.18653/v1/
2021.acl-long.498.
[37] S. Agrawal, M. Carpuat, Controlling Pre-trained Language Models for Grade-Specific Text
Simplification, in: H. Bouamor, J. Pino, K. Bali (Eds.), Proceedings of the 2023 Conference
on Empirical Methods in Natural Language Processing, Association for Computational
Linguistics, Singapore, 2023, pp. 12807–12819. URL: https://aclanthology.org/2023.emnlp-main.790/.
doi:10.18653/v1/2023.emnlp-main.790.
[38] X. Wu, Y. Arase, An In-depth Evaluation of Large Language Models in Sentence
Simpliifcation with Error-based Human Assessment, 2025. URL: https://arxiv.org/abs/2403.04963.
arXiv:2403.04963.
[39] F. Alva-Manchego, C. Scarton, L. Specia, The (Un)Suitability of Automatic Evaluation Metrics for
Text Simplification, Computational Linguistics 47 (2021) 861–889. URL: https://aclanthology.org/
2021.cl-4.28/. doi:10.1162/coli_a_00418.
[40] W. Xu, C. Napoles, E. Pavlick, Q. Chen, C. Callison-Burch, Optimizing Statistical Machine
Translation for Text Simplification, Transactions of the Association for Computational Linguistics 4
(2016) 401–415. URL: https://aclanthology.org/Q16-1029/. doi:10.1162/tacl_a_00107.
[41] E. Sulem, O. Abend, A. Rappoport, Semantic structural evaluation for text simplification, in:
M. Walker, H. Ji, A. Stent (Eds.), Proceedings of the 2018 Conference of the North American Chapter
of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long
Papers), Association for Computational Linguistics, New Orleans, Louisiana, 2018, pp. 685–696.</p>
      <p>URL: https://aclanthology.org/N18-1063/. doi:10.18653/v1/N18-1063.
[42] T. Zhang*, V. Kishore*, F. Wu*, K. Q. Weinberger, Y. Artzi, BERTScore: Evaluating Text
Generation with BERT, in: International Conference on Learning Representations, 2020. URL:
https://openreview.net/forum?id=SkeHuCVFDr.
[43] K. Papineni, S. Roukos, T. Ward, W.-J. Zhu, Bleu: a method for automatic evaluation of machine
translation, in: P. Isabelle, E. Charniak, D. Lin (Eds.), Proceedings of the 40th Annual Meeting
of the Association for Computational Linguistics, Association for Computational Linguistics,
Philadelphia, Pennsylvania, USA, 2002, pp. 311–318. URL: https://aclanthology.org/P02-1040/.
doi:10.3115/1073083.1073135.
[44] R. Flesch, A new readability yardstick., Journal of applied psychology 32 (1948) 221.
[45] J. Fernández Huerta, Medidas sencillas de lecturabilidad, Consigna 214 (1959) 29–32.
[46] T. Tanprasert, D. Kauchak, Flesch-Kincaid is Not a Text Simplification Evaluation Metric, in:
A. Bosselut, E. Durmus, V. P. Gangal, S. Gehrmann, Y. Jernite, L. Perez-Beltrachini, S. Shaikh,
W. Xu (Eds.), Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and
Metrics (GEM 2021), Association for Computational Linguistics, Online, 2021, pp. 1–14. URL:
https://aclanthology.org/2021.gem-1.1/. doi:10.18653/v1/2021.gem-1.1.
[47] D. Ponce, T. Etchegoyhen, J. Calleja, H. Gete, Split and Rephrase with Large Language Models,
in: L.-W. Ku, A. Martins, V. Srikumar (Eds.), Proceedings of the 62nd Annual Meeting of the
Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational
Linguistics, Bangkok, Thailand, 2024, pp. 11588–11607. URL: https://aclanthology.org/2024.acl-long.
622/. doi:10.18653/v1/2024.acl-long.622.
[48] J. Calleja, T. Etchegoyhen, D. Ponce, Automating Easy Read Text Segmentation, in: Y. Al-Onaizan,
M. Bansal, Y.-N. Chen (Eds.), Findings of the Association for Computational Linguistics: EMNLP
2024, Association for Computational Linguistics, Miami, Florida, USA, 2024, pp. 11876–11894. URL:
https://aclanthology.org/2024.findings-emnlp.694/. doi: 10.18653/v1/2024.findings-emnlp.
694.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Štajner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. C.</given-names>
            <surname>Sheang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Saggion</surname>
          </string-name>
          ,
          <source>Sentence Simplification Capabilities of Transfer-Based Models, Proceedings of the AAAI Conference on Artificial Intelligence</source>
          <volume>36</volume>
          (
          <year>2022</year>
          )
          <fpage>12172</fpage>
          -
          <lpage>12180</lpage>
          . URL: https://ojs.aaai.org/index.php/AAAI/article/view/21477. doi:
          <volume>10</volume>
          .1609/aaai.v36i11.
          <fpage>21477</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>United</given-names>
            <surname>Nations</surname>
          </string-name>
          ,
          <source>Universal Declaration of Human Rights, (Arts. 19 and 27)</source>
          ,
          <year>1948</year>
          . URL: https: //www.un.org/en/about-us/
          <article-title>universal-declaration-of-human-rights.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>European</given-names>
            <surname>Union</surname>
          </string-name>
          ,
          <source>Charter of Fundamental Rights of the European Union (Arts. 11, 22 and 42)</source>
          ,
          <year>2000</year>
          . URL: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A12012P%
          <fpage>2FTXT</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4] Government of Spain,
          <source>Spanish Constitution (Arts. 20 and 44)</source>
          ,
          <year>1978</year>
          . URL: https: //www.boe.es/biblioteca_juridica/codigos/codigo.php?id=158_Constitucion_Espanola___ ______________The_Spanish_Constitution_&amp;
          <source>modo=2.</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>González-Sordé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Matamala</surname>
          </string-name>
          ,
          <article-title>Empirical evaluation of Easy Language recommendations: a systematic literature review from journal research in Catalan, English, and</article-title>
          <string-name>
            <surname>Spanish</surname>
          </string-name>
          ,
          <source>Universal Access in the Information Society</source>
          <volume>23</volume>
          (
          <year>2024</year>
          )
          <fpage>1369</fpage>
          -
          <lpage>1387</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Suárez-Figueroa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Diab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Ruckhaus</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Cano</surname>
          </string-name>
          ,
          <article-title>First steps in the development of a support application for easy-to-read adaptation</article-title>
          ,
          <source>Universal Access in the Information Society</source>
          <volume>23</volume>
          (
          <year>2024</year>
          )
          <fpage>365</fpage>
          -
          <lpage>377</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Madina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Gonzalez-Dios</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Siegel</surname>
          </string-name>
          ,
          <article-title>Languagetool as a CAT tool for Easy-to-Read in Spanish</article-title>
          ,
          <source>in: Proceedings of the 3rd Workshop</source>
          on Tools and
          <article-title>Resources for People with REAding DIficulties (READI)@ LREC-COLING</article-title>
          <year>2024</year>
          ,
          <year>2024</year>
          , pp.
          <fpage>93</fpage>
          -
          <lpage>101</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>I.</given-names>
            <surname>Diab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Suárez-Figueroa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Peris</surname>
          </string-name>
          ,
          <article-title>Towards an automatic easy-to-read adaptation of dialogues in narrative texts in Spanish</article-title>
          , in: International Conference on Computers Helping People with Special Needs, Springer,
          <year>2024</year>
          , pp.
          <fpage>208</fpage>
          -
          <lpage>216</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>M. C. Suárez De Figueroa Baonza</surname>
            ,
            <given-names>P. M.</given-names>
          </string-name>
          <string-name>
            <surname>Blanco</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Blanco</surname>
            ,
            <given-names>I. Diab</given-names>
          </string-name>
          <string-name>
            <surname>Lozano</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>García-Agudo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Muñoz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <article-title>Torres, ATECA: una aplicación web para agilizar las adaptaciones a lectura fácil</article-title>
          ,
          <source>Siglo Cero</source>
          <volume>56</volume>
          (
          <year>2025</year>
          )
          <fpage>115</fpage>
          -
          <lpage>137</lpage>
          . URL: https://revistas.usal.es/tres/index.php/0210-1696/article/view/32147. doi:
          <volume>10</volume>
          .14201/scero.32147.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>I.</given-names>
            <surname>Espinosa-Zaragoza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Abreu-Salas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Moreda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Palomar</surname>
          </string-name>
          ,
          <article-title>Automatic Text Simplification for People with Cognitive Disabilities: Resource Creation within the ClearText project</article-title>
          , in: S. Štajner,
          <string-name>
            <given-names>H.</given-names>
            <surname>Saggio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shardlow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alva-Manchego</surname>
          </string-name>
          (Eds.),
          <source>Proceedings of the Second Workshop on Text Simplification, Accessibility and Readability</source>
          , INCOMA Ltd.,
          <string-name>
            <surname>Shoumen</surname>
          </string-name>
          , Bulgaria, Varna, Bulgaria,
          <year>2023</year>
          , pp.
          <fpage>68</fpage>
          -
          <lpage>77</lpage>
          . URL: https://aclanthology.org/
          <year>2023</year>
          .tsar-
          <volume>1</volume>
          .7/.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Martínez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ramos</surname>
          </string-name>
          , L. Moreno,
          <article-title>Exploring Large Language Models to generate Easy to Read content</article-title>
          ,
          <source>Frontiers in Computer Science</source>
          <volume>6</volume>
          (
          <year>2024</year>
          )
          <fpage>1394705</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>N.</given-names>
            <surname>Freyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Kempt</surname>
          </string-name>
          , L. Klöser,
          <article-title>Easy-read and large language models: on the ethical dimensions of LLM-based text simplification</article-title>
          ,
          <source>Ethics and Information Technology</source>
          <volume>26</volume>
          (
          <year>2024</year>
          )
          <fpage>50</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>I.</given-names>
            <surname>Diab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Suárez-Figueroa</surname>
          </string-name>
          ,
          <article-title>First Attempt at an Automatic Adaptation of Explanatory Structures in Spanish to Easy-to-</article-title>
          <string-name>
            <surname>Read</surname>
          </string-name>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>I.</given-names>
            <surname>Gonzalez-Dios</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Gutiérrez-Fandiño</surname>
          </string-name>
          ,
          <string-name>
            <surname>O.</surname>
          </string-name>
          <article-title>m. Cumbicus-Pineda, A. Soroa, IrekiaLFes: a New Open Benchmark and Baseline Systems for Spanish Automatic Text Simplification</article-title>
          , in: S. Štajner,
          <string-name>
            <given-names>H.</given-names>
            <surname>Saggion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ferrés</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shardlow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. C.</given-names>
            <surname>Sheang</surname>
          </string-name>
          , K. North,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          , W. Xu (Eds.), Proceedings
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>