=Paper=
{{Paper
|id=Vol-3846/paper17
|storemode=property
|title=First Attempt to an Automatic Adaptation of Explanatory Structures in Spanish to Easy-to-Read
|pdfUrl=https://ceur-ws.org/Vol-3846/paper17.pdf
|volume=Vol-3846
|authors=Isam Diab,Mari Carmen Suárez-Figueroa
|dblpUrl=https://dblp.org/rec/conf/sepln/DiabS24
}}
==First Attempt to an Automatic Adaptation of Explanatory Structures in Spanish to Easy-to-Read==
First Attempt at an Automatic Adaptation of Explanatory
Structures in Spanish to Easy-to-Read
Isam Diab1 , Mari Carmen Suárez-Figueroa1
1
Ontology Engineering Group (OEG), Universidad Politécnica de Madrid (UPM)
Abstract
Explanatory structures in the form of incises (e.g. nominal appositions and adjective clauses) can break the argumentative line of a
sentence and lose the focus of the reader’s attention. Thus, these structures are considered complex for different groups of the population
who present reading comprehension difficulties, including people with cognitive disabilities. The Easy-to-Read (E2R) Methodology
was created to provide clear and easily understood contents to people with reading comprehension problems. This methodology
recommends avoiding the use of explanations between commas and avoiding the use of appositions that interrupt the natural rhythm of
reading. To help people with difficulties in reading comprehension, we have developed a pair of initial Artificial Intelligence (AI)-based
methods for adapting in an automatic way explanatory structures in Spanish to E2R. The evaluation of the methods involved unit tests
and the calculation of the sentence similarity between the original and the adapted sentences.
Keywords
Easy-to-Read (E2R), Cognitive Accessibility, Automatic Translation, Text Adaptation
1. Introduction structures in the form of incises (i.e. nominal appositions
and adjective clauses) to easy-to-read versions has a positive
Equal opportunities and universal access to information impact for people with reading comprehension difficulties.
are fundamental rights that every person should benefit1 . Within the scope of technological support for addressing
However, certain groups of society, particularly those with E2R guidelines and recommendations in Spanish texts, it is
cognitive or intellectual disabilities, present some difficul- worth mentioning (a) Easy-to-Read Advisor [11], FACILE
ties related to reading comprehension processes. Therefore, [12], Comp4Text [13], E2R-Helper [14] and ATECA3 for an
prioritising the so-called cognitive accessibility becomes E2R analysis of documents; and (b) Simplext [15], LexSIS
essential for promoting active participation in diverse so- [16], DysWebxia [17], EASIER [18], FACILE [12], ATECA4
cial domains, such as politics, education, employment, and and Simple.Text5 for creating simpler versions of original
culture. For such a reason, a methodology called Easy-to- documents.
Read (E2R) [1, 2, 3, 4] was created. The main goal of this However, none of the aforementioned works specifically
methodology is to present clear and easily understood con- targets the identification and adaptation6 of explanatory
tent by providing a set of guidelines on the content and the structures in the form of incises into simpler or easy-to-read
design and layout of written materials, as, for instance, to forms.
use short and simple sentences, to avoid the use of long Motivated by the aim of bridging this gap and enhancing
words, or to divide ideas into paragraphs. This adaptation the reading comprehension process, our research work fo-
process is iterative and involves three key activities: anal- cuses on automatically identifying and adapting explanatory
ysis, adaptation and validation [4]. Nevertheless, the E2R structures in the form of nominal appositions and adjective
methodology is currently implemented manually, which is clauses, since they impact linguistic aspects such as sentence
costly and time-consuming, so it would benefit from hav- length and sentence complexity. Therefore, we propose two
ing a technological support. In this context, our research methods based on symbolic AI7 to adapt explanatory struc-
line is focused on applying different Artificial Intelligence tures that are not compliant with the E2R Methodology.
(AI) methods and techniques2 to automatically perform the We also implemented two proofs-of-concept based on these
analysis and the adaptation of Spanish documents to obtain methods. It is worth mentioning that we have opted to use
easy-to-read versions. In particular, this paper concentrates the term ‘adaptation’ consistently throughout the paper,
on two of the E2R guidelines that influence the composition as it aligns with the established terminology in E2R disci-
of the text [3, 4]: (a) to avoid explanations between commas; plines. This adaptation, achieved through our methods, can
and (b) to avoid the use of appositions that interrupt the be viewed as a type of intralinguistic automatic translation
natural rhythm of reading. tailored specifically for rendering sentences into an E2R
Several studies [5, 6, 7, 8, 9, 10] have shown that this type version.
of explanatory structures present a difficulty in the reading The rest of the paper is organised as follows: Section
comprehension process, since they break the argumentative 2 is devoted to (a) how explanatory structures affect read-
line and lead to missing information in the process of under- ing comprehension and cognitive accessibility, and (b) the
standing the text. In this way, the adaptation of explanatory automatic approaches for identifying and adapting these
structures into simpler ones. In Section 3 we present our first
SEPLN-2024: 40th Conference of the Spanish Society for Natural Language
Processing. Valladolid, Spain. 24-27 September 2024. attempts of methods for adapting both structures to the E2R
Envelope-Open isam.diab@upm.es (I. Diab); mcsuarez@fi.upm.es 3
https://ateca.linkeddata.es/
(M. C. Suárez-Figueroa) 4
https://ateca.linkeddata.es/
Orcid 0000-0002-3967-0672 (I. Diab); 0000-0003-3807-5019 5
https://simpletext.demos.gplsi.es/
(M. C. Suárez-Figueroa) 6
It is worth mentioning that text adaptation always aims to transform
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0). texts to meet the needs of a specific audience, while text simplification
1
Convention on the Rights of Persons with Disabilities (United Nations, tends to reduce the complexity of texts and does not always take the
2006). Available at: https://short.upm.es/k34jw final user into account.
2 7
We are investigating both symbolic (e.g. logical rules) and subsymbolic Human knowledge is explicitly represented in a declarative form (e.g.
(e.g. neural networks and machine learning) approaches. facts and rules). This way of proceeding is part of symbolic AI.
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
Methodology as well as the versions of proofs-of-concept aspects of contextual meaning on reading comprehension,
for the methods. Finally, we present some conclusions and the author realised that in the examples provided to the
future work. study participants, appositive structures in the form of ex-
planatory incises were the most difficult to understand.
Thus, based on the aforementioned studies, there is clear
2. State of the Art evidence of the complexity of the explanatory structures
that we are dealing with in this research work.
In this work we delve into developing initial methods to
automatically adapt explanatory structures in Spanish into
easy-to-read and more accessible versions, based on the 2.2. Automatic Approaches Addressing
guidelines provided by the E2R Methodology [3, 4]. There- Explanatory Structures
fore, in this section we (a) highlight some notes about this
type of structures and its implication for reading compre- In the context of Natural Language Processing (NLP), Text
hension (Section 2.1), and (b) summarise the automatic ap- Simplification (TS) has gained considerable attention over
proaches carried out on the adaptation of such explanatory the last decades. Specifically, syntactic simplification, which
structures (Section 2.2). involves reducing the complexity of embedded sentences,
has emerged as a key focus for the research community.
Numerous works have been devoted to the simplification
2.1. Explanatory Structures and Cognitive of complex sentences into simpler ones applied to differ-
Accessibility ent languages. One of the types of complex sentences that
have been automatically addressed are relative clause sen-
Explanatory structures are incises that appear between com-
tences in the form of incises. For the first of these, we have
mas within a sentence, interrupting the course of the ut-
to go back to the 1990s, when Chandrasekar and Srinivas
terance to add some precision or comment on the nominal
[21] proposed the implementation of an algorithm through
element that precedes them [19, 20]. Explanatory structures
which generalised simplification rules are automatically
can occur in two different forms, according to their syntactic
derived from annotated training data in English. The pro-
nature. On the one hand, in the form of nominal apposi-
cess used a partial parsing technique that integrates con-
tions, that is, nouns or noun phrases, as in Julia, our cousin,
stituent structure and dependency information, in order
lives in Canada. This type of apposition is formally rep-
to simplify subordinated sentences that included relative
resented as “A, B”. The segment B (also called apodosis in
clauses. Along the same line, relative clauses in the form of
linguistic terms) represents in this variety a parenthetical
incises are also addressed in the work done by Siddharthan
noun phrase which adds some precision or some remark
[22], which introduces a text simplification framework that
to clarify the reference of A (also called protasis), which is
uses transformation rules applied to a typed dependency
another noun phrase. In this sense, we observe that segment
representation generated by the Stanford parser10 . In addi-
B assumes that the explanation is copulative; that is, the
tion, for English as well, Dornescu and colleagues [23] ex-
relationship between segments A and B is formed by the
plored the extraction of relative clauses employing a tagging
verb to be, following the pattern “A, B = A is B” (e.g. Julia,
approach. They manually annotated a dataset encompass-
our cousin, lives in Canada > Julia is our cousin. Julia lives
ing three text genres, enabling the development and com-
in Canada). On the other hand, explanatory structures can
parison of ruled-based and machine learning methods for
be non-restrictive8 adjective clauses. Such clauses can
automatically identifying appositions and non-restrictive
be (a) relative clauses, which are introduced by relative de-
relative clauses. They built a supervised tagging model for
terminers or pronouns (viz. that, which, who, whom, whose),
automatic detection of appositions using the tagged dataset.
such as The house, which is on the seafront, is very bright;
For languages other than English, in Brazilian Portuguese,
or (b) participial clauses, introduced by verbs in participle
Candido and colleagues [24] developed a rule-based sys-
form, as in The man, tired from work, fell asleep 9 .
tem using a parser which provides lexical and syntactic
Explanatory structures, as a syntactic element that breaks
information for the simplification of 22 complex linguistic
the discourse line [3, 4], have been studied in relation to
phenomena, including relative clauses. For the Indonesian
reading comprehension [6, 8, 10]. For Dillon and colleagues
language, Haryadi and colleagues [25] replicated the same
[10], comprehending an explanatory structure in the form of
method of simplification and dataset as in Siddharthan’s
an incise involves the additional step of identifying which
work, where some relative clauses were handled. Regarding
of the previous phrases of like type it is coreferential to.
the Basque language, Aranzabe and colleagues [26] pre-
Certainly, they mention several studies [5, 7] suggesting
sented an architecture for a text simplification system based
that appositive relative subordinate clauses are often forgot-
on hand written rules specific for syntactic simplification. In
ten during the reading process, reflecting the well-known
the case of Spanish, both in the framework of the Simplext
phenomenon that sentence details quickly disappear from
project [15], and also in the work by Bott and colleagues [27]
memory. Furthermore, following this line, different analy-
some relative clauses are covered in the syntactic simplifica-
ses [9, 10] claim that there is increasing evidence that the
tion approach, making use of a hand-written computational
syntactic form of appositive material that has come and
grammar and dependency trees, and focussing on reducing
gone, such as appositive relative clauses in medial position,
sentence complexity.
becomes rapidly unavailable in short-term memory. More-
However, to the best of our knowledge, explanatory struc-
over, in a study [8] conducted to investigate the effects of
tures in the form of incises, whether appositions or non-
8
restrictive relative clauses, have not been specifically dealt
A non-restrictive clause adds additional information to a previous
with in any research work in Spanish. Thus, in our work,
noun, called antecedent. It uses commas to show that the information
is additional. we delve into analysing these structures in detail with the
9
Examples provided by the Spanish Royal Academy of Language (RAE):
10
https://short.upm.es/nbmpd https://short.upm.es/w7gxt
aim of adapting them to easy-to-read versions following the ::= ","
E2R methodology. ↪ ","
IF IdentificationPattern2 AND NounRelation =
↪ "amod" AND NounPhrase.POSTag = NOUN
3. Initial Methods for an E2R THEN NominalApposition
Adaptation of Explanatory
Structures Listing 2: Identification Pattern 2 for detecting appositional
structures.
The aim of the proposed methods is (a) to detect explana-
tory structures in the form of appositions and adjective Regarding the adaptation of nominal appositions, in
clauses written in Spanish, and (b) to adapt such structures general, the transformation of the apposition into a more
into easy-to-read versions as the E2R methodology suggests. accessible and easier structure consists of splitting the ap-
These initial methods are composed of the following activi- positive structure into two simple sentences according to
ties: (1) Natural Language Processing (NLP), which includes the pattern “A, B = A is B”, mentioned in Section 2.1: (a)
tokenization, tagging tasks, morphology and dependency on the one hand, the main idea, and (b) on the other hand,
detection, (2) Explanatory Structures Identification, and (3) the explanation: En el congreso conocí al famoso investigador,
Explanatory Structures Adaptation. Section 3.1 explains quizá la persona que más influyó en mi trabajo. > En el con-
the E2R adaptation methods for nominal appositions and greso conocí al famoso investigador (main idea). El famoso
Section 3.2 the methods for non-restrictive adjective clauses investigador es quizá la persona que más influyó en mi trabajo
in Spanish, as well as the proofs-of-concept implemented (explanation)12 .
based on such methods. Considering this, we have dealt with the adaptation of
four cases of appositions occurrences according to their
3.1. Nominal Appositions syntactic nature:
Nominal appositions identification relies on the Part-of- • Case A. The apposition is not marked by a deter-
Speech (PoS) tagging information, following an IF-THEN miner. As explained in Section 2.1, the apposition
ruled-based approach. In particular, we looked for the PoS is a noun phrase (or nominal syntagm). Typically a
tag that identifies an apposition, that is, ‘appos’11 . Regard- nominal syntagm is formed by a determiner (definite
less, we realised in the initial tests that in two cases appo- article el/la/los/las/los/las13 (‘the’) or indefinite arti-
sitions were not identified by the PoS tag. Therefore, we cle un/una/unos/unas/unas (‘a’)) preceding the noun.
analysed these cases to extract new apposition identification However, occasionally the apposition lacks a deter-
patterns to complement the identification by the PoS tag. miner (as it is omitted because it is taken for granted
For the first case, illustrated in Listing 1, it was decided to in the discourse). For example: El búho, ave rapaz,
create a rule that whenever a noun phrase between com- ve bien de noche. > #El búho es ave rapaz14 . El búho
mas functions (i.e. has the same tag) as its antecedent (a ve bien de noche15 . The fact that the determiner is as-
noun), it is an apposition. This noun phrase, being equally sumed by its absence means precisely that the noun
tagged, is acting as a “symmetrical” structure and both the it should accompany is not definite, i.e. no previous
noun phrase and the antecedent have the same syntactic reference has been made to that noun. When we use
information. definite determiners (el/la/los/las) with a noun, we
allude to the fact that there has already been a previ-
::= ? ?
ous reference in the text to that noun. Therefore, we
↪
can assume that the absence of a determiner is syn-
::= ","
onymous with the use of an indefinite determiner
↪ ","
(un/una/unos/unas). Thus, in this type of apposition,
IF IdentificationPattern1 AND Antecedent.POSTag
the adaptation is as follows: El búho, ave rapaz, ve
↪ = Noun.POSTag
bien de noche > El búho es un ave rapaz. El búho ve
THEN NominalApposition
bien de noche16 .
Then, in the adaptation process, illustrated in Listing
Listing 1: Identification Pattern 1 for detecting appositional 3, the steps to create the two new simple structures
structures. are the following:
1 To remove the commas.
For the second case, since the apposition can be seen as a 2 To insert the verb ser (‘to be’) in its concordant
complement of the noun, it was decided that in those cases form to the subject and to the originally main
where a noun phrase is enclosed in commas and the noun
12
is treated as a complement of the noun (PoS tag ‘amod’), it Translation (Tr.): At the congress I met the famous researcher, perhaps
the person who most influenced my work. > At the congress I met the
should be identified as an apposition. Since appositions are
famous researcher. The famous researcher is perhaps the person who
noun phrases, it is required that the apposition is marked most influenced my work.
with the tag ‘NOUN’. Listing 2 shows this identification 13
Note that in Spanish we use the slash symbol (/) to indicate gender
pattern. and number variations of the same word.
14
The hash (#) is used in linguistics to express that a structure is unusual,
although it makes sense grammatically.
15
Tr.: The owl, bird of prey, sees well at night. > #The owl is bird of prey.
The owl sees well at night.
16
Tr.: The owl, bird of prey, sees well at night. > The owl is a bird of prey.
11
https://short.upm.es/zuoji The owl sees well at night.
verb. In the example above the inserted form Bretaña. Jorge VI tuvo muchas hijas20 . The pattern
is es (‘is’) because the subject is búho (‘owl’) for this case is illustrated in Listing 4.
(3rd person and singular number) and the verb
ve (‘sees’) is in the present indicative tense. ::= Subject
3 To insert the indefinite determiner. For this ↪ ConjugatedVbSer Determiner NounPhrase"."
step, similar as before, the subject information ↪ Subject Predicate"."
is consulted (búho is masculine and singular), IF Determiner IN NounPhrase AND NounPhrase IS
and the indefinite determiner that meets the ↪ NominalApposition AND Sentence = Subject,
same characteristics is added (in this case, un). ↪ NounPhrase, Predicate
THEN AdaptationPattern2
4 To close the first segment or protasis with a
full stop.
Listing 4: Adaptation Pattern for Case B appositions.
5 To create the second segment or apodosis, re-
trieving the antecedent noun (El búho (‘the
owl’)) and placing it preceding the rest of • Case C. The apposition is headed by a deictic.
the predicate (ve bien de noche (‘sees well at In spoken discourse, or in certain literary contexts,
night’)). so-called opaque deictics (pronouns, in this case) are
sometimes used, which make a non-literal spatio-
::= el|la|los|las temporal allusion. For example, in Alberti, ese poeta
::= un|una|unos|unas políticamente comprometido, llegó el lunes21 we see
::= Subject how ese (‘that’) is a deictic pronoun which does not
↪ ConjugatedVbSer IndefiniteDeterminer really point to anything, it is opaque, it makes an
↪ NounPhrase"." Subject Predicate"." allusion to the listener’s supposed knowledge of the
IF Determiner NOT IN NounPhrase AND NounPhrase information in the apposition about Alberti. That is,
↪ IS NominalApposition AND Sentence = Subject, the sender uses it to include the receiver as knowing
↪ NounPhrase, Predicate that Alberti was a politically committed poet, but
THEN AdaptationPattern1 frames Alberti in a space. For its adaptation pat-
tern (see Listing 5), the same approach is proposed
Listing 3: Adaptation Pattern for Case A appositions. as in Case B, except that the deictic pronoun (es-
e/es/es/esa/esas (‘that’)) is replaced by an indefinite
determiner (un/una/unos/unas/unas), matching its
In the case of proper nouns, which are naturally gender and number. Thus, the example above is
not accompanied by a determiner, this rule does not adapted as follows: Alberti es un poeta políticamente
apply. It is possible either to eliminate the commas, comprometido. Alberti llegó el lunes22 .
as in Mi primo, Juan, vive en Canarias. > Mi primo
Juan vive en Canarias17 ; or to carry out the same ::= ese|esa|esos|esas|
process of creating two sentences, as in Mi primo, este|esta|estos|estas|
Juan, vive en Canarias. > Mi primo es Juan. Mi primo aquel|aquella|aquellos|aquellas
vive en Canarias18 . In this situation, we have opted ::= Subject
to eliminate commas for these specific cases, since in ↪ ConjugatedVbSer IndefiniteDeterminer
this case the apposition between commas is a simple ↪ NounPhrase"." Subject Predicate"."
element (a proper noun), and not a construction IF DeicticPronoun IN NounPhrase AND NounPhrase
with more elements that can cut the rhythm of the ↪ IS NominalApposition AND Sentence = Subject,
reading. In this way, the deletion of commas is the ↪ NounPhrase, Predicate
most direct form of adaptation as it does not interfere THEN AdaptationPattern3
with the content.
• Case B. The apposition is marked by a deter- Listing 5: Adaptation Pattern for Case C appositions.
miner. Contrary to Case A, when the apposition
contains a determiner, the steps mentioned in Case
• Case D. The apposition can be a clause attached
A are followed, except for the inclusion of a deter-
to a proper noun. It is important to mention that in
miner, which is now explicit. Thus, if there is an
Spanish there are other explanatory structures in the
example of an apposition with a definite determiner,
form of incises which are not appositions but can be
the adaptation is as follows: Ana, la amiga de Sara,
treated as such. These are parenthetical structures,
vino a la fiesta > Ana es la amiga de Sara. Ana vino
short and simple syntagms, which give information
a la fiesta19 And, in the case of an apposition with
by the speaker within the discourse (unlike in Case
an indefinite determiner, the adaptation is as fol-
B, these incises lack complements, they are simpler
lows: Jorge VI, uno de los reyes de Gran Bretaña, tuvo
than the noun phrases of appositions). Their func-
muchas hijas. > Jorge VI fue uno de los reyes de Gran
tion may be explanatory, but may also be due to
factors such as emphasis or reiteration of something
17
Tr.: My cousin, Juan, lives in the Canary Islands. > My cousin Juan lives
20
in the Canary Islands. Tr.: George VI, one of the kings of Great Britain, had many daughters. >
18
Tr.: My cousin, Juan, lives in the Canary Islands. > My cousin is Juan. George VI was one of the kings of Great Britain. George VI had many
My cousin lives in the Canary Islands. daughters.
19 21
Tr.: Ana, the friend of Sara, came to the party.. > Ana is the friend of Tr.: Alberti, that politically committed poet, arrived on Monday.
22
Sara. Ana came to the party. Tr.: Alberti is a politically committed poet. Alberti arrived on Monday.
previously said [19]. In these cases, it is proposed that of the original. Specifically, we have used a sentence
that, since they do not express an explanation per similarity model30 for Spanish, (available at Hugging Face
se, the sentence is transformed by reorganising the repository31 ), to compare the original sentence with ap-
clause, as shown in Listing 6. That is, in Juan, el position and the adapted sentence provided by our PoC
pobre, lo perdió todo, the clause el pobre is placed according to the E2R methodology. The choice of this lan-
before the proper noun in the adaptation: El pobre guage model is based on a previous work32 we carried out
Juan lo perdió todo23 . The same applies to Arturo, mi in which we compared the performance of different sen-
amigo, no quiere venir > Mi amigo Arturo no quiere tence similarity models in Spanish. By using this model,
venir24 . vector representations of the sentences can be obtained and
the semantic similarity between them can be calculated.
::= Determiner Noun The model receives as input the original sentence and the
↪ ProperNoun Verb RestOfSentence"." transformed sentence, and returns a number indicating the
IF ProperNoun.POSTag = PROPN AND degree of similarity between them, being 0 not similar at
↪ (Determiner.POSTag = DET AND Noun.POSTag = all and 1 being completely similar. We obtained an average
↪ NOUN) AND Sentence = Subject, ProperNoun, of 0.94 similarity between all 34 sentences with apposition
↪ Predicate that were adapted. In more detail, one of the main errors
THEN AdaptationPattern4 encountered in the adaptation activity has to do with the
conjugation of the verb ser (‘to be’), since an explanation
Listing 6: Adaptation Pattern for Case D appositions. of a subject can be presented in a different tense than the
time at which the main action of the sentence occurs. For
example, the sentence Nuestros vecinos, los Pérez, se fueron
Based on the proposed method for identifying and adapt- de vacaciones 33 is adapted as #Nuestros vecinos fueron los
ing appositions in Spanish, we have developed a proof-of- Pérez. Nuestros vecinos se fueron de vacaciones34 , because the
concept (PoC)25 implemented in Python 3.9. main verb of the sentence is fueron (‘they were’) (3rd per-
In detail, on the one hand, regarding the identification son singular of the preterite perfect simple indicative) and
activity, we made use of the PoS tags provided by the NLP therefore the verb ser is in the same verb tense (fueron). We
library spaCy26 to retrieve the specific tags related to the can clearly detect that perhaps the conjugation of the verb
apposition structure and the different patterns posed in the ser is not the right one in this situation, since according to
aforementioned cases. As for the evaluation of this activity, the reader’s logic it is assumed that if the Pérez knocked on
we have manually built a set of unit tests. For this purpose, the door, they are still the speaker’s neighbours, so the verb
we collected a sample of 52 sentences27 (with an average ser should be in the 3rd person plural of the present tense
word count of 11.8) extracted from CREA Corpus28 , of which (son (‘they are’)). In addition, as a qualitative analysis, we
34 include an apposition. We manually classified the collec- manually analysed the adapted sentences for grammatical
tion of sentences in binary form (true-false classification), sense. The result is positive, as all sentences generated by
and then analysed the classification performance of our sys- our method are grammatically correct.
tem by using a confusion matrix to measure the number of
hits and misses the system made when applying the pat-
3.2. Non-restrictive Adjective Clauses
terns to identify appositions. We observed that the results
are apparently favourable, since the reported precision was Similarly to the identification of nominal appositions, in
0.94 and the recall was 0.94. Analysing these results, we the case of the non-restrictive adjective clauses identifi-
found that the false positives and false negatives were due cation, we used Part-of-Speech (PoS) tagging information,
to PoS tagging errors on the part of spaCy. For example, in by means of a rule-based approach. Nevertheless, follow-
the sentence Julia, mi perra, necesita que la paseen varias ing the method in Section 3.1, we performed initial tests to
veces al día29 , spaCy labels the noun perra (‘dog’), which analyse those cases in which the adjective clauses were not
is the apposition, as a continuation of the noun Julia, thus identified by the specific PoS tag, in order to create patterns
omitting the explanatory apposition. to cover these cases. We crafted specific rules for covering
On the other hand, with respect to the adaptation activ- the following cases: (a) the relative pronoun consists of two
ity, PoS tags provided by spaCy are also used. In addition, words forming a single semantic unit (e.g. el que, la que, los
in the cases where the adaptation requires the inclusion que, las que), where the pattern (See Listing 7) identifies the
of the verb to be or determiners agreeing with the subject sentence not only as a relative clause when encountering
and the original verb, a dictionary has been manually cre- a comma followed by a relative pronoun, but also when
ated including the different verb and determiner forms. For encountering a comma followed by a definite article, and
the evaluation of the adaptation activity, we opted for a then a relative pronoun. On the other hand, (b) participle
language-model-based approach, since we aimed to measure sentences with verbal periphrasis in the main sentence, in
whether the semantic content of the adaptation maintains which the auxiliary verb is treated as the main verb, pos-
ing a problem when performing the transformation, since
23
Tr.: Poor Juan lost everything. we depend on the main verb to extract the morphological
24
Tr.: My friend Arturo does not want to come. > My friend Arturo does information. The solution by means of this rule is straight-
not want to come.
25
The proof-of-concept is not yet available online but we are working
forward: whenever the “auxiliary verb + main verb” pattern
to make it available as soon as possible. is detected, the morphological information of the auxiliary
26
https://spacy.io/ We used the trained model for Spanish
es_core_news_lg for developing all the activities. 30
https://short.upm.es/w2slm
27 31
Both set of sentences and analysis results are available at: https://doi. https://huggingface.co/
32
org/10.5281/zenodo.11397343 https://oa.upm.es/75516/
28 33
https://short.upm.es/ydq6p Tr.: Our neighbours, the Pérez, went on holiday.
29 34
Tr.: Julia, my dog, needs to be walked several times a day. Tr.: #Our neighbours were the Pérez. Our neighbours went on holiday.
verb is consulted in the adaptation. apodosis it has to be marked with a definite article
determiner (el, la, los, las) according to the gender
::= que and number of the noun.
::= el|la|los|las
::= "," Participial Clauses. Whereas relative clauses presented
↪ (| a nexus (the relative pronoun) linking the antecedent with
↪ )"," Predicate the subordinate clause, participial clauses follow the same
IF IdentificationPattern1 pattern as nominal appositions, i.e. “A, B = A is B”. For
THEN AdjectiveRelativeClause instance, in the sentence, El hombre, cansado de trabajar, se
durmió42 , we assume that El hombre (‘the man’) (segment A)
“was” tired from working (segment B). Considering that, the
Listing 7: Identification Pattern for Case A adjective clauses. adaptation of these type of adjective clauses is as follows:
1 To remove the commas.
As for the adaptation activity, the easy-to-read adapta-
tion of the adjective clauses into a more accessible and easier 2 To insert the verb estar (‘to be’) in its concordant
structure should consist of splitting the adjective clause into form with the subject and the original main verb.
two simple sentences: (a) on the one hand, the explanation If the main verb appears in the present tense, the
and (b) on the other hand, the main idea. For example, the verb estar also appears in the present tense, and
sentence La enfermera, que tiene 63 años, está a punto de ju- if it occurs in any past tense, we opted to use the
bilarse35 is adapted as follows: the explanation La enfermera imperfect past tense.
tiene 63 años36 and the main idea La enfermera está a punto
3 To close the first segment or protasis with a full stop.
de jubilarse37
In more detail, the process of adapting the non-restrictive 4 To create the second segment or apodosis by retriev-
adjective clauses into more easily understood structures ing the antecedent preceding the main sentence.
depends on the two types of adjective clause we mentioned
in Section 2.1: We have developed a proof-of-concept (PoC)43 , based on
Relative Clauses. These type of adjective clauses can the proposed method implemented in Python 3.9.
be introduced by different elements, and thus be classified With respect to the identification activity, as in the
as: method for nominal appositions, we used the PoS tags pro-
(a) Introduced by a relative pronoun. In this case, the vided by spaCy44 to get the specific tag related to the relative
procedure is straightforward. The relative pronoun and the pronoun that introduces the clause. To assess our method of
commas are removed, and the two new simple structures identifying non-restrictive adjective clauses, we assembled
are reorganised, keeping the verbs and the rest of the com- a set of 96 sentences45 from various sources (educational
plements the same: Juan, que trabaja mucho, decidió tomarse textbooks and literary works), with an average word count
un descanso38 is adapted as Juan trabaja mucho. Juan decidió of 9.7, of which 62 were adjective sentences of the aforemen-
tomarse un descanso39 . tioned types. We started by manually assigning binary tags
(b) Introduced by a possessive relative determiner. to the sentence collection. Next, we evaluated the classifica-
This type of relative (cuyo, cuya, cuyos, cuyas (‘whose’)) tion performance of our system by analysing a confusion
presents a relation with the antecedent as a complement matrix, which enabled us to quantify both correct classi-
to the noun, and this function in Spanish is expressed by fications and errors made by the system. As results, we
means of the prepositional syntagm “de (‘of’) + noun’. For obtained 0.95 precision and 0.86 recall. In this case, one of
example, Esta chica, cuyo padre vive en Malasia, se mudó a the most frequent errors is the incorrect identification of ex-
las islas40 the is adapted as: Esta chica se mudó a las islas. hortative (expressing command) or desiderative (expressing
El padre de esta chica vive en Malasia41 In this case, the desire) subordinate clauses as adjective clauses since spaCy
main idea is placed in the first plance and afterwards, the detects the conjunction that acts as a nexus of the main and
explanation, in order to avoid problems of correference. the subordinate sentence as a relative pronoun, since both
Thus, the adaptation process for these cases is based on the nexus and relative pronoun have the same form (que): e.g.
following steps: Juan, que te portes bien, por favor46 .
The same methodology has been used to develop the
1 To remove the commas. adaptation activity as in the case of nominal appositions.
This task relies on the PoS tags provided by spaCy. Further-
2 To remove the possessive.
more, for cases where adaptation involves the addition of the
3 To reorganise the main sentence to the first segment verb to be or determiners that match the subject and original
or protasis and close it with a full stop. verb, we developed a dictionary manually. To evaluate the
adaptation process, we employed a language-model-based
4 To create the second segment or apodosis by adding approach. We have yet again used the Spanish sentence
as subject the subject of the subordinate clause and similarity model47 to compare the vectorial similarity be-
the prepositional phrase “de + the subject of the tween the original adjective clause to the adapted sentence
main clause”. For the addition of the subject of the 42
Tr.: The man, tired from working, fell asleep.
35 43
Tr.: The nurse, who is 63 years old, is about to retire. The proof-of-concept is not yet available online but we are working
36 to make it available as soon as possible.
Tr.: The nurse is 63 years old.
37 44
Tr.: The nurse is about to retire. https://spacy.io/
38 45
Tr.: Juan, who works a lot, took a break. Both set of sentences and analysis results are available at: https://doi.
39
Tr.: Juan works a lot. Juan took a break. org/10.5281/zenodo.11397343
40 46
Tr.: This girl, whose father lives in Malaysia, moved to the islands. Tr.: Juan, please behave yourself.
41 47
Tr.: This girl moved to the islands. The father of this girl lives in Malaysia. https://short.upm.es/w2slm
provided by our system. The results showed an average [4] AENOR, Lectura Fácil. Pautas y recomendaciones para
similarity of 0.94 across the 62 adapted sentences in relation la elaboración de documentos (UNE 153101:2018 EX),
with their original versions. As in the previous case, we Inclusion Europe, 2018.
have manually analysed the adapted sentences and they are [5] J. S. Sachs, Recognition memory for syntactic and
all grammatically correct. semantic aspects of connected discourse, Perception
& Psychophysics 2 (1967) 437–442. URL: https://doi.
org/10.3758/BF03208784. doi:10.3758/BF03208784 .
4. Conclusions and Future Work [6] G. L. Dillon, Language Processing and the Reading of
Literature: Toward a Model of Comprehension, Indi-
This paper aims to enhance the cognitive accessibility of
ana University Press, Bloomington, 1978.
Spanish texts by proposing two methods for adapting sen-
[7] M. C. Potter, L. Lombardi, Regeneration in
tences containing explanatory structures such as nominal
the short-term recall of sentences, Journal
appositions and adjective clauses, following an E2R ap-
of Memory and Language 29 (1990) 633–654.
proach. While the methods we propose for the identifi-
URL: https://www.sciencedirect.com/science/
cation and adaptation of these structures might seem simple
article/pii/0749596X9090042X. doi:https:
and straightforward, we believe they represent a valuable
//doi.org/10.1016/0749- 596X(90)90042- X .
contribution to the field of cognitive accessibility. These
[8] A. A. Mirza, The effects of contextual meaning aspects
methods have been implemented as proof of concepts to help
on reading comprehension, Journal of English as a
in the (semi)-automatic adaptation task of improving text
Foreign Language 1 (2011). URL: https://doi.org/10.
accessibility for individuals with reading comprehension
23971/jefl.v1i2.192. doi:10.23971/jefl.v1i2.192 .
difficulties, including those with cognitive disabilities. We
[9] M. Kroll, M. Wagers, Is working memory sensitive
have evaluated the methods by unit tests and by using a lan-
to discourse status? Experimental evidence from re-
guage model to calculate the similarity between the original
sponsive appositives, Poster presented at XPrag 2017,
sentences and the ones adapted to E2R, obtaining gener-
Cologne, Germany, 2017. URL: https://osf.io/view/
ally satisfactory results. However, since the text adaptation
xprag2017/.
aims to meet the needs of particular groups, a user-based
[10] B. Dillon, L. Frazier, C. Clifton, No longer an or-
evaluation is essential to complete the assessment of our
phan: evidence for appositive attachment from sen-
methods.
tence comprehension, Glossa 3 (2018). URL: https:
As further research, several actions are planned to im-
//doi.org/10.5334/gjgl.379. doi:10.5334/gjgl.379 .
prove the initial attempt presented in this work: (a) we
[11] M. C. Suárez-Figueroa, E. Ruckhaus, J. López-Guerrero,
are going to analyse the possibility of proposing methods
I. Cano, A. Cervera, Towards the Assessment of Easy-
based on subsymbolic AI techniques, such as the use of lan-
to-Read Guidelines Using Artificial Intelligence Tech-
guage models or large language models (LLMs); the ultimate
niques, in: K. Miesenberger, R. Manduchi, M. C. Ro-
goal is to compare the results obtained using the methods
driguez, P. Peňáz (Eds.), Computers Helping People
proposed in this paper with the ones obtained with the
with Special Needs. ICCHP 2020, volume 12376 of
subsymbolic-based methods; (b) we are going to implement
Lecture Notes in Computer Science, Springer, 2020, pp.
a web application in the context of the assistive technolo-
74–82. doi:10.1007/978- 3- 030- 58796- 3_10 .
gies for adapting the explanatory structures which are not
[12] M. Suárez-Figueroa, I. Diab, E. Ruckhaus, I. Cano,
compliant to the E2R methodology; and (c) we have planned
First steps in the development of a support appli-
to involve people with cognitive disabilities to evaluate the
cation for easy-to-read adaptation, Universal Ac-
aforementioned web application.
cess in the Information Society (2022). doi:10.1007/
s10209- 022- 00946- z .
Acknowledgments [13] A. Iglesias, I. Cobián, A. Campillo, J. Morato,
S. Sánchez-Cuadrado, Comp4text checker: An
This work has been supported by the grant “Ayudas para automatic and visual evaluation tool to check the
la contratación de personal investigador predoctoral en for- readability of spanish web pages, Springer-Verlag,
mación para el año 2022” funded by Comunidad Autónoma Berlin, Heidelberg, 2020, pp. 258–265. URL: https:
de Madrid (Spain). We would like to thank Miguel Cerezo //doi.org/10.1007/978-3-030-58796-3_31. doi:10.1007/
Durán and Clara Osorio Sanz for their help in the initial 978- 3- 030- 58796- 3_31 .
methods development. [14] E. Díez, N. Rodríguez, A. Fernández, M. A. Alonso,
M. A. Díez-Álamo, M. J. Sánchez, D. Wojcik, E2R-
Helper Asistente Lectura Fácil, Instituto Universitario
References de Integración en la Comunidad - Universidad de
Salamanca, 2019. URL: https://eapoyo-inico.usal.es/
[1] Inclusion Europe, Information for All. European stan-
asistente-lectura-facil/.
dards for making information easy to read and under-
[15] H. Saggion, S. Stajner, S. Bott, S. Mille, L. Rello, B. Drn-
stand, Inclusion Europe, 2009.
darevic, Making It Simplext: Implementation and
[2] M. Nomura, G. S. Nielsen, International Federation
Evaluation of a Text Simplification System for Span-
of Library Associations and Institutions, Library Ser-
ish, ACM Transactions on Accessible Computing 6
vices to People with Special Needs Section, Guidelines
(2015). doi:10.1145/2738046 .
for easy-to-read materials, IFLA Headquarters, The
[16] S. Bott, L. Rello, B. Drndarevic, H. Saggion, Can Span-
Hague, 2010.
ish Be Simpler? LexSiS: Lexical Simplification for Span-
[3] O. García Muñoz, Lectura fácil: Métodos de redacción
ish, in: Proceedings of COLING 2012, The COLING
y evaluación, Real Patronato sobre Discapacidad, 2012.
2012 Organizing Committee, 2012.
[17] L. Rello, R. Baeza-Yates, H. Saggion, DysWebxia: tex-
tos más accesibles para personas con dislexia, Proce-
samiento del Lenguaje Natural 51 (2013) 205–208.
[18] L. Moreno, R. Alarcon, R. Martínez, EASIER System.
Language Resources for Cognitive Accessibility, in:
The 22nd International ACM SIGACCESS Conference
on Computers and Accessibility, ASSETS ’20, Associ-
ation for Computing Machinery, 2020. doi:10.1145/
3373625.3418006 .
[19] RAE, ASALE, Nueva gramática de la lengua española,
Espasa Calpe, Madrid, 2009.
[20] RAE, ASALE, Ortografía de la lengua española, Espasa
Calpe, Madrid, 2010.
[21] R. Chandrasekar, B. Srinivas, Automatic
induction of rules for text simplification,
Knowledge-Based Systems 10 (1997) 183–190.
URL: https://www.sciencedirect.com/science/
article/pii/S0950705197000294. doi:https:
//doi.org/10.1016/S0950- 7051(97)00029- 4 .
[22] A. Siddharthan, Text simplification using typed depen-
dencies: A comparision of the robustness of different
generation strategies, in: C. Gardent, K. Striegnitz
(Eds.), Proceedings of the 13th European Workshop on
Natural Language Generation, Association for Com-
putational Linguistics, Nancy, France, 2011, pp. 2–11.
URL: https://aclanthology.org/W11-2802.
[23] I. Dornescu, R. Evans, C. Orăsan, Relative clause ex-
traction for syntactic simplification, in: C. Orasan,
P. Osenova, C. Vertan (Eds.), Proceedings of the Work-
shop on Automatic Text Simplification - Methods
and Applications in the Multilingual Society (ATS-
MA 2014), Association for Computational Linguis-
tics and Dublin City University, Dublin, Ireland, 2014,
pp. 1–10. URL: https://aclanthology.org/W14-5601.
doi:10.3115/v1/W14- 5601 .
[24] A. Candido, E. Maziero, L. Specia, C. Gasperin,
T. Pardo, S. Aluisio, Supporting the adaptation of texts
for poor literacy readers: a text simplification editor
for Brazilian Portuguese, in: J. Tetreault, J. Burstein,
C. Leacock (Eds.), Proceedings of the Fourth Work-
shop on Innovative Use of NLP for Building Educa-
tional Applications, Association for Computational
Linguistics, Boulder, Colorado, 2009, pp. 34–42. URL:
https://aclanthology.org/W09-2105.
[25] M. L. Khodra, A. B. Sasmita, et al., Rule-based text
simplification for extractive automatic summarization
of news articles in indonesian, in: 2023 10th Interna-
tional Conference on Advanced Informatics: Concept,
Theory and Application (ICAICTA), IEEE, 2023, pp.
1–6.
[26] M. Aranzabe, A. Ilarraza, I. Gonzalez-Dios, First ap-
proach to automatic text simplification in basque, 2012,
pp. 1–8.
[27] S. Bott, H. Saggion, S. Mille, Text simplification
tools for Spanish, in: N. Calzolari, K. Choukri,
T. Declerck, M. U. Doğan, B. Maegaard, J. Mariani,
A. Moreno, J. Odijk, S. Piperidis (Eds.), Proceedings
of the Eighth International Conference on Language
Resources and Evaluation (LREC’12), European Lan-
guage Resources Association (ELRA), Istanbul, Turkey,
2012, pp. 1665–1671.