First Attempt at an Automatic Adaptation of Explanatory Structures in Spanish to Easy-to-Read Isam Diab1 , Mari Carmen Suárez-Figueroa1 1 Ontology Engineering Group (OEG), Universidad Politécnica de Madrid (UPM) Abstract Explanatory structures in the form of incises (e.g. nominal appositions and adjective clauses) can break the argumentative line of a sentence and lose the focus of the reader’s attention. Thus, these structures are considered complex for different groups of the population who present reading comprehension difficulties, including people with cognitive disabilities. The Easy-to-Read (E2R) Methodology was created to provide clear and easily understood contents to people with reading comprehension problems. This methodology recommends avoiding the use of explanations between commas and avoiding the use of appositions that interrupt the natural rhythm of reading. To help people with difficulties in reading comprehension, we have developed a pair of initial Artificial Intelligence (AI)-based methods for adapting in an automatic way explanatory structures in Spanish to E2R. The evaluation of the methods involved unit tests and the calculation of the sentence similarity between the original and the adapted sentences. Keywords Easy-to-Read (E2R), Cognitive Accessibility, Automatic Translation, Text Adaptation 1. Introduction structures in the form of incises (i.e. nominal appositions and adjective clauses) to easy-to-read versions has a positive Equal opportunities and universal access to information impact for people with reading comprehension difficulties. are fundamental rights that every person should benefit1 . Within the scope of technological support for addressing However, certain groups of society, particularly those with E2R guidelines and recommendations in Spanish texts, it is cognitive or intellectual disabilities, present some difficul- worth mentioning (a) Easy-to-Read Advisor [11], FACILE ties related to reading comprehension processes. Therefore, [12], Comp4Text [13], E2R-Helper [14] and ATECA3 for an prioritising the so-called cognitive accessibility becomes E2R analysis of documents; and (b) Simplext [15], LexSIS essential for promoting active participation in diverse so- [16], DysWebxia [17], EASIER [18], FACILE [12], ATECA4 cial domains, such as politics, education, employment, and and Simple.Text5 for creating simpler versions of original culture. For such a reason, a methodology called Easy-to- documents. Read (E2R) [1, 2, 3, 4] was created. The main goal of this However, none of the aforementioned works specifically methodology is to present clear and easily understood con- targets the identification and adaptation6 of explanatory tent by providing a set of guidelines on the content and the structures in the form of incises into simpler or easy-to-read design and layout of written materials, as, for instance, to forms. use short and simple sentences, to avoid the use of long Motivated by the aim of bridging this gap and enhancing words, or to divide ideas into paragraphs. This adaptation the reading comprehension process, our research work fo- process is iterative and involves three key activities: anal- cuses on automatically identifying and adapting explanatory ysis, adaptation and validation [4]. Nevertheless, the E2R structures in the form of nominal appositions and adjective methodology is currently implemented manually, which is clauses, since they impact linguistic aspects such as sentence costly and time-consuming, so it would benefit from hav- length and sentence complexity. Therefore, we propose two ing a technological support. In this context, our research methods based on symbolic AI7 to adapt explanatory struc- line is focused on applying different Artificial Intelligence tures that are not compliant with the E2R Methodology. (AI) methods and techniques2 to automatically perform the We also implemented two proofs-of-concept based on these analysis and the adaptation of Spanish documents to obtain methods. It is worth mentioning that we have opted to use easy-to-read versions. In particular, this paper concentrates the term ‘adaptation’ consistently throughout the paper, on two of the E2R guidelines that influence the composition as it aligns with the established terminology in E2R disci- of the text [3, 4]: (a) to avoid explanations between commas; plines. This adaptation, achieved through our methods, can and (b) to avoid the use of appositions that interrupt the be viewed as a type of intralinguistic automatic translation natural rhythm of reading. tailored specifically for rendering sentences into an E2R Several studies [5, 6, 7, 8, 9, 10] have shown that this type version. of explanatory structures present a difficulty in the reading The rest of the paper is organised as follows: Section comprehension process, since they break the argumentative 2 is devoted to (a) how explanatory structures affect read- line and lead to missing information in the process of under- ing comprehension and cognitive accessibility, and (b) the standing the text. In this way, the adaptation of explanatory automatic approaches for identifying and adapting these structures into simpler ones. In Section 3 we present our first SEPLN-2024: 40th Conference of the Spanish Society for Natural Language Processing. Valladolid, Spain. 24-27 September 2024. attempts of methods for adapting both structures to the E2R Envelope-Open isam.diab@upm.es (I. Diab); mcsuarez@fi.upm.es 3 https://ateca.linkeddata.es/ (M. C. Suárez-Figueroa) 4 https://ateca.linkeddata.es/ Orcid 0000-0002-3967-0672 (I. Diab); 0000-0003-3807-5019 5 https://simpletext.demos.gplsi.es/ (M. C. Suárez-Figueroa) 6 It is worth mentioning that text adaptation always aims to transform © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). texts to meet the needs of a specific audience, while text simplification 1 Convention on the Rights of Persons with Disabilities (United Nations, tends to reduce the complexity of texts and does not always take the 2006). Available at: https://short.upm.es/k34jw final user into account. 2 7 We are investigating both symbolic (e.g. logical rules) and subsymbolic Human knowledge is explicitly represented in a declarative form (e.g. (e.g. neural networks and machine learning) approaches. facts and rules). This way of proceeding is part of symbolic AI. CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Methodology as well as the versions of proofs-of-concept aspects of contextual meaning on reading comprehension, for the methods. Finally, we present some conclusions and the author realised that in the examples provided to the future work. study participants, appositive structures in the form of ex- planatory incises were the most difficult to understand. Thus, based on the aforementioned studies, there is clear 2. State of the Art evidence of the complexity of the explanatory structures that we are dealing with in this research work. In this work we delve into developing initial methods to automatically adapt explanatory structures in Spanish into easy-to-read and more accessible versions, based on the 2.2. Automatic Approaches Addressing guidelines provided by the E2R Methodology [3, 4]. There- Explanatory Structures fore, in this section we (a) highlight some notes about this type of structures and its implication for reading compre- In the context of Natural Language Processing (NLP), Text hension (Section 2.1), and (b) summarise the automatic ap- Simplification (TS) has gained considerable attention over proaches carried out on the adaptation of such explanatory the last decades. Specifically, syntactic simplification, which structures (Section 2.2). involves reducing the complexity of embedded sentences, has emerged as a key focus for the research community. Numerous works have been devoted to the simplification 2.1. Explanatory Structures and Cognitive of complex sentences into simpler ones applied to differ- Accessibility ent languages. One of the types of complex sentences that have been automatically addressed are relative clause sen- Explanatory structures are incises that appear between com- tences in the form of incises. For the first of these, we have mas within a sentence, interrupting the course of the ut- to go back to the 1990s, when Chandrasekar and Srinivas terance to add some precision or comment on the nominal [21] proposed the implementation of an algorithm through element that precedes them [19, 20]. Explanatory structures which generalised simplification rules are automatically can occur in two different forms, according to their syntactic derived from annotated training data in English. The pro- nature. On the one hand, in the form of nominal apposi- cess used a partial parsing technique that integrates con- tions, that is, nouns or noun phrases, as in Julia, our cousin, stituent structure and dependency information, in order lives in Canada. This type of apposition is formally rep- to simplify subordinated sentences that included relative resented as “A, B”. The segment B (also called apodosis in clauses. Along the same line, relative clauses in the form of linguistic terms) represents in this variety a parenthetical incises are also addressed in the work done by Siddharthan noun phrase which adds some precision or some remark [22], which introduces a text simplification framework that to clarify the reference of A (also called protasis), which is uses transformation rules applied to a typed dependency another noun phrase. In this sense, we observe that segment representation generated by the Stanford parser10 . In addi- B assumes that the explanation is copulative; that is, the tion, for English as well, Dornescu and colleagues [23] ex- relationship between segments A and B is formed by the plored the extraction of relative clauses employing a tagging verb to be, following the pattern “A, B = A is B” (e.g. Julia, approach. They manually annotated a dataset encompass- our cousin, lives in Canada > Julia is our cousin. Julia lives ing three text genres, enabling the development and com- in Canada). On the other hand, explanatory structures can parison of ruled-based and machine learning methods for be non-restrictive8 adjective clauses. Such clauses can automatically identifying appositions and non-restrictive be (a) relative clauses, which are introduced by relative de- relative clauses. They built a supervised tagging model for terminers or pronouns (viz. that, which, who, whom, whose), automatic detection of appositions using the tagged dataset. such as The house, which is on the seafront, is very bright; For languages other than English, in Brazilian Portuguese, or (b) participial clauses, introduced by verbs in participle Candido and colleagues [24] developed a rule-based sys- form, as in The man, tired from work, fell asleep 9 . tem using a parser which provides lexical and syntactic Explanatory structures, as a syntactic element that breaks information for the simplification of 22 complex linguistic the discourse line [3, 4], have been studied in relation to phenomena, including relative clauses. For the Indonesian reading comprehension [6, 8, 10]. For Dillon and colleagues language, Haryadi and colleagues [25] replicated the same [10], comprehending an explanatory structure in the form of method of simplification and dataset as in Siddharthan’s an incise involves the additional step of identifying which work, where some relative clauses were handled. Regarding of the previous phrases of like type it is coreferential to. the Basque language, Aranzabe and colleagues [26] pre- Certainly, they mention several studies [5, 7] suggesting sented an architecture for a text simplification system based that appositive relative subordinate clauses are often forgot- on hand written rules specific for syntactic simplification. In ten during the reading process, reflecting the well-known the case of Spanish, both in the framework of the Simplext phenomenon that sentence details quickly disappear from project [15], and also in the work by Bott and colleagues [27] memory. Furthermore, following this line, different analy- some relative clauses are covered in the syntactic simplifica- ses [9, 10] claim that there is increasing evidence that the tion approach, making use of a hand-written computational syntactic form of appositive material that has come and grammar and dependency trees, and focussing on reducing gone, such as appositive relative clauses in medial position, sentence complexity. becomes rapidly unavailable in short-term memory. More- However, to the best of our knowledge, explanatory struc- over, in a study [8] conducted to investigate the effects of tures in the form of incises, whether appositions or non- 8 restrictive relative clauses, have not been specifically dealt A non-restrictive clause adds additional information to a previous with in any research work in Spanish. Thus, in our work, noun, called antecedent. It uses commas to show that the information is additional. we delve into analysing these structures in detail with the 9 Examples provided by the Spanish Royal Academy of Language (RAE): 10 https://short.upm.es/nbmpd https://short.upm.es/w7gxt aim of adapting them to easy-to-read versions following the ::= "," E2R methodology. ↪ "," IF IdentificationPattern2 AND NounRelation = ↪ "amod" AND NounPhrase.POSTag = NOUN 3. Initial Methods for an E2R THEN NominalApposition Adaptation of Explanatory Structures Listing 2: Identification Pattern 2 for detecting appositional structures. The aim of the proposed methods is (a) to detect explana- tory structures in the form of appositions and adjective Regarding the adaptation of nominal appositions, in clauses written in Spanish, and (b) to adapt such structures general, the transformation of the apposition into a more into easy-to-read versions as the E2R methodology suggests. accessible and easier structure consists of splitting the ap- These initial methods are composed of the following activi- positive structure into two simple sentences according to ties: (1) Natural Language Processing (NLP), which includes the pattern “A, B = A is B”, mentioned in Section 2.1: (a) tokenization, tagging tasks, morphology and dependency on the one hand, the main idea, and (b) on the other hand, detection, (2) Explanatory Structures Identification, and (3) the explanation: En el congreso conocí al famoso investigador, Explanatory Structures Adaptation. Section 3.1 explains quizá la persona que más influyó en mi trabajo. > En el con- the E2R adaptation methods for nominal appositions and greso conocí al famoso investigador (main idea). El famoso Section 3.2 the methods for non-restrictive adjective clauses investigador es quizá la persona que más influyó en mi trabajo in Spanish, as well as the proofs-of-concept implemented (explanation)12 . based on such methods. Considering this, we have dealt with the adaptation of four cases of appositions occurrences according to their 3.1. Nominal Appositions syntactic nature: Nominal appositions identification relies on the Part-of- • Case A. The apposition is not marked by a deter- Speech (PoS) tagging information, following an IF-THEN miner. As explained in Section 2.1, the apposition ruled-based approach. In particular, we looked for the PoS is a noun phrase (or nominal syntagm). Typically a tag that identifies an apposition, that is, ‘appos’11 . Regard- nominal syntagm is formed by a determiner (definite less, we realised in the initial tests that in two cases appo- article el/la/los/las/los/las13 (‘the’) or indefinite arti- sitions were not identified by the PoS tag. Therefore, we cle un/una/unos/unas/unas (‘a’)) preceding the noun. analysed these cases to extract new apposition identification However, occasionally the apposition lacks a deter- patterns to complement the identification by the PoS tag. miner (as it is omitted because it is taken for granted For the first case, illustrated in Listing 1, it was decided to in the discourse). For example: El búho, ave rapaz, create a rule that whenever a noun phrase between com- ve bien de noche. > #El búho es ave rapaz14 . El búho mas functions (i.e. has the same tag) as its antecedent (a ve bien de noche15 . The fact that the determiner is as- noun), it is an apposition. This noun phrase, being equally sumed by its absence means precisely that the noun tagged, is acting as a “symmetrical” structure and both the it should accompany is not definite, i.e. no previous noun phrase and the antecedent have the same syntactic reference has been made to that noun. When we use information. definite determiners (el/la/los/las) with a noun, we allude to the fact that there has already been a previ- ::= ? ? ous reference in the text to that noun. Therefore, we ↪ can assume that the absence of a determiner is syn- ::= "," onymous with the use of an indefinite determiner ↪ "," (un/una/unos/unas). Thus, in this type of apposition, IF IdentificationPattern1 AND Antecedent.POSTag the adaptation is as follows: El búho, ave rapaz, ve ↪ = Noun.POSTag bien de noche > El búho es un ave rapaz. El búho ve THEN NominalApposition bien de noche16 . Then, in the adaptation process, illustrated in Listing Listing 1: Identification Pattern 1 for detecting appositional 3, the steps to create the two new simple structures structures. are the following: 1 To remove the commas. For the second case, since the apposition can be seen as a 2 To insert the verb ser (‘to be’) in its concordant complement of the noun, it was decided that in those cases form to the subject and to the originally main where a noun phrase is enclosed in commas and the noun 12 is treated as a complement of the noun (PoS tag ‘amod’), it Translation (Tr.): At the congress I met the famous researcher, perhaps the person who most influenced my work. > At the congress I met the should be identified as an apposition. Since appositions are famous researcher. The famous researcher is perhaps the person who noun phrases, it is required that the apposition is marked most influenced my work. with the tag ‘NOUN’. Listing 2 shows this identification 13 Note that in Spanish we use the slash symbol (/) to indicate gender pattern. and number variations of the same word. 14 The hash (#) is used in linguistics to express that a structure is unusual, although it makes sense grammatically. 15 Tr.: The owl, bird of prey, sees well at night. > #The owl is bird of prey. The owl sees well at night. 16 Tr.: The owl, bird of prey, sees well at night. > The owl is a bird of prey. 11 https://short.upm.es/zuoji The owl sees well at night. verb. In the example above the inserted form Bretaña. Jorge VI tuvo muchas hijas20 . The pattern is es (‘is’) because the subject is búho (‘owl’) for this case is illustrated in Listing 4. (3rd person and singular number) and the verb ve (‘sees’) is in the present indicative tense. ::= Subject 3 To insert the indefinite determiner. For this ↪ ConjugatedVbSer Determiner NounPhrase"." step, similar as before, the subject information ↪ Subject Predicate"." is consulted (búho is masculine and singular), IF Determiner IN NounPhrase AND NounPhrase IS and the indefinite determiner that meets the ↪ NominalApposition AND Sentence = Subject, same characteristics is added (in this case, un). ↪ NounPhrase, Predicate THEN AdaptationPattern2 4 To close the first segment or protasis with a full stop. Listing 4: Adaptation Pattern for Case B appositions. 5 To create the second segment or apodosis, re- trieving the antecedent noun (El búho (‘the owl’)) and placing it preceding the rest of • Case C. The apposition is headed by a deictic. the predicate (ve bien de noche (‘sees well at In spoken discourse, or in certain literary contexts, night’)). so-called opaque deictics (pronouns, in this case) are sometimes used, which make a non-literal spatio- ::= el|la|los|las temporal allusion. For example, in Alberti, ese poeta ::= un|una|unos|unas políticamente comprometido, llegó el lunes21 we see ::= Subject how ese (‘that’) is a deictic pronoun which does not ↪ ConjugatedVbSer IndefiniteDeterminer really point to anything, it is opaque, it makes an ↪ NounPhrase"." Subject Predicate"." allusion to the listener’s supposed knowledge of the IF Determiner NOT IN NounPhrase AND NounPhrase information in the apposition about Alberti. That is, ↪ IS NominalApposition AND Sentence = Subject, the sender uses it to include the receiver as knowing ↪ NounPhrase, Predicate that Alberti was a politically committed poet, but THEN AdaptationPattern1 frames Alberti in a space. For its adaptation pat- tern (see Listing 5), the same approach is proposed Listing 3: Adaptation Pattern for Case A appositions. as in Case B, except that the deictic pronoun (es- e/es/es/esa/esas (‘that’)) is replaced by an indefinite determiner (un/una/unos/unas/unas), matching its In the case of proper nouns, which are naturally gender and number. Thus, the example above is not accompanied by a determiner, this rule does not adapted as follows: Alberti es un poeta políticamente apply. It is possible either to eliminate the commas, comprometido. Alberti llegó el lunes22 . as in Mi primo, Juan, vive en Canarias. > Mi primo Juan vive en Canarias17 ; or to carry out the same ::= ese|esa|esos|esas| process of creating two sentences, as in Mi primo, este|esta|estos|estas| Juan, vive en Canarias. > Mi primo es Juan. Mi primo aquel|aquella|aquellos|aquellas vive en Canarias18 . In this situation, we have opted ::= Subject to eliminate commas for these specific cases, since in ↪ ConjugatedVbSer IndefiniteDeterminer this case the apposition between commas is a simple ↪ NounPhrase"." Subject Predicate"." element (a proper noun), and not a construction IF DeicticPronoun IN NounPhrase AND NounPhrase with more elements that can cut the rhythm of the ↪ IS NominalApposition AND Sentence = Subject, reading. In this way, the deletion of commas is the ↪ NounPhrase, Predicate most direct form of adaptation as it does not interfere THEN AdaptationPattern3 with the content. • Case B. The apposition is marked by a deter- Listing 5: Adaptation Pattern for Case C appositions. miner. Contrary to Case A, when the apposition contains a determiner, the steps mentioned in Case • Case D. The apposition can be a clause attached A are followed, except for the inclusion of a deter- to a proper noun. It is important to mention that in miner, which is now explicit. Thus, if there is an Spanish there are other explanatory structures in the example of an apposition with a definite determiner, form of incises which are not appositions but can be the adaptation is as follows: Ana, la amiga de Sara, treated as such. These are parenthetical structures, vino a la fiesta > Ana es la amiga de Sara. Ana vino short and simple syntagms, which give information a la fiesta19 And, in the case of an apposition with by the speaker within the discourse (unlike in Case an indefinite determiner, the adaptation is as fol- B, these incises lack complements, they are simpler lows: Jorge VI, uno de los reyes de Gran Bretaña, tuvo than the noun phrases of appositions). Their func- muchas hijas. > Jorge VI fue uno de los reyes de Gran tion may be explanatory, but may also be due to factors such as emphasis or reiteration of something 17 Tr.: My cousin, Juan, lives in the Canary Islands. > My cousin Juan lives 20 in the Canary Islands. Tr.: George VI, one of the kings of Great Britain, had many daughters. > 18 Tr.: My cousin, Juan, lives in the Canary Islands. > My cousin is Juan. George VI was one of the kings of Great Britain. George VI had many My cousin lives in the Canary Islands. daughters. 19 21 Tr.: Ana, the friend of Sara, came to the party.. > Ana is the friend of Tr.: Alberti, that politically committed poet, arrived on Monday. 22 Sara. Ana came to the party. Tr.: Alberti is a politically committed poet. Alberti arrived on Monday. previously said [19]. In these cases, it is proposed that of the original. Specifically, we have used a sentence that, since they do not express an explanation per similarity model30 for Spanish, (available at Hugging Face se, the sentence is transformed by reorganising the repository31 ), to compare the original sentence with ap- clause, as shown in Listing 6. That is, in Juan, el position and the adapted sentence provided by our PoC pobre, lo perdió todo, the clause el pobre is placed according to the E2R methodology. The choice of this lan- before the proper noun in the adaptation: El pobre guage model is based on a previous work32 we carried out Juan lo perdió todo23 . The same applies to Arturo, mi in which we compared the performance of different sen- amigo, no quiere venir > Mi amigo Arturo no quiere tence similarity models in Spanish. By using this model, venir24 . vector representations of the sentences can be obtained and the semantic similarity between them can be calculated. ::= Determiner Noun The model receives as input the original sentence and the ↪ ProperNoun Verb RestOfSentence"." transformed sentence, and returns a number indicating the IF ProperNoun.POSTag = PROPN AND degree of similarity between them, being 0 not similar at ↪ (Determiner.POSTag = DET AND Noun.POSTag = all and 1 being completely similar. We obtained an average ↪ NOUN) AND Sentence = Subject, ProperNoun, of 0.94 similarity between all 34 sentences with apposition ↪ Predicate that were adapted. In more detail, one of the main errors THEN AdaptationPattern4 encountered in the adaptation activity has to do with the conjugation of the verb ser (‘to be’), since an explanation Listing 6: Adaptation Pattern for Case D appositions. of a subject can be presented in a different tense than the time at which the main action of the sentence occurs. For example, the sentence Nuestros vecinos, los Pérez, se fueron Based on the proposed method for identifying and adapt- de vacaciones 33 is adapted as #Nuestros vecinos fueron los ing appositions in Spanish, we have developed a proof-of- Pérez. Nuestros vecinos se fueron de vacaciones34 , because the concept (PoC)25 implemented in Python 3.9. main verb of the sentence is fueron (‘they were’) (3rd per- In detail, on the one hand, regarding the identification son singular of the preterite perfect simple indicative) and activity, we made use of the PoS tags provided by the NLP therefore the verb ser is in the same verb tense (fueron). We library spaCy26 to retrieve the specific tags related to the can clearly detect that perhaps the conjugation of the verb apposition structure and the different patterns posed in the ser is not the right one in this situation, since according to aforementioned cases. As for the evaluation of this activity, the reader’s logic it is assumed that if the Pérez knocked on we have manually built a set of unit tests. For this purpose, the door, they are still the speaker’s neighbours, so the verb we collected a sample of 52 sentences27 (with an average ser should be in the 3rd person plural of the present tense word count of 11.8) extracted from CREA Corpus28 , of which (son (‘they are’)). In addition, as a qualitative analysis, we 34 include an apposition. We manually classified the collec- manually analysed the adapted sentences for grammatical tion of sentences in binary form (true-false classification), sense. The result is positive, as all sentences generated by and then analysed the classification performance of our sys- our method are grammatically correct. tem by using a confusion matrix to measure the number of hits and misses the system made when applying the pat- 3.2. Non-restrictive Adjective Clauses terns to identify appositions. We observed that the results are apparently favourable, since the reported precision was Similarly to the identification of nominal appositions, in 0.94 and the recall was 0.94. Analysing these results, we the case of the non-restrictive adjective clauses identifi- found that the false positives and false negatives were due cation, we used Part-of-Speech (PoS) tagging information, to PoS tagging errors on the part of spaCy. For example, in by means of a rule-based approach. Nevertheless, follow- the sentence Julia, mi perra, necesita que la paseen varias ing the method in Section 3.1, we performed initial tests to veces al día29 , spaCy labels the noun perra (‘dog’), which analyse those cases in which the adjective clauses were not is the apposition, as a continuation of the noun Julia, thus identified by the specific PoS tag, in order to create patterns omitting the explanatory apposition. to cover these cases. We crafted specific rules for covering On the other hand, with respect to the adaptation activ- the following cases: (a) the relative pronoun consists of two ity, PoS tags provided by spaCy are also used. In addition, words forming a single semantic unit (e.g. el que, la que, los in the cases where the adaptation requires the inclusion que, las que), where the pattern (See Listing 7) identifies the of the verb to be or determiners agreeing with the subject sentence not only as a relative clause when encountering and the original verb, a dictionary has been manually cre- a comma followed by a relative pronoun, but also when ated including the different verb and determiner forms. For encountering a comma followed by a definite article, and the evaluation of the adaptation activity, we opted for a then a relative pronoun. On the other hand, (b) participle language-model-based approach, since we aimed to measure sentences with verbal periphrasis in the main sentence, in whether the semantic content of the adaptation maintains which the auxiliary verb is treated as the main verb, pos- ing a problem when performing the transformation, since 23 Tr.: Poor Juan lost everything. we depend on the main verb to extract the morphological 24 Tr.: My friend Arturo does not want to come. > My friend Arturo does information. The solution by means of this rule is straight- not want to come. 25 The proof-of-concept is not yet available online but we are working forward: whenever the “auxiliary verb + main verb” pattern to make it available as soon as possible. is detected, the morphological information of the auxiliary 26 https://spacy.io/ We used the trained model for Spanish es_core_news_lg for developing all the activities. 30 https://short.upm.es/w2slm 27 31 Both set of sentences and analysis results are available at: https://doi. https://huggingface.co/ 32 org/10.5281/zenodo.11397343 https://oa.upm.es/75516/ 28 33 https://short.upm.es/ydq6p Tr.: Our neighbours, the Pérez, went on holiday. 29 34 Tr.: Julia, my dog, needs to be walked several times a day. Tr.: #Our neighbours were the Pérez. Our neighbours went on holiday. verb is consulted in the adaptation. apodosis it has to be marked with a definite article determiner (el, la, los, las) according to the gender ::= que and number of the noun. ::= el|la|los|las ::= "," Participial Clauses. Whereas relative clauses presented ↪ (| a nexus (the relative pronoun) linking the antecedent with ↪ )"," Predicate the subordinate clause, participial clauses follow the same IF IdentificationPattern1 pattern as nominal appositions, i.e. “A, B = A is B”. For THEN AdjectiveRelativeClause instance, in the sentence, El hombre, cansado de trabajar, se durmió42 , we assume that El hombre (‘the man’) (segment A) “was” tired from working (segment B). Considering that, the Listing 7: Identification Pattern for Case A adjective clauses. adaptation of these type of adjective clauses is as follows: 1 To remove the commas. As for the adaptation activity, the easy-to-read adapta- tion of the adjective clauses into a more accessible and easier 2 To insert the verb estar (‘to be’) in its concordant structure should consist of splitting the adjective clause into form with the subject and the original main verb. two simple sentences: (a) on the one hand, the explanation If the main verb appears in the present tense, the and (b) on the other hand, the main idea. For example, the verb estar also appears in the present tense, and sentence La enfermera, que tiene 63 años, está a punto de ju- if it occurs in any past tense, we opted to use the bilarse35 is adapted as follows: the explanation La enfermera imperfect past tense. tiene 63 años36 and the main idea La enfermera está a punto 3 To close the first segment or protasis with a full stop. de jubilarse37 In more detail, the process of adapting the non-restrictive 4 To create the second segment or apodosis by retriev- adjective clauses into more easily understood structures ing the antecedent preceding the main sentence. depends on the two types of adjective clause we mentioned in Section 2.1: We have developed a proof-of-concept (PoC)43 , based on Relative Clauses. These type of adjective clauses can the proposed method implemented in Python 3.9. be introduced by different elements, and thus be classified With respect to the identification activity, as in the as: method for nominal appositions, we used the PoS tags pro- (a) Introduced by a relative pronoun. In this case, the vided by spaCy44 to get the specific tag related to the relative procedure is straightforward. The relative pronoun and the pronoun that introduces the clause. To assess our method of commas are removed, and the two new simple structures identifying non-restrictive adjective clauses, we assembled are reorganised, keeping the verbs and the rest of the com- a set of 96 sentences45 from various sources (educational plements the same: Juan, que trabaja mucho, decidió tomarse textbooks and literary works), with an average word count un descanso38 is adapted as Juan trabaja mucho. Juan decidió of 9.7, of which 62 were adjective sentences of the aforemen- tomarse un descanso39 . tioned types. We started by manually assigning binary tags (b) Introduced by a possessive relative determiner. to the sentence collection. Next, we evaluated the classifica- This type of relative (cuyo, cuya, cuyos, cuyas (‘whose’)) tion performance of our system by analysing a confusion presents a relation with the antecedent as a complement matrix, which enabled us to quantify both correct classi- to the noun, and this function in Spanish is expressed by fications and errors made by the system. As results, we means of the prepositional syntagm “de (‘of’) + noun’. For obtained 0.95 precision and 0.86 recall. In this case, one of example, Esta chica, cuyo padre vive en Malasia, se mudó a the most frequent errors is the incorrect identification of ex- las islas40 the is adapted as: Esta chica se mudó a las islas. hortative (expressing command) or desiderative (expressing El padre de esta chica vive en Malasia41 In this case, the desire) subordinate clauses as adjective clauses since spaCy main idea is placed in the first plance and afterwards, the detects the conjunction that acts as a nexus of the main and explanation, in order to avoid problems of correference. the subordinate sentence as a relative pronoun, since both Thus, the adaptation process for these cases is based on the nexus and relative pronoun have the same form (que): e.g. following steps: Juan, que te portes bien, por favor46 . The same methodology has been used to develop the 1 To remove the commas. adaptation activity as in the case of nominal appositions. This task relies on the PoS tags provided by spaCy. Further- 2 To remove the possessive. more, for cases where adaptation involves the addition of the 3 To reorganise the main sentence to the first segment verb to be or determiners that match the subject and original or protasis and close it with a full stop. verb, we developed a dictionary manually. To evaluate the adaptation process, we employed a language-model-based 4 To create the second segment or apodosis by adding approach. We have yet again used the Spanish sentence as subject the subject of the subordinate clause and similarity model47 to compare the vectorial similarity be- the prepositional phrase “de + the subject of the tween the original adjective clause to the adapted sentence main clause”. For the addition of the subject of the 42 Tr.: The man, tired from working, fell asleep. 35 43 Tr.: The nurse, who is 63 years old, is about to retire. The proof-of-concept is not yet available online but we are working 36 to make it available as soon as possible. Tr.: The nurse is 63 years old. 37 44 Tr.: The nurse is about to retire. https://spacy.io/ 38 45 Tr.: Juan, who works a lot, took a break. Both set of sentences and analysis results are available at: https://doi. 39 Tr.: Juan works a lot. Juan took a break. org/10.5281/zenodo.11397343 40 46 Tr.: This girl, whose father lives in Malaysia, moved to the islands. Tr.: Juan, please behave yourself. 41 47 Tr.: This girl moved to the islands. The father of this girl lives in Malaysia. https://short.upm.es/w2slm provided by our system. The results showed an average [4] AENOR, Lectura Fácil. Pautas y recomendaciones para similarity of 0.94 across the 62 adapted sentences in relation la elaboración de documentos (UNE 153101:2018 EX), with their original versions. As in the previous case, we Inclusion Europe, 2018. have manually analysed the adapted sentences and they are [5] J. S. Sachs, Recognition memory for syntactic and all grammatically correct. semantic aspects of connected discourse, Perception & Psychophysics 2 (1967) 437–442. URL: https://doi. org/10.3758/BF03208784. doi:10.3758/BF03208784 . 4. Conclusions and Future Work [6] G. L. Dillon, Language Processing and the Reading of Literature: Toward a Model of Comprehension, Indi- This paper aims to enhance the cognitive accessibility of ana University Press, Bloomington, 1978. Spanish texts by proposing two methods for adapting sen- [7] M. C. Potter, L. Lombardi, Regeneration in tences containing explanatory structures such as nominal the short-term recall of sentences, Journal appositions and adjective clauses, following an E2R ap- of Memory and Language 29 (1990) 633–654. proach. While the methods we propose for the identifi- URL: https://www.sciencedirect.com/science/ cation and adaptation of these structures might seem simple article/pii/0749596X9090042X. doi:https: and straightforward, we believe they represent a valuable //doi.org/10.1016/0749- 596X(90)90042- X . contribution to the field of cognitive accessibility. These [8] A. A. Mirza, The effects of contextual meaning aspects methods have been implemented as proof of concepts to help on reading comprehension, Journal of English as a in the (semi)-automatic adaptation task of improving text Foreign Language 1 (2011). URL: https://doi.org/10. accessibility for individuals with reading comprehension 23971/jefl.v1i2.192. doi:10.23971/jefl.v1i2.192 . difficulties, including those with cognitive disabilities. We [9] M. Kroll, M. Wagers, Is working memory sensitive have evaluated the methods by unit tests and by using a lan- to discourse status? Experimental evidence from re- guage model to calculate the similarity between the original sponsive appositives, Poster presented at XPrag 2017, sentences and the ones adapted to E2R, obtaining gener- Cologne, Germany, 2017. URL: https://osf.io/view/ ally satisfactory results. However, since the text adaptation xprag2017/. aims to meet the needs of particular groups, a user-based [10] B. Dillon, L. Frazier, C. Clifton, No longer an or- evaluation is essential to complete the assessment of our phan: evidence for appositive attachment from sen- methods. tence comprehension, Glossa 3 (2018). URL: https: As further research, several actions are planned to im- //doi.org/10.5334/gjgl.379. doi:10.5334/gjgl.379 . prove the initial attempt presented in this work: (a) we [11] M. C. Suárez-Figueroa, E. Ruckhaus, J. López-Guerrero, are going to analyse the possibility of proposing methods I. Cano, A. Cervera, Towards the Assessment of Easy- based on subsymbolic AI techniques, such as the use of lan- to-Read Guidelines Using Artificial Intelligence Tech- guage models or large language models (LLMs); the ultimate niques, in: K. Miesenberger, R. Manduchi, M. C. Ro- goal is to compare the results obtained using the methods driguez, P. Peňáz (Eds.), Computers Helping People proposed in this paper with the ones obtained with the with Special Needs. ICCHP 2020, volume 12376 of subsymbolic-based methods; (b) we are going to implement Lecture Notes in Computer Science, Springer, 2020, pp. a web application in the context of the assistive technolo- 74–82. doi:10.1007/978- 3- 030- 58796- 3_10 . gies for adapting the explanatory structures which are not [12] M. Suárez-Figueroa, I. Diab, E. Ruckhaus, I. Cano, compliant to the E2R methodology; and (c) we have planned First steps in the development of a support appli- to involve people with cognitive disabilities to evaluate the cation for easy-to-read adaptation, Universal Ac- aforementioned web application. cess in the Information Society (2022). doi:10.1007/ s10209- 022- 00946- z . Acknowledgments [13] A. Iglesias, I. Cobián, A. Campillo, J. Morato, S. Sánchez-Cuadrado, Comp4text checker: An This work has been supported by the grant “Ayudas para automatic and visual evaluation tool to check the la contratación de personal investigador predoctoral en for- readability of spanish web pages, Springer-Verlag, mación para el año 2022” funded by Comunidad Autónoma Berlin, Heidelberg, 2020, pp. 258–265. URL: https: de Madrid (Spain). We would like to thank Miguel Cerezo //doi.org/10.1007/978-3-030-58796-3_31. doi:10.1007/ Durán and Clara Osorio Sanz for their help in the initial 978- 3- 030- 58796- 3_31 . methods development. [14] E. Díez, N. Rodríguez, A. Fernández, M. A. Alonso, M. A. Díez-Álamo, M. J. Sánchez, D. Wojcik, E2R- Helper Asistente Lectura Fácil, Instituto Universitario References de Integración en la Comunidad - Universidad de Salamanca, 2019. URL: https://eapoyo-inico.usal.es/ [1] Inclusion Europe, Information for All. European stan- asistente-lectura-facil/. dards for making information easy to read and under- [15] H. Saggion, S. Stajner, S. Bott, S. Mille, L. Rello, B. Drn- stand, Inclusion Europe, 2009. darevic, Making It Simplext: Implementation and [2] M. Nomura, G. S. Nielsen, International Federation Evaluation of a Text Simplification System for Span- of Library Associations and Institutions, Library Ser- ish, ACM Transactions on Accessible Computing 6 vices to People with Special Needs Section, Guidelines (2015). doi:10.1145/2738046 . for easy-to-read materials, IFLA Headquarters, The [16] S. Bott, L. Rello, B. Drndarevic, H. Saggion, Can Span- Hague, 2010. ish Be Simpler? LexSiS: Lexical Simplification for Span- [3] O. García Muñoz, Lectura fácil: Métodos de redacción ish, in: Proceedings of COLING 2012, The COLING y evaluación, Real Patronato sobre Discapacidad, 2012. 2012 Organizing Committee, 2012. [17] L. Rello, R. Baeza-Yates, H. Saggion, DysWebxia: tex- tos más accesibles para personas con dislexia, Proce- samiento del Lenguaje Natural 51 (2013) 205–208. [18] L. Moreno, R. Alarcon, R. Martínez, EASIER System. Language Resources for Cognitive Accessibility, in: The 22nd International ACM SIGACCESS Conference on Computers and Accessibility, ASSETS ’20, Associ- ation for Computing Machinery, 2020. doi:10.1145/ 3373625.3418006 . [19] RAE, ASALE, Nueva gramática de la lengua española, Espasa Calpe, Madrid, 2009. [20] RAE, ASALE, Ortografía de la lengua española, Espasa Calpe, Madrid, 2010. [21] R. Chandrasekar, B. Srinivas, Automatic induction of rules for text simplification, Knowledge-Based Systems 10 (1997) 183–190. URL: https://www.sciencedirect.com/science/ article/pii/S0950705197000294. doi:https: //doi.org/10.1016/S0950- 7051(97)00029- 4 . [22] A. Siddharthan, Text simplification using typed depen- dencies: A comparision of the robustness of different generation strategies, in: C. Gardent, K. Striegnitz (Eds.), Proceedings of the 13th European Workshop on Natural Language Generation, Association for Com- putational Linguistics, Nancy, France, 2011, pp. 2–11. URL: https://aclanthology.org/W11-2802. [23] I. Dornescu, R. Evans, C. Orăsan, Relative clause ex- traction for syntactic simplification, in: C. Orasan, P. Osenova, C. Vertan (Eds.), Proceedings of the Work- shop on Automatic Text Simplification - Methods and Applications in the Multilingual Society (ATS- MA 2014), Association for Computational Linguis- tics and Dublin City University, Dublin, Ireland, 2014, pp. 1–10. URL: https://aclanthology.org/W14-5601. doi:10.3115/v1/W14- 5601 . [24] A. Candido, E. Maziero, L. Specia, C. Gasperin, T. Pardo, S. Aluisio, Supporting the adaptation of texts for poor literacy readers: a text simplification editor for Brazilian Portuguese, in: J. Tetreault, J. Burstein, C. Leacock (Eds.), Proceedings of the Fourth Work- shop on Innovative Use of NLP for Building Educa- tional Applications, Association for Computational Linguistics, Boulder, Colorado, 2009, pp. 34–42. URL: https://aclanthology.org/W09-2105. [25] M. L. Khodra, A. B. Sasmita, et al., Rule-based text simplification for extractive automatic summarization of news articles in indonesian, in: 2023 10th Interna- tional Conference on Advanced Informatics: Concept, Theory and Application (ICAICTA), IEEE, 2023, pp. 1–6. [26] M. Aranzabe, A. Ilarraza, I. Gonzalez-Dios, First ap- proach to automatic text simplification in basque, 2012, pp. 1–8. [27] S. Bott, H. Saggion, S. Mille, Text simplification tools for Spanish, in: N. Calzolari, K. Choukri, T. Declerck, M. U. Doğan, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, S. Piperidis (Eds.), Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12), European Lan- guage Resources Association (ELRA), Istanbul, Turkey, 2012, pp. 1665–1671.