<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">AI vs. Human: Effectiveness of LLMs in Simplifying Italian Administrative Documents</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Marco</forename><surname>Russodivito</surname></persName>
							<email>marco.russodivito@unimol.it</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Molise</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Vittorio</forename><surname>Ganfi</surname></persName>
							<email>vittorio.ganfi@unimol.it</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Molise</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Giuliana</forename><surname>Fiorentino</surname></persName>
							<email>giuliana.fiorentino@unimol.it</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Molise</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Rocco</forename><surname>Oliveto</surname></persName>
							<email>rocco.oliveto@unimol.it</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Molise</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">AI vs. Human: Effectiveness of LLMs in Simplifying Italian Administrative Documents</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">9C4ECAB120A9C51E175CE57DE6C46267</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:34+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Automatic Text Simplification, Large Language Models, Italian Administrative language R. Oliveto) 0009-0004-8860-1739 (M. Russodivito)</term>
					<term>0000-0002-0892-7287 (V. Ganfi)</term>
					<term>0000-0002-0392-9056 (G. Fiorentino)</term>
					<term>0000-0002-7995-8582 (R. Oliveto)</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This study investigates the effectiveness of Large Language Models (LLMs) in simplifying Italian administrative texts compared to human informants. This research evaluates the performance of several well-known LLMs, including GPT-3.5-Turbo, GPT-4, LLaMA 3, and Phi 3, in simplifying a corpus of Italian administrative documents (s-ItaIst), a representative corpus of Italian administrative texts. To accurately compare the simplification abilities of humans and LLMs, six parallel corpora of a subsection of ItaIst are collected. These parallel corpora were analyzed using both complexity and similarity metrics to assess the outcomes of LLMs and human participants. Our findings indicate that while LLMs perform comparably to humans in many aspects, there are notable differences in structural and semantic changes. The results of our study underscore the potential and limitations of using AI for administrative text simplification, highlighting areas where LLMs need improvement to achieve human-level proficiency.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Due to the increasing popularity of generative Artificial Intelligence (AI) language tools <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>, significant attention has been devoted to the use of LLMs for text simplification <ref type="bibr" target="#b2">[3]</ref>. Several studies have addressed the application of LLMs to simplify texts, particularly focusing on administrative documents, including those in Italian <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b4">5,</ref><ref type="bibr" target="#b5">6]</ref>. Italian administrative texts are often notably complex and obscure <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b8">9]</ref>, which restricts a large segment of the population from fully accessing the content produced by the Italian public administration <ref type="bibr" target="#b9">[10,</ref><ref type="bibr" target="#b10">11]</ref>.</p><p>This work aims to (a) evaluate the quality of automatic text simplification performed by several well-known LLMs, and (b) compare LLM-based simplification with human-based simplification. To address these research questions, the following procedures were undertaken:</p><p>1. From an empirical perspective, a large corpus of Italian administrative texts was collected (i.e., ItaIst). A parallel simplified counterpart of the corpus was created using different LLMs. Additionally, a shorter version of the administrative corpus was manually simplified by two annotators.</p><p>2. From an analytical perspective, several statistical analyses were conducted to measure the semantic and complexity closeness between human and LLM-generated data. The comparison of scores for both LLM and human datasets highlights significant differences and similarities in manual and AI-driven simplification.</p><p>The results concerning readability indexes (e.g., Gulpease) and semantic and structural similarities (e.g., edit distance) reveal that LLMs generally perform comparably to human informants. However, AI-simplified texts are slightly less similar to the original documents than those generated by human simplifiers. LLMs tend to introduce more changes in the simplified corpora than human annotators. The empirical study indicates that texts simplified by AI exhibit more structural and lexical dissimilarities from the original documents than those simplified by humans.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Replication package.</head><p>All the codes and data are available on Figshare at https://figshare.com/s/ 4d927fe648c6f1cb4227.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Several researchers have conducted research on evaluating the accountability of LLMs in text simplification and on assessing the metrics employed to measure the quality of LLM text simplification <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b12">13,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b14">15,</ref><ref type="bibr" target="#b15">16]</ref>. In particular, numerous studies have focused on assessing the use of LLMs to simplify Italian administrative texts, highlighting the potential of these models to enhance text readability. Some studies have specifically evaluated the readability of simplified administrative texts by comparing parallel corpora of simplified documents and adopting a qualitative interpretative approach <ref type="bibr" target="#b16">[17]</ref>. Other contributions have assessed the outputs of LLMs in simplification tasks, particularly focusing on models partially trained on Italian <ref type="bibr" target="#b17">[18]</ref>.</p><p>Our paper analyzes the differences between LLM and human simplification of Italian administrative texts, following a quantitative approach. By examining these differences, our study aims to highlight the similarities and dissimilarities that emerge during the simplification of administrative documents by humans and AI.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Study Design</head><p>Our study aims to analyze the effectiveness of modern LLMs in simplifying administrative text. To achieve this, we address the following Research Question (RQ):</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>How effective are AI systems at simplifying administrative texts compared to humans?</head><p>This question evaluates whether modern AI can achieve a level of quality comparable to human experts, our references, by analyzing how well LLMs can reduce complexity while preserving the original meaning of the texts.</p><p>The study has been conducted on a sub-corpus of ItaIst, utilizing several LLMs to support the text simplification process.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Corpus</head><p>The ItaIst corpus has been created as part of the Ver-bACxSS research project. It was composed by linguists and jurists to create a representative linguistic resource for contemporary administrative Italian <ref type="bibr" target="#b18">[19,</ref><ref type="bibr" target="#b19">20]</ref>. ItaIst was assembled by collecting recent official documents from local and regional public administration websites of eight Italian regions (Basilicata, Calabria, Campania, Lazio, Lombardy, Molise, Tuscany, and Veneto) covering topics such as garbage, healthcare, and public services. The corpus includes a variety of text types, such as Tenders Notices, Planning Acts, Services Charters.</p><p>The reliability of the corpus design was ensured by (a) linguists, who checked the corpus represents administrative Italian in terms of textual and diatopic features, and (b) jurists, who selected and validated each document included in ItaIst. The resulting corpus, comprising 208 documents, consists of around 2, 000, 000 tokens and 45, 000 types 1 . More information about the ItaIst corpus can be found in Appendix A.</p><p>To make a fair comparison between humans and AI, a sub-corpus of ItaIst (hereinafter, s-ItaIst) was extracted. The s-ItaIst sub-corpus was composed by selecting representative documents from each region, balancing the 1 https://huggingface.co/datasets/VerbACxSS/ItaIst topics and text types of the main corpus. Table <ref type="table" target="#tab_0">1</ref> provides a summary of the s-ItaIst. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">LLMs</head><p>To investigate both open-source and commercial models, the s-ItaIst corpus was simplified using four distinct commercial LLMs, namely GPT-3.5-Turbo <ref type="bibr" target="#b20">[21]</ref> and GPT-4 <ref type="bibr" target="#b21">[22]</ref> by OpenAI, LLaMA 3 <ref type="bibr" target="#b22">[23]</ref> by Meta, and Phi 3 <ref type="bibr" target="#b22">[23]</ref> by</p><p>Microsoft. For open-source models, we used the LLaMA 3 8B<ref type="foot" target="#foot_0">2</ref> and Phi 3 3.8B<ref type="foot" target="#foot_1">3</ref> variants, both fine-tuned on large Italian corpora. This selection explores models of various sizes while ensuring optimal performance for Italian tasks.</p><p>A detailed prompt was formulated to instruct each model to perform the simplification task properly, avoiding summary and applying state-of-the-art simplification rules <ref type="bibr" target="#b8">[9]</ref>. The full prompt can be found in Appendix B.</p><p>The OpenAI models were accessed via APIs <ref type="foot" target="#foot_2">4</ref> , while the open-source models were hosted on an AWS EC2 G6 <ref type="foot" target="#foot_3">5</ref> instance equipped with a single Nvidia L4 GPU with 24GB vRAM.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Experimental Procedure</head><p>To address our research question, we conducted an empirical study to compare automatic and manual simplifications. Our study, illustrated in Figure <ref type="figure" target="#fig_0">1</ref>, can be summarized in three main steps: (i) constructing a corpus of administrative documents (i.e., s-ItaIst), (ii) simplifying this corpus using four LLMs and two human annotators, and (iii) comparing the LLM-simplified corpora with the human-simplified corpora.</p><p>It is worth noting that the s-ItaIst corpus was subdivided into small sections (2-6 sentences) to avoid exceeding the context windows of the LLMs and to facilitate human informants during simplification <ref type="foot" target="#foot_4">6</ref> .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>s-ItaIst</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Metrics Extractions</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Similarity Metrics</head><p>Semantic Similarity (%)</p><p>Edit Distance (%)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Manual simplification</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Completixy Metrics</head><p>Gulpease Index Flesch-Vacca Index Human annotators with strong backgrounds in linguistics and deep knowledge about administrative text simplification simplified the corpus following common simplification rules identified in the literature <ref type="bibr" target="#b23">[24,</ref><ref type="bibr" target="#b24">25,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b8">9]</ref>. They exploited a custom web application that (i) assigned sections of the document to simplify and (ii) tracked the time they spent during such an activity. Similarly, each LLM was instructed to automatically simplify every document in the corpus one section at a time.</p><p>This approach provided a comprehensive comparison dataset of six distinct parallel corpora. We analyzed these data to compare human and automatic simplifications by extracting features such as complexity and similarity metrics to measure the quality of the simplified texts and their relatedness to the original text. Furthermore, we computed the Wilcoxon Signed-Rank Test <ref type="bibr" target="#b25">[26]</ref> to statistically evaluate the difference between LLMs and human metrics and Cliff's Delta <ref type="bibr" target="#b26">[27,</ref><ref type="bibr" target="#b27">28]</ref> to provide a measure of the effect size.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Metrics</head><p>To assess the quality of the simplifications, we employed both complexity and similarity metrics from the literature. Complexity metrics compare the ease of the original and simplified text, while similarity metrics measure the distance between them. We implemented these metrics according to the state-of-the-art, leveraging natural language processing (NLP) techniques (e.g., tokenization, POS tagging 7 ). 7 The process of tokenization and tagging was conducted using the spaCy natural language processing tool: https://spacy.io (last seen 07-  In literature several simplicity measures (for instance, SAMSA <ref type="bibr" target="#b28">[29]</ref>, and SARI <ref type="bibr" target="#b29">[30]</ref>) are employed, although their results may vary depending on the level of analysis examined and, of course, on the design of the metrics. Therefore, SAMSA aims to measure structural simplicity through monitoring sentence splitting accuracy, and SARI was developed to measure the simplicity advantage when just lexical paraphrasing was evaluated. Furthermore, some study shows that when calculated using multi-operation manual references, both a generic metric like BLEU <ref type="bibr" target="#b30">[31]</ref> and an operation-specific one like SARI have low associations with assessments of overall simplicity <ref type="bibr" target="#b31">[32]</ref>. Thus, to measure the readability of investigated corpora we selected 1. Flesch Vacca Index, Gulpease Index and READ-IT, since they are advanced instruments designed to investigate the degree of simplicity of Italian texts, and 2. percentages of some lexical and structural features (i.e., amount of most common lexical items and active verb forms) increasing the readability of texts.</p><p>Also for similarity metrics, computational literature offers several resources aiming to measure the structural or semantic proximity of texts. Some of these operate at the n-gram overlap (e.g., BLEU <ref type="bibr" target="#b30">[31]</ref> and METEOR <ref type="bibr" target="#b32">[33]</ref>), while others consider other features. For this analysis, we select Semantic Similarity to quantify the degree of semantic closeness between corpora and Edit distance to measure structural similarities between investigated corpora.</p><p>To support future research, we have made our metrics implementation publicly available <ref type="foot" target="#foot_5">8</ref> . Details concerning considered complexity metrics herein are shown:</p><p>• Gulpease Index <ref type="bibr" target="#b33">[34]</ref>: This metric evaluates the readability of an Italian text and assesses the education level required to fully comprehend it. It is calculated using the following formula:</p><formula xml:id="formula_0">89 + 300 * (𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑠) − 10 * (𝑐ℎ𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑠) 𝑡𝑜𝑘𝑒𝑛𝑠<label>(1)</label></formula><p>• Flesch Vacca Index <ref type="bibr" target="#b34">[35]</ref>: This is an adaptation of the original Flesch Reading Ease formula for evaluating the readability of Italian texts, computed as follows:</p><formula xml:id="formula_1">217 − 130 * 𝑠𝑦𝑙𝑙𝑎𝑏𝑙𝑒𝑠 𝑡𝑜𝑘𝑒𝑛𝑠 − 𝑡𝑜𝑘𝑒𝑛𝑠 𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑠<label>(2)</label></formula><p>• READ-IT <ref type="bibr" target="#b35">[36]</ref>: The tool is the first advanced readability evaluation instrument for Italian, combining traditional raw text features with lexical, morpho-syntactic, and syntactic information. Four different readability models are included in the tool: READ-IT BASE includes only raw features, calculating sentence length (average number of words per sentence) and word length (average number of characters per word); READ-IT LEXICAL combines raw (e.g., word length) and lexical (e.g., Type/Token Ratio) features; READ-IT SYNTACTIC employs raw text (e.g., sentence length) and morpho-syntactic (e.g., average number of clauses per sentence) properties; READ-IT GLOBAL includes all other features, combining raw text, lexical, morpho-syntactic and syntactic (e.g., the depth of the whole parse tree) features <ref type="foot" target="#foot_6">9</ref> . • NVdB (%): "Il Nuovo vocabolario di base della lingua italiana" <ref type="bibr" target="#b36">[37]</ref> consists of fundamental and commonly used words representing the essential lexicon of the Italian language. The ease of a text can be roughly estimated by the number of words listed in the basic vocabulary <ref type="bibr" target="#b37">[38]</ref>. • Passive (%): Overuse of passive voice can lead to ambiguity and complexity, especially for readers who may struggle with comprehension <ref type="bibr" target="#b23">[24,</ref><ref type="bibr" target="#b24">25,</ref><ref type="bibr" target="#b8">9]</ref>. It is calculated by identifying verbs with aux:pass occurring in the Dependency Parsing Tree.</p><p>Details concerning considered similarity metrics herein are shown:</p><p>• Semantic Similarity (%) <ref type="bibr" target="#b38">[39]</ref>: This metric measures the distance between the semantic meanings of two documents. It can be computed exploiting relevant methodologies from the literature, such as BERTscore <ref type="bibr" target="#b39">[40]</ref> and SBERT <ref type="bibr" target="#b40">[41]</ref>. We opted for the latter approach, which leverages cosine similarity between contextual embeddings (obtained through sentence-transformers and an open-source multilingual model<ref type="foot" target="#foot_7">10</ref> ) to evaluate similarity at the sentence level, encapsulating the overall contextual meaning <ref type="bibr" target="#b41">[42]</ref>. • Edit distance (%) <ref type="bibr" target="#b42">[43]</ref>: This metric measures the similarity between two strings based on the number of single-character edits (insertions, deletions, or substitutions) required to transform one text into the other. A value close to zero indicates a relatively minor difference between the two texts, while a high value indicates significant rephrasing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5.">Threats to validity</head><p>We analyze the validity of our study by examining construct, internal, and external validity. This evaluation helps us understand the strengths and limitations of our methodology and the generalizability of our findings.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Construct validity:</head><p>The two linguistic experts involved in the manual simplification of the s-ItaIst corpus may have produced divergent variants due to their subjective approaches. Despite differences in seniority, both experts have strong linguistic backgrounds (holding PhDs) and several years of experience. Nevertheless, involving two human simplifiers allowed us to explore distinct simplification approaches and compare automatic simplification against two varied benchmarks.</p><p>Internal validity: The LLMs used for automatic text simplification, particularly those from HuggingFace, may have been trained on non-administrative texts, potentially introducing issues in the simplified text. However, we relied on state-of-the-art models tested against several benchmarks <ref type="bibr" target="#b43">[44,</ref><ref type="bibr" target="#b44">45,</ref><ref type="bibr" target="#b45">46,</ref><ref type="bibr" target="#b46">47]</ref>. Additionally, the embeddings for calculating Semantic Similarity were obtained through a multilingual model chosen for its high ranking on the MTEB leaderboard <ref type="foot" target="#foot_8">11</ref> , particularly for its performance in the STS22 benchmark (it) <ref type="bibr" target="#b47">[48]</ref>.</p><p>External validity: Our study focuses on the subcorpus ItaIst, consisting of eight administrative documents. Although the number of documents is relatively small, the corpus includes over 1, 000 sentences. Manual simplification of the corpus took Human1 and Human2 15 and 23 hours respectively. Extending our study to the entire ItaIst corpus would have been infeasible. However, the documents of the ItaIst sub-corpus were not chosen randomly; they were selected to represent the variety of administrative texts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 2</head><p>Metrics evaluated across the original corpus and the human and LLM simplified corpora.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Original</head><p>Human1 Human2 GPT-3. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Results and Discussion</head><p>A preliminary analysis of our results, summarized in Table <ref type="table">2</ref>, reveals several significant similarities and differences between the human and LLM datasets. For instance, the variation in the number of tokens is similar across both human and LLM corpora, although LLMs generally increase the number of sentences more prominently than human annotators. Regarding complexity metrics, all the parallel corpora (both human and LLM) exhibit a general increase in readability compared to the original texts. For example, the majority of the corpora improve the Gulpease Index readability metric, shifting the difficulty level from very difficult to difficult for middle school reading levels <ref type="bibr" target="#b33">[34]</ref> (except for Human1 and GPT-3.5-Turbo). Additionally, complexity metrics vary similarly across both human and LLM groups, with differences between manual and AI simplifiers not significantly greater than those between Human1 and Human2 or among GPT-3.5-Turbo, GPT-4, LLaMA 3, and Phi 3.</p><p>The analysis of semantic and structural distance metrics from the original s-ItaIst shows more pronounced differences between human and LLM datasets. In terms of semantic similarity (Semantic Similarity), the Human1 and Human2 corpora are closer to the original meaning than the LLM-simplified corpora. These differences are even more pronounced when considering edit distance (Edit distance). The percentage of edit distance is higher in the LLM group, with each LLM corpus exceeding the human ones by at least 10%.</p><p>Higher degrees of Semantic Similarity and lower degrees of Edit distance in human corpora indicate that human annotators tend to make fewer changes to the original text compared to LLMs.</p><p>As reported in Table <ref type="table">2</ref>, GPT-4 achieved the best results across the majority of metrics (except for READ-IT LEXICAL). To validate our outcomes, we performed the Wilcoxon Signed-Rank Test and calculated Cliff's Delta effect size to analyze the difference between GPT-4 and human metrics. By examining the results in Table <ref type="table" target="#tab_2">3</ref>, we can assert that: GPT-4 simplifications can be comparable to human simplifications. GPT-4 simplifications are negligibly better for complexity metrics, moderately worse for similarity, and largely rephrased compared to human simplifications.</p><p>The results of the Wilcoxon Signed-Rank Test and Cliff's Delta Effect Size for the other models, though not fully significant, are listed in Appendix C.</p><p>A brief extract taken from Original, Human1, Human2 and GPT-4 parallel corpora, representing the same phrase simplified by the two human annotators and GPT-4 is shown below <ref type="foot" target="#foot_9">12</ref> :</p><p>Original: fatturato minimo annuo, per gli ultimi tre esercizi, pari o superiore al valore stimato del presente appalto Human1: Guadagno in un anno (fatturato minimo annuo) negli ultimi 3 anni di valore uguale o superiore al valore di questo bando Human2: l'ammontare di fatture emesse annualmente, per gli ultimi tre anni, deve essere pari o superiore al valore stimato del presente appalto GPT-4: un fatturato annuo minimo, negli ultimi tre anni, uguale o maggiore al valore stimato dell'appalto In the above syntagmas, the similarities between the simplifications are quite obvious: for example, the technical term esercizio or the more ambiguous word pari are replaced by the more common lexical equivalents anno or uguale, respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>In this study, we investigated the automatic simplification of Italian administrative documents. Our results demonstrate that LLMs can effectively simplify these texts, performing comparably to humans 13 .</p><p>Among the models examined, GPT-4 shows superior performance in text simplification, exhibiting significant improvements in complexity metrics. Nonetheless, it is noteworthy that humans tend to maintain a higher level of Edit distance and Semantic Similarity, ensuring the preservation of the original meaning and structure of the text. In other words, humans-aware of the importance of precise language for these documents-mostly preserved the original meaning and structure, whereas LLMs, while simplifying, tended to rephrase extensively. This rephrasing, although effective in reducing complexity, might inadvertently alter the legal nuances, which 13 Further evidence showing that LLM simplifications preserve the meaning of the original texts was obtained in a study, conducted on the same data. The unpublished research indicated that experienced evaluators, i.e., jurists having administrative competence, agree that LLM simplifications of administrative texts maintain the legal integrity of the original documents <ref type="bibr" target="#b48">[49]</ref>.</p><p>are critical in administrative texts. Despite this limitation, LLMs can serve as valuable support tools for text simplification, significantly accelerating a process that typically requires hours of manual work. By generating initial drafts, LLMs can reduce the workload of human experts, who would then review and refine the AI-generated drafts, ensuring the preservation of the overall meaning and legal integrity of the text. The results achieved in our study indicated that modern LLMs can simplify administrative documents almost as effectively as humans. However, the achieved findings indicate that LLMs are not fully capable of preserving the semantic meaning of the text, tending to rephrase more extensively than humans. This could introduce legal issues into the simplified text. Further study could be conducted to evaluate the juridical equivalence of automatically simplified documents. A manual investigation of our parallel corpus, supervised by expert jurists, may reveal important implications in this sensitive context.</p><p>Another promising direction for future research is to investigate the impact of automatic simplification on text comprehension. An additional empirical study could be designed to evaluate whether automatically simplified documents are easier to understand than their original versions.</p><p>Additionally, it would be worthwhile to explore different prompting strategies to further improve simplification quality. For instance, few-shot prompting <ref type="bibr" target="#b49">[50]</ref> with some manually simplified gold samples could better align LLMs with human style. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Corpus ItaIst</head><p>The ItaIst corpus is a comprehensive collection of Italian administrative documents. Table <ref type="table" target="#tab_6">4</ref> provides an overview of the topics and regions from which these documents were collected. This corpus has been assembled to represent the diversity and complexity of contemporary administrative Italian, ensuring its relevance for linguistic and computational analysis. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Prompt engineering</head><p>In the context of LLMs, the term prompt refers to the instructions provided to a language model to generate a specific response. Prompt engineering is the process of designing a clear and detailed prompt to instruct the model to generate a desired response. The prompt we used to ask the models to simplify administrative text is:</p><p>Sei un dipendente pubblico che deve scrivere dei documenti istituzionali italiani per renderli semplici e comprensibili per i cittadini. Ti verrà fornito un documento pubblico e il tuo compito sarà quello di riscriverlo applicando regole di semplificazione senza però modificare il significato del documento originale. Ad esempio potresti rendere le frasi più brevi, eliminare le perifrasi, esplicitare sempre il soggetto, utilizzare parole più semplicii, trasformare i verbi passivi in verbi di forma attiva, spostare le frasi parentetiche alla fine del periodo.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Tests</head><p>Table <ref type="table" target="#tab_3">5</ref>, Table <ref type="table" target="#tab_4">6</ref>, and Table <ref type="table" target="#tab_5">7</ref> report the results of the statistical analyses conducted to compare the simplification performance of various LLMs against human experts.</p><p>The Wilcoxon Signed-Rank Test and Cliff's Delta effect size were employed to evaluate the metrics of GPT-3.5-Turbo, LLaMA 3, and Phi 3 models in comparison to two human simplifiers, labelled as Human1 and Human2. These analyses provide insights into the relative effectiveness of AI-driven simplifications versus human efforts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D. Examples</head><p>Table <ref type="table" target="#tab_7">8</ref> provides several examples of text simplification.</p><p>For each example, we present the original text alongside its simplified versions. The values of the complexity and similarity metrics are reported for each text. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>NVdBFigure 1 :</head><label>1</label><figDesc>Figure 1: Experimental design schema: The s-ItaIst corpus was simplified both automatically and manually by two humans and four LLMs. The resulting parallel corpora were analyzed using complexity and similarity metrics.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>An overview of the main metrics of the s-ItaIst corpus.</figDesc><table><row><cell>Metrics</cell><cell>Value</cell></row><row><cell># documents</cell><cell>8</cell></row><row><cell># sentences</cell><cell>1,314</cell></row><row><cell># tokens</cell><cell>33,295</cell></row><row><cell># types</cell><cell>5,622</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Results of the Wilcoxon Signed-Rank Test and Cliff's DeltaEffect Size performed on GPT-4, Human1, and Human2 metrics.</figDesc><table><row><cell></cell><cell>Metrics</cell><cell>p-value Effect Size</cell><cell></cell></row><row><cell></cell><cell>Gulpease Index</cell><cell>&lt; 0.0001 negligible</cell><cell>↗</cell></row><row><cell>Human1</cell><cell>Flesch Vacca Index NVdB Passive READ-IT BASE</cell><cell>&lt; 0.0001 negligible 0.0108 negligible 0.0004 negligible &lt; 0.0001 small</cell><cell>↗ ↗ ↘ ↘</cell></row><row><cell></cell><cell>READ-IT LEXICAL</cell><cell>&lt; 0.0001 negligible</cell><cell>↗</cell></row><row><cell></cell><cell>READ-IT SYNTACTIC</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell></cell><cell>READ-IT GLOBAL</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell></cell><cell>Semantic Similarity</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell></cell><cell>Edit distance</cell><cell>&lt; 0.0001 large</cell><cell>↗</cell></row><row><cell></cell><cell>Gulpease Index</cell><cell>0.0092 negligible</cell><cell>↗</cell></row><row><cell>Human2</cell><cell>Flesch Vacca Index NVdB Passive READ-IT BASE</cell><cell>&lt; 0.0001 negligible &lt; 0.0001 small &lt; 0.0001 negligible 0.0292 negligible</cell><cell>↗ ↗ ↘ ↗</cell></row><row><cell></cell><cell>READ-IT LEXICAL</cell><cell></cell><cell></cell></row><row><cell></cell><cell>READ-IT SYNTACTIC</cell><cell>&lt; 0.0001 negligible</cell><cell>↘</cell></row><row><cell></cell><cell>READ-IT GLOBAL</cell><cell>&lt; 0.0001 negligible</cell><cell>↘</cell></row><row><cell></cell><cell>Semantic Similarity</cell><cell>&lt; 0.0001 medium</cell><cell>↘</cell></row><row><cell></cell><cell>Edit distance</cell><cell>&lt; 0.0001 large</cell><cell>↗</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 5</head><label>5</label><figDesc>Results of the Wilcoxon Signed-Rank Test and Cliff's DeltaEffect Size performed on GPT-3.5-Turbo, Human1, and Human2 metrics.</figDesc><table><row><cell></cell><cell>Metrics</cell><cell>p-value Effect Size</cell><cell></cell></row><row><cell></cell><cell>Gulpease Index</cell><cell>&lt; 0.0001 negligible</cell><cell>↘</cell></row><row><cell>Human1</cell><cell>Flesch Vacca Index NVdB Passive READ-IT BASE</cell><cell>&lt; 0.0001 negligible &lt; 0.0001 negligible 0.0052 negligible</cell><cell>↘ ↘ ↘</cell></row><row><cell></cell><cell>READ-IT LEXICAL</cell><cell>&lt; 0.0001 negligible</cell><cell>↗</cell></row><row><cell></cell><cell>READ-IT SYNTACTIC</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell></cell><cell>READ-IT GLOBAL</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Semantic Similarity</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell></cell><cell>Edit distance</cell><cell>&lt; 0.0001 medium</cell><cell>↗</cell></row><row><cell></cell><cell>Gulpease Index</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell>Human2</cell><cell>Flesch Vacca Index NVdB Passive READ-IT BASE</cell><cell>&lt; 0.0001 negligible &lt; 0.0001 negligible 0.0072 negligible &lt; 0.0001 small</cell><cell>↘ ↗ ↘ ↗</cell></row><row><cell></cell><cell>READ-IT LEXICAL</cell><cell>0.0091 negligible</cell><cell>↗</cell></row><row><cell></cell><cell>READ-IT SYNTACTIC</cell><cell></cell><cell></cell></row><row><cell></cell><cell>READ-IT GLOBAL</cell><cell>0.0003 negligible</cell><cell>↗</cell></row><row><cell></cell><cell>Semantic Similarity</cell><cell>&lt; 0.0001 medium</cell><cell>↘</cell></row><row><cell></cell><cell>Edit distance</cell><cell>&lt; 0.0001 large</cell><cell>↗</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 6</head><label>6</label><figDesc>Results of the Wilcoxon Signed-Rank Test and Cliff's DeltaEffect Size performed on LLaMA 3, Human1, and Human2 metrics.</figDesc><table><row><cell></cell><cell>Metrics</cell><cell>p-value Effect Size</cell><cell></cell></row><row><cell></cell><cell>Gulpease Index</cell><cell>0.0077 negligible</cell><cell>↗</cell></row><row><cell>Human1</cell><cell>Flesch Vacca Index NVdB Passive READ-IT BASE</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell></cell><cell>READ-IT LEXICAL</cell><cell>&lt; 0.0001 negligible</cell><cell>↘</cell></row><row><cell></cell><cell>READ-IT SYNTACTIC</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell></cell><cell>READ-IT GLOBAL</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell></cell><cell>Semantic Similarity</cell><cell>&lt; 0.0001 medium</cell><cell>↘</cell></row><row><cell></cell><cell>Edit distance</cell><cell>&lt; 0.0001 large</cell><cell>↗</cell></row><row><cell></cell><cell>Gulpease Index</cell><cell></cell><cell></cell></row><row><cell>Human2</cell><cell>Flesch Vacca Index NVdB Passive READ-IT BASE</cell><cell>&lt; 0.0001 small &lt; 0.0001 negligible</cell><cell>↗ ↗</cell></row><row><cell></cell><cell>READ-IT LEXICAL</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell></cell><cell>READ-IT SYNTACTIC</cell><cell></cell><cell></cell></row><row><cell></cell><cell>READ-IT GLOBAL</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Semantic Similarity</cell><cell>&lt; 0.0001 large</cell><cell>↘</cell></row><row><cell></cell><cell>Edit distance</cell><cell>&lt; 0.0001 large</cell><cell>↗</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 7</head><label>7</label><figDesc>Results of the Wilcoxon Signed-Rank Test and Cliff's DeltaEffect Size performed on Phi 3, Human1, and Human2 metrics.</figDesc><table><row><cell></cell><cell>Metrics</cell><cell>p-value Effect Size</cell><cell></cell></row><row><cell></cell><cell>Gulpease Index</cell><cell>0.0134 negligible</cell><cell>↗</cell></row><row><cell>Human1</cell><cell>Flesch Vacca Index NVdB Passive READ-IT BASE</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell></cell><cell>READ-IT LEXICAL</cell><cell>&lt; 0.0001 negligible</cell><cell>↘</cell></row><row><cell></cell><cell>READ-IT SYNTACTIC</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell></cell><cell>READ-IT GLOBAL</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell></cell><cell>Semantic Similarity</cell><cell>&lt; 0.0001 medium</cell><cell>↘</cell></row><row><cell></cell><cell>Edit distance</cell><cell>&lt; 0.0001 large</cell><cell>↗</cell></row><row><cell></cell><cell>Gulpease Index</cell><cell></cell><cell></cell></row><row><cell>Human2</cell><cell>Flesch Vacca Index NVdB Passive READ-IT BASE</cell><cell>&lt; 0.0001 small &lt; 0.0001 negligible</cell><cell>↗ ↗</cell></row><row><cell></cell><cell>READ-IT LEXICAL</cell><cell>&lt; 0.0001 small</cell><cell>↘</cell></row><row><cell></cell><cell>READ-IT SYNTACTIC</cell><cell></cell><cell></cell></row><row><cell></cell><cell>READ-IT GLOBAL</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Semantic Similarity</cell><cell>&lt; 0.0001 large</cell><cell>↘</cell></row><row><cell></cell><cell>Edit distance</cell><cell>&lt; 0.0001 large</cell><cell>↗</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 4</head><label>4</label><figDesc>Topics and regions of documents collected in ItaIst</figDesc><table><row><cell></cell><cell cols="3">Garbage Healthcare Public services</cell></row><row><cell>Basilicata</cell><cell>8</cell><cell>3</cell><cell>9</cell></row><row><cell>Calabria</cell><cell>11</cell><cell>5</cell><cell>9</cell></row><row><cell>Campania</cell><cell>14</cell><cell>7</cell><cell>9</cell></row><row><cell>Lazio</cell><cell>9</cell><cell>3</cell><cell>9</cell></row><row><cell>Lombardia</cell><cell>15</cell><cell>3</cell><cell>11</cell></row><row><cell>Molise</cell><cell>10</cell><cell>7</cell><cell>9</cell></row><row><cell>Toscana</cell><cell>19</cell><cell>4</cell><cell>12</cell></row><row><cell>Veneto</cell><cell>9</cell><cell>5</cell><cell>10</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_7"><head>Table 8</head><label>8</label><figDesc>Examples of simplifications.OriginalL'operatore di Polizia Locale, quindi, rappresenta un importante punto di riferimento per la collettività. Nell'ambito delle sue funzioni, esso svolge i propri compiti in maniera autorevole, dando prova di preparazione professionale e sensibilità nel contatto relazionale. La sua attività, inoltre, è caratterizzata dal costante sforzo teso alla migliore interpretazione delle situazioni e delle problematiche incontrate, applicando un approccio educativo e orientato alla più adeguata risposta ai bisogni della cittadinanza.Human1La Polizia Locale è un punto di riferimento per i cittadini. La Polizia Locale ha autorevolezza, professionalità e sensibilità nel contatto con i cittadini. La Polizia Locale cerca sempre di interpretare al meglio situazioni e problematiche incontrate. La Polizia Locale si comporta in modo da educare e rispondere adeguatamente ai bisogni dei cittadini.Human2L'operatore di Polizia Locale, quindi, è un importante punto di riferimento per la collettività. Quando è in servizio, esso svolge i propri compiti in maniera autorevole. Dimostra preparazione professionale e sensibilità nel contatto con gli utenti. Cerca sempre di interpretare le situazioni e i problemi nel migliore dei modi. Applica un approccio educativo e cerca di adeguarsi ai bisogni della cittadinanza. L'operatore di Polizia Locale è un punto di riferimento importante per la comunità. Svolge i suoi compiti con autorità, mostrando competenza professionale e sensibilità nelle relazioni. Inoltre, lavora sempre per capire al meglio le situazioni e i problemi che incontra, usando un approccio educativo per rispondere adeguatamente ai bisogni dei cittadini. Turbo L'operatore di Polizia Locale è un punto di riferimento importante per la comunità. Nell'esercizio delle sue funzioni, svolge i compiti con autorevolezza, dimostrando professionalità e sensibilità nei rapporti con le persone. Si impegna costantemente a comprendere al meglio le situazioni e le problematiche affrontate, adottando un approccio educativo e orientato a rispondere in modo adeguato ai bisogni dei cittadini.L'operatore di Polizia Locale è un punto di riferimento importante per la comunità. Esegue i suoi compiti con autorità, professionalità e sensibilità nel rapporto con la gente. La sua attività è caratterizzata dal costante impegno per comprendere meglio le situazioni e i problemi, e per rispondere in modo educativo ai bisogni dei cittadini, con un approccio orientato alla loro assistenza.</figDesc><table><row><cell cols="6">Gulpease Index Flesch Vacca Index NVdB (%) Passive (%) Semantic Similarity (%) Edit distance (%)</cell></row><row><cell>38</cell><cell>12</cell><cell>77 %</cell><cell>28 %</cell><cell>-</cell><cell>-</cell></row><row><cell cols="6">Gulpease Index Flesch Vacca Index NVdB (%) Passive (%) Semantic Similarity (%) Edit distance (%)</cell></row><row><cell>55</cell><cell>33</cell><cell>67 %</cell><cell>0 %</cell><cell>93 %</cell><cell>56 %</cell></row><row><cell>58</cell><cell>42</cell><cell>83 %</cell><cell>0 %</cell><cell>98 %</cell><cell>35 %</cell></row><row><cell>GPT-4</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>48</cell><cell>32</cell><cell>84 %</cell><cell>0 %</cell><cell>97 %</cell><cell>48 %</cell></row><row><cell cols="6">GPT-3.5-Gulpease Index Flesch Vacca Index NVdB (%) Passive (%) Semantic Similarity (%) Edit distance (%) 45 27 78 % 0 % 98 % 45 %</cell></row><row><cell>LLaMA 3</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table><note>Gulpease Index Flesch Vacca Index NVdB (%) Passive (%) Semantic Similarity (%) Edit distance (%) Gulpease Index Flesch Vacca Index NVdB (%) Passive (%) Semantic Similarity (%) Edit distance (%)</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_8"><head>Gulpease Index Flesch Vacca Index NVdB (%) Passive (%) Semantic Similarity (%) Edit distance (%)</head><label></label><figDesc>L'operatore di Polizia Locale è un punto di riferimento importante per la comunità. Esegue i suoi compiti con autorità, professionalità e sensibilità nel trattare le persone. La sua attività è guidata dal desiderio di capire meglio le situazioni e le problematiche, e di rispondere in modo appropriato ai bisogni dei cittadini, con un approccio educativo.</figDesc><table><row><cell>50</cell><cell>37</cell><cell>85 %</cell><cell>28 %</cell><cell>96 %</cell><cell>54 %</cell></row><row><cell>Phi 3</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="6">Gulpease Index Flesch Vacca Index NVdB (%) Passive (%) Semantic Similarity (%) Edit distance (%)</cell></row><row><cell>52</cell><cell>38</cell><cell>82 %</cell><cell>28 %</cell><cell>96 %</cell><cell>56 %</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0">https://huggingface.co/DeepMount00/Llama-3-8b-Ita (last seen 07- </note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_1">3 https://huggingface.co/e-palmisano/Phi3-ITA-mini-4K-instruct (last seen 07-21-2024)</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2"><ref type="bibr" target="#b3">4</ref> https://openai.com/api/ (last seen 07-21-2024</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_3">)<ref type="bibr" target="#b4">5</ref> https://aws.amazon.com/it/ec2/instance-types/g6/ (last seen 07-21-</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_4">2024)<ref type="bibr" target="#b5">6</ref> s-ItaIst corpus was segmented into a total of 619 sections of text. Each section, then, was assigned to human annotators and LLMs for simplification.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_5">https://pypi.org/project/italian-ats-evaluator (last seen 07- </note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_6">http://www.italianlp.it/demo/read-it (last seen 04-10-2024)</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_7">https://huggingface.co/intfloat/multilingual-e5-base (last seen 07- </note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_8"><ref type="bibr" target="#b10">11</ref> https://huggingface.co/spaces/mteb/leaderboard (last seen 07-21- 2024)   </note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="12" xml:id="foot_9">A more extensive example of data regarding human and LLM simplifications collected in the parallel corpora designed for this study can be found in Appendix D.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This contribution is a result of the research conducted within the framework of the PRIN 2020 (Progetti di Rilevante Interesse Nazionale) "VerbACxSS: on analytic verbs, complexity, synthetic verbs, and simplification. For accessibility" (Prot. 2020BJKB9M), funded by the Italian Ministry of Universities and Research. Giuliana Fiorentino and Rocco Oliveto are responsible for research question identification, study design, research supervision and data analysis. However, for academic reasons, Section 2, Section 3.1, Section 3.3, Section 4, and Section 5 are attributed to Vittorio Ganfi; and Section 1, Section 3, Section 3.2, Section 3.4 and Section 3.5 to Marco Russodivito.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Attention is all you need</title>
		<author>
			<persName><forename type="first">A</forename><surname>Vaswani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Parmar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Uszkoreit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">N</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kaiser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Polosukhin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems (NIPS)</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">30</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Transformers: Stateof-the-art natural language processing</title>
		<author>
			<persName><forename type="first">T</forename><surname>Wolf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Debut</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Sanh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chaumond</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Delangue</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Moi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Cistac</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Rault</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Louf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Funtowicz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Davison</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shleifer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Von Platen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Jernite</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Plu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">Le</forename><surname>Scao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gugger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Drame</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Lhoest</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rush</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Conference on Empirical Methods in Natural Language Processing: System Demonstrations (EMNLP)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="38" to="45" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Revisiting non-English text simplification: A unified multilingual benchmark</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Ryan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Naous</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Xu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Association for Computational Linguistics (ACL)</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Montemagni, Design and Annotation of the First Italian Corpus for Text Simplification</title>
		<author>
			<persName><forename type="first">D</forename><surname>Brunato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Dell'orletta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Venturi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Linguistic Annotation Workshop (LAW)</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="31" to="41" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Neural readability pairwise ranking for sentences in Italian administrative language</title>
		<author>
			<persName><forename type="first">M</forename><surname>Miliani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Auriemma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Alva-Manchego</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lenci</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Asia-Pacific Chapter of the Association for Computational Linguistics(AACL) and International Joint Conference on Natural Language Processing (IJC-NLP)</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="849" to="866" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Understanding Italian Administrative Texts: A Reader-Oriented Study for Readability Assessment and Text Simplification</title>
		<author>
			<persName><forename type="first">M</forename><surname>Miliani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Senaldi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Lebani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lenci</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Workshop on AI for Public Administration (AIxPA)</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="71" to="87" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">La lingua del diritto e dell&apos;amministrazione</title>
		<author>
			<persName><forename type="first">S</forename><surname>Lubello</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2017">2017</date>
			<pubPlace>Il mulino; Bologna</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Cortelazzo</surname></persName>
		</author>
		<title level="m">Il linguaggio amministrativo. Principi e pratiche di modernizzazione</title>
				<meeting><address><addrLine>Roma</addrLine></address></meeting>
		<imprint>
			<publisher>Carocci</publisher>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Parametri per semplificare l&apos;italiano istituzionale: Revisione della letteratura</title>
		<author>
			<persName><forename type="first">G</forename><surname>Fiorentino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ganfi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Italiano LinguaDue</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="page" from="220" to="237" />
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Il dovere costituzionale di farsi capire. A trent&apos;anni dal Codice di stile</title>
		<author>
			<persName><forename type="first">E</forename><surname>Piemontese</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2023">2023</date>
			<publisher>Carocci</publisher>
			<pubPlace>Roma</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Da dembsher al codice di stile e oltre: un bilancio sul linguaggio burocratico</title>
		<author>
			<persName><forename type="first">S</forename><surname>Lubello</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Il dovere costituzionale di farsi capire A trent&apos;anni dal Codice di stile</title>
				<editor>
			<persName><forename type="first">E</forename><surname>Piemontese</surname></persName>
		</editor>
		<meeting><address><addrLine>Roma</addrLine></address></meeting>
		<imprint>
			<publisher>Carocci</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="54" to="70" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">The Simplification of the Language of Public Administration: The Case of Ombudsman Institutions</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">Gonzalez</forename><surname>Delgado</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">Navarro</forename><surname>Colorado</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Workshop on DeTermIt! Evaluating Text Difficulty in a Multilingual Context</title>
				<meeting>the Workshop on DeTermIt! Evaluating Text Difficulty in a Multilingual Context</meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="125" to="133" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Doshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Amin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Khosla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bajaj</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chheang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">P</forename><surname>Forman</surname></persName>
		</author>
		<idno type="DOI">10.1101/2023.06.04.23290786</idno>
		<title level="m">Utilizing large Language Models to Simplify Radiology Reports: a comparative analysis of ChatGPT3</title>
				<meeting><address><addrLine>medRxiv</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">5</biblScope>
		</imprint>
	</monogr>
	<note>ChatGPT4.0, Google Bard, and Microsoft Bing</note>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Xai for all: Can large language models simplify explainable ai?</title>
		<author>
			<persName><forename type="first">P</forename><surname>Mavrepis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Makridis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Fatouros</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Koukos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">M</forename><surname>Separdani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Kyriazis</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2401.13110</idno>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Improving Text Simplification with Factuality Error Detection</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Seneviratne</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Daskalaki</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Workshop on Text Simplification, Accessibility, and Readability</title>
				<imprint>
			<publisher>TSAR</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="173" to="178" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Data-Driven Sentence Simplification: Survey and Benchmark</title>
		<author>
			<persName><forename type="first">F</forename><surname>Alva-Manchego</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Scarton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Specia</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computational Linguistics</title>
		<imprint>
			<biblScope unit="volume">46</biblScope>
			<biblScope unit="page" from="135" to="187" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Simplifying Administrative Texts for Italian L2 Readers with Controllable Transformers Models: A Data-driven Approach</title>
		<author>
			<persName><forename type="first">M</forename><surname>Miliani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Alva-Manchego</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lenci</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLiC-it</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Is it really that simple? prompting language models for automatic text simplification in italian</title>
		<author>
			<persName><forename type="first">D</forename><surname>Nozza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Attanasio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<title level="m" type="main">L&apos;italiano istituzionale per la comunicazione pubblica</title>
		<author>
			<persName><forename type="first">D</forename><surname>Vellutino</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2018">2018</date>
			<publisher>Il mulino</publisher>
			<pubPlace>Bologna</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Corpus «itaist»: Note per lo sviluppo di una risorsa linguistica per lo studio dell&apos;italiano istituzionale per il diritto di accesso civico</title>
		<author>
			<persName><forename type="first">D</forename><surname>Vellutino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Cirillo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Italiano LinguaDue</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="page" from="238" to="250" />
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Language models are few-shot learners</title>
		<author>
			<persName><forename type="first">T</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Mann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ryder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Subbiah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Kaplan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dhariwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Neelakantan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shyam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Sastry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Askell</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems (NIPS)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="1877" to="1901" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Achiam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Adler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ahmad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Akkaya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">L</forename><surname>Aleman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Almeida</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Altenschmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Altman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Anadkat</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2303.08774</idno>
		<title level="m">Gpt-4 technical report</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<ptr target="https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md" />
		<title level="m">Llama 3 model card</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note>AI@Meta</note>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Criteri e proposte di semplificazione</title>
		<author>
			<persName><forename type="first">E</forename><surname>Piemontese</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Codice di stile delle comunicazioni scritte a uso delle pubbliche amministrazioni</title>
				<meeting><address><addrLine>Roma</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1994">1994</date>
		</imprint>
		<respStmt>
			<orgName>Istituto Poligrafico e Zecca dello Stato</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Strumenti per semplificare il linguaggio delle amministrazioni pubbliche</title>
		<author>
			<persName><forename type="first">A</forename><surname>Fioritto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Manuale di stile</title>
				<meeting><address><addrLine>Il mulino; Bologna</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Probability tables for individual comparisons by ranking methods</title>
		<author>
			<persName><forename type="first">F</forename><surname>Wilcoxon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Biometrics</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="119" to="122" />
			<date type="published" when="1947">1947</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Dominance statistics: Ordinal analyses to answer ordinal questions</title>
		<author>
			<persName><forename type="first">N</forename><surname>Cliff</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Psychological bulletin</title>
		<imprint>
			<biblScope unit="volume">114</biblScope>
			<biblScope unit="page" from="494" to="509" />
			<date type="published" when="1993">1993</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<author>
			<persName><forename type="first">N</forename><surname>Cliff</surname></persName>
		</author>
		<title level="m">Ordinal methods for behavioral data analysis</title>
				<meeting><address><addrLine>New York</addrLine></address></meeting>
		<imprint>
			<publisher>Psychology Press</publisher>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Semantic structural evaluation for text simplification</title>
		<author>
			<persName><forename type="first">E</forename><surname>Sulem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Abend</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rappoport</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/N18-1063</idno>
		<ptr target="https://aclanthology.org/N18-1063.doi:10.18653/v1/N18-1063" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long Papers</title>
		<editor>
			<persName><forename type="first">M</forename><surname>Walker</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Ji</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Stent</surname></persName>
		</editor>
		<meeting>the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies<address><addrLine>New Orleans, Louisiana</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="685" to="696" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Optimizing Statistical Machine Translation for Text Simplification</title>
		<author>
			<persName><forename type="first">W</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Pavlick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Callison-Burch</surname></persName>
		</author>
		<idno type="DOI">10.1162/tacl_a_00107</idno>
		<ptr target="https://doi.org/10.1162/tacl_a_00107.doi:10.1162/tacl_a_00107" />
	</analytic>
	<monogr>
		<title level="j">Transactions of the Association for Computational Linguistics</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="401" to="415" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Bleu: a method for automatic evaluation of machine translation</title>
		<author>
			<persName><forename type="first">K</forename><surname>Papineni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Roukos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ward</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W.-J</forename><surname>Zhu</surname></persName>
		</author>
		<idno type="DOI">10.3115/1073083.1073135</idno>
		<idno>doi:10.3115/1073083. 1073135</idno>
		<ptr target="https://doi.org/10.3115/1073083.1073135" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL &apos;02, Association for Computational Linguistics</title>
				<meeting>the 40th Annual Meeting on Association for Computational Linguistics, ACL &apos;02, Association for Computational Linguistics<address><addrLine>USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="311" to="318" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">The (Un)Suitability of Automatic Evaluation Metrics for Text Simplification</title>
		<author>
			<persName><forename type="first">F</forename><surname>Alva-Manchego</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Scarton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Specia</surname></persName>
		</author>
		<idno type="DOI">10.1162/coli_a_00418</idno>
		<ptr target="https://doi.org/10.1162/coli_a_00418.doi:10.1162/coli_a_00418" />
	</analytic>
	<monogr>
		<title level="j">Computational Linguistics</title>
		<imprint>
			<biblScope unit="volume">47</biblScope>
			<biblScope unit="page" from="861" to="889" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Meteor: An automatic metric for mt evaluation with improved correlation with human judgments</title>
		<author>
			<persName><forename type="first">S</forename><surname>Banerjee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lavie</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization</title>
				<imprint>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="65" to="72" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Gulpease: una formula per la predizione della leggibilita di testi in lingua italiana</title>
		<author>
			<persName><forename type="first">P</forename><surname>Lucisano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Piemontese</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Scuola e</title>
		<imprint>
			<biblScope unit="volume">città</biblScope>
			<biblScope unit="page" from="110" to="124" />
			<date type="published" when="1988">1988</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">Adaptation of flesh readability index on a bilingual text written by the same author both in italian and english languages</title>
		<author>
			<persName><forename type="first">V</forename><surname>Franchina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vacca</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Linguaggi</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="47" to="49" />
			<date type="published" when="1986">1986</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">Read-it: Assessing readability of italian texts with a view to text simplification</title>
		<author>
			<persName><forename type="first">F</forename><surname>Dell'orletta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Montemagni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Venturi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the second workshop on speech and language processing for assistive technologies</title>
				<meeting>the second workshop on speech and language processing for assistive technologies</meeting>
		<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="73" to="83" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">De</forename><surname>Mauro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Chiari</surname></persName>
		</author>
		<ptr target="https://www.internazionale.it/opinione/tullio-de-mauro/2016/12/23/il-nuovo-vocabolario-di-base-della-lingua-italiana" />
		<title level="m">Il nuovo vocabolario di base della lingua italiana</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<analytic>
		<title level="a" type="main">Linguistically-Based Comparison of Different Approaches to Building Corpora for Text Simplification: A Case Study on Italian</title>
		<author>
			<persName><forename type="first">D</forename><surname>Brunato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Dell'orletta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Venturi</surname></persName>
		</author>
		<idno type="DOI">10.3389/fpsyg.2022.707630</idno>
	</analytic>
	<monogr>
		<title level="j">Frontiers in Psychology</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b38">
	<analytic>
		<title level="a" type="main">Evolution of semantic similarity-A survey</title>
		<author>
			<persName><forename type="first">D</forename><surname>Chandrasekaran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Mago</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Computing Surveys (CSUR)</title>
		<imprint>
			<biblScope unit="volume">54</biblScope>
			<biblScope unit="page" from="1" to="37" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b39">
	<analytic>
		<title level="a" type="main">Bertscore: Evaluating text generation with bert</title>
		<author>
			<persName><forename type="first">T</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Kishore</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">Q</forename><surname>Weinberger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Artzi</surname></persName>
		</author>
		<ptr target="https://openreview.net/forum?id=SkeHuCVFDr" />
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b40">
	<analytic>
		<title level="a" type="main">Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks</title>
		<author>
			<persName><forename type="first">N</forename><surname>Reimers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Gurevych</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b41">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Barayan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Camacho-Collados</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Alva-Manchego</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2409.20246</idno>
		<title level="m">Analysing zero-shot readabilitycontrolled sentence simplification</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b42">
	<monogr>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">P</forename><surname>Miller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">F</forename><surname>Vandome</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mcbrewster</surname></persName>
		</author>
		<title level="m">Levenshtein distance: Information theory, computer science, string (computer science), string metric, damerau? Levenshtein distance, spell checker, hamming distance</title>
				<meeting><address><addrLine>Olando</addrLine></address></meeting>
		<imprint>
			<publisher>Alpha Press</publisher>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b43">
	<analytic>
		<title level="a" type="main">Measuring massive multitask language understanding</title>
		<author>
			<persName><forename type="first">D</forename><surname>Hendrycks</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Burns</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Basart</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mazeika</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Steinhardt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations</title>
				<imprint>
			<publisher>ICLR</publisher>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b44">
	<analytic>
		<title level="a" type="main">Hellaswag: Can a machine really finish your sentence?</title>
		<author>
			<persName><forename type="first">R</forename><surname>Zellers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Holtzman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bisk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Farhadi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Choi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics</title>
				<meeting>the 57th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="4791" to="4800" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b45">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Cowhey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Etzioni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Khot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sabharwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Schoenick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Tafjord</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1803.05457</idno>
		<title level="m">Think you have solved question answering? try arc, the ai2 reasoning challenge</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b46">
	<analytic>
		<title level="a" type="main">Drop: A reading comprehension benchmark requiring discrete reasoning over paragraphs</title>
		<author>
			<persName><forename type="first">D</forename><surname>Dua</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dasigi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Stanovsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gardner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long and Short Papers</title>
		<editor>
			<persName><forename type="first">J</forename><surname>Burstein</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Doran</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Solorio</surname></persName>
		</editor>
		<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="2368" to="2378" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b47">
	<analytic>
		<title level="a" type="main">MTEB: Massive text embedding benchmark</title>
		<author>
			<persName><forename type="first">N</forename><surname>Muennighoff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tazi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Magne</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Reimers</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">European Chapter of the Association for Computational Linguistics (EACL)</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="2014" to="2037" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b48">
	<analytic>
		<title level="a" type="main">Validazione e confronto tra semplificazione automatica e semplificazione manuale di testi in italiano istituzionale ai fini dell&apos;efficacia comunicativa</title>
		<author>
			<persName><forename type="first">G</forename><surname>Fiorentino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Russodivito</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ganfi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Oliveto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Automated texts In the ROMance languages and beyond&quot; (AI-ROM-II), 2nd International Conference</title>
				<imprint/>
	</monogr>
	<note>To appear</note>
</biblStruct>

<biblStruct xml:id="b49">
	<analytic>
		<title level="a" type="main">Recent advances of few-shot learning methods and applications</title>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Leng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Science China Technological Sciences</title>
		<imprint>
			<biblScope unit="volume">66</biblScope>
			<biblScope unit="page" from="920" to="944" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
