1. Introduction

K. Zeinalipour);

Harnessing LLMs for Educational Content-Driven Italian Crossword Generation

Kamyar Zeinalipour

Achille Fusco

Asya Zanollo

Marco Maggini

Marco Gori

1 0 IUSS Pavia , Piazza della Vittoria 15, 27100 Pavia, PV 1 University of Siena, DIISM , Via Roma 56, 53100 Siena , Italy

2024

000 0 0002

In this work, we unveil a novel tool for generating Italian crossword puzzles from text, utilizing advanced language models such as GPT-4o, Mistral-7B-Instruct-v0.3, and Llama3-8b-Instruct. Crafted specifically for educational applications, this cutting-edge generator makes use of the comprehensive Italian-Clue-Instruct dataset, which comprises over 30,000 entries including diverse text, solutions, and types of clues. This carefully assembled dataset is designed to facilitate the creation of contextually relevant clues in various styles associated with specific texts and keywords. The study delves into four distinctive styles of crossword clues: those without format constraints, those formed as definite determiner phrases, copular sentences, and bare noun phrases. Each style introduces unique linguistic structures to diversify clue presentation. Given the lack of sophisticated educational tools tailored to the Italian language, this project seeks to enhance learning experiences and cognitive development through an engaging, interactive platform. By meshing state-of-the-art AI with contemporary educational strategies, our tool can dynamically generate crossword puzzles from Italian educational materials, thereby providing an enjoyable and interactive learning environment. This technological advancement not only redefines educational paradigms but also sets a new benchmark for interactive and cognitive language learning solutions.

eol>Large Language Models Italian Educational Puzzles Interactive Learning Italian Educational Crosswords

1. Introduction

techniques, the tool produces high-quality clues and answers, ofering educators a resource to develop more While traditionally valued for their challenge and enter- interactive and efective instructional methods. tainment, crossword puzzles are increasingly recognized Furthermore, a new dataset called 1 has been compiled for their educational benefits. They provide an interac- and will be released to the scientific community. tive learning environment that enhances the retention of The layout of this paper is organized in the following both technical terms and general language skills, hence manner: Section 2 surveys the relevant literature in detail. facilitating learning across various disciplines, improving Section 3 explains the methods used for dataset collection language acquisition, and supporting cognitive develop- and curation. In Section 3, we describe the computational ment, through critical thinking and memory retention techniques employed in our study. Section 4 reports the [1, 2, 3, 4, 5, 6, 7, 3, 8, 9, 2, 10, 11]. results derived from our experimental analysis. Finally, The integration of Natural Language Processing (NLP) Section 5 closes with conclusive insights and the broader and Large Language Models (LLMs) has further enhanced implications of our research findings. their efectiveness by providing sophisticated, contextually relevant clues for educational crosswords.

This paper presents a novel tool that uses LLMs to gen- 2. Related Works erate tailored Italian educational crossword puzzles from texts, ofering various clue types. By integrating userprovided texts or keywords and applying fine-tuning clues [15]. The prominent focus was placed on the bolded keywords On a diferent front, Arora et al. developed SEEKH, a that highlight the primary topic and other significant system that integrates statistical and linguistic analyses terms within each article. Beyond keyword identificato generate crossword puzzles in multiple Indian lan- tion, we also gathered a variety of essential metadata. guages. Their approach emphasizes the identification of This included metrics such as view counts, relevance askeywords to structure the puzzles [16]. sessments, brief narrative summaries, central headlines, Recent progress in crossword puzzle generation has related terms, categorization, and URLs.2 The uniform been notably advanced by the work of Zeinalipour et al. structure of the Italian Wikipedia significantly aids this [17, 18, 19, 20], who demonstrated the use of large-scale process. By tapping into the introductory sections, which language models to develop puzzles in languages with are particularly information-rich, we could systematilimited support, such as English, Italian and Arabic. Their cally extract and outline the key concepts needed. This research highlights the vast potential of computational approach ensures a comprehensive data repository, caplinguistics in crafting puzzles that are both engaging and turing critical elements and insights from a diverse array linguistically rich. Initially, they employed few-shot and of articles. zero-shot learning techniques to generate new crossword clues from text [18, 17]. Data Enhancement To ensure the reliability and efecFurthermore, Zugarini et al. [21] introduced a method tiveness of our data, we performed some filtering based for generating educational crossword clues from the pro- on diferent criteria. The first filter was designed to priorivided text in English. tize the most important pages and those with the highest In their Italian crossword puzzle generation study [18], number of views. Firstly, articles were selected based Zeinalipour et al. initially used few-shot learning with on their popularity and relevance. To ensure a balanced large language models as-is. However, our current and manageable dataset, we also discarded articles that project goes a step further by introducing a specially were either too lengthy or too brief, specifically those designed dataset for this task in Italian. Additionally, with fewer than 50 words. Additionally, we removed keywe have developed open-source models that have been word associations longer than two words to maintain the ifne-tuned to significantly enhance performance for this clarity and relevance of the crossword clues. Finally, we specific application. imposed restrictions on keywords to ensure they were The current research initiates a novel approach by uti- between 3 and 20 characters in length and free of spelizing state-of-the-art language modeling to develop Ital- cial characters or numerals. Multi-words expressions ian crossword puzzles from given texts. By doing so, were also included as good keywords as they are quite it enriches the toolkit for language education, thereby common in crossword puzzles. pushing forward the development of Italian crossword puzzles.

Formulation of Various Prompts Crafting special

ized prompts was pivotal for producing Italian cross3. Methodology word clues from a given text using GPT-4o. The prompts were created to generate clues that were both informaWe have developed an automated system that gen- tive and engaging, by incorporating crucial details and erates educational Italian crossword puzzles using background context from the articles. Additionally, apart LLMs, with the Italian-Clue-Instruct dataset at its we aimed to elicit three specific types of clue varying in core. Our approach leverages the adaptability of their syntactic structures:

LLMs, like GPT-4o, to create puzzles from text, with human validation for accuracy. Additionally, we ifne-tuned models such as Llama3-8b-Instruct and

Mistral-7B-Instruct-v0.3 to improve clue accuracy and relevance.

A more detailed description of our methodology, illustrated in Figure 1, is provided in the following. • definite determiner phrases: nominal clues headed by a definite article and usually modified by adjectives, prepositional phrases (PPs) or relative clauses (RCs), like <La repubblica asiatica con capitale Tashkent, Uzbekistan> (‘The Asian republic with Tashkent as capital’, ‘Uzbekistan’).

Such clues are examples of definite descriptions

which have been traditionally analyzed as carrying a uniqueness presupposition ([22]) when singular and a maximality presupposition [23] when plural. In the context of crosswords, clues of this kind refer to their solution as the single Italian-Clue-Instruct

Data Collection Methodology Initiating the data col

lection process, we began by extracting the introductory portions of Italian Wikipedia articles. We use Wikipedia API and Beautiful Soup to automatically extract the pages. 2Wikipedia: Lists of popular pages by WikiProject entity or the maximal plural entity satisfying the in 9) to ensure that the required structure is given description. in output. It has been observed during the prompt • bare noun phrases [24]: the clue consists of trials that the validity of precise structures for clues a simple noun phrase (NP) with no determiner strongly depends on the type of text given in input. and typically modified by adjectives, PPs or RCs, The prompts used for clue generation in this study are for example <Grande centro commerciale di lusso presented in Figures 6, 7, 8 and 9, located in the Appendix. con sede a Londra, Harrods> (‘Luxury shopping mall based in London’, ‘Harrods’). In Italian, NPs are taken to denote a predicate that can be true Generation of Educational Italian Clues. Guided by of one or more individuals [22, 25].3 Given the the self-instruct framework [27], we devised a method absence of the definite determiner, bare NP clues to automate the generation of educational crossword do not specify whether the referent of the solution clues in Italian, harnessing the power of LLMs. Central to uniquely satisfy the description [22], thus more our approach is the sophisticated GPT-4o5, an enhanced than one solution could in principle be possible. version of LLMs, renowned for its eficiency. A key difer• copular sentences [26]: copular clues are entiator of our strategy is the integration of contextual clausal definitions structured as < copula predi- information with the clues produced. To achieve this, cate> with an elliptical subject as in <è una salsa we carefully curated the content and keywords from the piccante tipica della Tunisia, Harissa> (‘(It) is a Wikipedia text extracted in previous sections. We used spicy sauce typical of Tunisia’, ‘Harissa’). Copu- four distinct types of prompts, each designed to generate las, like Italian essere (’to be’) connect a subject diferent categories of clues: bare noun phrases, defiwith a non-verbal predicate, such as an adjecti- nite determiner phrases, and copular sentences. These val phrase (AP), a PP or another nominal phrase prompts were crafted to create diverse types of clues, (NP/DP). In crossword puzzles, the solution tar- ensuring alignment with our specific objectives for edugets the precopular position of such sentences, cational content in Italian. i.e. the elliptical subject. 4

Overview of the Italian-Clue-Instruct Dataset Our To accomplish this, we created three distinct prompts for research began with downloading 88,403 articles from each clue structure, and one prompt that does not specify the Italian Wikipedia, which we filtered down to 11,413 the structure. This step allows us to test the syntactic sen- relevant entries. From this refined set, we selected 5,000 sitivity of the models employed and, more importantly it articles for clue generation, spanning 29 thematic categives us the possibility of manipulating the structure to gories. To enhance our dataset, we leveraged the capabilcreate variation not just with respect to the subject mat- ities of GPT-4o, generating a minimum of three diverse ter but also in the clue syntactic complexity. Moreover, clues per Wikipedia article, depending on the text length. generating clues with specific structures represents an in- This efort resulted in a compilation of 15,000 unique teresting resource for the educational characterization of clues. puzzles. Indeed, it is well-known from psycholinguistic The dataset’s in-depth analysis demonstrates a variresearch that diferent structures can elicitate diferent ability in context length, ranging from 10 to 1512 tokens, reactions in the processing which can be correlated with with most texts falling between 100 and 600 tokens. Figfactors like age, linguistic disorders etc. and this can be ure 2 showcases the token distribution for contexts and exploited when creating puzzles specific for any solver’s clues, which have been processed using the Llama3 tokneeds. enizer. Typically, the clue-generation process results in

As for the prompt engeneering, the structure has clues ranging from 4 to 55 tokens in length. been explicitated in one dedicated step of the prompt Figure 3 illustrates the spread of data across diferent chain. For what regards the copular structure, which is categories. The dataset is notably dominated by the catwidespread and widely used with diferent formulation, egories of "Entertainment", "Geography", and "History". we include an example in the prompt (as shown In contrast, categories such as "Mathematics", "Architecture", and "Languages" are underrepresented. 3Bare NPs are known to denote also natural kinds [22]. However, given that NP clues occur in isolation, it is rather dificult to distinguish among the two senses, therefore we assume the more general reading of NPs as predicates. We leave this discussion to future analyses.

4Copular sentences are known to be diferentiated between canonical

and inverse structures [26]. Usually in crossword clues canonical structure are found more frequently, but inverse copular clues are not excluded. We leave the question open for further, purely linguistic research.

Evaluating quality of the Italian-Clue-Instruct Dataset Producing accurate and engaging Italian edu

cational crossword clues is inhibited by the absence of a reference corpus, making it dificult to draw comparisons using standard measures, such as ROUGE scores.

5https://openai.com/index/hello-gpt-4o/

Data Collection (a) (b) (c)

Italian Clue Creation (using GPT-4 Turbo) (d) (e) Our evaluation strategy adapts uniquely to the task re- the similarity between the n-grams of the generated clues quirements. Specifically, efective clues should represent and the reference text from Wikipedia, it is not a reliable contextually accurate paraphrases of text information. metric and does not provide any assessment of the semanTo accommodate this, we adopted an extractive method, tic quality of the generated clues. However, it provides a using the ROUGE-L score to gauge the adequacy of clues general picture of the generated clues. in reflecting the input context that we extracted from In addition, the integrity of the generated clues was furWikipedia. By comparing input sentences to the gener- ther examined through human evaluations. A randomly ated clues, the evaluation aimed to attain high scores to chosen subset of clues was assessed, generated from a ensure strict adherence to the original text, minimizing sample of 100 articles, with a maximum of three clues irrelevant content and avoiding clues that merely repli- per article. To avoid repetitions, duplicate clues were recate the input or improperly introduce the target key- moved. The evaluation employed a five-level criteria sysword. Results indicated a substantial connection between tem, analogous to the methodology utilized by [27]. For the context and the clues, with an average ROUGE-1, the present evaluation, the following parameters were ROUGE-2, and ROUGE-L score of 0.159, 0.114, and 0.146 used: respectively.

Considering that the ROUGE score merely compares • RATING-A: The clue is coherent and valid, align

Content Categories 1400 1200 1000

The evaluation was made by a native Italian speaker,

master student of linguistics, and PhD student in linguistics, who followed the criteria described above. Please refer to Table 2 for examples of clues and their respective ratings.

The distribution of the evaluation outcomes is depicted in Figure 4, these illustrate that the majority of the generated clues were of high quality rated as ’A’ and only a small fraction rated as ’C’, ’D’, or ’E’. By utilizing both quantitative metrics and qualitative

assessments, the study aimed to validate the educational utility and contextual accuracy of the clues created for

Italian educational crosswords. Enhancing LLMs for Italian text-based Educational Crossword Puzzle Generation To develop crossword

puzzle clues from Italian texts using advanced LLM functionalities, we employed three models: GPT-4o (for data generation), Mistral-7B-Instruct-v0.3, and

Llama3-8b-Instruct known for their strong text gen

eration and Italian language support. [28, 29].

We began the process by fine-tuning the models with

the Italian-Clue-Instruct dataset, which was rich in relevant material. This calibration was vital to enhance the models’ proficiency in generating Italian clues while accurately reflecting the Italian language’s intricate grammar and vocabulary within educational contexts.

To further refine the models, we optimized the parame

ters during the fine-tuning phase. This efort aimed to reduce errors specific to our task and better align the output of the models with Italian educational materials.

Ultimately, the specialized tuning of these LLMs with a

dedicated dataset was intended to foster their ability to Llama3-8b-Instruct models in generating clues from generate high-quality crossword clues from Italian texts. Italian educational texts.

The goal was to ensure that the resulting clues were not only linguistically sound but also relevant within an educational framework. Evaluation Results with the human evaluator Us

ing a dataset of 100 Italian contexts, each containing 3 clues, a human evaluation was conducted on both the 4. Experimental Results generated and base models. The results of this evaluation are depicted in Figure 5. The evaluation employed the This section ofers a detailed overview of the experiments 5-level rating system described in Section 3. conducted in the study. It begins with the training setup The table provided ofers a comparative evaluafor the Italian-Clue-Instruct LLMs, including key param- tion of the performance of language models in geneters and computational resources. The performance of erating Italian clues from a given text. Specifithe models is then evaluated using automated metrics, cally, the models Mistral-7B-Instruct-v0.3 and such as the ROUGE score, to compare configurations and Llama3-8b-Instruct are evaluated based on both identify areas for improvement. This is followed by an their base and fine-tuned configurations. Upon finein-depth analysis of human evaluations, focusing on rele- tuning, Mistral-7B-Instruct-v0.3 displays a sigvance, coherence, and content quality to provide insights nificant improvement, emerging as the top performer beyond automated metrics. Additionally, an example of a in category "A", and surpassing Llama3-8b-Instruct generated crossword puzzle is presented to demonstrate in terms of performance enhancement. These findpractical usability. The goal is to highlight the robustness ings underscore the impact of fine-tuning on enhancand versatility of the proposed approach. ing model capabilities, particularly highlighted by the performances of Mistral-7B-Instruct-v0.3 and Training Setup The models pLalraammae3t-er8sb, -reIsnpsetcrtiuveclty,. wFuhritchherfmeaotruer,efine7-taunndin8gbwililtihon Mistral-7B-Instruct-v0.3 and the introduced dataset significantly increased the modLlama3-8b-Instruct were fine-tuned using LORA els’ ability to generate Italian clues from the given text, [30], with parameters set to = 16 and = 32, across illustrating the quality and efectiveness of the Italianthree training epochs, maintaining a total batch size of 64.

The full experimental setup was performed on a server TChlueem-Inestthroudctoldoagtyasfeotr. generating Italian crossword clues equipped with four NVIDIA A6000 GPUs, utilizing from educational texts was explored, enabling cusDeepSpeed [31] and FlashAttention 2 [32]. For the tomized clues. This would allow educators to select suitinitial learning rate was configured at 3 × 10− 4. During able clues matching their teaching needs. The selected inference, model distribution sampling was applied to clues could in turn be used to automatically generate a generate clues for both Mistral-7B-Instruct-v0.3 crossword schema as discussed Zeinalipour et al. [17]. and Llama3-8b-Instruct, with a temperature param- Figure 10 in Appendix shows an example puzzle, demoneter set to 0.1. Additionally, the parameters for top- strating the system’s application. and top- sampling were set to 0.95 and 50, respectively.

Among the three epoch checkpoints, the one with the

minimum loss was selected, which, in our case, turned 5. Conclusion out to be the second checkpoint.

A novel system for generating crossword clues from

Evaluation Results with the Automatic Metrics Italian text is introduced, leveraging the newly deWe evaluated the resemblance between various sets of veloped Italian-Clue-Instruct dataset. This dataset, clues produced by diferent models (details shown in which includes text, keywords, categories, and reTable 1) and those generated by the GPT-4o model lated crossword clues in Italian, is pioneering in on a test set of 200 educational contexts. This evalu- this field. By fine-tuning two large language ation was done using ROUGE scores. Our results indi- models (LLMs), Mistral-7B-Instruct-v0.3 and cate that the fine-tuned Mistral-7B-Instruct-v0.3 Llama3-8b-Instruct, using this dataset, we have and Llama3-8b-Instruct models exhibit a closer achieved significant improvements in the models’ ability similarity to GPT-4o. On the other hand, the base to generate crossword clues from given text. The results Llama3-8b-Instruct model shows significantly lower highlight a substantial enhancement in model perforsimilarity with minimal overlap. These outcomes mance after fine-tuning. Both the Italian-Clue-Instruct highlight the eficacy of fine-tuning, demonstrating dataset and the fine-tuned models are now publicly availthat using the Italian-Clue-Instruct dataset enhances able, providing valuable tools for students and teachers the capability of Mistral-7B-Instruct-v0.3 and to create educational crossword puzzles from Italian text. 80 60 s t n u o C40 20 0

Counts of Ratings by Model

model mistral_base llama3_base mistral_finetuned llama3_finetuned Fine-tuned LLMs

Model name

Mistral-7B Llama3-8b Mistral-7B Llama3-8b

Acknowledgments Future research will aim to develop models capable of

generating various types of crossword clues, including ifll-in-the-blank clues.

The funding for this paper was provided by the TAILOR project and the HumanE-AI-Net projects, both supported by the EU Horizon 2020 research and innovation program under GA No 952215 and No 952026, respectively. thinking and problem solving skills in engineering theory of n-movement in syntax and logical form, education, J Engin Educ Trans 30 (2017) 103–13. Linguistic inquiry (1994) 609–665. [12] B. Ranaivo-Malançon, T. Lim, J.-L. Minoi, A. J. R. [25] Z. Roberto, Layers in the determiner phrase, Ph.D.

Jupit, Automatic generation of fill-in clues and an- thesis, PhD Thesis, University of Rochester (Pub

swers from raw texts for crosswords, in: 2013 8th lished by Garland, 2000), 1995.

International Conference on Information Technol- [26] A. Moro, Copular sentences, The Blackwell comogy in Asia (CITA), IEEE, 2013, pp. 1–5. panion to syntax (2006) 1–23. [13] L. Rigutini, M. Diligenti, M. Maggini, M. Gori, A [27] Y. Wang, Y. Kordi, S. Mishra, A. Liu, N. A. Smith, fully automatic crossword generator, in: 2008 Sev- D. Khashabi, H. Hajishirzi, Self-instruct: Aligning enth International Conference on Machine Learn- language model with self generated instructions, ing and Applications, IEEE, 2008, pp. 362–367. arXiv preprint arXiv:2212.10560 (2022). [14] L. Rigutini, M. Diligenti, M. Maggini, M. Gori, Au- [28] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Katomatic generation of crossword puzzles, Inter- plan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sasnational Journal on Artificial Intelligence Tools 21 try, A. Askell, et al., Language models are few-shot (2012) 1250014. learners, Advances in neural information process[15] J. Esteche, R. Romero, L. Chiruzzo, A. Rosá, Au- ing systems 33 (2020) 1877–1901. tomatic definition extraction and crossword gen- [29] H. Touvron, T. Lavril, G. Izacard, X. Martinet, eration from spanish news text, CLEI Electronic M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal,

Journal 20 (2017). E. Hambro, F. Azhar, et al., Llama: Open and efi

[16] B. Arora, N. Kumar, Automatic keyword extraction cient foundation language models, arXiv preprint and crossword generation tool for indian languages: arXiv:2302.13971 (2023).

Seekh, in: 2019 IEEE Tenth International Confer- [30] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li,

ence on Technology for Education (T4E), IEEE, 2019, S. Wang, L. Wang, W. Chen, Lora: Low-rank adappp. 272–273. tation of large language models, arXiv preprint [17] K. Zeinalipour, T. Iaquinta, G. Angelini, L. Rigutini, arXiv:2106.09685 (2021).

M. Maggini, M. Gori, Building bridges of knowl- [31] J. Rasley, S. Rajbhandari, O. Ruwase, Y. He, Deep

edge: Innovating education with automated cross- speed: System optimizations enable training deep word generation, in: 2023 International Conference learning models with over 100 billion parameters, on Machine Learning and Applications (ICMLA), in: Proceedings of the 26th ACM SIGKDD Interna

IEEE, 2023, pp. 1228–1236. tional Conference on Knowledge Discovery & Data

[18] K. Zeinalipour, A. Zanollo, G. Angelini, L. Rigutini, Mining, 2020, pp. 3505–3506.

M. Maggini, M. Gori, et al., Italian crossword [32] T. Dao, Flashattention-2: Faster attention with generator: Enhancing education through interac- better parallelism and work partitioning, arXiv tive word puzzles, arXiv preprint arXiv:2311.15723 preprint arXiv:2307.08691 (2023). (2023). [19] K. Zeinalipour, M. Saad, M. Maggini, M. Gori, Arabicros: Ai-powered arabic crossword puzzle gener- A. Appendix ation for educational applications, in: Proceedings of ArabicNLP 2023, 2023, pp. 288–301. [20] K. Zeinalipour, Y. G. Keptiğ, M. Maggini, L. Rigutini,

M. Gori, A turkish educational crossword puzzle

generator, in: International Conference on Artificial Intelligence in Education, Springer, 2024, pp. 226– 233. [21] A. Zugarini, K. Zeinalipour, S. S. Kadali, M. Maggini, M. Gori, L. Rigutini, Clue-instruct: Text-based clue generation for educational crossword puzzles, arXiv preprint arXiv:2404.06186 (2024). [22] G. Chierchia, Reference to kinds across language,

Natural language semantics 6 (1998) 339–405.

[23] G. Link, The logical analysis of plurals and mass terms: A lattice theoretical approach, Meaning, Use, and Interpretation of Language/Walter de Gruyter (1983). [24] G. Longobardi, Reference and proper names: A

You are a crossword expert.

Generate concise and clever clues in Italian for educational crossword puzzles based on a specified Keyword and its relation to an assigned Text. To execute this task properly, replicate the guidelines below: KEYWORD: {keyword} TEXT: {text} Observe the following steps: 1. Substitute every pronoun in the text with full phrases expressing their referents. 2. Split the text into small independent sentences that could be understood out of context. 3. Pinpoint three concise sentences that contain the Keyword and best characterize the keyword. Try to select sentences from different parts of the Text. 4. Generate short and clever crossword clues in Italian from the selected sentences. Make sure that the keyword remains absent from the clues. If the Keyword is not the subject of the sentence, make sure that it is substituted with an appropriate clitic, possessive or demonstrative pronoun. Generate clues from all the parts of the text and use all of the information provided to generate the clues. 5. Ensure that each clue functions as a description or definition of the keyword rather than a query, focusing on details about the keyword. 6. Make sure that each clue's information can be traced back to the text. Make sure that the clues are relevant and that they are sufficient to identify the keyword. Make sure that the keyword does not appear in the clues. Make sure that any part of the keyword is not present in the clues. 7. Select only the three best clues for educational purposes. 8. Compile these clues into a list formatted as follows: [clue1, clue2, clue3] into a JSON file under the key: 'clues'. Make sure the output is in the requested format and do not include the whole process in the output, but only the clues. Observe the following steps: 1. Substitute every pronoun in the text with full phrases expressing their referents. 2. Split the text into small independent sentences that could be understood out of context. 3. Pinpoint three concise sentences that contain the Keyword and best characterize the keyword. Try to select sentences from different parts of the Text. 4. Generate short and clever crossword clues in Italian from the selected sentences. Make sure that the keyword remains absent from the clues. Each clue must have the syntax of a determiner phrase with the definite article (followed by a noun and possibly adjectives). It can be followed by a relative clause or other complements or adjuncts. Generate clues from all the parts of the text and use all of the information provided to generate the clues. 5. Ensure that each clue functions as a description or definition of the keyword rather than a query, focusing on details about the keyword. 6. Make sure that each clue's information can be traced back to the text. Make sure that the clues are relevant and that they are sufficient to identify the keyword. Make sure that the keyword does not appear in the clues. Make sure that any part of the keyword is not present in the clues. 7. Select only the three best clues for educational purposes. 8. Compile these clues into a list formatted as follows: [clue1, clue2, clue3] into a JSON file under the key: 'clues'. Make sure the output is in the requested format and do not include the whole process in the output, but only the clues.

Answer

Quadrophenia Paramore Pixies

Rating Explanation

A B C D E

Definite determiner is not appropriate: there are other boroughs in Lancashire.

The clue provides accurate but incomplete information: the band was a duo for a limited period. The clue is too generic.

The clue contains part of the answer.