1. Introduction

Beyond Raw Text: Knowledge-Augmented Italian Relation Extraction with Large Language Models

Gianmaria Balducci

0 1

Elisabetta Fersini

Enza Messina

1 0 P.M.I. Reboot S.r.l. , Viale Lunigiana 40, Milano, 20125 , Italia 1 Università Degli studi di Milano-Bicocca) , Viale Sarca 336, Milano, 20125 , Italia

2025

Relation extraction (RE) is a fundamental NLP task that identifies semantic relationships between entities in text, serving as the foundation for applications such as knowledge graph completion and question answering. In real-world deployments, organizations frequently encounter low-resource scenarios where labeled training data is scarce, making efective RE particularly challenging. Existing approaches often rely on external knowledge sources to augment training data, but such resources can be noisy, incomplete, or misleading for model learning. To address this limitation, we propose an approach that leverages the reasoning capabilities of Large Language Models (LLMs) to generate reliable background knowledge for RE tasks on Italian texts.

eol>Relation Extraction LLMs Reasoning Low resources Italian

1. Introduction

require substantial labeled corpora resources that are often unavailable in low-resource settings [5]. MoreRelation extraction (RE) is a fundamental task in natural over, while prompt-tuned SLMs and instruction-tuned language processing that aims to identify and classify LLMs have shown remarkable success across various NLP relationships between subject and object entities men- tasks, they exhibit a tendency to memorize rather than tioned in text [1]. Formally, given an input sentence truly understand training data [6]. This limitation be = {1, 2, . . . , , . . . , , . . . , } containing to- comes particularly problematic for semantically complex kens, where and represent head and tail entities re- tasks like RE, which require deep domain-specific knowlspectively, RE systems predict a relation label ∈ edge and robust generalization capabilities. To address from a predefined set of relationships (e.g., founded_by, these limitations and further enhance the efectiveness born_in, and Work_For). This capability underlies of RE models, we propose a pipeline based on exploitmany critical NLP applications, including knowledge ing the LLMs’ reasoning capabilities. The hypothesis is graph completion and question answering systems [2]. that extending each sample of a given dataset using the Most past approaches focus on adapting standard-scale knowledge extracted by querying the LLM with specific language models (SLMs) such as BERT[3] to downstream clarification prompts helps the models trained on these RE tasks [4]. Recent advances in RE have been driven by samples, along with clarifications, to understand the task deep neural networks, with large pre-trained language better. We train several models on an Italian dataset, models achieving state-of-the-art performance. However, CoNLL04 Italian, translated from the CoNLL04 dataset despite these advances, several fundamental challenges [7]. Experimental results demonstrate that incorporatpersist in real-world deployment scenarios. The primary ing LLM-generated background knowledge significantly limitation stems from the long-tail distribution of rela- improves RE performance, particularly in low-resource tions in natural datasets. While frequent relations benefit settings. Subsequently, we conduct an analysis on the from abundant training examples, the majority of re- contribution that diferent outlooks that compose the lations sufer from severe data scarcity. This creates a knowledge give to the model’s prediction capabilities. significant bottleneck since deep learning approaches ued relevance of RE methods, which explicitly model from Italian literary texts. Their approach involves usrelationships between entities and thereby enhance LLM ing an LLM to preprocess the text into natural language performance. Moreover, RE techniques are especially triples, thereby simplifying the RE task for the fine-tuned valuable in dynamic domains characterized by the con- model. Existing RE methods also tend to exploit addistant emergence of new entities and relation types. Their tional knowledge to assist model reasoning. For example, adaptability makes them well-suited for scalable knowl- [23] proposes a knowledge-attention encoder that incoredge extraction from unstructured textual data, fueling porates prior knowledge from external lexical resources ongoing research and development in this area. Recent like FrameNet and Thesaurus.com into deep neural netadvances in deep neural networks (DNNs) and pretrained works for the relation extraction task. [24] uses enriched language models (PLMs) have substantially boosted RE sentence-level representations by introducing both strucperformance. Several studies [8, 9] approach RE as a tured knowledge from external knowledge graphs and pipeline process: first identifying entities within text, semantic knowledge from the corpus. However, exterthen determining the relationships between identified nal knowledge can be misleading and vague; external entity pairs. Earlier RE systems [10, 11] typically relied on resources don’t consider the context and the domain of external Named Entity Recognition (NER) tools for entity entities and relations, leading models to misinterpret the detection, followed by the use of supervised classifiers meaning of the sentence. with hand-engineered features to predict relations. In Despite these advances, the potential of Italian LLMs contrast, more recent approaches assume that entity men- to support and improve downstream RE remains largely tions are pre-identified, focusing solely on relation clas- underexplored. Given their demonstrated utility, further sification [ 12, 13]. However, pipeline architectures are investigation into their integration with RE workflows is prone to error propagation—errors in entity recognition both timely and necessary. can adversely afect the accuracy of relation classification.

Relation Extraction and Classification can be tackled as a generation task: REBEL [14] uses an autoregressive 3. Dataset model that outputs each triplet present in the input text.

To this end, it employs BART-large [15] as the base model In this research the proposed approach is evaluated on an for the seq2seq approach. The Italian LLM ecosystem has Italian translated version of CoNLL04 [7]. The CoNLL04 recently seen notable expansion, with several new mod- is a benchmark dataset used for relation extraction tasks. els released or announced that are specifically tailored It contains 1,441 sentences, each of which has at least for the Italian language. Among these is LLaMAntino- one relation. The sentences are annotated with infor3-ANITA [16], a fine-tuned version of Meta’s LLaMA-3 mation about entities and their corresponding relation (8B) [17], adapted through Supervised Fine-Tuning (SFT) types [25]. It comprises news articles from The Wall and Direct Preference Optimization (DPO) to align with Street Journal and the Associated Press. It encompasses user preferences and reduce biases. Another significant annotations for both entity and relation types, making it contribution is Fauno [18], developed by Sapienza Uni- versatile for various NLP tasks. The dataset includes relaversity as the first open-source Italian conversational tions among entities like people, organizations, locations, LLM (7B, with a 13B version forthcoming), trained on and other miscellaneous entities. Relation types are five: a blend of synthetic and technical corpora. Minerva Live_In, Located_In, OrgBased_In, Kill, Work_for. Rela7B [19], created by Sapienza NLP in collaboration with tions included: Person-Location, Organization-Person, FAIR, CINECA, and Italy’s National Recovery and Re- Person-Person, etc. silience Plan (PNRR), is trained from scratch on 2.5 trillion tokens (50% Italian), and further enhanced through Table 1 instruction tuning and safety layers. Velvet [20], devel- CoNLL04 benchmark statistics. Every sample is a sentence. oped by Almawave, is a family of multilingual LLMs that sentences entities relations includes Italian and is built on a proprietary architec- train 922 3377 1283 ture. This wave of Italian LLMs—from academic research validation 231 893 343 eforts to industry-grade solutions—reflects a growing test 288 1079 422 commitment to developing robust, safe, and efective na- total 1441 5349 2048 tive Italian models. These advances also contribute to improvements in downstream tasks, including RE. For This work employ a sophisticated hybrid approach instance, [21] propose an Italian Open Information Ex- for translating the ConLL04 English relation extraction traction framework that leverages LLMs for Open Named dataset to Italian while preserving the crucial tokenEntity Recognition, Open Relation Extraction, and joint level annotations required for named entity recognitasks via prompt-based instructions. Similarly, [22] com- tion and relation extraction tasks. The translation probine LLMs with fine-tuned models to extract relations cess operates in three main phases: first, the complete English sentence is translated to Italian using XALMA [26], built upon ALMA-R by expanding sup- relation type train validation test port from 6 to 50 languages. It utilizes a plug-and- Vive_A 322 88 95 play architecture with language-specific modules, com- Situato_In 243 64 94 plemented by a carefully designed training recipe. In HOarg_Luocccaistoa_In 127586 4604 41603 particular, a 8-bit quantized version due to resource Lavora_per 254 69 75 limit constraints is used from the ofical repository on Huggingface at https://huggingface.co/mradermacher/XALMA-13B-Group2-GGUF. The translator model gener- 4. Method ates fluent Italian text but disrupts the original token alignments. Second, to address the critical challenge 4.1. Background of maintaining entity boundaries and types across languages—where direct token-to-token mapping fails due This work considers an LLM as a reliable Knowledge Base to morphological diferences, word order changes, and (KB). Large Language Models (LLMs) ofer significant advarying translation lengths, the system employs Ope- vantages over external knowledge bases like Wikidata nAI’s GPT-4o-mini model [27] to perform intelligent en- for relation extraction tasks, particularly in their supetity alignment by analyzing both the original English rior ability to interpret sentence semantics and contextokens and their Italian counterparts, then identifying tual nuances. Unlike Wikidata, which provides static, which specific Italian tokens correspond to each English predefined relations between entities in a structured forentity based on semantic understanding rather than po- mat, LLMs possess deep contextual understanding that sitional heuristics. Finally, the system reconstructs the enables them to capture implicit relationships, resolve annotated dataset by mapping the spans of the identi- ambiguities, and interpret complex linguistic phenomena ifed Italian entity back to token indices. This step has such as metaphors, negations, and conditional statements the main goal to preserve entity types and relation la- that traditional knowledge bases cannot handle. LLMs bels while handling edge cases through fallback mecha- excel at understanding how the same entity pair can nisms that include proportional mapping and fuzzy string express diferent relations depending on syntactic strucmatching when exact alignment fails. This ensures that ture, discourse context, and pragmatic implications—for the resulting Italian dataset maintains the structural in- instance, distinguishing between "CEO of Apple" and tegrity necessary for training and evaluating relation "former CEO of Apple" or interpreting temporal and extraction models. The comprehensive error handling causal relationships that emerge from sentence composiand multi-stage validation process addresses the inher- tion rather than explicit statement. Furthermore, LLMs ent complexities of cross-lingual annotation transfer in can handle novel entity combinations and emerging restructured NLP datasets. In each split of the dataset, some lationships that may not yet exist in manually curated translated sentences are removed due to the impossibility databases, while their training on vast text corpora alof maintaining relation labels. This case is represented lows them to recognize subtle linguistic cues and conby a few sentences that are not well translated, in which textual modifiers that determine relation validity and one or more entities that were in the relationship label type. This semantic depth proves particularly valuable are missing. for relation extraction in domains with complex, evolving

Table 1 and Table 3, show the small reduction of sen- terminology or when dealing with informal text where tences (from 1441 to 1407) and consequently of the num- relationships are expressed through natural language ber of relations and entities. However, in the translation patterns rather than formal declarations, making LLMs process, entity types and relation types distribution are more robust and adaptable for real-world text analysis maintained 2, 4. scenarios where meaning emerges from the intricate interplay of syntax, semantics, and context. Given a sentence = {1, 2, . . . , } consisting of tokens, and a set of entities = {1, 2, . . . , } where each entity span-marker-multilingual-cased-multinerd, is defined by its span (, ) and type ∈ , [28] a SpanMarker model fine-tuned on the MultiNthe relation extraction task aims to identify and classify ERD. (2) bert-italian-cased-ner [29], a cased semantic relationships between entity pairs. Formally, BERT model specifically trained for Italian NER on let ℛ be the set of all possible relation types, including a the WikiNER Italian dataset plus manually annospecial no-relation type ∅ ∈ ℛ. For each ordered pair of tated Wikipedia paragraphs, capable of recognizing entities (, ) where ̸= , the relation extraction task four entity classes (Per, Loc, Org, Misc); and (3) seeks to determine the relation type ∈ ℛ that holds DeepMount00/universal_ner_ita, an Italian between (head entity) and (tail entity) within the adaptation of GLiNER [30] (Generalist Model for Named context of sentence . Entity Recognition using Bidirectional Transformer) that leverages natural language descriptions to identify 4.2. NER predictions arbitrary entity types. Entity types for GLiNER are "persona", "città", "nazione", "organizzazione", "data", "luogo", "evento", "prodotto" ("person", "city", "nation", "organisation", "date", "location", "event", "product").

Each model processes the tokenized Italian sentences independently, with predictions aligned to the original token boundaries. The resulting prediction set composed of all the token-level predictions obtained from cited models provides diverse perspectives on entity recognition.

This step involves in the extension of the input space using state-of-the-art Named Entity Recognition (NER) Italian models. NER is formulated as a sequence labeling task where each token in the input sequence is assigned a label that indicates its role in entity identification and classification. Given an input sentence = {1, 2, . . . , } consisting of tokens, the NER task aims to produce a corresponding label sequence = {1, 2, . . . , } where each label ∈ ℒ encodes both the entity type and the token’s position within the 4.3. Knowledge Extraction entity span. In particular, for each of input sentences of the dataset, this work construct a set of NER predictions Given the extended input (s, ) the aim of this step is to comprising annotations from three state-of-the-art further extend the input, extracting knowledge k from multilingual and Italian-specific named entity recog- LLM. k is composed by three diferent outlooks that are nition models. The prediction ensemble includes: (1) concatenated together to compose the semantic interpreA few Italian LLM’s are fine-tuned using LoRA strategy • For the Entities outlook we ask to the LLM: "Sp- in order to learn to generate the target representation iega brevemente il significato dei soggetti prin- 1. We fine-tune also mREBEL 32 [34], a multilingual vercipali menzionati per comprendere la frase: {s}" sion of REBEL [14]. All models are fine-tuned for 10 ("Briefly explain the meaning of the main subjects epochs. At the end of each epoch, models are evaluated mentioned in order to understand the sentence: on the validation set, best model on the evaluation set is {s}"). saved. Translation process, Knowledge extraction step, • Sentence outlook is obtained by asking "Spiegami and training step are executed on the same machine with molto brevemente la frase con il contesto nec- a NVIDIA GeForce RTX 3090 with 24GB of memory and essario: {s}" ("Explain the sentence to me very AMD Ryzen 9 5900X 12-Core Processor.

briefly, providing the necessary context: {s}"). • Relation outlook is obtained asking "Basandoti 5. Results sul testo e sulle predizioni di entità: Spiega brevemente le relazioni tra le entità menzionate nel testo. Testo: {s} Predizioni NER {}" ("Based on the text and entity predictions: Briefly explain the relationships between the entities mentioned in the text. Text: {s} NER predictions {}")

In this section, we present the experimental results of our

supervised fine-tuning approach on the Italian ConLL04 dataset. We evaluate multiple Italian large language models under diferent input configurations to assess the efectiveness of our generative relation extraction framework. We conduct experiments using three configurations:

Relations triplets are composed of a head entity, a tail en

tity, and a predicate indicating the semantic relationship between a subject entity and the object entity: 5.1. Main Results

"Hideo Kojima ha acquistato una nuova casa a Tokyo." Table 5 presents the performance comparison across dif("Hideo Kojima has purchased a new home in Tokyo.") ferent Italian language models and input configurations.

The semantic relationship according to CoNLL04 an- Following standard practice in relation extraction, we notation can be (Hideo Kojima, Vive_A, Tokio). Inspired report both micro and macro F1 scores, with macro F1 by REBEL triplets linearization [14], we try to minimize serving as the primary evaluation metric for state-of-thethe number of tokens in the generation stream in order art comparisons. to decode the output tokens eficiently. A relation triplet is represented by this notation: The model used to extract the Italian knowledge is Phi-4 [31] a 14B parameter state-of-the-art open model, due to the high quality and advanced multilingual reasoning capabilities, even though the small size. In this settings we are able to concatenate the sentence with NER predictions and knowledge k in order to represent the enriched input space <s , E, k > for a given sentence ∈ . Given this input space we employ a parametereficient fine-tuning strategy using Low-Rank Adaptation (LoRA) [32] within the PEFT framework [33] for supervised fine-tuning (SFT) of several Italian LLMs. 4.4. Target representation • Enriched: Complete input including sentence, entity predictions, and background knowledge ⟨, , ⟩ • Raw: Input containing only the source sentence

⟨⟩ • Enriched-Raw: Model fine-tuned on enriched input but evaluated using only raw sentence input at inference time

The enriched-raw configuration allows us to investigate the implicit knowledge distillation efects, where reasoning capabilities from the enriched training data transfer to simpler inference scenarios. tation of a single dataset sample. In particular for a given sentence ∈ where S represent the entire corpus of a dataset, = ⊕ ⊕ where is the Entities outlook, is the Sentence outlook and is the Relations outlook. task where the aim is to learn the conditional probability distribution given the input X = <s, E, k> : ( |) = (| < , , >) (2) Head Entity -> Tail Entity (Relation type)

(1) Multiple relations are separated by the semicolon character ";".

In this work relation extraction is treated as a generation

5.2. Performance Analysis LLaMAntino-3 demonstrates superior performance when trained and evaluated on enriched input, achieving 70.6% macro F1 score. This represents a significant improvement over both Minerva-7B (59.6%) and Velvet-14B

Model Configuration mREBEL (enriched) mREBEL (raw) mREBEL (enriched-raw) Minerva-7B (enriched) Minerva-7B (raw) Minerva-7B (enriched-raw) Velvet-14B (enriched) Velvet-14B (raw) Velvet-14B (enriched-raw) LLaMAntino-3 (enriched) LLaMAntino-3 (raw) LLaMAntino-3 (enriched-raw)

5.3. Error Analysis

Error analysis reveals two primary failure modes in the LLaMAntino-3 model’s relation extraction performance: spurious relation generation (41 instances) (60.2%), despite LLaMAntino-3 being a smaller 8B pa- and missed relation detection (37 instances). The rameter model. The results indicate that model archi- model demonstrates a tendency toward over-generation, tecture and training methodology are more critical fac- particularly struggling with complex sentences containtors than pure parameter count for this task. The strong ing multiple entities where it produces semantically plauperformance of mREBEL demonstrates that sequence-to- sible but factually incorrect relations. Geographic relasequence models, which were previously state-of-the-art tions (Situato_In) show the highest error rates, followed for this task, can achieve comparable results to large by organizational afiliations ( OrgLocata_In). Two reprelanguage models (LLMs). Additionally, mREBEL bene- sentative error patterns illustrate these challenges: Overifts from enriched input. However, Velvet-14B exhibits generation example: In the sentence "Nikita Chruščëv, the opposite behavior, performing better with raw input infuriato, ordinò alle navi dell’Unione Sovietica di igno(65.2%) than with enriched input (60.2%). This suggests rare il blocco navale del Presidente Kennedy durante la the model may be overfitting to the auxiliary information crisi dei missili cubani", the model incorrectly generated provided in the enriched input. Comparing LLaMAntino- four identical Kill relations between Khrushchev and 3 configurations reveals the substantial benefit of en- Kennedy, while missing the correct Vive_A relation beriched input during training. The model trained on en- tween Khrushchev and the Soviet Union. This demonriched data (70.6% macro F1) significantly outperforms strates the model’s tendency to infer dramatic but incorthe same model trained solely on raw sentences (62.1% rect relations from contextual conflict scenarios. Undermacro F1). This demonstrates the value of incorporat- detection example: For the sentence "MILANO, Italia ing entity predictions and background knowledge in the (AP)" (Milan, Italy (AP)), the model correctly identified ortraining process. The enriched-raw configuration yields ganizational relations for the Associated Press but failed particularly interesting results, achieving 64.9% macro F1 to extract the fundamental Situato_In relation between despite using only raw sentence input at inference time. Milan and Italy, suggesting dificulty with implicit geoThis performance exceeds that of the model trained ex- graphic knowledge in simple locative constructions. Outclusively on raw input (62.1% macro F1), suggesting an in- of-domain hallucination example: In the sentence teresting implicit knowledge distillation during training. "King venne ucciso il 4 aprile del 1968 a Memphis, nel TenThe model appears to internalize reasoning patterns from nessee", the model correctly identified the Situato_In relathe enriched training data, enabling improved perfor- tion between Memphis and Tennessee, but additionally mance even when auxiliary information is unavailable at generated correct (but counted as wrong) Evento relations inference time. Table 5.2 shows label-wise performances involving the date "4 aprile del 1968" with Memphis. The where the underlying capability of LLaMAantino3-8B Evento relation type does not exist in the defined schema, to predict well the "Kill" relation, which is the least rep- demonstrating the model’s tendency to create novel reresented in the training set. These results validate our lation categories when encountering temporal-spatial approach of treating relation extraction as a conditional contexts. These patterns indicate that while the generatext generation task and demonstrate the efectiveness tive approach successfully captures complex relational of supervised fine-tuning on Italian language models semantics, it requires improved calibration mechanisms,

7. Conclusion

growing body of research on Italian NLP by providing both a translated benchmark dataset and demonstrating efective strategies for leveraging LLM reasoning in structured prediction tasks. Our findings suggest that carefully designed knowledge augmentation can significantly improve relation extraction performance, particularly in scenarios where training data is limited.

Computational Linguistics, Boston, Massachusetts,

USA, 2004, pp. 1–8. URL: https://aclanthology.org/

W04-2401. [8] Y. Yuan, X. Zhou, S. Pan, Q. Zhu, Z. Song, L. Guo,

A relation-specific attention network for joint entity and relation extraction, in: International joint conference on artificial intelligence, International

Joint Conference on Artificial Intelligence, 2021. [9] T. Zhao, Z. Yan, Y. Cao, Z. Li, Asking efective and diverse questions: A machine reading comprehension based framework for joint entity-relation [1] X. Zhao, Y. Deng, M. Yang, L. Wang, R. Zhang, extraction, in: Proceedings of the Twenty-Ninth InH. Cheng, W. Lam, Y. Shen, R. Xu, A compre- ternational Conference on International Joint Conhensive survey on relation extraction: Recent ad- ferences on Artificial Intelligence, 2021, pp. 3948– vances and new frontiers, ACM Comput. Surv. 3954. 56 (2024). URL: https://doi.org/10.1145/3674501. [10] S. Pawar, G. K. Palshikar, P. Bhattacharyya, Redoi:10.1145/3674501. lation extraction: A survey, arXiv preprint [2] X. Zhao, Y. Deng, M. Yang, L. Wang, R. Zhang, arXiv:1712.05191 (2017).

H. Cheng, W. Lam, Y. Shen, R. Xu, A comprehen- [11] Z. Huang, W. Xu, K. Yu, Bidirectional lstm-crf sive survey on relation extraction: Recent advances models for sequence tagging, arXiv preprint and new frontiers, 2024. URL: https://arxiv.org/abs/ arXiv:1508.01991 (2015).

2306.02051. arXiv:2306.02051. [12] L. Weber, S anger, m., garda, s. et al.(2021) hum[3] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: boldt@ drugprot: chemical-protein relation extracPre-training of deep bidirectional transformers for tion with pretrained transformers and entity delanguage understanding, 2019. URL: https://arxiv. scriptions, in: Proceedings of the BioCreative VII org/abs/1810.04805. arXiv:1810.04805. challenge evaluation workshop, ????, pp. 22–25. [4] I. Yamada, A. Asai, H. Shindo, H. Takeda, Y. Mat- [13] A. Bhartiya, K. Badola, et al., Dis-rex: A multilinsumoto, LUKE: Deep contextualized entity rep- gual dataset for distantly supervised relation extracresentations with entity-aware self-attention, in: tion, arXiv preprint arXiv:2104.08655 (2021). B. Webber, T. Cohn, Y. He, Y. Liu (Eds.), Proceed- [14] P.-L. Huguet Cabot, R. Navigli, REBEL: Relation ings of the 2020 Conference on Empirical Meth- extraction by end-to-end language generation, in: ods in Natural Language Processing (EMNLP), As- M.-F. Moens, X. Huang, L. Specia, S. W.-t. Yih (Eds.), sociation for Computational Linguistics, Online, Findings of the Association for Computational Lin2020, pp. 6442–6454. URL: https://aclanthology.org/ guistics: EMNLP 2021, Association for Computa2020.emnlp-main.523/. doi:10.18653/v1/2020. tional Linguistics, Punta Cana, Dominican Repubemnlp-main.523. lic, 2021, pp. 2370–2381. URL: https://aclanthology. [5] A. Layegh, A. H. Payberah, A. Soylu, D. Roman, org/2021.findings-emnlp.204/. doi: 10.18653/v1/ M. Matskin, Wiki-based prompts for enhancing relation extraction using language models, in: Pro- [15] 2M0.2L1ew.fisi,nYd.Liinug, sN-. eGmonylalp, .M2.0G4h.azvininejad, A. Moceedings of the 39th ACM/SIGAPP Symposium hamed, O. Levy, V. Stoyanov, L. Zettlemoyer, BART: on Applied Computing, SAC ’24, Association for denoising sequence-to-sequence pre-training for Computing Machinery, New York, NY, USA, 2024, natural language generation, translation, and comp. 731–740. URL: https://doi.org/10.1145/3605098. prehension, CoRR abs/1910.13461 (2019). URL: http: [6] 3S6.3P5a9n4,9L. .dLoui:o1,0Y..1W14a5n/g3,C60.5C0h9e8n.,J3.6W35an94g,9X.. Wu, [16] /M/a.rPxiovl.iogrnga/anbos,/1P9.1B0a.1s3il4e6,1G.a.rSXemive:ra1r9o1, 0A.d1v3a4n6c1e.d Unifying large language models and knowledge natural-based interaction for the italian language: graphs: A roadmap, IEEE Transactions on Knowledge and Data Engineering 36 (2024) 3580–3599. [17] LAlIa@mManettian,o-3-Lalnaimtaa,20324m. oadreXlivca:r2d40(250.2047)1.0U1R.L: URL: http://dx.doi.org/10.1109/TKDE.2024.3352100. https://github.com/meta-llama/llama3/blob/main/ doi:10.1109/tkde.2024.3352100. MODEL_CARD.md. [7] D. Roth, W.-t. Yih, A linear programming for- [18] A. S. F. S. Andrea Bacciu, Giovanni Trappolini, mulation for global inference in natural language Fauno: The italian large language model that tasks, in: Proceedings of the Eighth Confer- will leave you senza parole!, https://github.com/ ence on Computational Natural Language Learning andreabac3/Fauno-Italian-LLM, 2023. (CoNLL-2004) at HLT-NAACL 2004, Association for [19] R. Navigli, S. N. group, Minerva: Italy’s first family of large language models trained on italian texts nition using bidirectional transformer, in: K. Duh, (2024). H. Gomez, S. Bethard (Eds.), Proceedings of the [20] Almawave, Velvet ai: sustainable and high- 2024 Conference of the North American Chapter performance italian multilingual llm, Wikipedia, of the Association for Computational Linguistics: 2025. Human Language Technologies (Volume 1: Long [21] L. Piano, A. Pisu, S. G. Tiddia, S. Carta, A. Giuliani, Papers), Association for Computational LinguisL. Pompianu, Llimoniie: Large language instructed tics, Mexico City, Mexico, 2024, pp. 5364–5376. model for open named italian information extrac- URL: https://aclanthology.org/2024.naacl-long.300/. tion (2024). doi:10.18653/v1/2024.naacl-long.300. [22] C. Santini, G. Marozzi, L. Melosi, E. Frontoni, Lever- [31] M. Abdin, J. Aneja, H. Behl, S. Bubeck, R. Eldan, aging large language models to generate a knowl- S. Gunasekar, M. Harrison, R. J. Hewett, M. Javaedge graph from italian literary texts, in: DH2024 heripi, P. Kaufmann, J. R. Lee, Y. T. Lee, Y. Li, W. Liu, Book of Abstracts, 2024. C. C. T. Mendes, A. Nguyen, E. Price, G. de Rosa, [23] P. Li, K. Mao, X. Yang, Q. Li, Improving relation ex- O. Saarikivi, A. Salim, S. Shah, X. Wang, R. Ward, traction with knowledge-attention, in: Proceedings Y. Wu, D. Yu, C. Zhang, Y. Zhang, Phi-4 technical of the 2019 Conference on Empirical Methods in report, 2024. URL: https://arxiv.org/abs/2412.08905. Natural Language Processing and the 9th Interna- arXiv:2412.08905. tional Joint Conference on Natural Language Pro- [32] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, cessing (EMNLP-IJCNLP), Association for Compu- S. Wang, L. Wang, W. Chen, Lora: Low-rank adaptational Linguistics, 2019, p. 229–239. URL: http: tation of large language models, arXiv preprint //dx.doi.org/10.18653/v1/D19-1022. doi:10.18653/ arXiv:2106.09685 (2021).

v1/d19-1022. [33] S. Mangrulkar, S. Gugger, L. Debut, [24] J. Gao, H. Wan, Y. Lin, Exploiting global context Y. Belkada, S. Paul, Peft: State-of-the-art and external knowledge for distantly supervised parameter-eficient ifne-tuning methods, relation extraction, Knowledge-Based Systems https://github.com/huggingface/peft, 2022. 261 (2023) 110195. URL: https://www.sciencedirect. [34] P.-L. Huguet Cabot, S. Tedeschi, A.-C. com/science/article/pii/S0950705122012916. Ngonga Ngomo, R. Navigli, Redfm: a filtered and doi:https://doi.org/10.1016/j.knosys. multilingual relation extraction dataset, in: Proc. 2022.110195. of the 61st Annual Meeting of the Association for [25] Y. Tao, Y. Wang, L. Bai, Graphical reason- Computational Linguistics: ACL 2023, Association ing: Llm-based semi-open relation extraction, for Computational Linguistics, Toronto, Canada, 2024. URL: https://arxiv.org/abs/2405.00216. 2023. URL: https://arxiv.org/abs/2306.09802. arXiv:2405.00216. [26] H. Xu, K. Murray, P. Koehn, H. Hoang, A. Eriguchi,

H. Khayrallah, X-alma: Plug play modules and adaptive rejection for quality translation at scale, 2025. URL: https://arxiv.org/abs/2410.03115.

arXiv:2410.03115. [27] OpenAI Team, GPT-4o mini: advancing costeficient intelligence, https://openai.com/ gpt4o-mini, 2024. Read me. Accessed on 23

Aug. 2024. [28] lxyuan, span-marker-bert-base-multilingualcased-multinerd, https://huggingface.co/lxyuan/ span-marker-bert-base-multilingual-cased-multinerd, 2023. Fine-tuned SpanMarker model based on bert-base-multilingual-cased for multilingual named entity recognition on MultiNERD dataset. [29] osiria, bert-italian-cased-ner, https://huggingface.

co/osiria/bert-italian-cased-ner, 2023. BERT-based model for Italian Named Entity Recognition, finetuned on WikiNER dataset for Person, Location,

Organization and Miscellaneous entity classes. [30] U. Zaratiana, N. Tomeh, P. Holat, T. Charnois,

GLiNER: Generalist model for named entity recogDeclaration on Generative AI During the preparation of this work, the author(s) used ChatGPT (OpenAI) and Grammarly in order to: Paraphrase and reword and Grammar and spelling check. After using these tool(s)/service(s), the author(s) reviewed and edited the content as needed and take(s) full responsibility for the publication’s content.