<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">SLIMER-IT: Zero-Shot NER on Italian Language</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Andrew</forename><surname>Zamai</surname></persName>
							<email>andrew.zamai@unisi.it</email>
							<affiliation key="aff0">
								<orgName type="institution">Università degli Studi di Siena</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">expert.ai</orgName>
								<address>
									<settlement>Siena</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Leonardo</forename><surname>Rigutini</surname></persName>
							<email>lrigutini@expert.ai</email>
							<affiliation key="aff1">
								<orgName type="department">expert.ai</orgName>
								<address>
									<settlement>Siena</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Marco</forename><surname>Maggini</surname></persName>
							<email>marco.maggini@unisi.it</email>
							<affiliation key="aff0">
								<orgName type="institution">Università degli Studi di Siena</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Andrea</forename><surname>Zugarini</surname></persName>
							<email>azugarini@expert.ai</email>
							<affiliation key="aff1">
								<orgName type="department">expert.ai</orgName>
								<address>
									<settlement>Siena</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">SLIMER-IT: Zero-Shot NER on Italian Language</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">C5620ACD1BF9DEFB1DCED180A8AB05E6</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:34+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Named Entity Recognition</term>
					<term>Zero-Shot NER</term>
					<term>Large Language Models</term>
					<term>Instruction tuning</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Traditional approaches to Named Entity Recognition (NER) frame the task into a BIO sequence labeling problem. Although these systems often excel in the downstream task at hand, they require extensive annotated data and struggle to generalize to out-of-distribution input domains and unseen entity types. On the contrary, Large Language Models (LLMs) have demonstrated strong zero-shot capabilities. While several works address Zero-Shot NER in English, little has been done in other languages. In this paper, we define an evaluation framework for Zero-Shot NER, applying it to the Italian language. Furthermore, we introduce SLIMER-IT, the Italian version of SLIMER, an instruction-tuning approach for zero-shot NER leveraging prompts enriched with definition and guidelines. Comparisons with other state-of-the-art models, demonstrate the superiority of SLIMER-IT on never-seen-before entity tags.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Named Entity Recognition (NER) plays a fundamental role in Natural Language Processing (NLP), often being a key component in information extraction pipelines. The task involves identifying and categorizing entities in a given text according to a predefined set of labels. While person, organization, and location are the most common, applications of NER in certain fields may require the identification of domain-specific entities.</p><p>Manually annotated data has always been critical for the training of NER systems <ref type="bibr" target="#b0">[1]</ref>. Traditional methods tackle NER as a token classification problem, where models are specialized on a narrow domain and a pre-defined labels set <ref type="bibr" target="#b1">[2]</ref>. While achieving strong performance for the data distribution they were trained on, they require extensive human annotations relative to the downstream task at hand. Additionally, they lack generalization capabilities when it comes to addressing out-of-distribution input domains and/or unseen labels <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b3">4]</ref>.</p><p>On the contrary, Large Language Models (LLMs) have recently demonstrated strong zero-shot capabilities. Models like GPT-3 can tackle NER via In-Context Learning <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6]</ref>, with Instruction-Tuning further improving performance <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b9">9]</ref>. To this end, several models have been proposed to tackle zero-shot NER <ref type="bibr" target="#b11">[10,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b12">11,</ref><ref type="bibr">12,</ref><ref type="bibr" target="#b14">13]</ref>. In particular, SLIMER <ref type="bibr" target="#b14">[13]</ref> proved to be particularly effective on unseen named entity types, by leveraging definitions and guidelines to steer the model generation. However, little has been done for zero-shot NER in non-English data. More in general, as pointed out in <ref type="bibr" target="#b0">[1]</ref>, NER is understudied in languages like Italian, especially outside the traditional news domain and person, location, organization classes.</p><p>To this end, we propose in this paper an evaluation framework for Zero-Shot NER, and we apply it to the Italian language. In addition, we fine-tune a version of SLIMER for Italian, which we call SLIMER-IT 1 . In the experiments, we explore different LLM backbones and we assess the impact of Definition and Guidelines (D&amp;G). When comparing SLIMER-IT with state-of-the-art approaches, either using models pre-trained on English or adapted for Italian, results demonstrate SLIMER-IT superiority in labelling unseen entity tags.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Several works tackle Zero-Shot NER on English, such as InstructUIE <ref type="bibr" target="#b11">[10]</ref>, UniNER <ref type="bibr" target="#b3">[4]</ref>, GoLLIE <ref type="bibr" target="#b2">[3]</ref>, GLiNER <ref type="bibr" target="#b12">[11]</ref>, GNER <ref type="bibr">[12]</ref> and SLIMER <ref type="bibr" target="#b14">[13]</ref>. Most of them are based on the instruction tuning of an LLM and mainly differ in the prompt and output format design. GLiNER distinguishes itself by being a smaller encoder-only model, combined with a span classifier head, that achieves competitive performance at a lower computational cost.</p><p>As highlighted in SLIMER <ref type="bibr" target="#b14">[13]</ref>, most approaches mainly focus on zero-shot NER in Out-Of-Distribution input domains (OOD), since they are typically fine-tuned on an extensive number of entity classes highly or completely overlapping between training and test sets. In view of this, we proposed a lighter instruction-tuning methodology for LLMs, training on data overlapping in lesser degree with the test sets, while steering the model annotation process with a definition and guidelines for the NE category to be annotated. From this, the name SLIMER: Show Less, Instruct More Entity Recognition.</p><p>Although the authors of GLiNER propose also a multilingual model and evaluate zero-shot generalizability across different languages, neither they nor any other work has addressed the task of Zero-Shot NER specifically for the Italian language. NER for Italian. While NER has been extensively studied on English, less has been done in other languages, particularly outside the traditional general-purpose domains and entity labels set <ref type="bibr" target="#b15">[14]</ref>. Indeed, in Italian, most NER datasets focus on news and, more recently, social media contents <ref type="bibr" target="#b16">[15,</ref><ref type="bibr" target="#b17">16,</ref><ref type="bibr" target="#b18">17]</ref>. Currently, there has been no research into zero-shot NER, only a few exploratory studies into multi-domain NER. This challenge was introduced in the NERMuD task (NER Multi-Domain) at EVALITA 2023 2 , in which one sub-task required to develop a single model capable of classifying the common entitiesperson, organization, location -from different types of text, including news, fiction and political speeches. ExtremITA team <ref type="bibr" target="#b19">[18]</ref> addressed the challenge proposing the adoption of a single LLM capable of tackling all the different tasks at EVALITA 2023, among which NERMuD. All the tasks were converted into text-to-text problems and two LLMs (LLaMA and T5 based) were instruction-tuned on the union of all the available datasets for the challenge.  In the case of zero-shot NER, a model should be able to extract entities from inputs belonging to the same domain it was trained on (in-domain) and across other domains not encountered before (out-of-domain). Moreover, it should also generalize well to novel entity classes (unseen named entities). In our zero-shot evaluation framework we aim to measure each level independently. Hence, we define an evaluation benchmark that includes a collection of NER datasets divided by degree of generalization. In the following we describe the required properties to fit in.</p><p>In-domain. This evaluation helps measure how well the model can generalize from its training data to similar, but not identical, data. The model is evaluated on the same input-domains and named entities as those in the training set. This data often consists in the test partitions associated with each training set used for fine-tuning the model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Out-Of-Domain (OOD).</head><p>OOD evaluation tests the model's ability to generalize to input texts from domains that it has not encountered during training. While the named entities have been seen during training, this type of evaluation is particularly challenging because different input domains often exhibit unique linguistic patterns and domain-specific terminology.</p><p>Unseen Named Entities. This evaluation tests the model's ability to identify and classify entities that has not encountered during its training phase. The tag set comprises fine-grained categories which are often specifically defined for the domain in which NER is deployed. Because of this, the input data may often be also Out-Of-Domain (OOD), making this evaluation include the previously mentioned OOD scenario as well.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">SLIMER-IT</head><p>To adapt SLIMER for Italian, we translate the instructiontuning prompt of <ref type="bibr" target="#b14">[13]</ref>, as shown in Figure <ref type="figure" target="#fig_0">1</ref>. The prompt is designed to extract the occurrences of one entity type per call. While this has the drawback of requiring |NE| inference calls on each input text, it allows the model to better focus on a single NE type at a time.</p><p>As in <ref type="bibr" target="#b14">[13]</ref>, we query gpt-3.5-turbo-1106 via OpenAI's Chat-GPT APIs to automatically generate definition and guidelines for each needed entity tag. The definition for a NE is meant to be a short sentence describing the tag. The guidelines instead provide annotation instructions to align the model's labelling with the desired annotation scheme. Guidelines can be used to prevent the model from labelling certain edge cases or to provide examples of such NE. Such an informative prompt is extremely valuable when dealing with unfamiliar entity tags, and can also be used to distinguish between polysemous categories.</p><p>Finally, the model is requested to generate the named entities in a parsable JSON format containing the list of NEs extracted for the given tag.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Experiments</head><p>Experiments aim to assess our approach in Italian. We study the impact of guidelines and the usage of different backbones. Then, we compare our approach against stateof-the-art alternatives.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Datasets</head><p>We construct the zero-shot NER framework (described in Section 3) for Italian upon NerMuD shared task and Multinerd dataset. In particular, we use NerMuD to build in-domain and OOD evaluation sets, while Multinerd-IT is used to assess the behaviour in the unseen named entites scenario.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>NERMuD.</head><p>NERMuD <ref type="bibr" target="#b0">[1]</ref> is a shared task organized at evalita-2023, built based on the Kessler Italian Namedentities Dataset (KIND) <ref type="bibr" target="#b20">[19]</ref>. It contains annotations for the three classic NER tags: person, organization and location. Examples are organized in three distinct domains: news, literature and political discourses. Unlike NERMuD, we restrict fine-tuning to a single domain. In such a way, we can evaluate both in-domain and outof-domain capabilities of the model. In particular, we designate WikiNews (WN) sub-set for training and indomain evaluation, being the most generic domain, while Fiction (FIC) and Alcide De Gasperi (ADG) splits are kept for out-of-domain evaluation only.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Multinerd-IT.</head><p>To construct the unseen NEs evaluation set, we exploit Multinerd<ref type="foot" target="#foot_0">3</ref>  <ref type="bibr" target="#b21">[20]</ref>, a multilingual NER dataset made of 15 tags: person, organization, location, animal, biological entity, celestial body, disease, event, food, instrument, media, plant, mythological entity, time and vehicle. We keep the Italian examples only. Such a dataset constitutes a perfect choice to assess models' capabilities on unseen NEs. Indeed, data belongs to the same news domain of the NERMuD split chosen for fine-tuning, but it includes a broader label set. Since we want to measure performance on never-seen-before entities, we exclude entity types seen in training, i.e. person, organization and location. We also remove biological entity, being poorly underrepresented, with a support of just 4 instances.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Backbones</head><p>We implemented several version of SLIMER-IT based on different backbone models. We consider similarly sized LLMs, all in the 7B parameters range. In particular, we selected five backbones: Camoscio<ref type="foot" target="#foot_1">4</ref>  <ref type="bibr" target="#b22">[21]</ref>, LLaMA-2-7bchat <ref type="bibr" target="#b23">[22]</ref>, Mistral-7B-Instruct <ref type="bibr" target="#b24">[23]</ref>, LLaMA-3-8B-Instruct, LLaMAntino-3-ANITA-8B-Inst-DPO-ITA<ref type="foot" target="#foot_2">5</ref>  <ref type="bibr" target="#b25">[24]</ref>.</p><p>LLaMA-2-7b-chat was originally used in SLIMER <ref type="bibr" target="#b14">[13]</ref>, and LLaMA-3-8B-Instruct is the newest, improved version of it. As LLaMA family, Mistral-7B-Instruct is a multilingual model mainly English-oriented, but it has demonstrated greater fluency on Italian. Camoscio and LLaMAntino-3-ANITA-8B-Inst-DPO-ITA, instead, are two LLMs specifically fine-tuned on Italian instructions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.">Compared Models</head><p>We compare the SLIMER-IT approach, implemented with different backbones, against other state-of-the-art approaches for zero-shot NER. All the methods are trained and evaluated in the defined zero-shot NER framework for a fair comparison. We evaluate against: Token classification. Although certainly not being suited for zero-shot NER, due to its architectural inability to cope with unseen tags, we decided to evaluate the most known approach to NER as baseline. As in NERMuD <ref type="bibr" target="#b0">[1]</ref>, we use the training framework dhfbk/bert-ner <ref type="foot" target="#foot_3">6</ref> . We fine-tune two different base models, bert-base-cased, pretrained on English, and dbmdz/bert-base-italian-cased <ref type="foot" target="#foot_4">7</ref> , an Italian version.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>GNER.</head><p>It is the best performing approach on zero-shot NER in OOD English benchmark. In GNER <ref type="bibr">[12]</ref>, they propose a BIO-like generation, replicating in output the same input text, along with a token-by-token BIO label. Here, we consider LLaMAntino-3 as its backbone.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 1</head><p>Comparing SLIMER-IT based on different backbones, with and without Definition and Guidelines (D&amp;G) in the prompt. LLMs with † symbol were instruction-tuned on Italian. In parentheses the (±Δ𝐹 1) of performance given by the usage of D&amp;G. GLiNER. Differently from all other methods, GLiNER is based on a smaller encoder-only model, combined with a span classifier head, able to achieve competitive performance on the OOD English benchmark at a lower computational cost. We fine-tune it both using its original deberta-v3-large English backbone and the Italian dbmdz/bert-base-italian-cased model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Backbone</head><p>extremITLLaMA. Already described in Section 2, it represents an interesting approach to compare against. Based on Camoscio LLM, we compare it with SLIMER-IT approach implemented with the same backbone.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.4.">Experimental setup</head><p>We kept the same training configuration of SLIMER <ref type="bibr" target="#b14">[13]</ref> on English, except that we trained on all available samples. Depending on the backbone, the instructiontuning prompt (see Figure <ref type="figure" target="#fig_0">1</ref>) was adjusted accordingly to the structure of its template (e.g. [INST] or &lt;|start_header_id|&gt; formats). For all the competitors, we replicated their training setup using their scripts and suggested hyper-parameters. For the evaluation, we use the micro-F1 as computed in the UniNER 8 implementation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.5.">Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Impact of Definition and Guidelines (D&amp;G).</head><p>We compare SLIMER-IT with a version devoid of definition and guidelines in the prompt. To demonstrate the robustness of the approach, we train several SLIMER-IT instances, based on different LLM backbones. In Table <ref type="table">1</ref>, we report the results, highlighting the absolute difference in performance between the model steered by </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 2</head><p>Comparison with existing off-the-shelf models for zero-shot NER on Italian. We omit in-domain evaluation to not disadvantage them against SLIMER-IT. D&amp;Gs and the one not using them. Generally, definition and guidelines yield improvements in F1. In particular, the gap is contained when evaluating on in-domain data, whereas it becomes significant in OOD and even more substantial in unseen NEs. This is expected since D&amp;G help the most in conditions unseen during training. Notably, LLaMA-3-based backbones benefit the most from definition and guidelines, with improvements beyond 23 absolute F1 points, surpassing all the other models by substantial margins in never-seen-before entity tags.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Model</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 3</head><p>Comparing SLIMER-IT with state-of-the-art approaches trained in the same zero-shot setting, and adopting the same backbone when possible. *Note that extremITLLaMA was fine-tuned also on the FIC and ADG train sets for the NERMuD task, so these datasets are not actually OOD for this model.  <ref type="figure" target="#fig_2">2</ref>. We can observe no remarkable difference in indomain evaluation, where most recent models outperform older ones, as one might expect. Also globally, Camoscio and LLaMA-2-chat obtain lower scores than the rest of the backbones, with the only exception of FIC dataset, where LLaMA-3 based architecture underperform. However, LLaMAntino-3-ANITA reaches the best performance on 3 out of 4 datasets, with a strong gap especially in unseen named entities scenario, the most challenging one. Interestingly enough, thanks to their better understanding capabilities, backbones specialized on Italian are particularly effective in the unseen NEs scenario. This is the case of LLaMAntino-3-ANITA and even Camoscio, which demonstrates higher F1 than LLaMA-2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Approach</head><p>Off-the-shelf Italian NER models. Although there has been no prior work defining a Zero-Shot NER evaluation framework for Italian, there exist fine-tune specialized state-of-the-art zero-shot NER models for Italian language. In particular, we consider: GLiNER-ML <ref type="bibr" target="#b12">[11]</ref>, a multilingual instance of GLiNER, Universal-NER-ITA <ref type="foot" target="#foot_5">9</ref>and GLiNER-ITA-Large<ref type="foot" target="#foot_6">10</ref> , both specialized on Italian. These models were trained on synthetic data covering a vast number of different entity classes (up to 97k). Thus, it is impossible to directly compare them in a pure zeroshot framework, since there are no entity tags actually never-seen-before during training. However, we still report their results against SLIMER-IT. Table <ref type="table">2</ref> reports the results. Despite this advantage, SLIMER-IT outperforms all these models by large a margin.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>State-of-the-art comparison.</head><p>Thanks to the definition of our zero-shot evaluation framework, we can compare different state-of-the-art approaches fairly. Results are outlined in Table <ref type="table">3</ref>. When evaluating in the same domain where the model was trained, encoder-only architectures obtain strong results despite being much smaller models. This result is not surprising, given the acknowledged performance of these architectures for supervised NER. More unexpected is their ability to generalize well to OOD inputs. Also GNER proves to be quite competitive achieving the best results in in-domain evaluation, and in OOD on FIC dataset. However, all these approaches dramatically fail on never-seen-before tags, in contrast to SLIMER-IT that achieves almost 55 F1 score points. Compared with LLM-based approaches like GNER and extremITLLaMA, this proves once again that without definition and guidelines LLMs struggle in tagging novel kind of entities.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusions</head><p>In this paper, we proposed an evaluation framework for Zero-Shot NER that we applied to Italian. Thanks to such a framework, we can better investigate different zero-shot properties depending on the scenario (in-domain, OOD, unseen NEs). On top of that, we compared several stateof-the-art approaches, with particular focus on SLIMER, which, thanks to the usage of definition and guidelines, is well suited to deal with novel entity types. Indeed, SLIMER-IT, our fine-tuned model based on LLaMAntino-3, surpasses other state-of-the-art techniques by large margins. In the future, we plan to further extend the zeroshot NER benchmark, and implement an input caching mechanism for scalability to large label sets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. SLIMER-IT on some NE tags</head><p>In Table <ref type="table" target="#tab_4">4</ref>  </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: SLIMER-IT instruction tuning prompt. Dedicated entity definition and guidelines steer the model labelling.</figDesc><graphic coords="1,322.96,291.76,162.69,210.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>2</head><label></label><figDesc>https://www.evalita.it/campaigns/evalita-2023/tasks/</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: SLIMER-IT performance for different backbones.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>3. Zero-Shot NER Framework</head><label></label><figDesc></figDesc><table><row><cell>In traditional Machine-Learning theory, a model 𝑓 ,</cell></row><row><cell>trained for a task (e.g. NER) represented by a dataset</cell></row><row><cell>𝒳 , 𝒴, is typically evaluated on an held-out test set sam-</cell></row><row><cell>pled from the same task and distribution of the training.</cell></row><row><cell>In zero-shot learning instead, a model is expected to go</cell></row><row><cell>beyond what experienced during training. There are</cell></row><row><cell>different levels of generalization indicating up to what</cell></row><row><cell>extent the model goes beyond what directly learnt.</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head></head><label></label><figDesc>Some qualitative examples are shown in Appendix A.</figDesc><table><row><cell></cell><cell>Backbone</cell><cell cols="2">Language Params</cell><cell>In-Domain</cell><cell cols="2">OOD</cell><cell>unseen NEs</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell>WN</cell><cell>FIC</cell><cell>ADG</cell><cell>MN</cell></row><row><cell>Token classification</cell><cell>BERT-base</cell><cell>EN</cell><cell>0.11B</cell><cell>83.9</cell><cell>75.6</cell><cell>75.0</cell><cell>-</cell></row><row><cell>Token classification</cell><cell>BERT-base</cell><cell>IT</cell><cell>0.11B</cell><cell>89.8</cell><cell>87.0</cell><cell>82.3</cell><cell>-</cell></row><row><cell>GLiNER</cell><cell>deberta-v3-large</cell><cell>EN</cell><cell>0.44B</cell><cell>87.8</cell><cell>77.2</cell><cell>80.3</cell><cell>0.2</cell></row><row><cell>GLiNER</cell><cell>BERT-base</cell><cell>IT</cell><cell>0.11B</cell><cell>89.3</cell><cell>87.5</cell><cell>84.9</cell><cell>0.6</cell></row><row><cell>extremITLLaMA</cell><cell>Camoscio</cell><cell>IT</cell><cell>7B</cell><cell>89.1</cell><cell cols="2">90.3* 83.4*</cell><cell>0.2</cell></row><row><cell>SLIMER-IT</cell><cell>Camoscio</cell><cell>IT</cell><cell>7B</cell><cell>81.5</cell><cell>85.1</cell><cell>76.0</cell><cell>38.7</cell></row><row><cell>GNER</cell><cell>LLaMAntino-3</cell><cell>IT</cell><cell>8B</cell><cell>90.3</cell><cell>88.9</cell><cell>82.5</cell><cell>1.2</cell></row><row><cell>SLIMER-IT</cell><cell>LLaMAntino-3</cell><cell>IT</cell><cell>8B</cell><cell>85.8</cell><cell>82.5</cell><cell>81.7</cell><cell>54.7</cell></row><row><cell cols="3">Impact of Backbones. Regarding the choice of the</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="3">SLIMER-IT backbone, we better illustrate results in Fig-</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>ure</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 4</head><label>4</label><figDesc>we compare SLIMER-IT (LLaMAntino-based) with a version of it devoid of Definition and Guidelines (D&amp;G), in order to get a better insight into the usefulness of such components in zero-shot NER. Some examples of definition and guidelines. Absolute F1 gains between SLIMER-IT and its version without definition and guidelines are reported. In green we highlight examples on unseen named entities, in blue examples on known tags such person, organization and location, but in Out-Of-Domain input distributions. MEDIA' si riferisce a entità come nomi di giornali, riviste, libri, album musicali, film, programmi televisivi, spettacoli teatrali e altre opere creative e di comunicazione., Linee Guida: Assicurati di etichettare solo nomi specifici di opere creative e di comunicazione, evitando generici come 'musica' o 'libro'. Presta attenzione alle ambiguità, ad esempio 'Apple' potrebbe riferirsi alla società tecnologica o ad un'opera d'arte. Escludi i nomi di artisti, autori o registi, che dovrebbero essere etichettati come 'persona', e nomi generici di strumenti musicali o generi letterari che non rappresentano opere specifiche. LUOGO' denota nomi propri di luoghi geografici, comprendendo città, paesi, stati, regioni, continenti, punti di interesse naturale, e indirizzi specifici., Linee Guida: Assicurati di non confondere i nomi di luoghi con nomi di persone, organizzazioni o altre entità. Ad esempio, 'Washington', potrebbe riferirsi alla città di Washington D.C. o al presidente George Washington, quindi considera attentamente il contesto. Escludi nomi di periodi storici, eventi o concetti astratti che non rappresentano luoghi fisici. Ad esempio, 'nel Rinascimento' è un periodo storico, non un luogo geografico. Definizione: 'ORGANIZZAZIONE' denota nomi propri di aziende, istituzioni, gruppi o altre entità organizzative. Questo tipo di entità include sia entità private che pubbliche, come società, organizzazioni non profit, agenzie governative, università e altri gruppi strutturati. Linee Guida: Annota solo nomi propri, evita di annotare sostantivi comuni come 'azienda' o 'istituzione' a meno che non facciano parte del nome specifico dell'organizzazione. Assicurati di non annotare nomi di persone come organizzazioni, anche se contengono termini che potrebbero sembrare riferimenti a entità organizzative. Ad esempio, 'Johnson &amp; Johnson' è un'azienda, mentre 'Johnson' da solo potrebbe essere il cognome di una persona. PERSONA' denota nomi propri di individui umani. Questo tipo di entità comprende nomi di persone reali, famose o meno, personaggi storici, e può includere anche personaggi di finzione. Linee Guida: Fai attenzione a non includere titoli o ruoli professionali senza nomi propri (es. 'il presidente' non è una 'PERSONA', ma 'il presidente Barack Obama' sì).</figDesc><table><row><cell>We present results</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_0">https://github.com/Babelscape/multinerd</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_1">https://huggingface.co/teelinsan/camoscio-7b-llama</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_2">https://huggingface.co/swap-uniba/ LLaMAntino-3-ANITA-8B-Inst-DPO-ITA</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_3">https://github.com/dhfbk/bert-ner</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_4">https://huggingface.co/dbmdz/bert-base-italian-cased</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_5">https://huggingface.co/DeepMount00/universal_ner_ita</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_6">https://huggingface.co/DeepMount00/GLiNER_ITA_LARGE</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>The work was partially funded by: • "ReSpiRA -REplicabilità, SPIegabilità e Ragionamento", a project financed by FAIR, Affiliated to spoke no. 2, falling within the PNRR MUR programme, Mission 4, Component 2, Investment 1.3, D.D. No. 341 of 03/15/2022, Project PE0000013, CUP B43D22000900004 11 ; • "MAESTRO -Mitigare le Allucinazioni dei Large Language Models: ESTRazione di informazioni Ottimizzate" a project funded by Provincia Autonoma di Trento with the Lp 6/99 Art. 5:ricerca e sviluppo, PAT/RFS067-05/06/2024-0428372, CUP: C79J23001170001 12 ; • "enRichMyData -Enabling Data Enrichment Pipelines for AI-driven Business Products and Services", an Horizon Europe (HE) project, grant agreement ID: 101070284 13 .</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Nermud at evalita 2023: Overview of the named-entities recognition on multi-domain documents task</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">P</forename><surname>Aprosio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Paccosi</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:261529782" />
	</analytic>
	<monogr>
		<title level="m">International Workshop on Evaluation of Natural Language and Speech Tools for Italian</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>short paper</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A survey on deep learning for named entity recognition</title>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Knowledge and Data Engineering</title>
		<imprint>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="page" from="50" to="70" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">O</forename><surname>Sainz</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2310.03668</idno>
		<title level="m">Gollie: Annotation guidelines improve zero-shot information-extraction</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">W</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Poon</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2308.03279</idno>
		<title level="m">Universalner: Targeted distillation from large language models for open named entity recognition</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Language models are unsupervised multitask learners</title>
		<author>
			<persName><forename type="first">A</forename><surname>Radford</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">OpenAI blog</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page">9</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Language models are few-shot learners</title>
		<author>
			<persName><forename type="first">T</forename><surname>Brown</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in neural information processing systems</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="1877" to="1901" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Finetuned language models are zero-shot learners</title>
		<author>
			<persName><forename type="first">J</forename><surname>Wei</surname></persName>
		</author>
		<ptr target="https://openreview.net/forum?id=gEZrGCozdqR" />
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">W</forename><surname>Chung</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2210.11416</idno>
		<title level="m">Scaling instruction-finetuned language models</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title/>
		<author>
			<persName><surname>Maestro</surname></persName>
		</author>
		<idno type="DOI">10.3030/101070284</idno>
		<ptr target="https://www.opencup.gov.it/portale/web/opencup/home/progetto/-/cup/C79J2300117000113https://doi.org/10.3030/101070284" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Super-Natural Instructions: Generalization via declarative instructions on 1600+ NLP tasks</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2022.emnlp-main" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">Y</forename><surname>Goldberg</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Z</forename><surname>Kozareva</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</editor>
		<meeting>the 2022 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics<address><addrLine>Abu Dhabi, United Arab Emirates</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="5085" to="5109" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title/>
		<idno type="DOI">10.18653/v1/2022.emnlp-main.340</idno>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Xia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ye</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Gui</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2304.08085</idno>
		<title level="m">Instructuie: multi-task instruction tuning for unified information extraction</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">U</forename><surname>Zaratiana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tomeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Holat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Charnois</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2311.08526</idno>
		<title level="m">Gliner: Generalist model for named entity recognition using bidirectional transformer</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Yan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zhang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2402.16602</idno>
		<title level="m">Rethinking negative instances for generative named entity recognition</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">for zero-shot ner</title>
		<author>
			<persName><forename type="first">A</forename><surname>Zamai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zugarini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Rigutini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ernandes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Maggini</surname></persName>
		</author>
		<ptr target="https://arxiv.org/abs/2407.01272.arXiv:2407.01272" />
	</analytic>
	<monogr>
		<title level="m">Show less, instruct more: Enriching prompts with definitions and guidelines</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Named entity recognition: Fallacies, challenges and opportunities</title>
		<author>
			<persName><forename type="first">M</forename><surname>Marrero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Urbano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sánchez-Cuadrado</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Morato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Gómez-Berbís</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.csi.2012.09.004</idno>
		<ptr target="https://doi.org/10.1016/j.csi.2012.09.004" />
	</analytic>
	<monogr>
		<title level="j">Computer Standards &amp; Interfaces</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="482" to="489" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">I-CAB: the Italian content annotation bank</title>
		<author>
			<persName><forename type="first">B</forename><surname>Magnini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Pianta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Girardi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Negri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Romano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Speranza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Bartalesi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Lenzi</surname></persName>
		</author>
		<author>
			<persName><surname>Sprugnoli</surname></persName>
		</author>
		<ptr target="http://www.lrec-conf.org/proceedings/lrec2006/pdf/518_pdf.pdf" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC&apos;06), European Language Resources Association (ELRA)</title>
				<editor>
			<persName><forename type="first">N</forename><surname>Calzolari</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Choukri</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Gangemi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><surname>Maegaard</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Mariani</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Odijk</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Tapias</surname></persName>
		</editor>
		<meeting>the Fifth International Conference on Language Resources and Evaluation (LREC&apos;06), European Language Resources Association (ELRA)<address><addrLine>Genoa, Italy</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Named entity recognition on transcribed broadcast news at evalita 2011</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">Bartalesi</forename><surname>Lenzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Speranza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sprugnoli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Evaluation of Natural Language and Speech Tools for Italian</title>
				<editor>
			<persName><forename type="first">B</forename><surname>Magnini</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Cutugno</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Falcone</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Pianta</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin Heidelberg; Berlin, Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="86" to="97" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Caputo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gentile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Rizzo</surname></persName>
		</author>
		<title level="m">Overview of the evalita 2016 named entity recognition and linking in italian tweets (neel-it) task</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Extremita at EVALITA 2023: Multi-task sustainable scaling to large language models at its extreme</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">D</forename><surname>Hromei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Croce</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Basili</surname></persName>
		</author>
		<ptr target="https://ceur-ws.org/Vol-3473/paper13.pdf" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2023)</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<editor>
			<persName><forename type="first">M</forename><surname>Lai</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Menini</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Polignano</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Russo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Sprugnoli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Venturi</surname></persName>
		</editor>
		<meeting>the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2023)<address><addrLine>Parma, Italy</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">September 7th-8th, 2023. 2023</date>
			<biblScope unit="volume">3473</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">KIND: an Italian multi-domain dataset for named entity recognition</title>
		<author>
			<persName><forename type="first">T</forename><surname>Paccosi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">Palmero</forename><surname>Aprosio</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2022.lrec-1.52" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Thirteenth Language Resources and Evaluation Conference, European Language Resources Association</title>
				<editor>
			<persName><forename type="first">N</forename><surname>Calzolari</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Béchet</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Blache</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Choukri</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Cieri</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Declerck</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Goggi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Isahara</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><surname>Maegaard</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Mariani</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Mazo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Odijk</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Piperidis</surname></persName>
		</editor>
		<meeting>the Thirteenth Language Resources and Evaluation Conference, European Language Resources Association<address><addrLine>Marseille, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="501" to="507" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">MultiNERD: A multilingual, multi-genre and fine-grained dataset for named entity recognition (and disambiguation)</title>
		<author>
			<persName><forename type="first">S</forename><surname>Tedeschi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Navigli</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.findings-naacl.60</idno>
		<ptr target="https://aclanthology.org/2022.findings-naacl.60.doi:10.18653/v1/2022.findings-naacl.60" />
	</analytic>
	<monogr>
		<title level="m">Findings of the Association for Computational Linguistics: NAACL 2022, Association for Computational Linguistics</title>
				<meeting><address><addrLine>Seattle, United States</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="801" to="812" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Camoscio: an italian instruction-tuned llama</title>
		<author>
			<persName><forename type="first">A</forename><surname>Santilli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Rodolà</surname></persName>
		</author>
		<ptr target="https://arxiv.org/abs/2307.16456.arXiv:2307.16456" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><surname>Touvron</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2307.09288</idno>
		<title level="m">Llama 2: Open foundation and fine-tuned chat models</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title/>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">Q</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sablayrolles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mensch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Bamford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">S</forename><surname>Chaplot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>De Las Casas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Bressand</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Lengyel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Lample</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Saulnier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">R</forename><surname>Lavaud</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-A</forename><surname>Lachaux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Stock</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">L</forename><surname>Scao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Lavril</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Lacroix</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">E</forename><surname>Sayed</surname></persName>
		</author>
		<ptr target="https://arxiv.org/abs/2310.06825.arXiv:2310.06825" />
	</analytic>
	<monogr>
		<title level="j">Mistral</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<title level="m" type="main">Advanced natural-based interaction for the italian language: Llamantino-3-anita</title>
		<author>
			<persName><forename type="first">M</forename><surname>Polignano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Semeraro</surname></persName>
		</author>
		<ptr target="https://arxiv.org/abs/2405.07101.arXiv:2405.07101" />
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
