<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>La non canonica l'hai studiata? Exploring LLMs and Sentence Canonicity in Italian</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Claudiu Daniel Hromei</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Danilo Croce</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rodolfo Delmonte</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roberto Basili</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ca' Foscari University</institution>
          ,
          <addr-line>Venice</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Enterprise Engineering, University of Rome Tor Vergata</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper investigates the ability of Large Language Models (LLMs) to diferentiate between canonical and non-canonical sentences in Italian, employing advanced neural architectures like LLaMA and its adaptations. Canonical sentences adhere to the standard Subject-Verb-Object (SVO) structure. We hypothesize that recent generative LLMs are influenced heavily by the English language, where non-canonical structures are very rare. Using the in-context learning technique, we probe these models and further fine-tune them for this specific task. Initial results indicate that these models continue to struggle with this task even after fine-tuning. Additionally, we introduce a test set comprising several hundred sentences from the poetry domain, which presents significant challenges for the canonical structure task.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Large Language Models</kwd>
        <kwd>Italian Sentence Structure</kwd>
        <kwd>Non-Canonical Structures</kwd>
        <kwd>In-Context Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>CLiC-it 2024: Tenth Italian Conference on Computational Linguistics,
Dec 04 — 06, 2024, Pisa, Italy
* Corresponding author.
$ hromei@ing.uniroma2.it (C. D. Hromei); croce@info.uniroma2.it
(D. Croce); delmont@unive.it (R. Delmonte);
basili@info.uniroma2.it (R. Basili)</p>
      <p>0009-0000-8204-5023 (C. D. Hromei); 0000-0001-9111-1950
(D. Croce); 0000-0003-0282-7661 (R. Delmonte);
0000-0001-5140-0694 (R. Basili)
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License</p>
      <p>Attribution 4.0 International (CC BY 4.0).
1Elizabethan English was more similar to Italian in its variety of
syntactic structures.
2In English: “Always dear to me was this solitary hill and this hedge
which from large side of the ultimate horizon the gaze excludes”
projectivity in written texts is 7%, based on 230, 629 con- and 3, 000 sentences, respectively. Illo tempore, these
stituents. Compared to Latin, where the non-projectivity eforts yielded an F1 score of 82.96%, while
comparaindex is 6.65% in the Latin Dependency Treebank con- ble parsers (Stanford, Collins, and MaltParser) achieved
taining about 55, 000 tokens, Italian and Latin are quite about 92.10% on the WSJ treebank. The lower
perforsimilar. In contrast, English tree projectivity in the Penn mance in Italian was primarily due to two factors: a
Treebank (PT), where the majority of data corresponds higher number of non-canonical structures (i.e., word
to the articles of Wall Street Journal (WSJ), shows much order variations) and the presence of pro-drop clauses,
lower numbers: with 720, 086 constituents, the non- where the subject is lexically omitted — a challenge also
projectivity index is 0.01004%. documented for other similar languages [11].</p>
      <p>Thus, Italian speakers have high expectancies for the Significant improvements in parsing performance
presence of an NCS due to processing dificulties also were noted in a paper on the EVALITA shared task on
raised by the number of unexpressed subjects: 61% of constituency parsing, where the best F1 score increased
all Inflected Propositions lack a lexically expressed sub- from 70% to 84% [12], attributed to the nearly
douject. This does not apply to English speakers, for whom bling of training samples between 2007 and 2011. In [13],
NCS are infrequent and context-specific. In this view, the authors presented a new dataset of Italian based on
Italian is considered unique for its use of many of the “marked” sentences to test the performance of the neural
non-canonical structures found in contemporary poetry parser TINT. The result for LAS dependency structures
and examined in this experiment. The richness and free- was 77% accuracy, three points below the best results
dom of the language give the speakers the ability to pro- on the UD corpus of Italian, which was 80%. This
outduce such a diverse typology of non-canonical structures, come confirmed previous findings with a small dataset
which stems from its Latin heritage, with the Null Sub- of strongly marked sentences, where accuracy was
beject being one of the most well-known features. Like low 50%. The authors detailed seven types of marked
many other languages, including Spanish, Portuguese, structures in their treebank corpus: cleft, left-dislocated,
and Catalan, as well as Chinese, Japanese, Slavic lan- right-dislocated, presentative “ci” (there in English),
inguages, Greek, and Hebrew, Italian is a Null Subject Lan- verted subject, pseudo-clefts, and hanging topic, with
guage. However, this parameter alone does not fully ex- cleft and left-dislocated sentences being the most
complain the richness and complexity of syntactic structures mon.
seen in Italian poetry. While other Romance languages In this context, it is interesting to explore the
capashare similar syntactic traits, the specific linguistic legacy bilities of state-of-the-art methods for addressing the
and poetic traditions of Italian give it a unique character problem of distinguishing between canonical and
nonin this regard. canonical sentences in Italian. This exploration is
mo</p>
      <p>In this paper, we want to analyze the ability of recently tivated by the complexity and richness of Italian
synproposed Large Language Models to detect non-canonical tax, which presents unique challenges for natural
lansentences in Italian. Our hypothesis is that, given the very guage processing models. Mostly all actual
state-of-thelarge percentage of English training data (usually more art models are based on the Transformer architecture
than 90%) and the very low percentage of Italian training [14]. This game-changer model comprises two main
data (usually less than 1%), these models have a limited components, leading to diferent model families. The
capacity to process such structures and they rely mostly encoder, used in models like BERT [15], RoBERTa [16],
on the English writing structures. On the other hand, the and Sentence BERT [17], encodes input sequences using
models that have been specifically adapted or fine-tuned self-attention. In contrast, decoders, such as GPT [18],
on Italian data should show a better understanding of GPT-3 [19], and LLaMA [20], generate output sequences
the canonicity in Italian. auto-regressively. Beyond these, encoder-decoder
mod</p>
      <p>
        In the rest, Section 2 describes the related work, Sec- els like T5 [21] and BART [22] integrate both components,
tion 3 shows the approach in recognizing the canonical excelling in tasks such as translation, summarization, and
structures, Section 4 presents and discusses the results, question-answering.
while Section 5 derives the conclusions. One notable Transformer-based architecture is the
LLaMA foundational model [20]. LLaMA is a large model
with billions of parameters that generates output
se2. Related Work quences auto-regressively based on the input and
previously generated tokens. It has been recently applied
Our approach has been previously adopted by other re- to a variety of linguistic tasks by instruction-tuning a
searchers but with slightly diferent aims, as described monolithic architecture to solve them all [23]. This
fambelow. Initial attempts at parsing Italian treebanks of con- ily of models is promising as they rely on auto-regressive
stituent structures focused on two small treebanks: TUT generation methods and, thanks to their massive amount
[
        <xref ref-type="bibr" rid="ref8">8, 9</xref>
        ] and ISST [10], containing approximately 3, 500 of training data and parameters, can solve a plethora of
linguistic tasks. Additionally, [24] demonstrated the ap- is to generate vector embeddings of sentences using a
plication of LLaMA-family models for syntactic parsing model like sBERT [17]. This model produces a
contextuacross multiple languages, highlighting the capability alized vector that represents the information contained
of the model to analyze and detect sentence structures. in a sentence. By applying Cosine Similarity, we can
This work underscores the versatility of large language rank these vectors and select the training examples most
models in handling diverse syntactic frameworks, fur- similar to the input sequence. This process ensures that
ther probing their performance in cross-linguistic scenar- the model is supplied with the most relevant solved
exios. Finally, architectures specifically adapted for Italian, amples for a given input. It’s important to note that these
such as Camoscio [25] and LLaMAntino [26], are tuned examples may not always capture the same explicit
synwith instruction datasets for the Italian language, starting tax representation as a Tree Kernel [
        <xref ref-type="bibr" rid="ref10">28</xref>
        ] function would,
from the original LLaMA model and its second variant, in which every word of the sentence is explicitly
annoLLaMA2-chat, respectively. They demonstrate a strong tated with syntactic information and linked to each other.
understanding of the language and an excellent ability However, the crucial aspect is that the examples provided
to generate appropriate responses. are suficiently similar in meaning and context, and the
      </p>
      <p>In this paper, we aim to explore the ability of Large sBERT architecture is very efective.</p>
      <p>Language Models (LLMs) to distinguish between canoni- When the model’s pre-existing knowledge is
insufical and non-canonical sentences in Italian using neural cient, we can fine-tune it on the downstream task.
Finearchitectures such as LLaMA and its various adaptations, tuning involves training the model in a traditional
manas discussed in the next Section. It’s interesting to note ner using input-output pairs (training data) to adjust its
that in the future one might explore the applications parameters. This process improves the model’s
perforof probing syntax at the intermediate layers of various mance on specific tasks, allowing it to learn from a more
models. extensive set of examples. As a result, the model becomes
more adept at handling similar queries in the future, with
a focus on the specific task at hand. By leveraging these
3. Recognizing Canonical techniques, LLMs can recognize and respond to
canonstructures through LLMs ical structures with varying degrees of eficiency and
accuracy.</p>
      <p>
        To address the capabilities of Large Language Models in
recognizing the canonical structures, they can be utilized 3.1. Training LLMs against non-Canonical
through In-Context Learning techniques [
        <xref ref-type="bibr" rid="ref9">27</xref>
        ] or by
directly fine-tuning the model for specific downstream structures
tasks. In-context learning relies on the model’s pre- To interact with the models, we need a suficiently
deexisting knowledge acquired during pre-training and on tailed prompt, which includes a natural language
descripinstructions provided in natural language at inference tion of the task (i.e., the rules to determine whether an
time. This method does not involve additional training Italian sentence follows the canonical structure) and
specand can be categorized based on the number of examples ifies the type of answer we expect the LLM to produce: Sì
provided: i) 0-shot Learning, where no examples are (Yes in English) if the sentence is canonical and follows
given, and the model generates responses based solely the rules, or No otherwise. For the training and the 0-shot
on its pre-existing knowledge and the provided instruc- strategy, we used the following prompt:
tions; ii) 1-shot Learning, where one example per class
(positive and negative in our case) is added to provide a “Dimmi se la seguente frase ha una struttura
more precise context, these examples help the model bet- canonica o meno. Per Canonica si intende
ter understand the task by ofering a concrete reference una frase che segue una struttura standard
point; iii) Few-shot Learning, where more than one ex- per ogni verbo presente. Più nello specifico,
ample per class is provided to give the model additional le frasi canoniche seguono queste regole:
concontextual information during decision-making. This ap- tengono SOLO sequenze del tipo nome o
strutproach is particularly efective when very few examples ture nominali SEGUITE da struttura verbale a
(such as 2 or 4) are given, but it can be extended up to sua volta seguita (oppure no) da complementi
the maximum input context length. OPPURE contengono SOLO sequenze composte
      </p>
      <p>For both one-shot and few-shot learning approaches, a da struttura verbale seguita da complementi,
key challenge is selecting the most informative examples dove: STRUTTURE VERBALI sono sequenze
to provide during inference. One efective strategy is composte da ausiliare o/e modale e verbo, e tra
to retrieve examples that are most similar to the current i due ci può essere un avverbio oppure
strutsequence to be classified, focusing on those with a similar ture preposizionali COMPLEMENTI sono
strutstructure or meaning. A commonly used method for this ture nominali oppure strutture preposizionali
oppure strutture frasali oppure strutture
infinitivali. Tutte le altre frasi sono da considerarsi
come Non Canoniche. Riguardo il prossimo
input, rispondi ’sì’ se è ’canonico’, ’no’ se è ’non
canonico’.”</p>
      <sec id="sec-1-1">
        <title>For the 1-shot scenario, immediately after the above prompt, we append the following instruction, where the two provided examples are selected as the most relevant for the input example:</title>
        <p>Ti faccio un paio di esempi:
&lt;Positive_Example&gt; e devi rispondere sì.
&lt;Negative_Example&gt; e devi rispondere no.</p>
        <p>text is canonical and follows the rules, or No otherwise.</p>
        <p>
          When fine-tuning a model, a highly detailed prompt For training, we used the VIT Treebank [
          <xref ref-type="bibr" rid="ref12">30</xref>
          ], which
might seem excessive, especially since traditional train- contains approximately 320, 000 words. Among other
ing involves repeating the prompt multiple times. How- information, each sentence is categorized into canonical
ever, our hypothesis is that clearly explaining the task or not. The dataset was divided into a Training set and a
to the model aids in faster convergence of the parame- Development set with a 90/10 ratio. The class distribution
ters and a more rapid reduction in loss during training. is shown in Figure 1, where it is evident that the vast
Therefore, this is the reason why our prompt includes majority of the sentences are canonical, reflecting the
a comprehensive description of the canonical sentence natural usage patterns of Italian speakers.
structure. This description details that each verb must ad- We employed the LoRA [
          <xref ref-type="bibr" rid="ref13">31</xref>
          ] technique and the Peft
here to specified constraints, the types of sequences they package on a single Tesla T4 GPU to train the models for
can contain, the verbal structures, and the order of com- 3 epochs, with a learning rate of 3− 4 and using a linear
plements. If a verb does not adhere to these constraints, scheduler with 10% warmup. The LoRA  parameter
it should be classified as Non-Canonical. was set to 8,  to 16, and all available layers were
involved (for more details, refer to the original paper [
          <xref ref-type="bibr" rid="ref13">31</xref>
          ]).
3.2. LLM architectures of non-Canonical For computational eficiency, the floating-point precision
of the parameters was set to 8 bits, allowing the use of a
structures single GPU.
        </p>
        <p>
          Today, the landscape of Large Language Models (LLMs) For the Test set, we used a collection of Italian poetry
is vast, making it challenging to choose the most suitable comprising 51 texts with a total of 303 sentences. For the
model. In this paper, we focus on several well-known same reason that people still regard Dante as the greatest
models from the LLaMA family: LLaMA1 [20], the first Italian poet and students are required to learn his best
in the series; LLaMA2 [
          <xref ref-type="bibr" rid="ref11">29</xref>
          ], which introduced minor im- poems by heart, we have chosen what is regarded as the
provements in Transformer architecture; Camoscio [25], best Contemporary Italian poetry: a manually curated
an instruction-tuned LLaMA model fine-tuned on Ital- collection of excerpts from Italian poems from the late
ian data; ExtremITA [23], an architecture designed for 19th and early 20th centuries. In particular, we used
poa wide range of Italian tasks; and LLaMAntino [26], an ems from the 1975 Nobel Prize Eugenio Montale, with
adaptation of the original LLaMA2 model for the Italian about one hundred excerpts taken from the volume “Ossi
language. di Seppia”. The class distribution of this test set is shown
        </p>
        <p>We expect the best-performing models to be those in Figure 1. Notably, the distribution of Yes (the sentence
specifically adapted or fine-tuned on Italian data, such as is canonical) and No (the sentence is non-canonical) is
Camoscio, ExtremITA, or LLaMAntino. One significant reversed compared to the Training and Development sets,
issue with the English models is that non-canonicity is due to poetic license and rhyming constraints. This
revery rare in English, as the language predominantly fol- versal poses a significant challenge for the models we
lows the Subject-Verb-Object structure, which is canoni- trained, but it presents an interesting test case. More
decal, with very few (grammatically correct) non-canonical tails about this and a simple Error Analysis are presented
examples. in the Appendix B.</p>
        <p>In this context, it is important to note that the
consideration of structures which, in Chomskyan
transfor4. Empirical Investigation mational theory, were once viewed as surface-level
realizations of deep canonical structures has not been a
deliberate focus of this experiment. The first reason for
In this setup, the models trained and those utilized in the
k-shot scenario are required to answer Yes if the given
excluding structures like passives, interrogatives, rela- a second comparison, we train an Italian BERT model
tive clauses, cleft sentences, tough constructions, and for 3 epochs which starts showing some awareness of
others, is their relative scarcity in poetry, though they the task and reaching an overall 40% of Micro-F1.
Usare more frequent in prose. A second reason, closely tied ing our Development set we selected only the best LLM
to the first, is that these common structures do not add to report here for space constraints, which is based on
an element of surprise, given their frequency in everyday Camoscio [25]. Finally, the Fine-Tuned model reaches
language use. That said, some of these common non- the best performance with a very good Precision (98%)
canonical structures can still be found in Italian literary for the non-canonical sentences and very good Recall
prose, but not all are represented in the examples we (98%) for the canonical ones, with a final 60% of both
studied. On the other hand, focus fronting (also referred Macro and Micro F1.
to as object preposing, complement preposing, or full
argument inversion, depending on the constituent be- 4.2. Corpus Analysis
ing fronted) is prevalent in the examples included in the
experiment. An exemplar list of such structures can be
found in Appendix C.</p>
      </sec>
      <sec id="sec-1-2">
        <title>For a better insight into the current measured perfor</title>
        <p>
          mance, we studied the role of training material as
representative of the adopted test dataset. We analyzed the test
4.1. Results and Discussion dataset used in terms of the average word frequencies,
as observed on the ITWaC corpus4. This corpus provides
The models used in this paper are those already antici- pre-computed frequencies for each word: for
comparapated in Section 3.2, available from Huggingface, using tive reasons, we normalized in [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] and measured them
the prompt described in Section 3.1. The results are avail- for each sentence in terms of the mean frequency, i.e.,
able in Table 1. Given the distribution of the sentences the sum of the word frequencies over each sentence. By
of the Training set, we report a simple but informed independently averaging frequencies of canonical and
Yes-Baseline. This baseline cannot perform well on non-canonical sentences, we obtained the following
figthe inverted distribution of the Test Set, as it always an- ures:
swers Yes. We first used the LLMs anticipated in Section
3.2 in a 0-shot manner and you can notice an overall good • Canonical Sentences, AVG frequency: 0.38
ability to detect the non-canonical sentences reaching a • Non-Canonical Sentences, AVG frequency: 0.24
73% of Precision and 93% of Recall for Camoscio, but
still struggles to identify the canonical ones. We hoped Intuitively, a value approaching 1 characterizes highly
to heavily boost the performances of the model in the 1- frequent words in ITWaC: this suggests that they are
shot scenario3, but it seemed to decrease in performance. well-represented in the original LLM. Conversely, values
The same trend can be noted for all the other models. As closer to 0 characterize less represented sentences.
Notice that only canonical sentences (AVG 0.38) are
represented, although in a limited manner, in standard Italian
texts. This result sheds light on the specific relationship
3We experimented with more than 1 example per class, increasing
the number of samples up to a 16-shot scenario. Unfortunately, the
performance was not increasing but stale around 60% of Micro-F1.
        </p>
        <p>We didn’t report such results here for space constraints.</p>
      </sec>
      <sec id="sec-1-3">
        <title>4https://www.sketchengine.eu/itwac-italian-corpus/</title>
        <p>between word frequencies and training: LLMs, partic- and improving model architectures to better capture the
ularly Camoscio, are more “confident” with words they nuances of non-canonical sentences.
encountered during pre-training or fine-tuning. It is
noticeable that almost 50% of our test set words (adjectives,
verbs, nouns) do not even occur in the ITWaC and, in Acknowledgments
fact, they are also absent in any canonical sentence of
the training set. Another issue lies in the pre-training Claudiu Daniel Hromei is a Ph.D. student enrolled in
data of these LLMs. Since most of the data is in English the National Ph.D. in Artificial Intelligence, XXXVII
(over 88%) and non-canonical sentences are extremely cycle, course on Health and life sciences, organized by
rare in English, models like LLaMA or Camoscio have the Università Campus Bio-Medico di Roma. We
acrarely encountered such data, leading to suboptimal per- knowledge financial support from the PNRR MUR project
formance. Moreover, the length of the sentence could PE0000013-FAIR and support from Project ECS 0000024
be a factor that may influence the performance of LLMs, Rome Technopole, - CUP B83C22002820006, NRP Mission
specifically in poetry, in the ability to detect canonical or 4 Component 2 Investment 1.5, Funded by the European
non-canonical sentences. Union - NextGenerationEU.</p>
        <p>Therefore, to achieve a more balanced evaluation, we
merged the Training, Development, and Testing sets into References
a single dataset to balance the classes and ensure that
the model learns to recognize non-canonical sentences.</p>
        <p>We then performed an N-Fold Cross-Validation (N = 5).</p>
        <p>Only the trained model was re-evaluated, and the
results are presented in Table 2. We maintained the simple
and informed Yes-Baseline for comparison and
recomputed its performance. In this setting, the class
distribution aligns again with the Training set. The fine-tuned
Camoscio model now shows very good performance in
distinguishing canonical sentences, achieving a Macro-F1
of 88% and a Micro-F1 of 90%.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>5. Conclusions</title>
      <p>In this study, we have shown the potential of Large
Language Models, particularly the LLaMA architecture and
its Italian adaptations, in distinguishing between
canonical and non-canonical sentences in Italian. Our
experiments indicate that instruction-tuned models
specifically for Italian, such as Camoscio and LLaMAntino,
exhibit a strong grasp of Italian syntax and can
efectively handle diverse sentence structures. However, the
performance for this task is still penalized by the large
portion of English data they ingest during pre-training.</p>
      <p>The findings underscore the importance of tailored
language models for specific languages and the benefits of
incorporating extensive syntactic variations into training
datasets. Future work should focus on expanding the
training datasets with more diverse syntactic structures
hauer (Eds.), Proceedings of the Second In- 2019, pp. 4171–4186.
ternational Conference on Language Resources [16] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen,
and Evaluation (LREC’00), European Language O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov,
Resources Association (ELRA), Athens, Greece, Roberta: A robustly optimized BERT pretraining
2000. URL: http://www.lrec-conf.org/proceedings/ approach, CoRR abs/1907.11692 (2019).
lrec2000/pdf/220.pdf. [17] N. Reimers, I. Gurevych, Sentence-bert:
Sen[9] C. Bosco, A. Mazzei, V. Lombardo, G. Attardi, tence embeddings using siamese bert-networks, in:
A. Corazza, A. Lavelli, L. Lesmo, G. Satta, M. Simi, Proceedings of the 2019 Conference on Empirical
Comparing Italian parsers on a common tree- Methods in Natural Language Processing,
Assobank: the EVALITA experience, in: N. Calzo- ciation for Computational Linguistics, 2019. URL:
lari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, http://arxiv.org/abs/1908.10084.</p>
      <p>S. Piperidis, D. Tapias (Eds.), Proceedings of the [18] A. Radford, K. Narasimhan, T. Salimans, I. Sutskever,
Sixth International Conference on Language Re- et al., Improving language understanding by
genersources and Evaluation (LREC’08), European Lan- ative pre-training, 2018.
guage Resources Association (ELRA), Marrakech, [19] T. B. Brown, B. Mann, N. Ryder, M. Subbiah,
Morocco, 2008. URL: http://www.lrec-conf.org/ J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam,
proceedings/lrec2008/pdf/528_paper.pdf. G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss,
[10] S. Montemagni, F. Barsotti, M. Battista, N. Calzo- G. Krueger, T. Henighan, R. Child, A. Ramesh,
lari, O. Corazzari, A. Lenci, A. Zampolli, F. Fan- D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen,
ciulli, M. Massetani, R. Rafaelli, R. Basili, M. T. E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark,
Pazienza, D. Saracino, F. Zanzotto, N. Mana, F. Pi- C. Berner, S. McCandlish, A. Radford, I. Sutskever,
anesi, R. Delmonte, Building the Italian Syntactic- D. Amodei, Language models are few-shot learners,
Semantic Treebank, Springer Netherlands, Dor- CoRR abs/2005.14165 (2020).
drecht, 2003, pp. 189–210. URL: https://doi. [20] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A.
org/10.1007/978-94-010-0201-1_11. doi:10.1007/ Lachaux, T. Lacroix, B. Rozière, N. Goyal, E.
Ham978-94-010-0201-1_11. bro, F. Azhar, A. Rodriguez, A. Joulin, E. Grave,
[11] T. Chung, M. Post, D. Gildea, Factors afecting the G. Lample, Llama: Open and eficient foundation
accuracy of Korean parsing, in: D. Seddah, S. Koe- language models, 2023. URL: https://arxiv.org/abs/
bler, R. Tsarfaty (Eds.), Proceedings of the NAACL 2302.13971. arXiv:2302.13971.
HLT 2010 First Workshop on Statistical Parsing [21] C. Rafel, N. Shazeer, A. Roberts, K. Lee, S. Narang,
of Morphologically-Rich Languages, Association M. Matena, Y. Zhou, W. Li, P. J. Liu,
Explorfor Computational Linguistics, Los Angeles, CA, ing the limits of transfer learning with a unified
USA, 2010, pp. 49–57. URL: https://aclanthology. text-to-text transformer, J. Mach. Learn. Res. 21
org/W10-1406. (2020) 140:1–140:67. URL: http://jmlr.org/papers/
[12] C. Bosco, A. Mazzei, A. Lavelli, Looking back to v21/20-074.html.</p>
      <p>the evalita constituency parsing task: 2007-2011, [22] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A.
Moin: B. Magnini, F. Cutugno, M. Falcone, E. Pianta hamed, O. Levy, V. Stoyanov, L. Zettlemoyer, BART:
(Eds.), Evaluation of Natural Language and Speech denoising sequence-to-sequence pre-training for
Tools for Italian, Springer Berlin Heidelberg, Berlin, natural language generation, translation, and
comHeidelberg, 2013, pp. 46–57. prehension, CoRR abs/1910.13461 (2019).
[13] T. Paccosi, A. Palmero Aprosio, S. Tonelli, It is [23] C. D. Hromei, D. Croce, V. Basile, R. Basili,
Exmarkit that is new: An italian treebank of marked tremITA at EVALITA 2023: Multi-Task Sustainable
constructions, in: CLiC-it 2021 - Italian Conference Scaling to Large Language Models at its Extreme, in:
on Computational Linguistics, 2022. Proceedings of the Eighth Evaluation Campaign of
[14] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, Natural Language Processing and Speech Tools for
L. Jones, A. N. Gomez, L. u. Kaiser, I. Polosukhin, At- Italian. Final Workshop (EVALITA 2023), CEUR.org,
tention is all you need, in: I. Guyon, U. V. Luxburg, Parma, Italy, 2023.</p>
      <p>S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, [24] C. D. Hromei, D. Croce, R. Basili, U-DepPLLaMA:
R. Garnett (Eds.), Advances in Neural Information Universal Dependency Parsing via Auto-regressive
Processing Systems, volume 30, Curran Associates, Large Language Models, IJCoL 10 (2024). URL: http:
Inc., 2017. //journals.openedition.org/ijcol/1352.
[15] J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: [25] A. Santilli, E. Rodolà, Camoscio: an Italian
pre-training of deep bidirectional transformers for Instruction-tuned LLaMA, 2023. URL: https://arxiv.
language understanding, in: J. Burstein, C. Doran, org/abs/2307.16456. arXiv:2307.16456.
T. Solorio (Eds.), Proceedings of the NAACL 2019, [26] P. Basile, E. Musacchio, M. Polignano, L. Siciliani,</p>
      <p>In assessing the data distribution disparities between
languages in the pre-training phase of the LLaMA family
models, we provide an illustrative breakdown in Table 3,
where English accounts for nearly 90% of the data, while
Italian is present in less than 1%.</p>
      <p>Among the limitations of the proposed model, the
computational costs associated with training a model like
LLaMA are undoubtedly significant, requiring hundreds
In this section, we present a simple Error Analysis with
two diferent cases: i) a sentence from the Development
set, which should reflect the distribution of the training
data for the models introduced in Section 3.2; a sentence
from the poetry domain that is radically diferent from
the training data. We will then report the answer for
each model specifying the modality (in-context learning
or training) and eventually the number of shots used for
inference.</p>
      <p>As a first example, consider “ Dificile tenersi in quel
cammino”5, which is non-canonical as the main verb “è”
is missing. The models answered as follows:
• LLaMA1 0s: canonical
• LLaMA1 1s: canonical
• LLaMA2 0s: canonical
• LLaMA2 1s: canonical
• ExtremITA 0s: canonical
• ExtremITA 1s: non-canonical
• LLaMAntino 0s: canonical
• LLaMAntino 1s: non-canonical
• Camoscio 0s: non-canonical
• Camoscio 1s: non-canonical
• BERT FT: non-canonical
• Camoscio FT: non-canonical</p>
      <sec id="sec-2-1">
        <title>This example is interesting because all the Italian</title>
        <p>adapted models in some way (1-shot or Fine-Tuned)
answered correctly, thus recognizing that the sentence was
missing the main verb, given the initial prompt. Notice
that only Camoscio answered correctly both in 0-shot
and 1-shot</p>
        <p>As a second and more dificult example, consider the
sentence “Zacinto mio che te specchi nell’onde del greco
mar da cui vergine nacque Venere”6, taken from the poetry
test set. This example is very hard to comprehend as some
words are very rare in spoken/written Italian (nell’onde),
the usage of the uncommon te to express that the city is
actively mirroring in the sea, and the reversed order of
the last words. In this case, all the models answered that
the sentence is non-canonical, recognizing the strange
structure of the sentence, except for BERT FT which
classified this sentence as canonical.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>C. Typical Non-Canonical</title>
    </sec>
    <sec id="sec-4">
      <title>Structures</title>
      <p>In this section, we report a list of typical non-canonical
structures as an example of the complexity the models
are dealing with.
5In English: “(It’s) Hard to keep in that path.”
6In English: “My Zacinto that you mirror in the waves of the Greek
sea where virgin was born Venus from”
1. Inversion of the complete argument, where the
complement is fronted, and the subject follows
the verb.
2. Subject inversion, positioning the subject after
the main verb.
3. Fronting of the object, moving the object to the
beginning of the sentence before the subject.
4. Extraction of the object from an infinitival clause,
placing it at the beginning of the sentence.
5. Preposing of a prepositional adjunct from a
participial clause, moving the prepositional
complement of a past participle to a position before the
verb.
6. Leftward extraction of the lexical verb, where
the untensed, non-finite main verb precedes the
auxiliary or modal verb.
7. Right dislocation of the subject, placing the
subject after the complements of the sentence.
8. Fronting of both the subject and the object,
positioning them before the main verb, with the
subject preceding the object.
9. Fronting of a prepositional specification, often
introduced by "of", extracting it from the noun
phrase and positioning it at the front.
10. Right dislocation of the clitic, where a clitic
pronoun attached to the main verb corefers to an
object noun phrase positioned later in the
sentence.
11. Right dislocation of the object, placing the object
after indirect objects, adjuncts, or an inverted
subject.
12. Insertion of parentheticals or adjuncts between
the subject and the main verb.
13. Rightward extraction of the adjective from the
noun phrase, positioning it after any noun
adjuncts.
14. Right stranding of a prepositional specification,
such as "of", leaving it at the end of the sentence,
separate from the noun phrase.
15. Rightward extraction of the lexical verb,
positioning the untensed, non-finite main verb after the
complements of the sentence.
16. Right stranding of the predicate’s head noun,
leaving it after two adjuncts.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Delmonte</surname>
          </string-name>
          ,
          <article-title>Syntax and semantics of italian poetry in the first half of the 20th century</article-title>
          ,
          <year>2018</year>
          . URL: https: //arxiv.org/abs/
          <year>1802</year>
          .03712. arXiv:
          <year>1802</year>
          .03712.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Delmonte</surname>
          </string-name>
          ,
          <source>Cognitive Models of Poetry Reading</source>
          , Springer International Publishing, Cham,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>39</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -44982-7_
          <fpage>19</fpage>
          -
          <lpage>4</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -44982-7_
          <fpage>19</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Delmonte</surname>
          </string-name>
          , Recursion and Ambiguity:
          <string-name>
            <given-names>A</given-names>
            <surname>Linguistic and Computational Perspective</surname>
          </string-name>
          ,
          <year>2015</year>
          , pp.
          <fpage>257</fpage>
          -
          <lpage>284</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -08043-7_
          <fpage>15</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bresnan</surname>
          </string-name>
          ,
          <source>The Mental Representation of Grammatical Relations</source>
          , The MIT Press, Cambridge,
          <year>1982</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bresnan</surname>
          </string-name>
          ,
          <string-name>
            <surname>Lexical-Functional</surname>
            <given-names>Syntax</given-names>
          </string-name>
          , Blackwell Publishing, Oxford,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Ward</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Birner</surname>
          </string-name>
          , Information Structure and
          <string-name>
            <surname>Non-canonical Syntax</surname>
          </string-name>
          ,
          <year>2008</year>
          , pp.
          <fpage>152</fpage>
          -
          <lpage>174</lpage>
          . doi:
          <volume>10</volume>
          . 1002/9780470756959.ch7.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Delmonte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Busetto</surname>
          </string-name>
          ,
          <article-title>Measuring similarity by linguistic features rather than frequency</article-title>
          , in: H.
          <string-name>
            <surname>Bunt</surname>
          </string-name>
          (Ed.),
          <source>Proceedings of the 18th Joint ACL - ISO Workshop on Interoperable Semantic Annotation within LREC2022</source>
          , European Language Resources Association, Marseille, France,
          <year>2022</year>
          , pp.
          <fpage>42</fpage>
          -
          <lpage>52</lpage>
          . URL: https://aclanthology.org/
          <year>2022</year>
          .isa-
          <volume>1</volume>
          .6.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bosco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Lombardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vassallo</surname>
          </string-name>
          , L. Lesmo,
          <article-title>Building a treebank for Italian: a data-driven annotation schema</article-title>
          , in: M.
          <string-name>
            <surname>Gavrilidou</surname>
            , G. Carayannis,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Markantonatou</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Piperidis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>StainG. Fiameni</surname>
          </string-name>
          , G. Semeraro,
          <article-title>LLaMAntino: LLaMA 2 Table 3 Models for Efective Text Generation in Italian Lan- Data distribution</article-title>
          . guage,
          <year>2023</year>
          . URL: https://arxiv.org/abs/2312.09993.
          <source>Code Language Percentage arXiv:2312.09993. en English</source>
          <volume>89</volume>
          ,70%
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zheng</surname>
          </string-name>
          , J. Ma,
          <string-name>
            <given-names>R.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <source>unk unknown 8</source>
          ,38% J.
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Sui</surname>
          </string-name>
          ,
          <source>A de German</source>
          <volume>0</volume>
          ,17% survey on in-context learning,
          <source>2024. URL: https: fr French</source>
          <volume>0</volume>
          ,16% //arxiv.org/abs/2301.00234. arXiv:
          <volume>2301</volume>
          .00234.
          <source>sv Swedish</source>
          <volume>0</volume>
          ,
          <fpage>15</fpage>
          %
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>D.</given-names>
            <surname>Croce</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moschitti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Basili</surname>
          </string-name>
          , Structured lex-
          <source>zh Chinese 0</source>
          ,13% idceanlcsyimtrileaersi,
          <source>ty ivni:a Rco</source>
          .nBvaorlzuiltaioyn, Mk e.rJnoehlnssoonn d(eEpdesn.)-,
          <source>renusl SRDpuuastnscihiasnh 000</source>
          ,,,
          <volume>111233</volume>
          %%%
          <source>Proceedings of the 2011 Conference on Empirical it Italian 0,11% Methods in Natural Language Processing, Asso- ja Japanese 0</source>
          ,10% ciation for Computational Linguistics, Edinburgh,
          <source>pl Polish 0</source>
          ,09% Scotland, UK.,
          <year>2011</year>
          , pp.
          <fpage>1034</fpage>
          -
          <lpage>1046</lpage>
          .
          <source>URL: https: pt Portuguese</source>
          <volume>0</volume>
          ,09% //aclanthology.org/D11-1096.
          <source>vi Vietnamese</source>
          <volume>0</volume>
          ,
          <fpage>08</fpage>
          %
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>H.</given-names>
            <surname>Touvron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Stone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Albert</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Alma- uk
          <source>Ukrainian 0</source>
          ,07% hairi,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Babaei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bashlykov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Batra</surname>
          </string-name>
          , P. Bhar- ko
          <source>Korean 0</source>
          ,06% gava, S. Bhosale,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bikel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Blecher</surname>
          </string-name>
          , C. C. Ferrer, ca Catalan 0,04%
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Cucurull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Esiobu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          , J. Fu,
          <source>sr Serbian 0</source>
          ,04%
          <string-name>
            <given-names>W.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Fuller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Goswami</surname>
          </string-name>
          , N. Goyal,
          <source>cs Czech 0</source>
          ,03%
          <string-name>
            <given-names>A.</given-names>
            <surname>Hartshorn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hosseini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Inan</surname>
          </string-name>
          , M. Kar- fhiu
          <source>FHiunnngisahrian 00</source>
          ,,
          <volume>0033</volume>
          %% das,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kerkez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Khabsa</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Kloumann</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Ko- id
          <source>Indonesian 0</source>
          ,03% renev,
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Koura</surname>
          </string-name>
          , M.
          <article-title>-</article-title>
          <string-name>
            <surname>A. Lachaux</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lavril</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
          </string-name>
          , no Norwegian
          <issue>0</issue>
          ,03%
          <string-name>
            <given-names>D.</given-names>
            <surname>Liskovich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Martinet</surname>
          </string-name>
          , T. Mihaylov,
          <source>ro Romanian 0</source>
          ,03%
          <string-name>
            <given-names>P.</given-names>
            <surname>Mishra</surname>
          </string-name>
          , I. Molybog,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Poulton</surname>
          </string-name>
          ,
          <source>J. Reizen- bg Bulgarian 0</source>
          ,02% stein,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rungta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Saladi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Schelten</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Silva</surname>
          </string-name>
          , E. M.
          <source>da Danish</source>
          <volume>0</volume>
          ,02%
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Subramanian</surname>
            ,
            <given-names>X. E.</given-names>
          </string-name>
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Tang</surname>
          </string-name>
          , R. Tay- hr
          <source>Croatian 0</source>
          ,01% lor,
          <string-name>
            <given-names>A.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. X.</given-names>
            <surname>Kuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yan</surname>
          </string-name>
          , I. Zarov,
          <source>sl Slovenian 0</source>
          ,01%
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kambadur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Narang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rodriguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Stojnic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Edunov</surname>
          </string-name>
          ,
          <source>T. Scialom, Llama</source>
          <volume>2</volume>
          :
          <article-title>Open foundation and fine-tuned chat models</article-title>
          ,
          <year>2023</year>
          .
          <article-title>of hours on a GPU. We have implemented methods to arXiv:2307.09288. streamline this process, but the computational expen-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>R.</given-names>
            <surname>Delmonte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bristot</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>Tonelli, VIT - Venice diture for training on a 16GB GPU remains high. This Italian Treebank: Syntactic and Quantitative Fea- becomes even more pronounced considering the model's tures</article-title>
          ,
          <source>in: Proc. Sixth International Workshop on sentence processing time</source>
          ,
          <article-title>which is slightly less than half Treebanks and Linguistic Theories</article-title>
          , volume
          <volume>1</volume>
          ,
          <string-name>
            <surname>Nealt</surname>
          </string-name>
          <article-title>a second per sentence</article-title>
          .
          <source>Given the required computational Proc. Series</source>
          ,
          <year>2007</year>
          , pp.
          <fpage>43</fpage>
          -
          <lpage>54</lpage>
          . URL: https://catalog.
          <article-title>power to run the model, this duration is relatively long. elra.info/en-us/repository/browse/ELRA-W0324/. Regarding the model's application, since it heavily re-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wallis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Allen-Zhu</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. Li,</surname>
          </string-name>
          <article-title>lies on an LLM, it might be susceptible to hallucination S. Wang</article-title>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chen</surname>
          </string-name>
          , Lora:
          <article-title>Low-rank adaptation - generating non-existent sentences or fragments. Howof large language models</article-title>
          ,
          <source>CoRR abs/2106</source>
          .09685 ever,
          <article-title>during inference (few-shot or training), it</article-title>
          seems to (
          <year>2021</year>
          ).
          <article-title>always answer in the request format, very rarely (especially in 0-shot) adding some explanation for its decision after a Yes or No.</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>