=Paper=
{{Paper
|id=Vol-3878/124_calamita_long
|storemode=property
|title=AMELIA - Argument Mining Evaluation on Legal documents in ItAlian: A CALAMITA Challenge
|pdfUrl=https://ceur-ws.org/Vol-3878/124_calamita_long.pdf
|volume=Vol-3878
|authors=Giulia Grundler,Andrea Galassi,Piera Santin,Alessia Fidelangeli,Federico Galli,Elena Palmieri,Francesca Lagioia,Giovanni Sartor,Paolo Torroni
|dblpUrl=https://dblp.org/rec/conf/clic-it/GrundlerGSFGPLS24
}}
==AMELIA - Argument Mining Evaluation on Legal documents in ItAlian: A CALAMITA Challenge==
AMELIA - Argument Mining Evaluation on Legal documents
in ItAlian: A CALAMITA Challenge
Giulia Grundler2,∗ , Andrea Galassi1,∗ , Piera Santin3 , Alessia Fidelangeli2 , Federico Galli2 ,
Elena Palmieri1 , Francesca Lagioia2,3 , Giovanni Sartor2,3 and Paolo Torroni1
1
DISI, Alma-AI, University of Bologna, Italy
2
CIRSFID Alma-AI, Faculty of Law, University of Bologna, Italy
3
European University Institute, Law Department, Italy
Abstract
This challenge consists of three classification tasks, in the context of argument mining in the legal domain. The tasks are
based on a dataset of 225 Italian decisions on Value Added Tax, annotated to identify and categorize argumentative text. The
objective of the first task is to classify each argumentative component as premise or conclusion, while the second and third
tasks aim at classifying the type of premise: legal vs factual, and its corresponding argumentation scheme. The classes are
highly unbalanced, hence evaluation is based on the macro F1 score.
Keywords
LLM, Argument Mining, Legal Analytics, VAT, CALAMITA, CLiC-it
1. Challenge: Introduction and resources for other languages remain scarce. To the best
of our knowledge, only a few works exist for Italian. In
Motivation [11], the authors use the CorEA corpus of user comments
To what extent are Large Language Models (LLMs) ca- to online newspaper articles, to assign the correct relation
pable of reasoning, as opposed to simply recognizing (support or attack), to pairs of arguments. In [12], the
patterns from vast amounts of data is an open research authors propose a new model for stance detection, trained
question and the subject of a lively ongoing debate [1]. A and evaluated on a corpus of Italian tweets where users
way to describe human reasoning is through its ability to were discussing on a highly polarized political debate.
understand, evaluate, and invent arguments composed Among the many domains of interest for argument
by claims, evidence, and conclusions meaningfully con- mining, our focus is on the legal domain, where argumen-
nected with one another [2]. For this reason, the ability tation is fundamental for the decision-making process.
to recognize arguments could be considered as a first Legal reasoning relies heavily on well-structured argu-
step in a sequence of reasoning tasks of increasing com- ments, as legal professionals must construct and decon-
plexity, that goes from the detection and classification of struct arguments within formal documents, providing
argumentative discourse units or argument components, a challenging setting for assessing an LLMs’ ability to
through argument structure prediction, reconstruction, engage in complex reasoning tasks. Despite its relevance,
evaluation, down to argument generation. Automatizing little attention has been given to argument mining in
these tasks is the object of argument mining [3, 4, 5]. We the legal domain in Italian. Most existing work in legal
believe that gauging the ability of LLMs to address even NLP for Italian has focused on tasks such as law article
basic argument mining tasks would provide meaningful retrieval [13, 14], outcome prediction [15], analysis of
cues as to these models’ ability to process and understand contracts [16, 17], and summarization [18, 19].
logical relations expressed in natural language. Our challenge for CALAMITA [20] consists of three
While several datasets for argument mining in English classification tasks over argumentative texts. We mostly
have been developed over the last decade [6, 7, 8, 9, 10], follow the setting used in Demosthenes [21, 22], a cor-
pus for argument mining on legal documents in English.
CLiC-it 2024: Tenth Italian Conference on Computational Linguistics, Since we leverage real legal documents, not synthetic or
Dec 04 — 06, 2024, Pisa, Italy artificially constructed case studies, our dataset reflects
∗
Corresponding authors. the real complexity and nuances of legal argumentation.
Envelope-Open giulia.grundler2@unibo.it (G. Grundler); a.galassi@unibo.it
(A. Galassi)
It is therefore particularly relevant for a robust assess-
Orcid 0000-0002-7255-9343 (G. Grundler); 0000-0001-9711-7042 ment of LLMs’ abilities in real-world applications. To
(A. Galassi); 0000-0002-0734-9657 (P. Santin); 0000-0003-3739-5387 the best of our knowledge, we are the first to propose a
(F. Galli); 0000-0001-5176-8843 (E. Palmieri); 0000-0001-7083-3487 challenge of argument mining over legal documents in
(F. Lagioia); 0000-0003-2210-0398 (G. Sartor); 0000-0002-9253-8638 Italian.
(P. Torroni)
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License The challenge requires understanding not only the
Attribution 4.0 International (CC BY 4.0).
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
Italian language but domain-specific technical language. Si osserva poi che ritenere che la mancata
Such a language uses complex syntactic structures, and a possibilità di detrazione a favore di soggetti
specialized terminology. Besides language, the challenge come il ricorrente comporti un aiuto di Stato
tests LLMs’ ability to recognize and interpret legal ar- in favore degli ospedali pubblici, in quanto
guments by recognizing typical argumentation schemes le perdite degli stessi vengono ripianate dalle
[23], e.g., patterns of reasoning used in human discourse, USL e dalla Regioni trascura di considerare
offering a principled approach to argument analysis and l’accessibilità, indiscriminata, ai servizi dei
evaluation. Identifying schemes is challenging as there nosocomi pubblici da parte dei soggetti iscritti
are many possible schemes, and arguments are often only al SSN, rispetto a quella ad un libero pro-
partially laid out in the text, leaving many important fessionista sanitario che, in quanto tale, ben
parts implicit for brevity or because they are considered potrebbe rifiutarsi di prestare i propri servigi
common knowledge. Nonetheless, this task lends itself al pare di un normale contraente.
to generalization beyond the legal domain, making the • Argument conclusion: the statement that follows
insights transferable to other fields where structured rea- logically from the premise(s) and represents the
soning plays a critical role. final point being argued for.
Dunque, l’ufficio ha riconosciuto la non im-
2. Challenge: Description ponibilità IVA delle cessioni all’esportazione,
così cessando sul punto la materia del con-
We consider an argument as a set of interconnected por- tendere.
tions of texts called argument components. The connec-
tions between components form a specific pattern of Argument components can be involved in more than
relationships that represents a reasoning paradigm. one relationship, therefore a component may be the con-
The following tasks presume that argument compo- clusions of other premises, as well as a premise of other
nents have already been identified from the source doc- arguments. In that case, the component is to be classified
uments. Argument components can therefore be clas- as a premise.
sified according to their role in the connections (such
as Premises or Conclusions), according to their content Premise Type classification. Multi-label classifica-
(such as Legal or Factual), and according to the rela- tion: classify an argumentative premise as factual or legal
tionship pattern they contribute to (the Argumentative (or both).
Scheme).
This challenge proposes three classification tasks, in • Factual premise: a premise that describes factual
the context of argument mining in the legal domain: situations and events, pertaining to the substance
or the procedure of the case.
• Argument Component classification: given Indubbiamente, la contribuente ha impugnato
an argumentative component, classify it as la sentenza di prime cure, rappresentando nuo-
premise or conclusion. vamente di non aver potuto proporre appello
• Premise Type classification: given a premise, avverso la pronuncia di condanna di primo
classify it as factual or legal. grado, per causa di forza maggiore.
• Argument Scheme classification: given a pre-
• Legal premise: a premise that specifies the legal
determined set of argument schemes, classify a
content (legal rules, precedents, interpretation of
legal premise as belonging to one or more such
applicable laws and principles).
schemes.
La giurisprudenza citata, alla motivazione
The following paragraphs contain a definition of each della quale si fa rinvio, ha tra l’altro preso
class, along with an example extracted from the dataset. posizione espressamente e positivamente sulla
The translated version of the examples is available in conformità della normativa italiana rispetto
Appendix A. a quella dell’Unione Europea, risultando così
confutata anche la doglianza della difesa sul
Argument Component classification. Binary classi- punto che ha chiesto la sospensione del procedi-
fication: given an argumentative component, classify it mento, con investitura della Corte di Giustizia
as premise or conclusion. Europea della questione.
• Argument premise: a proposition that provides a Since a premise could be both factual and legal, this task
reason or support for the argument. is framed as multi-label binary classification.
Argument Scheme classification. Legal premises con incarico a terzi) e cedere il prodotto finito
determine the nature of the legal reasoning they support, ottenuto.
hence they are labeled with the corresponding reasoning
• Interpretative scheme: it is used whenever the
pattern, called argument scheme. We define five schemes
Court expresses new interpretative assertions
relevant for tax law. Each legal premise may be assigned
(that may depend on previous case law) thereby
multiple schemes, therefore we frame this task as multi-
creating new precedents.
label multi-class classification.
Given a legal premise, classify it as belonging to one or Si vuole dire, in sostanza, che la finalità del
more of the following schemes: (established) rule, prece- contraddittorio anticipato è quella di mettere
dent, classification, interpretative, or principle. il contribuente nella condizione di potere fare
valere le proprie osservazioni prima che la de-
• Rule (or established rule) scheme: it is used when- cisione sia adottata e, quindi, di far sì che l’Am-
ever an explicit reference to codified law is ministrazione possa tener conto di tutti gli el-
present. This reference can be the reference to ementi del caso nell’adottare (o non adottare)
a certain article or the quotation of the text of a il provvedimento ovvero nel dare a questo un
certain article. contenuto piuttosto che un altro.
Infatti, è ben vero che, ai sensi del combi- • Principle scheme: it is used whenever the Court
nato disposto dagli articoli 54 e 23 D.Lgs. n. explicitly refers to a principle of law (e.g. the
546/1992, il convenuto in appello deve costi- Principle of proportionality).
tuirsi entro 60 giorni dal giorno in cui ricorso
è stato notificato. Nell’ordinamento unionale, pertanto, il princi-
pio del contraddittorio in ambito tributario
• Precedent scheme: it is used whenever there is prescinde dalla natura del tributo e deve
an explicit reference to a previous decision. In trovare applicazione ogni qualvolta l’ammin-
the dataset we considered only the references to istrazione sulla base della documentazione es-
a decision of both the Court of Cassation or the ibita ritenga dovere dare alla stessa documen-
European Court of Justice. tazione interpretazione diversa da quella data
L’ Amministrazione “ha l’onere di provare ed dal contribuente invitandolo, come detto, a
allegare gli elementi probatori su cui si fondi fornire nel corso del contraddittorio le ragioni
la contestazione, tra i quali possono rilevare, della propria scelta.
in via indiziaria, quali elementi sintomatici
della mancata esecuzione della prestazione dal
fatturante, l’assenza della minima dotazione 3. Data description
personale e strumentale, l’immediatezza dei
rapporti (cedente/prestatore fatturante inter- 3.1. Origin of data
posto e cessionario/committente), una concla- The data consists of argumentative portions of text ex-
mata inidoneità allo svolgimento dell’attività tracted from 225 Italian decisions on Value Added Tax
economica e la non corrispondenza tra i ce- (VAT) by the Regional Tax Commissions from various
denti e la società coinvolta nell’operazione”. judicial districts. The decisions were downloaded par-
• Classification scheme: it is used whenever a legal tially from the open Giustizia Tributaria database1 and
concept is defined, its properties are listed, and from other judicial databases accessed through university
a certain fact or legal deed must be qualified as licensing agreements. The decisions range from 2010 to
having those properties. 2022 and concern taxable transactions, exemptions, out-
of-scope transactions, and the right to obtain a deduction.
In conclusione, per quanto fin qui esposto, i The argumentative components were extracted from the
“compro oro” possono essere definiti come “eser- sections “Motivi della decisione”, “Diritto” or “Fatto e
cizi commerciali che acquistano, commerciano diritto”, depending on the format of each decision.
o rivendono oggetti d’oro, di metalli preziosi o The collected data were anonymised modifying any
recanti pietre preziose usati e li cedono nella identification data of natural or legal persons involved in
forma di materiale, di rottami d’oro o di met- the proceedings. In particular, the names of the parties
alli preziosi alle fonderie o ad altre aziende spe- in the proceeding and, to provide the highest privacy
cializzate nel recupero di materiali preziosi”. standards, also the names of the companies have been
Trattano esclusivamente prodotti finiti e non replaced with initials (e.g., Mario Rossi in “MR”, Company
possono, congiuntamente, acquistare oro da
1
gioielleria usato, fonderlo (per proprio conto o Tax Justice database accessible at: https://www.giustizia-tributaria.
it/.
s.r.l in “C s.r.l.”). The names of the judges composing the set, some of which are included in Section 2. Here we
judicial panel have been replaced by “giu1, giu2, [...] report the zero-shot version. The translation of the zero-
giuN”. Also, addresses and places were replaced with shot prompts is available in Appendix B. The few-shot
’XXX’, and dates were changed to show only the year in version is available in Appendix C.
the following format: DD/MM/2015.
Argument Component classification: given an ar-
3.2. Annotation details gumentative text, classify it as premise or conclusion.
Prompt: “Classifica il seguente testo argomentativo come
The dataset was annotated by four tax law experts. Anno- premessa ‘prem’ o conclusione ‘conc’. Per premessa (prem)
tation guidelines are significantly based on our previous si intende una proposizione che fornisce una ragione o un
work on the Demosthenes corpus [21], a dataset with supporto per l’argomentazione. Per conclusione (conc) si in-
English documents from the Court of Justice of the Euro- tende l’affermazione che segue logicamente dalle premesse
pean Union. The guidelines were adapted to the Italian e rappresenta il punto finale che viene argomentato. Testo:”
decisions, and refined through an iterative process of
validation and discussion, to solve conflicts between an-
Premise Type classification: given a premise, classify
notators. In particular, the annotation is based on the
it as factual, legal or both.
same classes used in Demosthenes. However, the struc-
Prompt: “Classifica la seguente premessa come di fatto
ture of the decisions is different: while in the English
‘F’, legale ‘L’ o entrambe. Le premesse di fatto (F) descrivono
corpus the annotation is done at the sentence level, it
situazioni ed eventi fattuali relativi al caso di specie. Le pre-
is not always possible to meet this criterion in Italian
messe legali (L) specificano il contenuto giuridico (norme
decisions. Therefore, the constraint has been relaxed,
giuridiche, precedenti, interpretazione delle leggi e dei prin-
allowing a single annotation to cover multiple sentences
cipi applicabili). L’output atteso è una lista con tutte le
and a single sentence to contain multiple annotations.
label applicabili. Ad esempio: [‘F’, ‘L’]. Testo: ”
The tagged decisions are available in our GitHub reposi-
tory. 2
Argument Scheme classification: given a legal
premise, classify it as one or more of the following argu-
3.3. Data format mentative schemes: Rule, Prec, Class, Itpr, Princ.
Data are available as a Hugging Face Dataset,3 divided in Prompt: “Classifica la seguente premessa legale in uno
three splits: train, val and test. Each row represents an o più dei seguenti schemi argomentativi: Rule, Prec, Class,
argumentative component, with the following columns: Itpr, Princ. Rule: se esiste un riferimento esplicito o im-
plicito a un articolo di legge o la citazione del testo di una
• Text: the text of the component norma. Prec: se esiste un riferimento ad una precedente
• Document: the document it belongs to pronuncia della Corte di Cassazione o della Corte di Gius-
• Component: if it is a premise (prem) or a conclu- tizia dell’Unione Europea. Class: se c’è la definizone di un
sion (conc) concetto giuridico o degli elementi costitutivi dello stesso.
• Type: a list value representing the type of a Itpr: se c’è il riferimento a uno dei criteri interpretativi
premise; the list contains F for a Factual premise contenuti all’art. 12 delle preleggi (letterale, teleologica,
and L for a Legal one. psicologica, sistematica) al codice civile. Princ: se c’è un
• Scheme: a list value representing the argumenta- riferimento espresso a un prinicpio generale del diritto (es.
tive schemes of a legal premise. The values are: principio di proporzionalità). L’output atteso è una lista
Rule, Prec, Class, Itpr and Princ. con tutte le label applicabili. Ad esempio: [‘Prec’, ‘Princ’,
• Chain_id: univocal for each document, it specifies ‘Rule’]. Testo: ”
the argumentative chain the component belongs
to (e.g. A1, A2,..., B1, B2,...) 3.5. Detailed data statistics
• Id: an univocal numerical id
The composition of the dataset is summarized in Table
1. The splitting between train, validation, and test data
3.4. Example of prompts used for zero was done at the document level so that components of
and few shots the same document belong to the same split. It was per-
formed manually, with a ratio of approximately 60:20:20,
For each task, we propose both a zero-shot and a few-shot
and the aim of balancing the Scheme classes as much as
prompt. For the few-shot version, we have selected some
possible. We adopt the train/val/test format to make the
particularly representative examples from the training
results comparable with as many methods as possible,
2
https://github.com/adele-project/AMELIA/ such as fine-tuned transformer-based models.
3
https://huggingface.co/datasets/nlp-unibo/AMELIA
Component Premise Type Argument Scheme
Split N docs
Prem Conc Factual Legal Rule Prec Itpr Princ Class
Train 135 1866 242 1254 812 350 264 224 92 51
Validation 44 528 81 315 266 107 82 83 21 22
Test 46 516 78 323 260 118 73 67 31 27
Total 225 2910 401 1892 1338 575 419 374 144 100
Table 1
Composition of the dataset
4. Metrics capture or leverage such contextual details that would
otherwise aid in more accurate argument classification.
Due to the heavy unbalance between the classes, we eval- Another limitation is the manual annotation process,
uate the results using the macro F1 score. Additionally, which, despite efforts to ensure consistency through
we evaluate the F1 score of each class to provide further expert annotators and conflict resolution, may still be
insights. subject to human bias or interpretation inconsistencies.
As a reference, in Demosthenes [21] the best macro F1 These subjective elements could affect the quality and
results for the three tasks are 0.88 for Argument Com- reproducibility of the tasks.
ponent classification, 0.85 for Premise Type classifica-
tion, and 0.75 for Argument Scheme classification. It is
important to specify that these scores are not directly 6. Ethical issues
comparable and we provide them only as a reference of
the difficulty of the tasks. The dataset comprises legal decisions that have been
anonymised to protect the privacy of the individuals.
However, it is important to acknowledge the potential
5. Limitations risks related to re-identification, even with anonymisa-
tion efforts, especially in legal contexts where case details
The original documents, along with the argument min- could be cross-referenced with external sources. Care
ing annotation, are already available as part of the Adele was taken to remove any personal identifiers, such as
tool.4 The original documents, annotated according to names, addresses, and dates, but residual risks may re-
the task of outcome prediction instead of argument min- main.
ing, are also published in [15]. Additionally, the use of this dataset raises questions
The dataset is limited in size, consisting of only 225 regarding the deployment of AI systems in legal contexts.
legal decisions on Value Added Tax (VAT). While this AI used by a judicial authority in researching and inter-
provides a valuable resource for testing argument min- preting facts and the law are considered high-risk by the
ing models in the Italian tax legal domain, the relatively AI Act.5 Those systems must conform to the essential
small dataset may not capture the full diversity of argu- requirements (e.g. data governance, user transparency,
mentative structures present in the broader Italian tax human oversight, etc.) and the conformity must be docu-
legal system or other legal domains. This could limit mented.
the scalability of models trained on this dataset. Also, Finally, a critical aspect is the transparency and ac-
given that the legal decisions are from a specific time countability of AI systems when applied in sensitive do-
frame (2010-2022), the dataset may not reflect more re- mains like law. Users of the models should understand
cent developments or changes in legal reasoning or tax their limitations, especially in tasks involving nuanced
law. reasoning like legal argumentation. Furthermore, ensur-
Secondly, the dataset has been anonymised to protect ing that legal professionals and stakeholders have the
the privacy of individuals and legal entities. While this ability to audit and interpret the decisions made by AI
is necessary to comply with data protection regulations, models is crucial to avoid undermining trust in legal in-
the anonymisation process may have removed certain stitutions.
contextual details (e.g., names of places or entities) that
could be relevant for understanding the nuances of cer-
tain legal arguments. As a result, models may not fully
4 5
https://adele-tool.eu/ https://eur-lex.europa.eu/eli/reg/2024/1689/oj.
7. Data license and copyright [5] J. Lawrence, C. Reed, Argument Mining: A Sur-
vey, Computational Linguistics 45 (2020) 765–818.
issues doi:10.1162/coli_a_00364 .
The dataset used in this challenge consists of legal deci- [6] I. Habernal, D. Faber, N. Recchia, S. Bretthauer,
sions on Value Added Tax (VAT) made by the Regional I. Gurevych, I. S. genannt Döhmann, C. Burchard,
Tax Commissions in Italy, available and downloaded from Mining legal arguments in court decisions, Artif.
the Giustizia Tributaria and other judicial databases ac- Intell. Law 32 (2024) 1–38.
cessed through university licensing agreements. These [7] V. Niculae, J. Park, C. Cardie, Argument mining
legal texts, being official public documents, are generally with structured svms and rnns, in: ACL (1), As-
not subject to copyright restrictions. The dataset con- sociation for Computational Linguistics, 2017, pp.
sists of a non-substantial part of the respective databases. 985–995.
Moreover, the use of data is compliant with the text and [8] P. Poudyal, J. Savelka, A. Ieven, M. F. Moens,
data mining exception under the EU Copyright Directive T. Goncalves, P. Quaresma, ECHR: Legal corpus
and implementing national law.6 for argument mining, in: E. Cabrio, S. Villata
Since the data has been processed and annotated, the (Eds.), Proceedings of the 7th Workshop on Ar-
annotations and derived data are subject to copyright by gument Mining, Association for Computational
the authors of this challenge. To promote transparency Linguistics, Online, 2020, pp. 67–75. URL: https:
and further research, the dataset is released under the //aclanthology.org/2020.argmining-1.8.
Creative Commons Attribution 4.0 International (CC BY [9] T. Mayer, S. Marro, E. Cabrio, S. Villata, Enhancing
4.0) license. This license allows others to share, use, and evidence-based medicine with natural language ar-
adapt the data, as long as appropriate credit is given to the gumentative analysis of clinical trials, Artif. Intell.
creators, and any modifications are explicitly indicated. Medicine 118 (2021) 102098.
[10] P. Accuosto, H. Saggion, Mining arguments in sci-
entific abstracts with discourse-level embeddings,
Acknowledgments Data Knowl. Eng. 129 (2020) 101840.
[11] P. Basile, V. Basile, E. Cabrio, S. Villata, Argument
This work was partially supported by the following Mining on Italian News Blogs, volume 1749 of CEUR
projects: “ADELE – Analytics for DEcision of LEgal cases” Workshop Proceedings, CEUR-WS.org, 2016. URL:
(Justice Programme, GA. No. 101007420); PRIN2022 https://ceur-ws.org/Vol-1749/paper8.pdf.
PRIMA - PRivacy Infringements Machine-Advice (Ref. [12] M. Lai, V. Patti, G. Ruffo, P. Rosso, Stance evolution
Prot. n.: 20224TPEYC - CUP J53D23005130001); “FAIR - and twitter interactions in an italian political debate,
Future Artificial Intelligence Research” – Spoke 8 “Perva- in: M. Silberztein, F. Atigui, E. Kornyshova, E. Mé-
sive AI’’, under the European Commission’s NextGener- tais, F. Meziane (Eds.), Natural Language Processing
ation EU programme, PNRR – M4C2 – Investimento 1.3, and Information Systems, Springer International
Partenariato Esteso (PE00000013). Publishing, Cham, 2018, pp. 15–27.
[13] A. Tagarelli, A. Simeri, Unsupervised law article
mining based on deep pre-trained language rep-
References resentation models with application to the italian
[1] E. M. Bender, T. Gebru, A. McMillan-Major, civil code, Artificial Intelligence and Law 30 (2021)
S. Shmitchell, On the dangers of stochastic par- 417–473. doi:10.1007/s10506- 021- 09301- 8 .
rots: Can language models be too big?, in: FAccT, [14] V. Bellandi, S. Castano, P. Ceravolo, E. Damiani,
ACM, 2021, pp. 610–623. A. Ferrara, S. Montanelli, S. Picascia, A. Polimeno,
[2] D. Walton, Argumentation Theory: A Very Short D. Riva, Knowledge-based legal document re-
Introduction, Springer US, Boston, MA, 2009, pp. trieval: A case study on italian civil court de-
1–22. doi:10.1007/978- 0- 387- 98197- 0_1 . cisions, in: D. Symeonidou, R. Yu, D. Ceolin,
[3] M. Lippi, P. Torroni, Argumentation mining: State M. Poveda-Villalón, D. Audrito, L. D. Caro, F. Grasso,
of the art and emerging trends, ACM Trans. Internet R. Nai, E. Sulis, F. J. Ekaputra, O. Kutz, N. Tro-
Techn. 16 (2016) 10:1–10:25. doi:10.1145/2850417 . quard (Eds.), Companion Proceedings of the 23rd
[4] E. Cabrio, S. Villata, Five years of argument mining: International Conference on Knowledge Engineer-
a data-driven analysis, in: IJCAI, ijcai.org, 2018, pp. ing and Knowledge Management, Bozen-Bolzano,
5427–5433. Italy, September 26-29, 2022, volume 3256 of CEUR
Workshop Proceedings, CEUR-WS.org, 2022. URL:
https://ceur-ws.org/Vol-3256/km4law2.pdf.
[15] F. Galli, G. Grundler, A. Fidelangeli, A. Galassi, F. La-
6
https://eur-lex.europa.eu/eli/dir/2019/790/oj. gioia, E. Palmieri, F. Ruggeri, G. Sartor, P. Torroni,
Predicting outcomes of italian VAT decisions, in: A. Translated Examples
JURIX, volume 362 of Frontiers in Artificial Intelli-
gence and Applications, IOS Press, 2022, pp. 188–193. Argument Component classification.
doi:10.3233/FAIA220465 .
[16] A. Galassi, F. Lagioia, A. Jabłonowska, M. Lippi, Argument premise:
Unfair clause detection in terms of service across
It should be noted that viewing the inability to deduct
multiple languages, Artificial Intelligence and Law
expenses for individuals such as the plaintiff as state
(2024) 1–49. doi:10.1007/s10506- 024- 09398- 7 .
aid to public hospitals overlooks the indiscriminate
[17] K. Drawzeski, A. Galassi, A. Jablonowska, F. La-
accessibility of public hospital services for individuals
gioia, M. Lippi, H. Micklitz, G. Sartor, G. Tagiuri,
registered with the National Health Service (SSN). In
P. Torroni, A corpus for multilingual analysis of
contrast, a self-employed healthcare professional may
online terms of service, in: NLLP@EMNLP, Associ-
refuse to provide services as an ordinary contractor.
ation for Computational Linguistics, 2021, pp. 1–8.
doi:10.18653/v1/2021.nllp- 1.1 . Argument conclusion:
[18] L. Ragazzi, G. Moro, S. Guidi, G. Frisoni, Law-
suit: a large expert-written summarization dataset Thus, the office recognized the VAT non-taxable nature
of italian constitutional court verdicts, Artificial of the exportation, thus considering there is no longer
Intelligence and Law (2024) 1–37. doi:10.1007/ any grounds to proceed on the matter.
s10506- 024- 09414- w .
[19] D. Licari, P. Bushipaka, G. Marino, G. Comandé, Premise Type classification.
T. Cucinotta, Legal holding extraction from italian
case documents using italian-legal-bert text sum- Factual premise:
marization, in: Proceedings of the Nineteenth In-
ternational Conference on Artificial Intelligence Undoubtedly, the taxpayer appealed the first instance
and Law, ICAIL ’23, Association for Computing ruling, again representing that she could not appeal
Machinery, New York, NY, USA, 2023, p. 148–156. against the first instance decision due to force majeure.
doi:10.1145/3594536.3595177 . Legal premise:
[20] G. Attanasio, P. Basile, F. Borazio, D. Croce, M. Fran-
cis, J. Gili, E. Musacchio, M. Nissim, V. Patti, M. Ri- The cited case law, to which reference is made for
naldi, D. Scalena, CALAMITA: Challenge the Abili- its reasoning, has explicitly and positively addressed
ties of LAnguage Models in ITAlian, in: Proceed- the conformity of Italian legislation with that of the
ings of the 10th Italian Conference on Computa- European Union. This effectively refutes the defense’s
tional Linguistics (CLiC-it 2024), Pisa, Italy, Decem- objection on this point, which requested the suspension
ber 4 - December 6, 2024, CEUR Workshop Proceed- of the proceedings and the referral of the issue to the
ings, CEUR-WS.org, 2024. European Court of Justice.
[21] G. Grundler, P. Santin, A. Galassi, F. Galli, F. Go-
dano, F. Lagioia, E. Palmieri, F. Ruggeri, G. Sartor, Argument Scheme classification.
P. Torroni, Detecting arguments in CJEU decisions
on fiscal state aid, in: G. Lapesa, J. Schneider, Y. Jo, Rule Scheme:
S. Saha (Eds.), Proceedings of the 9th Workshop
on Argument Mining, International Conference on In fact, it is true that under Articles 54 and 23 of
Computational Linguistics, Online and in Gyeongju, Legislative Decree No. 546/1992, the defendant on
Republic of Korea, 2022, pp. 143–157. URL: https: appeal must come up for trial within 60 days from the
//aclanthology.org/2022.argmining-1.14. day on which appeal was served.
[22] P. Santin, G. Grundler, A. Galassi, F. Galli, F. La-
Precedent Scheme:
gioia, E. Palmieri, F. Ruggeri, G. Sartor, P. Torroni,
Argumentation structure prediction in CJEU deci- The Administration “has the burden of proving and
sions on fiscal state aid, in: ICAIL, ACM, 2023, pp. attaching the evidence on which the dispute is based,
247–256. among which the absence of the minimum personal
[23] D. Walton, C. Reed, F. Macagno, Argumentation and instrumental equipment, the immediacy of the
schemes, Cambridge University Press, 2008. relationships (transferor/interposed invoicing provider
and transferee/buyer), an overt unsuitability to carry
out the economic activity and the mismatch between
the transferors and the company involved in the trans-
action may be circumstantial.”
Classification Scheme: Argument Scheme classification.
In conclusion, given what has been said so far, “gold “Classify the following legal premise as one or more
shop” can be defined as “business establishments that of the following argumentative schemes: Rule, Prec,
buy, trade or resell used objects of gold, precious met- Class, Itpr, Princ. Rule: whether there is an explicit
als or bearing precious stones and dispose of them in or implicit reference to an article of law or citation
the form of material, scrap gold or precious metals to of the text of a certain article. Prec: whether there
foundries or other companies specialising in the recov- is a reference to a previous ruling of the Supreme
ery of precious materials”. They deal only in finished Court or the Court of Justice of the European Union.
products and may not purchase used jewelery gold, Class: if there is a definition of a legal concept or
melt it down (for their account or by commissioning its constituent elements. Itpr: if there is reference
a third party) and dispose of the resulting finished to one of the interpretative criteria contained in Ar-
product. ticle 12 of the prelegislations (literal, teleological,
Interpretative Scheme: psychological, systematic) to the Civil Code. Princ:
if there is a reference to a general principle of law
It means that, in essence, the purpose of the right (e.g. principle of proportionality). The expected out-
to be heard is to put the taxpayer in the position of put is a list with all applicable labels. For example:
being able to make his or her observations before the [‘Prec’, ‘Princ’, ‘Rule’]. Text:”
decision is made and, therefore, to ensure that the
administration can take into account all the elements
of the case in adopting (or not adopting) the measure C. Few-shot prompts
or in giving this one content rather than another.
Argument Component classification.
Principle Scheme:
“Classifica il seguente testo argomentativo come pre-
In the European Union system, therefore, the right to messa ‘prem’ o conclusione ‘conc’. Per premessa
be heard in tax matters is independent of the nature (prem) si intende una proposizione che fornisce
of the tax and must be applied whenever the admin- una ragione o un supporto per l’argomentazione.
istration on the basis of the documentation exhibited Per conclusione (conc) si intende l’affermazione che
deems it necessary to give the same documentation segue logicamente dalle premesse e rappresenta il
an interpretation that differs from that given by the punto finale che viene argomentato.
taxpayer, inviting him, as mentioned, to provide in
the exercise of the right to be heard the reasons for his Esempi:
choice. Testo: Si osserva poi che ritenere che la mancata pos-
sibilità di detrazione a favore di soggetti come il ri-
corrente comporti un aiuto di Stato in favore degli os-
B. Translated Prompts pedali pubblici, in quanto le perdite degli stessi ven-
gono ripianate dalle USL e dalla Regioni trascura di
Argument Component classification.
considerare l’accessibilità, indiscriminata, ai servizi
“Classify the following argumentative text as dei nosocomi pubblici da parte dei soggetti iscritti
premise ‘prem’ or conclusion ‘conc’. A premise al SSN, rispetto a quella ad un libero professionista
(prem) is a proposition that provides a reason or sanitario che, in quanto tale, ben potrebbe rifiutarsi
support for the argument. A conclusion (conc) is the di prestare i propri servigi al pare di un normale
statement that follows logically from the premise(s) contraente
and represents the final point being argued for. Risposta: prem
Text:”
Testo: L’appello è infondato e va respinto
Premise Type classification. Risposta: conc
“Classify the following premise as factual ‘F’, legal Testo: Va osservato che la motivazione dell’atto di
‘L’ or both. Factual premises (F) describe factual situ- accertamento non può esaurirsi nel rilievo dello
ations and events, pertaining to the substance or the scostamento, ma deve essere integrata con la di-
procedure of the case. Legal premises (L) specify the mostrazione dell’applicabilità in concreto dello ‘stan-
legal content (legal rules, precedents, interpretation dard’ prescelto e con le ragioni per le quali sono state
of applicable laws and principles). The expected out- disattese le contestazioni sollevate dal contribuente.
put is a list with all applicable labels. For example: (cfr. Cass. S.U. 26635/2009, Cass. 12558/2010, Cass.
[‘F’, ‘L’]. Text:” 12428/2012, Cass. 23070/2012)
Risposta: prem bassi onulli per ricavare liquidità a fronte di nuovi
Testo: Dunque, l’ufficio ha riconosciuto la non im- impegni, ma dovrà rilevare la condotta antieconom-
ponibilità IVA delle cessioni all’esportazione, così ica dello stesso sulla base dell’utile di esercizio
cessando sul punto la materia del contendere Risposta: [‘L’]
Risposta: conc Testo: Invero l’avviso di accertamento è fondato
Testo: Risulta d’altronde dalle osservazioni scritte sul mancato rispetto, da parte del contribuente, nel
del governo spagnolo che quest’ultimo non riesce calcolo del ROL, delle disposizioni dell’articolo 96,
a discernere tale differenza ad un esame delle perti- secondo comma, del TUIR, che ne definisce le modal-
nenti norme dell’ordinamento spagnolo. ità
Risposta: prem Risposta: [‘F’, ‘L’]
Testo: Il Collegio, esaminata l’eccezione preliminare Testo: La società ‘A’, per quanto previsto dall’art.
svolta nel suo appello dall’Ufficio e relativa alla richi- 4, comma 18 del Regolamento CEE n. 2913/1992,
esta nullità della sentenza per mancata instaurazione riveste il ruolo di ‘dichiarante in Dogana‘, soggetto
del contraddittorio, la respinge passivo della obbligazione
Risposta: conc Risposta: [‘F’, ‘L’]
Testo: ” Testo: ”
Premise Type classification. Argument Scheme classification.
“Classifica la seguente premessa come di fatto ‘F’, “Classifica la seguente premessa legale in uno o più
legale ‘L’ o entrambe. Le premesse di fatto (F) de- dei seguenti schemi argomentativi: Rule, Prec, Class,
scrivono situazioni ed eventi fattuali relativi al caso Itpr, Princ. Rule: se esiste un riferimento esplic-
di specie. Le premesse legali (L) specificano il con- ito o implicito a un articolo di legge o la citazione
tenuto giuridico (norme giuridiche, precedenti, in- del testo di una norma. Prec: se esiste un riferi-
terpretazione delle leggi e dei principi applicabili). mento ad una precedente pronuncia della Corte di
L’output atteso è una lista con tutte le label applica- Cassazione o della Corte di Giustizia dell’Unione
bili. Ad esempio: [‘F’, ‘L’]. Europea. Class: se c’è la definizone di un concetto
giuridico o degli elementi costitutivi dello stesso.
Esempi: Itpr: se c’è il riferimento a uno dei criteri interpre-
Testo: Per i primi giudici nel caso di specie questa tativi contenuti all’art. 12 delle preleggi (letterale,
esenzione non poteva essere applicata perché la com- teleologica, psicologica, sistematica) al codice civile.
plessiva attività di ‘A’ srl era un’attività commerciale Princ: se c’è un riferimento espresso a un prinicpio
svolta in concorrenza con altre imprese operanti nel generale del diritto (es. principio di proporzional-
settore ità). L’output atteso è una lista con tutte le label
Risposta: [‘F’] applicabili. Ad esempio: [‘Prec’, ‘Princ’, ‘Rule’].
Testo: In assenza di siffatti elementi, che in via pre- Esempi:
suntiva avrebbero potuto fare giungere questo giu- Testo: Infatti, è ben vero che, ai sensi del combinato
dice a conclusioni diverse in via logica, si deve con- disposto dagli articoli 54 e 23 D.Lgs. n. 546/1992, il
fermare l’esito cui è giunta la commissione provin- convenuto in appello deve costituirsi entro 60 giorni
ciale dal giorno in cui ricorso è stato notificato.
Risposta: [‘F’] Risposta: [‘Rule’]
Testo: Su questo si osserva che si deve condividere Testo: L’Amministrazione “ha l’onere di provare
la circostanza dedotta dal giudice di prime cure per ed allegare gli elementi probatori su cui si fondi la
cui deve essere il contribuente, ove sia contestata la contestazione, tra i quali possono rilevare, in via
inerenza e verità della rappresentazione ricavabile indiziaria, quali elementi sintomatici della mancata
dal documento contabile, a dare la dimostrazione esecuzione della prestazione dal fatturante, l’assenza
della fondatezza e della correttezza del comporta- della minima dotazione personale e strumentale,
mento tenuto l’immediatezza dei rapporti (cedente/prestatore fat-
Risposta: [‘L’] turante interposto e cessionario/committente), una
conclamata inidoneità allo svolgimento dell’attività
Testo: L’Ufficio non potrà impedire ad un impren- economica e la non corrispondenza tra i cedenti e
ditore, per esempio, di cedere immobili con prezzi la società coinvolta nell’operazione”
Risposta: [‘Prec’] Risposta: [‘Rule’, ‘Itpr’, ‘Class’]
Testo: In conclusione, per quanto fin qui esposto, i Testo: ”
“compro oro” possono essere definiti come “esercizi
commerciali che acquistano, commerciano o riven-
dono oggetti d’oro, di metalli preziosi o recanti pietre
preziose usati e li cedono nella forma di materiale, di
rottami d’oro o di metalli preziosi alle fonderie o ad
altre aziende specializzate nel recupero di materiali
preziosi”. Trattano esclusivamente prodotti finiti
e non possono, congiuntamente, acquistare oro da
gioielleria usato, fonderlo (per proprio conto o con
incarico a terzi) e cedere il prodotto finito ottenuto
Risposta: [‘Class’]
Testo: Si vuole dire, in sostanza, che la finalità del
contraddittorio anticipato è quella di mettere il con-
tribuente nella condizione di potere fare valere le
proprie osservazioni prima che la decisione sia adot-
tata e, quindi, di far sì che l’Amministrazione possa
tener conto di tutti gli elementi del caso nell’adottare
(0 non adottare) il provvedimento ovvero nel dare a
questo un contenuto piuttosto che un altro.
Risposta: [‘Itpr’]
Testo: Nell’ordinamento unionale, pertanto, il prin-
cipio del contraddittorio in ambito tributario pre-
scinde dalla natura del tributo e deve trovare appli-
cazione ogni qualvolta l’amministrazione sulla base
della documentazione esibita ritenga dovere dare
alla stessa documentazione interpretazione diversa
da quella data dal contribuente invitandolo, come
detto fornire nel corso del contraddittorio le ragioni
della propria scelta
Risposta: [‘Princ’]
Testo: In sintesi per esterovestizione si intende la
fittizia localizzazione della residenza fiscale di un
soggetto all’estero, in particolare in un Paese con
un trattamento fiscale più vantaggioso di quello
nazionale,che la giurisprudenza configura in termini
di abuso del diritto riconosciuto, in via tendenziale,
come principio generale anche nel diritto dei singoli
Stati membri (v. Cass., Sez. Un., n. 30055 del 2008,
secondo la quale il divieto di abuso del diritto si tra-
duce in un principio generale antielusivo che trova
fondamento, in tema di tributi non armonizzati, nei
principi costituzionali di capacità contributiva e di
progressività dell’imposizione).
Risposta: [‘Prec’, ‘Class’, ‘Princ’]
Testo: La denuncia, infatti, non codificata nel codice
di procedura penale (a differenza della notizia di
reato di cui all’articolo 347 c.p.p.), può definirsi come
qualunque atto con il quale chiunque abbia notizia di
un reato perseguibile d’ufficio ne informa il pubblico
ministero o un ufficiale di polizia giudiziaria.