Knowledge-Based Legal Document Retrieval: A Case
Study on Italian Civil Court Decisions
Valerio Bellandi1 , Silvana Castano1 , Paolo Ceravolo1 , Ernesto Damiani1 ,
Alfio Ferrara1 , Stefano Montanelli1 , Sergio Picascia1 , Antongiacomo Polimeno1 and
Davide Riva1
1
    Università degli Studi di Milano, Department of Computer Science, Via Celoria, 18 - 20133 Milano, Italy


                                         Abstract
                                         In this paper, we present a knowledge-based approach for legal document retrieval based on the organi-
                                         zation of a textual data repository and on document embedding models. Pre-processed and embedded
                                         documents are iteratively classified at sentence level through a terminology extraction and concept
                                         formation cycle, using a zero-knowledge approach that offers a high degree of flexibility with regard
                                         to the integration of external knowledge and the variability of inputs, suitable to face the scarcity of
                                         annotated data and the specificity of terminology that feature the Italian legal domain document corpora.

                                         Keywords
                                         legal knowledge extraction, legal document retrieval, semantic search, zero-shot learning


1. Introduction
Document retrieval is a daily activity in a wide variety of domains. In the legal domain, retrieval
of legal documents, like law articles and court decisions, is important for several categories
of legal actors: practitioners (attorneys, lawyers), to support their professional activities; ad-
ministrators (legislators, judges), to enforce law procedures; users (citizens, organizations), for
information exploration and exploitation [1].
   To provide effective retrieval functionalities, semantic approaches and knowledge-based
systems are being proposed for the legal domain, combining Natural Language Processing (NLP)
and context-aware embedding models for extracting and conceptualizing relevant terminology
from documents. In particular, to cope with the variety of terminology adopted by judges
in the production of legal documents such as court decisions, development of approaches
providing word sense disambiguation and the ability to retrieve documents that refer to the
same concept despite of the adopted terminology is essential in order to deal with synonyms,
circumlocutions, polysemic terms and similar situations. In addition, applying general-purpose


EKAW’22: Companion Proceedings of the 23rd International Conference on Knowledge Engineering and Knowledge
Management, September 26–29, 2022, Bozen-Bolzano, IT
*
  Corresponding author.
$ valerio.bellandi@unimi.it (V. Bellandi); silvana.castano@unimi.it (S. Castano); paolo.ceravolo@unimi.it
(P. Ceravolo); ernesto.damiani@unimi.it (E. Damiani); alfio.ferrara@unimi.it (A. Ferrara);
stefano.montanelli@unimi.it (S. Montanelli); sergio.picascia@unimi.it (S. Picascia);
antongiacomo.polimeno@unimi.it (A. Polimeno); davide.riva1@unimi.it (D. Riva)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
supervised information retrieval techniques to legal documents would likely be ineffective since
they would suffer from the lack of sufficiently large corpora of annotated documents.
   In this paper, we present a knowledge-based approach for legal document retrieval based on
the organization of a textual data repository and on document embedding models. Pre-processed
and embedded documents are iteratively classified at sentence level through a terminology
extraction and concept formation cycle. The approach relies on the ASKE (Automated System
for Knowledge Extraction) engine, which iteratively solves a multilabel classification problem
with zero initial knowledge, that is, without any annotation of the documents. A presentation
of the ASKE functionalities is given in [2]. In this paper, we describe an application of using
ASKE in a real case study of legal document retrieval from a repository of Italian court decisions
as part of the Next Generation UPP (NGUPP) project, funded by the Italian Ministry of Justice,
aiming at providing artificial intelligence and advanced information management techniques
for digital transformation of Italian legal processes and digital justice in general. In particular,
in this paper we present a case study addressing a practical task faced by law practitioners and
administrators, namely the retrieval of past court decisions (so called “precedents”) based on
one or more text fragments in input, such as sentences, definitions, excerpts of articles. The
objective is to retrieve the most pertinent documents (e.g., court decisions or sentences therein
contained), i.e. precedents, for the input query.
   The paper is organized as follows: Section 2 discusses the related work on legal information
retrieval. In Section 3, we describe the proposed approach for legal document retrieval. Section
4 describes the real case study. Finally, Section 5 is devoted to ongoing and future work.


2. Related work
Legal information retrieval (LIR) is the discipline that aims at extracting information from
a corpus of legal documents, including case law decisions and legal codes. Since the digital
transformation of these documents initiated, LIR has been of interest for both legal actors and
information scientists. Boolean searches [3] were the first method applied to accomplish this
task, followed by rule-based [4] and NLP-based [5] approaches. Other works focus on the
exploitation of external resources, such as ontologies [6] or thesauri [7] combined with natural
language processing techniques. With the advent of language models, more elaborated systems
have been developed [8], which account for the contextual meaning of words and sentences
rather than simply detecting their occurrences.
   The main obstacle with legal documents is the lack of sufficient data, especially for languages
different from English, that is a crucial aspect given the requirements of neural network-based
language models. For this reason, one of the topic of interest for our work has been the zero-
shot learning (ZSL) approach. ZSL is a problem setup in the field of machine learning, where
a classifier is required to predict labels of examples extracted from classes that were never
observed in the training phase. It was firstly referred to as dataless classification in 2008 [9] and
has quickly become a subject of interest, particularly in the field of natural language processing.
The great advantage of this approach consists in the resulting classifier being able to operate
efficiently in a partially or totally unlabeled environment.
   It is possible to classify ZSL techniques according to three different criteria, as explained
in [10]: the learning setting, the semantic space and the method. Firstly, ZSL can be applied
on a completely unlabeled dataset, as in the original paper [9], or on a partially labeled one,
like in [11]; with this last approach, called generalized ZSL, the goal of the classifier shifts to
distinguishing between observation from already seen classes, and examples from unseen ones.
Secondly, one may discern an engineered semantic space from a learned semantic space: the
former is designed by humans and can be constructed upon a set of attributes [12] or a collection
of keywords [13], while the latter is built on top of the results of a machine learning model, as
in the case of a text-embedding space [14]. Finally, ZSL methods can be divided in instance-
based [15], whose focus is on obtaining examples for unseen classes, and classifier-based [16],
which instead focus on directly building a classifier for unlabeled instances.
   Our approach relies on the ASKE engine to classify documents chunks and to extract knowl-
edge from them. This process is enforced in a completely unsupervised environment, therefore
eliminating the need for annotated data by operating in a text-embedding space. The employed
instance-based method goes under the category of projection methods, which consists in label-
ing instances by collocating these examples in the same semantic space with class prototypes.
A key component in ASKE is Sentence-BERT [17], a modification of BERT language model [18]
that is specifically aimed at representing sentence meaning in a vector space. In the legal domain,
a specific pre-trained BERT model, known as LEGAL-BERT, has been built by pre-training the
original BERT model on several legal corpora [19]. Sentence-BERT, instead, has been used to
retrieve legal documents by their embedding similarity [20]. By comparison, few models have
been developed for Italian language. LamBERTa model [21] is built upon an Italian pre-trained
BERT model by using the Italian Civil Code as a corpus to fine-tune it for article retrieval. By
outperforming several other neural models, it proved a promising adaptation of BERT to Italian
legal language. However, in our work, we preferred focusing on sentence representation by
adopting a pre-trained multilingual Sentence-BERT model.


3. The proposed knowledge-based approach for legal document
   retrieval
As shown in Figure 1, the proposed approach is organized in three phases. Phase 1 is devoted to
data acquisition, organizing a repository in which source documents and associated metadata
are stored. Phase 2 deals with textual data preprocessing, where documents are split into chunks,
queries are submitted by the user, and the two are mapped into a vector space by a pretrained
document embedding model. Finally, in phase 3, the ASKE cycle is performed, consisting in
zero-shot classification of documents, terminology enrichment and concept formation. Zero-
shot classification is exploited for document retrieval by vector similarity, while the other two
steps are used for knowledge extraction and query expansion. Retrieved documents are stored,
together with terminological and conceptual knowledge extracted from text, in a graph data
structure called ASKE Conceptual Graph (ACG).
             data acquisition                   textual data preprocessing           knowledge-based document retrieval

                                                           document                                zero-shot
                                         documents
                                                            chunks                               classiﬁcation


     data                                                                embedding   ACG                         terminology
                         Elasticsearch
                                                                           model                                 enrichment
    source


                                                                                                  concepts
                                                            queries
                                                                                                  formation


                    1                                       2                                         3


Figure 1: The proposed approach


3.1. Data acquisition
In this work, we consider a set of 1,059,358 public source legal documents with textual content
and associated metadata provided by the Italian Ministry of Justice. Approximately 92% is made
of court decisions, and the others are ancillary acts of the courts. The source documents were
originally organized in folders, subdivided by district of the Italian administration of justice,
containing csv files; a single row housed the metadata and text of a document. For each source
document, 41 different metadata are provided; some of them, such as the path where text is
stored or the id of the user who created the document, contain information used by the document
management system of the Courts but poorly relevant for subsequent document retrieval. Other
metadata are more content-related and provide useful information for subsequent phases of
document retrieval, as per the list below, where numbers in parentheses are the percentages
of documents in which they are filled in: the year when the trial started (100%) and the code
assigned to it (100%), the year when the decision was enacted (100% of court decisions) and
the code assigned to it (100% of court decisions), a code for the trial subject, called the “object
code” (100%), the names of the claimant (99.2%) and defendant (98.9%), the name of the judge
(100%), first instance or appeal (100%), court and office codes (100%), 8 pieces of data about the
first instance trial in case of appeal. The district too can be considered part of the metadata, as,
even if not coded in the records, can be deduced by the folder structure. The object code is a
hierarchical classification of trial subjects, used by the Italian IT infrastructure of the courts,
that considers both the general type (e.g. family) and the specificity of the case (e.g. divorce by
mutual agreement).
   Starting from source documents and associated metadata, we created an Elasticsearch indexed
repository for the data, so that one can extract document sets for any combinations of the
metadata for subsequent phases of the approach. The architecture of the repository is based on
microservices, exposing specific API for document filtering.
3.2. Textual data preprocessing
Inputs of the textual data preprocessing phase are a corpus of documents 𝐷 = {𝑑1 , . . . , 𝑑𝑛 }
extracted from the repository organized in the previous phase, plus a set of queries 𝑄 =
{𝑞1 , . . . , 𝑞𝑚 }. Queries can be:

  (i) lemmas;

  (ii) text fragments, for instance definitions or excerpts of other documents;

 (iii) concepts from an existing knowledge base.

   While in case (ii) the query will correspond to the submitted fragment and in case (iii) to the
concept definition, case (i) requires the system to retrieve all possible definitions of the input
lemma from an external knowledge base, not specified by the user.
   Documents are firstly tokenized and split into chunks 𝑘1 , . . . , 𝑘𝑁 . In our case, chunks
correspond to sentences, but the analysis may be carried out at a different granularity level.
Then stopwords are removed and the remaining terms are lemmatized. The preprocessing
phase ends with sentence embedding, in which document chunks 𝑘1 , . . . , 𝑘𝑁 and queries
𝑞1 , . . . , 𝑞𝑚 are mapped into the same vector space using a pre-trained either language specific
or multilingual Sentence-BERT [22] model.

3.3. Knowledge-based document retrieval
Document retrieval is carried out by enforcing the ASKE cycle, which iteratively solves a
multilabel classification problem with zero initial knowledge, that is, without any annotation of
the documents. The goal is to classify document chunks with query labels 𝑞1 , . . . , 𝑞𝑚 , which
can correspond to words (the input lemmas in case (i) from the previous paragraph and the
concept labels in case (iii)) or alphanumeric IDs (case (ii), in which summarizing the input
fragments with an unambiguous label may not be possible).
   The first step in the ASKE cycle is the zero-shot classification step, in which query labels are
assigned to document chunks according to a similarity 𝜎 between the chunk embedding k and
the query embedding q:

                                𝑓𝑄 (k) = {𝑞 ∈ 𝑄 : 𝜎(k, q) ≥ 𝛼}
   where threshold 𝛼 is treated as a hyperparameter: the closer to 1, the fewer query labels will
be associated to each chunk. Such a setting may imply that not all chunks that are relevant
for a query will be retrieved, while 𝛼 ≪ 1 may result in the association of chunks that are not
relevant for the query.
   While the classification step already produces a set of document chunks that are pertinent to
the input queries, the retrieval is also refined by a knowledge extraction and query expansion
mechanism consisting of two further steps, performed at each cycle iteration: terminology ex-
traction and concept formation. Given a set of chunks 𝐾𝑞 classified with label 𝑞, the terminology
extraction step considers term lemmas 𝑡 ∈ 𝑘, ∀𝑘 ∈ 𝐾𝑞 , it takes and embeds all their possible
definitions 𝑠𝑡,1 , . . . , 𝑠𝑡,𝐿 from a predefined knowledge base (e.g. WordNet), and performs term
sense disambiguation by retaining only the definition whose embedding s*𝑡 is the closest to the
query embedding q. Finally, for each query 𝑞, only terms that satisfy the following equation
are extracted:

                      𝑇 (𝑞) = {𝑡 ∈ 𝑘, ∀𝑘 ∈ 𝐾𝑞 : 𝜎(s*𝑡 , q) + 𝜎(s*𝑡 , K𝑞 ) ≥ 𝛽}
   where 𝛽 is again a hyperparameter, and K𝑞 = |𝐾1𝑞 | 𝑘∈𝐾𝑞 k is the centroid of the embeddings
                                                         ∑︀

of chunks that have been labelled with 𝑞.
   All terms are finally clustered in what we called the “concept formation” step, and the resulting
clusters of terms, referred to as “concepts” and represented by their centroid in the embedding
space, will be added to the set of queries in the following cycle iteration. For concept formation,
any clustering algorithm will fit, producing different but comparable results, as long as the
number of clusters is not fixed a priori and the clusters corresponding to the user-defined
queries are always retained. The newly formed concepts are considered as “derived” from the
initial queries or from concepts formed at previous iterations.
   Terminology extraction and concept formation contribute to query expansion in two ways.
First, they add derived concepts to the set of initial queries, enabling the extraction of additional
information from the retrieved documents. Secondly, they modify the initial queries themselves.
Indeed, at the first iteration a query 𝑞 is represented solely by its embedding, while at later
iterations it will be represented by the centroid of its embedding and the embeddings of the
terms clustered with 𝑞.
   All concepts, document chunks and extracted terms are stored in a graph a data structure
called 𝐴𝐶𝐺, ASKE Conceptual Graph. 𝐴𝐶𝐺 includes term-to-chunk belonging, term-to-
concept (or term-to-query) relatedness, concept-to-chunk (or query-to-chunk) labelling, and
concept-to-concept (or query-to-concept) derivation relationships. The data contained in the
ACG at the end of every iteration constitute the input to the subsequent iteration: in particular,
concepts are used as new queries. This mechanism enables iterative refinement of the retrieval
operation by exploiting knowledge contained in the documents themselves.
   In the end, the ACG will contain a set of final concepts, a vocabulary of extracted terms, and
the collection of document chunks classified against initial query labels 𝑞1 , . . . , 𝑞𝑚 and against
new concepts obtained as the result of the ASKE cycle.


4. Application to a real case study of “precedents” retrieval
In this section, we describe a real case study of using ASKE in the legal domain to retrieve the
precedents for a given case. This case study has been chosen as it corresponds to a significant
and frequent task that judges commonly face in their decision process.
   We built a web prototype in order to demonstrate the application of ASKE approach for
legal document retrieval. The prototype works as a search engine, which takes in input an
initial query (i.e., the input text expressing the precedent search) and which returns a list
of document chunks ordered by pertinence as the output. To define the corpus of source
documents for the case study, in the project team we decided to focus on the matter of unfair
competition. The underlying repository is accessed to extract all the documents related to
unfair competition. A first extraction exploits metadata only, by querying the Elasticsearch
repository using the 8 object codes related to the unfair competition. In this way, an initial
corpus of 779 court decisions is built. To overcome situations where court decisions referring to
unfair competition might have been associated with a different object code metadata (this
occurs, for instance, for documents where the decision deals with several different matters), a
second repository extraction has been performed to retrieve also documents containing unfair
competition in the document text. As a result, a final corpus of 3171 documents is formed. The
prototype is designed to support several datasets, such as court decisions related to different
fields (e.g., civil law, criminal law), to provide a comprehensive and flexible tool. In the case
study, we run the prototype on the unfair competition dataset to search related precedents over
past court decisions, on the basis of one or more text fragments provided as input by the user.
Three main “retrieval-by-input” patterns are envisaged, namely:

      1. retrieval by law provision;

      2. retrieval by decision chunk;

      3. retrieval by keyword.

  For the purposes of the case study: pattern (1) is enforced by using provisions from Italian
Civil Code, pattern (2) is enforced by using chunks from rulings of the Italian Supreme Court;
pattern (3) is enforced by using sets of keywords suggested by legal experts participating in
the NGUPP project team. ASKE was applied choosing a multilingual Sentence-BERT model1
and thresholds 𝛼 = 0.2 and 𝛽 = 0.2, which are only partly relevant if we focus on highest
similarity results.
  Figure 2 shows results ordered by pertinence. Each row in the table represents a pertinent
document chunk for which, from left to right, the following information is given to user:
       • query id: the query to which the chunks has been assigned; since the user can define
         multiple queries, each query is enumerated by order of definition;

       • text: the plain text of the document chunk;

       • metadata: the district, date and id related to the court decision document the chunk
         belongs to;

       • pertinence: the a percentage value of similarity between the chunk and the input query.
  The user can click on a chunk row, to open an external page showing the full text of the
original document from which the chunk has been extracted.

4.1. Preliminary analysis of results
With the help of legal and linguistics experts in the project team, we tried to simulate the typical
process that a judge is likely to undertake when looking for precedents. We were advised to use
parts of a law or of a court decision as search queries and we evaluated the resulting output,
corresponding to pattern (1) of the possible retrieval-by-input patterns previously envisaged.
1
    distiluse-base-multilingual-cased-v2 from www.sbert.net/docs/pretrained_models.html
      query id     text                                                metadata        pertinence


         Q1


         Q1


Figure 2: Example of the results retrieved over unfair competition dataset


   Before proceeding further, a couple of remarks are needed. Each of the following retrieval
tests has been performed on the original documents in Italian. For the sake of clarity and
understandability, hereafter, some text translations, made by the authors, are provided to the
reader. Furthermore, for brevity, below we report only on retrieved text chunks with the highest
pertinence score.
   Test (1): example of result-by-law provision, that is, retrieval is performed using an entire
law or part of it as input query. In this test, we give as input article 2598, paragraph 2, of the
Italian Civil Code instituting the second type of unfair competition:

      “[...] any person who disseminates information and appreciations on the products and
      activities of a competitor, capable of discredit them, or takes possession of the merits
      of a competitor’s products or business.”

   As it can be seen from the two retrieved chunks below, ASKE is able to retrieve as most
relevant results the ones containing a reference to the input law text, regardless of the way in
which this reference is made, either explicitly or implicitly.

      “Therefore, neither would be the additional case referred to in article 2598 comma 2 of
      the Civil Code, which refers to the unfair nature of competition in the act or behaviour
      of those who disseminate news and appreciations of the products and activities of a
      competitor, capable of discredit them, or takes possession of the merits of a competitor’s
      products or business.”

      “In order to qualify unfair competition for denigration, it is not necessary that the
      news and appreciations spread among the public relate specifically to the competitor’s
      products, since they may also have as object also circumstances or opinions more in
      general inherent to the activity of the latter, and therefore also to its organization or
      to the way of acting of the entrepreneur in the professional field (with the exclusion,
      therefore, of its strictly personal and private sphere), whose knowledge by third parties
      is in any case likely to adversely affect the consideration that the company enjoys
      among consumers.”

   Test (2): example of result-by-decision chunk, that is, retrieval is performed using a part of
the court decision as input query. In particular, this test used a court decision chunk referring
to the first of the three types of unfair competition identified by the Italian Civil Code (Article
2598, paragraph 1):

      “The reproduction of what has been or could have been the subject of patent for model
      of utility or ornament cannot, by itself, normally supplement the extremes of unfair
      competition for slavish imitation, being necessary, for the survival of the latter, that
      the presentation of the goods - on the basis of a comparative examination of similar
      products in relation to the degree of diligence and capacity of the average consumer,
      to whom the goods are intended to be allocated - is carried out in such a way as to
      mislead the consumer, so that he, wishing to buy the goods of a certain producer, may
      confuse it with that of a competitor.”

   Although the evaluation of results retrieved by ASKE for this kind of query could be not
immediate, since one needs to understand the meaning and the context of these chunks, the law
experts of the project team positively judged the relevance/pertinence of retrieved results, by
demonstrating the effectiveness of the proposed approach at this preliminary stage of evaluation,
i.e., manual validation by legal experts. In particular, the two most pertinent chunks retrieved
are reported below:

      “In order to be considered integrated in the case of unfair competition for slavish
      imitation, it is also necessary to verify the existence of the imitation and the distinctive
      character of the model, namely its ability to link the product to a certain company,
      and the ability of imitation behaviour to create confusion in the average consumer
      regarding the origin of the product.”

      “Indeed, anti-competitive protection against slavish imitation may be more effective
      than the one provided by the legislation on the protection of models, since the exclusion
      of counterfeiting of an ornamental model justified by the fact that what is claimed to
      be unlawful presents individual character, does not preclude verification, with regard
      to the same products, of unfair competition for slavish imitation given that the latter
      is subject to the diligence of the average consumer and not to the higher one required
      for the informed user in the context of the assessment of individual character.”

   Finally, a third test has been conducted to experiment the retrieval-by-keyword pattern,
defined in Section 4. In this case, keywords like “denigration” (related to Article 2598, paragraph
2) and “slavish imitation” (Article 2598, paragraph 1) have been provided as input query, as
suggested by legal experts. Results were less satisfactory than previous cases, in that our
approach aims at capturing the contextual meaning provided by a portion of text, which can
not be equally captured through the ASKE cycle from simple keywords or from a combination
of few keywords.
   As mentioned in Section 3.2, the system may also be allowed to retrieve all definitions of
a lemma from an external knowledge base. The following ones are the most similar chunks
for the lemmas “imitation” and “denigration” (in Italian) using WordNet as knowledge base to
retrieve the displayed definitions:

      Imitation (1): copying (or trying to copy) the actions of someone else
      Chunk: “to distinguish the original from the copy.”

      Imitation (2): something copied or derived from an original
      Chunk: “to distinguish the original from the copy.”

      Imitation (3): the doctrine that representations of nature or human behavior should
      be accurate imitations
      Chunk: “The nature, unfavourable or not, of the exemption must be evaluated at the
      moment when the parties foresee it.”

      Denigration (1): the act of speaking contemptuously of
      Chunk: “seizure minutes on record”

      Denigration (2): a communication that belittles somebody or something
      Chunk: “Communications by certified e-mail”

      Denigration (3): a false accusation of an offense or a malicious misrepresentation of
      someone’s words or actions
      Chunk: “on the non-existence of any illicit fact or malicious behavior ascribable to
      __”

   As in the case of the keywords alone, the results are less satisfying than the case of retrieval
by law provision or decision chunk. Not only the similarity scores are far lower, achieving
at most 64.2%, but the system appears also to be more likely to retrieve short chunks and to
misunderstand the context (as in cases for “denigration”). Moreover, here input queries are
not necessarily pertinent to the legal domain, due to the usage of a non-legal knowledge base.
All in all, we found that query type 1 (law provisions) and 2 (decision chunks) are generally
preferable.


5. Ongoing and future work
In this paper we addressed the problem of document search in the legal domain by introducing
a zero-knowledge approach that offers a high degree of flexibility with regard to the integration
of external knowledge and the variability of inputs, suitable to face the scarcity of annotated
data and the specificity of terminology typical of the legal domain in general, and of the Italian
legal domain in particular.
   The work is being developed in the context of an ongoing project and will be further refined
and extended. The integration of the approach with external knowledge bases and dictionaries
that specifically relate to the legal domain is one practical line for future work, while fine-tuning
the underlying language model for the search task will depend on the availability of an annotated
corpus. Such data may not only give the possibility of fine-tuning the model, but also enable
deeper, mathematically rigorous evaluation, which is often affected by the characteristics of the
field of application. The lack of annotated data suitable for our task prevented us from evaluating
the performances of the proposed approach in quantitative terms. Despite that, we are currently
working on performing tests on datasets devoted to the evaluation of document classification
tasks, such as the BBC News dataset [23], a collection of 2225 articles from five topical areas:
business, entertainment, politics, sport, and tech. The metrics used for the evaluation is the
weighted F1 score, which takes into consideration the different sizes of the classes, reaching a
value of 0.89; this can be considered a promising result, given the unsupervised nature of our
approach.
   Continuation of the collaboration with experts and practitioners of related disciplines, such as
law and linguistics, will be crucial to improve our approach in the directions outlined above, and
it may prove helpful to discover new improvements to better cope and capture the specificity of
the legal documents and the rules commonly adopted by legal actors for their construction.

Acknowledgements
This paper is partially funded by the Next Generation UPP project within the PON programme
of the Italian Ministry of Justice.


References
 [1] H. Surden, Artificial intelligence and law: An overview, Georgia State University Law
     Review 35 (2019) 19–22.
 [2] A. Ferrara, S. Picascia, D. Riva, Context-Aware Knowledge Extraction from Legal Docu-
     ments through Zero-Shot Classification, in: Proc. of the 1st ER Int. Workshop on Digital
     Justice, Digital Law, and Conceptual Modeling (JUSMOD22), Hyderabad, India, 2022.
 [3] D. C. Blair, M. E. Maron, An evaluation of retrieval effectiveness for a full-text document-
     retrieval system, Commun. ACM 28 (1985) 289–299. URL: https://doi.org/10.1145/3166.3197.
     doi:10.1145/3166.3197.
 [4] W. Y. Mok, J. R. Mok, Legal machine-learning analysis: First steps towards a.i. assisted
     legal research, in: Proceedings of the Seventeenth International Conference on Artificial
     Intelligence and Law, ICAIL ’19, Association for Computing Machinery, New York, NY, USA,
     2019, p. 266–267. URL: https://doi.org/10.1145/3322640.3326737. doi:10.1145/3322640.
     3326737.
 [5] N. Zeni, N. Kiyavitskaya, L. Mich, J. Cordy, J. Mylopoulos, Gaiust: supporting the extraction
     of rights and obligations for regulatory compliance, Requirements Engineering 20 (2015)
     1–22. doi:10.1007/s00766-013-0181-8.
 [6] S. Castano, A. Ferrara, M. Falduti, S. Montanelli, Crime knowledge extraction: An ontology-
     driven approach for detecting abstract terms in case law decisions, in: Proceedings
     of the Seventeenth International Conference on Artificial Intelligence and Law, ICAIL
     ’19, Association for Computing Machinery, New York, NY, USA, 2019, p. 179–183. URL:
     https://doi.org/10.1145/3322640.3326730. doi:10.1145/3322640.3326730.
 [7] M. Klein, W. Steenbergen, E. Uijttenbroek, A. Lodder, F. Harmelen, Thesaurus-based
     retrieval of case law., 2006, pp. 61–70.
 [8] W. Hu, S. Zhao, Q. Zhao, H. Sun, X. Hu, R. Guo, Y. Li, Y. Cui, L. Ma, BERT_LF: A
     similar case retrieval method based on legal facts, Wireless Communications and Mobile
     Computing 2022 (2022) 1–9. URL: https://doi.org/10.1155/2022/2511147. doi:10.1155/
     2022/2511147.
 [9] M.-W. Chang, L.-A. Ratinov, D. Roth, V. Srikumar, Importance of semantic representation:
     Dataless classification., in: Aaai, volume 2, 2008, pp. 830–835.
[10] W. Wang, V. W. Zheng, H. Yu, C. Miao, A survey of zero-shot learning: Settings, methods,
     and applications, ACM Transactions on Intelligent Systems and Technology (TIST) 10
     (2019) 1–37.
[11] Y. Xian, C. H. Lampert, B. Schiele, Z. Akata, Zero-shot learning—a comprehensive evalua-
     tion of the good, the bad and the ugly, IEEE transactions on pattern analysis and machine
     intelligence 41 (2018) 2251–2265.
[12] C. H. Lampert, H. Nickisch, S. Harmeling, Learning to detect unseen object classes by
     between-class attribute transfer, in: 2009 IEEE conference on computer vision and pattern
     recognition, IEEE, 2009, pp. 951–958.
[13] R. Qiao, L. Liu, C. Shen, A. Van Den Hengel, Less is more: zero-shot learning from online
     textual documents with noise suppression, in: Proceedings of the IEEE conference on
     computer vision and pattern recognition, 2016, pp. 2249–2257.
[14] Y. Xian, Z. Akata, G. Sharma, Q. Nguyen, M. Hein, B. Schiele, Latent embeddings for
     zero-shot classification, in: Proceedings of the IEEE conference on computer vision and
     pattern recognition, 2016, pp. 69–77.
[15] X. Xu, T. Hospedales, S. Gong, Transductive zero-shot action recognition by word-vector
     embedding, International Journal of Computer Vision 123 (2017) 309–333.
[16] A. Frome, G. S. Corrado, J. Shlens, S. Bengio, J. Dean, M. Ranzato, T. Mikolov, Devise:
     A deep visual-semantic embedding model, Advances in neural information processing
     systems 26 (2013).
[17] N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks,
     arXiv preprint arXiv:1908.10084 (2019).
[18] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional
     transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018).
[19] I. Chalkidis, M. Fergadiotis, P. Malakasiotis, N. Aletras, I. Androutsopoulos, Legal-bert:
     The muppets straight out of law school, arXiv preprint arXiv:2010.02559 (2020).
[20] S. Villata, et al., Sentence embeddings and high-speed similarity search for fast computer
     assisted annotation of legal documents, in: Legal Knowledge and Information Systems:
     JURIX 2020: The Thirty-third Annual Conference, Brno, Czech Republic, December 9-11,
     2020, volume 334, IOS Press, 2020, p. 164.
[21] A. Tagarelli, A. Simeri, Unsupervised law article mining based on deep pre-trained language
     representation models with application to the italian civil code, Artificial Intelligence and
     Law (2021) 1–57.
[22] N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks,
     arXiv preprint arXiv:1908.10084 (2019).
[23] D. Greene, P. Cunningham, Practical solutions to the problem of diagonal dominance in
     kernel document clustering, in: Proc. 23rd International Conference on Machine learning
     (ICML’06), ACM Press, 2006, pp. 377–384.