<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Improving the accessibility of EU laws: the Chat-EUR-Lex project</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Manola Cherubini</string-name>
          <email>manola.cherubini@igsg.cnr.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Romano</string-name>
          <email>francesco.romano@igsg.cnr.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Bolioli</string-name>
          <email>andrea.bolioli@aptus.ai</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lorenzo De Mattei</string-name>
          <email>lorenzo@aptus.ai</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mattia Sangermano</string-name>
          <email>mattia.sangermano@aptus.ai</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Aptus.AI</institution>
          ,
          <addr-line>Largo Padre Renzo Spadoni 1, 56126 Pisa</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Legal Informatics and Judicial Systems (IGSG-CNR)</institution>
          ,
          <addr-line>via dei Barucci 20, Florence, 50127</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Ital-IA 2024: 4th National Conference on Artificial Intelligence</institution>
          ,
          <addr-line>organized by CINI</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this article we describe the results of an ongoing research project on the use of Chat-Based Large Language Models (Chat LLMs) and Retrieval Augmented Generation (RAG) for the access to legal repositories. We are integrating Chat LLMs and RAG to access a dataset of legal acts in English and Italian (a subset of EUR-Lex collection), and interact through a chatbot. We present the state of the art, the objectives, the use cases, the methodology used in the project, and then we discuss the preliminary results.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Legal Informatics</kwd>
        <kwd>Large Language Models (LLMs)</kwd>
        <kwd>Retrieval Augmented Generation (RAG) 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In this article we describe the partial results of an
ongoing research project on the use of Chat-Based
Large Language Models (Chat LLMs) and Retrieval
Augmented Generation (RAG) for the access to
normative repositories.</p>
      <p>In the project, we are integrating Chat LLMs and
RAG to access a dataset of legal documents (European
legal acts taken from EUR-Lex repository) and to
allow the user to interact through a chatbot.</p>
      <p>In the next sections, we will present the state of
the art (2. Related works), the objectives and the
methodology used (3. Chat-EUR-Lex methodology),
the results of a research survey (4. Research survey),
the system architecture (5. System architecture), and
then we discuss the results presented in the previous
sections (6. Discussion and conclusions).</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related works</title>
      <p>
        As stated in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and many other sources, “Legal
professionals rely on accurate and up-to-date
information to make informed decisions, interpret
laws, and provide legal counsel”. The phenomenon of
hallucination and nonsensical outputs of systems
based on LLMs is obviously not acceptable in the legal
context. To the best of our knowledge, the first survey
on the challenges faced by LLMs in the legal domain
was presented in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], but mainly for Chinese language.
While in other domains, such the financial one, a few
LLMs have already been developed [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Large
language models are also used in healthcare where
LLMs are useful for processing and understanding
medical text data, providing valuable insights, and
supporting clinical decision-making [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        LLMs are posing interesting challenges to those
who are experimenting with these technologies in the
legal field, where the “complexities of legal language,
nuanced interpretations, and the ever-evolving
nature of legislation present unique challenges that
0000-0002-0242-6633 (Manola Cherubini);
0000-0001-52507733 (Francesco Romano); 0000-0003-1681-9435 (Andrea
Bolioli)
© 2024 Copyright for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).
require tailored solutions” [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. There are many
questions and fears about the actual use of these
artificial intelligence tools, e.g., their opacity [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and
the possibility of hallucinations, but also “legal
problems concerning intellectual property, data
privacy, and bias and discrimination” [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. For this
reason, in the European Union it has been decided to
regulate the use of artificial intelligence in specific
sectors, but also to adopt a regulation that provides
for a regulatory framework of reference only for
highrisk AI systems [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Some experiments conducted on
legal datasets show that LLMs can improve the
performance of document page classification [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ][
        <xref ref-type="bibr" rid="ref9">9</xref>
        ],
the annotation of legal texts [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], the summarization
of legal texts [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], the legal rule classification
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], the legal statute identification from facts [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]
and the mining of legal argument [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Other trials
explore the ability of LLMs “to explain legal concepts
from statutory provisions to legal professionals” [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]
and to create “a register of obligations from various
types of legislative and regulatory material” [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>
        Other uses can also be mentioned, such as LLMs
as legal tutors in the context of legal training [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] and
in one of the most basic tasks required of lawyers, the
so-called “statutory reasoning” [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Recently,
legislative drafting experiments have been carried out
with ChatGPT, particularly for “the comparison of
legislation among jurisdictions and the synthesis of
the best possible policy for the country based on this
comparison” [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
      </p>
      <p>
        As it is known, generative AI models have been
found to hallucinate, i.e., they can generate false or
nonsensical statements [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]., two strategies for
reducing this problem are Fine-tuning and
RetrievalAugmented Generation (RAG) [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. In both cases, we
try to provide the LLM with the relevant information
(according to a domain or a specific query).
Finetuning involves additional training on a specific
dataset, tailoring the model to certain tasks or
domains [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. This improves accuracy but limits
the model to knowledge up to the last fine-tuning. RAG
merges a pre-trained model with a retrieval system,
accessing current data for accurate responses on
recent or specific topics. Its success hinges on the
quality of retrieved information and requires
maintaining a large, updated database. Both methods
enhance model performance in specific areas,
balancing current information and resource needs.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Chat-EUR-Lex metodology</title>
      <p>In this section, we present the main problems faced
and the methodologies used in the ongoing
Chat-EURLex project. Our objective is to create an AI-powered
conversational interface that deals with complex legal
texts (the regulations in English and Italian published
in the EUR-Lex repository), can provide simplified
explanations, and allows the user to conduct
contextspecific interaction. To present the methodologies
used, we describe the activities performed:
• Legal and Ethics risk assessment. We
performed a legal and ethics assessment.
The prototype will be compliant to GDPR
and EU AI Act, i.e., we will comply with the
rules set by EU regulations.
• UX Research and Survey. We collected data
and information from a sample of potential
users through a questionnaire, both in
Italian and in English. Objective of the
survey: understand the needs of people
using digital legal resources, and their level
of satisfaction; identify users' needs and
desires regarding chatbot interaction; know
the fears related to the use of generative AI.
User experience (UX) research involves
studying how users interact with the current
EUR-Lex system and identifying pain points
and challenges.
• Data collection. Chat-EUR-Lex dataset
comprises a selection of in force legal acts in
English and Italian sourced from EUR-Lex,
covering the period from January 1, 2014, to
December 31, 2023. Specifically, it includes
all historical texts preserved in Celex 3
sector that remain unaltered over time,
along with the most recent consolidated
versions in Celex 0 sector for acts that have
undergone amendments. Corrigenda are
omitted from this dataset. Additionally, the
EUR-Lex documents that are not provided
with XML or HTML data are excluded from
the selection. Number of documents in
English: 19062; documents in Italian:
18164.
• Semantic search engine setup. Semantic
search must allow users to find relevant
legal information even if they don't use
precise legal terminology. This involves
using Natural Language Processing (NLP)
techniques, particularly neural embedding,
such as the one introduced by (Lai et al.
2023).
• RAG-based Chat system development. RAG
combines retrieval-based methods with
generative language models to provide
accurate and contextually relevant
responses to user queries. The user can read
•
•
both the generated answer and the relevant
sources, i.e. the portions of regulations used
to generate the answer.</p>
      <p>First version release (June 2024). The first
version of the prototype is released to a
selected group of users. This version should
provide basic functionality and serve as a
starting point for further improvements.</p>
      <p>Feedback collection and tuning. User
feedback is actively collected and analysed.
This feedback is used to identify areas for
improvement and fine-tune both the chat
system and the user interface. This iterative
process continues to enhance the system's
effectiveness and user satisfaction.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Research survey</title>
      <p>In this section we present the results of the
questionnaire distributed from December 28, 2023, to
March 31, 2024, aimed at legal professionals, law
researchers, public officials in the legal sector,
compliance specialists, and other people interested in
the use of Generative AI in the legal domain, in Italy
and other European countries. The objectives of the
questionnaire were to understand the needs of people
using digital legal resources (EUR-Lex in particular)
and their level of satisfaction; identify users' needs
and desires regarding chatbot interaction; know the
fears related to the use of generative AI. The
questionnaire was anonymous; the languages used
were Italian and English. We distributed it online on
websites and with targeted e-mail activity.</p>
      <p>The questionnaire contained 22 questions: 4
questions for demographic information (age, gender,
education, profession); 9 multiple choice questions; 6
open-ended questions; 2 yes/no questions, and 1
rating question. Regarding the topic of the use of LLMs
for accessing European laws, the most important
questions are: “7) To search for legal documents,
regulations and rulings, do you mainly use the
EURLex search engine, or do you use Google Search or
something else?”. “17) In the legal domain, could a
generative AI chatbot help search and interaction?”.
“18) What kind of requests would you make to the
chatbot? Write one or more example requests.”. “19)
Do you know what generative AI is and/or are you a
user of generative AI tools?”. “21) Do you have any
concerns about the use of generative AI in the legal
field?”.</p>
      <p>The following table (Table 1) presents the
numbers of Submissions, number of people that did
not complete the questionnaire (Starts), number of
people that viewed the questionnaire (Views) in
Italian and English, as of March 30, 2024.</p>
      <p>We report here some statistics on the Italian
responses: 54% of the respondents are legal experts
(law researchers, jurists, lawyers, compliance
specialists, etc.), while 46% are not legal experts. 66%
say that they consulted the EUR-Lex repository at
least once. When asked which tool they mainly use to
search for legal documents and regulations, 48%
answer mainly Google search, 37% mainly EUR-Lex
search engine, 15% mainly other tools (we do not
report here the answers on the other legal sources).
60.4% say that a generative AI chatbot could help
search and interaction, 33.3% don’t know, 6.2% say
No. The question “Do you know what generative AI is
and/or are you a user of generative AI tools?” is
answered: 51% “Little”, 27% say “Yes, I use them
regularly”, 22% “Not at all” (remember that these
percentages concern responses in Italy). Finally, 87%
think generative AI must be regulated, 8% don’t know,
5% answer No. For reasons of space, we do not report
here the answers to the question “What kind of
requests would you make to the chatbot?”.</p>
      <p>In summary, these responses allow us to assess
the level of knowledge of legal experts in generative
artificial intelligence, to see if there are differences
between legal experts and other people, to know their
fears on these issues, and, above all, to collect the
needs and requirements of potential users of the
chatbot.</p>
      <p>A detailed report containing the complete
questionnaire, the aggregate results and a detailed
analysis will be published in May 2024 on the GitHub
project repository.</p>
    </sec>
    <sec id="sec-5">
      <title>5. System architecture</title>
      <p>The pipeline of Chat-EUR-Lex prototype is divided
into main parts:
• An asynchronous batched pipeline which
collects and indexes the documents from
EUR-Lex into a search engine.
• A synchronous pipeline that gets the users'
queries, retrieves relevant contextual
information and provides a response to the
users.</p>
      <p>The asynchronous batched pipeline comprises
three main components:
1. A crawler that collects the data from
EUR</p>
      <p>Lex.
2. A chunker who chunks the documents into
smaller segments.
3. An embedding model that transforms the
segments into dense vectors to be indexed in
the vector DB.</p>
      <p>The synchronous pipeline comprises two main
components:
• A retriever that transforms the query into a
vector using the same embedding models
used by the asynchronous pipeline and looks
into the vector DB for similar contents.
• An LLM that gets both the query and the
context inserted in a prompt template and
produces a response to be provided to the
users</p>
      <p>Each time the user does a new query, the whole
chat history is passed to the LLM until the maximum
prompt length is reached; in that case, older chat parts
are truncated.</p>
      <p>This process involves several parameters to be
selected, such as:
• Chunking techniques and size.
• Embedding models.
• K-nearest neighbor search techniques.
• Prompt templates.
• LLM and its parameters.</p>
      <p>In summary, the RAG approach is a blend of two
key components: a retrieval system and a generator.
The retrieval system scans through a database of
documents to fetch the most relevant ones in
response to a user query. The most recent solutions
for retrieval systems employed in RAGs rely on
semantic search utilizing embeddings.</p>
      <p>The generator, on the other hand, uses these
retrieved documents to generate a well-informed
answer. This process ensures that the system
provides responses that are both informative and
contextually accurate. In the current project setup
(April 2024), the gpt-4 model powers the generation
of responses. For the creation of embeddings, we
utilize text-embedding-ada-002
(https://platform.openai.com/docs/models/embedd
ings).</p>
      <p>While our dataset contains about 37000 legal
acts, the need for partitioning these laws for a
granular retrieval process amplifies the total count of
retrievable documents into about 371000 texts
(“chunks”). This extensive partitioning provides a
more detailed context for the RAG system, allowing
for more accurate answer generation. On the other
side the increased number of documents naturally
presents a challenge for our retrieval process.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Discussion and conclusions</title>
      <p>In this project, we are trying different combinations of
the mentioned parameters using both open-source
and closed-source models to investigate the readiness
of LLMs to build a system for legislative research.</p>
      <p>We are performing two evaluation steps to
compare different models and parameters:
1. Search engine evaluation: we are comparing
different Embedding models, chunking
strategies and k-nearest neighbors search
techniques to select the best combinations
to retrieve good-quality results.
2. Response generator evaluation: having fixed
the best combination for contextual
information retrieval thanks to step 1
evaluation, we will compare the quality of
the generated response using different
prompt templates, LLMs and LLMs
parameters.</p>
      <p>For step 1 evaluation, we are creating a gold
dataset using expert annotators and use standard
search engine evaluation metrics such as the
Mean Reciprocal Rank. For step 2, evaluation of
different settings will be proposed to experts
who will ask the same questions and attribute
scores to each response. Preliminary results have
shown that when the context provided to gpt-4
by the retrieval system is consistent with the
question asked, the generated answer is concise,
comprehensible, and accurate. This consistency
significantly minimizes the problem of
hallucination, wherein the model might generate
false or nonsensical information.</p>
      <p>The large number of chunks that can
contribute to the generation of the answer
naturally presents a challenge for our retrieval
process. In this context, we are actively exploring
strategies to improve the efficiency of this crucial
component. One of the promising directions we
are considering involves leveraging not only the
semantic content of the normative sources but
also the boundary information, such as metadata.</p>
      <p>The inclusion of metadata in our retrieval process
could potentially imbue our system with the ability to
hone in on the most relevant documents, thereby
optimizing the retrieval process and improving the
overall performance of the chat system. On the other
hand, the utilization of a specific embedding model
built on legal data could be beneficial, as opposed to a
generic embedder. Indeed, this model could provide a
more nuanced understanding of the legal texts, thus
enhancing the retrieval process.</p>
      <p>
        The evaluation of the preliminary results is
promising: on simple tasks, i.e. simple queries, the
system perform on par with human experts, as
attested in similar researches [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ].
      </p>
      <p>
        We must highlight the need for legal experts to
make qualitative assessments, as well as quantitative
evaluations, both due to the complexity of the legal
domain and because the evaluation can be done in
different ways depending on the use case, aim and
user target. It does not seem possible to create a
unique dataset to evaluate the quality of the results in
an automatic way for QA tasks in the legal domain.
Multiple points of view may be equally valid, and a
unique ‘ground truth’ may not exist, as discussed for
other tasks in Perspectivist Approaches to NLP [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]
      </p>
      <p>A general problem for experiments with LLMs is
“non-repeatability”: the experiments are not exactly
reproducible because the systems are not
deterministic; moreover, in proprietary commercial
systems, we do not know the LLM’s parameters and
the dataset used for training; new versions and new
LLMs are released quickly.</p>
      <p>Here are some additional critical issues that we
are addressing:
• text length limitations: currently, there's a
limit on the length of text the LLM can handle
effectively. This necessitates breaking down
longer texts, which can be cumbersome;
• impact of short queries: the chatbot's
accuracy and precision suffer when
responding to very short or poorly defined
queries. More detailed user queries lead to
better results;
• single vs. multiple documents: the chatbot
performs best when responding to queries
that target information from a single
document, rather than synthesizing
information from multiple sources.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>Chat-EUR-Lex project is funded within the framework
of the NGI Search project under grant agreement No
101069364. Views and opinions expressed are
however those of the author(s) only and do not
necessarily reflect those of the European Union or
European Commission. Neither the European Union
nor the granting authority can be held responsible for
them.
The GitHub project repository can be consulted at
https://github.com/Aptus-AI/chat-eur-lex. The
dataset has been published
on
https://huggingface.co/datasets/AptusAI/chateur-lex .</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yuan</surname>
          </string-name>
          , Chatlaw:
          <article-title>Open-source legal large language model with integrated external knowledge bases</article-title>
          ,
          <source>arXiv preprint arXiv:2306.16092</source>
          , (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Gan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.S.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <source>Large Language Models in Law: A Survey. arXiv preprint arXiv:2312</source>
          .
          <fpage>03718</fpage>
          . (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Irsoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dabravolski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dredze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gehrmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kambadur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rosenberg</surname>
          </string-name>
          , G. Mann,
          <article-title>BloombergGPT: A Large Language Model for Finance</article-title>
          ,
          <source>arXiv preprint arXiv:2303.17564v2</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Reddy</surname>
          </string-name>
          ,
          <article-title>Evaluating large language models for use in healthcare: A framework for translational value assessment</article-title>
          ,
          <source>in volume 41 of Informatics in Medicine Unlocked</source>
          , (
          <year>2023</year>
          ). https://doi.org/10.1016/j.imu.
          <year>2023</year>
          .101304
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Contaldo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Campara</surname>
          </string-name>
          , Intelligenza artificiale e Diritto.
          <article-title>Dai sistemi esperti “classici” ai sistemi esperti “evoluti”: tecnologia e implementazione giuridica</article-title>
          , in: G.
          <string-name>
            <surname>Taddei Elmi</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Contaldo (Eds.),
          <article-title>Intelligenza artificiale</article-title>
          .
          <source>Algoritmi giuridici. Ius condendum o “fantadiritto”</source>
          ,
          <source>Pacini editore, Pisa</source>
          ,
          <year>2020</year>
          , p.
          <fpage>24</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>A short survey of viewing large language models in legal aspect</article-title>
          ,
          <source>arXiv preprint arXiv:2303.09136</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Finocchiaro</surname>
          </string-name>
          ,
          <article-title>Artificial intelligence</article-title>
          .
          <source>What are the rules? Il Mulino</source>
          , Bologna,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Fragkogiannis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Forster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.E.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <article-title>Context-Aware Classification of Legal Document Pages</article-title>
          .
          <source>In: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '23)</source>
          ,
          <article-title>Association for Computing Machinery</article-title>
          , NY, pp
          <fpage>3285</fpage>
          -
          <lpage>3289</lpage>
          ,
          <year>2023</year>
          , doi: 10.1145/3539618.3591839
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Trautmann</surname>
          </string-name>
          ,
          <article-title>Large Language Model Prompt Chaining for Long Legal Document Classification</article-title>
          , arXiv preprint arXiv:
          <volume>2308</volume>
          .
          <fpage>04138</fpage>
          .(
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Savelka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.D.</given-names>
            <surname>Ashley</surname>
          </string-name>
          ,
          <article-title>The unreasonable effectiveness of large language models in zeroshot semantic annotation of legal texts</article-title>
          ,
          <source>Frontiers Artificial Intelligence</source>
          ,
          <volume>6</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          ,
          <year>2023</year>
          , doi: 10.3389/frai.
          <year>2023</year>
          .
          <volume>1279794</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Cherubini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Romano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bolioli</surname>
          </string-name>
          , N. De Francesco,
          <string-name>
            <surname>I. Benedetto</surname>
          </string-name>
          ,
          <article-title>The summarization of legal texts: an experiment with GPT-3. Rivista italiana di informatica e diritto, 5,</article-title>
          <issue>1</issue>
          :
          <fpage>191</fpage>
          -
          <lpage>204</lpage>
          (
          <year>2023</year>
          ) https://doi.org/10.32091/RIID0103
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D.</given-names>
            <surname>Datta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Soni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          , S. Ghosh,
          <source>MILDSum: A Novel Benchmark Dataset for Multilingual Summarization of Indian Legal Case Judgments, arXiv preprint arXiv:2310.18600v1</source>
          , (
          <year>2023</year>
          ) https://doi.org/10.48550/arXiv.2310.18600
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D.</given-names>
            <surname>Liga</surname>
          </string-name>
          , L. Robaldo,
          <article-title>Fine-tuning GPT-3 for legal rule classification</article-title>
          , in volume
          <volume>51</volume>
          of Computer Law &amp; Security Review, (
          <year>2023</year>
          ) doi: 10.1016/j.clsr.
          <year>2023</year>
          .
          <volume>105864</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Paul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mandal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <article-title>Pre-trained language models for the legal domain: a case study on Indian law</article-title>
          .
          <source>In: Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law</source>
          , (
          <year>2023</year>
          ), pp
          <fpage>187</fpage>
          -
          <lpage>196</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Al Zubaer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Granitzer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mitrović</surname>
          </string-name>
          ,
          <article-title>Performance analysis of large language models in the domain of legal argument mining”</article-title>
          ,
          <source>Frontiers Artificial Intelligence</source>
          ,
          <volume>6</volume>
          , (
          <year>2023</year>
          ), doi: 10.3389/frai.
          <year>2023</year>
          .
          <volume>1278796</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.</given-names>
            <surname>Savelka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.D.</given-names>
            <surname>Ashley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.A.</given-names>
            <surname>Gray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Westermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <article-title>Explaining Legal Concepts with Augmented Large Language Models (GPT4), (</article-title>
          <year>2023</year>
          ) arXiv preprint arXiv:
          <volume>2306</volume>
          .
          <year>09525v2</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ioannidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Harper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.S.</given-names>
            <surname>Quah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hunter</surname>
          </string-name>
          , Gracenote.ai:
          <article-title>Legal Generative AI for Regulatory Compliance</article-title>
          .
          <source>In: Proceedings of the Third International Workshop on Artificial Intelligence</source>
          and
          <article-title>Intelligent Assistance for Legal Professionals in the Digital Workplace (LegalAIIA 2023) co-located with (ICAIL 2023), CEUR-WS</article-title>
          .org, Elsevier, pp.
          <fpage>20</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>D.</given-names>
            <surname>Charlotin</surname>
          </string-name>
          ,
          <source>Large Language Models and the Future of Law</source>
          , SSRN https://papers.ssrn.com/sol3/papers.cfm?abst ract_
          <source>id=4548258</source>
          ,
          <year>2023</year>
          , Accessed 04 april,
          <year>2024</year>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A.</given-names>
            <surname>Blair-Stanek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Holzenberger</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. Van Durme</surname>
          </string-name>
          ,
          <source>Can GPT-3 Perform Statutory Reasoning? In: The Nineteenth International Conference on Artificial Intelligence and Law (ICAIL</source>
          <year>2023</year>
          ), ACM, NY, USA, pp.
          <fpage>22</fpage>
          -
          <lpage>31</lpage>
          ,
          <year>2023</year>
          , doi: 10.1145/3594536.3595163.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>G. Hill,</surname>
          </string-name>
          <article-title>The emerging artificial intelligence (AI) and national uniform legislation</article-title>
          ,
          <source>in volume 97.5 of Australian Law Journal</source>
          , (
          <year>2023</year>
          ), pp.
          <fpage>303</fpage>
          -
          <lpage>306</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>L.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Yu</surname>
          </string-name>
          , W. Ma,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions</article-title>
          .
          <source>arXiv preprint arXiv:2311.05232</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Tonmoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.M.</given-names>
            <surname>Zaman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Rawte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chadha</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Das</surname>
          </string-name>
          ,
          <article-title>A comprehensive survey of hallucination mitigation techniques in large language models</article-title>
          .
          <source>arXiv preprint arXiv:2401.01313</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Piktus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Petroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Karpukhin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kiela</surname>
          </string-name>
          ,
          <article-title>Retrievalaugmented generation for knowledge-intensive nlp tasks</article-title>
          .
          <source>Advances, in volume 33 of Neural Information Processing Systems</source>
          (
          <year>2020</year>
          ), pp.
          <fpage>9459</fpage>
          -
          <lpage>9474</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>L.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.Z.J.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Tao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment</article-title>
          .
          <source>arXiv preprint arXiv:2312.12148</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wallis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Allen-Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chen</surname>
          </string-name>
          , Lora:
          <article-title>Low-rank adaptation of large language models</article-title>
          .
          <source>arXiv preprint arXiv:2106.09685</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ladhak</surname>
          </string-name>
          , E. Durmus,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>McKeown</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.B.</given-names>
            <surname>Hashimoto</surname>
          </string-name>
          ,
          <article-title>Benchmarking large language models for news summarization</article-title>
          ,
          <source>in volume 12 of Transactions of the Association for Computational Linguistics</source>
          , (
          <year>2024</year>
          ), pp.
          <fpage>39</fpage>
          -
          <lpage>57</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>F.</given-names>
            <surname>Cabitza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Campagner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Basile</surname>
          </string-name>
          ,
          <article-title>Toward a perspectivist turn in ground truthing for predictive computing</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>37</volume>
          ,
          <source>No. 6</source>
          ,
          <issue>2023</issue>
          , pp.
          <fpage>6860</fpage>
          -
          <lpage>6868</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>