=Paper=
{{Paper
|id=Vol-3762/590
|storemode=property
|title=Legal Drafting supported by AI: enhancing LEOS
|pdfUrl=https://ceur-ws.org/Vol-3762/590.pdf
|volume=Vol-3762
|authors=Monica Palmirani,Fabio Vitali,Generoso Longo,Emanuele Di Sante,Aurora Brega,Andrea D'Arpa,Michele Corazza
|dblpUrl=https://dblp.org/rec/conf/ital-ia/PalmiraniVLSBDC24
}}
==Legal Drafting supported by AI: enhancing LEOS==
Legal Drafting supported by AI: enhancing LEOS
Monica Palmirani1∗, Fabio Vitali2, Generoso Longo1, Emanuele Di Sante1, Aurora
Brega1, Andrea D’Arpa1, Michele Corazza1
1University of Bologna, ALMA-AI, via Galliera 3, Bologna, 40121, Italy
2 University of Bologna, DISI, via Mura Anteo Zamboni 7, Bologna, 40126, Italy
Abstract
Legal drafting is a complex activity that involves different actors and end-users, usually belonging
to the administration staff. AI tools could support this activity by providing useful aid for various
tasks. This paper presents two scenarios where the AI add-on supports the legal drafting activity
conducted using the LEOS web editor, developed by the EU Commission for EU legislation. The
two scenarios are the following: i) retrieving the relevant existing normative definitions
connected with the ongoing bill, by using algorithms based on semantic similarity; ii) suggesting
the normative more pertinent references when some information is missing (e.g., the year); iii)
aiding the drafter in following templates for improving clearness and regularity in the norms
(e.g., modifications). Additionally, it is crucial to model a user interface that is capable of
guaranteeing some foundational principles: accessibility, transparency, usability, user
experience, and explicability. This paper presents the output of this project conducted in
collaboration with the DG Informatics of the EU Commission.
Keywords
Akoma Ntoso, LEOS, similarity, AI.1
1. Introduction apply Symbolic AI based on rules [12]. LEOS [5], [10]
is one of the most promising web editors for legal
The legal drafting activity is a crucial task in the drafting, it has been developed by the EU Commission
legislative procedure in any deliberative assembly. to support the internal legal drafting activities but also
The goals of this task are many: i) to support the with the aim to serve the Member States as well.
political decision-makers; ii) to standardize the LEOS is an open-source web editor specific for
language with the legal tradition, adopting legal drafting, it is written in Angular and it is oriented
multilingual translations when necessary; iii) to apply to manage all the law-making process [15].
drafting rules to improve quality, and clearness; iv) to The aim of this work is to develop a framework
guarantee the Rule of Law and the theory of law architecture that is capable of enhancing LEOS with
principles; v) to track the modifications happening add-ons, developed with AI technologies, that
over time due to the the legislative process. In the last improve the quality of the legal content, help the legal
15 years many specialized editors have been drafters, and manage the law-making process. The
developed [13],[5],[3],[1], in order to support these two add-ons provide the following features
important goals using Natural language processing [7],[4][4],[14]:
technology [6]. Among the proposed solutions some
use the Semantic Web approach [2], while others
Ital-IA 2024: 4th National Conference on Artificial Intelligence, 0000-0002-8557-8084 (M. Palmirani); 0000-0002-7562-5203
organized by CINI, May 29-30, 2024, Naples, Italy (F. Vitali); 0000-0002-7288-6635 (M. Corazza);
∗ Corresponding author. © 2023 Copyright for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).
All authors contributed equally.
monica.palmirani@unibo.it (M. Palmirani);
fabio.vitali@unibo.it (F. Vitali); generoso.longo@studio.unibo.it
(G. Longo); emanuele.disante@studio.unibo.it (E. Di Sante);
aurora.brega@stuido.unibo.it (A. Braga);
andrea.darpa@studio.unibo.it (A. D’Arpa);
michele.corazza@unibo.it (M. Corazza).
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
(i) Suggest the pertinent normative provision (e.g., “For the purposes of this Directive, the
definitions using similarity with the bill definitions laid down in Article 2 of Directive
topic; 2000/60/EC shall apply”).
(ii) Suggest the pertinent normative Fourthly, the context is important for providing
reference using the thematic similarity the relevant output of the suggestion. A definition
with the bill; depends on the topic of the bill. For example, we have
(iii) Take into consideration the temporal many definitions of ‘accuracy’ and it depends on the
information and the nested normative topic of the document.
references; Fifthly, the user interface is a fundamental pillar
(iv) Use the metadata of ELI2 and for guaranteeing good usability, transparency, and
EUROVOC3 to improve the similarity. explicability of the AI behaviors and output [8].
Finally, we use Akoma Ntoso [9] serialization for
The aim is also to create a user interface capable
fostering the structure of the legal documents, the
of:
normative references, the metadata of the lifecycle of
(i) Reduce manual/error-prone work the document, the date of entry into force, into
typing the normative references, also operation, and the date of repeal.
avoiding repetitions in legislative
citations; 3. Dataset
(ii) Maximising reuse of similar legal
The dataset used is composed by 10 years of European
concepts (e.g., definition);
legislation (2010-2021), about 15.000 regulations
(iii) Increasing transparency and
and directives. It was provided by the European
searchability of the existing legal
Publication Office in Formex 3.0 XML format. We have
knowledge included in the corpora.
converted all the documents in Akoma Ntoso, and
using a natural language processing approach we
2. Methodology have annotated the definitions and the normative
The adopted methodology is based on hybrid AI [11], references.
and it uses multiple techniques for achieving its goals. The dataset includes about 899 documents with
We do not generate new text (e.g., using LLM o definitions. For definitions, we have considered only
generative AI), but we intend to suggest pertinent, the explicit provisions usually titled “Definitions” or
contextual, and significant existing legal knowledge where a regular pattern can surely identify the
extracted by the legal corpora, using a similarity index relationship between a term (definiens) and
according to the bill parameters that the legal drafter description (definiendum) (e.g., ‘definiens’ means
is writing. We also use the EUROVOC classification and definiendum, “‘domain’ means one or several data
other contextual information provided by the experts sets that cover specific topics;”). The definitions that
during the drafting process (e.g., type of provision). include normative references are managed by
Secondly, the approach takes into consideration navigating the link to include the complete
the temporal validity of the normative provisions, information (e.g. ‘personal data’ means personal data
excluding those that are repealed, or suggesting the as defined in point (1) of Article 4 of Regulation (EU)
appropriate versions of the consolidated text 2016/679).
according to the view date typed by the end-user. If
the author seeks the normative definition of “privacy” 4. Use Cases
before the GDPR, they can set the date of view before
the 5 May of 2016 (the date of entering into force of 4.1. Normative References
the act) and the system will respect this setting. Normative references are qualified citations used for
Thirdly, we resolve the normative references in mentioning other documents or provisions relevant
order to include in the model of indexing the text cited for the normative discourse. The errors during the
in the recursive way as well (only the first level), typing of the normative references produce incorrect
allowing us to grasp more information, especially links and additional effort in the control phases.
when the definition is limited in the text and it The system permits to type incomplete normative
consists only of normative citations to another references and to retrieve and rank the existing and
2 ELI: https://eur-lex.europa.eu/eli-register/about.html 3 EUROVOC: https://eur-
lex.europa.eu/browse/eurovoc.html?locale=it
into force references which are similar to the XML documents and an SQL database containing the
information requested by the end-user. In the case a correspondence between each document and its
citation of the form “Regulation 406”, for example, the EUROVOC categorization. Each EUROVOC is
system returns all the Regulation which are valid, into associated with an average of the Word2Vec [16]
force, numbered 406 and pertinent to the EUROVOC embeddings of the words composing it. The eXist
of the bill. The system completes the reference (e.g., database including all the AKN-XML documents4 can
Regulation 406/2010) and returns the title of the also use Lucene Java library to calculate the index of
document and other information for identifying the the document text and in particular to the definitions
act as well. (defBody elements). When a new document enters the
Due to the evolution of the European institutions, eXist database it is also indexed in the SQLDB and the
the references have changed syntax and patterns over Word2Vec representation of its definitions is stored.
time. For this reason, the end-user can easily make a If the document does not have EUROVOC tags, we
mistake in the citation format. Our tool helps the end- extract them from CELLAR and we serialize the
user to compose the reference according to the information in the metadata of the Akoma Ntoso
historical period of the document cited. For example, documents.
a Regulation before 1968 is cited using During legal drafting, if the end-user wants to get
number/yy/EEC (e.g., Regulation No 1009/67/EEC); a suggestion (e.g., normative reference or definition),
after 1968 we have number/yy (e.g. Regulation (EEC) they need to provide some parameters as inputs, in
No 2195/91) and after 2009 we have yyyy/number order to calculate the corresponding indexes like the
(e.g., Regulation (EU) 2016/679). title and the EUROVOC keywords of the bill (proposal
of law). The dynamic input typed by the end user (e.g.,
4.2. Legal Definitions incomplete normative reference or definition
keywords) is parsed to compare the content with the
Legal definitions are a sensitive part of the law
existing document collection in eXist. After a first filter
because they define new legal concepts, new
using traditional Information Retrieval techniques for
terminologies, equivalences between different other
grasping the relevant documents, the similarity score
definitions, and exceptions in the case of specific
is calculated based on the text retrieved and
cases. In EU legislation, we usually have a clear article
compared with the embeddings of the input
called “Definitions”, but sometimes we could also find
parameters stored in the SQL DB (for EUROVOC
technical definitions in the last part of the act or in the
values) and using the similarity algorithm of Lucine
annexes.
for the definitions. The ranking is based on the index
Additionally, we could have definitions organized
score, the temporal parameters, considering the
in a long list of points, which might be connected to
normative citations included in the normative
each other. Definitions are composed of three main
provision retrieved as well.
parts: definiens (term); definiendum (description);
Lucene Similarity class implements the scoring
legal concept (abstract class of concept). The use of
model. The library offers several already-built
the same term for multiple definitions is not
implementations of the Similarity class, which reflect
infrequent, and the term might have completely
different scoring models developed in the field of
different meaning in different domains (e.g., pollution
Information Retrieval. Our implementation adopts
has different definitions according to the domain like
Default Similarity class, which combines the Boolean
water, energy, industry, etc.).
model, adopted to filter documents matching the
For this reason, the tool calculates the similarity of
query, and a readjustment of the Vector Space model,
a given term (which can also be composed of multiple
based on TF-IDF weights, for scoring results. In
words) with the existing, valid, and updated (present
particular, VSM is refined by Lucene taking into
in consolidated versions of documents ) definitions in
account the corpus statistics contained in the inverted
the legal corpus, using the similarity index as a
index, the number of terms that correspond to the
criterion.
query, and the multiplying enhancement factors
expressed in the research. This class is also exploited
5. Architecture by the process chain of indexing, since it deals with the
The overall architecture (see Figure 1) is composed calculation of the normalization factors, which
of an XML database that includes the Akoma Ntoso depend on the length of the fields and the boost
4 eXist is an XML database that is indexed using Lucine and
querable with XQuery.
factors specified in the configuration(Similarity 7. Conclusions
(Lucene 3.6.1 API) (apache.org)).
The current paper presents two add-ons
6. User-interface integrated into LEOS web editor to enhance legal
drafting tasks using AI applications. The user interface
The user interface (see Figure 2) is a fundamental is a fundamental component of this work that is
part of this application. LEOS is enriched with an add- designed to incorporate the principles of
on that enables these functionalities in a selective transparency, accessibility, user experience, and
way. The suggestions are offered in a portion of the explicability. The methodology is to not generate new
window that allows the end-user to confirm or discard text (e.g., like LLMs) to avoid hallucinations, which
the output, or to integrate the results in the drafting could affect the democratic rules of the law-making
text. process.
Our custom components are organised in a We aim to extract and offer to the legal drafters the
dedicated application folder, comprising new legal knowledge stored in the corpus, which is
components (stored in .component.ts, sometimes difficult to find due to the large volume of
.component.html, and .component.scss files), new documents, and to return the relevant information
classes (.ts files), and service (in a .service.ts file). This accompanied with a particular index score based on
service manages the essential methods and global temporal parameters, similarity of text using qualified
variables used by our approach. legal provisions like definitions and normative
To maintain consistency, we adopted a style for references. The first results were evaluated by legal
our extension that closely imitates the original experts and they are promising and pertinent to the
application's design. Many of the components used drafting text. Moreover, the end-users appreciated the
were taken from the eUI library, and we followed the provided suggestions, which could retrieve pertinent
guidelines and suggestions provided by the eUI information using topic similarity, cutting repetitive
framework. The version of the eUI library used is 14, work and focusing on higher-level tasks.
the same one adopted by LEOS and used in its native
components. Therefore, both the shape and color of Acknowledgements
the interface elements are consistent with those
indicated by the framework. This project is co-funded by DG Informatics of the
The components we added, we always provide European Commission inside of the larger project
feedback to the user, displaying results when LEOS and with the support of the European
generated, an error message if the service responses Commission funds within ERC HyperModeLex. Grant
raise an issue, and an alert if the user's request is not agreement ID: 101055185.
executed correctly, accordingly with the functionality
we aim to provide. We designed it so that the user References
knows the reasons for an incomplete or incorrect
[1] Agnoloni Tommaso, E Francesconi, P Spinosa,
request and is given the opportunity to make any
xmLegesEditor: an opensource visual XML
necessary corrections. We also strive to maintain
editor for supporting legal national standards
consistency in the terms used in the labels, ensuring
Proceedings of the V legislative XML workshop,
that each element is identified by a unique name and
239-251, 2007
avoiding multiple elements with the same name.
[2] Casanovas, Pompeu; Palmirani, Monica; Peroni,
The end user of the service is an expert in
Silvio; Van Engers, Tom; Vitali, Fabio, Semantic
legislative matters, so we prioritised making the
Web for the Legal Domain: The'next step,
interface simple and intuitive but also very specific for
«SEMANTIC WEB», 2016, 7, pp. 213 - 227
professional tasks in drafting, considering that the
[3] Grant Vergottini,
user has clear knowledge of the subject matter being
https://xcential.com/legispro/standards/
addressed. We created mockups of the interface to
[4] Griglio Elena - Marchetti Carlo, La "specialità"
evaluate it before implementation, ensuring that it is
delle sfide tecnologiche applicate al drafting
indeed usable and effective. The end-user is
parlamentare : dal quadro comparato
constantly involved in the evaluation with regular
all'esperienza del Senato italiano / Elena Griglio,
meetings where the usability is tested and feedback is
Carlo Marchetti, Osservatorio sulle fonti. - 2022,
incorporated in the software.
n. 3, p. 361-386
[5] LEOS
https://joinup.ec.europa.eu/collection/justice-
law-and-security/solution/leos-open-source-
software-editing-legislation
[6] Lesmo Leonardo, Alessandro Mazzei, Monica
Palmirani, Daniele Paolo Radicioni: TULSI: an
NLP system for extracting legal modificatory
provisions. Artif. Intell. Law 21(2): 139-172
(2013)
[7] Lorello Laura, La qualità della legislazione
diventa un’esigenza bicamerale : considerazioni
sul nuovo Comitato per la legislazione del Senato
della Repubblica nel primo semestre di attività,
Federalismi.it : rivista di diritto pubblico
italiano, comunitario e comparato. - 2023, n. 27,
p. 17-45
[8] Palmirani M., Vitali F, Legislative Drafting
Systems, in: Usability in Government Systems,
NEW YORK, Morgan Kaufmann, 2012, pp. 133 -
151
[9] Palmirani Monica (2011). Legislative change
management with Akoma Ntoso. In Legislative
XML for the semantic Web, 101–130. Springer
[10] Palmirani Monica et. al. Drafting legislation in
the era of AI and digitisation,
https://joinup.ec.europa.eu/sites/default/files
/document/2022-
06/Drafting%20legislation%20in%20the%20e
ra%20of%20AI%20and%20digitisation%20%
E2%80%93%20study.pdf
[11] Palmirani Monica, F Sovrano, D Liga, S Sapienza,
F Vitali, Hybrid AI Framework for Legal Analysis
of the EU Legislation Corrigenda, Legal
Knowledge and Information Systems, 68-75
[12] Palmirani Monica, Luca Cervone, Octavian Bujor,
Marco Chiappetta: RAWE: An Editor for Rule
Markup of Legal Texts. RuleML (2) 2013
[13] Palmirani Monica, Raffaella Brighi: An XML
Editor for Legal Information Management. EGOV
2003: 421-429
[14] Tafani Laura, Federico Ponte, Le tecniche
legislative statali, regionali e dell'Unione
europea a confronto : per un auspicabile
ravvicinamento, Osservatorio sulle fonti. - 2022,
n. 1, p. 447-497
[15] Leos Manual
https://joinup.ec.europa.eu/collection/justice-
law-and-security/solution/leos-open-source-
software-editing-legislation/releases
[16] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg
Corrado, and Jeffrey Dean. Distributed
Representations of Words and Phrases and their
Compositionality. In Proceedings of NIPS, 2013.
Figure 1 – Architecture of the system.
Figure 2 – Interface of LEOS with the add-on.
Figure 3 – Interface of LEOS with the add-on results.