=Paper=
{{Paper
|id=Vol-3762/590
|storemode=property
|title=Legal Drafting supported by AI: enhancing LEOS
|pdfUrl=https://ceur-ws.org/Vol-3762/590.pdf
|volume=Vol-3762
|authors=Monica Palmirani,Fabio Vitali,Generoso Longo,Emanuele Di Sante,Aurora Brega,Andrea D'Arpa,Michele Corazza
|dblpUrl=https://dblp.org/rec/conf/ital-ia/PalmiraniVLSBDC24
}}
==Legal Drafting supported by AI: enhancing LEOS==
<pdf width="1500px">https://ceur-ws.org/Vol-3762/590.pdf</pdf>
<pre>
                                Legal Drafting supported by AI: enhancing LEOS
                                Monica Palmirani1∗, Fabio Vitali2, Generoso Longo1, Emanuele Di Sante1, Aurora
                                Brega1, Andrea D’Arpa1, Michele Corazza1

                                1University of Bologna, ALMA-AI, via Galliera 3, Bologna, 40121, Italy
                                2 University of Bologna, DISI, via Mura Anteo Zamboni 7, Bologna, 40126, Italy


                                                   Abstract
                                                   Legal drafting is a complex activity that involves different actors and end-users, usually belonging
                                                   to the administration staff. AI tools could support this activity by providing useful aid for various
                                                   tasks. This paper presents two scenarios where the AI add-on supports the legal drafting activity
                                                   conducted using the LEOS web editor, developed by the EU Commission for EU legislation. The
                                                   two scenarios are the following: i) retrieving the relevant existing normative definitions
                                                   connected with the ongoing bill, by using algorithms based on semantic similarity; ii) suggesting
                                                   the normative more pertinent references when some information is missing (e.g., the year); iii)
                                                   aiding the drafter in following templates for improving clearness and regularity in the norms
                                                   (e.g., modifications). Additionally, it is crucial to model a user interface that is capable of
                                                   guaranteeing some foundational principles: accessibility, transparency, usability, user
                                                   experience, and explicability. This paper presents the output of this project conducted in
                                                   collaboration with the DG Informatics of the EU Commission.

                                                   Keywords
                                                   Akoma Ntoso, LEOS, similarity, AI.1


                                1. Introduction                                                     apply Symbolic AI based on rules [12]. LEOS [5], [10]
                                                                                                    is one of the most promising web editors for legal
                                    The legal drafting activity is a crucial task in the            drafting, it has been developed by the EU Commission
                                legislative procedure in any deliberative assembly.                 to support the internal legal drafting activities but also
                                The goals of this task are many: i) to support the                  with the aim to serve the Member States as well.
                                political decision-makers; ii) to standardize the                       LEOS is an open-source web editor specific for
                                language with the legal tradition, adopting                         legal drafting, it is written in Angular and it is oriented
                                multilingual translations when necessary; iii) to apply             to manage all the law-making process [15].
                                drafting rules to improve quality, and clearness; iv) to                The aim of this work is to develop a framework
                                guarantee the Rule of Law and the theory of law                     architecture that is capable of enhancing LEOS with
                                principles; v) to track the modifications happening                 add-ons, developed with AI technologies, that
                                over time due to the the legislative process. In the last           improve the quality of the legal content, help the legal
                                15 years many specialized editors have been                         drafters, and manage the law-making process. The
                                developed [13],[5],[3],[1], in order to support these               two add-ons provide the following features
                                important goals using Natural language processing                   [7],[4][4],[14]:
                                technology [6]. Among the proposed solutions some
                                use the Semantic Web approach [2], while others


                                Ital-IA 2024: 4th National Conference on Artificial Intelligence,         0000-0002-8557-8084 (M. Palmirani); 0000-0002-7562-5203
                                organized by CINI, May 29-30, 2024, Naples, Italy                      (F. Vitali); 0000-0002-7288-6635 (M. Corazza);
                                ∗ Corresponding author.                                                            © 2023 Copyright for this paper by its authors. Use permitted under
                                                                                                                   Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                All authors contributed equally.
                                   monica.palmirani@unibo.it (M. Palmirani);
                                fabio.vitali@unibo.it (F. Vitali); generoso.longo@studio.unibo.it
                                (G. Longo); emanuele.disante@studio.unibo.it (E. Di Sante);
                                aurora.brega@stuido.unibo.it (A. Braga);
                                andrea.darpa@studio.unibo.it (A. D’Arpa);
                                michele.corazza@unibo.it (M. Corazza).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
       (i)        Suggest the pertinent normative              provision (e.g., “For the purposes of this Directive, the
                  definitions using similarity with the bill   definitions laid down in Article 2 of Directive
                  topic;                                       2000/60/EC shall apply”).
       (ii)       Suggest the pertinent normative                  Fourthly, the context is important for providing
                  reference using the thematic similarity      the relevant output of the suggestion. A definition
                  with the bill;                               depends on the topic of the bill. For example, we have
       (iii)      Take into consideration the temporal         many definitions of ‘accuracy’ and it depends on the
                  information and the nested normative         topic of the document.
                  references;                                      Fifthly, the user interface is a fundamental pillar
       (iv)       Use the metadata of ELI2 and                 for guaranteeing good usability, transparency, and
                  EUROVOC3 to improve the similarity.          explicability of the AI behaviors and output [8].
                                                                   Finally, we use Akoma Ntoso [9] serialization for
      The aim is also to create a user interface capable
                                                               fostering the structure of the legal documents, the
of:
                                                               normative references, the metadata of the lifecycle of
       (i)        Reduce manual/error-prone work               the document, the date of entry into force, into
                  typing the normative references, also        operation, and the date of repeal.
                  avoiding repetitions in legislative
                  citations;                                   3. Dataset
       (ii)       Maximising reuse of similar legal
                                                               The dataset used is composed by 10 years of European
                  concepts (e.g., definition);
                                                               legislation (2010-2021), about 15.000 regulations
       (iii)      Increasing        transparency     and
                                                               and directives. It was provided by the European
                  searchability of the existing legal
                                                               Publication Office in Formex 3.0 XML format. We have
                  knowledge included in the corpora.
                                                               converted all the documents in Akoma Ntoso, and
                                                               using a natural language processing approach we
2. Methodology                                                 have annotated the definitions and the normative
The adopted methodology is based on hybrid AI [11],            references.
and it uses multiple techniques for achieving its goals.            The dataset includes about 899 documents with
We do not generate new text (e.g., using LLM o                 definitions. For definitions, we have considered only
generative AI), but we intend to suggest pertinent,            the explicit provisions usually titled “Definitions” or
contextual, and significant existing legal knowledge           where a regular pattern can surely identify the
extracted by the legal corpora, using a similarity index       relationship between a term (definiens) and
according to the bill parameters that the legal drafter        description (definiendum) (e.g., ‘definiens’ means
is writing. We also use the EUROVOC classification and         definiendum, “‘domain’ means one or several data
other contextual information provided by the experts           sets that cover specific topics;”). The definitions that
during the drafting process (e.g., type of provision).         include normative references are managed by
    Secondly, the approach takes into consideration            navigating the link to include the complete
the temporal validity of the normative provisions,             information (e.g. ‘personal data’ means personal data
excluding those that are repealed, or suggesting the           as defined in point (1) of Article 4 of Regulation (EU)
appropriate versions of the consolidated text                  2016/679).
according to the view date typed by the end-user. If
the author seeks the normative definition of “privacy”         4. Use Cases
before the GDPR, they can set the date of view before
the 5 May of 2016 (the date of entering into force of          4.1. Normative References
the act) and the system will respect this setting.             Normative references are qualified citations used for
    Thirdly, we resolve the normative references in            mentioning other documents or provisions relevant
order to include in the model of indexing the text cited       for the normative discourse. The errors during the
in the recursive way as well (only the first level),           typing of the normative references produce incorrect
allowing us to grasp more information, especially              links and additional effort in the control phases.
when the definition is limited in the text and it                  The system permits to type incomplete normative
consists only of normative citations to another                references and to retrieve and rank the existing and


2 ELI: https://eur-lex.europa.eu/eli-register/about.html          3 EUROVOC: https://eur-

                                                                  lex.europa.eu/browse/eurovoc.html?locale=it
into force references which are similar to the                XML documents and an SQL database containing the
information requested by the end-user. In the case a          correspondence between each document and its
citation of the form “Regulation 406”, for example, the       EUROVOC categorization. Each EUROVOC is
system returns all the Regulation which are valid, into       associated with an average of the Word2Vec [16]
force, numbered 406 and pertinent to the EUROVOC              embeddings of the words composing it. The eXist
of the bill. The system completes the reference (e.g.,        database including all the AKN-XML documents4 can
Regulation 406/2010) and returns the title of the             also use Lucene Java library to calculate the index of
document and other information for identifying the            the document text and in particular to the definitions
act as well.                                                  (defBody elements). When a new document enters the
    Due to the evolution of the European institutions,        eXist database it is also indexed in the SQLDB and the
the references have changed syntax and patterns over          Word2Vec representation of its definitions is stored.
time. For this reason, the end-user can easily make a         If the document does not have EUROVOC tags, we
mistake in the citation format. Our tool helps the end-       extract them from CELLAR and we serialize the
user to compose the reference according to the                information in the metadata of the Akoma Ntoso
historical period of the document cited. For example,         documents.
a Regulation before 1968 is cited using                            During legal drafting, if the end-user wants to get
number/yy/EEC (e.g., Regulation No 1009/67/EEC);              a suggestion (e.g., normative reference or definition),
after 1968 we have number/yy (e.g. Regulation (EEC)           they need to provide some parameters as inputs, in
No 2195/91) and after 2009 we have yyyy/number                order to calculate the corresponding indexes like the
(e.g., Regulation (EU) 2016/679).                             title and the EUROVOC keywords of the bill (proposal
                                                              of law). The dynamic input typed by the end user (e.g.,
4.2. Legal Definitions                                        incomplete normative reference or definition
                                                              keywords) is parsed to compare the content with the
Legal definitions are a sensitive part of the law
                                                              existing document collection in eXist. After a first filter
because they define new legal concepts, new
                                                              using traditional Information Retrieval techniques for
terminologies, equivalences between different other
                                                              grasping the relevant documents, the similarity score
definitions, and exceptions in the case of specific
                                                              is calculated based on the text retrieved and
cases. In EU legislation, we usually have a clear article
                                                              compared with the embeddings of the input
called “Definitions”, but sometimes we could also find
                                                              parameters stored in the SQL DB (for EUROVOC
technical definitions in the last part of the act or in the
                                                              values) and using the similarity algorithm of Lucine
annexes.
                                                              for the definitions. The ranking is based on the index
    Additionally, we could have definitions organized
                                                              score, the temporal parameters, considering the
in a long list of points, which might be connected to
                                                              normative citations included in the normative
each other. Definitions are composed of three main
                                                              provision retrieved as well.
parts: definiens (term); definiendum (description);
                                                                   Lucene Similarity class implements the scoring
legal concept (abstract class of concept). The use of
                                                              model. The library offers several already-built
the same term for multiple definitions is not
                                                              implementations of the Similarity class, which reflect
infrequent, and the term might have completely
                                                              different scoring models developed in the field of
different meaning in different domains (e.g., pollution
                                                              Information Retrieval. Our implementation adopts
has different definitions according to the domain like
                                                              Default Similarity class, which combines the Boolean
water, energy, industry, etc.).
                                                              model, adopted to filter documents matching the
    For this reason, the tool calculates the similarity of
                                                              query, and a readjustment of the Vector Space model,
a given term (which can also be composed of multiple
                                                              based on TF-IDF weights, for scoring results. In
words) with the existing, valid, and updated (present
                                                              particular, VSM is refined by Lucene taking into
in consolidated versions of documents ) definitions in
                                                              account the corpus statistics contained in the inverted
the legal corpus, using the similarity index as a
                                                              index, the number of terms that correspond to the
criterion.
                                                              query, and the multiplying enhancement factors
                                                              expressed in the research. This class is also exploited
5. Architecture                                               by the process chain of indexing, since it deals with the
The overall architecture (see Figure 1) is composed           calculation of the normalization factors, which
of an XML database that includes the Akoma Ntoso              depend on the length of the fields and the boost


4 eXist is an XML database that is indexed using Lucine and

querable with XQuery.
factors specified in the configuration(Similarity                7. Conclusions
(Lucene 3.6.1 API) (apache.org)).
                                                                     The current paper presents two add-ons
6. User-interface                                                integrated into LEOS web editor to enhance legal
                                                                 drafting tasks using AI applications. The user interface
The user interface (see Figure 2) is a fundamental               is a fundamental component of this work that is
part of this application. LEOS is enriched with an add-          designed to incorporate the principles of
on that enables these functionalities in a selective             transparency, accessibility, user experience, and
way. The suggestions are offered in a portion of the             explicability. The methodology is to not generate new
window that allows the end-user to confirm or discard            text (e.g., like LLMs) to avoid hallucinations, which
the output, or to integrate the results in the drafting          could affect the democratic rules of the law-making
text.                                                            process.
    Our custom components are organised in a                         We aim to extract and offer to the legal drafters the
dedicated application folder, comprising new                     legal knowledge stored in the corpus, which is
components            (stored       in       .component.ts,      sometimes difficult to find due to the large volume of
.component.html, and .component.scss files), new                 documents, and to return the relevant information
classes (.ts files), and service (in a .service.ts file). This   accompanied with a particular index score based on
service manages the essential methods and global                 temporal parameters, similarity of text using qualified
variables used by our approach.                                  legal provisions like definitions and normative
    To maintain consistency, we adopted a style for              references. The first results were evaluated by legal
our extension that closely imitates the original                 experts and they are promising and pertinent to the
application's design. Many of the components used                drafting text. Moreover, the end-users appreciated the
were taken from the eUI library, and we followed the             provided suggestions, which could retrieve pertinent
guidelines and suggestions provided by the eUI                   information using topic similarity, cutting repetitive
framework. The version of the eUI library used is 14,            work and focusing on higher-level tasks.
the same one adopted by LEOS and used in its native
components. Therefore, both the shape and color of               Acknowledgements
the interface elements are consistent with those
indicated by the framework.                                      This project is co-funded by DG Informatics of the
    The components we added, we always provide                   European Commission inside of the larger project
feedback to the user, displaying results when                    LEOS and with the support of the European
generated, an error message if the service responses             Commission funds within ERC HyperModeLex. Grant
raise an issue, and an alert if the user's request is not        agreement ID: 101055185.
executed correctly, accordingly with the functionality
we aim to provide. We designed it so that the user               References
knows the reasons for an incomplete or incorrect
                                                                 [1]   Agnoloni Tommaso, E Francesconi, P Spinosa,
request and is given the opportunity to make any
                                                                       xmLegesEditor: an opensource visual XML
necessary corrections. We also strive to maintain
                                                                       editor for supporting legal national standards
consistency in the terms used in the labels, ensuring
                                                                       Proceedings of the V legislative XML workshop,
that each element is identified by a unique name and
                                                                       239-251, 2007
avoiding multiple elements with the same name.
                                                                 [2]   Casanovas, Pompeu; Palmirani, Monica; Peroni,
    The end user of the service is an expert in
                                                                       Silvio; Van Engers, Tom; Vitali, Fabio, Semantic
legislative matters, so we prioritised making the
                                                                       Web for the Legal Domain: The'next step,
interface simple and intuitive but also very specific for
                                                                       «SEMANTIC WEB», 2016, 7, pp. 213 - 227
professional tasks in drafting, considering that the
                                                                 [3]   Grant                                   Vergottini,
user has clear knowledge of the subject matter being
                                                                       https://xcential.com/legispro/standards/
addressed. We created mockups of the interface to
                                                                 [4]   Griglio Elena - Marchetti Carlo, La "specialità"
evaluate it before implementation, ensuring that it is
                                                                       delle sfide tecnologiche applicate al drafting
indeed usable and effective. The end-user is
                                                                       parlamentare : dal quadro comparato
constantly involved in the evaluation with regular
                                                                       all'esperienza del Senato italiano / Elena Griglio,
meetings where the usability is tested and feedback is
                                                                       Carlo Marchetti, Osservatorio sulle fonti. - 2022,
incorporated in the software.
                                                                       n. 3, p. 361-386
                                                                 [5]   LEOS
                                                                       https://joinup.ec.europa.eu/collection/justice-
     law-and-security/solution/leos-open-source-
     software-editing-legislation
[6] Lesmo Leonardo, Alessandro Mazzei, Monica
     Palmirani, Daniele Paolo Radicioni: TULSI: an
     NLP system for extracting legal modificatory
     provisions. Artif. Intell. Law 21(2): 139-172
     (2013)
[7] Lorello Laura, La qualità della legislazione
     diventa un’esigenza bicamerale : considerazioni
     sul nuovo Comitato per la legislazione del Senato
     della Repubblica nel primo semestre di attività,
     Federalismi.it : rivista di diritto pubblico
     italiano, comunitario e comparato. - 2023, n. 27,
     p. 17-45
[8] Palmirani M., Vitali F, Legislative Drafting
     Systems, in: Usability in Government Systems,
     NEW YORK, Morgan Kaufmann, 2012, pp. 133 -
     151
[9] Palmirani Monica (2011). Legislative change
     management with Akoma Ntoso. In Legislative
     XML for the semantic Web, 101–130. Springer
[10] Palmirani Monica et. al. Drafting legislation in
     the       era   of     AI    and     digitisation,
     https://joinup.ec.europa.eu/sites/default/files
     /document/2022-
     06/Drafting%20legislation%20in%20the%20e
     ra%20of%20AI%20and%20digitisation%20%
     E2%80%93%20study.pdf
[11] Palmirani Monica, F Sovrano, D Liga, S Sapienza,
     F Vitali, Hybrid AI Framework for Legal Analysis
     of the EU Legislation Corrigenda, Legal
     Knowledge and Information Systems, 68-75
[12] Palmirani Monica, Luca Cervone, Octavian Bujor,
     Marco Chiappetta: RAWE: An Editor for Rule
     Markup of Legal Texts. RuleML (2) 2013
[13] Palmirani Monica, Raffaella Brighi: An XML
     Editor for Legal Information Management. EGOV
     2003: 421-429
[14] Tafani Laura, Federico Ponte, Le tecniche
     legislative statali, regionali e dell'Unione
     europea a confronto : per un auspicabile
     ravvicinamento, Osservatorio sulle fonti. - 2022,
     n. 1, p. 447-497
[15] Leos Manual
     https://joinup.ec.europa.eu/collection/justice-
     law-and-security/solution/leos-open-source-
     software-editing-legislation/releases
[16] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg
     Corrado, and Jeffrey Dean. Distributed
     Representations of Words and Phrases and their
     Compositionality. In Proceedings of NIPS, 2013.
Figure 1 – Architecture of the system.


Figure 2 – Interface of LEOS with the add-on.


Figure 3 – Interface of LEOS with the add-on results.

</pre>