<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Obligations Extraction System for Heterogeneous Legal Documents: Building and Evaluating Data and Model</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maria Iacono</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laura Rossi</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paolo Dangelo</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Tesei</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lorenzo De Mattei Aptus.AI / Pisa</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Italy</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>maria</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>laura</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>paolo</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>andrea</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>lorenzo}@aptus.ai</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>A system that extracts obligations automatically from heterogeneous regulations could be of great help for a variety of stakeholders including financial institutions. In order to reach this goal, we propose a methodology to build a training set of regulations written in Italian coming from a set of different legal sources and a system based on a Transformer language model to solve this task. More importantly, we deep dive into the process of human and machine-learned annotations by carrying out both quantitative and manual evaluations of both of them.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>Compliance practitioners in financial intuitions are
overburdened by the high volume of upcoming
regulations coming from different legal sources,
such as the European Union, National legislation,
central banks and independent administrative
authorities sources, to name a few. Part of the
compliance offices work consists of extracting
obligations from this vast amount of regulations to
trigger compliance processes. It is worth noting that
extracting obligations from such a big amount of
regulations is tedious and repetitive work. In this
scenario having systems to automate this process
might be very useful to cut down the costs.
Machine Learning (ML) and Natural Language
Processing (NLP) may come in help. However, given
the variety of legal sources, training this kind of
system is a complex activity because it requires a
sufficient amount of annotated data, which are
ex</p>
      <p>Copyright © 2021 for this paper by its authors. Use
permitted under Creative Commons License Attribution 4.0
International (CC BY 4.0).
pensive especially if the annotations require legal
domain experts.</p>
      <p>
        The obligations extraction topic has been
already studied with different approaches. Bartolini
et al. (2004) used a shallow syntactic parser and
hand-crafted rules to automatically classify laws
paragraphs according to their regulatory content
and extract relevant text fragments corresponding
to specific semantic roles. Similarly Sleimi et al.
(2018) represent automatically legal texts
semantics using an RDF schema with a system based
on a dependency parser and hand-crafted rules.
Sleimi et al. (2019) used the same representation
to build a question-answering system with a focus
on obligations. Biagioli et al. (2005) represent law
paragraphs using Bag of words either with TF or
TF-IDF weighting
        <xref ref-type="bibr" rid="ref11">(Salton and Buckley, 1988)</xref>
        and
used Support Vector Machines (SVM) to classify
each paragraph as a type of provisioning
including obligations. A similar approach is adopted
by Francesconi and Passerini (2007): they
classify legislative texts paragraphs according to the
proposed provision model. They represent them
in a similar way as
        <xref ref-type="bibr" rid="ref2">(Biagioli et al., 2005)</xref>
        and use
two learning algorithms: Naive Bayes and SVM.
Sleimi et al. (2020), propose to address the
problem of the complexity of regulatory texts by
writing them following a set of standard templates
which could be easily parsed.
      </p>
      <p>
        Contributions In this work we offer four main
contributions. (i) We propose a methodology for
building training corpora relying on non-expert
annotators and we apply this methodology on a
set of heterogeneous regulations written in Italian,
coming from a set of different legal sources. (ii)
We assess the quality of the introduced
methodology relying on an inter-annotator agreement score
and we carry out an error analysis to highlight if
and when expert annotators are required. (iii) We
use the dataset produced to train and test an
obligations classification system based on neural
networks as this approach has been proven to
provides state of the art results for several Italian
classification tasks
        <xref ref-type="bibr" rid="ref3 ref3 ref4 ref4 ref9">(De Mattei et al., 2018; Cimino et
al., 2018; Occhipinti et al., 2020)</xref>
        . (v) We conduct
a manual error analysis to investigate the pros and
the limitations of the mentioned system.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Task Description</title>
      <p>The task we tackle consists of classifying
regulations clauses either as obligations or not. By
obligation, we mean, from a juridical point of view, a
legal constraint imposed by law and addressed to
a juridical person.</p>
      <p>Being interested in developing a system that
supports financial institutions, we distinguish two
categories of obligations, classifying them as
relevant or irrelevant for financial institutions. Then
each clause can be classified in one out of the
following three categories: (i) not obligation,
(ii) relevant obligation and (iii) not
relevant obligation. This classification
schema allows practitioners to retrieve in one click
all the obligations or the relevant only so that they
can decide whether to have a complete overview of
the laws they are consulting or to focus only on the
obligations that actually affect their institutions.</p>
      <p>To distinguish the two categories, we look at the
subject to whom the obligation is addressed: if it
is a public institution, we classify it as an
irrelevant obligation, in all other cases as a relevant
obligation. This simplification applied to the
classification criterion may seem extreme since it
implies that any type of obligation not addressed to a
public institution must be considered relevant for a
ifnancial institution. However, we believe that
applying this distinction is a good strategy because
the documents we analyze are already filtered, i.e.,
they belong to a category of laws that impact
financial institutions. Consequently, within them, if
an obligation is not directed at a public institution
it will almost certainly be directed somehow to
financial institutions.
2.1</p>
      <p>Special Cases
Legal jargon is not merely a tool used for
argumentation or narrative, but a constitutive element
of the law. Consequently, the structure of legal
texts has particular characteristics that must
respond to precise and predictable patterns. Despite
this, there are cases in which the language can be
ambiguous. Since our goal is to build a dataset
in line with compliance practitioners expectations
we analyzed some special cases with a group of
experts in order to provide clear guidelines to
annotators.</p>
      <p>One such case is when an obligation is
expressed indirectly, for example through the
formulation of a right. If an article talks about rights of
any kind, it assumes that those rights must be
respected. So, for example, the right of a client in
terms of obtaining a loan (client’s point of view)
corresponds to a duty of the bank, which is obliged
to grant it if the client has what it takes (bank’s
point of view). Similarly, an employee’s right to
go on vacation means that the employer must
guarantee vacation days. For this reason, in deciding
how to classify a part of a law, in addition to the
interpretation by the annotator, the concept of
”priority” comes into play. Since our application is
designed to support financial institutions, our
priority is to highlight the obligations that they must
take into account in order not to risk penalties.
Consequently, if a sentence represents both a right
for one subject and duty for another, we prioritize
the obligation in classifying it.</p>
      <p>Another case where the priority factor comes
into play is that of clauses that contain both
relevant and irrelevant obligations. In these cases,
since we cannot break the clause down into several
parts, we give priority to the relevant obligation.
In terms of risk, it is better to classify an irrelevant
obligation as relevant, rather than the other way
around.</p>
      <p>In addition, we have to consider that obligations
may be reported implicitly. For example, if a
person can perform an action only under certain
conditions, it is implied that those conditions can be
interpreted as obligations. According to this
principle, we do not classify a sentence such as
“Spectators may enter the theatre” as an obligation. On
the contrary, we do so when a condition is added,
as in the case of the sentence “Spectators may
enter the theatre only if they have the ticket.”</p>
      <p>Even if we, as readers, do not pay attention to
it, normative texts often contain implicit
information that readers are naturally able to trace through
reading, such as an implied subject, or a reference
to another part of the document or to an external
document. Unlike a reader, an automatic classifier,
not having provided with enough context, may
encounter difficulties in handling this kind of case.
institutions, and dark blue if it is not relevant.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Data Annotation</title>
      <p>
        We extracted the dataset from Daitomic1, a
product that automatically collects legal documents
from a wide variety of legal sources, represents
automatically them accordingly to the Akoma Ntoso
standard
        <xref ref-type="bibr" rid="ref10">(Palmirani and Vitali, 2011)</xref>
        and makes
them available through a dedicated User Interface.
The adoption of Akoma Ntoso lets us represent the
structure of heterogeneous legal texts in a unified
format that makes us able to apply the same
operations on very different kind of poorly encoded
documents such as PDF, HTML and DOCX files.
      </p>
      <p>The corpus has been manually labelled by three
trained annotators with no previous background in
legal domain and contains 71 regulations for a
total of 10.628 clauses. We selected regulations that
touch heterogeneous topics such as data privacy,
ifnancial risk, tax compliance and many more but
all of them are known to be relevant for financial
institutions. In order to deal with the problem of
heterogeneity of normative sources, we found it
appropriate to take texts from different sources, so
that we could train the model in a balanced way.
In particular, we extracted the texts from thirty of
the most important regulatory sources for Italian
ifnancial institutions, including Gazzetta Ufficiale
Italiana, EUR-Lex, Consob, Banca d’Italia and
many more. From these sources, we selected texts
of different types: acts, regulations, decisions,
directives, communications, statutes, and more. In
this way, we created a very heterogeneous dataset
that can be considered representative of the wide
variety of existing regulations.</p>
      <p>The annotations were carried out directly from
the graphical user interface of the Daitomic
application, which allows, within the consultation
section, to mark the requirements present in the
law and to classify them as relevant or not
relevant. The application texts are already structured,
so they present a tree structure divided into
chapters, articles, paragraphs, clauses, etc, where we
annotated the smallest parts, i.e. clauses. Each
clause is flanked by a sidebar, clicking on which
automatically opens the pop-up shown in Figure
1, which allows the annotators to choose the label
that they consider most appropriate. As a result
of this choice, the sidebar will turn light blue if
the obligation is classified as relevant to financial
1https://www.daitomic.com/</p>
      <p>We picked four of the annotated laws
containing as many as 2189 clauses to be annotated by all
three annotators.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Annotations Evaluation</title>
      <p>We used the part of the dataset annotated by all
three annotators in order to calculate the
interannotator agreement (IAA). Using Krippendorff’s
Alpha reliability, we computed IAA in two
different ways, at first checking only whether they
had classified the sentences as obligations or
nonobligations, then taking into account their choices
in distinguishing obligations between relevant and
non-relevant. The resulting IAA is 0.58
considering the distinction between relevant and not
relevant but increases to 0.70 if no such distinction is
applied.</p>
      <p>In order to better understand these results we
carried out a manual analysis from which turned
out that most cases of disagreement are of two
kinds (two examples are reported in Table 1). The
lack of agreement between annotators can be
primarily attributed to the fact that there is often no
explicitly expressed subject in a clause, either
because it is expressed in the preceding clauses or
because it is intuitable from the context, as we can
see in the first example. Another frequent reason
for disagreement is surely the fact that our
annotators, not being experts in the legal field, not
always are able to understand the kind of subject to
which the obligation is referred, as in the second
example. In such cases, expert annotators might
be more reliable.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Automatic Classifier</title>
      <p>We also used the dataset we built to train an
automatic classifier. We split the dataset into
training (90%) and test (10%) sets. As a learning
Annotator 1 Annotator 2 Annotator 3 text
not relevant</p>
      <p>
        I contratti di assicurazione di cui al comma 1, lettera b),
sono corredati da un regolamento, redatto in base alle
direttive impartite dalla COVIP [...]
en:[The insurance contracts referred to in paragraph
1, letter b), are accompanied by a regulation, drawn up
on the basis of the directives issued by COVIP [...]]
Il soggetto incaricato del collocamento nel territorio
dello Stato provvede altresi’ agli adempimenti
stabiliti [...]
en:[The person in charge of placement in the territory
of the The State also provides for the established
obligations [...]]
model, we used UmBERTo2, an Italian pretrained
Language Model trained by Musixmatch based
on Roberta architecture
        <xref ref-type="bibr" rid="ref7">(Liu et al., 2019)</xref>
        , which
has been recently proved to provide state of the
art performances for other Italian tasks
        <xref ref-type="bibr" rid="ref12 ref6 ref9">(Occhipinti et al., 2020; Sarti, 2020; Giorgioni et al.,
2020)</xref>
        . This language model has 12-layer,
768hidden, 12-heads, 110M parameters. On top of
the language model, we added a ReLU classifier
        <xref ref-type="bibr" rid="ref8">(Nair and Hinton, 2010)</xref>
        . All the model’s weights
has been updated during fine-tuning. We applied
dropout
        <xref ref-type="bibr" rid="ref16">(Srivastava et al., 2014)</xref>
        with probability
0.1 to both the attention and the hidden layers.
We used Cross-Entropy as a loss function and we
trained the system until early-stop at epoch 6. The
performances obtained on the test set are reported
in Table 2. The system performances are fairly
good if compared to IAA but not enough reliable
to be used in real-world scenarios. However if we
evaluate the system without considering the
difference between not relevant and relevant obligations
(Table 3) we observe much more accurate results
2https://github.com/musixmatchresearc
h/umberto
      </p>
      <sec id="sec-5-1">
        <title>Precision Recall F-Score</title>
      </sec>
      <sec id="sec-5-2">
        <title>Not Obligations</title>
        <p>Obligations
suggesting that the systems, similarly to the
annotators, performs well in identifying obligations,
but struggles in distinguishing between relevant
and not relevant obligations.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Human vs Automatic Classification</title>
      <p>In order to better understand the model
capabilities, we ran a manual error analysis, comparing
human annotations against automatic
classifications on the test set. We identified some categories
of typical errors and reported some examples in
Table 4. In some cases, the errors of the model
are attributable to the non-explicit subject, which
the human annotator can derive from the context,
as can be seen in the first example, where it is not
explicitly specified who should enter the data in
the communication. Looking at the second
example, we can see a sentence whose main message is
the expression of a right, in this case, the right to
access a certain file. However, access to the file is
allowed only under certain temporal conditions (at
the conclusion of the appeal procedure), so behind
that right is hidden a relevant obligation.
Unfortu</p>
      <p>Nella comunicazione di avvio di cui al comma 2 sono indicati l’oggetto del
procedimento, gli elementi acquisiti d’ufficio [...]
en:[In the communication of initiation referred to in paragraph 2 are
indicated the subject of the procedure, the elements acquired ex officio [...]]
L’accesso al fascicolo e` consentito a conclusione della procedura di
interpello ai fini della tutela in sede giurisdizionale.
en:[Access to the file is granted at the conclusion of the appeal procedure
for judicial protection purposes.]
E’ considerata ingannevole la pubblicita`’, che, in quanto suscettibile di
raggiungere bambini ed adolescenti, puo`’, anche indirettamente, minacciare la
loro sicurezza.
en:[Advertising that is likely to reach children and adolescents and that may
even indirectly threaten their safety is considered misleading.]
not relevant Le amministrazioni interessate provvedono agli adempimenti previsti dal
presente decreto con le risorse umane, finanziarie e strumentali disponibili
[...].
en:[The administrations involved shall carry out the obligations provided
for in this decree with the human, financial and instrumental resources
available.[...]]</p>
      <p>Il presente decreto reca le disposizioni di attuazione dell’articolo 1 del
decreto legge 6 dicembre 2011, n. 201, convertito, con modificazioni, dalla
legge 22 dicembre 2011, n. 214 [...].
en:[This decree contains the provisions for the implementation of article 1
of Law Decree no. 201 of December 6, 2011, converted, with amendments,
by Law no. 214 of December 22, 2011 [...]]
nately in these cases, the model is often wrong.</p>
      <p>Another difficult case to handle is the one shown
in the third example in Table 4. This is a sentence
that apparently contains simple information:
advertising is considered deceptive if it can threaten
the safety of children. But behind this message
lies an obligation on advertisers to avoid such a
situation. Again, the obligation is not explicit, so
it is quite understandable that the model could be
wrong. Finally, the last two examples show
human errors, and it was noted with some interest
that where annotators make errors due to
distraction or misunderstanding, the model often
classiifes correctly.
7</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusions</title>
      <p>In this work we propose a methodology for
building training corpora for obligations classification,
based on annotations performed by non-experts.
We apply this methodology to a set of
heterogeneous regulations from a collection of different
legal sources. IAA and a manual error analysis
highlight that human annotation is in general prone
to errors and that non-expert annotators struggle
to distinguish between relevant and not relevant
obligations. The dataset produced has been used
to train and test an obligations classification
system based on state-of-the-art pretrained language
models. We conduct both an automatic evaluation
and a manual error analysis from which turned out
that the system, similarly to human annotators, has
good performances in recognizing obligations but
struggles in distinguish between relevant and not.
As future works, we plan to involve domain-expert
annotators to evaluate if their contribution can
improve the quality of the data and of the model.
Also, we will explore techniques to provide more
context to the classifier in order to improve the
performances on clauses in which the subject is
implied.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Roberto</given-names>
            <surname>Bartolini</surname>
          </string-name>
          , Alessandro Lenci, Simonetta Montemagni, Vito Pirrelli, and
          <string-name>
            <given-names>Claudia</given-names>
            <surname>Soria</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Automatic classification and analysis of provisions in italian legal texts: a case study</article-title>
          .
          <source>In OTM Confederated International Conferences” On the Move to Meaningful Internet Systems”</source>
          , pages
          <fpage>593</fpage>
          -
          <lpage>604</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Carlo</given-names>
            <surname>Biagioli</surname>
          </string-name>
          , Enrico Francesconi, Andrea Passerini, Simonetta Montemagni, and
          <string-name>
            <given-names>Claudia</given-names>
            <surname>Soria</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Automatic semantics extraction in law documents</article-title>
          .
          <source>In Proceedings of the 10th international conference on Artificial intelligence and law</source>
          , pages
          <fpage>133</fpage>
          -
          <lpage>140</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Cimino</surname>
          </string-name>
          , Lorenzo De Mattei, and Felice Dell'Orletta.
          <year>2018</year>
          .
          <article-title>Multi-task learning in deep neural networks at evalita 2018</article-title>
          .
          <source>Proceedings of the Wvaluation Campaign of Natural Language Processing and Speech tools for Italian</source>
          , pages
          <fpage>86</fpage>
          -
          <lpage>95</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Lorenzo De Mattei</surname>
          </string-name>
          , Andrea Cimino, and Felice Dell'Orletta.
          <year>2018</year>
          .
          <article-title>Multi-task learning in deep neural network for sentiment polarity and irony classification</article-title>
          .
          <source>In NL4AI@ AI* IA</source>
          , pages
          <fpage>76</fpage>
          -
          <lpage>82</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Enrico</given-names>
            <surname>Francesconi</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Passerini</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Automatic classification of provisions in legislative texts</article-title>
          .
          <source>Artificial Intelligence and Law</source>
          ,
          <volume>15</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Simone</given-names>
            <surname>Giorgioni</surname>
          </string-name>
          , Marcello Politi, Samir Salman, Roberto Basili, and
          <string-name>
            <given-names>Danilo</given-names>
            <surname>Croce</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Unitor@ sardistance2020: Combining transformer-based architectures and transfer learning for robust stance detection</article-title>
          .
          <source>In EVALITA.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Yinhan</given-names>
            <surname>Liu</surname>
          </string-name>
          , Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen,
          <string-name>
            <surname>Omer Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Mike</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Luke</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Veselin</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Roberta: A robustly optimized bert pretraining approach</article-title>
          . arXiv preprint arXiv:
          <year>1907</year>
          .11692.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Vinod</given-names>
            <surname>Nair</surname>
          </string-name>
          and
          <string-name>
            <given-names>Geoffrey E</given-names>
            <surname>Hinton</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Rectified linear units improve restricted boltzmann machines</article-title>
          .
          <source>In ICML.</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Daniela</given-names>
            <surname>Occhipinti</surname>
          </string-name>
          , Andrea Tesei, Maria Iacono, Carlo Aliprandi, Lorenzo De Mattei, and
          <source>Aptus AI</source>
          .
          <year>2020</year>
          .
          <article-title>Italianlp@ tag-it: Umberto for author profiling at tag-it 2020</article-title>
          .
          <source>In Proceedings of Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA</source>
          <year>2020</year>
          ),
          <article-title>Online</article-title>
          . CEUR. org.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Monica</given-names>
            <surname>Palmirani and Fabio Vitali</surname>
          </string-name>
          ,
          <year>2011</year>
          .
          <article-title>AkomaNtoso for Legal Documents</article-title>
          , pages
          <fpage>75</fpage>
          -
          <lpage>100</lpage>
          . Springer Netherlands, Dordrecht.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Gerard</given-names>
            <surname>Salton</surname>
          </string-name>
          and
          <string-name>
            <given-names>Christopher</given-names>
            <surname>Buckley</surname>
          </string-name>
          .
          <year>1988</year>
          .
          <article-title>Termweighting approaches in automatic text retrieval</article-title>
          .
          <source>Information Processing &amp; Management</source>
          ,
          <volume>24</volume>
          (
          <issue>5</issue>
          ):
          <fpage>513</fpage>
          -
          <lpage>523</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Gabriele</given-names>
            <surname>Sarti</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Umberto-mtsa@ accompl-it: Improving complexity and acceptability prediction with multi-task learning on self-supervised annotations</article-title>
          . arXiv preprint arXiv:
          <year>2011</year>
          .05197.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Amin</given-names>
            <surname>Sleimi</surname>
          </string-name>
          , Nicolas Sannier, Mehrdad Sabetzadeh, Lionel Briand,
          <string-name>
            <given-names>and John</given-names>
            <surname>Dann</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Automated extraction of semantic legal metadata using natural language processing</article-title>
          .
          <source>In 2018 IEEE 26th International Requirements Engineering Conference (RE)</source>
          , pages
          <fpage>124</fpage>
          -
          <lpage>135</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Amin</given-names>
            <surname>Sleimi</surname>
          </string-name>
          , Marcello Ceci, Nicolas Sannier, Mehrdad Sabetzadeh, Lionel Briand,
          <string-name>
            <given-names>and John</given-names>
            <surname>Dann</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>A query system for extracting requirements-related information from legal texts</article-title>
          .
          <source>In 2019 IEEE 27th International Requirements Engineering Conference (RE)</source>
          , pages
          <fpage>319</fpage>
          -
          <lpage>329</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Amin</given-names>
            <surname>Sleimi</surname>
          </string-name>
          , Marcello Ceci, Mehrdad Sabetzadeh, Lionel C Briand,
          <string-name>
            <given-names>and John</given-names>
            <surname>Dann</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Automated recommendation of templates for legal requirements</article-title>
          .
          <source>In 2020 IEEE 28th International Requirements Engineering Conference (RE)</source>
          , pages
          <fpage>158</fpage>
          -
          <lpage>168</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Nitish</given-names>
            <surname>Srivastava</surname>
          </string-name>
          , Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and
          <string-name>
            <given-names>Ruslan</given-names>
            <surname>Salakhutdinov</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Dropout: a simple way to prevent neural networks from overfitting</article-title>
          .
          <source>The journal of machine learning research</source>
          ,
          <volume>15</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1929</fpage>
          -
          <lpage>1958</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>