<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A First Step Towards Automatic Consolidation of Legal Acts: Reliable Classification of Textual Modifications</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Samuel Fabrizi</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria Iacono</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Tesei</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lorenzo De Mattei Aptus.AI / Pisa</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Italy</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>samuel</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>maria</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>andrea</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>lorenzo}@aptus.ai</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>The automatic consolidation of legal texts with the integration of its successive amendments and corrigenda might have an important practical impact on public institutions, citizens and organizations. This process involves two steps: a) the classification of the textual modifications in amendment acts and b) the integration within a single document of such modifications. In this work we propose a methodology to solve step a) by exploiting Machine Learning and Natural Language Process techniques on the Italian versions of European Regulations: our results suggest that the methodology we propose is a reliable first milestone toward the automatic consolidation of legal texts.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>Consolidation consists of the integration in a
legal act of its successive amendments and
corrigenda.1 Consolidated texts are very important for
legal practitioners. However, their maintenance is
a tedious task. Some regulatory publishers such as
Normattiva2 provide continuously updated
consolidated texts, others such as Eur-Lex3 do times to
times, some other do not. The automation of this
process could let institutions to save resources and
practitioners to access continuously updated
consolidated documents. This achievement would let
organizations stay compliant with the normative
more easily. The consolidation process involves
two main steps: a) the identification and
classification of the textual modifications in amendment
acts; b) the integration within a single document of
the textual modifications identified in the previous
step. The first step can be expressed as the
automatic classification of textual modifications inside
a legal document. In this work, we focus on step
a).</p>
      <p>
        Several authors tried to solve this task using
standard Natural Language Processing (NLP)
techniques. Ogawa et al. (2008) showed that
amendment clauses described in the Japanese statutes
can be formalized in terms of sixteen regular
expressions. Lesmo et al. (2009) tried to identify
and classify integrations, substitutions and
deletions using a three-step approach: 1) prune text
fragments that do not convey relevant
information, 2) perform the syntactic analysis of the
retrieved sentences, 3) semantically annotate the
provision using a rule-based approach based on
tree. In this last step, they also used a
knowledge base that describes the provisions taxonomy
        <xref ref-type="bibr" rid="ref1">(Arnold-Moore, 1997)</xref>
        .4 Brighi et al. (2008) and
Spinosa et al. (2009) followed a similar approach.
In both cases, semantic analysis is carried out on
the syntactically pre-processed text using a
rulebased approach. The difference is related to the
starting point of the semantic analysis. The
former’s system relied on a deep semantic analysis of
the textual modifications. The latter started from
the shallow syntactically parsed text. Garofalakis
et al. (2016) presented a semi-automatic system
for the consolidation of Greek legislative texts
based on regular expressions. Francesconi and
Passerini (2007) defined a module that
automatically classiefis paragraphs into provision types.
Each paragraph is represented using Bag of words
either with TF-IDF weighting
        <xref ref-type="bibr" rid="ref21">(Salton and
Buck4A legislative provision represents the meaning of a law
part from a legal point of view. Obligations, definitions and
modifications are specific types of provision.
ley, 1988)</xref>
        or binary weight. The authors showed
an experimental comparison of the different
representation methods using the Naive Bayes and
Multiclass Support Vector Machine (MSVM) models.
This paper describes our approach in the
classification of textual modifications, namely
substitution, addition, repeal and abolition. The proposed
approach is based on standard statistical NLP
techniques
        <xref ref-type="bibr" rid="ref17">(Manning and Schutze, 1999)</xref>
        . Our method
involves i) the use of XML-based standards for the
annotation of legislative documents, ii) the
construction of the dataset assigning a label to each
word according to the tagging format used, and
iii) the implementation of NLP models to
identify and classify textual modifications. We carried
out a systematic comparison among several
feature extraction techniques and models. The main
contribution of this paper is the application of
machine learning models to classify textual
modifications. In contrast to rule-based or regular
expression techniques, our models do not need expert
knowledge about the application domain’s
properties. They try to extract formulas used to introduce
a textual modification without the need for an
explicit definition of all the formulas. Our approach
leads to lower maintenance costs and hopefully
increased robustness of the system.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Data</title>
      <p>
        We extracted the data from Daitomic5, a product
that contains all the regulations from a set of legal
sources encoded automatically in Akoma Ntoso
standard format
        <xref ref-type="bibr" rid="ref20">(Palmirani and Vitali, 2011)</xref>
        . We
collected from this product all the Italian versions
of the amendment documents originally extracted
from Eur-Lex and we randomly sampled 260 legal
documents for manual labelling.
      </p>
      <p>Accordingly to the Eur-Lex web service
specifications6, we identified seven different types of
textual modifications:
• replacement annotates a substitution which
may concern a part of a sentence (expression,
word, date, amount) or a whole subdivision
of the document (article, paragraph, indent).
Usually, this type of textual modification
includes also the following subcategories:
– from annotates the replaced words
(“novellando”).
5Daitomic, https://www.daitomic.com/
6Eur-Lex, How to use the webservice?, https://bit.
ly/393qt9Z
– to annotates the words that replace the
previous ones (“novella”).
• replacement ref is a type of replacement. We
use it to handle textual modifications that
include attachments.
• addition annotates textual modifications that
add or complete a part of a legal document.
• repeal indicates the removal or reversal of a
law. It is used to invalidate its provisions
altogether.
• abolition indicates the removal of a law part.</p>
      <p>It is used to replace the law with an updated,
amended or related law. This textual
modification could just involve single words or
whole subdivision as in the replacements.</p>
      <p>Category
replacement
from</p>
      <p>to
replacement ref
addition
repeal
abolition
Table 2 reports an example for each of the
mentioned categories. Table 1 shows the total number
of textual modifications per category. The number
of replacements examples is greater than that the
others types of modifications because substitutions
can be introduced by different formulas that
determine their specific meaning. Indeed, from a
preliminary experiment, we understood that there is
a relationship of proportionality between the
number of formulas used to introduce textual
modifications and the number of examples needed to train
the models. For this reason, we needed a different
number of examples for each category to train our
models.</p>
      <p>Given the differences among the nature of each
modification type, we preferred to split the
original problem into vfie subtasks, namely:
1. replacement classification that also contains
the replacement ref category;</p>
      <sec id="sec-2-1">
        <title>2. addition classification;</title>
      </sec>
      <sec id="sec-2-2">
        <title>3. repeal classification;</title>
      </sec>
      <sec id="sec-2-3">
        <title>4. abolition classification;</title>
      </sec>
      <sec id="sec-2-4">
        <title>5. from to classification.</title>
        <p>The manual annotation consisted in assigning one
label at each token of the selected document for
each subtask that indicates if it represents or not
a textual modification. We defined three different
tagging formats: Inside-Outside-Beginning (IOB),
Inside-Outside (IO), Limit-Limit(LL). The first
two tagging formats are standard.7 The last one,
instead, uses the prefix “L-” to indicate that the
token is either the beginning or end of a textual
modification. We adopted a specific tagging format for
each model based on our preliminary results. The
tagging format was one of the most critical choices
to improve model performance.</p>
        <p>
          The dataset used for the last subtask is different.
Indeed, the from and to tags are always enclosed
within the replacement tags. We could not use any
of our tagging formats because their syntax does
not permit any nesting
          <xref ref-type="bibr" rid="ref7">(Dai, 2018)</xref>
          . Therefore,
we decided to change the dataset itself to train the
models. We considered only the tokens inside the
sentences representing a replacement and tagged
them using the aforementioned tagging formats.
In this way, we avoided the nesting issue.
2.1
        </p>
        <sec id="sec-2-4-1">
          <title>Preprocessing</title>
          <p>Each model needs a different preprocessing
method to process the raw text legal documents,
depending on the feature extractor used. There are
only a few preprocessing operations common to
all models:
1. substitution of the special characters ≪ and
≫ with the quote marks;
2. substitution of words between quote marks
with the special token QUOTES TEXT. This
step has allowed us to limit the number of
tokens in each paragraph. The words between
quote marks often represent a whole article
(for example to substitute or to add). We
decided to substitute these words with a special
token because they are redundant for our task.
This consideration permits us to improve the
performances of all models. In the from and
to subtask, we avoided substituting the text
7Breckbaldwin, Coding Chunkers as Taggers: IO, BIO,
BMEWO, and BMEWO+, https://bit.ly/3DzuqBc
between quotes because it has led to a
performance improvement.
3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experiments</title>
      <p>
        For each task, we gathered the documents that
contain one or more occurrences of that specific
modification. Then, we split the dataset into a
training and a test set. More precisely, we used the
80/20 ratio adopting a stratified technique
        <xref ref-type="bibr" rid="ref24">(Trost,
1986)</xref>
        . We used the training set to validate the
hyperparameters of each model. Once computed the
ifnal models, we made use of the test set to
measure their generalization ability. It is important to
emphasise that we never used the internal test set
before the definition of the final models.
      </p>
      <p>The general pipeline is composed of the following
steps:
1. The annotated documents are tokenized.
2. Each token is associated with one label for
each category following the tagging formats
previously defined.
3. From each token, we extract its
representation using either hand-crafted features or
character level N-grams or word embeddings.
Depending on the model used, both tagging
format and features extraction change.
4. We execute the model selection phase
exploiting K-fold cross-validation. In our
experiments, we set the K parameter to 3 so
that validation sets size is reasonable. The
purpose of this step is to find the best
hyperparameters of each model.
5. For each subtask, we chose the model with
the best performance in the previous step.
6. After choosing the best configuration of each
model, we computed and compared their
performances over the test set.
3.1</p>
      <sec id="sec-3-1">
        <title>Feature Extraction</title>
        <p>We applied several feature extraction techniques to
ifgure out which one was the most effective. In this
section, we will explain these techniques with an
in-depth description. Considering the nature of the
task, all the features are extracted at the word level.</p>
        <p>We define different sets of features according to
the models’ needs. We logically divided our
features into hand-crafted features, n-gram features
and word embeddings.
replacement
replacement ref
addition
repeal
abolition</p>
        <p>All’articolo 7 della decisione 2005/692/CE, la data del
&lt;replacement&gt; ≪ &lt;from&gt; 31 dicembre 2010 &lt;/from&gt; ≫
e` sostituita da ≪ &lt;to&gt; 30 giugno 2012 &lt;/to&gt; ≫ &lt;/replacement&gt;.
L’allegato II al regolamento (CE) n. 998/2003 e` sostituito dal testo dell’
&lt; replacement ref &gt; allegato &lt;/replacement ref&gt; al presente regolamento.
E` aggiunto il seguente allegato:
&lt;addition&gt; “ALLEGATO III [...]” &lt;/addition&gt;
Il regolamento (CEE) n. 160/88 e` abrogato. &lt;repeal&gt;&lt;/repeal&gt;
nel titolo i termini &lt;abolition&gt;“raccolti nel 1980” &lt;/abolition&gt;sono soppressi
In the following we list the hand-crafted features
extracted and their meaning:
• is upper: boolean value indicating whether</p>
        <p>the token is in uppercase
• is lower: boolean value indicating whether</p>
        <p>the token is in lowercase
• is title: boolean value indicating whether the</p>
        <p>token is in titlecase
• is alpha: boolean value indicating whether</p>
        <p>the token consists of alphabetic characters
• is digit: boolean value indicating whether the</p>
        <p>token consists of digits
• is punct: boolean value indicating whether</p>
        <p>
          the token is a punctuation mark
• pos val cg: coarse-grained part-of-speech
from the Universal POS tag set
          <xref ref-type="bibr" rid="ref13 ref14">(Kumawat
and Jain, 2015)</xref>
          : the text has been POS tagged
with SpaCy Italian model8
• is alnum: boolean value indicating whether
all characters in the token are alphanumeric
(either alphabets or numbers)
• word lower: token in lowercase
• word[-3:]: last three characters of the token
• word[-2:]: last two characters of the token
Then, we decided to use a more complex
representation. We used a Count Vectorizer
          <xref ref-type="bibr" rid="ref22">(Sarlis and
Maglogiannis, 2020)</xref>
          computed over all the
Italian legal documents contained in EUR-Lex at the
date we created it. It converts a collection of text
documents to a matrix of n-gram counts. From
8Spacy, Models, https://spacy.io/models/it
each set of words, it produces a sparse vector
representation that captures a large number (376037)
of character n-grams features.
        </p>
        <p>
          Finally, we decided to use a word embedding
lexicon as it has been shown that provides good
performances in other Italian tasks
          <xref ref-type="bibr" rid="ref6 ref6 ref8 ref8">(De Mattei
et al., 2018; Cimino et al., 2018)</xref>
          . We tested a
few different in-domain and general purpose
embeddings lexicons trained using both fastText
          <xref ref-type="bibr" rid="ref4">(Bojanowski et al., 2017)</xref>
          and word2vec
          <xref ref-type="bibr" rid="ref18">(Mikolov et
al., 2013)</xref>
          , we obtained the best results with
fastText pretrained Italian model
          <xref ref-type="bibr" rid="ref12">(Grave et al., 2018)</xref>
          .
        </p>
        <p>
          The features extracted from each token do not
contain enough information to discriminate the true
amendment class. For this reason, we decided
to introduce the sliding window concept
          <xref ref-type="bibr" rid="ref9">(Dietterich, 2002)</xref>
          . It represents a set of tokens that
precede and/or follow each token, like a “window”
with a fixed size that moves forward through the
text. For each feature extraction technique, we
introduced two parameters, window size and
is bilateral window. The former indicates
the dimension of the window. The latter is a
boolean value indicating whether the window
considers only the preceding tokens (False) or both
preceding and following tokens (True). For
example, the sentence “E` aggiunto il seguente allegato”
with a bilateral sliding window of size 1, becomes
〈(PAD, E` , aggiunto), ( E`, aggiunto, il), (aggiunto,
il, seguente), (il, seguente, allegato), (seguente,
allegato, PAD)〉 where PAD indicates the padding
value. The introduction of the sliding window has
made it possible to improve the evaluation metric
of all models.
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Models</title>
        <p>
          We want to find a fully automatic approach based
on the extraction of interesting features. For this
reason, we developed a systematic comparison
among three models: Support Vector Machine
(SVM) with n-gram features, Conditional
Random Field (CRF) with hand-crafted features and
a Neural Network (NN) that uses word
embeddings. This latter model is a rather general
convolutional network architecture. The inputs of our
NLP tasks are the words that compose the
sliding window represented as a matrix. Each row
of the matrix corresponds to the word embedding
representation of one token. We decided to use a
convolutional layer given its efficiency in terms of
both representation and speed; it permits us to
capture local and position-invariant features
          <xref ref-type="bibr" rid="ref25">(Yin et
al., 2017)</xref>
          useful for our purpose. Then, we added
a Batch Normalization layer. It significantly
reduces the training time in feedforward neural
networks
          <xref ref-type="bibr" rid="ref2">(Ba et al., 2016)</xref>
          . During the experiment
phase, we observed that layer normalization
offers a speedup over the baseline model without
normalization and it stabilizes the training of the
model. We have also tried to use a Bidirectional
Long Short-Term Memory based model with an
additional CRF layer (Bi-LSTM-CRF) to solve
our task
          <xref ref-type="bibr" rid="ref13">(Huang et al., 2015)</xref>
          . Its application leads
to poor performance in terms of scores and speed.
The results obtained show the need to solve our
task using simple models that are able to discover
local patterns.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>The objective of the evaluation was to define a
systematic comparison among the models’
performance with respect to F1 macro, precision and
recall. In the model selection step, we used the F1
macro score as the evaluation metric since the
frequency distribution of the labels turned out to be
strongly unbalanced in all the subtasks.</p>
      <p>
        After some preliminary experiments, we fixed the
sliding window size and the tagging format for
each model. We found that both the CRF and NN
models are more inclined to use a bigger sliding
window size (5) than the SVM models (1) from
a performance-based perspective. We think this
difference comes from the Curse of
Dimensionality problem that could be encountered in the SVM
models
        <xref ref-type="bibr" rid="ref3">(Bengio et al., 2005)</xref>
        . Concerning the
tagging format, we adopted the LL tagging for all the
models. Our experiments show that it increases
the f1 score of about 20 percentage points.
Table 3 reports the mean results among the 3-fold
obtained by the best configuration of each model.
The CRF outperforms other models in almost all
the subtasks. We think that it is due to the
nature of this model. Indeed, CRFs naturally
consider state-to-state dependencies and
feature-tostate dependencies
        <xref ref-type="bibr" rid="ref15">(Lafferty et al., 2001)</xref>
        . Once
      </p>
      <sec id="sec-4-1">
        <title>Subtask SVM</title>
        <p>Replacement 0.868</p>
        <p>Addition 0.825</p>
        <p>Repeal 0.915
Abolition 0.823
From To 0.748
completed the model selection phase, we chose the
best model and its configuration for each subtask.
We considered both the mean and standard
deviation of the f1 metric among the folds. Then, we
re-trained the best model on the whole training set.
Table 4 reports the results and the average score of
the precision, recall and F1 metrics over the
internal test set. The precision score is higher than
recall in all except one subtask which may be good
for an application perspective.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Replacement</title>
        <p>Addition
Repeal
Abolition
From To</p>
      </sec>
      <sec id="sec-4-3">
        <title>Model</title>
        <p>CRF
CRF
CRF
NN
CRF
The models’ performances are improved
compared to the results achieved in the model
selection phase, probably thanks to the larger training
set provided.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>We presented and analysed a machine-learning
approach to the problem of the classification of
textual modifications. We compared different
tagging formats, feature extractor techniques and
machine learning models. Our experiments show that
the sliding window approach, combined with char
count vectorizer or word embeddings, allows the
models to capture most of the formulas that
introduce textual modifications. Following Occam’s
razor principle, we defined simple models that
obtained good performances in all the subtasks. Our
approach does not need any expertise in the law
ifeld since it tries to formalized rules to identify
textual modifications. We use different NLP
techniques to extract hidden features from the words
inside a window.</p>
      <p>Results validate our approach in terms of both
correctness and stability. They represent the first step
to build a fully automatic model capable to
identify and integrates textual modifications.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Timothy</given-names>
            <surname>Arnold-Moore</surname>
          </string-name>
          .
          <year>1997</year>
          .
          <article-title>Automatic generation of amendment legislation</article-title>
          .
          <source>In ICAIL '97</source>
          , pages
          <fpage>56</fpage>
          -
          <lpage>62</lpage>
          ,
          <fpage>01</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Jimmy</given-names>
            <surname>Lei</surname>
          </string-name>
          <string-name>
            <surname>Ba</surname>
          </string-name>
          , Jamie Ryan Kiros, and
          <string-name>
            <given-names>Geoffrey E</given-names>
            <surname>Hinton</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Layer normalization</article-title>
          .
          <source>arXiv preprint arXiv:1607</source>
          .
          <fpage>06450</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Yoshua</given-names>
            <surname>Bengio</surname>
          </string-name>
          , Olivier Delalleau, and Nicolas Le Roux.
          <year>2005</year>
          .
          <article-title>The curse of dimensionality for local kernel machines</article-title>
          .
          <source>Techn. Rep</source>
          ,
          <volume>1258</volume>
          :
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Piotr</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          , Edouard Grave, Armand Joulin, and
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Enriching word vectors with subword information</article-title>
          .
          <source>Transactions of the Association for Computational Linguistics</source>
          ,
          <volume>5</volume>
          :
          <fpage>135</fpage>
          -
          <lpage>146</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Raffaella</given-names>
            <surname>Brighi</surname>
          </string-name>
          , Leonardo Lesmo, Alessandro Mazzei, Monica Palmirani, and
          <string-name>
            <given-names>Daniele</given-names>
            <surname>Radicioni</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Towards semantic interpretation of legal modifications through deep syntactic analysis</article-title>
          . volume
          <volume>189</volume>
          , pages
          <fpage>202</fpage>
          -
          <lpage>206</lpage>
          ,
          <fpage>01</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Cimino</surname>
          </string-name>
          , Lorenzo De Mattei, and Felice Dell'Orletta.
          <year>2018</year>
          .
          <article-title>Multi-task learning in deep neural networks at evalita 2018</article-title>
          .
          <source>Proceedings of the Wvaluation Campaign of Natural Language Processing and Speech tools for Italian</source>
          , pages
          <fpage>86</fpage>
          -
          <lpage>95</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Xiang</given-names>
            <surname>Dai</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Recognizing complex entity mentions: A review and future directions</article-title>
          .
          <source>In Proceedings of ACL</source>
          <year>2018</year>
          , Student Research Workshop, pages
          <fpage>37</fpage>
          -
          <lpage>44</lpage>
          , Melbourne, Australia, July. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Lorenzo De Mattei</surname>
          </string-name>
          , Andrea Cimino, and Felice Dell'Orletta.
          <year>2018</year>
          .
          <article-title>Multi-task learning in deep neural network for sentiment polarity and irony classification</article-title>
          .
          <source>In NL4AI@ AI* IA</source>
          , pages
          <fpage>76</fpage>
          -
          <lpage>82</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Thomas G.</given-names>
            <surname>Dietterich</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Machine learning for sequential data: A review</article-title>
          .
          <source>In Terry Caelli</source>
          , Adnan Amin,
          <string-name>
            <given-names>Robert P. W.</given-names>
            <surname>Duin</surname>
          </string-name>
          , Dick de Ridder, and Mohamed Kamel, editors,
          <source>Structural, Syntactic, and Statistical Pattern Recognition</source>
          , pages
          <fpage>15</fpage>
          -
          <lpage>30</lpage>
          , Berlin, Heidelberg. Springer Berlin Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Enrico</given-names>
            <surname>Francesconi</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Passerini</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Automatic classification of provisions in legislative texts</article-title>
          .
          <source>Artiifcial Intelligence and Law</source>
          ,
          <volume>15</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          ,
          <fpage>01</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>John Garofalakis</surname>
            , Konstantinos Plessas, and
            <given-names>Athanasios</given-names>
          </string-name>
          <string-name>
            <surname>Plessas</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>A semi-automatic system for the consolidation of greek legislative texts</article-title>
          .
          <source>In Proceedings of the 20th Pan-Hellenic Conference on Informatics, PCI '16</source>
          , New York, NY, USA. Association for Computing Machinery.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Edouard</given-names>
            <surname>Grave</surname>
          </string-name>
          , Piotr Bojanowski, Prakhar Gupta, Armand Joulin, and
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Learning word vectors for 157 languages</article-title>
          .
          <source>In Proceedings of the International Conference on Language Resources and Evaluation (LREC</source>
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Zhiheng</given-names>
            <surname>Huang</surname>
          </string-name>
          , Wei Xu,
          <string-name>
            <given-names>and Kai</given-names>
            <surname>Yu</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Bidirectional lstm-crf models for sequence tagging</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Deepika</given-names>
            <surname>Kumawat</surname>
          </string-name>
          and
          <string-name>
            <given-names>Vinesh</given-names>
            <surname>Jain</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Pos tagging approaches: a comparison</article-title>
          .
          <source>International Journal of Computer Applications</source>
          ,
          <volume>118</volume>
          (
          <issue>6</issue>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>J.</given-names>
            <surname>Lafferty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>McCallum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Fernando</given-names>
            <surname>Pereira</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>Conditional random fields: Probabilistic models for segmenting and labeling sequence data</article-title>
          .
          <source>In ICML.</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Leonardo</given-names>
            <surname>Lesmo</surname>
          </string-name>
          , Alessandro Mazzei, and
          <string-name>
            <given-names>Daniele</given-names>
            <surname>Radicioni</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Extracting semantic annotations from legal texts</article-title>
          .
          <source>In HT '09</source>
          , pages
          <fpage>167</fpage>
          -
          <lpage>172</lpage>
          ,
          <fpage>01</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Christopher</given-names>
            <surname>Manning</surname>
          </string-name>
          and
          <string-name>
            <given-names>Hinrich</given-names>
            <surname>Schutze</surname>
          </string-name>
          .
          <year>1999</year>
          .
          <article-title>Foundations of statistical natural language processing</article-title>
          . MIT press.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , Ilya Sutskever, Kai Chen, Greg S Corrado, and
          <string-name>
            <given-names>Jeff</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          . In C. J.
          <string-name>
            <surname>C. Burges</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Bottou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Welling</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Ghahramani</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          K. Q. Weinberger, editors,
          <source>Advances in Neural Information Processing Systems</source>
          , volume
          <volume>26</volume>
          . Curran Associates, Inc.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Yasuhiro</given-names>
            <surname>Ogawa</surname>
          </string-name>
          , Shintaro Inagaki, and
          <string-name>
            <given-names>Katsuhiko</given-names>
            <surname>Toyama</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Automatic consolidation of japanese statutes based on formalization of amendment sentences</article-title>
          . In Ken Satoh, Akihiro Inokuchi, Katashi Nagao, and Takahiro Kawamura, editors,
          <source>New Frontiers in Artificial Intelligence</source>
          , pages
          <fpage>363</fpage>
          -
          <lpage>376</lpage>
          , Berlin, Heidelberg. Springer Berlin Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Monica</given-names>
            <surname>Palmirani and Fabio Vitali</surname>
          </string-name>
          ,
          <year>2011</year>
          .
          <article-title>AkomaNtoso for Legal Documents</article-title>
          , pages
          <fpage>75</fpage>
          -
          <lpage>100</lpage>
          . Springer Netherlands, Dordrecht.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>Gerard</given-names>
            <surname>Salton</surname>
          </string-name>
          and
          <string-name>
            <given-names>Christopher</given-names>
            <surname>Buckley</surname>
          </string-name>
          .
          <year>1988</year>
          .
          <article-title>Termweighting approaches in automatic text retrieval</article-title>
          .
          <source>Information Processing &amp; Management</source>
          ,
          <volume>24</volume>
          (
          <issue>5</issue>
          ):
          <fpage>513</fpage>
          -
          <lpage>523</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <given-names>S.</given-names>
            <surname>Sarlis</surname>
          </string-name>
          and
          <string-name>
            <given-names>I.</given-names>
            <surname>Maglogiannis</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>On the reusability of sentiment analysis datasets in applications with dissimilar contexts</article-title>
          . In Ilias Maglogiannis, Lazaros Iliadis, and Elias Pimenidis, editors,
          <source>Artificial Intelligence Applications and Innovations</source>
          , pages
          <fpage>409</fpage>
          -
          <lpage>418</lpage>
          , Cham. Springer International Publishing.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <given-names>Pierluigi</given-names>
            <surname>Spinosa</surname>
          </string-name>
          , Gerardo Giardiello, Manola Cherubini, Simone Marchi, Giulia Venturi, and
          <string-name>
            <given-names>Simonetta</given-names>
            <surname>Montemagni</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Nlp-based metadata extraction for legal text consolidation</article-title>
          .
          <source>In ICAIL</source>
          , pages
          <fpage>40</fpage>
          -
          <lpage>49</lpage>
          ,
          <fpage>01</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>Jan E</given-names>
            <surname>Trost</surname>
          </string-name>
          .
          <year>1986</year>
          .
          <article-title>Statistically nonrepresentative stratified sampling: A sampling technique for qualitative studies</article-title>
          .
          <source>Qualitative sociology</source>
          ,
          <volume>9</volume>
          (
          <issue>1</issue>
          ):
          <fpage>54</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <given-names>Wenpeng</given-names>
            <surname>Yin</surname>
          </string-name>
          , Katharina Kann,
          <string-name>
            <given-names>Mo</given-names>
            <surname>Yu</surname>
          </string-name>
          , and Hinrich Schu¨tze.
          <year>2017</year>
          .
          <article-title>Comparative study of cnn and rnn for natural language processing</article-title>
          .
          <source>arXiv preprint arXiv:1702</source>
          .
          <year>01923</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>