<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Classification of German Court Rulings: Detecting the Area of Law</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ingo GLASER</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Florian MATTHES</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Software Engineering for Business Information Systems, Department of Informatics, Technical University of Munich</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <abstract>
        <p>This paper investigates on the feasibility of automatically detecting the legal area of court rulings. Hereby, we establish the hypothesis that the allocation to a field of law is often ambiguous and errors occur in that process as a result. A dataset constituting over 9.000 labelled court rulings was used in order to train different machine learning (ML) classifiers. Additionally, we applied rule-based approaches utilizing domain knowledge of legal experts. Our models outperformed the rule-based approaches significantly. Hence, we could show that the performance of ML models are less prone to errors than the manual assignment of legal experts.</p>
      </abstract>
      <kwd-group>
        <kwd />
        <kwd>legal document classification</kwd>
        <kwd>area of law detection</kwd>
        <kwd>semantic analysis of court rulings</kwd>
        <kwd>natural language processing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Legal rules are only partially defined in legislation. Besides, in the context of jurisdiction,
there is a permanent local, case-related clarification and concretization of the terms used
in legislative texts. Court rulings, therefore, represent, next to legislative texts, a second
important source of conceptual knowledge for the practice of law.</p>
      <p>Hence, court rulings play not only an important role in legal research but are also part
of the daily work of legal practitioners. Various online databases exist to make judgments
accessible in the digital age. These databases aim at offering state-of-the-art information
retrieval features to put useful search functionalities at their disposal. Therefore, the court
rulings are enriched with semantic information. That is when the work of legal authors
begin.</p>
      <p>One crucial piece of information is the area of law on which the decision is based.
At first glance, it seems easy for a legal author to decide to which area of law a given
verdict belongs. However, as this is only one of the tasks of a legal author when enriching
court rulings with semantic information, every automation allows one to focus on other
tedious and time-consuming tasks (e.g. building norm chains, writing guiding principles,
etc.). Furthermore, we establish the hypothesis that the allocation to a field of law is often
ambiguous, and errors occur in that process as a result. Hence, in this paper, we want to
investigate the feasibility of automatically detecting the legal area of court rulings.</p>
      <p>The remainder of the paper is structured as follows: Section 2 provides a short
overview of the related work, Section 3 describes the legal areas leveraged, the
experimental setup along with the used dataset is discussed in Section 4, finally, the approaches
and its performance is evaluated in Section 5, before Section 6 closes with a conclusion
and outlook.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        The computer-assisted semantic analysis of court rulings is highly relevant and has
attracted researchers for quite some time. However, hardly any attempt has been made in
the German legal domain. Waltl et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] attempted to predict the outcome of appeal
decisions in Germany’s tax law. They trained different machine learning classifiers based
on the previous instance to determine likelihood ratios and thus predict the outcome of
the appeal. In another paper from 2017, Waltl et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] demonstrated the rule-based
extraction of semantic information, such as the year of dispute, from court rulings in the
area of tax law.
      </p>
      <p>
        However, approaches to utilize court rulings for various analyses exist in different
jurisdictions. An important contribution concerning pre-processing court rulings was made
by Savelka et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. They showed that legal decisions are more challenging for
existing sentence boundary detection (SBD) systems than for non-legal texts and trained
conditional random fields (CRF) and outperformed state-of-the-art SBD systems when
applied to adjudicatory decisions. Westerman et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] have built classifiers in the form of
boolean search rules on four different legal datasets, including statutory data, to provide
an explainable legal classification setup. As it is a core activity in legal decision making,
the identification of relevant or similar court decisions was investigated by Moodley et
al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Hereby, the authors compared the results of state-of-the-art text similarity
algorithms with the citation behavior in the case citation network for the Court of Justice of
the European Union. The fact that labeled datasets in the legal domain are most often
small, scarce, and expensive was taken up by Condevaux et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] as they utilized weakly
supervised one-shot classification for recurrent neural networks (RNN) to overcome the
issue of data scarcity. In their work, they focused on predicting the outcome of decisions
given highly ambiguous judge arguments. Slingerland et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] classify Dutch civil law
judgments based on whether they involve the Brussels I Regulation, including the Recast,
or not.
      </p>
      <p>
        As Brueninghaus and Ashley [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] explained that the desire of attorneys to find the
most relevant cases caused the broad interest of text classification in the legal domain,
many research was conducted applying legal text classification [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13 ref14 ref9">9,10,11,12,13,14</xref>
        ]. While
there exists related work with regard to the classification of legal texts, to the best of our
knowledge, there is no work concerning the detection of the area of law of German court
rulings.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Area of Laws</title>
      <p>The classification of legal court rulings into different areas seems obvious at first glance.
However, such a classification highly depends on the specific domain.</p>
      <p>For this work, we utilized common areas of law that were provided by a German
legal publisher who uses them in practice. Their classification system consists of 92
different classes. Since this classification is very fine-grained and therefore few examples
per class are available, we needed a more abstract classification. Table 1 reveals the
taxonomy used in this work.</p>
      <p>The first column depicts the 16 fields of law that are utilized in our experiments.
There is a null class to intercept decisions that do not fall under any of the specific areas
of law. While there may be room for arguing whether some of the classess are mapped
properly, the underlying system has been proven in practice as it is employed by many
legal magazines.</p>
      <p>Furthermore, we came up with a scenario constituting only four different fields of
law. The mapping between the 16 and four class setup is revealed in Table 2, and its
reasoning is discussed in Section 4.1.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Setup</title>
      <p>Legal research is a crucial task in the field of law. Due to the tedious and labor-intensive
work, we are investigating the automatic classification of court rulings according to
different areas of law. We not only utilize machine learning (ML) to classify decisions into
fields of law but also mimic human behavior on that task by capturing it in a rule-based
manner.
4.1. Data
For this research, we utilized a dataset provided by a German legal publisher. The dataset
consists of 9,563 civil law decisions from 2010 to 2020 in XML format constituting
various German court instances (e.g. Supreme Court, Regional Courts, Higher Regional
Courts, Local Court) in the ordinary jurisdiction. In a first preprocessing step, the raw
text of these decisions was extracted from the XML files. Since the XML already
contained a segmentation into the components of a judgment, it was possible to retrieve these
separately:
1. Heading (Rubrum): The so-called rubrum is, so to speak, the introduction of the
judgment and consists of file number, header, name of the parties involved, in
particular the plaintiff and defendant and any legal representatives, the court, date
of the last trial, and designation of the judgment.
2. Guiding Principle (Leitsatz): A guiding principle is not directly part of the court
ruling, but is prepared upon publication and represents a summary of the main
reasons for the decision by the court.
3. Tenor (Tenor): The most important part of the judgment, as this is where the
legal dispute is decided, so i.e. whether the defendant is sentenced to the plaintiff
the sued amount or whether the claim is dismissed. The tenor is composed of
three things: the actual tenor, the decision on costs, and provisional enforceability
including the power to avert the judgment.</p>
      <p>Family and Inheritance Law
Toll Law
Commercial and Corporate Law
Liability and Insurance Law
Tenancy and Real Estate Law
Enforcement and Insolvency
Motor Vehicle and Traffic Law
Neighbourhood Law
Procedural Law
Contract Law
Competition Law and
Industrial Property Rights</p>
      <p>Other</p>
      <p>Area of Law
Building Law
Professional Law
Labor and Social Law
Banking and Credit Security Law
4. Facts (Tatbestand): In it, the facts on which the judgment was based are presented
in the same way as they were presented to the court after the last oral hearing.
5. Reasoning (Entscheidungsgr u¨nde): The court states the reasons for its decision
in the reasoning part.</p>
      <p>The separation into these components allowed us to investigate different
classification inputs. Even though the tenor is mandatory for every German court ruling ( 117 II
Nr. 3 VwGO), 69 decisions did not include a tenor. The same applies to the reasoning
part ( 117 II Nr. 5 VwGO). 11 court rulings that do not contain any reasoning can be
explained by the fact that these are guiding principle decisions. Often the facts of a given
case are summarized under the reasoning part. Therefore only 3,929 decisions explicitly
state the facts. Last but not least, four decisions did miss the respective label and thus,
were removed from the dataset.</p>
      <p>As the data has been published in various legal magazines, the decisions were labeled
over the course of many years according to the company guidelines of the legal publisher.
The distribution of the different fields of law for the remaining 9,550 verdicts is revealed
in Table 2.</p>
      <p>4-class Setup</p>
      <sec id="sec-4-1">
        <title>4.2. Experiment</title>
        <p>To be able to assess our two hypotheses, our experimental setting constituted three steps:</p>
      </sec>
      <sec id="sec-4-2">
        <title>1. Rule-based Classification with Four Classes: We implemented a rule-based ap</title>
        <p>proach and evaluated it using different parts of the dataset.
2. Model Training with Four Classes: We trained various classifiers on the dataset
with four classes and evaluated them using 10-fold cross-validation on 20% of
the data.
3. Model Training with 16 Classes: We trained various classifiers on the dataset
with 16 classes and evaluated them through 10-fold cross-validation on 20% of
the data.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.2.1. Rule-based classification</title>
        <p>Three documents describing the approach to manually identify the area of law for a given
court ruling built the foundation for the rule-based approach. These documents were
created by three legal experts as they perform this task daily. At first, we transformed the
descriptions into rules employing the programming language Python. These rules consist
of four distinct criteria:
1. Specific Laws (SL): We extracted all legal references and compared them against
a list of laws that is typical for the underlying area.
2. Specific Norms (SN): Even one level deeper, we looked at certain norms.
3. Typical terms (TT): A simple lookup to check for the occurrence of specific terms.
4. Cited Literature (CL): Based on the extracted legal references, the occurrence of
certain literature such as commentaries was counted.</p>
        <p>Furthermore, the file number of the decision provides valuable information. Higher
courts such as the German Supreme Court (BGH) utilize a unique system for the file
numbers. Such a file number indicates the underlying field of law. This information was
taken into account as well. The rule-based algorithm favors the file number, i.e. if the
decision origins from a court with such a file number system, the classification is done
purely based on that number.</p>
        <p>In all other cases, the algorithm counts the occurrences of the specific laws, norms,
terms, and literature. The weights used in the formula were provided by the legal experts.
Figure 1 depicts that approach.</p>
        <p>Score = SL + SN
2 + T T</p>
        <p>0:5 + CL 3</p>
        <p>We then applied different thresholds to the score to classify each decision into one
field of law. Section 5.1 elaborates on these thresholds in greater detail.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.2.2. ML-based classification</title>
        <p>As discussed already, two different settings were applied. However, since the only
difference is the underlying variation of the taxonomy, the same approaches were used for
both setups. Therefore, the following steps were implemented:</p>
        <p>
          Pre-Processing: We used three different pre-processing procedures: (1) The
normalization (PRE) consisted of the removal of line breaks as well as duplicated
whitespaces, replacing German umlauts, spelling numbers, and removing
punctuation. (2) Stop word removal (SWR) was performed according to the spaCy1 stop
word list. (3) A lemmatization (Lemma) was conducted leveraging spaCy. These
three procedures were incorporated into pipelines in different combinations.
Feature Extraction: Four different feature representations were used: (1) A
bagof-words approach was utilized to represent features. We used simple word count
vectors as well as where indicated, additionally a term frequency-inverse
document frequency (TFIDF) transformer on these vectors. Where indicated,
part-ofspeech (POS) tags have been created and used as well. To keep the bag-of-words
approach in this case as well, each token was combined with the respective POS
tag using a dash. (2) The second feature representation leveraged word
embeddings. We trained word2vec models on different legal corpora as well as used
pretrained models. These models were used to calculate the mean embedding of a
decision component (e.g. reasoning). (3) We also incorporated topic modeling to
create features. (4) Finally, we utilized state-of-the-art deep neural representations
such as BERT [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
        </p>
        <p>Training of Machine Learning Model: Six different traditional classifiers were
applied to the task of predicting the legal area of court rulings. We used
multinomial naive Bayes (MNB), logistic regression (LR), support vector machines
(SVM), multilayer perceptrons (P), random forests (RF), and an extra tree
classifier (ETC). The models were trained using 10-fold cross-validation on 80% of
the dataset each iteration. Furthermore, we trained different deep neural
architectures on 60% of the data. Hereby, 20% of the data was used for validating during
training, while the other 20% acted as a hold-out set for the final testing.
Evaluation and Error Analysis: Weighted variants of precision, recall, and F1 was
used to evaluate the performance of the trained models.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Evaluation &amp; Error Analysis</title>
      <sec id="sec-5-1">
        <title>5.1. Evaluating the Performance</title>
        <p>The objective of this work was to evaluate the possibility of automatically detecting the
field of law for a given legal verdict as well as to compare ML-based with rule-based
approaches.</p>
        <p>
          To achieve this, different classifiers were incorporated into various pipeline settings
as described in Section 4.2.2 and applied to the dataset constituting four classes. While
this resulted in over 50 different models, we also utilized state-of-the-art deep neural
networks utilizing contextual embeddings such as BERT [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. In Table 3 we only report
on the best performing classifier for space reasons.
        </p>
        <p>That is a SVM applied on a pipeline performing our pre-processing procedure
utilizing TFIDF on the lemmatized input. As input, only the reasoning part of the court rulings
was used as the other components resulted in inferior performances. The resulting model
performed with an F1 of 0.87. In contrast, the initial rule-based approach only achieved
1https://spacy.io/usage/linguistic-features
an F1 of 0.15. The bad result can be attributed to the low recall, particularly for the null
class. As a result, we introduced a threshold for the score, i.e. the score needs to meet
a defined value to be eligible for classification. If this condition is not met, the decision
will be classified into the class Other. As one can see in Table 3 due to that threshold the
recall was increased while keeping the precision almost consistent. We achieved the best
result with a threshold value of 30 as a bigger threshold worsened the performance.</p>
        <p>The results already suggest that both our hypothesis can be proved: (1) It is possible
to automatically detect the field of law for a given court ruling with good performance
( 88%), and (2) ML-based approaches outperform rules mimicking human behavior.</p>
        <p>Nonetheless, a setup consisting of only four classes is not usable in practice. For that
reason, we selected the best performing setup and trained it on the dataset with 16 classes
as well. Table 4 reveals the resulting performance.</p>
        <p>As the overall performance decreases significantly (F1 0.75), the approach does not
seem to generalize well at first glance. However, it can be seen that the performance
only drops in underrepresented classes such as Professional Law or Neighbourhood Law.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Classes with high support, e.g. Family and Inheritance Law or Enforcement and Insol</title>
        <p>vency perform almost on the same level as in the four-class setting. This suggests that
adding sufficient data leads to results of around 90% as well.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.2. Error Analysis</title>
        <p>The comparison of the results in Table 3 provides evidence in our initial hypotheses,
which stated that machines utilizing ML are more capable of extracting the area of law
from court rulings. To be able to better understand the differences between our rules and
the ML models, the best configuration (SVM utilizing TFIDF with our pre-processing
and lemmatization) was examined in greater detail.
Support
45
21
24
17
119
47
53
100
92
88
31
5
181
59
61
12
955</p>
        <p>Method
SVM (PRE + Lemma)</p>
        <p>We looked into the existing model and inspected the coefficients of each feature.
The most important features for the class Labor Law are (1) landesarbeitsgericht, (2)
arbeitnehmer, (3) arbeitgeber, and (4) arbeitsverh a¨ltnis. When looking at the rules for
the classification of a decision into Labor Law, these terms occur in our TT. However,
these typical terms only play a small role when assigning the classes through the rules. In
general could be observed that the rules focus heavily on references to specific laws and
norms, while all of our ML models mostly pay attention to specific terms. Even greater
evidence is provided when revealing the most important features for the classes Tenancy
Law and Business Law. While the former uses words representing typical terms, the latter
focuses on laws as well (e.g. hgb, aktg or inso) resulting in a worse performance.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion &amp; Outlook</title>
      <p>In this paper, we investigated the possibility to automatically detect the area of law for
a given legal judgment. Therefore, various classifiers were trained on a dataset
constituting 9,559 German court rulings. While some simple linear models were created, also
advanced techniques such as topic modeling were incorporated.</p>
      <p>We could make two contributions: (1) Highlighting the ability to automatically detect
the area of law for German legal court rulings with high precision, and (2) show that
ML-based approaches outperform a human inspired rule-based baseline.</p>
      <p>Nonetheless, this research includes some limitations. The verdict components were
of different lengths. Furthermore, the class distribution varied quite a lot. As a
consequence, future research needs to define an even more suitable setting in terms of data
distribution and size to provide more evidence on the capability of ML models. Yet, this
work builds a solid base for future research in this area.</p>
      <p>While we could not achieve great results utilizing state-of-the-art contextual
embeddings, usually such approaches are superior. As a result, it may be worth another attempt
at selecting different deep neural architectures relying on contextual embeddings.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Waltl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Bonczek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Scepankova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Landthaler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Matthes</surname>
          </string-name>
          , “
          <article-title>Predicting the outcome of appeal decisions in germany's tax law</article-title>
          ,” in Electronic Participation,
          <string-name>
            <given-names>P.</given-names>
            <surname>Parycek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Charalabidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Chugunov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Panagiotopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Pardo</surname>
          </string-name>
          , Ø. Saebø, and E. Tambouris, Eds. Cham: Springer International Publishing,
          <year>2017</year>
          , pp.
          <fpage>89</fpage>
          -
          <lpage>99</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Waltl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Landthaler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Scepankova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Matthes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Geiger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Stocker</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Schneider</surname>
          </string-name>
          , “
          <article-title>Automated extraction of semantic information from german legal documents,”</article-title>
          <source>in IRIS: Internationales Rechtsinformatik Symposium</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Savelka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. R.</given-names>
            <surname>Walker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Grabmair</surname>
          </string-name>
          , and
          <string-name>
            <surname>K. D. Ashley</surname>
          </string-name>
          , “
          <article-title>Sentence boundary detection in adjudicatory decisions in the united states,” TRAITEMENT AUTOMATIQUE DES LANGUES</article-title>
          , vol.
          <volume>58</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>21</fpage>
          -
          <lpage>45</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Westermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Savelka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. R.</given-names>
            <surname>Walker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. D.</given-names>
            <surname>Ashley</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Benyekhlef</surname>
          </string-name>
          , “
          <article-title>Computer-assisted creation of boolean search rules for text classification in the legal domain</article-title>
          .”
          <string-name>
            <surname>in</surname>
            <given-names>JURIX</given-names>
          </string-name>
          ,
          <year>2019</year>
          , pp.
          <fpage>123</fpage>
          -
          <lpage>132</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>K.</given-names>
            <surname>Moodley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. V. H.</given-names>
            <surname>Serrano</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. van Dijck</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumontier</surname>
          </string-name>
          , “
          <article-title>Similarity and relevance of court decisions: A computational study on cjeu cases</article-title>
          ,” in JURIX,
          <year>2019</year>
          , pp.
          <fpage>63</fpage>
          -
          <lpage>72</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Condevaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Harispe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mussard</surname>
          </string-name>
          , and G. Zambrano, “
          <article-title>Weakly supervised one-shot classification using recurrent neural networks with attention: Application to claim acceptance detection</article-title>
          .”
          <string-name>
            <surname>in</surname>
            <given-names>JURIX</given-names>
          </string-name>
          ,
          <year>2019</year>
          , pp.
          <fpage>23</fpage>
          -
          <lpage>32</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Slingerland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Boer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Winkels</surname>
          </string-name>
          , “
          <article-title>Analysing the impact of legal change through case classification</article-title>
          .”
          <string-name>
            <surname>in</surname>
            <given-names>JURIX</given-names>
          </string-name>
          ,
          <year>2018</year>
          , pp.
          <fpage>121</fpage>
          -
          <lpage>130</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bru</surname>
          </string-name>
          <article-title>¨ninghaus and</article-title>
          <string-name>
            <surname>K. D. Ashley</surname>
          </string-name>
          , “
          <article-title>Toward adding knowledge to learning algorithms for indexing legal cases</article-title>
          ,”
          <source>in Proceedings of the 7th international conference on Artificial intelligence and law</source>
          ,
          <year>1999</year>
          , pp.
          <fpage>9</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>E. de Maat</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Krabben</surname>
            , and
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Winkels</surname>
          </string-name>
          , “
          <article-title>Machine learning versus knowledge based classification of legal texts</article-title>
          .”
          <string-name>
            <surname>in</surname>
            <given-names>JURIX</given-names>
          </string-name>
          ,
          <year>2010</year>
          , pp.
          <fpage>87</fpage>
          -
          <lpage>96</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Biagioli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Francesconi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passerini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Montemagni</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Soria</surname>
          </string-name>
          , “
          <article-title>Automatic semantics extraction in law documents</article-title>
          ,”
          <source>in Proceedings of the 10th international conference on Artificial intelligence and law</source>
          ,
          <year>2005</year>
          , pp.
          <fpage>133</fpage>
          -
          <lpage>140</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J. O.</given-names>
            <surname>Neill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Buitelaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Robin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L. O.</given-names>
            <surname>Brien</surname>
          </string-name>
          , “
          <article-title>Classifying sentential modality in legal language: a use case in financial regulations, acts</article-title>
          and directives,”
          <source>in Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>159</fpage>
          -
          <lpage>168</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>I.</given-names>
            <surname>Glaser</surname>
          </string-name>
          , E. Scepankova, and
          <string-name>
            <given-names>F.</given-names>
            <surname>Matthes</surname>
          </string-name>
          , “
          <article-title>Classifying semantic types of legal sentences: Portability of machine learning models</article-title>
          .”
          <string-name>
            <surname>in</surname>
            <given-names>JURIX</given-names>
          </string-name>
          ,
          <year>2018</year>
          , pp.
          <fpage>61</fpage>
          -
          <lpage>70</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Falakmasir and K. D. Ashley</surname>
          </string-name>
          , “
          <article-title>Utilizing vector space models for identifying legal factors from text</article-title>
          .”
          <string-name>
            <surname>in</surname>
            <given-names>JURIX</given-names>
          </string-name>
          ,
          <year>2017</year>
          , pp.
          <fpage>183</fpage>
          -
          <lpage>192</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B.</given-names>
            <surname>Waltl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Muhr</surname>
          </string-name>
          , I. Glaser,
          <string-name>
            <given-names>G.</given-names>
            <surname>Bonczek</surname>
          </string-name>
          , E. Scepankova, and
          <string-name>
            <given-names>F.</given-names>
            <surname>Matthes</surname>
          </string-name>
          , “
          <article-title>Classifying legal norms with active machine learning</article-title>
          .
          <source>” in JURIX</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , “BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)</article-title>
          . Minneapolis, Minnesota: Association for Computational Linguistics, Jun.
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>