<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Semantic Extraction of Key Figures and Their Properties From Tax Legal Texts Using Neural Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Daniel Steinigen</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marcin Namysl</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Markus Hepperle</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan Krekeler</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Susanne Landgraf</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Bucerius Law School</institution>
          ,
          <addr-line>Jungiusstraße 6, Hamburg, 20355</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Federal Ministry of Finance</institution>
          ,
          <addr-line>Wilhelmstraße 97, Berlin, 10117</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS</institution>
          ,
          <addr-line>Schloss Birlinghoven 1, Sankt Augustin, 53757</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Applying information extraction to legislative texts is a challenging task that requires a specification to distinguish the relevant parts from the less relevant parts of the text. Moreover, there is still a lack of appropriate language- and domain-specific data in the field of information extraction. This work investigates the extraction and modeling of key figures from legal texts. We introduce a universally applicable annotation scheme together with a semantic model for key figures and their logically connected properties in legal texts. Moreover, we release KeyFiTax, a dataset with key figures based on paragraphs of German tax acts manually annotated by tax experts together with a knowledge graph populated from these paragraphs based on our semantic model. Using our dataset, we also evaluate and compare state-of-the-art entity extraction models in terms of long entity spans and low-resource data. Furthermore, we present a transformer-based approach for relation extraction using entity markers to obtain a logical formulation of the key figures. Finally, we introduce task triggers for training a combined resource-eficient entity and relation extraction model. We make our dataset together with the semantic model and the knowledge graph, as well as the implementation of the entity and relation extraction approaches investigated in this work public.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;information extraction</kwd>
        <kwd>entity extraction</kwd>
        <kwd>relation extraction</kwd>
        <kwd>ontologies</kwd>
        <kwd>knowledge graphs</kwd>
        <kwd>transformers</kwd>
        <kwd>language models</kwd>
        <kwd>German datasets</kwd>
        <kwd>legal texts</kwd>
        <kwd>tax key figures</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>NegativeCondition
Key figures represent a central component in legal texts Paragraph DeclarativeKeyFigure
of tax laws. They are crucial for applying laws and are CumulativeCondsiutiboCnlassOsfubClassOf refersTo hasKeyFiguresubClassOf subClassOf
an important criterion in the amendment of laws. Such Condition hasCondition KeyFigure
lkoewyafignucree’,sKairned,eer.gfr.,eibEetnrtafger‘ncuhnilgdsptaaxu-sfcrheaeleal‘ldoiwstaannccee’ aolr- AlternativeCondsuitbioCnlassOf hasExpression hasCondition hasExpression hxassdV:daeluceimal
pWeenrsbeusnaglslokwosatenncpe’a.uschale ‘flat-rate income-related ex- Unit hasUnit Expression subClassOf StatedExpression</p>
      <p>Changing key figures in the tax laws directly afects subClassOf hasRange hasFactor subClassOf DeclarativeExpression
itonhfgeestrhteisemuclaottiminngmgtutahtxeerrieamvlleponawucaten.ocAfeantcoehx5aa0nmcgepenleitnswpthoeuerlkldamwb.e,Iainnmtceroredmaessl- CUuprrpeenrcLyimsiutbClassOf subClassOfRange subLCimlaistsROafnge FasucbtoCrlassOf subClasTseOmfhpradosfrDsa:eLlFciltaaercraattoliorn
can be used to simulate what efect an adjustment of the
LowerLimit QuantitativeFactor
key figures will have on the specific tax forecast. To
facilitate this, in this paper we propose an approach based Figure 1: Ontology for semantic modeling of key figures and
on information extraction and semantic technologies. their logically connected properties in legal texts
For this it is first necessary to recognize and extract the
key figures with their logically connected properties and
rules from legal texts. This task requires an automatic
understanding of the legal texts and recognizing the
relevant information within the text. Then it is necessary
to semantically model the extracted information using a
specific ontology and populate a Knowledge Graph (KG)
out of this information. This then allows to compare the
KG’s of existing and new law texts to identify legislative
changes. In this paper we focus on the information
extraction part and the semantic modeling part. We leave
the diferential analysis and the prediction of the impact
chine learning (ML) approaches to extract key figures mance of natural language understanding approaches
from legal texts. Specifically, we consider this problem a on statutory reasoning by introducing the SARA dataset,
token-level classification task, known as sequence label- which consists among other of extracted arguments and
ing. With this approach, each token of a text is classified a graph-based representation of those arguments.
Nevaccording to the predefined categories, whereby tokens ertheless, these approaches are either too general and
not assigned to any class are labeled with zeros [1]. More generic or too specifically modeled for a particular
probprecisely, this can also be interpreted as an entity extrac- lem to fit our use case of modeling key figures. Therefore,
tion task in which individual entities can span over many we propose a new semantic model tailored to our use
words or tokens. Entity extraction is widely used in the case that models the key figures with their properties in
research area of information extraction (IE) and has also detail. The authors of this paper are a diverse team of
been applied in the legal domain [2]. NLP and ML experts and tax experts. In interdisciplinary</p>
      <p>We face several challenges in applying standard en- cooperation, we have developed an annotation scheme
tity extraction approaches in our work. Since we focus and a semantic model in an iterative process, which
conon German tax legal texts, we have both language- and tains the classes and properties required for the complete
domain-specific data. It means we are in a low-resource specification of the key figures.
domain and have to deal with limited training data. More- There are various challenges when annotating the key
over, the entities can span over many tokens, making it ifgures. Since legal texts can be structured in a complex
harder for the models to recognize the complete entities. way, the goal is to find a universally applicable annotation
Furthermore, not all numeric currency values are directly schema. Furthermore, most key figures contain not just
relevant to key figures. Therefore, the model must learn a single value but diferent values that apply under
diferthe text semantics and what specific tokens refer to in ent conditions. Using the created annotation scheme, we
order to extract the relevant values. generated a manually annotated gold standard dataset</p>
      <p>For obtaining a logical formulation of the key figures, based on paragraphs of German tax laws. This dataset
it is necessary to extract the key figures represented by is the basis for training and evaluating diferent
state-oftheir entities and the relations between them. To address the-art information extraction models. Figure 2 shows
this, we also consider relation extraction approaches in two examples of annotated paragraphs with distinct
catour work. To facilitate resource-eficient training and to egories or entity types.
get more benefit from the limited amount of available By applying our information extraction models and
training data, training a combined model for both entity our semantic model, the adjusted key figures will be
and relation extraction is reasonable. extracted and semantically modeled when the legal texts</p>
      <p>As a prerequisite to training models for the automatic have changed so that they can be taken into account in
extraction of key figures, we also introduce an annotation the tax forecast. In summary, the contributions of this
scheme together with a semantic model for key figures paper are as follows:
in legal texts. A variety of approaches, ontologies, and
knowledge graphs already exist for semantic modeling
of legal texts. LegalRuleML by Palmirani et al. [3] is
intended to model legal rules and to connect between
legal sources and metadata of the rules. They also
introduce a Metamodel with defined nodes (classes) and
edges (properties) to expose the LegalRuleML Metadata
as linked data. Moreno Schneider et al. [4] propose a
Legal Knowledge Graph that integrates and links
heterogeneous compliance data sources including legislation,
case law, regulations, standards, and private contracts.</p>
      <p>Holzenberger and Van Durm [5] investigated the
perfor• An annotation scheme together with a semantic</p>
      <p>model for key figures in legal texts
• A dataset consisting of paragraphs of German tax
laws with annotated key figures and a knowledge
graph populated with these key figures
• Evaluation and comparison of state-of-the-art
entity extraction models in terms of long entity
spans and low-resource data utilizing the
proposed dataset
• A transformer-based approach for a combined
resource-eficient extraction of entities and
relations from legal data.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Semantic Model and Dataset</title>
      <sec id="sec-2-1">
        <title>2.1. Data Sources and Data Selection</title>
        <sec id="sec-2-1-1">
          <title>The initial data basis for generating the annotated dataset</title>
          <p>is legal texts in the German language. For this purpose,
we took advantage of the publicly accessible website of
the German Federal Ministry of Justice and the Federal
Ofice of Justice 1, which contains the current German
laws and legal regulations. These legal texts are available
in various data formats, such as XML, PDF, or HTML.
For our purpose, we use the XML files and automatically
extract the contained legal paragraphs.</p>
          <p>In accordance with the overall aim of providing a
model for determining the impact of legislative change
on tax revenues, we select on a primary step the relevant
German tax laws, notably the Fiscal Code
(Abgabenordnung), the Income Tax Act (Einkommensteuergesetz),
Corporate Tax Act (Körperschaftsteuergesetz),
Inheritance Tax Act (Erbschaft- und Schenkungsteuergesetz)
and further tax acts regulating German direct and indirect
taxes. To generate a larger dataset, we also considered
further tax acts from other jurisdictions in the German
language, such as the Austrian or the Swiss, but gave up
on this due to the inconsistent and, therefore, harmful
use of the same key figures in a difering meaning or
different key figures in the same meaning as the key figure
from the German jurisdiction.</p>
          <p>In the second step, we determine the relevant sections
and paragraphs of the selected acts. To this end, we ask
which rules directly impact the tax revenues and have
not only a serving or systematizing function. Thereto
we select these sections and paragraphs, which contain a
key figure and a corresponding value and unit, which are
the essential and mandatory components of the relevant
key figures, whereas the other categories are optional.
The categories are described in detail in the next section.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Semantic Model</title>
        <p>We introduce our annotation scheme and our semantic
model for creating the dataset with diferent semantic
categories for the key figures. The goal is to provide a
comprehensive specification of the key figures so that
they can be used independently of the legal text for
downstream applications, such as tax forecasts. The annotation
scheme and the semantic model should be universally
applicable to legal texts, which can be structured in
various complex ways. We identified the semantic categories
in an iterative process by analyzing diferent paragraphs
of tax acts and revising our annotation scheme
continuously.</p>
        <p>First, we introduce the category for the key figure itself
as a central category, which is specified by containing one
1https://www.gesetze-im-internet.de
or more values that have an impact on tax revenue. The
annotation can be considered as the name or label for the
key figure. It corresponds to a text phrase or word that
describes this key figure. Figure 2 shows, for example,
the annotation of the key figures distance allowance and
child allowance. Then, since every key gfiure we consider
here should have at least one or more values, there is
the category for these values that we call expression of
the key figure. These are numerical values or terms to
which the key figures refer, such as the values 0.30 or
4 500 in Figure 2. The expressions can be specified in
certain units, so there is a category for units. In the case
of monetary amounts, which often appear in tax acts, the
unit is in most cases a currency, such as Euro.</p>
        <p>With these three categories, simple key figures can
already be specified. However, while analyzing the legal
texts, we found that the key figures can also be structured
much more complexly. Thus, most key figures contain
not only a single expression but diferent expressions
that apply under diferent conditions, and there are also
preconditions for specific key figures. For this purpose,
we introduce the category condition. It includes spans
of text over several words with conditions that apply
to a key figure or for which a key figure has specific
expressions. An example is the commuter allowance,
which amounts to 0.30 euros up to 20 kilometers driven
and increases to 0.35 euros from kilometer 21. Another
example is given in Figure 2, where it is shown that
the child allowance can have diferent expressions resp.
values depending on the number of children, which is
the condition there.</p>
        <p>We also found that there can be diferent types of
conditions, namely negative, alternative, or cumulative
conditions. An example of a negative condition can be found
in section 24 sentence 2 of the Corporate Tax Act. The
provision stipulates that the allowance for corporate tax
subjects, as regulated in sentence 1, does not apply to
the type of subjects specified in number 1 to 3 of the
provision. Alternative conditions are, for instance, used
in section 10b para. 1 Sentence 8 of the Income Tax Act.</p>
        <p>The sentence regulates that certain membership fees
cannot be deducted in case they are paid to corporations
serving certain in number 1 to 5 specified purposes. The
deduction prohibition already applies, if only one of these
numbers is fulfilled, as indicated by the word or between
the ultimate and the penultimate number. Section 10b
para. 1a sentence 1 contains one of many examples of
cumulative conditions, where donations into the assets of a
foundation are only declared deductible, if they meet the
requirements of a donation into the assets of a
foundation, the provisions of para. 1 sentence 2 to 6 are fulfilled,
and an application has been filed.</p>
        <p>Another point to consider when describing key figures
is that the expressions are not always just fixed values but
can also define a range in which a key figure applies. This
is covered by the range category. The range is an indicator shown in Figure 1 using the RDF Schema2 vocabulary.
for the area in which an expression is valid. This area can The semantic categories become the classes and the
relabe defined by either an upper limit, a lower limit or some tions become the properties of this ontology, which also
limit range. Figure 2 shows an example of an upper limit define the permissible properties between these classes.
"at most" and a lower limit "an amount greater than". In For the class expression, we have also defined data
propaddition, there is also weighting of the expressions, which erties for storing the numeric values if they are explicitly
we call factors. This category characterizes the factor that specified or the phrases for the declarative expressions.
must be considered for a expression and indicates what This model allows the assignment of the key figures to
the expression refers to. These factors can be further the associated conditions and expressions during
annodivided into temporal factors, which refer to periods of tation. Noteworthy are the properties hasCondition and
time, such as months or years, and quantitative factors, hasExpression since they can be applied to two diferent
which refer to some absolute amount. For example, the classes as a head. When considering conditions, these can
paragraph in Figure 2 includes a temporal factor "per apply directly to certain key figures or define the validity
calendar year" and a quantitative factor "for each full of diferent expressions. On the other hand, expressions
kilometer". can be derived directly from a key figure or can also be</p>
        <p>Furthermore, we found that not all key figures have part of a condition.
their expressions explicitly mentioned as such in the legal Furthermore, we introduce the relation join to link
texts. It means that the key figures sometimes cannot related annotations from the same semantic category
be recognized as distinct mentions of a short sequence since there are cases where a single entity is spread across
of words, and expressions do not always occur as easily multiple annotations. Beyond the key figures, we also
recognizable numerical values. Instead, the key figures model the pargraphs that contain the key figures and
and expressions can also be implicitly described in the the legal sources, in our case the tax acts that consist
legal texts using long phrases in a declarative manner. of the parapgraphs. In addition, since conditions can be
To tackle this, we have two additional categories for the expressed not only by natural text, but also depend other
declarative phrases of the key figures and expressions, paragraphs, we also introduce a property referTo between
called declarative key figures and declarative expressions. condition and paragraph.</p>
        <p>For the cases where the key figures and expressions are
explicitly mentioned, we use the categories stated key 2.3. Annotation Rules and Dataset
ifgure and stated expression. Table 1 shows all introduced Acquisition
semantic categories with some sample formulations and
their English translations. Given the developed annotation schema and the collected</p>
        <p>To assign the annotations created according to the se- data sources, the next step is to annotate the legal texts
mantic categories to each other in order to obtain a logical and build up the dataset. For the further procedure of
formulation of the key figures, we also introduce relation annotating the dataset and applying the information
extypes between the categories. This is also particularly traction models, we refer to the semantic catgories or
important, since a single paragraph may contain multiple classes as entities and the properties as relations. We first
key figures with the associated other categories. Based used the selected paragraphs from Section 2.1 and
peron the defined semantic categories and relation types, formed a simple pre-annotation task. Using rule-based
we build a semantic model in the form of an ontology as
approaches and pattern matching, we automatically en- cations already mentioned, there were other aspects and
riched the paragraphs with annotations for the expression challenges to be considered during the annotation. The
and unit categories. The annotators reviewed these pre- general challenge is the complexity of the German tax
annotations and corrected, removed, or complemented regulations, which are often long, convoluted, and
conthem as necessary. For storing the pre-annotated data, tain references to other provisions. Hence, compromises
we have chosen the CAS format serialized as an XMI file. were often necessary between annotation as accurately
It allows us to import the data directly into the anno- as possible and managing the complexity of annotations
tation tool. For manual annotation of the texts, we use that would otherwise result in specifying rules that afect
the INCEpTION tool3 [6] as it has an intuitive graphical only a small number of tokens. Because the dataset is of
user interface and can be configured well for specific a manageable size, the annotation agreement was that
annotation tasks. the annotation is done piecewise by both commenters</p>
        <p>Furthermore, we defined a set of annotation rules. We simultaneously. Anomalies and deviations were then
only allow complete words to be annotated and not parts discussed together with the NLP engineers and the
anof words. We do not allow multi-label annotation except notation scheme was readjusted if necessary.
for the conditions category, which means that each token
can only be labeled with one of the defined semantic 2.4. Dataset Statistics
categories. Conditions are an exception to this rule. Each
token already labeled as a condition can also have a label The generated dataset includes 106 annotated paragraphs
of another category because conditions can also represent from 14 diferent German tax acts. Table 2 show the
a key figure concurrently, and conditions themselves can statistics of the generated dataset with the number of
contain expressions. For example, section 10 para. 1a annotated instances and the token sequence length for
sentence 1 number 1 of the Income tax act contains the each category. It shows that the dataset contains 157
key figure of maintenance payments to the divorced or annotations of key figures , with the corresponding
addipermanently separated spouse who is subject to unlim- tional categories. The statistics also illustrate that the
ited income tax liability, which is a condition of this key annotations for categories condition, declarative key
figifgure. This is because the key figure and its expression ure, and declarative expression contain very long token
only apply if the maintenance payment, as defined else- sequences. We further populated a KG out of this
anwhere (in the German Civil Code) but referenced here, is notated dataset using the defined semantic model from
paid. Section Section 2.2. The annotated dataset, as well as the</p>
        <p>We also found that besides diferent types, the condi- KG and the list of tax acts of which paragraphs are
intions can also have diferent formats. Considering the cluded in the dataset, have been made publicly available
length, some conditions that span only a few words, and and can be found in the project repository.
others might span entire sentences. Here we do not limit
the length of the conditions and allow arbitrary long Table 2
phrases. The same applies to the categories declarative Statistics of the entities and relations in our dataset. No. is the
key figure and declarative expression. number of annotated instances and Tok. the mean number of</p>
        <p>For our annotation task we simplify for now the issue tokens for each category.
that there are diferent condition types, and do not
distinguish these types during annotation. We define that Entity Type No. Tok. Relation Type No.
cumulative conditions are labeled contiguously and that Key figure 129 4 hasKeyFigure 157
alternative conditions are labeled separately as long as (stated)
they do not have a common beginning or end of sentence. Expression 295 2 hasExpression 319
In addition, the relations between the entities are also (stated)
annotated. However, the relations are only allowed be- CUonnitdition 429814 114 hhaassCUonnitdition 237999
tween certain entity types, in a defined direction. This Range 75 2 hasRange 75
annotation was done in accordance with the classes and Factor 97 11 hasFactor 137
properties defined in the ontology in Figure 1. Key figure 28 14 hasParagraph 106</p>
        <p>The data was annotated by tax experts who coauthored (declarative)
this paper in an iterative process. In this process, we also Expression 32 6 join 139
continuously developed the annotation scheme together (declarative)
with the semantic model. The first semantic model was
more restrictive and as it progressed we allowed more
relations when it was necessary. In addition to the
specifi</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Approaches for Key Figure</title>
    </sec>
    <sec id="sec-4">
      <title>Extraction</title>
      <p>Given the dataset described in Section 2, the goal is to
automatically extract the key figures specified by their
semantic types from the legal texts. We address this
problem by employing entity extraction approaches. In
the entity extraction task, each token of a text is assigned
a label according to some predefined categories, whereby
tokens not assigned to any category are labeled with
zeros. The individual entities can then span over a large
number of tokens. Based on this, ML-based classification
models can be trained to classify each token. Ideally, the
model memorizes the examples seen during training and
tries to generalize to unseen examples.</p>
      <sec id="sec-4-1">
        <title>3.1. Approaches from NLP libraries</title>
        <p>model and the GBERT and GElectra models by Chan et al.
[8], which, in addition to Wikipedia- and news articles,
is also pre-trained on 2.4GB of German legal texts from
Open Legal Data7 [9]. We also consider a multilingual
language model XLM-RoBERTa [10], which is pre-trained
on 2.5 TB of data from 100 diferent languages, including
about 100 GB of German texts.</p>
        <p>In order to face the challenge of long input sequences
due to the long paragraphs legal texts can have, we also
consider the Longformer model by Beltagy et al. [11]. In
contrast to the other models, which only allow a
maximum length of 512 tokens as input, this model allows
up to 4096 tokens. Specifically, we use the XLM-R
Longformer model by Sagen [12]8. This is an XLM-RoBERTa
model that has been extended to allow sequence lengths
up to 4096 tokens using the Longformer pre-training
scheme.</p>
        <p>We also consider transformer-based approaches as we
investigate the low-resource scenario and have to cope
with long entity spans. Transformer architecture aims
to solve sequence-to-sequence tasks while being able
to consider long-distance dependencies across several
words in a sentence by employing the attention
mechanism [7]. Transformer-based language models can be
pre-trained on large text corpora, allowing them to
understand the contextual relationships between individual
words and sentences. Considering the entity extraction
task, we choose models that utilize the encoder part of
the transformer architecture. These models provide an
encoded representation of the input sentences. We use a
ifnal classification layer to classify the sentence tokens
according to our annotation scheme.</p>
        <p>For our work, we select relevant models pre-trained on
German text data. First, we consider the German BERT
In our work, we consider and compare diferent ap- 3.3. Relation Extraction
proaches for entity extraction. First, we investigate the
approaches of two well-known NLP libraries spaCy and As described in Section 2, our goal is to automatically
RASA. For spaCy, we take advantage of the provided pre- extract key figures represented by their entities and the
defined pipelines for training named entity recognition relations between them to obtain the logical formulation
(NER) models4. We used the recommended settings and of key figures. We employ a relation extraction approach
adjusted the hyperparameters for our use case, as shown to classify the relationship between the entities. Table 2
in Table 7. From RASA, we use an entity extraction ap- lists the relations in our dataset. Note that a simple
ruleproach based on a conditional random field (CRF) model 5. based assignment of the relation type based on the
enThis model utilizes the sklearn-crfsuite6 and uses features tity types according to the ontology in Figure 1 is not
of the words (e.g., capitalization, part-of-speech tagging) straightforward as the relationship may or may not exist
and their context to assign probabilities to certain entity depending on many other factors. Therefore, we apply
classes. ML-based approaches to this task.
We adopt a transformer-based approach inspired by
3.2. Transformer Models for Entity Zhou and Chen [13] and introduce typed entity
markers to the input text before feeding it into the model.</p>
        <p>Extraction First, we add special tokens into the vocabulary of the
model and use them to enclose subject and object entities
within the input paragraph: [SUB], [/SUB], [OBJ],
[/OBJ]. In addition to the subject and object, we also
mark the type of entities in the input text by using
additional special tokens for each entity type, which provides
the neural network with prior knowledge that facilitates
the learning process.</p>
        <p>Multiple training samples are generated for each input
paragraph depending on the number of entities contained
in that paragraph. For each sample, we mark one entity
as a subject and all other entities as objects. Similar to
the sequence labeling approach (Section 3.2), we feed
the text with marked entities to the encoder to obtain a
token-level representation of the input. Then, we apply
a classification layer to classify the relations between the
subject and objects. We label each [OBJ] token with the</p>
        <sec id="sec-4-1-1">
          <title>4https://spacy.io/usage/training/</title>
          <p>5https://rasa.com/docs/rasa/components/#crfentityextractor
6https://sklearn-crfsuite.readthedocs.io/en/latest/
[CLS] [RE] Das [OBJ] Kindergeld [/OBJ] beträgt [OBJ] monatlich [/OBJ] [OBJ] für das erste und zweite Kind [/OBJ] [SUB] 219 [/SUB] [OBJ] Euro [/OBJ]
Key Figure</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Experimental Evaluation</title>
      <sec id="sec-5-1">
        <title>4.1. Comparison of Approaches for Entity</title>
      </sec>
      <sec id="sec-5-2">
        <title>Extraction From Legal Data</title>
      </sec>
      <sec id="sec-5-3">
        <title>3.4. Joint Entity and Relation Extraction</title>
        <sec id="sec-5-3-1">
          <title>In this experiment, we evaluate the entity extraction</title>
          <p>approaches described in Section 3 on the dataset
introduced in Section 2. For this purpose, we only use the
superclasses condition, unit, range and factor of our
semantic model and do not distinguish into the subclasses.</p>
          <p>However, for the key figure and expressions classes we
retain the distinction between the stated and declarative
subclasses.</p>
          <p>Extending the approach from Section 3.3 further, it is
possible to use the same network architecture to train a
combined entity and relation extraction model. To this
end, we introduce new special tokens called task
triggers to distinguish the entity and relation extraction task:
[EE] and [RE], respectively. We insert these tokens at
the beginning of each paragraph right after the [CLS]
token.</p>
          <p>Moreover, since the condition class may overlap with 4.1.1. Experimental Setup
other classes in our dataset, we employ task triggers to
distinguish between groups of entities by defining addi- Data Split and Data Partition We use diferent
stratetional triggers for each group. It allows us to separate gies for splitting the data. For evaluating the diferent
entities into groups of types with non-overlapping an- types of transformer models described in Section 3.2 and
notations. Specifically, we have one entity group for ifnding the best-performing model on our dataset, we
conditions, marked with [GRP-1], and one group for randomly split the data into fixed training (80%) and
evalthe remaining entity types, marked with [GRP-2]. Con- uation (20%) subsets. This results in 85 paragraphs for
sidering that we have two entity groups, this gives us training and 21 for evaluation. For the condition class,
two training samples for entity extraction and multiple which is trained separately, there are 73 paragraphs for
samples (depending on the number of entities) for rela- training and 18 for evaluation. This allows us to identify
tion extraction for each paragraph. By executing multiple the most suitable models in less time and with less
compuforward passes on a single token classification model, we tational efort compared to the more complex evaluation
can recognize entities with overlapping annotations as approach we used afterward.
well as the relations between these entities. In the next step, we select the best-performing
trans</p>
          <p>One advantage of this approach is that we do not need former model and compare it with the other approaches
to train separate models for the diferent entity groups described in Section 3.1 using k-fold cross-validation.
and for relation extraction, which saves computational This validation technique is particularly suitable for the
resources for training and memory resources for infer- low-resource scenario considered here, as it reduces the
ence. Another advantage is that we get a larger number influence of the distribution of data across the training
and variety of samples for training the model and thus and test splits on the evaluation results of the models.
more benefit from the limited training data available. Fig- We choose k= 5 and randomly divide the dataset into
ure 3 shows an example excerpt of a paragraph with ifve equal-sized subsets. In each iteration, one subset is
the marked entities and the labeled relations. A detailed retained as the data used for testing the model, and the
overview of all generated training samples for this ex- remaining four subsets are used as training data. Thus,
cerpt can be found in the project repository. each subset is used once for evaluation and four times
for training the model. The results are then averaged to
produce the final scores.</p>
          <p>Training Setup As the annotations for the condition
class may overlap with other annotations, we train two
separate models — one for the recognition of the
condition type and the other for the recognition of the
remaining entity types. We train the transformer model
over 200 epochs with a batch size of 8 and a learning rate
of 1 × 10− 5. All other relevant hyperparameters and
the configuration files used for the other approaches are
documented in the project repository.</p>
          <p>Evaluation Metric For each entity type individually,
we report the token-level micro-averaged F1 score on the
test set as the evaluation metric in the charts. We also
provide the macro-averaged F1 score over all classes as a
tabular overview. For k-fold cross-validation, we report
the average F1 score achieved over all five training runs.
4.1.2. Results and Discussion</p>
        </sec>
        <sec id="sec-5-3-2">
          <title>We believe that it is due to the complexity of this class</title>
          <p>and the low number of instances in the data (see num.
samples and max. length plots in Figure 4, respectively).</p>
          <p>Despite a large number of available samples, the score
on the condition class is also low for spaCy-NER and
RASA-CRF, but acceptable for XLM-RoBERTaLARGE. We
believe that the length and the complexity of this class
could cause this. Note that the longest instances of this
class have over 100 tokens. Moreover, the concept of
a condition is not so strictly defined, as, e.g., expression,
unit, or factor.</p>
          <p>Looking at the overall performance across all classes,
XLM-RoBERTaLARGE clearly scores the best with a
macroaveraged F1 score of 60.9 %. SpaCy-NER and RASA-CRF
perform comparably in terms of overall performance but
are still about 15 % behind XLM-RoBERTaLARGE.</p>
        </sec>
        <sec id="sec-5-3-3">
          <title>Transformer Models The evaluation results for com</title>
          <p>paring diferent pre-trained transformer models are pre- GBERTBASE
sented in Table 3 as a summary overview. The detailed GBERTLARGE
performance of the evaluated models per class is visual- GGEElleeccttrraaLBAARSGEE
ized in the project repository. The results show that the Longformer
GBERT and XLM-RoBERTa models outperform other XLM-RoBERTaBASE
models for the declarative expression class. The best- XLM-RoBERTaLARGE
performing Transformer model is XLM-RoBERTaLARGE
with a F1 score of 56.8 %.
spaCy-NER (cross-validated) 45.78
RASA-CRF (cross-validated) 44.10
XLM-RoBERTaLARGE (cross-validated) 60.91
Model comparison By choosing XLM-RoBERTaLARGE, XLM-RoBERTaLARGE-Triggers (cross-validated) 58.78
we perform a cross-validation of this model and the
spaCy-NER and RASA-CRF approaches. Figure 4 present
the results of this experiment.</p>
          <p>In the case of the unit class, all models achieved high 4.2. Combined Extraction of Entities and
F1 scores. Unsurprisingly, the instances of this class are Relations From Legal Data
single-token entities (e.g., Euro, EUR) that only pose a
few challenges to the examined models. Similarly, the In this experiment, we evaluate the approach described
scores for the stated expression class were also high. in Section 3.4 for combined entity and relation extraction</p>
          <p>The Range and Factor classes were recognized rela- on the dataset introduced in Section 2. We use the same
tively well, especially by XLM-RoBERTaLARGE and in the classes as in Section 4.1.
case of the Factor class also by spaCy-NER. Note that
these two classes have three times fewer samples than in 4.2.1. Experimental Setup
the case of the expression and unit types. Despite a lower Training Setup We select the XLM-RoBERTaLARGE
number of examples, similar scores are achieved on the model for this experiment as its results in Section 4.1
declarative expression class by XLM-RoBERTaLARGE. were the most consistent among the examined models.</p>
          <p>All models, except XLM-RoBERTaLARGE, perform rel- Using the approach described in Section 3.3, we train one
atively poorly on the key figure class. Interestingly, the model for extracting the two groups of entities and the
variance of the results for this class is relatively large: relations.</p>
          <p>RASA-CRF achieves only 0.16 F1 score and, in contrast,
XLM-RoBERTaLARGE exhibits three times better score. Dataset We expand our training data according to
Sec</p>
          <p>For the declarative key figure class, the performance tion 3.3. For each record, we create one training sample
of every model examined in our experiment is the worst.
1.0
0.8
for each entity group and one training sample for each
possible subject entity containing the entity markers for
relation extraction. Then, analogous to Section 4.1, we
apply cross-validation to evaluate the model’s
performance.</p>
        </sec>
        <sec id="sec-5-3-4">
          <title>RoBERTaLARGE-Triggers achieves better performance on the most relevant class key figure . Moreover, it performs better on the most complex class condition.</title>
        </sec>
        <sec id="sec-5-3-5">
          <title>Relation Extraction The performance of this model</title>
          <p>in the relation extraction task is presented in Table 4.
The result shows that the F1 scores for all relation types
are above 0.6. Especially for relations hasUnit, hasRange,
hasFactor and hasExpression the F1 scores are high. The
model recognized the relationship between expressions
and units almost perfectly.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Related Work</title>
      <sec id="sec-6-1">
        <title>5.1. NLP datasets in Legal Domain</title>
        <sec id="sec-6-1-1">
          <title>Chalkidis et al. [14] provide a dataset for entity recog</title>
          <p>Macro-averaged 77.34 nition consisting of 3,500 English contracts manually
annotated with 11 entity types (party name, termination
date, jurisdiction, etc.). Chalkidis et al. [15] release a
multi-label text classification dataset based on EUR-LEX
4.2.2. Results and Discussion portal9. Leitner et al. [16] develop a dataset consisting of
Entity Extraction The results of this experiment are German court decisions annotated with 19 entity types
presented in Figure 4 and Table 3, named as XLM- (person, judge, lawyer, ordinance, court decision, etc.)
RoBERTaLARGE-Triggers, for comparison with the other and they examine, among others, CRF’s for entity
exmodels. The evaluation result shows that the jointly traction. Glaser et al. [17] introduce a dataset of 100k
trained model can achieve comparable performance German court rulings with short summaries to study the
for entity extraction as the XLM-RoBERTaLARGE mod- performance of text summarization systems. Wrzalik
els trained separately for conditions and other entities. and Krechel [18] release a dataset for legal information
Even though the jointly trained model slightly underper- retrieval (IR), which is based on case documents from
forms on the classes factor, range and declarative expres- the Open Legal Data platform [9]. Chalkidis et al. [19]
sion compared to the separately trained models, XLM- 9https://eur-lex.europa.eu/
present FairLex, a multilingual fairness benchmark of low-resource scenario and long entity spans. The
refour legal datasets that covers five languages and five sults showed that all models perform well for classes
sensitive attributes. They employ FairLex to evaluate with low complexity and suficient training data available.
the fairness of pre-trained language models (PLMs) and Nonetheless, for more complex entities the
transformerthe techniques used to fine-tune them. Holzenberger based language models significantly outperform the other
and Durme [5] introduce the SARA dataset to investigate models. However, as a limitation, such models also
rethe performance of natural language understanding ap- quire a certain amount of training data to achieve
acceptproaches on statutory reasoning Waltl et al. [20] present able performance. We further provided a
transformera automated classification of legal norms with regard to based relation extraction approach using typed entity
their semantic type and propose a semantic type taxon- markers, which has performed very well in our
experiomy for norms in the German civil law domain. ments. Moreover, we introduced task triggers for training
a combined model for entity and relation extraction and
5.2. NLP Approaches in Legal Domain for diferent groups of entities with overlapping
annotations. We have shown that comparable performance can
Dozier et al. [21] discusses NER and named entity disam- be achieved with this combined model as with separately
biguation (NED) in legal documents such as US case law, trained models. Using a combined model saves
computadepositions, pleadings, etc. Glaser et al. [22] evaluate tional resources for training and memory resources for
NER and NED approaches on a manually annotated Ger- inference.
man court decisions dataset. Chalkidis et al. [23] apply We make our dataset together with the semantic model
sequence labeling techniques to extracting core informa- and the KG, as well as the implementation of the entity
tion from contracts. Large PLMs are usually trained using and relation extraction approaches investigated in this
generic corpora and tend to underperform in specialized work publicly available11. To showcase our work, we also
domains [24, 25]. Chalkidis et al. [2] apply BERT models provide a simple demonstrator application12.
[26] to English downstream legal tasks: text classification
and sequence labeling, by exploring diferent pretraining
and fine-tuning strategies.</p>
          <p>Andrew [27] uses statistical and rule-based techniques
to extract entities such as names, organizations and roles
and their relations in legal documents. Chen et al. [28]
propose a legal triplet extraction system for drug-related
criminal judgment documents. Hong et al. [29]
perform IE of case factors from a dataset of parole hearings.</p>
          <p>Cardellino et al. [30] employ IE in legal texts to recognize
mentions of entities and links them to a structured
knowledge representation10. Lüdemann et al. [31] use KG’s to
model business entities of multinational companies and
employ it for tax planning strategies.</p>
          <p>Future Work In the future, we also plan to consider
alternative modeling approaches of the entity and
relation extraction task, e.g., as a span-based classification,
using machine reading comprehension or unsupervised
approaches utilizing large PLMs. Even with the
relation extraction approach used in this work, a more
comprehensive evaluation can be performed by considering
diferent entity markers and providing more or less
information about the entities, such as the entity types.</p>
          <p>The KG’s populated from the extracted key figures
allows as next step to compare the KG’s of existing and
new law texts in terms of their key figures. In this future
work, we also plan to evaluate other approaches for
differential analysis and then compare them to the semantic
approach described in this work. These detected changes
then provide the input for an application to predict the
impact of the law change on the expected tax revenue.</p>
          <p>The ontology developed in this work on the basis of
German tax acts can thereby also be applied universally to
other legal fields and languages.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion and Future Work</title>
      <p>11https://github.com/danielsteinigen/nlp-legal-texts
12https://huggingface.co/spaces/danielsteinigen/NLP-Legal-Texts
In this work, we investigated extracting relevant key
ifgures from legislative texts. To this end, we provided a
universally applicable annotation schema together with
a semantic model for key figures and their properties in
legal texts. We successfully applied the schema and the
model to legal texts. Moreover, we presented a dataset Acknowledgments
manually annotated by tax experts, which includes 85
annotated paragraphs from 14 diferent German tax acts The authors acknowledge the financial support by the
with 157 annotated tax key figures as well as a knowledge German Federal Ministry of Finance in the project "KISS
graph populated from these annotated paragraphs based - KI-gestütztes System zur Steueranalyse".
on our semantic model.</p>
      <p>
        We evaluated state-of-the-art entity extraction
models on the proposed dataset, facing the challenges of the
10LKIF ontology: http://www.estrellaproject.org/lkif-core/
sources and Evaluation Conference, European Lan- ings of the 2019 Conference on Empirical
Methguage Resources Association, Marseille, France, ods in Natural Language Processing and the 9t
        <xref ref-type="bibr" rid="ref13">h
2020</xref>
        , pp. 4478–4485. URL: https://aclanthology.org/ International Joint Conference on Natural
Lan2020.lrec-1.551. guage Processing (EMNLP-IJCNLP), Association
[17] I. Glaser, S. Moser, F. Matthes, Summarization for Computational Linguistics, Hong Kong, China,
of German court rulings, in: Proceedings of 2019, pp. 3615–3620. URL: https://aclanthology.org/
the Natural Legal Language Processing Workshop D19-1371. doi:10.18653/v1/D19-1371.
2021, Association for Computational Linguistics, [26] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT:
Punta Cana, Dominican Republic, 2021, pp. 180– Pre-training of deep bidirectional transformers for
189.
        <xref ref-type="bibr" rid="ref31">URL: https://aclanthology.org/2021</xref>
        .nllp-1.19. language understanding, in: Proceedings of the
doi:10.18653/v1/2021.nllp-1.19. 2019 Conference of the North American
Chap[18] M. Wrzalik, D. Krechel, GerDaLIR: A German ter of the Association for Computational
Linguisdataset for legal information retrieval, in: Pro- tics: Human Language Technologies, Volume 1
ceedings of the Natural Legal Language Processing (Long and Short Papers), Association
        <xref ref-type="bibr" rid="ref30">for
ComWorkshop 2021</xref>
        , Association for Computational Lin- putational Linguistics, Minneapolis, Minnesota,
guistics, Punta Cana, Dominican Republic, 2021, pp. 2019, pp. 4171–4186. URL: https://aclanthology.org/
123–128.
        <xref ref-type="bibr" rid="ref31">URL: https://aclanthology.org/2021</xref>
        .nllp-1. N19-1423. doi:10.18653/v1/N19-1423.
13. doi:10.18653/v1/2021.nllp-1.13. [27] J. J. Andrew, Automatic extraction of entities and
[19] I. Chalkidis, T. Pasini, S. Zhang, L. Tomada, relation from legal documents, in: Proceedings
S. Schwemer, A. Søgaard, FairLex: A multilingual of the Seventh Named Entities Workshop,
Assobenchmark for evaluating fairness in legal text pro- ciation for Computational Linguistics, Melbourne,
cessing, in: Proceedings of the 60th Annual Meeting Australia, 2018, pp. 1–8. URL: https://aclanthology.
of the Association for Computational Linguistics org/W18-2401. doi:10.18653/v1/W18-2401.
(Volume 1: Long Papers), Association for Computa- [28] Y. Chen, Y. Sun, Z. Yang, H. Lin, Joint
entional Linguistics, Dublin, Ireland, 2022, pp. 4389– tity and relation extraction for legal documents
4406. URL: https://aclanthology.org/2022.acl-long. with legal feature enhancement, in:
Proceed301. doi:10.18653/v1/2022.acl-long.301. ings of the 28th International Conference on
[20] B. Waltl, G. Bonczek, E. Scepankova, F. Matthes, Computational Linguistics, International
ComSemantic types of legal norms in german laws: clas- mittee on Computational Linguistics, Barcelona,
sification and analysis using local linear explana-
        <xref ref-type="bibr" rid="ref1">Spain (Online), 2020</xref>
        , pp. 1561–1571. URL: https:
tions, Artificial Intelligence and Law 27 (2019) 43– //
        <xref ref-type="bibr" rid="ref11">aclanthology.org/2020</xref>
        .coling-main.137. doi:10.
71. doi:10.1007/s10506-018-9228-y. 18653/v1/2020.coling-main.137.
[21] C. Dozier, R. Kondadadi, M. Light, A. Vachher, [29] J. Hong, D. Chong, C. Manning, Learning from
limS. Veeramachaneni, R. Wudali, Named Entity Recog- ited labels for long legal dialogue, in: Proceedings of
nition and Resolution in Legal Text, Springer Berlin the Natural Legal Language Processing Workshop
Heidelberg, Berlin, Heidelberg, 2010, pp. 27–43. 2021, Association for Computational Linguistics,
URL: https://doi.org/10.1007/978-3-642-12837-0_2. Punta Cana, Dominican Republic, 2021, pp. 190–
doi:10.1007/978-3-642-12837-0_2. 204.
        <xref ref-type="bibr" rid="ref31">URL: https://aclanthology.org/2021</xref>
        .nllp-1.20.
[22] I. Glaser, B. Waltl, F. Matthes, Named entity recog- doi:10.18653/v1/2021.nllp-1.20.
nition, extraction, and linking in german legal con- [30] C. Cardellino, M. Teruel, L. A. Alemany, S. Villata,
tracts, in: IRIS: Internationales Rechtsinformatik A low-cost, high-coverage legal named entity
recSymposium, 2018, p. 325–334. ognizer, classifier and linker, in: Proceedings of
[23] I. Chalkidis, M. Fergadiotis, P. Malakasiotis, I. An- the 16th Edition of the International Conference on
droutsopoulos, Neural contract element extraction Articial Intelligence and Law, ICAIL ’17,
Associarevisited, in: Workshop on Document Intelligence tion for Computing Machinery, New York, NY, USA,
at NeurIPS 2019, 2019. URL: https://openrev
        <xref ref-type="bibr" rid="ref33">iew.net/ 2017</xref>
        , p. 9–18. URL: https://doi.org/10.1145/3086512.
forum?id=B1x6fa95UH. 3086514. doi:10.1145/3086512.3086514.
[24] J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. [31] N. Lüdemann, A. Shiba, N. Thymianis, N. Heist,
So, J. Kang, BioBERT: a pre-trained biomedi- C. Ludwig, H. Paulheim, A knowledge graph for
cal language representation model for biomedical assessing agressive tax planning strategies, in: J. Z.
text mining, Bioinformatics 36 (2019) 1234–1240. Pan, V. Tamma, C. d’Amato, K. Janowicz, B. Fu,
URL: https://doi.org/10.1093/bioinformatics/btz682. A. Polleres, O. Seneviratne, L. Kagal (Eds.), The
Sedoi:10.1093/bioinformatics/btz682. mantic Web – ISWC 2020, Springer International
[25] I. Beltagy, K. Lo, A. Cohan, SciBERT: A pretrained Publis
        <xref ref-type="bibr" rid="ref13">hing, Cham, 2020</xref>
        , pp. 395–410.
language model for scientific text, in:
Proceed
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Spain</surname>
          </string-name>
          (Online),
          <year>2020</year>
          , pp.
          <fpage>6788</fpage>
          -
          <lpage>6796</lpage>
          . URL: https:
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          //aclanthology.org/
          <year>2020</year>
          .coling-main.
          <volume>598</volume>
          . doi:10. [1]
          <string-name>
            <surname>Namysł</surname>
          </string-name>
          , Marcin, Robust Information Extrac-
          <volume>18653</volume>
          /v1/
          <year>2020</year>
          .coling-main.
          <volume>598</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>tion From Unstructured Documents</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ph.D.</surname>
            the- [9]
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Ostendorf</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Blume</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Ostendorf</surname>
          </string-name>
          , Towards
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Bonn</surname>
          </string-name>
          ,
          <year>2023</year>
          . URL: https://hdl.handle.
          <source>net/20</source>
          .500. ceedings of the ACM/IEEE Joint Conference on
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <volume>11811</volume>
          /10560. Digital Libraries in
          <year>2020</year>
          , JCDL '20,
          <string-name>
            <surname>Association</surname>
            <given-names>for</given-names>
          </string-name>
          [2]
          <string-name>
            <given-names>I.</given-names>
            <surname>Chalkidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fergadiotis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Malakasiotis</surname>
          </string-name>
          , N. Ale- Computing
          <string-name>
            <surname>Machinery</surname>
          </string-name>
          , New York, NY, USA,
          <year>2020</year>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>tras</surname>
          </string-name>
          , I. Androutsopoulos, LEGAL-BERT:
          <article-title>The mup</article-title>
          - p.
          <fpage>385</fpage>
          -
          <lpage>388</lpage>
          . URL: https://doi.org/10.1145/3383583.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <article-title>pets straight out of law school</article-title>
          , in: Findings 3398616. doi:
          <volume>10</volume>
          .1145/3383583.3398616.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <article-title>of the Association for Computational Linguistics</article-title>
          : [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Conneau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Khandelwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          , V. Chaud-
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>EMNLP</source>
          <year>2020</year>
          ,
          <article-title>Association for Computational Lin- hary</article-title>
          , G. Wenzek,
          <string-name>
            <given-names>F.</given-names>
            <surname>Guzmán</surname>
          </string-name>
          , E. Grave, M. Ott,
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>guistics</surname>
          </string-name>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>2898</fpage>
          -
          <lpage>2904</lpage>
          . URL: https:// L. Zettlemoyer,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          , Unsupervised cross-
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          aclanthology.org/
          <year>2020</year>
          .findings-emnlp.
          <volume>261</volume>
          . doi: 10.
          <article-title>lingual representation learning at scale</article-title>
          , in: Pro-
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <volume>18653</volume>
          /v1/
          <year>2020</year>
          .findings-emnlp.
          <source>261. ceedings of the 58th Annual Meeting of the Associa</source>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Palmirani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Governatori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rotolo</surname>
          </string-name>
          , S. Tabet, tion for Computational Linguistics, Association for
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>H.</given-names>
            <surname>Boley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Paschke</surname>
          </string-name>
          , Legalruleml:
          <article-title>Xml-based rules Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>8440</fpage>
          -
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <article-title>and norms</article-title>
          .,
          <source>RuleML America</source>
          <volume>7018</volume>
          (
          <year>2011</year>
          )
          <fpage>298</fpage>
          -
          <lpage>312</lpage>
          . 8451. URL: https://aclanthology.org/
          <year>2020</year>
          .acl-main.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <source>doi:10</source>
          .1007/978-3-
          <fpage>642</fpage>
          -24908-2_
          <fpage>30</fpage>
          . 747. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .acl-main.
          <volume>747</volume>
          . [4]
          <string-name>
            <given-names>J. Moreno</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Rehm</surname>
          </string-name>
          , E. Montiel-Ponsoda, [11]
          <string-name>
            <given-names>I.</given-names>
            <surname>Beltagy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. E.</given-names>
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cohan</surname>
          </string-name>
          , Longformer:
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>V.</given-names>
            <surname>Rodríguez-Doncel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Martín-Chozas</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Navas- The long-document transformer</article-title>
          ,
          <year>2020</year>
          . URL: https:
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Loro</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Kaltenböck</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Revenko</surname>
          </string-name>
          , S. Karampatakis, //arxiv.org/abs/
          <year>2004</year>
          .05150. doi:
          <volume>10</volume>
          .48550/ARXIV.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>C.</given-names>
            <surname>Sageder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gracia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Maganza</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Kernerman</surname>
          </string-name>
          ,
          <year>2004</year>
          .
          <volume>05150</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>D.</given-names>
            <surname>Lonke</surname>
          </string-name>
          ,
          <article-title>Lynx: A knowledge-based ai service plat-</article-title>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sagen</surname>
          </string-name>
          ,
          <string-name>
            <surname>Large-Context Question</surname>
          </string-name>
          Answering with
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <article-title>ysis for the legal domain</article-title>
          ,
          <source>Information Systems</source>
          106 University, Department of Information Technology,
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          (
          <year>2022</year>
          )
          <article-title>101966</article-title>
          . URL: https://www.sciencedirect.com/
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          science/article/pii/S0306437921001563. doi:https: [13]
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>An improved baseline for</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          //doi.org/10.1016/j.is.
          <year>2021</year>
          .
          <volume>101966</volume>
          .
          <article-title>sentence-level relation extraction</article-title>
          , in: Proceedings [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Holzenberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. V.</given-names>
            <surname>Durme</surname>
          </string-name>
          ,
          <article-title>Factoring statutory of the 2nd Conference of the Asia-Pacific Chap-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          in: C.
          <string-name>
            <surname>Zong</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Xia</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Navigli</surname>
          </string-name>
          (Eds.),
          <source>Proceed- tics and the 12th International Joint Conference</source>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <source>ings of the 59th Annual Meeting of the Associa- on Natural Language Processing</source>
          (Volume
          <volume>2</volume>
          : Short
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <article-title>tion for Computational Linguistics and the 11th</article-title>
          <source>In- Papers)</source>
          , Association for Computational Linguis-
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <source>ternational Joint Conference on Natural Language tics</source>
          , Online only,
          <year>2022</year>
          , pp.
          <fpage>161</fpage>
          -
          <lpage>168</lpage>
          . URL: https:
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <surname>Processing</surname>
          </string-name>
          , ACL/IJCNLP 2021, (Volume
          <volume>1</volume>
          : Long Pa- //aclanthology.org/
          <year>2022</year>
          .aacl-short.
          <volume>21</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <source>pers)</source>
          ,
          <source>Virtual Event, August 1-6</source>
          ,
          <year>2021</year>
          , Association [14]
          <string-name>
            <given-names>I.</given-names>
            <surname>Chalkidis</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Androutsopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Michos</surname>
          </string-name>
          , Ex-
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <string-name>
            <surname>for Computational</surname>
            <given-names>Linguistics</given-names>
          </string-name>
          ,
          <year>2021</year>
          , pp.
          <fpage>2742</fpage>
          -
          <lpage>2758</lpage>
          .
          <article-title>tracting contract elements</article-title>
          , in: Proceedings of
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          URL: https://doi.org/10.18653/v1/
          <year>2021</year>
          .
          <article-title>acl-long.213. the 16th Edition of the International Conference</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <source>doi:10.18653/v1/2021.acl-long.213. on Articial Intelligence and Law</source>
          , ICAIL '17,
          <string-name>
            <surname>As</surname>
            [6]
            <given-names>J.-C.</given-names>
          </string-name>
          <string-name>
            <surname>Klie</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Bugert</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Boullosa</surname>
          </string-name>
          , R. E. de Castilho, sociation for Computing Machinery, New York,
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          , The inception platform: Machine- NY, USA,
          <year>2017</year>
          , p.
          <fpage>19</fpage>
          -
          <lpage>28</lpage>
          . URL: https://doi.org/
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <article-title>assisted and knowledge-oriented interactive anno</article-title>
          -
          <volume>10</volume>
          .1145/3086512.3086515. doi:
          <volume>10</volume>
          .1145/3086512.
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          tation,
          <source>in: Proceedings of the 27th International 3086515.</source>
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          <source>Conference on Computational Linguistics: System</source>
          [15]
          <string-name>
            <given-names>I.</given-names>
            <surname>Chalkidis</surname>
          </string-name>
          , E. Fergadiotis,
          <string-name>
            <given-names>P.</given-names>
            <surname>Malakasiotis</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          <article-title>An-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          <string-name>
            <surname>Demonstrations</surname>
          </string-name>
          ,
          <year>2018</year>
          , pp.
          <fpage>5</fpage>
          -
          <lpage>9</lpage>
          . droutsopoulos,
          <article-title>Large-scale multi-label text clas[7]</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Uszkoreit,</surname>
          </string-name>
          <article-title>sification on EU legislation</article-title>
          , in: Proceedings of
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , Ł. Kaiser,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>At- the 57th Annual Meeting of the Association for</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          <source>mation processing systems</source>
          <volume>30</volume>
          (
          <year>2017</year>
          ). putational Linguistics, Florence, Italy,
          <year>2019</year>
          , pp. [8]
          <string-name>
            <given-names>B.</given-names>
            <surname>Chan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schweter</surname>
          </string-name>
          , T. Möller,
          <source>German's 6314-6322</source>
          . URL: https://aclanthology.org/P19-1636.
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          <article-title>next language model</article-title>
          ,
          <source>in: Proceedings of doi:10</source>
          .18653/v1/
          <fpage>P19</fpage>
          -1636.
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          the 28th International Conference on Com- [16]
          <string-name>
            <given-names>E.</given-names>
            <surname>Leitner</surname>
          </string-name>
          , G. Rehm,
          <string-name>
            <given-names>J.</given-names>
            <surname>Moreno-Schneider</surname>
          </string-name>
          , A dataset
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          <source>tee on Computational Linguistics</source>
          , Barcelona, nition,
          <source>in: Proceedings of the 12th Language</source>
          Re-
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>