=Paper= {{Paper |id=Vol-3256/km4law1 |storemode=property |title=Extracting Structured Knowledge from Dutch Legal Texts: A Rule-based Approach |pdfUrl=https://ceur-ws.org/Vol-3256/km4law1.pdf |volume=Vol-3256 |authors=Roos M. Bakker,Maaike H.T. de Boer,Romy A.N. van Drie,Daan Vos |dblpUrl=https://dblp.org/rec/conf/ekaw/BakkerBDV22 }} ==Extracting Structured Knowledge from Dutch Legal Texts: A Rule-based Approach== https://ceur-ws.org/Vol-3256/km4law1.pdf
Extracting Structured Knowledge from Dutch Legal
Texts: A Rule-based Approach
Roos M. Bakker1,2,* , Maaike H.T. de Boer1 , Romy A.N. van Drie1 and Daan Vos1
1
  Nederlandse Organisatie voor Toegepast Natuurwetenschappelijk Onderzoek (TNO), Anna van Buerenplein 1, Den
Haag, 2595DA, the Netherlands
2
  Universiteit Leiden, Reuvensplaats 3, Leiden, 2311BE, the Netherlands


                                         Abstract
                                         Legal texts are difficult to interpret, and its interpretation depends on the knowledge and experience of
                                         the legal expert. Formalising interpretations can improve transparency. However, creating formalisations
                                         of legal texts is labour-intensive, and automatically creating them is still a challenge. Previous work
                                         showed that rule-based systems have mixed success on Dutch legal texts. They use complex rule systems
                                         for specific cases, making them hard to compare. Because of the lack of analysis, the success of these
                                         methods is also unclear. In this paper, we propose a new rule-based architecture for detecting the
                                         different roles of Flint frames, a knowledge representation language which aims to be a generic and less
                                         task-dependent language. The rules in this architecture are based on Part-of-Speech tags and universal
                                         dependency tags. Our analysis shows that this combination yields more precise extraction of the roles of
                                         Flint frames than previous methods, and the use of universal dependency tags allows this method to also
                                         be applied to other languages. For further improvement we suggest extending the rules for extracting
                                         the recipient role, add rules for recognising complex relative clauses, and testing this framework on
                                         English legal texts.

                                         Keywords
                                         Information Extraction, Knowledge Modelling, Legal Interpretation Support




1. Introduction
Automatically translating sources of norms, a name that covers a wide variety of legal text,
including legislation, policy guidelines, contracts and doctrine, into formal and computer exe-
cutable interpretations is considered to be challenging. Experts in Natural Language Processing
(NLP) have been working on this topic since the early 2000s, using a wide variety of NLP
techniques to ‘translate’ these sources of norms into (semi) formal knowledge representation
languages (KR) [1, 2, 3]. A recent language is Flint [4]. Flint is aimed to be a more generic
and less task-dependent modeling language that can be used to express the interpretation of
any source of norms. One of the important considerations in Flint is that the perspectives

EKAW’22: Companion Proceedings of the 23rd International Conference on Knowledge Engineering and Knowledge
Management, September 26–29, 2022, Bozen-Bolzano, IT
*
  Corresponding author.
$ roos.bakker@tno.nl (R. M. Bakker); maaike.deboer@tno.nl (M. H.T. d. Boer); romy.vandrie@tno.nl
(R. A.N. v. Drie); daan.vos@tno.nl (D. Vos)
€ https://gitlab.com/calculemus-flint/flintfillers (R. M. Bakker)
 0000-0002-1760-2740 (R. M. Bakker); 0000-0002-2775-8351 (M. H.T. d. Boer)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
of all agent roles are included, as well as that the focus is on the action part of norms. As a
result, the KR can be used in different task contexts using multiple forms of reasoning. We may,
for instance, use the KR as a knowledge base for multidisciplinary teams, as specification for
decision-support systems, to detect anomalies such as conflicts in duties, or to build eServices
assisting citizens.
   In this paper, we propose a new knowledge driven approach, inspired by previous work from
De Maat et al. [3] and Bakker et al. [5]. We use part-of-speech (POS) and dependency tagging,
and then allocate parts of the sentence to the slots in a Flint frame using a new set of rules. We
use universal dependency tags such that our method will be transferable to other languages.
Our approach differs from earlier work in that 1) it can deal effectively with passive sentences;
2) it uses the latest dependency parsing techniques; and 3) that it is applicable on flint frames.
We hypothesise that our approach will improve extraction of the frames and will allow for a
combination of the methods used in Bakker et al. [5].


2. Related Work
Relevant work on extracting information from legal texts using NLP has been done for different
languages, such as Italian [1, 6] and English [2, 7]. In this section, we focus on previous work
on Dutch legal texts. In 2.1 we discuss previous work that improves modeling of Dutch legal
texts using NLP techniques. We use Flint frames as a basis for the extracted information, they
will be discussed in 2.2.

2.1. Rule-based systems for Dutch
In the previous decade, NLP research within the Dutch legal text domain has mainly focused
on syntactic and semantic analysis. One of the first papers is that of Van Gog et al. [8] from
2001. In that paper a tool named OPAL (Object-oriented Parsing and Analysis of Legislation) is
proposed. The goal of this tool is to support the modeling of legislation using noun phrases and
specific patterns in legal sentences. Unified Modeling Language (UML) and Object Constraint
Language (OCL) are used as languages.
   Another paper in that decade uses Web Ontology Language (OWL) as the knowledge rep-
resentation language [3]. The goal of this paper by de Maat et al. is to extract norms, which
have specific subtypes: obligations, rights, application provisions, penalisations, calculations,
delegations and publication provisions. Each sentence of the Dutch law texts is first classified
to one of the subtypes using a pattern matcher and an Machine Learning (ML) classifier. After
classification, frames are extracted using rules and transformed to OWL. These rules are based
on extracted dependencies using Alpino [9]. An example of a rule is that the subject is the agent
of the action.
   Recently, Bakker et al. [5] propose to view the formalisation of a part of the law as a Semantic
Role Labelling task. They compare performance between a transformer-based model based on
the Dutch BERTje [10] and a rule-based method. The results show that the transformer-based
method outperforms the rule-based method on both a small annotated data set with Dutch law
text and on the Dutch Aliens Act. In the discussion it is mentioned that the rule-based method
Table 1
Action role definitions with simplified example of a manually created act frame from the GDPR
 Role                   Definition                                  example
 Act                    Name of the act frame                       collect personal data
 Action                 Action that causes the transition of an     collect
                        object
 Actor                  Agent role that is allowed to perform ac-   processor
                        tion
 Object                 The object acted upon                       personal data
 Recipient              Agent role having a normative relation      data subject
                        with the actor concerning his action
 Precondition           Set of conditions that must be met to al-   personal data are processed lawfully,
                        low the action of the actor                 fairly and in a transparent manner in re-
                                                                    lation to the data subject
 Creating postcondi-    Facts or normative relations created by     controller shall be able to demonstrate
 tion                   action of the actor                         compliance with Art. 5(1) GDPR
 Terminating post-      Facts or normative relations terminated     -
 condition              by action of the actor
 Source                 Reference to the source of the act type,    Art. 5 (1) GDPR
                        including information on version


is not optimised yet, as only a limited number and analysis of the rules is available. The authors
also mention Flint as the formalism in which the semantic roles could be used.

2.2. Flint Frames
Flint [4] is a framework to represent sources of norms. Flint is aimed to be a generic and less
task-dependent modeling language that can be used to express the interpretation of any source
of norms. The novelty of this framework is that the focus is on the action part of norms and
that the perspectives of all agent roles affected by the norms are included. As a result, the KR
can be used in multiple task contexts using multiple forms of reasoning.
   Flint is based on three types of frames to express the interpretation of sources of norms: acts,
facts and duties, as described in [4]. In this paper we focus on act frames, as they are the core of
Flint. An overview of the roles with a definition and example are shown in table 1. The action
is the core of the act frame, and is accompanied by an actor that performs the action, the object
that is acted upon and the recipient that has a relation with the actor concerning the action.
Before the action can be performed there are preconditions, and after the action is performed
there are post-conditions that can be created or terminated. Metadata such as the name of the
act and the original text source are also included in the act frame.


3. Problem Statement
Dutch legal texts are not representative for normal written Dutch. The sentences are longer,
they contain complex adverbial clauses, and are mostly written in passive voice. The following
sentence is a typical example: A return visa can be refused by Our Minister if the foreign national
has not demonstrated by submitting documents that there is an urgent reason that does not allow
postponement of departure.1 If we want to formalise such a sentence to a norm, in our case a
flint frame, we notice that the sentence is in passive voice, it has a large relative clause, and it
contains information of which it is unclear how it fits into the frame.
   Previous work on extracting information from Dutch legal texts shows different approaches
on tackling these challenges using language patterns and rules. A downside of many of these
approaches such as van Gog and van Engers [8], De Maat et al. [3], Bakker et al. [5] is that
they use different KR languages. [8] use UML and OCL, [3] use types of norm frames that are
translated into OWL, and [5] use Flint. This makes the methods harder to compare and apply
in a broader context.
   What previous approaches do have in common is that they make use of tagging, [8] and
[5] use POS-tags, [3] use dependency tags. The sentences are tagged manually in [8], and
automatically in [3, 5] by using Alpino [9] and Pattern [11] respectively. The rules in these
methods are based on the tags. For instance, [5] create the actor of an act frame from the
first Noun Phrase in the sentence, whereas [3] create the actor from the dependency tag subj
(subject). [8] use a different KR language, but take a similar approach where noun phrases are
stored as individuals in an ontology-like model. At the time of their work, in 2001, treebanks
for automatic tagging such as [3] and [5] use were not available yet. As [5] mention, POS-tags
do not contain enough information to capture the complex structure of sentences in legal texts.
Their results show that the performance of their rules was very low, especially on the recipient
role in the flint frame. The cause was too simplistic rules and wrong tags. [3] showed in 2009 that
they can correctly tag complex active sentences using dependency tags. For passive sentences
however, more complex operations are needed and the performance remains unclear. It has to
be noted here that since then, the tagging packages and databases for Dutch have improved.
For [8], analysis of their rules and the performance is not present. Analysis of the rules and
the performance on legal text is sparse in all cases. [5] show an accuracy performance, but an
analysis of the rules and why they don’t work properly is missing. [3] do a short analysis, but
only on one example sentence which is not typical for legal texts.


4. Proposed Solution
In this work, we choose the Flint language for formalisation because of it’s aim to be a generic
and less task-dependent modeling language. This way, our work can also be translated to other
types of texts. To improve and solve the issues stated above, we propose a new architecture and
set of rules.2 The architecture is shown in figure 1 and consists of three steps. First, the law texts
are preprocessed. To make them suitable for tagging, they are split into the smallest element
of the law, often a sentence. For Dutch law source texts, they also need to be extracted from
the xml form in which they are documented. Second, the sentences are POS- and dependency
1
  translated, orginal sentence from de Vreemdelingenwet, artikel 2x-1a: Een terugkeervisum kan worden geweigerd
  door Onze Minister indien de vreemdeling niet door overlegging van documenten aannemelijk heeft gemaakt dat
  sprake is van een dringende reden die geen uitstel van vertrek mogelijk maakt.
2
  The implemented pipeline and rules of this architecture can be found here: https://gitlab.com/calculemus-
  flint/flintfillers/flintfiller-rlb
Figure 1: The architecture to extract flint frames based on three different steps: preprocessing, tagging,
and rules.


tagged. Third, rules are applied to extract the relevant roles to build flint frames. This pipeline
is similar to the work in [5], and we also focus on the main type of Flint frame, the act frame.
The difference is that we combine universal dependency tags with POS-tags, and propose more
detailed rules to recognise the roles of an act frame in the Flint language. We hypothesise that
the combination of POS-tags and dependency tags will allow more detailed rules, which in turn
will produce better quality flint frames.

4.1. Part-of-speech Tags and Dependency Tags
Part-of-speech (POS) and dependency tagging was one of the first modern NLP techniques.
POS-tagging tags grammatical parts of the sentence, not unlike the parsing of a sentence that
children learn in elementary school. An example can be seen in table 23 . We tag the sentences
automatically using spaCy [12], which uses both Alpino and Lassysmall [9, 13], the main
treebanks for Dutch. These libraries are up-to-date and offer improved tagging compared to
taggers used in previous work such as [3, 5]. Below we describe the dependency tags that we
use in our rules, an overview of the complete set and of the POS-tags can be found on the
universal dependencies website4 .

Table 2
Example tagged sentence
    Example sentence        Our Minister may grant an exemption or dispensation from the first and second
                            paragraphs to the alien.
    POS tags                [’NP’, (’Our’, ’PRP$’), (’Minister’, ’NNP’)][’VP’, (’may’, ’MD’), (’grant’, ’VB’)][’NP’,
                            (’an’, ’DT’), (’exemption’, ’NN’), (’or’, ’CC’), (’dispensation’, ’NN’)][’PP’, (’from’,
                            ’IN’)][’NP’, (’the’, ’DT’), (’first’, ’JJ’), (’and’, ’CC’), (’second’, ’JJ’), (’paragraphs’,
                            ’NNS’)][’NP’, (’the’, ’DT’), (’alien’, ’NN’)]
    Dep tags                [(’Our’, ’nmod:poss’), (’Minister’, ’nsubj’), (’may’, ’aux’), (’grant’, ’root’), (’an’, ’det’),
                            (’exemption’, ’obj’), (’or’, ’cc’), (’dispensation’, ’conj’), (’from’, ’case’), (’the’, ’det’),
                            (’first’, ’amod’), (’and’, ’cc’), (’second’, ’conj’), (’paragraphs’, ’obl’), (’to’, ’case’), (’the’,
                            ’det’), (’alien’, ’nmod’), (’.’, ’punct’)]

  We compared the definitions of the roles of flint frames to the definitions of the different
dependency tags and part of speech tags. As [3] described, the dependency tags offer more

3
  All examples in this paper are translated from the Dutch Aliens Act, because of this small differences may occur in
  the tagging and the frames compared to the original Dutch ones.
4
  https://universaldependencies.org/nl/index.html
semantic insight in the words. Another advantage is that by using universal dependency tags,
the method will be transferable to other languages. The POS tags are useful for chunking the
information and for elimination of adverbial clauses. The dependency tags that comply with
the actor are nsubj for active sentences, which is the nominal subject of finite sentences. For
actors in the passive sentences the dependency tag is obl:agent, which is used for prepositional
arguments and adjuncts of a verbal head and for the door-phrase that can be present in passives.
The tags for the object are obj for active sentences: the direct object of verbal heads, and
nsubj:pass for passive sentences, the subject of passive sentences. The tag for a recipient is iobj:
indirect objects that are not introduced by a preposition. Finally, actions can be recognised
by root, the root or main verb, ccomp which are complement clauses that are dependents of a
verb, and xcomp: head of non-finite verbal complements of verb and redicative complements of
non-copula verbs.

4.2. Rules
After the sentences are tagged with POS-tags and dependency tags, we extract the roles of the
flint frames by formulating rules. An overview of the rules is given in pseudocode in algorithm
1. The rules are applied per sentence in the legal text. We first form phrases of the sentence by
using the POS-tags for creating chunks. For instance, in the example shown in table 2 the first
phrase, a noun phrase, is ‘Our Minister’. This way, a role can be multiple words. The first rule
states that if a token in a phrase (a word in for instance a noun phrase) is labeled with nsubj or
obl:agent, the complete phrase will be the actor role of a flint frame. In the example sentence
’Our Minister may grant an exemption or dispensation from the first and second paragraphs
to the alien’ in 2, ’Minister’ is labeled nsubj, and part of the noun phrase ‘Our Minister’. The
second rule determines that a phrase is a object when a token in that phrase is labeled obj of
nsubj:pass. In our example, ’exemption or dispensation’ is labeled obj and together forms a noun
phrase, so this part of the sentence will be the object. The third rule determines the recipient, if
a token is labeled iobj, the phrase will be the recipient. In our example this tag does not occur.
Finally, to get the action phrase, which often consists of multiple verbs, we take all the phrases
with tokens labeled as root, ccomp, or xcomp, and they will form the action of the act frame.
In the final step of the architecture, the roles resulting from these rules are assembled and put
together in a flint frame format, of which we will see more examples in the next section.


5. Analysis and Insights
To get insights in the performance of the architecture and the rule set, we created act frames
using the tags and the rules, and compared them to manually created act frames. We chose
the Alien Act (Vreemdelingenwet) as a use case. We saw that actors, objects, and actions are
recognised correctly in most cases. However, recipients were only recognised correctly in a
few cases. This is best illustrated with some examples. For instance, the following active voice
sentence contains all roles: ‘Our minister may grant an exemption or dispensation from the first
and second paragraphs to the alien5 .’ A domain expert created a manual act from this sentence,

5
    Translated from the Dutch Aliens Act, article 2y-3
Algorithm 1 Rules expressed in pseudocode
  for phrase in sentence do                                                 ◁ e.g. ’Our Minister’
      for token in phrase do                                                     ◁ e.g. ’Minister’
          if token = 𝑛𝑠𝑢𝑏𝑗 or 𝑜𝑏𝑙 : 𝑎𝑔𝑒𝑛𝑡 then
              phrase ← 𝑎𝑐𝑡𝑜𝑟                                      ◁ e.g. ’Our Minister’ -> actor
          else if token = 𝑜𝑏𝑗 or 𝑛𝑠𝑢𝑏𝑗 : 𝑝𝑎𝑠𝑠 then
              phrase ← 𝑜𝑏𝑗𝑒𝑐𝑡                      ◁ e.g. ’exemption or dispensation’ -> object
          else if token = 𝑖𝑜𝑏𝑗 then
              phrase ← 𝑟𝑒𝑐𝑖𝑝𝑖𝑒𝑛𝑡         else if 𝑡𝑜𝑘𝑒𝑛 =root𝑜𝑟ccomp𝑜𝑟xcomp then
               phrase ← 𝑎𝑐𝑡𝑖𝑜𝑛                                      ◁ e.g. ’may grant’ -> action
           end if
       end for
   end for


and we compared this to the automatic act. The resulting frames can be seen in table 5, as well
as two other examples. All roles are correctly detected except the recipient. The minor changes
in the actor and the object are caused by the domain expert adding their contextual knowledge,
this information is not in the sentence. The recipient is not recognised correctly because the
dependency tag for alien was nmod. We also tagged the English version of this sentence to
check if the tagger algorithm was at fault. In English the token was tagged as obl, which is
also not a recipient according to our rules. In the second example in 5 the frames are almost
identical, and a recipient is found. The only mistake here is that ‘him’ refers back to the foreign
national in the first part of the sentence, this cannot be derived with our rules.
   Finally, the third example shows an error in the action. This is due to multiple verb phrases
in the sentence, one in the first part, and one in the conditional clause. This conditional clause
causes also problems with the actor and the object. For this sentence, the domain expert created
the frame only from the first part of the sentence, whereas the rules extract roles from the
complete sentence.
   In our analysis we found few recipients, so we tested with very simple sentences when the
iobj tag occurs at all in Dutch. For simple sentences as ‘I give her a gift’, her was correctly
tagged with the iobj tag, but when we changed it to ‘I grant her a wish’, her is tagged as expl:pv.
This indicates that the recipient role requires more analysis and possibly more complex rules to
be extracted correctly.
   From our first analysis, we can conclude that actors, actions, and objects can be recognised in
active and passive sentences with our rule set. Improvements need to be made for recipients, and
for dealing with long sentences with relative clauses. Finally, implicit roles might be added by a
domain expert, but to add these automatically would be a very hard task for which you would
have to look at the complete law text. Further insights may be gathered from more extensive
experimentation and quantitative evaluation which we did not do due to time constraints. An
interesting experimental setup would be to manually label a set of legal texts with the roles of
the flint frames, and to compare these to the roles assigned automatically with our method.
Table 3
Example act frames created manually and automatically
 frame            Manual act                        Automatic act
 action           may grant                         may grant
 actor            Our Minister of Justice and Se-   Our Minister
                  curity
 object           exemption for exceeding the pe-   exemption or dispensation
                  riod of validity of the return
                  visa
 recipient        alien
 source text                                        Our Minister may grant an exemption or dispensation
                                                    from the first and second paragraphs to the alien.
 frame            Manual act                        Automatic act
 action           assigned                          assigned
 actor            Our Minister                      Our Minister
 object           a counselor                       a counselor
 recipient        the foreign national              him
 source text                                        At the request of the foreign national, a counselor is as-
                                                    signed to him by Our Minister.
 frame            Manual act                        Automatic act
 action           refused                           refused allow
 actor            Our Minister                      Our Minister urgent reason
 object           A return visa                     a return visa postponement
 recipient
 source text                                        A return visa can be refused by Our Minister if the foreign
                                                    national has not demonstrated that there is an urgent
                                                    reason that does not allow postponement of departure.


6. Conclusion and Discussion
Legal texts are difficult to interpret, formalising them can improve transparency. However,
creating formalisations of legal texts is labour-intensive, and automatically creating them is
still a challenge. Previous work showed that rule-based systems have mixed success on Dutch
legal texts [8, 3, 5]. This has multiple reasons. First of all, they focus on different types of
formalisation languages, making them hard to apply in a broader context and to compare the
performance. Second, they use outdated POS- or dependency tags [8, 3], or use only POS tags
which do not express enough information such as [5]. Third, proper analysis of the rules is
missing. Therefore, we propose a new rule-based architecture for detecting the different roles
of Flint frames. We chose the Flint language for formalisation because of it’s aim to be a generic
and less task-dependent modeling language. This way, our work can also be translated to other
types of texts. Our architecture has three steps: preprocessing the text, tagging the sentences
with POS- and universal dependency tags, and finally applying rules. The rules are all of the
form if a word (token) has a certain dependency tag then the complete phrase will be marked
as one of the roles in the flint frame. The phrase is recognisable by the POS-tags. Finally, for
each sentence that contains an action, a flint frame is build with the roles that were found
in this sentence. In the analysis we saw that the roles actor, object, and action are correctly
recognised in most cases. However, the recipient often was missing, even though it occurred in
a frame created manually by a domain expert of the same sentence. This had several causes:
The legal sentences were too complex, the recipient is often tagged with the broad dependency
tag obl, and sometimes the domain expert would add roles that were not explicitly present in
the sentence based on their contextual knowledge. We also found that complex relative clauses
cause multiple roles to appear in one frame with the current rule set. We can conclude that our
proposed architecture improves the extraction of all but one roles in the flint frame. Because of
the combination of POS- and dependency tags, it is more precise than previous methods. It is
also applicable to other languages because of the use of universal dependency tags. For a more
extensive insight in our method it would be interesting to quantitatively evaluate the results
of this solution against manually created frames in future work. Furthermore, we focused our
analysis on laws, our solution might also be useful for other types of legal texts. For further
improvement we suggest extending the rules for extracting the recipient role, add rules for
recognising relative clauses, and testing this framework on English legal texts.


Acknowledgments
The authors would like to thank the Governmental Advisory Board on Digital Government
(OBDO) and Dutch Ministry of the Interior and Kingdom Relations for the financial support of
our research. Furthermore we would like to thank Tom van Engers and the Norm Engineering
project team, particularly Robert van Doesburg, for their insight and feedback.


References
 [1] E. Francesconi, The “norme in rete” project: Standards and tools for italian legis-
     lation, International Journal of Legal Information 34 (2006) 358–376. doi:10.1017/
     S0731126500001517.
 [2] I. Chalkidis, M. Fergadiotis, P. Malakasiotis, N. Aletras, I. Androutsopoulos, Legal-bert:
     The muppets straight out of law school, arXiv preprint arXiv:2010.02559 (2020).
 [3] E. De Maat, R. Winkels, T. van Engers, Making sense of legal texts, volume 212, Walter de
     Gruyter, 2009.
 [4] R. van Doesburg, T. M. van Engers, Explicit interpretation of the dutch aliens act, in:
     Proceedings of the Workshop on Artificial Intelligence and the Administrative State co-
     located with 17th International Conference on AI and Law, 2019, pp. 27–37.
 [5] R. M. Bakker, R. A. van Drie, M. H. de Boer, R. van Doesburg, T. van Engers, Semantic
     role labelling for dutch law texts, in: Proceedings of the 13th Conference on Language
     Resources and Evaluation (LREC 2022), Marseille, 2022, pp. 448–457.
 [6] R. Brighi, L. Lesmo, A. Mazzei, M. Palmirani, D. P. Radicioni, Towards semantic in-
     terpretation of legal modifications through deep syntactic analysis, in: Proceedings of
     The Twentieth First Annual Conference on Legal Knowledge and Information Systems,
     volume 21, 2008, p. 202.
 [7] S. Shaghaghian, L. Y. Feng, B. Jafarpour, N. Pogrebnyakov, Customizing contextualized
     language models for legal document reviews, in: 2020 IEEE International Conference on
     Big Data (Big Data), IEEE, 2020, pp. 2139–2148.
 [8] R. van Gog, T. M. van Engers, Modeling legislation using natural language processing,
     in: 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and
     e-Man for Cybernetics in Cyberspace, volume 1, IEEE, 2001, pp. 561–566.
 [9] L. Van der Beek, G. Bouma, R. Malouf, G. Van Noord, The alpino dependency treebank, in:
     Computational linguistics in the Netherlands 2001, Brill, 2002, pp. 8–22.
[10] W. de Vries, A. van Cranenburgh, A. Bisazza, T. Caselli, G. van Noord, M. Nissim, Bertje:
     A dutch bert model, arXiv preprint arXiv:1912.09582 (2019).
[11] T. De Smedt, W. Daelemans, Pattern for python, The Journal of Machine Learning Research
     13 (2012) 2063–2067.
[12] M. Honnibal, M. Johnson, An improved non-monotonic transition system for dependency
     parsing, in: Proceedings of the 2015 Conference on Empirical Methods in Natural Language
     Processing, Association for Computational Linguistics, Lisbon, Portugal, 2015, pp. 1373–
     1378. URL: https://aclweb.org/anthology/D/D15/D15-1162.
[13] G. Bouma, G. Van Noord, Increasing return on annotation investment: the automatic
     construction of a universal dependency treebank for dutch, in: Proceedings of the nodalida
     2017 workshop on universal dependencies (udw 2017), 2017, pp. 19–26.