Concept Hierarchy Extraction from Legal Literature

                           Sabine Wehnert                                  David Broneske
                     Otto von Guericke University                    Otto von Guericke University
                        Magdeburg, Germany                              Magdeburg, Germany
                       sabine.wehnert@ovgu.de                          david.broneske@ovgu.de

                             Stefan Langer                                  Gunter Saake
                           Legal Horizon AG                          Otto von Guericke University
                         Magdeburg, Germany                             Magdeburg, Germany
                     stefan.langer@legalhorizon.ag                      gunter.saake@ovgu.de


                                                                      tents, we propose a heterogeneous lightweight
                                                                      ontology allowing for the coexistence of simi-
                         Abstract                                     lar, yet diverse concept hierarchies to dynami-
                                                                      cally determine the best fit for a user in a semi-
     Due to the ever-increasing amount of legal reg-                  supervised setting. This approach is novel,
     ulations, it became an interest of scholars to                   since state-of-the-art ontologies are conven-
     find ways of capturing domain-relevant knowl-                    tionally modeled under full integration and in
     edge and facilitate the navigation in legal text                 a top-down manner, often not accounting for
     corpora. Furthermore, the contextual nature                      perspectives in knowledge representation.
     of legislation requires enhanced semantic ca-
     pabilities to identify relevant regulations for
     specific user needs. This work aims for col-                1     Introduction
     lecting concept hierarchies from German lit-
     erature in the legal domain which are then                  Nowadays, enterprises as well as lawyers are facing the
     integrated into a knowledge base with mul-                  challenge of keeping track of an overwhelming number
     tiple clusters, allowing for different perspec-             of legal texts from different jurisdictions. Yet, it is
     tives and efficient lookups. Having references              their obligation to ensure compliance, so that often
     to regulations in the leaves of the concept tree            manual efforts are made to monitor changes in law.
     and higher levels with an increasingly abstract             On the other hand, this means that new developments
     context, the resulting hierarchies provide the              need to be integrated into already existing knowledge,
     basis for creating legal domain knowledge in                e.g., if a law is amended and impacts other regulations
     German law. Starting with rule-based anno-                  which are used in a specific scenario, the knowledge
     tation, we cluster extracted references, given              needs to be adapted accordingly. There is a need for
     their context features derived from tables of               context-sensitive search and a grouping method which
     contents and reasons for citing from various                ensures that all relevant documents are retrieved for
     textbook formats. We study the expressive-                  a specific situation. The natural language processing
     ness of the obtained reference context fea-                 (NLP) community has made many advances, such as
     tures. Since different authors have their own               building citation networks [ZK07, WLM16]. Surpris-
     notion of hierarchy given by the table of con-              ingly, there are few works addressing the extraction of
                                                                 legal concept hierarchies based on implicit semantic re-
Copyright © CIKM 2018 for the individual papers by the papers'   lations between legal texts. We define implicit seman-
authors. Copyright © CIKM 2018 for the volume as a collection    tic relations as relationships among legal texts which
by its editors. This volume and its papers are published under   only apply in specific contexts, so that they are not
the Creative Commons License Attribution 4.0 International (CC   coded as explicit citations within generally applicable
BY 4.0).                                                         regulations. For example, depending on the expertise
                                                                 of a lawyer (i.e., knowledge about implicit semantic
relations), he can use his background to identify con-         ized concepts, as it can be encountered in standard
nected laws which are important for a specific case.           ontologies. Having legal textbooks of many different
    In this paper, we propose a method to extract in-          formats and authors as data sources, we expect many
formation from a large number of textbooks. It can be          contradictions to occur during an attempt to estab-
used to identify contextually relevant texts based on          lish mapping rules for a standard axiomatic ontology.
their mentions within literature, providing evidence of        Therefore, we follow a different notion of knowledge
a semantic relationship between legal texts depending          representation.
on their closeness within the resulting concept hierar-           Similar to the process of studying law, we aim for a
chy. This form of domain knowledge is modeled in a             diversity of perspectives within our system, which are
bottom-up manner, using the references to legal texts          chosen depending on the context. Specifically, we are
in the literature as instances in the bottom levels of         interested in the effects of letting a concept hierarchy
the concept hierarchy. Above, descriptive context rep-         remain in its original structure, derived from the table
resentations are desired, which we refer to as reasons         of contents (TOC), and coexist among other similar
for citing, for each respective regulation. These repre-       concept hierarchies belonging to the same cluster. In
sentations and relationships can be modeled according          this work, we show how such an approach can model
to the desired expressiveness of the resulting ontology.       the contextual application of regulations and how it is
Winkels et al. show that reasons for citing can be             able to adapt to user-given feedback. Thus, the con-
extracted from the sentence referring to the respective        tribution of this work is a combination of the following
regulation, and narrow them down to four relationship          techniques:
categories: selection, application, concluding (denying)
and a category for in relation to [WBVvS14]. Zhang                 • We apply rules to annotate elements in a text-
and Koppaka link relevant legal texts based on rea-                  book.
sons for citing and let experts assess their contex-
                                                                   • We access DBpedia knowledge for named entity
tual quality [ZK07]. There are works addressing legal
                                                                     resolution.
text linking based on the information given therein
[FMPT10, BDCG+ 15]. These approaches use explicit                  • We form concept hierarchies and evaluate their
citations from within the document itself or its meta-               components.
data. We choose to use external knowledge from lit-
erature to find relationships which cannot be directly             • We group concept hierarchies with nominal clus-
detected within these documents. For this, we model                  tering.
relationships among legal texts in a concept hierarchy,
founded upon the spatial co-occurrence of their men-               • We discuss the use of heterogeneous lightweight
tions in legal literature.                                           ontology clusters for legal texts.
    Our approach is therefore a step in a new direc-           The remainder is structured as follows: Section 2 con-
tion of legal informatics, because we consider legal           tains related work regarding concept hierarchy extrac-
literature as a source of concept hierarchies to build         tion, lightweight ontologies and the formation of clus-
domain knowledge. We base our method on the as-                ters. Since our approach is derived from observations
sumption that a (sub-) chapter headline corresponds            of research gaps for our specific use case, we provide
approximately to the concept described in the section.         a justification of our methods alongside. In Section 3,
Furthermore, the cited legal texts in each passage are         we describe our method of extracting concept hierar-
seen as semantically related to the discussed concept          chies from legal literature and the subsequent steps of
of the respective section. While this assumption does          constructing the domain knowledge. We discuss ex-
not always hold - especially in cases where authors use        perimental results in Section 4. Finally, we conclude
creative titles - our studied literature contains descrip-     our findings and unveil future research potential.
tive concepts in most headings of sections.
    For the scope of this paper, we establish a connec-
tion between legal documents which co-occur in the
                                                               2     Related Work
same chapter, part, section or lower level subsections.        We introduce three main aspects regarding our aim
By means of a concept hierarchy, we are able to iden-          of capturing and applying knowledge from textbooks.
tify closely related legal texts in the lower parts, as well   The concept hierarchy is derived from the inherent
as those which have a higher distance given only one           structure of a piece of literature. In this section, we
common concept on a high abstraction level. A limita-          first name some alternative approaches to extract con-
tion of this approach is that we extract and maintain          cept hierarchies. Second, we provide the background
explicit keywords forming a concept. Hence, we do not          for the formation of our knowledge base, being de-
integrate it into a common understanding of standard-          rived from a heterogeneous ontology. Third, we briefly
outline a clustering method because it provides some          tactic positions, called Formal Concept Analysis, is
further optimization options to control the cluster for-      suggested by Cimiano et al. to extract concept hi-
mation of a heterogeneous ontology.                           erarchies [CHS04]. Based on topic modeling, part-of-
                                                              speech tags and tf-idf weighting, Anoop et al. [AAD16]
2.1   Concept Hierarchy Extraction                            suggest an unsupervised method for concept hierarchy
                                                              extraction. A possible drawback of statistical topic
Concept hierarchies are a means for representing              modeling methods is the instability of retrieved topics
knowledge in a hierarchical manner, having nodes of           and their keywords if the process is repeated on the
increasing abstraction per level and things as instances      same data. Belford et al. propose a method relying
in the leaves of the tree. We intend to represent links       on matrix factorization to increase the stability and
between legal texts by shared concepts: The higher            accuracy of topic models [BMNG18].
a linking node between two instances is located in               In contrast to these implementations, we use a rule-
the concept hierarchy, the more distant are two doc-          based approach to extract information. Legal appli-
uments. There are several approaches for extracting           cations can benefit from the control over data qual-
concept hierarchies from unstructured text. Among             ity that a system designer has while using rule-based
them, we find rules to detect hyponomy relations based        approaches, without compromising on the amount of
on Hearst Patterns [Hea92], for example to represent          data. Despite some deviations from the pattern -
legal vocabularies. Also eigenvector decomposition is         where authors incorporate creative headings for didac-
a method for identifying term taxonomies [BDMP06].            tic purposes - we find very few of these cases in our
Those patterns, however, are not applicable for the           collection of legal literature. We show the results of
use case of linking legal texts. Lexical hyponomies are       our approach in Section 4.
not suitable for references modeled as instances of the
concept hierarchy tree, since the subsumption relation
                                                              2.2   Heterogeneous Legal Ontology
is not based on the vocabulary, but semantic relat-
edness gained from textbooks. Kuo et al. [KTH06]              Despite some variation in the style format among the
propose hierarchical clustering to build concept hier-        pieces of literature, another major challenge arises
archies, while also the extraction of noun groups is a        from the obtained concept hierarchies themselves: Ini-
valid approach [ROB17].                                       tially, we obtain standalone hierarchies from each
   We examine methods of noun group extraction com-           book, and the difference among them is unknown.
bined with hierarchical clustering further, and propose       However, topical overlaps are possible for diversified
a combination of them for concept hierarchy extrac-           literature, thus posing a challenge in integrating all
tion from literature. This approach is based on the           concept hierarchies in a non-contradicting manner.
assumption that an author captures the topic of a sec-            Instead, we capture the contextual character of le-
tion within its title. In the highest levels of abstraction   gal texts. Following the notion of hierarchical ontology
within our concept hierarchy, we gather elements from         clusters proposed in [VC98], we develop the idea of al-
the Tables of Contents (TOC) within literature. Fi-           lowing multiple concept hierarchies to coexist without
nally, we obtain a coarse- to fine-grained clustering of      integrating them. Conventionally, one common lan-
regulations based on the understanding of the corre-          guage and understanding is desired for system archi-
sponding author, while we assume that the reasons for         tectures whose components access the same domain
citing in particular are relevant features justifying the     knowledge. Despite these advantages, for our applica-
cluster membership of a regulation.                           tion such an ontology requires high maintenance efforts
   Similar to this work, Günel and Aşlıyan [GA10] de-       resulting from frequent insertions of further knowl-
scribe how to extract concepts from tutoring mate-            edge, either by automatically determining valid map-
rial in TEX format using domain relevance, entropy            pings or checking for logically matching candidates.
and lexical cohesion as inclusion criteria. Wang et               In the legal domain, a common requirement is to
al. extract concept hierarchies from textbooks by the         ensure that all relevant documents are retrieved, thus
TOC and Wikipedia [WLW+ 15]. We also use the                  we optimize for a high recall. This is however chal-
TOC to find local relatedness of regulations given the        lenging when working with natural language, for ex-
section title and Wikipedia for Named Entity Resolu-          ample when encountering its cases of ambiguity, near-
tion. Robin et al. compare two approaches for legal           synonyms and polysemy. We therefore argue that
concept hierarchy extraction: hierarchical clustering         concepts in legal literature may differ even for equal
and the extraction of topical expressions composed of         topics, which is due to different perspectives of the
noun groups [ROB17]. Bruckschen et al. populate               authors and their own interpretation. However, any
a legal ontology based on Named Entity Recognition            human regularly overcomes these inconsistencies and
[BNS+ 10]. In a related field, an approach using syn-         ambiguity by either choosing one concept for a nar-
row but consistent understanding, or by broadening          case for an enforced hierarchy, consider a scenario
the scope and encompassing multiple sources to avoid        where a distance between European and national law is
omissions of important items, while accessing the most      desired. After including must-link-before constraints,
appropriate fit based on a contextual decision crite-       instances from the specified category are located closer
rion. This criterion can be derived from user-provided      to the reference instance than those which are forced
feedback, for example by marking a document as ir-          to link on a higher node of the concept tree. The algo-
relevant. Then, the concept hierarchy will be selected      rithm we use in the scope of this work allows for must-
which most likely captures the user need based on the       link and cannot-link constraints by defining a relation-
recomputation of relevance.                                 ship between two features [MHAK16]. Due to space
    Since our intended knowledge base is built in a         limitations, we leave the examination of constraint ef-
bottom-up manner, this work is different from ax-           fects for future work and implement the clustering al-
iomatic ontologies. There are legal ontologies avail-       gorithm without constraints.
able such as ALLOT [BDIPV13] or LKIF [HBDB+ 07],
which are able to encompass multiple legal data             3      Concept Extraction for Heteroge-
sources, however also requiring alignment of the re-               neous Ontologies
spective classes. These ontologies are built upon a
document standard called Akoma Ntoso [VZ07] and             Following relevant literature and the justification of
offer many ways of standardized information model-          our method, we outline our approach for building a
ing on the document level and beyond. For our spe-          heterogeneous ontology. In particular, we describe the
cific use case, we identify two possibilities to achieve    process of annotating features in textbooks to obtain a
our goal: Either an expert maintains contextual in-         contextual representation of the reference by means of
formation regarding specific applications of laws to-       concept hierarchy clusters. Figure 1 depicts the work-
gether in such a standardized ontology - for instance,      flow.
by using the contextual ontology language C-OWL
[BGvH+ 03] - or there is a system for legal literature          1. An electronic literature resource is converted into
covering different scenarios, user categories and juris-           a txt file.
dictions, ideally resulting in a complete collection of
all regulations needed for a case. Several bottom-up            2. The text is preprocessed by performing tokeniza-
lightweight ontologies for legislative terms and entities          tion, sentence chunking, orthographic coreference
exist [BGBI16, ABC+ 16]. Our knowledge representa-                 resolution, parts-of-speech tagging, roman literal
tion differs from these works substantially in terms of            identification and named entity resolution using
the application scenario and extraction method. To                 web knowledge from DBpedia.
the best of our knowledge, there is no approach for
the same use case within the legal domain allowing for          3. Rule-based annotation is applied to match TOC
a fair comparison with our work.                                   components (Chapter, Part, Subchapter, Subsub-
                                                                   chapter ), CS components (regulation name REG,
                                                                   DBpedia concept DBp, relationship REL and ref-
2.3   Concept Hierarchy Clusters                                   erences REF.
Given a large collection of textbooks, we apply cluster-
                                                                4. All annotations are extracted into a csv file, re-
ing to increase contextuality and to reduce the search
                                                                   sulting in a table of tokens T with their respective
space for finding the the most applicable concept hier-
                                                                   annotation features.
archy for a context. As a result, many references from
different concept hierarchies are merged together. In           5. The file is treated as a lookup table and for each
order to stucture the cluster, the distance informa-               TOC component, boundaries are determined.
tion given by a hierarchical clustering algorithm can
be exploited. For user-centered applications, a semi-           6. All references are matched in document order to
supervised clustering method has been proposed by                  each TOC component with respect to the differ-
Bade and Nürnberger [BN14]. They introduce must-                  ent section boundaries. Also, the CS information
link-before constraints for clustering algorithms which            is retrieved from an extracted annotation file and
can be applied to hierarchical agglomerative cluster-              assigned to the REF.
ing. Those constraints identify instances to be linked
and those which shall remain separate. Different from           7. After the feature information has been detected,
other works, this method also implies the means to                 a flat representation of the concept hierarchy is
model the hierarchical order of instances without re-              stored, with one REF instance per line and its
quiring to define the exact level difference. As a use             TOC and CS feature information.
 (1)       Book
                                                            C1 REF1    X                           Show Semantically
                                                               REF20   x       Feedback                                 (10)
                                                               REF55   x                           Related References

                                                           ”REF” found in:

 (2)   Preprocessing                                         REF1 C1         Context Descriptor         Query
                                                             REF5 C5                                                    (9)
                                                                              Label of Cluster
                                                                                                     Knowledge Base


        Concept
                                              <TOC>                                                  Cluster Concept
 (3)                                           <CS>                                                                     (8)
       Annotation                               REF                                                 Hierarchy Instances


       Annotation         T
                              <TOC>
                                X
                                       <CS>
                                         x
                                              REF
                                               x
                                                          REF1 , <CS>, <TOC>                          Compose Flat
 (4)                      T     x       X      x
                                                          REF2 , <CS>, <TOC>
                                                          REF3 , <CS>, <TOC>
                                                                                       Instances
                                                                                                    Concept Hierarchy (7)
       Extraction         T     x        x     X
                          T     x        x     x


 (5) Grouping REF by
                                1
                                                               REF1    in        1                      Lookup and
    <TOC> Component
                               1.1                             REF2    in     1, 1.1                                    (6)
                               1.2
                                                               REF3    in     1, 1.2                     Matching

      Figure 1: Workflow towards a lightweight heterogeneous ontology, used in a query expansion setting.
  8. The instances are clustered, using their nominal    similar to the approach of Günel and Aşlıyan on cor-
     features.                                           responding TEX-files can be cumbersome [GA10]. Al-
                                                         ternatively, we convert the PDFs into txt files to speed
  9. We included a possible use case, where a user       up subsequent preprocessing steps. We use GATE - a
     searches for context information of a regulation    widely adopted framework for text processing to pre-
     REF 1 . Here, for REF 1 , cluster context descrip-  process the text - and JAPE Grammar rules 2 to an-
     tors C1 and C5 are retrieved. The user decides      notate the concept hierarchy elements. For example,
     for C1 and receives references linked to REF 1 de-  based on the pattern of a book publisher for a TOC,
     pending on the data contained in the respective     we specify matching criteria including orthographic in-
     concept hierarchy cluster.                          formation, roman numerals and part-of-speech tags 3 .
 10. A feedback mechanism can be implemented to          The patterns for reasons for citing are described in
     narrow down relevant references. Different from     Equation (1) and for the respective relationship in
     our idea, Boonchom and Soonthornphisaj use          Equation (2). There is a trade-off between statisti-
     term frequency-based ontology seeds for a legal     cal and rule-based approaches: the former is faster to
     ontology search task [sBS12]. A similar approach    implement but less accurate, the latter is slow to im-
     for query expansion using a hierarchical legal      plement but more accurate. Waltl et al. emphasize the
                                                  +
     knowledge base is by Schweighofer et al. [SG 07].   effectiveness of rule-based information extraction due
     Yet, their relevance feedback is based on the pref- to explicitly applied domain knowledge and suggest
     erences of other users, unlike our approach focus-  this approach as an alternative to machine learning
     ing only on content.                                algorithms, since the latter often require a sufficient
                                                         quality of training data [WBM18]. Regarding the an-
   Selected process steps to obtain the knowledge base   notation of several elements within a textbook, we de-
are described in more detail in the following. We        fine rules suited for the respective elements which we
share more implementation details and program code       consider as expressive features. We proceed with a de-
on GitHub.1                                              scription of these rules for TOCs, reasons for citing
                                                         and regulations.
3.1 Annotation
Since digital literature is conventionally available in
                                                             2 https://gate.ac.uk/sale/tao/splitch8.html
PDF format, making use of formatting information             3 We use the German german-hgc.tagger from the Stanford
  1 https://github.com/anybass/HONto                      parser https://nlp.stanford.edu/software/tagger.shtml
3.1.1   Table of Contents (TOC)
                                                                     RFC = (NN | NNS | NNP | NNPS | NE |
Depending on the publisher, a table of contents man-                                                                 (1)
                                                                                  (NN (ADJA | NN) ∗ NN))+
ifests itself in various styles. From numeral-only ver-
sions to mixed alphabet, roman literal and numeric           Due to space limitations, this pattern is a simpli-
variations, we define separate rules to capture each         fied version of the actual one, here only listing candi-
distinct heading element including its level in the con-     date part-of-speech tags (POS) using the SSTS tagset
text of the table of contents. Despite the efforts in        [STT95]. Our rules account for a variety of possi-
rule definition, there are not many substantial varia-       ble sentence structures in German natural language.
tions within each publishing style, so that minor in-        Those patterns which are formulated by using the
consistencies may be captured by generalization from         more expressive JAPE rule syntax are defined with pri-
seen examples. Waltl et al. combine the advantages of        orities, so that the most restrictive rule is applied first.
rule-based approaches with those of machine learning         Likewise, there are patterns for relationship extraction
techniques because domain knowledge can be directly          examined by multiple authors, as well [FSE11]. We
incorporated into the training phase to obtain more          adapted them to German language and added nega-
control over results [WBM18]. However, it is out of          tion tags with
scope of this work to train an annotation classifier and
                                                               REL = (PTKNEG | V-INF | V-PP | V-FIN)+ (2)
a potential future optimization task. After annotation,
we export the TOC features. Based on the detected            as the simplified relationship pattern REL. In the verb
elements, we determine the boundaries for each level of      categories we subsume the tags using a hyphen, for
the TOC hierarchy to store the respective references         example V-INF is a placeholder for VAINF, VVINF
contained per part, subchapter and subsubchapter.            and VMINF, which are originally output by the Stan-
                                                             ford parser. The relationship feature of the annota-
                                                             tion in this case is formed as a concatenation of REL
3.1.2   Reasons for Citing (RFC) and Relation-               matches within a sentence containing RFC. We ad-
        ships (REL)                                          just the matching rule regarding specific word patterns
                                                             for important indicators - strings indicating contradic-
Each sentence with a reference to a legal text poten-        tions (e.g., in German “Widerspruch”) or selections
tially contains information about the rationale of this      (e.g., in German “Beispiel”) - which cannot be gener-
citation, which serves as a contextual summary. We           alized with parts-of-speech information. Also, if there
divide the citation summary CS into the regulation           is a syntactic indication of a legal term definition (e.g.,
name REG, the reason for citing RFC - following the          in German “nach” or “gemäß”) within a law, we fill
notion of an entity - and its relationship REL with the      undetected REL fields with an is-relationship (in Ger-
regulation, captured by verb forms. Extracting the           man: “ist”). Furthermore, we clean the matches by
CS serves as feature information for a clustering al-        parsing out non-descriptive strings for a relationship
gorithm. Another application is in connection with a         between a reference and its reason for citing (e.g., in
reasoner based on the abstract relationships. Similar        German “denke”). This consequently results in sparse
to the approach of Winkels et al. [WBVvS14], a model         relationship features, since the above rules are both
of relationships among legal texts can be derived from       specified within sentence boundaries. While our as-
textbooks and then be incorporated into the concept          sumption that a sentence citing a regulation contains
hierarchy. In addition, reasons for citing RFC can be        RFC and REL patterns, this is not always the case.
considered for the user of a (content-based) legal rec-      For the subsequent steps, we only consider those regu-
ommender system as an explanatory component, to              lations containing RFC, and optionally REL. Any an-
be displayed alongside the reference as a context de-        notated regulation contained in the document where
scriptor. We find several pattern varieties proposed for     RFC is missing may not hold enough context infor-
keyphrase extraction and consider them for the RFC           mation to determine its applicability for the context.
[WZH16, Hul03]. While the respective authors ana-            Despite this limitation, it shall not have severe con-
lyze English language and capture adjective groups in        sequences in case of a sufficiently large heterogeneous
addition to noun groups as well, there are more dis-         ontology, since other extracted concept hierarchies for
tinctions available for part-of-speech-tags in German        the same context shall cover possible gaps due to the
language. Since including all adjective groups results       highly regularized nature of legislation.
in a larger number of distinct nominal features, we
                                                             3.1.3    Regulations (REG, REF)
limit the pattern to minor sequence variations allow-
ing for attributive adjectives. In our use case, we define   Many scholars have examined methods to extract regu-
the following expression to capture the RFC :                lations from unstructured text [WLM16], often to cre-
ate a citation network based on the references within      node, we summarize the features REG, DBp, RFC,
the original regulation text [WBVvS14]. While cur-         REL for space reasons, however, they are all stand-
rently machine learning approaches remain popular,         alone features. Each element has mandatory values
rule-based methods achieve high precision and recall,      for the Chapter, RFC and Reference. The other fields
as well, which is due to the highly regularized pattern    are optional because we do not assert that the rules
of regulation citation. In German law, there are fixed     return values for each feature.
citation guidelines. Therefore, a sufficiently high pro-       Given the illustrated concept hierarchy in Figure
portion of citations can be detected with rules, with      2, we evaluate the results by setting the Chapter as
precision and recall in the range from 80% to 90%          a class label - thus expecting a reproduction of the
[WLM16]. In addition, legal language contains term         structure of a chapter - and by not including it in the
definitions, which are implicitly referenced by other      features to be processed. As indicated by the arrows,
laws [WLM16]. Those term definitions can be ex-            the test data can match the learned examples by com-
tracted with rules and stored in a Lookup dictionary.      parison of the subfeatures and early merges are an in-
Although it is out of scope of this work, we plan to       dicator for higher similarity between two instances. A
analyze and enrich regulations with legal term defini-     possible limitation of this approach comes from the re-
tions - to be found in other regulations - to gain more    liance on explicitly stated information. For instance,
context information from the knowledge provided in         if the RFC are not indicated within the reference sen-
the data source itself. We considered corner cases in      tence or if they are faulty extracted, this can decrease
reference citations, thus aiming for an improvement        the expressiveness of the features for the desired struc-
of the already high regulation coverage. These cor-        ture. Since the resulting concept hierarchy depends
ner cases include references containing more than two      on the author of the book, his perspective may not
regulations from different sources, and occurrences of     be suitable for any user. Therefore, we see a possi-
connection indicators, in German abbreviated as “i.        ble remedy in the notion of concept hierarchy clusters,
V. m.”. These annotations shall contribute to a rich       forming a heterogeneous lightweight ontology.
knowledge base.
                                                           3.2.1   Concept Hierarchy Clusters.
3.1.4    Access Web Knowledge (DBp)
                                                           Extracting a narrow concept hierarchy with only nom-
Wang et al. suggest in their approach to apply
                                                           inal features leads to a lower probability of getting
web knowledge for identifying concept candidates
                                                           all relevant references for a specific information need.
[WLW+ 15]. We access Wikipedia-based linked open
                                                           Consider the following example: While one book may
data through the DBpedia Spotlight 4 plugin for
                                                           focus on the aspects of national law, another depicts
GATE5 . Unlike their method, we intend the knowl-
                                                           European legislation. In reality, this information needs
edge base to perform named entity resolution directly
                                                           to be considered as a whole, since European legislation
on the citation summary. If a DBpedia entry exists in
                                                           supersedes national law.
the sentence containing a reference, we split the URI to
obtain the concept name as a nominal feature. We ob-          Recalling the discussion from Section 2.2, we show
serve that most matches occur for the regulation or the    how exactly a heterogeneous ontology can serve a user
RFC tokens. There is one frequent misclassification        who is interested in complete, reliable and founded in-
regarding the German Civil Code (BGB), where the           formation. Aside from our experiment of matching
DBpedia lookup yields a swiss political party instead      extracted instances with Chapter labels, an actual ap-
of the civil code, which we manually corrected before      plication of this method is to classify for Relevance
composing the concept hierarchy. After having anno-        instead. Figure 3 illustrates how a heterogeneous on-
tated the nine feature types (Chapter, Part, Subchap-      tology in legal contexts may emerge. In the setting of
ter, Subsubchapter, REG, DBp, RFC, REL, REF ), we          a recommender system, suppose there is a cluster con-
export them from GATE and build the concept hier-          taining two concept hierarchies with sets of instances
archy.                                                     (1, 5, 8) and (1, 2, 4, 8) respectively. In the first sce-
                                                           nario depicted on the left hand side, the recommender
3.2     Compose Concept Hierarchies                        system receives positive user feedback regarding in-
                                                           stance 1. Since this instance is present in the cur-
Figure 2 shows how we compose and evaluate the con-        rent context which is more narrow than other concept
cept hierarchy. In this example, there are two sim-        hierarchy, the context is not altered. In contrast, a
plified concept hierarchies, which are obtained from       similarity function ( A) receives negative feedback for
the JAPE rule-based annotations. In the fictive CS         instance 5 in the second scenario, thus resulting in a
  4 https://www.dbpedia-spotlight.org/                     context switch to the other concept hierarchy without
  5 http://www.semanticsoftware.info/lodtagger             instance 5. There are several approaches for similarity
                                                                           unknown connenction
                               B
                                                                           inferred membership from feature
                       C                C
                                                                      §     training data
                P                               P
                                                                      §     test data
                        S               S               S
                                                                    B:    Book
                       SS          SS       SS              SS      C:    Chapter
                                                                    P:    Part
                       CS          CS       CS      CS      CS      S:    Subchapter
                                                                    SS:   Subsubchapter
                        §          §        §       §       §
                                                                    CS:   Citation Summary (REG, DBp, RFC, REL)


                             Figure 2: Structure and Evaluation of a Concept Hierarchy
adaptation, as investigated by Stober and Nürnberger      hierarchical tree algorithm, which learns incrementally
in [SN11]. In addition, the heterogeneous ontology can     from new instances, given four options of incorporating
also be used for query expansion, as previously pointed    them (creating a new child node, adding to an exist-
out regarding Figure 1.                                    ing child node, merging two similar child nodes and
   We find that for a legal recommender system, het-       incorporating the newest instance therein, and split-
erogeneous ontologies - as defined in this work as clus-   ting a node, so that it becomes a child of the current
ters of concept hierarchies acquired from suitable lit-    node) [MHAK16]. We visualize our results by using
erature - can indeed fulfill the following desirable func- the python library concept formation 6 by MacLel-
tions:                                                     lan et al. [MHAK16]. Instead of incorporating several
                                                           books, we evaluate this method with respect to the
  1. They group semantically related concept hierar-
                                                           most high-level concepts (i.e., chapter titles) of one
     chies.
                                                           comprehensive book. In particular, we used chapters
  2. Their clusters allow for efficient lookups, instead   (1), (4) and (8) from Derleder et al. because they were
     of querying the whole ontology.                       perceived as topically related, while still treating dif-
                                                           ferent concepts [DKB08]. For a rich heterogeneous on-
  3. They are sensitive towards user feedback.             tology, multiple books need to be taken into account,
  4. They are as relevant as possible by applying the      among which several topical overlaps shall occur to
     narrowest context given user feedback constraints.    compensate for losses from the extraction process or
                                                           a different focus of an author. In case of significant
   We conducted some experiments with subsets from         overlaps, two concept hierarchies shall be merged.
the 78 documents (subchapters from three fixed chap-
ters), the results are shown in the next Section 4.
                                                           4.2 Evaluation Measures
4     Results                                                    Regarding the annotation success, we determine the ef-
To show the effect of adding knowledge to the het-               fectiveness of context feature extraction by computing
erogeneous lightweight ontology, we evaluate the an-             the average coverage of references REF by RFC an-
notation and perform two experiments. The first ex-              notations. Basically, if a sentence contains a pattern
periment applies COBWEB clustering on the features,              which can be detected by our JAPE rules, there will be
without knowing the Chapter class label. The second              an RFC annotation. Since we only considered those
approach is a classifier for the same features, this time        regulations whose context features (especially RFC )
we use the COBWEB tree. Before we present their                  could be retrieved, this evaluation is important to un-
results, we describe the experiment setting and evalu-           derstand how many data points were the basis for the
ation measures.                                                  subsequent steps of clustering and classification.
                                                                    Our evaluation measure for the supervised cluster-
4.1   Evaluation Setup                                           ing experiment is the Adjusted Rand Index (ARI),
                                                                 originally proposed by Hubert and Arabie [HA85]. It
The aim of this evaluation is to determine the expres-           quantifies the overlap between two partitioning ap-
siveness of our selected features to distinguish between         proaches, in our case, we compare the COBWEB clus-
abstract concepts. In this work, we intend to show               tering and the class labels (i.e., textbook chapters). Its
the feasibility of our proposed knowledge extraction             expected value 0 indicates a random clustering, while
and representation method. Therefore, we create clus-            a value close to 1 corresponds to a high agreement
ters of semantically similar concept hierarchies by us-
ing the COBWEB algorithm [Fis87]. It is a recursive                6 https://github.com/cmaclell/concept formation
                  1 = relevant                                 5 = irrelevant
                                     A                                              A


                  current context                                                       current context


                   1    5        8       1   2   4   8           1     5        8       1   2     4       8


    Figure 3: Incorporating user feedback in a cluster of concept hierarchies with an adaptation function (A)
between the resulting clustering and class label parti-     we define a JAPE rule and annotate the text based
tions. Santos and Embrechts suggest using the ARI           on a pattern that is able to detect several citation for-
for supervised multilabel classification evaluation due     mats:
to its ability to measure the relationship of two el-
                                                                  German law: § 676 a Abs. 1 Satz 1 BGB
ements instead of the correct class label assignment
[SE09]. While we only use one book, we expect an                  German law: Art. 1 und 2 Abs. 1 GG
ARI above 0.5 because each chapter contains unique                European law: 2000 / 46 / EG
themes and possible overlaps in cited regulations REG.
                                                               In Table 1, we list the number of reference
Having heterogeneous ontology clusters, an automatic
                                                            annotations corresponding to the book chapters:
merging criterion can be applied to achieve clusters
                                                            (1) Bankvertragliche Grundlagen (English: Founda-
of topically related concept hierarchies. Based on the
                                                            tions of Banking Contracts), (4) Kapitalmarkt- und
ARI, this merging criterion has been implemented by
                                                            Auslandsgeschäfte (English: Capital Market and For-
Pavan et al. to extend k-means clustering [PARR11].
                                                            eign Transactions), (8) Europäisches Bankenrecht mit
   For the classification task, we use average values
                                                            Länderabschnitten (English: European Banking Law
of precision and recall. Calculating average recall is
                                                            by Country). Additionally, we indicate the number
rather unconventional [GF14], however, optimizing for
                                                            of RFC and the average percentage of detected RFC
a high recall is crucial in the legal domain. Those two
                                                            from all REF annotations per chapter. The numbers
measures quantify how well the COBWEB tree is be
                                                            in the column header depict the document number,
able to infer the correct class membership given the
                                                            corresponding to the subchapters of the textbook. We
instance features, as shown in Figure 2. In partic-
                                                            find that almost 75% of the references have an an-
ular, our average precision measures the percentage
                                                            notation value for RFC. The restrictions we included
of correctly identified class members compared to all
                                                            in our pattern prevent us from extracting the chapter
instances labeled as class members by the algorithm,
                                                            name as a REF, and despite some missing references
averaged over the number of runs and all classes. The
                                                            and RFC due to long-range dependencies within the
average recall in our case is defined as the fraction of
                                                            sentence or unwanted headline text insertions at page
correctly identified instances of a class compared to all
                                                            breaks, the noise in the text data (e.g., citations of
that belong to the respective class, averaged over all
                                                            other books in a reference-like format) did not affect
runs and classes. Intuitively, a false positive recom-
                                                            the extraction substantially. Nevertheless, all subse-
mendation of a regulation is not as severe as a false
                                                            quent steps depend on the annotation, so that a loss
negative for the legal domain.
                                                            in this step propagates forward to the clustering and
                                                            classification task.
4.3 Evaluation of Annotation
                                                            4.4 Evaluation of Heterogeneous Legal Ontol-
We evaluate our annotation results regarding the num-
                                                                   ogy
ber of detected references REF compared to the num-
ber of extracted RFC in the chapter, since we require       We evaluate our results for the COBWEB clustering
the latter for concept formation. Spiegel-Rosing found      algorithm using the extracted Chapter feature as the
for scientific texts descriptive RFC context in 80% of      ground truth class. With the remaining context infor-
the sentences. We assume that in a German legal text-       mation starting with the Part feature until the REF
book, slightly less RFC will be detected, due to a dif-     feature, the instances are supposed to be grouped by
ferent writing style (e.g., more complex syntax and         the COBWEB clustering algorithm. In order to show
longer sentences). Consequently, our aim for RFC an-        the effect of a successful extraction method, we re-
notation is set to 70% of REF occurrences. Therefore,       stricted the instances only to those cases where a value
      Table 1: Evaluation of REF and RFC detection. From three chapters, we analyzed all subchapters.
   (1)    1     2    3       4   5    6    7      8    9                                       Avg. %
   REF     197    40   196     47     41    107    568       131     250
   RFC     170    30   168     37     31    83     385       74      160                                       72
   (4)     50     51   52      53     54    55     56        57      58    59     60    61     62    63        Avg. %
   REF     211    82   1091    283    119   41     82        283     270   483    112   115    164   237
   RFC     158    60   643     232    85    33     70        215     227   400    85    93     111   221       74
   (8)     72     73   74      75     76    77     78                                                          Avg. %
   REF     47     90   188     40     67    28     370
   RFC     36     61   147     30     43    16     275                                                         73


Figure 4: COBWEB clustering with p=2, i=1020 and                   Figure 5: COBWEB clustering with p=2, i=1149 and
the ARI evaluation [HA85]                                          the ARI evaluation [HA85]

could be retrieved for the Part feature, since this is the         ter (1) (labeled as B) and chapter (4) (labeled as K)
most abstract class. To have an equal class distribu-              and chapter (8) (labeled as E). Many instances of par-
tion, we downsampled the instances of other chapters               ticularly chapters (4) and (8) are placed in the wrong
to match the class with the fewest instances left. This            cluster. From this, we conclude that despite having
has not been achieved with a random selection, but in-             balanced classes, there may be topical overlaps among
stead we selected a group of instances which were pre-             the concept hierarchies which shall either result in a
viously spatially close in the textbook. This has the              merge or are lacking evidence for separate groups. If
advantage of not missing important context, as well as             we allow for a slight class imbalance of the instances by
limiting the variance in nominal features. For a fair              increasing the number of chapter (1) and (4) instances
comparison, running the evaluation with different in-              in a comparable amount to 1149, the ARI increases to
stance groups yielded mostly similar results, however              0.64, as shown in Figure 5. This also led to a differ-
we observe that more variability leads to less similar             ent cluster shape and a better discrimination between
examples and thus a lower ARI score.                               the three chapter classes. The improvement can be
   For the first evaluation shown in Figure 4 with 2               seen in the classes, where more labels correspond to
principal components p, 3 Chapters and 1020 instances              the cluster membership. It indicates that the cluster-
i of balanced classes, we obtain an adjusted rand in-              ing approach found more agreement between clusters
dex ( ARI) of 0.28. Each axis holds one principal com-             and the ground truth classes. That observation lets us
ponent analysis (PCA) dimension to visualize a pro-                conclude that additional examples can lead to a higher
jection of the cluster shape. According with our ex-               ARI if they only broaden the feature value space mod-
pectation, there are three clusters, while each cluster            erately. In previous experiments, we applied the algo-
consists of two to three ellipsis shapes. The chapter              rithm to all extracted instances, leading to an ARI of
labels in Figure 4 indicate that the algorithm does not            0.05, presumably because of the high variance of in-
have enough information to distinguish between chap-               stances within a chapter and different chapter length.
                                                            to 10pp for precision, which is a significant improve-
                                                            ment of the classifier performance. In summary, the
                                                            results for the COBWEB algorithm vary depending
                                                            on the number of examples for each concept hierarchy.
                                                            A recall of more than 90% is desirable, so that the
                                                            results from the second setup of each experiment are
                                                            regarded as sufficient evidence for descriptive features
                                                            to distinguish between different contexts. We discuss
                                                            the general applicability of the results.

                                                            4.5   Discussion
                                                            There is more research potential in the question
                                                            whether this approach also works for other domain
                                                            literature, or what happens if other clustering algo-
                                                            rithms with advanced capabilities of constraint formu-
Figure 6:    COBWEB tree with r=10, num=100,                lation are chosen. Considering that we used concept
i=1020                                                      hierarchies mostly about general banking law, finan-
                                                            cial markets and european banking law, the overlap
                                                            of REG and RFC is considerable. After other books
                                                            about different subjects are added, those three concept
                                                            hierarchies may form a cluster. During the concept hi-
                                                            erarchy extraction, we found that there are four major
                                                            limitations of our approach: First, literature resources
                                                            are needed which cover the information need. Other-
                                                            wise, a user may not find his case represented. Second,
                                                            for each textbook, there can be a different format of
                                                            citations or the TOC components. This results in a
                                                            higher manual effort for rule formulation. Third, since
                                                            we only had the PDF files of literature available, there
                                                            were challenges in segmenting the file and assigning
                                                            references to each section, leading to missing feature
                                                            values. Fourth, despite having gained much domain
                                                            information from the textbook, we need to investigate
Figure 7:    COBWEB tree with r=10, num=100,                more methods of leveraging those. Since we plan to
i=1149                                                      implement a lightweight heterogeneous ontology, we
                                                            uncover future research fields in Section 5.
Since this class imbalance will naturally occur in a
heterogeneous ontology, we need to investigate futher
how the approach scales and what the limitations are
                                                            5     Conclusion and Future Work
regarding the feature diversity.                            To conclude, our lightweight heterogeneous ontology
   We perform a second experiment on the same data,         is composed of concept hierarchies which are derived
but in the classification setting with a COBWEB tree        from literature. It is a promising area for further work.
with 10 runs r and 300 training instances num. The          We pointed out the reasons for accepting coexisting
result of the classification algorithm is shown in Fig-     perspectives in the legal domain and gave indications
ures 6 and 7, including 95% confidence intervals for        of how to take advantage of many sources, while still
the average precision and recall values. In Figure 6,       controlling the results with constraints and user feed-
the confidence intervals obtain a range of 40 percent-      back. The rule-based annotation method provided fea-
age points (pp), witnessing of an unstable classification   tures for context-aware classification and clustering of
result of 80% precision and 87% recall on average af-       the concept hierarchies. Overall, the results indicate
ter 200 training examples. The effect of adding further     that the chosen features, the extraction method and
examples is illustrated in Figure 7 and similar to the      the concept formation library are suitable for detect-
previous experiment, which manifests in a gain in pre-      ing semantic similarity in the book we selected. Re-
cision of about 10pp and a slight increase of 5pp in the    garding future work, we are curious about how this
average recall score. Please note that the range of the     method performs, if additional features of the content
confidence interval is reduced to 20pp for recall and       of referenced regulations and term definitions are taken
into account. Another field to study is the impact               Vojtěch Svátek, and Maarten van Someren, ed-
of abstract relationship categories on clustering. We            itors, Semantics, Web and Mining, pages 103–
see possible applications of the learned ontology in the         120. Springer Berlin Heidelberg, Berlin, Heidel-
field of law clustering, legal context search, topic de-         berg, 2006.
tection and legal recommender systems and intend to
                                                           [BGBI16] MarÃa G. Buey, Angel Luis Garrido, Car-
explore more about these use cases.
                                                                los Bobed, and Sergio Ilarri. The ais project:
                                                                Boosting information extraction from legal doc-
6   Acknowledgements                                            uments by using ontologies. In Proceedings
The authors would like to thank Andreas Nürnberger             of the 8th International Conference on Agents
and the anonymous referees for their valuable com-              and Artificial Intelligence, pages 438–445, 2016.
ments. The work is supported by Legal Horizon AG,               Exported from https://app.dimensions.ai on
Grant No.:1704/00082                                            2018/08/19.
                                                           [BGvH+ 03] Paolo Bouquet, Fausto Giunchiglia,
References                                                      Frank van Harmelen, Luciano Serafini, and
[AAD16] VS Anoop, S Asharaf, and P Deepak. Un-                  Heiner Stuckenschmidt. C-owl: Contextualiz-
     supervised concept hierarchy learning: a topic             ing ontologies. In Dieter Fensel, Katia Sycara,
     modeling guided approach. Procedia Computer                and John Mylopoulos, editors, The Semantic
     Science, 89:386–394, 2016.                                 Web - ISWC 2003, pages 164–179, Berlin,
                                                                Heidelberg, 2003. Springer Berlin Heidelberg.
[ABC+ 16] Gianmaria Ajani, Guido Boella, Luigi Di
     Caro, Livio Robaldo, Llio Humphreys, Sabrina          [BMNG18] Mark Belford, Brian Mac Namee, and
     Praduroux, Piercarlo Rossi, and Andrea Vi-                Derek Greene. Stability of topic modeling via
     olato. The european taxonomy syllabus: A                  matrix factorization. Expert Systems with Ap-
     multi-lingual, multi-level ontology framework             plications, 91:159–169, 2018.
     to untangle the web of european legal termi-          [BN14] Korinna Bade and Andreas Nürnberger. Hier-
     nology. Applied Ontology, 11(4):325–375, 2016.              archical constraints - providing structural bias
[BDCG+ 15] Guido Boella, Luigi Di Caro, Michele                  for hierarchical clustering. Machine Learning,
     Graziadei, Loredana Cupi, Carlo Emilio                      94(3):371–399, 2014.
     Salaroglio, Llio Humphreys, Hristo Konstanti-         [BNS+ 10] Mı́rian Bruckschen, Caio Northfleet,
     nov, Kornel Marko, Livio Robaldo, Claudio                   DM Silva, Paulo Bridi, Roger Granada, Re-
     Ruffini, Kiril Simov, Andrea Violato, and Veli              nata Vieira, Prasad Rao, and Tomas Sander.
     Stroetmann. Linking legal open data: Breaking               Named entity recognition in the legal domain
     the accessibility and language barrier in euro-             for ontology population. In In: 3rd Workshop
     pean legislation and case law. In Proceedings               on Semantic Processing of Legal Texts (SPLeT
     of the 15th International Conference on Arti-               2010), page 16, 2010.
     ficial Intelligence and Law, ICAIL ’15, pages
     171–175, New York, NY, USA, 2015. ACM.                [CHS04] Philipp Cimiano, Andreas Hotho, and Stef-
                                                                fen Staab. Clustering concept hierarchies from
[BDIPV13] Gioele Barabucci, Angelo Di Iorio,                    text. In Proceedings of the Conference on Lex-
     Francesco Poggi, and Fabio Vitali. Integra-                ical Resources and Evaluation (LREC), pages
     tion of legal datasets: From meta-model to                 1721–1724, 2004.
     implementation. In Proceedings of Interna-
                                                           [DKB08] Peter Derleder, Kai-Oliver Knops, and
     tional Conference on Information Integration
                                                                Heinz Georg Bamberger.       Handbuch zum
     and Web-based Applications &#38; Services,
                                                                deutschen und europäischen Bankrecht.
     IIWAS ’13, pages 585:585–585:594, New York,
                                                                Springer Science & Business Media, 2008.
     NY, USA, 2013. ACM.
                                                           [Fis87] Douglas H Fisher. Knowledge acquisition via
[BDMP06] Holger Bast, Georges Dupret, Debapriyo
                                                                  incremental conceptual clustering. Machine
    Majumdar, and Benjamin Piwowarski. Discov-
                                                                  learning, 2(2):139–172, 1987.
    ering a term taxonomy from term similarities
    using principal component analysis. In Markus          [FMPT10] Enrico Francesconi, Simonetta Monte-
    Ackermann, Bettina Berendt, Marko Grobel-                   magni, Wim Peters, and Daniela Tiscornia. In-
    nik, Andreas Hotho, Dunja Mladenič, Giovanni               tegrating a bottom–up and top–down method-
    Semeraro, Myra Spiliopoulou, Gerd Stumme,                   ology for building semantic resources for the
      multilingual legal domain. In Semantic Pro-        [PARR11] Karteeka Pavan, Allam Appa Rao, and A V
      cessing of Legal Texts, pages 95–121. Springer,         Rao. An automatic clustering technique for op-
      2010.                                                   timal clusters. abs/1109.1068:133–144, 09 2011.

[FSE11] Anthony Fader, Stephen Soderland, and            [ROB17] Cécile Robin, James O’Neill, and Paul
      Oren Etzioni. Identifying relations for open            Buitelaar. Automatic taxonomy generation
      information extraction. In Proceedings of the           - A use-case in the legal domain. CoRR,
      conference on empirical methods in natural lan-         abs/1710.01823, 2017.
      guage processing, pages 1535–1545. Association
      for Computational Linguistics, 2011.               [sBS12] Vi sit Boonchom and Nuanwan Soonthorn-
                                                               phisaj. Atob algorithm: an automatic ontology
[GA10] Korhan Günel and Rıfat Aşlıyan. Extracting            construction for thai legal sentences retrieval.
      learning concepts from educational texts in in-          Journal of Information Science, 38(1):37–51,
      telligent tutoring systems automatically. Expert         2012.
      Systems with Applications: An International
      Journal, 37(7):5017–5022, 2010.                    [SE09] Jorge M Santos and Mark Embrechts. On the
                                                               use of the adjusted rand index as a metric for
[GF14] Marian George and Christian Floerkemeier.               evaluating supervised classification. In Inter-
      Recognizing products: A per-exemplar multi-              national Conference on Artificial Neural Net-
      label image classification approach. In Euro-            works, pages 175–184. Springer, 2009.
      pean Conference on Computer Vision, pages
      440–455. Springer, 2014.                           [SG+ 07] Erich Schweighofer, Anton Geist, et al. Legal
                                                               query expansion using ontologies and relevance
[HA85] Lawrence Hubert and Phipps Arabie. Com-                 feedback. In LOAIT, pages 149–160, 2007.
      paring partitions. Journal of classification,
      2(1):193–218, 1985.                                [SN11] Sebastian Stober and Andreas Nürnberger. An
                                                               experimental comparison of similarity adapta-
[HBDB+ 07] Rinke Hoekstra, Joost Breuker, Marcello             tion approaches. In International Workshop on
     Di Bello, Alexander Boer, et al. The lkif                 Adaptive Multimedia Retrieval, pages 96–113.
     core ontology of basic legal concepts. LOAIT,             Springer, 2011.
     321:43–63, 2007.
                                                         [STT95] Anne Schiller, Simone Teufel, and Christine
[Hea92] Marti A Hearst. Automatic acquisition of hy-          Thielen. Guidelines für das tagging deutscher
      ponyms from large text corpora. In Proceed-             textcorpora mit stts. Technical report, Univer-
      ings of the 14th conference on Computational            sitäten Stuttgart und Tübingen, 1995.
      linguistics-Volume 2, pages 539–545. Associa-
      tion for Computational Linguistics, 1992.          [VC98] Pepijn R.S. Visser and Zhan Cui. Heteroge-
                                                               neous ontology structures for distributed archi-
[Hul03] Anette Hulth. Improved automatic keyword               tectures, 1998.
      extraction given more linguistic knowledge. In
      Proceedings of the 2003 conference on Empirical    [VZ07] Fabio Vitali and Flavio Zeni. Towards a
      methods in natural language processing, pages            country-independent data format: the akoma
      216–223. Association for Computational Lin-              ntoso experience. In Proceedings of the V leg-
      guistics, 2003.                                          islative XML workshop, pages 67–86. Florence,
                                                               Italy: European Press Academic Publishing,
[KTH06] Huang-Cheng Kuo, Tsung-Han Tsai, and                   2007.
     Jen-Peng Huang. Building a concept hierar-
     chy by hierarchical clustering with join/merge      [WBM18] Bernhard Waltl, Georg Bonczek, and Flo-
     decision. In Proceedings of the 9th Joint Confer-       rian Matthes. Rule-based information extrac-
     ence on Information Sciences, JCIS 2006, vol-           tion - advantages, limitations, and perspectives.
     ume 2006, 01 2006.                                      Jusletter IT, 02 2018.

[MHAK16] C.J. MacLellan, E. Harpstead, V. Aleven,        [WBVvS14] Radboud Winkels, Alexander Boer, Bart
    and K.R. Koedinger. Trestle: A model of con-             Vredebregt, and Alexander van Someren. To-
    cept formation in structured domains. Advances           wards a legal recommender system. In JURIX,
    in Cognitive Systems, 4:131–150, 2016.                   volume 271, pages 169–178, 2014.
[WLM16] Bernhard Waltl, Jörg Landthaler, and Flo-
    rian Matthes. Differentiation and empirical
    analysis of reference types in legal documents.
    In JURIX, pages 211–214, 2016.
[WLW+ 15] Shuting Wang, Chen Liang, Zhaohui
    Wu, Kyle Williams, Bart Pursel, Benjamin
    Brautigam, Sherwyn Saul, Hannah Williams,
    Kyle Bowen, and C Lee Giles. Concept hierar-
    chy extraction from textbooks. In Proceedings
    of the 2015 ACM Symposium on Document En-
    gineering, pages 147–156. ACM, 2015.
[WZH16] Minmei Wang, Bo Zhao, and Yihua Huang.
     Ptr: Phrase-based topical ranking for auto-
     matic keyphrase extraction in scientific publi-
     cations. In International Conference on Neu-
     ral Information Processing, pages 120–128.
     Springer, 2016.
[ZK07] Paul Zhang and Lavanya Koppaka. Semantics-
      based legal citation network. In Proceedings
      of the 11th international conference on Artifi-
      cial intelligence and law, pages 123–130. ACM,
      2007.