Constructing a Knowledge Graph from Indian Legal
Domain Corpus
Sarika Jain1 , Pooja Harde1 , Nandana Mihindukulasooriya2 , Sudipto Ghosh3 ,
Abhinav Dubey3 and Ankush Bisht3
1
  National Institute of Technology Kurukshetra, India
2
  IBM Research, Dublin, Ireland
3
  University of Delhi, India


                                         Abstract
                                         While being an important pillar of human society, legal domain consists of large corpora of complex
                                         documents about different aspects such as laws or court judgements. In recent years, knowledge graphs
                                         have become a prominent solution to represent such complex information in semantically rich machine
                                         readable manner allowing access to other AI powered downstream applications. In this work, we aim to
                                         construct a reliable knowledge graph from Legal domain corpus that may be utilized by researchers and
                                         the application developers working in legal domain.The source dataset chosen is the Indian Legal Court
                                         Judgements and NyOn1 (Nyaya Ontology) has been utilized for conceptualization. A framework that
                                         consists of entity extraction, relation extraction, triple construction is used to convert the legal text into
                                         RDF triples. The knowledge graph thus built has been quantitatively evaluated over a small random
                                         sample with reasonable results.

                                         Keywords
                                         Knowledge Graph Construction, Entity Extraction, Relation Extraction, Legal Domain


1. Introduction
There is a boom in digitizing the legal domain for use cases like classifying judgments [1, 2, 3],
predictions [4, 5], questions answering [6, 7], finding similarities between judgments, and many
more. Although much data is available in the legal domain, from court judgments to acts and
deeds, its unstructured nature makes it inefficient and costly to process it to useful results.
Accessibility and transparency of the key information have always been an issue. There is a
requirement for a central place to keep this vast knowledge, and that too in an interoperable
format.
  With recent technological advances, AI applications can now process, understand and inter-
pret the human language. Knowledge graphs [8] can represent large volumes of knowledge
with their semantics providing easy access and structured querying abilities. Furthermore,
when modeled following the Semantic Web standards, they can be used to reason, infer new

                1
                  https://w3id.org/def/NyOnLegalOntology#
Text2KG 2022: International Workshop on Knowledge Graph Generation from Text, Co-located with the ESWC 2022,
May 05-30-2022, Crete,Hersonissos, Greece
Envelope-Open jasarika@nitkkr.ac.in (S. Jain); pmharde29@gmail.com (P. Harde); nandana@ibm.com (N. Mihindukulasooriya)
GLOBE https://sites.google.com/view/nitkkrsarikajain/ (S. Jain)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
information, or find any inconsistencies in data. Moreover, such knowledge graphs enable
downstream applications such as question answering, dialogue, prediction, and classification
systems.
   In the literature, we find different models for creating a knowledge graph from unstructured
documents. In recent years, neural networks such as convolutional neural networks (CNNs),
recurrent neural networks (RNNs), and Sequence-to-Sequence (Seq2Seq) models have shown
promising results when used in the field of natural language processing (NLP) for sentiment
analysis, information retrieval, and document classification. The state-of-the-art Information
Extraction (IE) models use a supervised machine learning method that requires a huge amount
of labeled training data. For general open-domain IE, publicly available labeled data is present,
but domain-specific IE such as legal domain requires time-consuming data labeling tasks to
create a large amount of training data. Although much work has been done in the legal domain
for IE, the dataset available is not quite useful for every region. As the judicial system changes
from country to country, so do changes in the legal terms, facts, artifacts, and law. There are
many legal datasets available like in Chinese [9] and German [10] [1]. All these are specific to
their region law; therefore, other regions beyond their boundaries cannot use it. For countries
like India, where digitization of the judicial system is an ongoing process, it is challenging to find
any proper dataset that will be useful for Information extraction. According to [11], creating
the annotated data for a large corpus is time-consuming and too expensive. Moreover, while
typing during the hearings in the courts, the court judgments get noisy due to misspellings,
wrong placement of punctuation, and non-uniformity in the document structure. As for the
linguistic part of the document, every judge has their vocabulary and interpretation; therefore,
it becomes more difficult for the non-domain experts to comprehend legal text corpora and
develop a dataset that satisfies all the system’s requirements to process the documents with the
uniform annotated data.
   This work aims to explore a viable rule-based approach to systematically extract, annotate, and
store the key entities of the legal domain as a knowledge base. The first step is to extract entities
from the legal corpora using Information Extraction (IE) involving Named Entity Recognition
(NER) and Relation Extraction (RE). The second step is to store, maintain, and update the
retrieved knowledge in a graph database in a set of triples. As metadata, we have developed
NyOn1 (Nyaya Ontology) [12], a multilingual modular ontology for legal court judgments.
NyOn has been developed taking entities from the Indian Supreme Court judgments taken
from indiankanoon.org2 and is currently available in five languages (English, German, French,
Spanish, and Hindi). NyOn is referred as the schema required for information extraction. It
gives the broader aspect of the types of entities and the relations we need to identify from the
legal court judgment documents. The knowledge graph thus constructed will be put to question
answering to answer questions like:

    • What are the different courts that are referred to in a specific case?
    • List of documents that fall under a specific type of jurisdiction?
    • Who were the parties involved in Union Of India vs Ex No. 3192684 W Sep. Virendra
      judgment on 7 January 2020?

    2
        https://indiankanoon.org/
Figure 1: Architecture for Knowledge Graph Construction


    • How many judgments were passed in 2002?
    • List all judgments ordered by CJI N V Ramana.
    • What are the names of the witnesses in a specific case?

   The main contributions of this paper are (1) An analysis of existing approaches for information
extraction and knowledge graph construction, and (2) Construction of a knowledge graph for
the Indian Supreme Court judgments and a quantitative evaluation of the proposed framework.
The dataset and the source code are publicly available on GitHub3 . The rest of the paper is
organized as follows. Section 2 provides a approach followed for constructing the knowledge
graph from Indian Supreme Court Judgments, Section 3 provides the evaluation results, while
Section 4 discusses the related work focusing on legal knowledge graph construction. Section 5
elaborates the conclusions derived from the work and presents future work ideas.


2. Construction of Knowledge Graph of Indian Supreme Court
   Judgements
In this paper, we use the Indian Supreme Court Judgements to create the Knowledge Graphs,
which will facilitate downstream applications such as question answering, judgment prediction,
etc. Figure 1 demonstrates the general architecture of the knowledge graph construction
approach we followed. It mainly consists of a three-step pipeline consisting of data prepossessing,
entity extraction, relation extraction, and triple construction steps. Each of these steps will be
described in the following subsections.

2.1. Dataset
The input corpus to our Knowledge Graph construction pipeline contains 44,366 reportable
and non-reportable court judgments published by the Supreme Court of India from 1947 to

   3
       https://github.com/semintelligence/Text2KG
2020 in plain text. The documents have been scraped using a Python-based scraper from
IndianKanoon.org, a free-to-use case search engine for Indian legal documents sourced from
the various court and tribunal repositories. The dataset comprises court judgments for cases
from various legal subdomains (civil law, criminal law, property law, etc.). All text documents
in the dataset are assumed to be in English language.
   Fig. 2 refers to a snapshot of one Indian Court Decision document. We call this data the
caption data as no sentences are present in this part of the document. The text highlighted in
orange represents the data, and the text highlighted in cyan color represents the entity type.
Even though this data is not present in the sentence format, it represents most of the legal court
decision documents useful for the knowledge graph creation. In the relation extraction part of
the paper, we have discussed how we identify the relationship between the available entities in
the caption format.

2.2. Data Preprocessing
After selecting documents, we perform the data preprocessing on the corpus. In the GATE4
software (General Architectural of Text Engineering), we already have the Language Processing
tools in ANNIE5 (A Nearly-New Information Extraction System) such as POS Tagger, ANNIE
Sentence Splitter, ANNIE English Tokenizer, etc. All these processing tools are preloaded into
the software for data preprocessing.

2.3. Entity Extraction
Entity extraction is the step performed after data preprocessing for information retrieval. Here
all the useful entities are identified from the unstructured data that forms relevant, useful infor-
mation about the context of the document. After the data is scraped from indiankanoon.org2 ,
the rules are written in JAPE (Java Annotation Patterns Engine) for the GATE software (General
Architectural of Text Engineering) with the help of NyOn [12] Ontology for recognizing the
entities in the documents. As discussed in the above subsection, these rules are given as a JAPE
Transducer to ANNIE as a processing resource with other processing resources like POS tagger
and Sentence Tokenizer. While annotating the entities using GATE API, we observed that it
takes approximately one hour to annotate 100 documents as the length of the documents is
quite long. One of the rules used for entity recognition is listed in Table 1.
   After the data is annotated using the JAPE rules, the corpus is stored with the annotation
state having the inline XML format of all the documents. We use a small Python script to
convert all the inline XML documents into standard text documents. The converted standard
text documents are then saved separately for the next phase of the system, which is relation
extraction. The list of the entities identified in the court decision documents is presented in
Table 2.


    4
        https://gate.ac.uk/
    5
        https://gate.ac.uk/sale/tao/splitch6.html
Figure 2: Court Case Document Snapshot with text in orange highlighting the data and text in cyan
highlighting the entity.


2.4. Relation Extraction
The entities extracted in the previous phase are passed as input in this phase. Here the relations
are identified between the extracted entities. The list of the various relations that are identified
between different entity types (Object Properties) are listed below in Table 3.
  As discussed in the earlier section, in the caption data portion of the document, there are
no sentences in between the entities; we use a small code to map the relation between such
entities and the data with no text around. To refer to the type of relation, we refer to the NyOn
Table 1
JAPE Rule for the entity BENCH
                  Rule: BENCH
                  Priority: 20
                  (
                    (
                      {Token.string == ”BENCH”}
                    ):Type
                    {Token.string == ”:”}
                    (
                      {Token.kind == ”word”, Token.category != ”CC”} |
                      {Token.string == ”.”, Token.category != ”CC”}
                    )+ :Name
                    ({Token.kind == ”punctuation”})?
                    ((
                      {Token.kind == ”word”, Token.string != ”judgment”} |
                      {Token.string == ”.”, Token.sting != ”judgment”}
                    )+) :Name2
                  )
                  –>
                  :Type.COURT_OFFICIALS = {kind = ”BENCH”, rule = ”BENCH”},
                  :Name.JUDGE = {rule = ”BENCH”},
                  :Name2.JUDGE = {rule = ”BENCH”}


Ontology, where the relation between the entities has been described. The NyOn ontology
works as a base for us for entity extraction and relation extraction from the data. The list of
relations between the entities and the values (Datatype Properties) is also provided in Table 4.

2.5. Triple Construction
A knowledge graph consists of triples where every triple is represented in the form of subject-
predicate-object. After extracting the entities and the relations from the data, the next step is
to create the triple to form a knowledge graph. The extracted entities and relations are stored
corresponding to each other in the form of lists. 𝐸 = {𝑒1 , 𝑒2 , 𝑒3 , ..., 𝑒𝑛 } represents the entity set
and 𝑅 = {𝑟1 , 𝑟2 , 𝑟3 , ...𝑟𝑚 } represents the relation set. To form a triple, we follow the given format
𝑇 = {𝑒𝑎 , 𝑟𝑘 , 𝑒𝑏 } where {𝑒𝑎 , 𝑒𝑏 𝜖 𝐸, 𝑟𝑘 𝜖 𝑅}.
   Once the RDF model is constructed, it can be materialized into any RDF serializations such
as RDF/XML, Turtle, or N Triples. The triples then can be loaded into an industry-standard
triple store such as Blazegraph, Apache Jena, and Virtuoso and queried through a SPARQL
endpoint. We are also currently working on exposing such triples as Linked Open Data with
dereferenceable HTTP URIs.
Table 2
Entities used to Annotate Dataset
         Entity Labels                                   Description
           BENCH                        Bench of Judges delivering the judgment
            CASE                               Identifier of the form A v. B
         CASE_TYPE                                    Civil or Criminal
           COURT                                        Judicial Entity
     COURT_DECISION                               Orders in the judgment
    CRIME_VIOLATION                         Instances of crimes and violations
          CUSTODY                         Instances of judicial or police custody
        DOCUMENTS                               Appeal, Petition, FIRs, etc.
          EVIDENCE                                 Weapons, Documents
       JURISDICTION                       Original, Advisory, Appellate, Review
             LAW                          Instances of Acts, IPC, CrPC sections
           PARTIES                        Plaintiffs, Judges, Parties in judgment
        PARTY_TYPE                 Individual, Organization, State, Government, etc
    COURT_OFFICIALS               Legal people involved in CASE (Judge, Solicitor, etc)
          LOCATION         Geographical location of State, District, Village, Place (for Evidences)
            DATE                             Documented and relevant dates

Table 3
Object properties used to annotate relations between entities
                  Relation Labels     Entity Type 1 (domain)    Entity Type 2 (range)
                 hasCourtOfficials             CASE             COURT_OFFICIALS
                     hasParties                CASE                   PARTIES
                   hasPartyType              PARTIES                PARTY_TYPE
                     hasBench                  CASE                    BENCH
                     hasAuthor                 CASE                   AUTHOR
                     hasCourt                  CASE                   COURTS
                      hasActs                  CASE                     ACTS
                   hasEvidences                CASE                 EVIDENCES
                  documentType                 CASE                DOCUMENTS
                  hasJurisdiction             COURT               JURISDICTION
                    hasLocation               COURT                  LOCATION
                                            EVIDENCE                 LOCATION
                        isA            PRECEDENT_CASE                   CASE
                                             AUTHOR                    JUDGE
                  caseBelongsTo                CASE                  CASE_TYPE


3. Evaluation
This section will present the preliminary evaluation of the proposed approach. This evaluation
focused on two aspects of the knowledge base construction process: named entity extraction
and relation extraction of Indian Court Judgement documents.
  We have evaluated both the rule-based Named Entity Recognition and Relation Extraction
components of the proposed approach. As there was no academic benchmark for the type
Table 4
Datatype properties used to annotate relations between entities and values
                        Relation Labels           Entity Label         Values
                        hasCaseNumber                CASE              string
                         hasCaseName                 CASE              string
                        hasPartiesName             PARTIES             string
                        hasCourtName               COURTS              string
                     hasCourtOfficialsName      COURT_OFFICIALS        string


of documents in our use case, we have manually created a gold standard with a sample of
documents with the help of human annotators who are domain experts.

3.1. Gold Standard Creation
There were no existing academic benchmarks for testing the entity and relation extraction on
the Indian court judgment documents. Thus, we decided to create a gold standard for evaluating
the performance of our proposed approach to entity extractions and relation extraction. We
randomly selected five documents from the corpus and annotated them for gold entities and
relations with the help of domain experts. The NyOn ontology [12] guided the gold entity types
and gold relations that were annotated. Two annotators annotated each document to ensure the
inter-annotator agreement, and any conflicts were resolved using a third domain expert. The
annotated documents contained 363 named entities of 10 different class types and 154 triples of
6 relation types. We plan to increase the size of the gold standard for future work.

Metrics: As evaluation metrics, we have used commonly used precision-recall measures. We
have a set of gold annotations and machine-generated output from our rule-based pipeline
for both entities and relations. Table 5 shows some example output of rule-based machine
NER, including correctly and incorrectly identified entities. In this context, the precision was
calculated as the percentage of the machine-generated entities/relations that were correct
according to the gold annotations and recall as the percentage of total gold entities/relations
identified in the machine output.

3.2. Named Entity Recognition
Table 6 shows the results of NER by individual documents level, and Table 7 shows the result
for each class type level. We notice that our rules identify entities with high precision but suffer
from recall issues. Furthermore, we have noticed that there is a wide variation of the recall
values across different documents as well as different entity types. Some types, such as dates
higher recall, while others, such as court decisions or law, have a lower recall. Based on these
results, we are performing error analysis and working on improving the recall of our rules to
capture most of the relevant entities.
Table 5
An example snippet of rule-based NER output
     Entity text span                                                  Entity Type    Correct
     ASSTT. GEN. MANAGER CENTRAL BANK OF INDIA ETC.                   PARTY_TYPE        "
     PETITIONER                                                          PARTY          "
     COMMISSIONER MUNICIPAL CORPORATION AHMEDABAD                     PARTY_TYPE        "
     ETC. ETC.
     RESPONDENT                                                         PARTY           "
     09/05/1995                                                          DATE           "
     BENCH                                                              BENCH           "
     B.P. JEEVAN REDDY                                                PARTY_TYPE        "
     SUJATA V. MANOHAR                                                PARTY_TYPE        "
     Bombay Rents, Hotel and Lodging House Rates Control Act,            LAW            "
     1944
     Gujarat High Court                                                 COURT           "
     K. BENCH                                                         PARTY_TYPE        $
     M.P. Act and that the said non-obstante clause makes all the        LAW            $
     difference. Dewan Daulat Rai Kapoor arose under the Punjab
     Municipal Act, 1911


Table 6
NER Evaluation by Document
                         Total Gold    Identified   Correct
       Document No.                                            Precision     Recall    F1
                         Entities      Entities     Entities
          1592579            84            36         36              1       0.428    0.599
          1592674           124            45         39            0.866     0.314    0.461
          1592725            80            15         14            0.937     0.175    0.295
          1592769            56            26         26              1       0.464    0.634
          1592785            19            15         15              1       0.78     0.876


3.3. Relation Extraction
Table 8 shows the results of relation extraction by individual documents level, and Table 9
shows the result by each relation type level. In contrast to entity extraction, here we see that
both precision and recall have been affected by our rules. The results are expected because
relation extraction is a more complex task than entity extraction, and entity extraction recall
issues are propagated to relation extraction. Furthermore, there are several relation types, such
as hasDecision and hasJudge, for which we have not implemented the rules.In future work, we
plan to perform error analysis to improve the current set of rules and implement new rules to
cover the relation types that are not currently covered.
Table 7
NER Evaluation by Entity Type
                            Total Gold    Identified    Correct
    Entity Type                                                     Precision    Recall      F1
                            Entities      Entities      Entities
    LAW                         85            22          16            0.7       0.188      0.296
    COURT                       69            26          26             1        0.3768     0.547
    PARTICIPANT_TYPE            69            16          16             1         0.23      0.374
    PARTICIPANT                 58            26          26             1         0.44      0.611
    DATE                        50            50          50             1           1         1
    BENCH                       10            2            2             1          0.2      0.333
    COURT_DECISION              10            1            1             1          0.1      0.182
    DOCUMENTS                    4            0            0             0           0         0
    JURISDICTION                 2            1            1             1          0.5      0.667

Table 8
Relation Extraction Evaluation by Document
                          Total Gold     Identified    Correct
       Document No.                                                Precision    Recall     F1
                          Triples        Triples       Triples
           1592579            16             9             7         0.78        0.44      0.56
           1592674            63             35           19         0.54        0.30      0.39
           1592725            36             8             6         0.75        0.17      0.28
           1592769            26             9             6         0.67        0.23      0.34
           1592785            13             7             6         0.86        0.46      0.60

Table 9
Relation Extraction Evaluation by Relation Type
                          Total Gold     Identified    Correct
       Relation Type                                               Precision    Recall     F1
                          Triples        Triples       Triples
       hasPartyName           30             25           23          0.92       0.76      0.83
       hasCourtName           15             15           10          0.66       0.66      0.66
       hasLaw                 44             20           12          0.60       0.28      0.38
       hasPartyType           29             15           13          0.87       0.45      0.59
       hasDecision            10             0            0             0         0          0
       hasCourtOfficial       10             8            5           0.62       0.5       0.55


4. Related Work
Information extraction approaches are emerging and are making their space in various domains,
whether biomedical, social-media, legal, etc. These approaches can play an auspicious role in
building an enriched knowledge base, specifically for legal domains in maintaining the huge
volume of legal information scattered on various portals. Information extraction approaches can
also provide a way for developing various applications like question-answering systems for the
legal domain, judgment prediction, dispute resolution, etc. The most commonly used approaches
for information extraction are; first finding the named entities present in the unstructured data
and then finding the relation between those entities that have been identified. Named Entity
Recognition (NER) is somewhat challenging in the legal domain due to various legal terms,
abbreviations, references made to the Acts and Laws, etc. Then after identifying these domain-
specific entities, relation extraction (RE) is also of great importance as it identifies the relations
between any two entities, which leads to triple construction and later knowledge graph creation.
   Various review and survey articles on NER and RE focus on standard datasets, and even if
most of the research work is done in the legal domain, the available dataset is not useful for the
Indian context. We come across articles that focus on information extraction from unstructured
text in the context of legal documents that do not cover the Indian context mostly. It is also
discussed by Fernàndez-Cañellas et. al. [13] that Natural Language Processing (NLP) alone
cannot guarantee the validity of the facts to be populated in the knowledge graph and data
validation methods also need to be taken into account. In the legal context, this might mean the
validity of the legal facts about judgments, cases, and stakeholders in the Knowledge Graphs
(KG).
   The relationship extraction depends highly on the syntactic and semantic analysis of sentences
in a rule-based approach. Dragoni et al. [14] discussed how to combine the NLP approaches
for rules extractions in the legal domain. The author uses the Deontic Lightweight Ontology
called normonto, which represents and models the legal concepts and specifies the lexicons
used for legal expressions like permission, prohibition, and obligation. Thomas and Sangeetha
in [15] discuss rule-based entity extraction represented as regular expressions to detect entity
mentions in Indian judicial texts using specific patterns or trigger words. The GATE tool and
Java Annotation Patterns Engine (JAPE) grammar rules can support rule-based extraction;
however, basic entities must be identified first. Andrew and Tannier in [16] take a hybrid
approach by using a statistical Conditional Random Field (CRF) model and legal domain-specific
JAPE rules in GATE gazetteer to annotate their dataset. Eftimov et. al. [17] proposes the rule-
based approach for NER identification for evidence-based dieatry recommendations. Poudyal
et. al. [18] also proposes a rule-based approach for NER and Relation Extraction. The author
made use of the C programming language for rules extraction.
   To sum up, as the paper focuses on the rule-based approach for named entity recognition
and relation extraction for legal court judgment documents, we discussed various research
work done in this area and how we can enhance the NER and Relation extraction for the legal
domain by writing JAPE rules in context to India Judicial System. There exist many differences
in the existing rule-based approaches, like [15] does not talk about the jurisdiction entity, and
the paper focuses only on the Criminal Cases. [14] take the entity that expressions only legal
expressions and not all the information related to the case document. [18] considers only
entities related to the plaintiff, court, court staff, and decision. The authors did not consider
other entities like petitioner, case jurisdiction, documents, evidence, etc. In this paper, we focus
on extracting many such entities that are not considered in the works presented till now and
making the information extraction comprehensive for the knowledge graph creation. Currently,
we have focused only on the rule-based approach, but as the next step, we will use the Machine
Learning and Deep Learning models for NER and Relation Extraction for the legal court decision
documents.
5. Conclusion
In this paper, we have presented a pipeline for constructing Knowledge Graphs from the Indian
Supreme Court Decisions corpus. The pipeline consists of data preprocessing, entity extraction,
relation extraction, and triple construction. The generated triples are represented in RDF using
the NyOn ontology and are stored in a triple store. The Knowledge Graph can be queried
using SPARQL and can be used to build downstream applications such as Knowledge Base
Question Answering with complex questions such as the ones that require aggregations or
multi-hop reasoning (which are not possible with simple document retrieval or keyword search).
In addition, the generated Knowledge Graph can become a useful resource for other AI tasks
such as judgment predictions, case clustering, and classification.
   The results are reasonably good but still have room for improvement. We plan to improve our
rules to increase recall in entity recognition and both precision and recall in relation extraction
in future work. We are also planning to explore neural approaches based on semi-supervised
learning by using distant supervision data or using the output of the rule-based system as
training data. Furthermore, we are planning to publish the generated data as Linked Data with
public access and develop a semantic web portal to expose the knowledge base to various use
cases.


Acknowledgments
This work is supported by the IHUB-ANUBHUTI-IIITD FOUNDATION set up under the NM-
ICPS scheme of the Department of Science and Technology, India


References
 [1] A. Elnaggar, C. Gebendorfer, I. Glaser, F. Matthes, Multi-task deep learning for legal
     document translation, summarization and multi-label classification, in: Proceedings of the
     2018 Artificial Intelligence and Cloud Computing Conference, 2018, pp. 9–15.
 [2] O.-M. Sulea, M. Zampieri, S. Malmasi, M. Vela, L. P. Dinu, J. Van Genabith, Exploring the
     use of text classification in the legal domain, arXiv preprint arXiv:1710.09306 (2017).
 [3] N. Ramrakhiyani, S. Pawar, G. K. Palshikar, A system for classification of propositions of
     the indian supreme court judgements, in: Post-Proceedings of the 4th and 5th Workshops
     of the Forum for Information Retrieval Evaluation, 2013, pp. 1–4.
 [4] K. D. Ashley, Artificial intelligence and legal analytics: new tools for law practice in the
     digital age, Cambridge University Press, 2017.
 [5] M. Medvedeva, M. Vols, M. Wieling, Using machine learning to predict decisions of the
     european court of human rights, Artificial Intelligence and Law 28 (2020) 237–266.
 [6] G. McElvain, G. Sanchez, S. Matthews, D. Teo, F. Pompili, T. Custis, Westsearch plus:
     A non-factoid question-answering system for the legal domain, in: Proceedings of the
     42nd International ACM SIGIR Conference on Research and Development in Information
     Retrieval, 2019, pp. 1361–1364.
 [7] C. Hoppe, D. Pelkmann, N. Migenda, D. Hötte, W. Schenck, Towards intelligent legal
     advisors for document retrieval and question-answering in german legal documents, in:
     2021 IEEE Fourth International Conference on Artificial Intelligence and Knowledge
     Engineering (AIKE), IEEE, 2021, pp. 29–32.
 [8] A. Hogan, E. Blomqvist, M. Cochez, C. d’Amato, G. d. Melo, C. Gutierrez, S. Kirrane, J. E. L.
     Gayo, R. Navigli, S. Neumaier, et al., Knowledge Graphs, Synthesis Lectures on Data,
     Semantics, and Knowledge 12 (2021) 1–257.
 [9] W. Huang, D. Hu, Z. Deng, J. Nie, Named entity recognition for chinese judgment doc-
     uments based on bilstm and crf, EURASIP Journal on Image and Video Processing 2020
     (2020). doi:10.1186/s13640- 020- 00539- x .
[10] E. Leitner, G. Rehm, J. Moreno-Schneider, Fine-grained Named Entity Recognition in
     Legal Documents, in: M. Acosta, P. Cudré-Mauroux, M. Maleshkova, T. Pellegrini, H. Sack,
     Y. Sure-Vetter (Eds.), Semantic Systems. The Power of AI and Knowledge Graphs. Pro-
     ceedings of the 15th International Conference (SEMANTiCS 2019), number 11702 in
     Lecture Notes in Computer Science, Springer, Karlsruhe, Germany, 2019, pp. 272–287.
     10/11 September 2019.
[11] V. Malik, R. Sanjay, S. K. Nigam, K. Ghosh, S. Guha, A. Bhattacharya, A. Modi, Ildc
     for cjpe: Indian legal documents corpus for court judgment prediction and explanation,
     in: Proceedings of the 2021 Conference on The Joint Conference of the 59th Annual
     Meeting of the Association for Computational Linguistics and the 11th International
     Joint Conference on Natural Language Processing (ACL-IJCNLP 2021), Association for
     Computational Linguistics, Bangkok, Thailand (Online), 2021.
[12] S. Jain, P. Harde, N. Mihindukulsooriya, NyOn, 2022. URL: https://github.com/
     semintelligence/NyOn.
[13] D. Fernàndez-Cañellas, J. Marco Rimmek, J. Espadaler, B. Garolera, A. Barja, M. Codina,
     M. Sastre, X. Giro-i Nieto, J. C. Riveiro, E. Bou-Balust, Enhancing online knowledge
     graph population with semantic knowledge, in: International Semantic Web Conference,
     Springer, 2020, pp. 183–200.
[14] M. Dragoni, S. Villata, W. Rizzi, G. Governatori, Combining Natural Language Processing
     Approaches for Rule Extraction from Legal Documents: AICOL International Workshops
     2015-2017: AICOL-VI@JURIX 2015, AICOL-VII@EKAW 2016, AICOL-VIII@JURIX 2016,
     AICOL-IX@ICAIL 2017, and AICOL-X@JURIX 2017, Revised Selected Papers, volume
     10791, 2018, pp. 287–300. doi:10.1007/978- 3- 030- 00178- 0_19 .
[15] A. Thomas, S. Sangeetha, An innovative hybrid approach for extracting named entities
     from unstructured text data, Computational Intelligence 35 (2019) 799–826.
[16] J. J. Andrew, Automatic extraction of entities and relation from legal documents, in:
     Proceedings of the Seventh Named Entities Workshop, Association for Computational
     Linguistics, Melbourne, Australia, 2018, pp. 1–8. URL: https://aclanthology.org/W18-2401.
     doi:10.18653/v1/W18- 2401 .
[17] T. Eftimov, B. Seljak, P. Korošec, A rule-based named-entity recognition method for
     knowledge extraction of evidence-based dietary recommendations, PLoS ONE 12 (2017).
     doi:10.1371/journal.pone.0179488 .
[18] P. Poudyal, P. Quaresma, An hybrid approach for legal information extraction,
     Frontiers in Artificial Intelligence and Applications 250 (2012) 115–118. doi:10.3233/
978- 1- 61499- 167- 0- 115 .