<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Interlinking Legal Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Erwin Filtz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sabrina Kirrane</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Axel Polleres</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Vienna University of Economics and Business</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In recent years, the European Union has been working towards harmonizing legislation thus allowing for easier cross-border access to, exchange and reuse of legal information. This initiative is supported via standardization activities such as the European Law Identifier (ELI) and the European Case Law Identifier (ECLI), which provide technical specifications for web identifiers and vocabularies that can be used to describe metadata pertaining to legal documents. Unfortunately, to date said initiative has only been partially adopted by EU member states, possibly due to the manual effort involved in curating the metadata. As a first step towards streamlining this process, we propose a cross-jurisdictional legal framework that demonstrates how legal information stored in national databases can be linked at a European level using Natural Language Processing together with external knowledgebases to automatically populate the knowledge base.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>National sources
unstructured structured
y
g
o
lt
o
n
o
o
t
p
a
M
link
link</p>
      <sec id="sec-1-1">
        <title>European Sources</title>
        <p>EuroVoc thesaurus
li
n
k
li
n
k</p>
      </sec>
      <sec id="sec-1-2">
        <title>Linked Legal Data</title>
        <p>link
db:Austria
ev:566
eli/reg/2004/261/oj
dcterms:subject dcterms:refere.n.c.es ...
ECLI:AT:OGH0002:2013:0070OB00065.1
3D.0703.000
dcterms:coveragdceterms:creator“20dcte1rm3s:d-ate07-03” ... ...</p>
      </sec>
      <sec id="sec-1-3">
        <title>Existing link</title>
      </sec>
      <sec id="sec-1-4">
        <title>New link</title>
        <p>unstructured structured
li
n
k
M
a
p
t
o
o
n
tl
o
o
g
y</p>
      </sec>
      <sec id="sec-1-5">
        <title>General Knowledge bases, thesauri, vocabularies db:Oberster_Gerichts hof_(Österreich)</title>
      </sec>
      <sec id="sec-1-6">
        <title>RDF links</title>
        <p>
          extraction and mapping rules, which exploit knowledge about the legal document
production process and document structures. Moreover, we argue that the ELI and ECLI
standards themselves and their associated metadata-guidelines serve as an excellent
basis for a trans-national European legal knowledge graph (KG). Possible use case
applications of such a knowledge graph are widespread (c.f. [3], including: (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) supporting
the comparative analyses of court decisions and different legal interpretations of
legislations; (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) enabling the analyses of the evolution of legislation and jurisdiction; (
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
interlinking legal knowledge with other data (such as online discussions, news, etc.,).
        </p>
        <p>Previous work is either presenting an idea [3], focusing on representing legal
information based on the Akoma Ntoso XML format [2], hence missing linkability required
for a legal KG, or solving very specific problems like an ECLI parser for the automatic
extraction of legal links and making them available in a machine-readable format [1].
Previous work can be seen as a starting point, but is not sufficient for the creation of a
legal KG at the moment.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>A Linked Legal Data Framework</title>
      <p>
        Fig. 1 illustrates our proposed framework to overcome existing problems in relation to
the accessibility of legal data across Europe and includes the primary data source
components: the (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) European EUR-Lex database containing legal documents issued by the
EU using a classification scheme from the EuroVoc thesaurus; (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) national databases
containing information about national laws and court decisions; and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) general
knowledge bases, such as DBpedia and Wikidata that also contain information about legal
concepts and aspects (which can help us to create links to the outside), but also to enrich
for instance links to EuroVoc keywords, which is commonly used to annotate EU legal
documents. The basic concepts of the linked legal KG are legislation identifiers (ELIs),
case identifiers (ECLIs) and their properties. The source components in Fig. 1 are
connected in three different ways: (i) information is imported into the KG from source
components (dotted arrows); (ii) the KG contains backlinks to these sources (dotted arrows);
and (iii) existing links already exist between some sources (solid arrows).
      </p>
      <p>The EU institutions (e.g., Council of the European Union, European Commission,
etc...) routinely publish and update freely accessible legal documents in the EUR-Lex3
database, maintained by the European Publications Office (OP)4. This database contains
metadata-enhanced legal documents in each of the official languages of the EU
member states, such as the authentic Official Journal of the European Union, EU treaties,
regulations, directives and EU case law, dating back to 19515.</p>
      <p>Each EU member state has its own national legal database, which is used to store
legal information, usually in the national language(s). Information is often displayed in
HTML and/or available for download as PDF, however a few countries also provide
access their national legal databases via an API. For instance, Austria provides an API to
access legal documents and associated metadata that comply with ECLI in a JSON
serialization. While, Germany6 offers documents and metadata in XML, Finland7 goes as
far as offering legal information as linked data in JSON-LD and via a SPARQL endpoint.</p>
      <p>Legal documents present in both the European and national databases often contain
concepts for which supplementary information is available in external databases, such
as Wikidata8 and DBpedia9 as well as thesauri like STW10. This external information
could be used to enhance legal documents with additional information and increase the
interlinking with other datasets.
3</p>
    </sec>
    <sec id="sec-3">
      <title>A Linked Legal Data Knowledge Graph Population</title>
      <p>We are using the proposed ECLI and ELI ontologies as a foundation for our legal KG
to build upon. The information being included in the KG might be contained in the
metadata, the legal document text or can be inferred from the datasource. For space
limitations we focus on a high-level description of the mappings, more information about
the used methodology and NLP pipeline to be included in the actual poster.
Direct and Configuration Mappings. Certain information contained in the metadata
provided by the national databases could be directly linked to the corresponding properties
in the ECLI / ELI ontologies without the additional data extraction steps. Configuration
files could be used for properties not contained in the metadata, but remaining the same
for an entire corpus. Given that legal documents in a country are typically issued in the
official language, the language property can be globally set for the corpus of a country.
3 http://eur-lex.europa.eu
4 http://publications.europa.eu/
5 http://eur-lex.europa.eu/content/welcome/about.html
6 http://www.rechtsprechung-im-internet.de
7 http://data.finlex.fi/en/main
8 http://www.wikidata.org
9 http://wiki.dbpedia.org
10 http://zbw.eu/stw/version/9.0/about.en.html
OGH
VfGH
Indirect Mappings. Missing information requires preprocessing steps such as natural
language processing (NLP) techniques or information from external knowledge bases for
the mapping to the appropriate ECLI property, e.g. the ECLI properties dcterms:subject,
dcterms:description allow the user to map information about the field of law and
descriptive elements. Keywords provided in a national database in natural text must be
mapped to the corresponding EuroVoc descriptor to enable multilingual search of legal
information. Preliminary results shown in Table 1 for 500 supreme (OGH, 40 distinct
keywords) and constitutional court (VfGH, 411 distinct keywords) decisions show the
share of keywords that can be mapped directly to EuroVoc or using (combinations) of
external knowledgebases and thesauri for translations and the increase of mappings when
using external sources. Domain-specific thesauri, document classification systems based
on the EuroVoc scheme and NLP techniques could be used to improve the low numbers
and increase the share of keywords that can be mapped to an EuroVoc descriptor.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Summary and Future Work</title>
      <p>We proposed a cross-jurisdictional legal framework demonstrating how legal
information stored in national databases can be linked at a European level. The proposed legal
KG uses a lightweight ontology based upon the ELI and ECLI specifications and their
metadata guidelines as a starting point. For future work we plan to improve the precision
and recall by applying different mapping strategies.</p>
      <p>Acknowledgments. Funded by the Austrian Federal Ministry of Transport, Innovation
and Technology (BMVIT) DALICC project https://www.dalicc.net.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Agnoloni</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bacci</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peruginelli</surname>
          </string-name>
          , G., van Opijnen, M., van den Oever, J.,
          <string-name>
            <surname>Palmirani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cervone</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bujor</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lecuona</surname>
            ,
            <given-names>A.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>García</surname>
            ,
            <given-names>A.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Caro</surname>
            ,
            <given-names>L.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Siragusa</surname>
          </string-name>
          , G.:
          <article-title>Linking european case law: BO-ECLI parser, an open framework for the automatic extraction of legal links</article-title>
          .
          <source>In: Legal Knowledge and Information Systems - JURIX</source>
          <year>2017</year>
          : The Thirtieth Annual Conference
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Boella</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Caro</surname>
            ,
            <given-names>L.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Graziadei</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cupi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Salaroglio</surname>
            ,
            <given-names>C.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Humphreys</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Konstantinov</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marko</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robaldo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruffini</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simov</surname>
            ,
            <given-names>K.I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Violato</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stroetmann</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Linking legal open data: breaking the accessibility and language barrier in european legislation and case law</article-title>
          .
          <source>In: 15th International Conference on Artificial Intelligence and Law</source>
          , ICAIL 2015
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Montiel-Ponsoda</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rodríguez-Doncel</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gracia</surname>
          </string-name>
          , J.:
          <article-title>Building the legal knowledge graph for smart compliance services in multilingual europe</article-title>
          .
          <source>In: 1st Workshop on Technologies for Regulatory Compliance (co-located with JURIX</source>
          <year>2017</year>
          )
          <article-title>(</article-title>
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>