<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Interlinking Legal Data</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Erwin</forename><surname>Filtz</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Vienna University of Economics and Business</orgName>
								<address>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sabrina</forename><surname>Kirrane</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Vienna University of Economics and Business</orgName>
								<address>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Axel</forename><surname>Polleres</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Vienna University of Economics and Business</orgName>
								<address>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Interlinking Legal Data</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">771F434D7C9A81C1967B5D9C6D24F918</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T08:04+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In recent years, the European Union has been working towards harmonizing legislation thus allowing for easier cross-border access to, exchange and reuse of legal information. This initiative is supported via standardization activities such as the European Law Identifier (ELI) and the European Case Law Identifier (ECLI), which provide technical specifications for web identifiers and vocabularies that can be used to describe metadata pertaining to legal documents. Unfortunately, to date said initiative has only been partially adopted by EU member states, possibly due to the manual effort involved in curating the metadata. As a first step towards streamlining this process, we propose a cross-jurisdictional legal framework that demonstrates how legal information stored in national databases can be linked at a European level using Natural Language Processing together with external knowledgebases to automatically populate the knowledge base.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Globalization has increased the number of cross-border cross-jurisdictional activities, bringing with it the need for alignment of and improved accessibility to multilingual legal data across Europe. Therefore, one of the key goals of the European Union (EU) is to harmonize laws and to improve access to legal information. From an information access perspective, standardization activities such as the European Law Identifier (ELI) <ref type="bibr" target="#b0">1</ref> and the European Case Law Identifier (ECLI) <ref type="bibr" target="#b1">2</ref> aim to improve accessibility and integration of legal data by proposing standard Web identifiers and metadata schemas for legislation and court proceedings respectively. From a legislation perspective, we also see a push towards this harmonization in terms of EU-wide regulations (laws to be enforced by all member states) and directives (which are transformed into national laws). However, there are still a large number of national laws that are not governed by the EU centrally. Most member states record legal information (i.e., legislation and court proceedings in their respective national language) in heterogeneous national legal databases that are usually accessible via Web search interfaces or application programming interfaces.</p><p>Although the ELI and ECLI are also relevant for national legislation and court cases, these standards have not yet received widespread adoption, possibly due to the incurring costs in terms of either manual effort involved or changes required to existing systems in order to sustainably implement and maintain these relatively new standards.</p><p>We strongly believe that ELI and ECLI do not necessarily need to be built into the existing legal production processes, but instead can be re-engineered through semantic  Previous work is either presenting an idea [3], focusing on representing legal information based on the Akoma Ntoso XML format [2], hence missing linkability required for a legal KG, or solving very specific problems like an ECLI parser for the automatic extraction of legal links and making them available in a machine-readable format [1]. Previous work can be seen as a starting point, but is not sufficient for the creation of a legal KG at the moment.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">A Linked Legal Data Framework</head><p>Fig. <ref type="figure" target="#fig_1">1</ref> illustrates our proposed framework to overcome existing problems in relation to the accessibility of legal data across Europe and includes the primary data source components: the (1) European EUR-Lex database containing legal documents issued by the EU using a classification scheme from the EuroVoc thesaurus; (2) national databases containing information about national laws and court decisions; and (3) general knowledge bases, such as DBpedia and Wikidata that also contain information about legal concepts and aspects (which can help us to create links to the outside), but also to enrich for instance links to EuroVoc keywords, which is commonly used to annotate EU legal documents. The basic concepts of the linked legal KG are legislation identifiers (ELIs), case identifiers (ECLIs) and their properties. The source components in Fig. <ref type="figure" target="#fig_1">1</ref> are connected in three different ways: (i) information is imported into the KG from source components (dotted arrows); (ii) the KG contains backlinks to these sources (dotted arrows); and (iii) existing links already exist between some sources (solid arrows).</p><p>The EU institutions (e.g., Council of the European Union, European Commission, etc...) routinely publish and update freely accessible legal documents in the EUR-Lex<ref type="foot" target="#foot_1">3</ref> database, maintained by the European Publications Office (OP) <ref type="bibr">4</ref> . This database contains metadata-enhanced legal documents in each of the official languages of the EU member states, such as the authentic Official Journal of the European Union, EU treaties, regulations, directives and EU case law, dating back to 1951 <ref type="bibr">5</ref> .</p><p>Each EU member state has its own national legal database, which is used to store legal information, usually in the national language(s). Information is often displayed in HTML and/or available for download as PDF, however a few countries also provide access their national legal databases via an API. For instance, Austria provides an API to access legal documents and associated metadata that comply with ECLI in a JSON serialization. While, Germany<ref type="foot" target="#foot_4">6</ref> offers documents and metadata in XML, Finland<ref type="foot" target="#foot_5">7</ref> goes as far as offering legal information as linked data in JSON-LD and via a SPARQL endpoint.</p><p>Legal documents present in both the European and national databases often contain concepts for which supplementary information is available in external databases, such as Wikidata<ref type="foot" target="#foot_6">8</ref> and DBpedia<ref type="foot" target="#foot_7">9</ref> as well as thesauri like STW<ref type="foot" target="#foot_8">10</ref> . This external information could be used to enhance legal documents with additional information and increase the interlinking with other datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">A Linked Legal Data Knowledge Graph Population</head><p>We are using the proposed ECLI and ELI ontologies as a foundation for our legal KG to build upon. The information being included in the KG might be contained in the metadata, the legal document text or can be inferred from the datasource. For space limitations we focus on a high-level description of the mappings, more information about the used methodology and NLP pipeline to be included in the actual poster.</p><p>Direct and Configuration Mappings. Certain information contained in the metadata provided by the national databases could be directly linked to the corresponding properties in the ECLI / ELI ontologies without the additional data extraction steps. Configuration files could be used for properties not contained in the metadata, but remaining the same for an entire corpus. Given that legal documents in a country are typically issued in the official language, the language property can be globally set for the corpus of a country. Indirect Mappings. Missing information requires preprocessing steps such as natural language processing (NLP) techniques or information from external knowledge bases for the mapping to the appropriate ECLI property, e.g. the ECLI properties dcterms:subject, dcterms:description allow the user to map information about the field of law and descriptive elements. Keywords provided in a national database in natural text must be mapped to the corresponding EuroVoc descriptor to enable multilingual search of legal information. Preliminary results shown in Table <ref type="table" target="#tab_0">1</ref> for 500 supreme (OGH, 40 distinct keywords) and constitutional court (VfGH, 411 distinct keywords) decisions show the share of keywords that can be mapped directly to EuroVoc or using (combinations) of external knowledgebases and thesauri for translations and the increase of mappings when using external sources. Domain-specific thesauri, document classification systems based on the EuroVoc scheme and NLP techniques could be used to improve the low numbers and increase the share of keywords that can be mapped to an EuroVoc descriptor.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Framework components to be integrated into a Linked Legal Data Knowledge Graph</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Results of mapping keywords to EuroVoc descriptors</figDesc><table><row><cell></cell><cell></cell><cell>Individual</cell><cell></cell><cell cols="2">Combined</cell></row><row><cell cols="4">Court EuroVoc STW Wikidata DBpedia</cell><cell>Wikidata DBpedia</cell><cell>DBpedia Wikidata</cell></row><row><cell>OGH</cell><cell>0.30 0.25</cell><cell>0.38</cell><cell>0.43</cell><cell>0.45</cell><cell>0.48</cell></row><row><cell>VfGH</cell><cell>0.25 0.16</cell><cell>0.34</cell><cell>0.33</cell><cell>0.39</cell><cell>0.40</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:52011XG0429(01)&amp;from=EN</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_1">http://eur-lex.europa.eu</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2">http://publications.europa.eu/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_3">http://eur-lex.europa.eu/content/welcome/about.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_4">http://www.rechtsprechung-im-internet.de</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_5">http://data.finlex.fi/en/main</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_6">http://www.wikidata.org</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_7">http://wiki.dbpedia.org</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_8">http://zbw.eu/stw/version/9.0/about.en.html</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Summary and Future Work</head><p>We proposed a cross-jurisdictional legal framework demonstrating how legal information stored in national databases can be linked at a European level. The proposed legal KG uses a lightweight ontology based upon the ELI and ECLI specifications and their metadata guidelines as a starting point. For future work we plan to improve the precision and recall by applying different mapping strategies.</p><p>Acknowledgments. Funded by the Austrian Federal Ministry of Transport, Innovation and Technology (BMVIT) DALICC project https://www.dalicc.net.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Linking european case law: BO-ECLI parser, an open framework for the automatic extraction of legal links</title>
		<author>
			<persName><forename type="first">T</forename><surname>Agnoloni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bacci</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Peruginelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Van Opijnen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Van Den Oever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Palmirani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Cervone</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Bujor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Lecuona</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">B</forename><surname>García</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">D</forename><surname>Caro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Siragusa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Legal Knowledge and Information Systems -JURIX 2017: The Thirtieth Annual Conference</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Linking legal open data: breaking the accessibility and language barrier in european legislation and case law</title>
		<author>
			<persName><forename type="first">G</forename><surname>Boella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">D</forename><surname>Caro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Graziadei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Cupi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">E</forename><surname>Salaroglio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Humphreys</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Konstantinov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Marko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Robaldo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Ruffini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">I</forename><surname>Simov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Violato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stroetmann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">15th International Conference on Artificial Intelligence and Law, ICAIL 2015</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Building the legal knowledge graph for smart compliance services in multilingual europe</title>
		<author>
			<persName><forename type="first">E</forename><surname>Montiel-Ponsoda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Rodríguez-Doncel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gracia</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">1st Workshop on Technologies for Regulatory Compliance (co-located with JURIX</title>
				<imprint>
			<date type="published" when="2017">2017. 2017</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
