<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Knowledge Graph Construction and Refinement for Cultural Heritage Digital Libraries</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Mary</forename><forename type="middle">Ann</forename><surname>Tan</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">FIZ Karlsruhe -Leibniz Institute for Information Infrastructure</orgName>
								<address>
									<addrLine>Hermann-von-Helmholtz-Platz 1</addrLine>
									<postCode>76344</postCode>
									<settlement>Eggenstein-Leopoldshafen</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Applied Informatics and Formal Description Methods (AIFB)</orgName>
								<orgName type="institution">Karlsruhe Institute of Technology (KIT)</orgName>
								<address>
									<addrLine>Kaiserstraße 89</addrLine>
									<postCode>76133</postCode>
									<settlement>Karlsruhe</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff3">
								<orgName type="department">Deutsche Digitale Bibliothek</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Mary</forename><surname>Ann</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Tan</forename><surname>Ceur</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Workshop</forename><surname>Proceedings</surname></persName>
						</author>
						<author>
							<affiliation key="aff2">
								<address>
									<settlement>Baltimore</settlement>
									<region>Maryland</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Knowledge Graph Construction and Refinement for Cultural Heritage Digital Libraries</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">4F3BBE518DAD5A7FAE70F276734076E4</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:57+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Semantic Web</term>
					<term>NLP</term>
					<term>Information Extraction</term>
					<term>Knowledge Graphs</term>
					<term>Digital Libraries</term>
					<term>Cultural Heritage</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Digital Libraries containing metadata of diverse cultural heritage objects are meant to be accessible not only to domain experts but also to the general population. This calls for information services that can provide ease and efficiency to search, retrieval and exploration. Knowledge graphs (KGs) are essential for representation, organization, integration, and analysis of hierarchical and heterogeneous information. However, most KGs suffer from incompleteness and inaccuracies. This work intends to address various challenges arising from construction and refinement of a KG populated with historical objects, by defining domain-and application-appropriate ontologies and leveraging approaches in information extraction (IE) for improving metadata quality.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The German Digital Library 1 (DDB) collects, aggregates, transforms, and publishes metadata representing tens of millions of digitized cultural heritage objects (eg. books, paintings, archival documents, photographs, audio recordings). These objects span several millennia and belong to the holdings of various memory institutions all across Germany. Due to its historical significance, this collection is meant to be accessed and explored by users from diverse backgrounds.</p><p>However, the sheer volume, granularity, and heterogeneity of this collection hampers the ease in search, retrieval, and exploration. These hurdles call for the construction of a knowledge graph (KG) to represent and to organize the objects and their contextual descriptions, while enabling data integration and analytics.</p><p>As the national aggregator to the Europeana <ref type="bibr" target="#b0">[1]</ref>, DDB's metadata collection is represented using an extension of the Europeana Data Model (EDM) 2 . EDM favors simplicity and offers flexibility in the choice of metadata element sets, as well as the range of possible values for properties describing the objects. These design considerations lead to modeling challenges described by Tan et al. <ref type="bibr" target="#b1">[2]</ref>. In addition, the metadata collection suffers from incompleteness and inaccuracies as described in Tan et al. <ref type="bibr" target="#b2">[3]</ref>. This prevents the underlying retrieval engine from properly indexing the objects.</p><p>To address these challenges necessitates a combination of solutions in knowledge representation, knowledge refinement, and information extraction. Therefore, this thesis proposes i) an ontology that enables interoperability across different types of CHOs while maintaining domain-specific semantics as discussed in Section 5.1; ii) a KG refinement approach leveraging NLP teachniques to improve metadata quality of historical objects; and iii) an Entity Linking approach for entities in historical objects.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Importance</head><p>This work will benefit not only the general population, but also the domain experts such as librarians, curators, and archivists. Proposed solutions will empower users from diverse backgrounds to seamlessly and efficiently search, retrieve, and explore Germany's rich and voluminous collection.</p><p>Recent developments in AI can be leveraged to address the technical challenges facing the DDB. This work is relevant to the researchers working at the intersection of Semantic Web (SW), Digital Humanities (DH), and Natural Language Processing (NLP).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Related Work</head><p>There have been several notable data models or ontologies proposed for cultural heritage representation. Liu et al. <ref type="bibr" target="#b3">[4]</ref> provided a review of CIDOC-CRM, Sampo Model, and EDM specific to the museum use case only. Cultural heritage data models are delineated along two modeling paradigms: object-centric and event-centric. CIDOC-CRM follows the former, while EDM follows a mixture of both paradigms. Object-centric modeling defines attributes directly by describing the object, while event-centric modeling defines these attributes through a series of events associated with the object. Object-centric modeling favors conciseness, while event-centric modeling emphasizes completeness.</p><p>A pioneer in the application of of SW technologies, the Sampo series of semantic portals showcase the national heritage of Finland. These systems make use of the modular FinnONTO ontology infrastructure <ref type="bibr" target="#b4">[5]</ref>. However, FinnONTO is not a full-featured ontology, but a taxonomy of CHOs encoded as Simple Knowledge Organization System (SKOS) concepts. Following the modular modeling approach is Italy's ArCO<ref type="foot" target="#foot_0">3</ref>  <ref type="bibr" target="#b5">[6]</ref>, where each module is intended to describe a CHO <ref type="foot" target="#foot_1">4</ref> in the context of cataloging activities and events.</p><p>The core design principles of EDM, and by extension DDB-EDM, lead to definitions of general classes that require the bare minimum of metadata properties and controlled vocabularies. Thus, all CHOs, regardless of their sources, media types and object types, are instances of the class edm:ProvidedCHO, while their digitized representations on the Web are instances of the class edm:WebResource. This flexibility however results in imprecise representations and loss of semantics inherent in the original objects <ref type="bibr" target="#b6">[7]</ref>. In particular, it is not possible to model the concepts and level of abstractions widely-accepted in the bibliographical domain.</p><p>The International Federation of Library Associations and Institutions (IFLA) developed the Functional Requirements for Bibliographic Records (FRBR) <ref type="bibr" target="#b7">[8]</ref>, where a book can be represented as several entities and the relationships that exist among these entities. A copy of a book (frbr:Item) is a specimen or exemplification of a specific publication (frbr:Manifestation), which is an embodiment of an expression frbr:Expression that realizes the ideas of a creative work (frbr:Work).</p><p>Most Europeana users are less likely to search for specific items (11.3%) and are more inclined to search by category (47.1%) and by subject (24.6%) <ref type="bibr" target="#b8">[9]</ref>. This supports the need to align bibliographic objects from the Item level to their respective higher-level abstractions (Work, Expression, Manifestation). Consequently, the process of alignment sets a prerequisite for objects to possess identifiable properties and attributes, such as title, agents, dates, and subject heading. However, due to the age of the objects, a high level of uncertainty with respect to proper author or date attributions is apparent.</p><p>The challenges of filling missing information and identifying erroneous information in a knowledge graph fall under the umbrella of Knowledge Graph Refinement. In particular, Knowledge Graph Completion (KGC) deals with the former challenge, while Error Detection deals with the latter.</p><p>By definition, internal methods for KGC use the content of the current KG either to determine class membership or to predict relations between entities. These methods require the current KG to at least possess reasonable quality in order for large scale evaluation to be feasible <ref type="bibr" target="#b9">[10]</ref>. On the other hand, external methods leverage other sources of knowledge for refinement, such as other knowledge graphs or text corpora.</p><p>With the rapid development in the area of Natural Language Processing (NLP), text corpora have become an excellent source of external knowledge. The subfield of Information Extraction (IE), an intermediate step to knowledge graph construction, can be defined as the process of gleaning structured information from unstructured text <ref type="bibr" target="#b10">[11]</ref>. A concrete example of this task would be to extract distinct properties and attributes identifying a literary work from the title.</p><p>An IE pipeline starts with Named Entity Recognition (NER), or the detection and classification of named entities mentioned in the text. Types of entities can be coarse-grained such as PERSON, WORK_OF_ART, DATE, et cetera or fine-grained such as AUTHOR, PUBLISHER, ARTWORK, PUBLISHER, LITERARY_WORK, PUBLICATION DATE, et cetera.</p><p>Specific entity types (fine-grained) are often found in domain-specific texts, or even timespecific texts where concept drift is quite common. In the field of digial humanities, there are a number of studies on NER with historical text <ref type="bibr" target="#b11">[12]</ref>, however, coverage goes back to 17 th century B.C. at best (DROC <ref type="bibr" target="#b12">[13]</ref>) and none belong to the domain of bibliography.</p><p>Once the entities have been detected and classified, they are linked to specific entries in reference knowledge bases or KGs. Entity Linking is particularly challenging due to the surface form variations. In particular, names in historical texts can be multilingual, refer to aliases or contain initials, include honorifics and designations. The names of geop-olitical entities are also known to change through time. Pontes et al. <ref type="bibr" target="#b13">[14]</ref> proposed an end-to-end multilingual NER and EL (NERL) approach to address some of these challenges using some of the datasets mentioned in Ehrmann et al. <ref type="bibr" target="#b11">[12]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Research Questions</head><p>This section formulates the research questions (RQs) to address the challenges and limitations of existing approaches described in Section 3. RQ1: How can existing ontologies be adapted and extended to suit the domain and application profile of digital libraries, such as the DDB? Cultural heritage practitioners have been developing ontologies for specifc domains and applications. As of this writing, only EDM is used to represent metadata from several cultural institutions. In order to prevent data model silos <ref type="bibr" target="#b14">[15]</ref> and to promote reusability, it is beneficial to consider existing ontologies that are applicable and appropriate for the use case of the DDB. Preliminary results are discussed in Section 5.1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RQ2: How can we leverage state-of-the-art NLP models to improve metadata quality of historical objects?</head><p>Non-contemporary titles in the DDB (&lt;dc:title&gt;) encode details that can be used to fillout missing properties, such as the title itself, author, publisher, editor, subject headings, and dates. Hence, this calls for extractive NLP approaches. Section 5.2 presents some preliminary results.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RQ2.1: How can we automatically construct an evaluation dataset from the DDB?</head><p>In order to address the succeeding RQs, an evaluation dataset for IE is required. Section 5.2 briefly describes what has been done so far.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RQ2.2: How can we effectively extract fine-grained bibliographic entities from historical texts?</head><p>The goal here is to address open challenges in the area of historical NER, such as how to properly handle the dynamics of an evolving language, where spelling and naming conventions change through time, and noise resulting from OCR engine. Dataset construction, design of experiments, and model development will be accomplished.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RQ3: How can we link entities to records in the reference KG?</head><p>The goal here is to accurately disambiguate named entities and link them to entries in external KGs, while addressing the challenges associated with historical texts. Moreover, entities that do not exist in the reference KG can used as further contribution to increase the coverage of authority files.</p><p>The entirety of this work is envisioned to guide the construction and refinement of a knowledge graph representing DDB's cultural heritage objects. In addition, some open questions have yet to be addressed concerning NERL in historical texts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Preliminary Results</head><p>The section describes preliminary work conducted to address the open questions presented in Section 4.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">The DDB Ontology (DDB-O)</head><p>Extensive quantitative and qualitative analysis of the entire DDB metadata collection have been conducted in order to ascertain the applicability of existing CH ontologies. Initially, objects were logically classified according to their originating institution, whether from libraries, archives, museums, media libraries, or historical preservation. In addition, the media type of an object was also taken into account. Taking up a large proportion of the entire collection, the alignment of textual bibliographic resources to an extension of FRBR<ref type="foot" target="#foot_2">5</ref> have been presented <ref type="bibr" target="#b1">[2]</ref> and implemented as a SPARQL Endpoint <ref type="bibr" target="#b15">[16]</ref>. Domain-specific ontologies have been adapted to have more precise semantic representation objects (eg. components of bibliographic objects, hierarchy of archival objects, level of representations of an image, etc.) Existing audio ontologies intended for other domains have been extended to represent intangible audio heritage <ref type="bibr" target="#b16">[17]</ref>. The DDB-O Namespace<ref type="foot" target="#foot_3">6</ref> is available online. A formal and complete specification is under review and yet to be published.</p><p>FRBR, as the upper ontology, requires that each object is looked up against a list of creative works, such as the German Authority File or Gemeinsame Normdatei (GND <ref type="foot" target="#foot_4">7</ref> ). This ensures that the relationship between different objects resulting from the same creative work is represented in the KG <ref type="bibr" target="#b17">[18]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Information Extraction</head><p>The alignment of bibliographic items to their corresponding literary works proved to be a challenging task due to incomplete object descriptions <ref type="bibr" target="#b17">[18]</ref>. Taking advantage of the greater textual content encoded in the titles, several NLP tasks were reformulated in order to extract contextual details present in the title. Several state-of-the-art, off-the-shelf NER and extractive QA models, as well as LLMs were used in the experiments.</p><p>As described in <ref type="bibr" target="#b2">[3]</ref>, the objects in the evaluation dataset were selected according to language, hierarchy type, existence of agent and date properties, format, and title length (&gt;30 tokens).</p><p>A more forgiving evaluation measure (Precision@n) described in Section 6.2 was defined to take into account the various naming conventions found in the text. An NER model (FLERT) <ref type="bibr" target="#b18">[19]</ref> that can detect literary works and dates was initially used to test the hypothesis, and to refine the evaluation dataset for the succeeding tasks. The results shown in Table <ref type="table" target="#tab_0">1</ref> illustrate that these models can be leveraged but only to a lesser extent. The results were poor since the models were not adapted to the age and domain of the texts. In addition, the results are not indicative of the actual model performance due to evaluation dataset inaccuracies <ref type="bibr" target="#b2">[3]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Evaluation</head><p>The research questions enumerated in Section 4 require different evaluation procedures, dataset, and metrics. These are described in the succeeding subsections. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1.">Ontology Evaluation</head><p>There several ways in evaluating ontologies. One of which is using is using competency questions (CQs). A collection of CQs published in GitHub<ref type="foot" target="#foot_5">8</ref> are included in the partial ontological definitions and alignment activities. In addition, SPARQL query processing time for CQs that can be answered with DDB-EDM will be compared with queries using the proposed ontology.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.">Information Extraction</head><p>Name matching for historical documents is non-trivial due to various naming conventions and spelling variations. In a QA task, the most forgiving measure is Accuracy@1, which returns 1 if there is a single token overlap between the ground truth and the answer. Precision@n measure is a combination of 2 matching criteria: an exact match of the DDB object ID and an approximate match for names using the Levenshtein edit distance <ref type="bibr" target="#b2">[3]</ref>. The evaluation measures for RQ3 will not be any different from those associated with EL. A large proportion of the agents in the DDB are already linked to GND Persons. And there already exist links between GND and Wikidata entities. This means that it is trivial to combine naming variations and multilingual names for the evaluation dataset. Evaluating geopolitical entities will require prior knowledge of the age of the object in question.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Limitations and Future Work</head><p>As discussed in Section 5.2, the lack of a gold standard evaluation dataset brings a level of uncertainty to the experimental results. This will be addressed with the creation of a manually annotated dataset with fine-grained entities. Consequently, this dataset will be used to address RQ2.2. In addition, the work conducted to address RQ1 need to be finalized. Finally, entities that already exist in GND will be linked, while non-existing ones can be used to further increase the coverage of GND and Wikidata.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>LLM vs Extractive QA</figDesc><table><row><cell>Question: "Who is the ...?</cell><cell>Ground Truth</cell><cell cols="2">LLM mistral-7b-instruct-v0.2 gelectra-large-germanquad QA</cell></row><row><cell>Author</cell><cell>all agents &lt;dc:creator&gt;</cell><cell>51.60% 37.60%</cell><cell>66.23% 32.19%</cell></row><row><cell>Publisher</cell><cell>&lt;dc:publisher&gt;</cell><cell>2.70%</cell><cell>0.85%</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_0">ArCO, https://w3id.org/arco</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_1">CHO is referred to as "Cultural Property".</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_2">Functional Requirements for Bibliographic Records<ref type="bibr" target="#b7">[8]</ref> </note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_3">DDB-O , https://ise-fizkarlsruhe.github.io/ddbkg/ddbo</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_4">GND, https://www.dnb.de/DE/Professionell/Standardisierung/GND/gnd_node.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_5">CQs for DDB-O, https://ise-fizkarlsruhe.github.io/ddbkg/docs/examples/</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>I would like to thank my supervisors Prof. Dr. Harald Sack and Dr. Shufan Jiang for their invaluable mentoring and support.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Think culture: Europeana.eu from concept to construction</title>
		<author>
			<persName><forename type="first">J</forename><surname>Purday</surname></persName>
		</author>
		<idno type="DOI">10.1515/bfup.2009.018</idno>
	</analytic>
	<monogr>
		<title level="j">Bibliothek Forschung und Praxis</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="170" to="180" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">DDB-EDM to FaBiO: The Case of the German Digital Library</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Tan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Tietz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Bruns</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Oppenlaender</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dessì</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Sack</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">Proc. of the 20th Int. Semantic Web Conference -Posters and Demos -ISWC 2021</title>
				<meeting>of the 20th Int. Semantic Web Conference -Posters and Demos -ISWC 2021</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">2980</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Great Article</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Tan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Sack</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Workshop on Deep Learning and Linguistic Linked Data</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A review of the cultural heritage linked open data ontologies and models</title>
		<author>
			<persName><forename type="first">F</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hindmarch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hess</surname></persName>
		</author>
		<idno type="DOI">10.5194/isprs-archives-XLVIII-M-2-2023-943-2023</idno>
		<idno>XLVIII-M-2-2023</idno>
		<ptr target="https://isprs-archives.copernicus.org/articles/XLVIII-M-2-2023/943/2023/.doi:10.5194/isprs-archives-XLVIII-M-2-2023-943-2023" />
	</analytic>
	<monogr>
		<title level="m">The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="943" to="950" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Building a national semantic web ontology and ontology service infrastructure -the finnonto approach</title>
		<author>
			<persName><forename type="first">E</forename><surname>Hyvönen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Viljanen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tuominen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Seppälä</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Semantic Web: Research and Applications</title>
				<editor>
			<persName><forename type="first">S</forename><surname>Bechhofer</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Hauswirth</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Hoffmann</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Koubarakis</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin Heidelberg; Berlin, Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="95" to="109" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">ArCo: The Italian Cultural Heritage Knowledge Graph</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">A</forename><surname>Carriero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gangemi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">L</forename><surname>Mancinelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Marinucci</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">G</forename><surname>Nuzzolese</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Presutti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Veninata</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-30796-7_3</idno>
	</analytic>
	<monogr>
		<title level="m">The Semantic Web -ISWC</title>
				<imprint>
			<date type="published" when="2019">2019. 2019</date>
			<biblScope unit="page" from="36" to="52" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Reflecting on the Europeana Data Model</title>
		<author>
			<persName><forename type="first">S</forename><surname>Peroni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Tomasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Vitali</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IRCDL</title>
		<imprint>
			<biblScope unit="page" from="228" to="240" />
			<date type="published" when="2012">2012. 2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">B</forename><surname>Tillet</surname></persName>
		</author>
		<title level="m">What is FRBR?: A Conceptual Model for the Bibliographic Universe</title>
				<imprint>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Europeana: What users search for and why</title>
		<author>
			<persName><forename type="first">P</forename><surname>Clough</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Hill</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">L</forename><surname>Paramita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Goodale</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Research and Advanced Technology for Digital Libraries</title>
				<editor>
			<persName><forename type="first">J</forename><surname>Kamps</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Tsakonas</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Manolopoulos</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Iliadis</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">I</forename><surname>Karydis</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="207" to="219" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Knowledge graph refinement: A survey of approaches and evaluation methods</title>
		<author>
			<persName><forename type="first">H</forename><surname>Paulheim</surname></persName>
		</author>
		<idno type="DOI">10.3233/SW-160218</idno>
		<ptr target="https://doi.org/10.3233/SW-160218.doi:10.3233/SW-160218" />
	</analytic>
	<monogr>
		<title level="j">Semant. Web</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="489" to="508" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Information extraction overview</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Okurowski</surname></persName>
		</author>
		<idno type="DOI">10.3115/1119149.1119164</idno>
		<ptr target="https://aclanthology.org/X93-1012.doi:10.3115/1119149.1119164" />
	</analytic>
	<monogr>
		<title level="m">TIPSTER TEXT PROGRAM: PHASE I: Proceedings of a Workshop held at</title>
				<meeting><address><addrLine>Fredricksburg, Virginia; Fredericksburg, Virginia, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="1993">September 19-23, 1993. 1993</date>
			<biblScope unit="page" from="117" to="121" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Named entity recognition and classification in historical documents: A survey</title>
		<author>
			<persName><forename type="first">M</forename><surname>Ehrmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hamdi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">L</forename><surname>Pontes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Romanello</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Doucet</surname></persName>
		</author>
		<idno type="DOI">10.1145/3604931</idno>
		<ptr target="https://doi.org/10.1145/3604931.doi:10.1145/3604931" />
	</analytic>
	<monogr>
		<title level="j">ACM Comput. Surv</title>
		<imprint>
			<biblScope unit="volume">56</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Description of a corpus of character references in german novels-droc [deutsches roman corpus</title>
		<author>
			<persName><forename type="first">M</forename><surname>Krug</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Weimer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Reger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Macharowsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Feldhaus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Puppe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Jannidis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">DARIAH-DE Working Papers</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="page" from="1" to="16" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Entity Linking for Historical Documents: Challenges and Solutions</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">L</forename><surname>Pontes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Cabrera-Diego</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">G</forename><surname>Moreno</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Boros</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">L</forename><surname>Pontes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hamdi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Sidère</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Coustaty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Doucet</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-64452-9_19</idno>
		<ptr target="https://hal.science/hal-03034492.doi:10.1007/978-3-030-64452-9\_19" />
	</analytic>
	<monogr>
		<title level="m">22nd International Conference on Asia-Pacific Digital Libraries</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<meeting><address><addrLine>ICADL</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020">2020. 2020</date>
			<biblScope unit="volume">12504</biblScope>
			<biblScope unit="page" from="215" to="231" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">O</forename><surname>Suominen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Hyvönen</surname></persName>
		</author>
		<ptr target="https://swib.org/swib16/slides/suominen_silos.pdf" />
		<title level="m">From MARC Silos to Linked Data Silos</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">The German Bibliographic Heritage in a Knowledge Graph</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Tan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Tietz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Bruns</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Oppenlaender</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dessì</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Sack</surname></persName>
		</author>
		<author>
			<persName><surname>Ddb-Kg</surname></persName>
		</author>
		<ptr target=".org" />
	</analytic>
	<monogr>
		<title level="m">6th Int. Workshop on Computational History at JCDL -Histoinformatics</title>
				<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2021">2981. 2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Audio Ontologies for Intangible Cultural Heritage</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Tan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Posthumus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Sack</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of the 19th European Semantic Web Conference -Posters and Demos -ESWC 2022</title>
				<meeting>of the 19th European Semantic Web Conference -Posters and Demos -ESWC 2022</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Tan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Sack</surname></persName>
		</author>
		<ptr target="https://swib.org/swib23/slides/06_Mary%20Ann%20Tan_SWIB2023%20Final.pdf" />
		<title level="m">The DDB Collection and the Limits of Artificial Intelligence</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Schweter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Akbik</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2011.06993</idno>
		<title level="m">Flert: Document-level features for named entity recognition</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
