<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Linked Stage Graph</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Tabea</forename><surname>Tietz</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">Karlsruhe Institute of Technology</orgName>
								<orgName type="institution" key="instit2">Institute AIFB</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">FIZ Karlsruhe -Leibniz Institute for Information Infrastructure</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jörg</forename><surname>Waitelonis</surname></persName>
							<affiliation key="aff2">
								<orgName type="institution">yovisto GmbH</orgName>
								<address>
									<settlement>Potsdam</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kanran</forename><surname>Zhou</surname></persName>
							<email>kanran.zhou@student.kit.edu</email>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">Karlsruhe Institute of Technology</orgName>
								<orgName type="institution" key="instit2">Institute AIFB</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Paul</forename><surname>Felgentreff</surname></persName>
							<email>paul.felgentreff@gmail.com</email>
						</author>
						<author>
							<persName><forename type="first">Nils</forename><surname>Meyer</surname></persName>
							<affiliation key="aff3">
								<orgName type="institution">-Württemberg State Archives</orgName>
								<address>
									<settlement>Baden</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Andreas</forename><surname>Weber</surname></persName>
							<affiliation key="aff3">
								<orgName type="institution">-Württemberg State Archives</orgName>
								<address>
									<settlement>Baden</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Harald</forename><surname>Sack</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">Karlsruhe Institute of Technology</orgName>
								<orgName type="institution" key="instit2">Institute AIFB</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">FIZ Karlsruhe -Leibniz Institute for Information Infrastructure</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="institution">yovisto GmbH</orgName>
								<address>
									<settlement>Potsdam</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Linked Stage Graph</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">2D7480EFDDCB539C53E8257E0EEA9BE2</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-19T15:52+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>cultural heritage</term>
					<term>linked data</term>
					<term>knowledge graph</term>
					<term>UI</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Archives today are publishing their cultural heritage data on the Web for exploration. However, for archive novices the traditional archival structures often are not intuitive and difficult to understand, and thus challenges data access and consumption. To tackle this problem, Linked Stage Graph was developed, a knowledge graph (KG) on the foundation of historical data about the Stuttgart State Theater. The data was made available by the Baden-Württemberg State Archives for the Coding da Vinci hackathon. This demo paper contributes the KG, a SPARQL endpoint, named entity extraction and linking to existing authoritative KGs as well as a dedicated user interface for exploration.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Digitizing cultural heritage has been a major task for galleries, libraries, archives and museums (GLAM) for many years now. As a result, a number of Web based platforms have been developed with the goal to enable researchers to access and analyze the data scientifically as well as to allow the general public to explore the data. However, many archival web platforms present their content organized in a way familiar only to archive experts but users who are unfamiliar with archival practise to structure information often find it challenging to access and explore the provided content <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>.</p><p>The Baden-Württemberg State Archives in Germany recognized this issue and opened up their data to the Coding da Vinci initiative, the first German open cultural heritage hackathon. The initiative organizes several hack events a year, bringing together GLAM institutions as data providers and computer scientists, designers, digital humanists to develop creative and interesting applications on the foundation of these cultural heritage data. The Baden-Württemberg State Archives published a dataset about the Stuttgart State Theatres containing 7.000 historical black and white photographs along with EAD-XML metadata to be 'hacked' by the creatives. The photographs and metadata cover the period from the 1890s to the 1940s. Especially in Germany, this time period was marked by social and political upheavals in which democracy, freedom of speech and creativity in particular were challenged. For this reason, the data provided are still of enormous relevance today. For instance, the data reveal which theater performances were allowed to be played in these difficult times and how certain characters were displayed <ref type="bibr" target="#b3">[4]</ref>. In order to consume and optimally share these data, specific requirements must be met. These include in particular interoperability, availability and comprehensibility on different levels. Linked Data based knowledge graphs (KG) have established as practical means to formally encode, integrate and share data. In this demo paper Linked Stage Graph 6 is presented, a KG developed during the Coding da Vinci Süd 2019 hackathon 7 on the foundation of the aforementioned archival data about the Stuttgart State Theaters. The goal of Linked Stage Graph is to enable researchers as well as the general public to access, analyze and explore the data in intuitive, interesting and useful ways. Along with the KG, the presented prototype demo contributes a publicly available SPARQL endpoint 8 to enable sophisticated queries for expert users, the extraction and linking of named entities mentioned in the metadata to the Wikidata KG and the German Integrated Authority File (GND) 9 , a timeline interface for data exploration and lessons learned. As an additional feature, all 7.000 black and white photographs were colorized using open source tools based on machine learning (ML). All code of this demo is published and freely available on GitHub 10 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Linked Stage Graph</head><p>This section presents the main contributions of the demo, Linked Stage Graph. 6 http://slod.fiz-karlsruhe.de/ 7 https://codingdavinci.de/events/sued/ 8 http://slod.fiz-karlsruhe.de/sparql 9 https://www.dnb.de/EN/Standardisierung/GND/gnd_node.html 10 https://github.com/ISE-FIZKarlsruhe/LinkedStageGraph</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Dataset</head><p>The project is based on an archival fonds of around 7.000 historical photographs from the Stuttgart State Theaters. The photographs depict scenes and characters from a wide range of productions from opera to childrens theater dating from the 1890s to the 1940s. In 2009, the Ludwigsburg State Archives took the fonds into their custody. It consisted of 572 sleeves containing prints, photographic plates, nitrate films, photographic negatives and positive images. The archival description captures (where possible) the title and author of the play, the directors, choreographers and designers of each production. The provided dataset consists of JPEGs and an EAD-XML file (cf. Fig. <ref type="figure" target="#fig_0">1 left</ref>). The Encoded Archival Description<ref type="foot" target="#foot_0">11</ref> (EAD) is a documentary XML standard for the description of archival finding aids maintained by the Technical Subcommittee for Encoded Archival Description of the Society of American Archivists together with the Library of Congress. For users unfamiliar with an archives content structure it is difficult interacting with EAD encoded finding aids, because often the archive's hierarchy has to be navigated extract meaningful information <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>. Also, it has been widely recognized that EAD complicates user interaction with the data because operations like accessing a specific item on-the-fly are either impossible or inefficient. Automated processing is another issue since the degree of freedom for expressing information within EAD is too high <ref type="bibr" target="#b5">[6]</ref>. This attempts to overcome these shortcomings by transforming the EAD-XML into an RDF representation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Knowledge Graph</head><p>Fig. <ref type="figure" target="#fig_0">1</ref> presents the workflow. Before further processing the input XML file, some adaptions were necessary like adapting the XML name spaces, adding repository code and language code attributes to the EAD unitid XML-tags and replacing the lb tag with XML entity &amp;#10; . The actual XML to RDF conversion was performed with two different approaches. First, the generic ReDeFer XML2RDF 12 converter was used. The second approach applied the EAD2RDF 13 XSLT stylesheet to transform the provided XML to RDF. As expected, both methods produced different results. While the EAD2RDF stylesheet result were incomplete (e.g. did not manage to transform the information about the image files), the XML2RDF converter produced a vast amount of blank nodes, which are difficult to query and to navigate, but, the results were more complete. Both methods created different IRIs to identify the actual subject of interest: the archival unit. While XML2RDF preferred the archival identifier 14 the EAD2RDF transformation created the IRIs from the EAD unitid 15 . From a computer science perspective the first type of IRI was considered more stable, and that's why it was chosen as identifier. The results were merged through mapping the archival unit titles and the archival identifiers. Finally, the unwanted IRIs were removed. Many literals in the dataset contained unstructured information, i.e. the titles included also a play's author name and the abstracts contained further information about involved and roles. To extract this information a script was created. This also involved to define the vocabulary to model the persons types of contribution and roles. The aim was to reuse existing vocabularies as best as possible. Due to the clear definition of the domain the ambiguity in the data was rather low. This enabled to map plays, person, and location names to Wikidata and GND very quickly. Therefore, a dictionary of potentially relevant resources form the vocabularies was extracted and an exact string matching was performed. Finally, all information was deployed to an instance of the Virtuoso triple store also providing a SPARQL endpoint.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Exploration</head><p>A variety of visual means was implemented and utilized to explore the archive data. To bring the historical black and white images to life, they were automatically colorized using an open source tool based on ML <ref type="bibr" target="#b6">[7]</ref>. To oversee the RDF data in a table view, an instance of the open source LodView RDF viewer was adapted and deployed <ref type="bibr" target="#b4">[5]</ref>. The Linked Stage Graph Viewer<ref type="foot" target="#foot_5">16</ref> was implemented for this use case. The viewer is shown in Fig 2 <ref type="figure">.</ref> It presents a timeline visualization with the goal to let the user explore the rich and detailed images in a more prominent way without too complex means of interaction to reduce the technical barriers of engaging with the content. The user can scroll through the images with an overview of the timeline on the right 1 . One large representative image for each performance is shown 2 with further thumbnails on the bottom 3 . Swiping left or right reveals further plays which took place during the same year 4 . Clicking on images will direct the user to the Lodview interface and reveal all data available for the play. Next to the implemented Linked Stage Graph Viewer, the Vikus Viewer<ref type="foot" target="#foot_6">17</ref> was utilized and connected to the dataset as well. The viewer was previously developed by <ref type="bibr" target="#b2">[3]</ref> and enables intuitive content exploration in a timeline view as well as search and content clustering. During the demo session, all described interfaces can be used to explore the dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Conclusion</head><p>In this paper Linked Stage Graph is presented, a KG based on historical data about the Stuttgart State Theater. Next to the KG, a SPARQL endpoint was released, named entities mentioned in the metadata were extracted and linked to existing KGs and a user interface was developed. The demo was created during </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 :</head><label>1</label><figDesc>Fig. 1: Workflow and architecture.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 2 :</head><label>2</label><figDesc>Fig. 2: Linked Stage Graph Viewer</figDesc><graphic coords="5,156.92,115.93,296.91,170.75" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_0">https://www.loc.gov/ead/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="12" xml:id="foot_1">http://rhizomik.net/html/redefer/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="13" xml:id="foot_2">http://data.archiveshub.ac.uk/ead2rdf/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="14" xml:id="foot_3">E.g. http://slod.fiz-karlsruhe.de/labw-2-2599382</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="15" xml:id="foot_4">E.g. from 'Abt. Staatsarchiv Ludwigsburg, E 18 III Nr 6' http://slod. fiz-karlsruhe.de/id/archivalresource/abt.staatsarchivludwigsburg, e18iiibu161</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="16" xml:id="foot_5">http://slod.fiz-karlsruhe.de/#Viewer</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="17" xml:id="foot_6">http://slod.fiz-karlsruhe.de/vikus/</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Acknowledgement. We would like to thank Coding da Vinci for connecting cultural institutions with creatives to develop innovative applications.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">From users to systems: Identifying and overcoming barriers to efficiently access archival data</title>
		<author>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Silvello</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ACHS@ JCDL</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Interacting with archival finding aids</title>
		<author>
			<persName><forename type="first">L</forename><surname>Freund</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">G</forename><surname>Toms</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the Association for Information Science and Technology</title>
		<imprint>
			<biblScope unit="volume">67</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="994" to="1008" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<author>
			<persName><forename type="first">K</forename><surname>Glinka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dörk</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Museum im display. visualisierung kultureller sammlungen (vikus)</title>
		<title level="s">Berliner Veranstaltung der internationalen EVA-Serie</title>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page">2015</biblScope>
		</imprint>
	</monogr>
	<note>Konferenzband zur 22</note>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">F</forename><surname>Halbach</surname></persName>
		</author>
		<title level="m">Judenrollen: Darstellungsformen im europäischen Theater von der Restauration bis zur Zwischenkriegszeit</title>
				<imprint>
			<publisher>Walter de Gruyter</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="volume">70</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Lodview: a computer program for the graphical evaluation of lod score results in exclusion mapping of human disease genes</title>
		<author>
			<persName><forename type="first">F</forename><surname>Hildebrandt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Pohlmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Omran</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computers and biomedical research</title>
		<imprint>
			<biblScope unit="volume">26</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="592" to="599" />
			<date type="published" when="1993">1993</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">A unified platform for archival description and access</title>
		<author>
			<persName><forename type="first">C</forename><surname>Prom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Rishel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Schwartz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Fox</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 7th ACM/IEEE Joint Conference on Digital Libraries</title>
				<meeting>the 7th ACM/IEEE Joint Conference on Digital Libraries<address><addrLine>JCDL</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2007">2007. 2007</date>
			<biblScope unit="page" from="157" to="166" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Realtime user-guided image colorization with learned deep priors</title>
		<author>
			<persName><forename type="first">R</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">Y</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Isola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Geng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">S</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Efros</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Graphics (TOG)</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">4</biblScope>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
