<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">LED: curated and crowdsourced Linked Data on Music Listening Experiences</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Alessandro</forename><surname>Adamou</surname></persName>
							<email>alessandro.adamou@open.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="institution">The Open University</orgName>
								<address>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><surname>Mathieu D'aquin</surname></persName>
							<email>mathieu.daquin@open.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="institution">The Open University</orgName>
								<address>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Helen</forename><surname>Barlow</surname></persName>
							<email>helen.barlow@open.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="institution">The Open University</orgName>
								<address>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Simon</forename><surname>Brown</surname></persName>
							<email>simon.brown@rcm.ac.uk</email>
							<affiliation key="aff1">
								<orgName type="institution">Royal College of Music</orgName>
								<address>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">LED: curated and crowdsourced Linked Data on Music Listening Experiences</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">89F93D2CDCD862A20DCD0663DEBD7658</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T20:08+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Linked Data</term>
					<term>Crowdsourcing</term>
					<term>Digital Humanities</term>
					<term>Data Workflow</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We present the Listening Experience Database (LED), a structured knowledge base of accounts of listening to music in documented sources. LED aggregates scholarly and crowdsourced contributions and is heavily focused on data reuse. To that end, both the storage system and the governance model are natively implemented as Linked Data. Reuse of data from datasets such as the BNB and DBpedia is integrated with the data lifecycle since the entry phase, and several content management functionalities are implemented using semantic technologies. Imported data are enhanced through curation and specialisation with degrees of granularity not provided by the original datasets.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Most research on listening to music focuses on investigating associated cognitive processes or analysing its reception by critics or commercial indicators such as sales. There is only sporadic research on the cultural and aesthetic position of music among individuals and societies over the course of history. One obstacle to this kind of research is the sparsity of primary source evidence of listening to music. Should such evidence be compiled, we argue that the adoption of explicit structured semantics would help highlight the interactions of listeners with a range of musical repertoires, as well as the settings where music is performed.</p><p>With the Listening Experience Database (LED) <ref type="foot" target="#foot_0">1</ref> , we aim at covering this ground. LED is the product of a Digital Humanities project focused on gathering documented evidence of listening to music across history and musical genres. It accepts contributions from research groups in humanities as well as the crowdsourcing community, however, the data management workflow is supervised to guarantee that minimum scholarly conventions are met. Being conceived with data reuse in mind, LED is natively implemented as Linked Data. All the operations in the data governance model manipulate triples within, and across, named RDF graphs that encode provenance schemes for users of the system.</p><p>Several content management functionalities available in LED, such as content authoring, review, reconciliation and faceted search, incorporate Linked Data reuse. Reused datasets include DBpedia<ref type="foot" target="#foot_1">2</ref> and the British National Bibliography (BNB) <ref type="foot" target="#foot_2">3</ref> , with music-specific datasets currently under investigation. Reused data are also enhanced, as the LED datamodel is fine-grained and allows for describing portions of documents and excerpts, which are not modelled in the datasets at hand. LED therefore also aims at being a node by its own right in the Linked Data Cloud, providing unique content and contributing to existing data too. At the time of writing, the LED dataset stores about 1,000 listening experience records contributed by 25 users, half of whom being volunteers from the crowd.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related work</head><p>A similar effort in aggregating structured data in primary evidence was already carried out for reading experiences <ref type="bibr" target="#b0">[1]</ref>, though the process was not data-driven and the resulting Linked Data were only marginally aligned. We also acknowledge a project being carried out, which gathers direct personal experiences of young users of the system, albeit with a minimal data structure<ref type="foot" target="#foot_3">4</ref> . We also drew inspiration from earlier accounts of using DBpedia for music, such as the dbrec recommender <ref type="bibr" target="#b2">[3]</ref>. Crowdsourcing is also gaining the attention of the Semantic Web community, with very recent attempts at tackling data quality aspects <ref type="bibr" target="#b3">[4]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">The Listening Experience Database</head><p>We define a listening experience (LE) as a documented (i.e. with a quotable and citable source) engagement of an individual in an event where some piece of music is played. In terms of conceptual modelling, a LE is a subjective event, and one document describing it is the quoted evidence reported in the database.</p><p>The lifecycle of data in LED involves the roles of contributor, consumer and gatekeeper, and states called draft, submitted, public and blacklisted. Every artifact stored in the system exists in one or more of these states (except blacklisted ones, which exclude all other states), and a state determines if a user with a certain role can "see" an artifact or not. What these artifacts are, depends on the specific phases in the workflow, which are transitions between these states.</p><p>Authoring. Contributors populate the knowledge base by entering data on a LE and its associated entities. The entry forms are dynamic and provide suggestions and autocompletion data from LED and external datasets in real time (cf. Figure <ref type="figure" target="#fig_1">1</ref>). Artifacts declared during this phase remain in a draft state, only to enter a submitted state once the contributor submits the LE to gatekeepers.</p><p>Review. Privileged users with the gatekeeper role review a submitted artifact and either promote it to the public state, or reject it for blacklisting, or demote it to draft again, which they can do by either taking over the artifact and amending its data themselves, or sending it back to the original contributor.</p><p>Reconciliation. Gatekeepers can align and merge duplicate artifacts that are found to match. They can compare candidate duplicates with other artifacts in LED and third-party data. This operation does not modify their state.</p><p>Faceted search. Consumers can navigate LE's by filtering keyword search results by bespoke criteria which are not necessarily stored in LED, but also reused from third-party datasets. Only public artifacts contibute to searches.  With a native Linked Data implementation, we can immediately integrate reuse with every stage of the data lifecycle starting with data entry, and eliminate a posteriori revision and extraction phases from the workflow, thereby reducing the time-to-publish of our data and having them linked right from the beginning. Also, the named graph model of quad-stores can encode provenance information with the granularity of atomic statements <ref type="bibr" target="#b1">[2]</ref>, thus lending itself to fine-grained and complex trust management models.</p><p>To encode the above workflow entirely in RDF, we used the named graph paradigm in order to represent states and artifacts. Deciding on the scale of the latter was an issue: while we intended to give gatekeepers control on single RDF triples (or quads, from the named graph perspective), and to contributors a way to support the truth or falsehood of a triple, this can be complex and time-consuming. Therefore, artifacts are encapsulated into LE's, musical works, literary sources, agents (e.g. people, groups or organisations) and places: these are, for instance, the classes of artifacts that gatekeepers may want to review or reconcile. However, LE's remain the core artifacts of the sytem: only by creating or editing them can their associated artifacts be generated.</p><p>The LED knowledge base is partitioned into data spaces, each belonging to a user or role. Every contributor owns two RDF graphs, one for draft artifacts and one for submitted ones. Thus, we can keep track of which contributors support a fact by reusing it (e.g. &lt;Messiah (oratorio) composer Georg Frideric Handel&gt;). There is a single graph for public artifacts, and one for blacklisted ones. Contributors have access to the graphs they own plus the public graph; gatekeepers can access every user's submitted graph and the public and blacklist graphs. State transitions are realised by parametric SPARQL queries that selectively move RDF triples across these graphs. Along with these data spaces there are rules that determine the visibility of triples to each user, depending on the content of their private graphs. In general, these rules assume contributors have greater confidence in the facts in their possession, and when missing, they should trust those provided by the community or other datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Demonstration</head><p>The audience will be given a live demonstration of the LED system, but from the point of view of users with the privileged roles of contributor and gatekeeper. We will show the benefits of reusing data from indexed datasets during the entry phase, as well as the implementation of our governance model in Linked Data and its effects on the representation of a resource as seen by the general public or a specific user. Data reuse and enhancement will be demonstrated through a LE entry form to be auto-populated in real time and open to input by audience members. To demonstrate the governance model, we will run two distinct entries with shared data through the whole draft-submission-gatekeeping lifecycle. We will then show how differently the shared data and their RDF representations appear to each user, based on the trust and provenance policies in place.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>(a) Listening experience submission. (b) Autocompletion from the BNB dataset.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 1 :</head><label>1</label><figDesc>Fig. 1: Example of data entry for "Pictures from Italy" by Charles Dickens.</figDesc><graphic coords="3,136.69,233.68,155.62,214.65" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">LED, online at http://www.open.ac.uk/Arts/LED</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">DBpedia, http://dbpedia.org</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">British National Bibliography, http://bnb.data.bl.uk</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">Experiencing Music, http://experiencingmusic.com</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">The Reading Experience Database</title>
		<author>
			<persName><forename type="first">Matthew</forename><surname>Bradley</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Victorian Culture</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="151" to="153" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Named graphs, provenance and trust</title>
		<author>
			<persName><forename type="first">Jeremy</forename><forename type="middle">J</forename><surname>Carroll</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christian</forename><surname>Bizer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Patrick</forename><forename type="middle">J</forename><surname>Hayes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Patrick</forename><surname>Stickler</surname></persName>
		</author>
		<editor>Allan Ellis and Tatsuya Hagino</editor>
		<imprint>
			<date type="published" when="2005">2005</date>
			<publisher>ACM</publisher>
			<biblScope unit="page" from="613" to="622" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">dbrec -music recommendations using DBpedia</title>
		<author>
			<persName><forename type="first">Alexandre</forename><surname>Passant</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Semantic Web Conference (2</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">F</forename><surname>Peter</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Yue</forename><surname>Patel-Schneider</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Pascal</forename><surname>Pan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Peter</forename><surname>Hitzler</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Lei</forename><surname>Mika</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Jeff</forename><forename type="middle">Z</forename><surname>Zhang</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Ian</forename><surname>Pan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Birte</forename><surname>Horrocks</surname></persName>
		</editor>
		<editor>
			<persName><surname>Glimm</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="volume">6497</biblScope>
			<biblScope unit="page" from="209" to="224" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A semantically enabled architecture for crowdsourced linked data management</title>
		<author>
			<persName><forename type="first">Elena</forename><surname>Simperl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Maribel</forename><surname>Acosta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Barry</forename><surname>Norton</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">CrowdSearch</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<editor>
			<persName><forename type="first">Ricardo</forename><forename type="middle">A</forename><surname>Baeza-Yates</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Stefano</forename><surname>Ceri</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Piero</forename><surname>Fraternali</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Fausto</forename><surname>Giunchiglia</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="volume">842</biblScope>
			<biblScope unit="page" from="9" to="14" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
