<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Extracting core knowledge from Linked Data</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Valentina</forename><surname>Presutti</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Lora</forename><surname>Aroyo</surname></persName>
							<affiliation key="aff2">
								<orgName type="institution">Vrije Universiteit Amsterdam</orgName>
								<address>
									<country key="NL">The Netherlands</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Alessandro</forename><surname>Adamou</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Balthasar</forename><surname>Schopman</surname></persName>
							<affiliation key="aff2">
								<orgName type="institution">Vrije Universiteit Amsterdam</orgName>
								<address>
									<country key="NL">The Netherlands</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Aldo</forename><surname>Gangemi</surname></persName>
							<affiliation key="aff1">
								<orgName type="department">ISTC</orgName>
								<orgName type="institution">National Research Council</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Guus</forename><surname>Schreiber</surname></persName>
							<affiliation key="aff2">
								<orgName type="institution">Vrije Universiteit Amsterdam</orgName>
								<address>
									<country key="NL">The Netherlands</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff0">
								<orgName type="institution">Alma Mater Studiorum Università di Bologna</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Extracting core knowledge from Linked Data</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">41945335ED287B546842290277E8F06C</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T11:48+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Recent research has shown the Linked Data cloud to be a potentially ideal basis for improving user experience when interacting with Web content across different applications and domains. Using the explicit knowledge of datasets, however, is neither sufficient nor straightforward. Dataset knowledge is often not uniformly organized, thus it is generally unknown how to query for it. To deal with these issues, we propose a dataset analysis approach based on knowledge patterns, and show how the recognition of patterns can support querying datasets even if their vocabularies are previously unknown. Finally, we discuss results from experimenting on three multimedia-related datasets.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The constant expansion trend of Linked Data (LD) is broadening the potential exploitation range of their datasets for improving search through related data. Current research <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b0">1]</ref> and established Web search firms like Google and Powerset show the benefits of using explicit semantics and LD to refine search results. However, using efficiently the explicit knowledge of each dataset can be awkward and ineffective. Datasets typically cover diverse domains, do not follow a unified way of organizing the knowledge, differ in size, granularity and descriptiveness. To avoid burdensome, dataset-specific querying schemes, the following are required: <ref type="bibr" target="#b0">(1)</ref> measures and indicators that provide a landscape view on a dataset; (2) a way to query a dataset even with no prior knowledge of its vocabulary.</p><p>We propose an approach to examine LD with these problems in mind. It employs a strategy for inspecting datasets and identifying emerging knowlege patterns (KPs). A key step of this method is the construction of a formal logical architecture, or dataset knowledge architecture, which summarizes the key features and figures of one or more datasets, thus addressing requirement <ref type="bibr" target="#b0">(1)</ref>. This, in turn, relies on the notions of KPs and type-property paths. We identify the central properties and types, i.e. those able to capture most of the knowledge in a dataset, and extract KPs based on the central types. In other words, we extract the dataset vocabulary and analyse the way the data are used in terms of patterns. We also associate general-purpose measures, such as betweenness and centrality, to the knowledge architecture components of a dataset for performing empirical analysis. These notions and measures will be defined throughout the paper. Using KPs and paths, we can provide prototypical ready-to-use queries for core and concealed knowledge to emerge, thus addressing requirement <ref type="bibr" target="#b1">(2)</ref>. Although the method applies to datasets whose logical structure is not known a priori, it is meant to analyse LD for serendipitous knowledge. Unlike a mere reverse-engineering exercise, our method discovers new knowledge about datasets, such as their central types and properties and emerging patterns.</p><p>The method was partly applied manually and our claims on it are observed empirically, yet it can be generalised and fully automated, as the construction of a dataset architecture and computation of its measures are all derived by directly querying the data using metalevel constructs from RDF and OWL.</p><p>The paper is organized as follows. In Section 2 we discuss the general approach for data retrieval and analysis, compounded with our leading hypotheses and basic definitions of recurring terms in our methodology. In Section 3 we describe the dataset knowledge architecture, the evaluation measures and an overview of the datasets specifically considered in this analysis. In Section 4 we present and discuss the results of our empirical study, including an example of query, knowledge pattern and dataset figures. After an overview on related work in Section 5, we present our conclusions and future work in Section 6.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Approach</head><p>Linked Data typically combine types and properties defined either in in-house ontologies, or in widespread existing ones e.g. FOAF <ref type="foot" target="#foot_0">4</ref> , DC<ref type="foot" target="#foot_1">5</ref> or GeoNames <ref type="foot" target="#foot_2">6</ref> . It may occur, though, that in-house ontologies have not been formalized, or are not disclosed. Even when the ontologies are available, they do not tell which relevant part of them is actually used in a dataset, and what links are drawn (through the data) between entities across various ontologies. Knowing these details about a dataset is a pre-condition for evaluating its adequateness for being reused in a certain context, for inspecting its content, for integrating it with other (possibly legacy) knowledge; in other words, for using it. We hypothesize that employing KPs for analysing and possibly authoring LD addresses this problem. In this paper, we focus on querying a dataset when its vocabulary is previously unknown, by proceeding as follows:</p><p>we define a method and an ontology for analyzing a dataset and producing a synthesis of it i.e. a modular abstraction named dataset knowledge architecture, that highlights how a dataset knowledge is organized, and what its core knowledge components (e.g. central types and associated KPs) are; we show how this general method and ontology can be exploited for identifying principal KPs extracted from a dataset for producing prototypical queries, through which we are able to retrieve a dataset core knowledge.</p><p>As another premise to our approach to be described in Section 2, we define a few terms that are used throughout the remainder of this paper.</p><p>A knowledge pattern (KP) for a type in an RDF graph includes: (i) the properties by which instances of this type relate to other individuals; (ii) the types of such individuals for each property. A KP is an invariance across observed data or objects that allows a formal or cognitive interpretation <ref type="bibr" target="#b3">[4]</ref>. A KP embeds the key relations that describe a relevant piece of knowledge in a certain domain of interest, similar to linguistic frames and cognitive schemata.</p><p>A path is an ordered type-property sequence that can be traversed in an RDF graph. Note the use of types in lieu of their instances, which instead denote multiple occurrences of the same path. The length of a path is the number of properties involved (possibly even with repetitions).</p><p>Our approach uses a strategy aimed at modelling, inspecting, and summarizing Linked Data sets, thereby drawing what we call their knowledge architecture, which relies on the notions of paths and KPs defined above. The application of this approach is sketched in Figure <ref type="figure" target="#fig_0">1</ref>, and can be synthesized as follows:  The following section illustrates how the components of a dataset knowledge architecture are constructed for performing the steps of our method.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Method and datasets</head><p>Our method described in Section 2 focuses on two main empirical results:</p><p>(1) building a knowledge architecture of a dataset able to summarize its key features; and (2) extracting the central KPs of a dataset. In this Section we focus on what a dataset knowledge architecture is and how to construct it.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">The knowledge architecture</head><p>A dataset knowledge architecture is an ontology that expresses a dataset vocabulary in a modular way. Its components are selected based on measures that indicate their importance in capturing the core knowledge in a dataset. In other words, it is an abstraction over an RDF graph, which offers a modular view as opposed to the usual "class-property" view provided by vocabularies and ontologies, since it is open to queries that are agnostic to specific types and properties used in the dataset. To populate it, we inspect a dataset for (i) the types and properties it uses, (ii) its typical paths i.e. type-property sequences, and (iii) quantitative statistics about their usage. The knowledge architecture schema is available online <ref type="foot" target="#foot_4">8</ref> . This formalism allows us to empirically analyse a dataset architecture through SPARQL queries. Figure <ref type="figure" target="#fig_2">2</ref> depicts the main entities defined by the dataset knowledge architecture ontology. With the help of this ontology, we aim at deriving, in a bottom-up way, the ontology actually employed for representing the data in a LD dataset and produce additional useful knowledge about a dataset e.g., its central types. We identify (i) the properties used in a dataset triples, and model them through the class Property; (ii) the types (classes or literals) of the subject and object resources of such triples, and model them through the class Type; and (iii) the typical paths that connect triples in a dataset, and model them through the class Path.</p><p>A Path is an ordered set {T 1 , p 1 , ..., p l , T l+1 }, where T i is a Type, p i is a property, and l is the path length. Each ordered subset {T i , p i , T i+1 } of a Path of length l is called PathElement, and is associated with its position i = 1, ..l, in the path. For example, one DBTune Jamendo 9 instance of Path is: where mo: and foaf: are prefixes for the Music Ontology (i.e., http://purl. org/ontology/mo/) and FOAF namespaces (i.e., http://xmlns.com/foaf/0. 1/). We then define four properties describing PathElement: hasProperty, hasPosition, hasPathElementObjectType and hasPathElementSubjectType. Each Path is associated to an instance of PathOccurrencesInDataset, which indicates the observed number of occurrences of that path in a Dataset.</p><p>We also define the concepts CentralType and CentralProperty, which identify the entities capturing most of the knowledge in a dataset. The type KnowledgePattern is used for storing the emerging knowledge patterns.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Measures</head><p>Table <ref type="table" target="#tab_1">1</ref> illustrates the measures that we associate to the knowledge architecture of a dataset to empirically analyse and interpret them to support our conclusive statements in Section 4. Measures from #Triples to #PathOcc hold for a dataset as a whole, while the others are related to a type or property and can be computed either for one dataset or across multiple datasets. Most of the measures have been computed by combining SPARQL queries and software scripting 10 .</p><p>The measures for identifying the central types and properties of a dataset are also shown. Type betweenness and Property betweenness are simplifications of centrality measures used in graph theory. Although we do not model the knowledge architecture of a dataset as a graph, its structure approximates it through the notion of directed paths and based on the empirical observation that all paths longer than 3 are composed of the observed shorter paths.</p><p>We can then compute betweenness of types by counting the participation of types as subjects in paths of length 2 at position 2, and betweenness of properties by counting the participation of properties in paths of length 3 in position 2.</p><p>The combination of betweenness values and number of instances indicates the value of centrality of a type or property for a dataset. Central types and properties are able to capture most of the knowledge expressed in a dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Datasets</head><p>We initially examined several datasets including general-purpose, cross-domain and multimedia domain-specific datasets. As selection criteria for ensuring us a statistically relevant sample for our experiments, we selected datasets:</p><p>1. addressing a specific knowledge domain;   2. addressing the same specific domain, or a conceptually related one -hence leading to possible cross-relations among them; 3. with different sizes 4. that use third-party as well as in-house ontologies.</p><p>We eventually chose three datasets related to the multimedia domain.</p><p>Jamendo<ref type="foot" target="#foot_5">11</ref> is an online distributor of independent musical artists. The represented data focus on record authorship, release and distribution over internet channels. Being part of the DBTune service, its data representation relies on the Music Ontology<ref type="foot" target="#foot_6">12</ref> and parts of FOAF, Dublin Core, Event <ref type="foot" target="#foot_7">13</ref> , Timeline <ref type="foot" target="#foot_8">14</ref>and Tags <ref type="foot" target="#foot_9">15</ref> ontologies. The indie nature of its hosted artists, who are scarcely represented in other datasets, makes Jamendo a primary source for its content.</p><p>John Peel Sessions (JPeel) 16 includes data related to live musical performances for the John Peel Show aired on BBC Radio One, and the resulting record releases. It is also a DBTune dataset, but being more event-focused, it mostly reuses a different portion of the Music Ontology vocabulary than Jamendo does. LinkedMDB (LMDB) <ref type="foot" target="#foot_10">17</ref> is a triplified database of the film industry domain. It encompasses the entities involved with film production and release, plus additional metadata concerning ratings and events such as film festivals. The LinkedMDB ontology is unpublished<ref type="foot" target="#foot_11">18</ref> and almost entirely in-house, with a few exceptions such as FOAF and Dublin Core terms.</p><p>These datasets address domain-specific knowledge, thus satisfying our criterion 1. They also address criterion 2 as Jamendo and JPeel share the music production domain (albeit with different data and perspectives), while LMDB addresses the movie production domain, which is related to music e.g. through soundtrack authorship. Table <ref type="table" target="#tab_2">2</ref> shows how they differ in dimensions, thus satisfying criterion 3. Finally, as for criterion 4 Jamendo and JPeel heavily reuse external ontologies, while LMDB mainly uses a proprietary one. Additionally, their vocabulary usage has little to no overlap.</p><p>We excluded general-purpose and cross-domain datasets, e.g. DBPedia and GeoNames, based on the already existing research experience on applying patterns to them. Examples are <ref type="bibr" target="#b9">[10]</ref>, which addresses the application of patterns to general-purpose datasets such as WordNet<ref type="foot" target="#foot_12">19</ref> , and <ref type="bibr" target="#b16">[17]</ref> which applies patterns to the Thesaurus of Geographic Names<ref type="foot" target="#foot_13">20</ref> addressing geographical as well as artrelated knowledge. In other words, we opted to experiment on a different type of resource in order to lay the basis -in our future work -for comparing our method and results with existing approaches.</p><p>For each dataset we computed the measures from Table <ref type="table" target="#tab_1">1</ref>. They provided us with a means to objectively describe datasets according to our selection criteria.</p><p>By the figures in Table <ref type="table" target="#tab_3">3</ref>, the three datasets are not very sparse, due to their high property usage values. This favours the identification of central types and properties. Additionally, datasets differ by several orders of magnitude in number of paths and their occurrences, thus confirming their variety in size. The full list of types and properties used in the three datasets is available online<ref type="foot" target="#foot_14">21</ref> .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Results and examples</head><p>Based on the described entities and the associated measures (cf. Sections 3.2 and 3.3), we can compute each dataset's central types and properties and extract their most representative Paths, to the aim of building prototypical queries. These can be constructed by identifying a set of path elements that allow us to retrieve the relevant knowledge about central types.</p><p>We extract all emerging KPs through querying all distinct paths of length 1 and group them by their subject types. The KP of a type C (i.e., KP (C)) includes the type C, all the properties used for describing its instances, and the object types connected to them. The list of KPs extracted from the three datasets is published online <ref type="foot" target="#foot_15">22</ref> .</p><p>A prototypical query has to provide an effective summarization of a dataset knowledge about a central type. Such effective summarization includes more than the properties used for describing such type's instances, that is, we need a way to identify an interesting neighborhood of the types' instances. To this aim, we exploit the notion of central properties and their clusters of paths of length 3 (where the property is at position 2). Examples of clusters, the queries used for retrieving them, and the full list of extracted KPs are available online <ref type="foot" target="#foot_16">23</ref> .</p><p>We identify the types and properties through which most of the dataset knowledge transits (i.e., central types and properties, respectively), and show that they have a primary role for selecting KPs and paths for building prototypical queries. We report here the statistics computed for these types and properties in Jamendo<ref type="foot" target="#foot_17">24</ref> . Figures <ref type="figure">4(a</ref>) and 4(b) show the metrics used for identifying central types; that is, as our task is to build prototypical queries, we look at types with both high betweenness and many individuals, thereby addressing centrality and recall. In the case of Jamendo, we select mo:Playlist, Track, and Signal. Figures <ref type="figure">4(c</ref>) and 4(d) show the metric values for identifying central types, i.e. the number of triples that instantiate a property and the betweenness of properties. By adopting the same policy as for central types, we select mo:available as, mo:published as, and foaf:made.</p><p>A prototypical query, which provides a meaningful summarization of a dataset knowledge about a central type, is built by combining KPs and central properties' clusters of paths of length 3, by implementing the following algorithm:</p><p>1. Create an empty list P E to store path elements; 2. Let C be a central type; 3. Take the knowledge pattern of C KP(C), and add its path elements to P E; 4. Identify the central properties p1, ..., pn involved in KP (C); 5. For all pi, i = 1, ..., n, take paths of length 3 with clustering factor = pi (select paths with property pi in position 2); 6. identify path elements in the paths that inlcude C and that are not in KP (C) and add them to P E; 7. Build a CONSTRUCT query by using path elements in P E by using the OPTION construct.</p><p>Let us exemplify the generation of a prototypical query, following the algorithm above, for C=mo:Track, which is a central type in Jamendo. Based on KP (mo : T rack), P E will be initialized by 4 paths, characterized by the properties {dc:title, mo:available as, mo:license, mo:track number} 25 . Among them, only mo:available as is a central property in Jamendo, hence we pick up its cluster of paths (of length 3) in order to enrich the set P E that will be used for building the prototypical query. From such cluster, we collect 3 additional paths as they are connected to mo:Track. Two of them identify incoming links to mo:Track (i.e., mo:track and mo:pulished as), and one identifies one link in the neighborhood of mo:Track (i.e., dc:format), which reaches a 2-degree distance from it in its knowledge graph. This additional link in the neighborhood shows how this approach allows to build queries that capture more meaningful knowledge than a simple SPARQL DESCRIBE. An interesting investigation that we have planned in our future work is to study the cognitive soundness of these queries with respect to user interaction tasks.</p><p>The result of the described procedure is the following query. Figure <ref type="figure">3</ref> shows the resulting graph if such a query would be applied to a specific instance of mo:Track, http://dbtune.org/jamendo/track/7593 in this specific case.</p><p>construct { ?t a mo:Track . ?t dc:title ?t1 . ?t mo:available_as ?t2 . ?t2 dc:format ?f . ?t mo:license ?t3 . ?t mo:track_number ?t4 . ?s a mo:Signal . ?s mo:published_as ?t . ?r a mo:Record . ?r mo:track ?t . } from jamendo_dataset where { ?t a mo:Track . ?t dc:title ?t1 . {{OPTIONAL { ?t mo:available_as ?t2 . ?t2 dc:format ?f }} UNION {OPTIONAL { ?t mo:license ?t3 }} UNION {OPTIONAL { ?t mo:track_number ?t4 }} UNION {OPTIONAL { ?s a mo:Signal . ?s mo:published_as ?t }} UNION {OPTIONAL { ?r a mo:Record . ?r mo:track ?t }}} } These steps allow us to build a summary of a dataset, which supports the retrieval of the most representative knowledge for its domain as (i) paths of length 3 are enough for capturing all knowledge structures, (ii) central types catch most representative knowledge of the dataset, (iii) KPs convey the description of types, and (iv) central properties link the most representative KPs of a dataset. This analytic approach showed that we can summarize a dataset through a relatively small knowledge architecture, thus limiting the impact of empirical analysis on computational complexity. In our three experiments of Section 3.3, we built a </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Related work</head><p>There has been valuable research work on understanding the LD cloud recently, which highlights different approaches, perspectives, and specific goals. Some work focus on providing macroscopic analysis on LD as a whole such as <ref type="bibr" target="#b4">[5]</ref>, which analyzes the typical usage of the owl:sameAs standard property; <ref type="bibr" target="#b10">[11]</ref>, which identifies and generates relations between ontologies used in LD; <ref type="bibr" target="#b6">[7]</ref>, which discusses how LD would benefit from vocabulary alignments with foundational ontologies; <ref type="bibr" target="#b12">[13]</ref>, which identifies the most important vocabularies and classes over large-scale distributed datasets.</p><p>Works such as <ref type="bibr" target="#b8">[9]</ref>, <ref type="bibr" target="#b2">[3]</ref>, <ref type="bibr" target="#b14">[15]</ref> focus mainly on query optimization.</p><p>Other research efforts exploit LD for supporting user interaction with contentintensive applications, such as <ref type="bibr" target="#b11">[12]</ref>, <ref type="bibr" target="#b15">[16]</ref>, and <ref type="bibr" target="#b13">[14]</ref> but do not try to provide a summarization of the used RDF datasets.</p><p>Finally, <ref type="bibr" target="#b1">[2]</ref> and <ref type="bibr" target="#b7">[8]</ref> focus on providing a compact representation of RDF datasets. In both cases, the main difference with our approach is the lack of design perspective and conceptual analysis. In particular, the taxonomy of classes and properties is not considered, they treat all classes and properties in the same way, while we use them for eliminating redundancies. Furthermore, <ref type="bibr" target="#b1">[2]</ref> do not consider the notion of central types and properties (or analogous), while <ref type="bibr" target="#b7">[8]</ref> does not exploit semantic web technologies for storing the RDF datasets summaries, which is instead a characteristic of our approach. Finally, our approach focuses on identifying prototypical queries that convey meaningful conceptual organization around a certain type based on the notions of centrality.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusion and future work</head><p>We have shown how to summarize Linked Data sets by treating them as sets of connected knowledge patterns, in order to identify their core knowledge components. We have experimented on three datasets from the LD cloud, and showed how to build prototypical queries for them even when the ontologies that model them are unknown. We have planned, in our future work, to compare ontologies explicitly published and used for a dataset with the knowledge architecture that arises from our analysis.</p><p>Our ongoing and future work focuses on extending our strategy, in order to (i) demonstrate how by aligning emerging KPs of a dataset to general KPs improves interoperability across datasets, and detects incompatibility issues (ii) compare analysis data about different datasets, and (iii) improve user interaction in searches for relevant content. We have planned to improve the method by performing additional analysis on an extensive coverage of the multimedia domain, and subsequently evaluate the cross-domain portability of our approach.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Linked Data analysis methodology Side arrows denote whether the dataset or the knowledge architecture is being accessed for reading or writing on each step.</figDesc><graphic coords="3,134.77,244.01,242.08,142.48" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>1 .</head><label>1</label><figDesc>Gather property usage statistics for a chosen dataset and store them as ABox of the knowledge architecture ontology; 2. Query the dataset for extracting paths. Store all paths with length up to 4, with their usage statistics, in the knowledge architecture dataset 7 ; 3. Identify central types and central properties based on their frequencies in a key position in paths, i.e. betweenness, and number of instantiations; 4. Extract emerging KPs based on the dataset's central types and properties; 5. Select clustering factors among central properties, i.e. those properties occupying the same position in a set of paths, and construct path clusters;</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Diagram of the knowledge architecture ontology. Property arcs denote either class restrictions or domain/range pairs across their nodes.</figDesc><graphic coords="4,146.58,302.33,207.50,139.25" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>{mo:MusicArtist foaf:made mo:Record mo:available as mo:Torrent} has length= 2, and is composed of the following PathElements: {mo:MusicArtist, foaf:made, mo:Record} (position 1) {mo:Record, mo:available as, mo:Torrent} (position 2)</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head></head><label></label><figDesc>of properties that participates in paths of length n by the total number of used properties, with n = 2...4.Type betweennessThe capability of a type to catch meaningful knowledge.Count the number of paths of length 2 in which a type participate in at position 1 with subject role.Property betweennessThe capability of a type to catch meaningful knowledge.Count the number of paths of length 3 in which a property participates at position 2.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Fig. 3 .Fig. 4 .</head><label>34</label><figDesc>Fig. 3. Result of a the prototypical query for mo:Track filtered on the instance http://dbtune.org/jamendo/track/7593. knowledge architecture dataset of 130, 373 triples representing three datasets whose combined size sums up to over 7 • 10 6 triples. The architecture is available online 26 .</figDesc><graphic coords="10,134.77,101.90,276.67,58.77" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>9</head><label></label><figDesc>DBTune Jamendo, http://dbtune.org/jamendo/ 10 Queries available at http://stlab.istc.cnr.it/stlab/LOD-Analysis-Statistics</figDesc><table><row><cell>Measure</cell><cell>What it indicates</cell><cell>How it is computed</cell></row><row><cell># Triples</cell><cell>Number of triples that con-</cell><cell>Sum all PropertyUsageInDataset triples.</cell></row><row><cell></cell><cell>stitute a dataset.</cell><cell></cell></row><row><cell># Props</cell><cell cols="2">Number of used properties. Count all PropertyUsageInDataset.</cell></row><row><cell># Types</cell><cell>Number of used types.</cell><cell>Cardinality of the union set of types related to</cell></row><row><cell></cell><cell></cell><cell>PathElement with either hasPathElementSubjectType or</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 .</head><label>1</label><figDesc>Measures of dataset dimensions and characteristics in terms of number of triples, number of types and properties used, number of paths of length 2 to 4, number of occurrences of paths of length 2 to 4, and property usage in paths of length 2 to 4.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2 .</head><label>2</label><figDesc>Dimension indicators for the three datasets we analysed.</figDesc><table><row><cell>Dataset</cell><cell>Jamendo</cell><cell>JPeel</cell><cell>LMDB</cell><cell></cell></row><row><cell>nTriples</cell><cell>1,047,950</cell><cell>271,369</cell><cell>6,147,978</cell><cell></cell></row><row><cell>nProps</cell><cell>24</cell><cell>24</cell><cell>221</cell><cell></cell></row><row><cell>nTypes</cell><cell>11</cell><cell>9</cell><cell>53</cell><cell></cell></row><row><cell>Measure</cell><cell cols="2">Dataset L=2</cell><cell>L=3</cell><cell>L=4</cell></row><row><cell></cell><cell cols="2">Jamendo 33</cell><cell>31</cell><cell>26</cell></row><row><cell>nPath</cell><cell>JPeel</cell><cell>56</cell><cell>65</cell><cell>73</cell></row><row><cell></cell><cell>LMDB</cell><cell>546</cell><cell>1,663</cell><cell>3,757</cell></row><row><cell></cell><cell cols="2">Jamendo 999,052</cell><cell>1,452,645</cell><cell>2,259,097</cell></row><row><cell>nPathOcc</cell><cell>JPeel</cell><cell>1,948,999</cell><cell>14,447,400</cell><cell>1,240,815,607</cell></row><row><cell></cell><cell>LMDB</cell><cell cols="2">25,765,513 184,950,315</cell><cell>1,402,705,472</cell></row><row><cell></cell><cell cols="2">Jamendo 1</cell><cell>0.917</cell><cell>0.834</cell></row><row><cell>Property usage in paths</cell><cell>JPeel</cell><cell>0.917</cell><cell>0.834</cell><cell>0.792</cell></row><row><cell></cell><cell>LMDB</cell><cell>0.847</cell><cell>0.747</cell><cell>0.747</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3 .</head><label>3</label><figDesc>Path and property-related figures for the three datasets we analysed.</figDesc><table /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_0">Friend-Of-A-Friend, http://xmlns.com/foaf/0.1/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_1">Dublin Core, http://purl.org/dc/elements/1.1/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_2">GeoNames, http://www.geonames.org/ontology/ontology_v2.2.1.rdf</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_3">The choice of maximum path length 4 was dictated by computational boundaries and the extreme redundancy empirically observed in longer paths (cf. Section 3).</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_4">http://www.ontologydesignpatterns.org/ont/lod-analysis-properties.owl, which imports the paths module as well.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_5">Jamendo DBTune home, http://dbtune.org/jamendo</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="12" xml:id="foot_6">The Music ontology, http://purl.org/ontology/mo/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="13" xml:id="foot_7">Event Ontology, http://purl.org/NET/c4dm/event.owl#</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="14" xml:id="foot_8">Timeline ontology, http://purl.org/NET/c4dm/timeline.owl#</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="15" xml:id="foot_9">Tag vocabulary, http://www.holygoat.co.uk/owl/redwood/0.1/tags/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="17" xml:id="foot_10">LinkedMDB home, http://www.linkedmdb.org/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="18" xml:id="foot_11">The base namespace http://data.linkedmdb.org/resource/movie/ would not resolve as of August 2011, thus forcing us to assume an implicit schema for the dataset.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="19" xml:id="foot_12">  19  WordNet, http://wordnet.princeton.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="20" xml:id="foot_13">edu/ 20 Geographic Names, http://www.getty.edu/research/tools/vocabularies/tgn/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="21" xml:id="foot_14">http://stlab.istc.cnr.it/stlab/LOD-Analysis-TypesAndProperties</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="22" xml:id="foot_15">List of KPs extracted from Jamendo (10 KPs), JPeel (8 KPs), and LMDB (51 KPs), http://stlab.istc.cnr.it/stlab/LOD-Analysis-EmergingKP</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="23" xml:id="foot_16">Example clusters at http://stlab.istc.cnr.it/stlab/LOD-Analysis-Clusters</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="24" xml:id="foot_17">Webpage http://stlab.istc.cnr.it/stlab/LOD-Analysis-Graphs contains the complete charts; http://stlab.istc.cnr.it/stlab/LOD-Analysis-Statistics shows the same data as tables, along with the queries for computing them.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgements</head><p>This work has been part-funded by the European Commission under grant agreement FP7-ICT-2007-3/ No. 231527 (IKS -Interactive Knowledge Stack)</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">CHIP demonstrator: Semantics-driven recommendations and museum tour generation</title>
		<author>
			<persName><forename type="first">L</forename><surname>Aroyo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Stash</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Gorgels</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Rutledge</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">Semantic Web Challenge. CEUR Workshop Proceedings</title>
				<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="volume">295</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Dfs-based frequent graph pattern extraction to characterize the content of rdf triple stores</title>
		<author>
			<persName><forename type="first">A</forename><surname>Basse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Gandon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Mirbel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the WebSci10: Extending the Frontiers of Society On-Line</title>
				<meeting>the WebSci10: Extending the Frontiers of Society On-Line<address><addrLine>Raleigh, NC: US</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The summary abox: Cutting ontologies down to size</title>
		<author>
			<persName><forename type="first">A</forename><surname>Fokoue</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kershenbaum</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Schonberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Srinivas</surname></persName>
		</author>
		<ptr target="http://dblp.uni-trier.de/db/conf/semweb/iswc2006.html#FokoueKMSS06" />
	</analytic>
	<monogr>
		<title level="m">International Semantic Web Conference</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">I</forename><forename type="middle">F</forename><surname>Cruz</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Decker</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Allemang</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Preist</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Schwabe</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Mika</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Uschold</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Aroyo</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2006">2006</date>
			<biblScope unit="volume">4273</biblScope>
			<biblScope unit="page" from="343" to="356" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Towards a pattern science for the semantic web</title>
		<author>
			<persName><forename type="first">A</forename><surname>Gangemi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Presutti</surname></persName>
		</author>
		<ptr target="http://dblp.uni-trier.de/db/journals/semweb/semweb1.html#GangemiP10" />
	</analytic>
	<monogr>
		<title level="j">Semantic Web</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="issue">1-2</biblScope>
			<biblScope unit="page" from="61" to="68" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">When owl:sameAs isn&apos;t the same: An analysis of identity in Linked Data</title>
		<author>
			<persName><forename type="first">H</forename><surname>Halpin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Hayes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">P</forename><surname>Mccusker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Mcguinness</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">S</forename><surname>Thompson</surname></persName>
		</author>
		<ptr target="http://data.semanticweb.org/conference/iswc/2010/paper/261" />
	</analytic>
	<monogr>
		<title level="m">9th International Semantic Web Conference (ISWC2010)</title>
				<imprint>
			<date type="published" when="2010-11">November 2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">RelFinder: Revealing relationships in RDF knowledge bases</title>
		<author>
			<persName><forename type="first">P</forename><surname>Heim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hellmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lehmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Lohmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Stegemann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 3rd International Conference on Semantic and Media Technologies (SAMT)</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<meeting>the 3rd International Conference on Semantic and Media Technologies (SAMT)</meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="volume">5887</biblScope>
			<biblScope unit="page" from="182" to="187" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Linked Data is merely more data</title>
		<author>
			<persName><forename type="first">P</forename><surname>Jain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Hitzler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">Z</forename><surname>Yeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Verma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">P</forename><surname>Sheth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AAAI Spring Symposium Linked Data Meets Artificial Intelligence, AAAI</title>
				<imprint>
			<publisher>Press</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="82" to="86" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Explod: Summary-based exploration of interlinking and rdf usage in the linked open data cloud</title>
		<author>
			<persName><forename type="first">S</forename><surname>Khatchadourian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">P</forename><surname>Consens</surname></persName>
		</author>
		<ptr target="http://dblp.uni-trier.de/db/conf/esws/eswc2010-2.html#KhatchadourianC10" />
	</analytic>
	<monogr>
		<title level="m">ESWC (2</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">L</forename><surname>Aroyo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Antoniou</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Hyvnen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Ten Teije</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Stuckenschmidt</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Cabral</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Tudorache</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="volume">6089</biblScope>
			<biblScope unit="page" from="272" to="287" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Graph summaries for subgraph frequency estimation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Maduko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Anyanwu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sheth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Schliekelman</surname></persName>
		</author>
		<ptr target="http://data.semanticweb.org/conference/eswc/2008/papers/330" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th European Semantic Web Conference</title>
				<editor>
			<persName><forename type="first">M</forename><surname>Hauswirth</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Koubarakis</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Bechhofer</surname></persName>
		</editor>
		<meeting>the 5th European Semantic Web Conference<address><addrLine>Berlin, Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer Verlag</publisher>
			<date type="published" when="2008-06">June 2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">The interaction between automatic annotation and query expansion: a retrieval experiment on a large cultural heritage archive</title>
		<author>
			<persName><forename type="first">V</forename><surname>Malaisé</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Hollink</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Gazendam</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">SemSearch. CEUR Workshop Proceedings</title>
				<editor>
			<persName><forename type="first">S</forename><surname>Bloehdorn</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Grobelnik</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Mika</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><forename type="middle">T</forename><surname>Tran</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="volume">334</biblScope>
			<biblScope unit="page" from="44" to="58" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Capturing emerging relations between schema ontologies on the Web of Data</title>
		<author>
			<persName><forename type="first">A</forename><surname>Nikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Motta</surname></persName>
		</author>
		<ptr target="http://ceur-ws.org/Vol-665/NikolovEtAl_COLD2010.pdf" />
	</analytic>
	<monogr>
		<title level="m">First International Workshop on Consuming Linked Data (COLD2010)</title>
				<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Combining social music and Semantic Web for musicrelated recommender systems</title>
		<author>
			<persName><forename type="first">A</forename><surname>Passant</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Raimond</surname></persName>
		</author>
		<ptr target="http://data.semanticweb.org/workshop/sdow/2008/paper/3" />
	</analytic>
	<monogr>
		<title level="m">Social Data on the Web (SDoW2008)</title>
				<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Class association structure derived from linked objects</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Qu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Ge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Zhiqiang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of WebSci&apos;09: Society On-Line</title>
				<meeting>WebSci&apos;09: Society On-Line<address><addrLine>Athens, Greece</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Open Innovation and Semantic Web : Problem solver search on Linked Data</title>
		<author>
			<persName><forename type="first">M</forename><surname>Stankovic</surname></persName>
		</author>
		<ptr target="http://data.semanticweb.org/conference/iswc/2010/paper/439" />
	</analytic>
	<monogr>
		<title level="m">9th International Semantic Web Conference (ISWC2010) (November 2010</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Index structures and algorithms for querying distributed rdf repositories</title>
		<author>
			<persName><forename type="first">H</forename><surname>Stuckenschmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vdovjak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">J</forename><surname>Houben</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Broekstra</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">WWW</title>
		<imprint>
			<biblScope unit="page" from="631" to="639" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Recommendations based on semantically enriched museum collections</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Stash</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Aroyo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Gorgels</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Rutledge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Schreiber</surname></persName>
		</author>
		<ptr target="http://www.sciencedirect.com/science/article/B758F-4TT7153-1/2/f1bd28cd4d79a0ff70d74439e3f5e3fc,semanticWebChallenge2006/2007" />
	</analytic>
	<monogr>
		<title level="m">Web Semantics</title>
				<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page" from="283" to="290" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Semantic relations for content-based recommendations</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Stash</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Aroyo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Hollink</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Schreiber</surname></persName>
		</author>
		<idno type="DOI">10.1145/1597735.1597786</idno>
		<ptr target="http://doi.acm.org/10.1145/1597735.1597786" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the fifth international conference on Knowledge capture</title>
				<meeting>the fifth international conference on Knowledge capture<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="209" to="210" />
		</imprint>
	</monogr>
	<note>K-CAP &apos;09</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
