<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">RDFPath: Path Query Processing on Large RDF Graphs with MapReduce</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Martin</forename><surname>Przyjaciel-Zablocki</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Lehrstuhl für Datenbanken und Informationssysteme Albert-Ludwigs</orgName>
								<orgName type="institution">-Universität Freiburg</orgName>
								<address>
									<addrLine>Georges-Köhler-Allee, Geb. 51</addrLine>
									<postCode>79110</postCode>
									<settlement>Freiburg, Breisgau</settlement>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Alexander</forename><surname>Schätzle</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Lehrstuhl für Datenbanken und Informationssysteme Albert-Ludwigs</orgName>
								<orgName type="institution">-Universität Freiburg</orgName>
								<address>
									<addrLine>Georges-Köhler-Allee, Geb. 51</addrLine>
									<postCode>79110</postCode>
									<settlement>Freiburg, Breisgau</settlement>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Thomas</forename><surname>Hornung</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Lehrstuhl für Datenbanken und Informationssysteme Albert-Ludwigs</orgName>
								<orgName type="institution">-Universität Freiburg</orgName>
								<address>
									<addrLine>Georges-Köhler-Allee, Geb. 51</addrLine>
									<postCode>79110</postCode>
									<settlement>Freiburg, Breisgau</settlement>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Georg</forename><surname>Lausen</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Lehrstuhl für Datenbanken und Informationssysteme Albert-Ludwigs</orgName>
								<orgName type="institution">-Universität Freiburg</orgName>
								<address>
									<addrLine>Georges-Köhler-Allee, Geb. 51</addrLine>
									<postCode>79110</postCode>
									<settlement>Freiburg, Breisgau</settlement>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">RDFPath: Path Query Processing on Large RDF Graphs with MapReduce</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">E71F6EDBEA69812E9EC672C14EEB8CD3</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-19T16:10+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>MapReduce</term>
					<term>RDFPath</term>
					<term>RDF Query Languages</term>
					<term>Social Network Analysis</term>
					<term>Semantic Web</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The MapReduce programming model has gained traction in different application areas in recent years, ranging from the analysis of log files to the computation of the RDFS closure. Yet, for most users the MapReduce abstraction is too low-level since even simple computations have to be expressed as Map and Reduce phases. In this paper we propose RDFPath, an expressive RDF path query language geared towards casual users that benefits from the scaling properties of the MapReduce framework by automatically transforming declarative path queries into MapReduce jobs. Our evaluation on a real world data set shows the applicability of RDFPath for investigating typical graph properties like shortest paths.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The proliferation of data on the Web is growing tremendously in recent years. According to Eric Schmidt, CEO of Google, more than five Exabyte of data are generated collectively every two days, which corresponds to the whole amount of data generated up to the year 2003 <ref type="foot" target="#foot_0">1</ref> . Another example is Facebook with currently more than 500 million active users interacting with more than 900 million different objects like pages, groups or events.</p><p>In a Semantic Web environment this data is typically represented as a RDF graph <ref type="bibr" target="#b18">[19]</ref>, which is a natural choice for social network scenarios <ref type="bibr" target="#b19">[20]</ref>, thus facilitating exchange, interoperability, transformation and querying of data. However, management of large RDF graphs is a non-trivial task and single machine approaches are often challenged with processing queries on such graphs <ref type="bibr" target="#b27">[28]</ref>. One solution is to use high performance clusters or to develop custom distributed systems that are commonly not very cost-efficient and also do not scale with respect to additional hardware <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b8">9]</ref>.</p><p>The MapReduce programming model introduced by Google in <ref type="bibr" target="#b7">[8]</ref> runs on regular off-the-shelf hardware and shows desirable scaling properties, e.g. new computing nodes can easily be added to the cluster. Additionally, the distribution of data and the parallelization of calculations is handled automatically, relieving the developer from having to deal with classical problems of distributed applications such as the synchronization of data, network protocols or fault tolerance strategies. These benefits have led to the application of this programming model to a number of problems in different areas, where large data sets have to be processed <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2,</ref><ref type="bibr" target="#b6">7]</ref>. One line of research is centered around the transformation of existing algorithms into the MapReduce paradigm <ref type="bibr" target="#b17">[18]</ref>, which is a time consuming process that requires substantial technical knowledge about the framework. More in line with the approach presented in this paper is the idea to use a declarative high-level language and to provide an automatic translation into a series of Map and Reduce phases as proposed in <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b26">27]</ref> for SPARQL and in <ref type="bibr" target="#b21">[22]</ref> for Pig Latin, a data processing language for arbitrary data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Contributions.</head><p>In this paper we present RDFPath, a declarative path query language for RDF that by design has a natural mapping to the MapReduce programming model while remaining extensible. We also give details about our system design and implementation. By its intuitive syntax, RDFPath supports the exploration of graph properties such as shortest connections between two nodes in a RDF graph. We are convinced that RDFPath is a valuable tool for the analysis of social graphs, which is highlighted by our evaluation on a realworld data set based on user profiles crawled from Last.fm. The implementation of RDFPath is available for download from our project homepage<ref type="foot" target="#foot_1">2</ref> . Related Work. There is a large body of work dealing with query languages for (RDF) graphs considering various aspects and application fields <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b9">10,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b15">16,</ref><ref type="bibr" target="#b24">25,</ref><ref type="bibr" target="#b31">32]</ref>. Besides classical proposals for graphs as introduced in <ref type="bibr" target="#b24">[25]</ref> and in <ref type="bibr" target="#b15">[16]</ref> with RQL, there are also many proposals for specific RDF graph languages (cf. <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b9">10,</ref><ref type="bibr" target="#b11">12]</ref> for detailed surveys). Taking this into account, we extended the proposed comparison matrix for RDF query languages from <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b4">5]</ref> by two additional properties, namely the support for shortest path queries and aggregate functions, as well as the additional RDF query languages SPARQL <ref type="bibr" target="#b23">[24]</ref>, RPL <ref type="bibr" target="#b31">[32]</ref>, and RDF-Path, as depicted in Table <ref type="table">1</ref>. For a more detailed description of the properties occurring in Table <ref type="table">1</ref> the interested reader is referred to <ref type="bibr" target="#b4">[5]</ref>.</p><p>According to Table <ref type="table">1</ref>, RDFPath has a competitive expressiveness to other RDF query languages. For the missing diameter property, which is not considered in any of the listed languages, a MapReduce solution has been proposed in <ref type="bibr" target="#b14">[15]</ref>, regardless of a syntactically useful integration into any path query language. There are also further approaches to extend SPARQL with expressive navigational capabilities such as nSPARQL <ref type="bibr" target="#b22">[23]</ref>, (C)PSPARQL <ref type="bibr" target="#b2">[3]</ref> as well as</p><formula xml:id="formula_0">Property RQL SeRQL, RDQL 3 , Triple N3 Versa RxPath RPL SPARQL (1.0) RDFPath Adjacent nodes ± ± ± ± × √ √ √ Adjacent edges ± ± × × × × √ √ Degree of a node ± × × × × × × √ Path × × × × ± ± × ± Fixed-length Path ± ± ± × ± ± √ √ Distance between 2 nodes × × × × × × × ± Diameter × × × × × × × × Shortest Paths × × × × × × × ± Aggregate functions ± × × × ± × × ±</formula><p>(×: not supported, ±: partially supported, √ : fully supported)</p><p>Table <ref type="table">1</ref>. Comparison of RDF Query Languages (adapted from <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b4">5]</ref>)</p><p>property paths, that are a part of the proposal for SPARQL 1.1 <ref type="foot" target="#foot_3">4</ref> . In contrast, we focus on path queries and study their implementation based on MapReduce.</p><p>Another area, which is related to our research, is the distributed processing of large data sets with MapReduce. Pig is a system for analyzing large data sets, consisting of the high-level language Pig Latin <ref type="bibr" target="#b21">[22]</ref> that is automatically translated into MapReduce jobs. Furthermore there are serveral recent approches for evaluating SPARQL queries with MapReduce <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b20">21,</ref><ref type="bibr" target="#b26">27]</ref>. However, because of the limited navigational capabilities of SPARQL <ref type="bibr" target="#b22">[23]</ref>, as opposed to RDFPath, these approaches do not offer a sufficient functionality to support a broad range of analysis tasks for RDF graphs.</p><p>Besides the usage of a general purpose MapReduce cluster, some systems rely on a specialized computer cluster. Virtuoso Cluster Edition <ref type="bibr" target="#b8">[9]</ref> is a cluster extension of the RDF Store Virtuoso and BigOWLIM<ref type="foot" target="#foot_4">5</ref> is a RDF database engine with extensive reasoning capabilities, both allowing to store and process billions of triples. In <ref type="bibr" target="#b29">[30]</ref> the authors propose an extension of Sesame for querying distributed RDF repositories. However, such specialized clusters have the disadvantage that they require individual infrastructures, whereas our approach is based on a general framework that can be used for different purposes.</p><p>Paper Structure. Section 2 provides a brief introduction to the MapReduce framework. Section 3 introduces the RDFPath language, while Section 4 discusses the components of the implemented system and the evaluation of RDF-Path queries. Section 5 presents our system evaluation based on a real-world data set and Section 6 concludes this paper with an outlook on future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">MapReduce</head><p>The MapReduce programming model was originally introduced by Google in 2004 <ref type="bibr" target="#b7">[8]</ref> and enables scalable, fault tolerant and massively parallel calculations using a computer cluster. The basis of Google's MapReduce is the distributed file system GFS <ref type="bibr" target="#b10">[11]</ref> where large files are split into equal sized blocks, spread across the cluster and fault tolerance is achieved by replication. The workflow of a MapReduce program is a sequence of MapReduce jobs each consisting of a Map and a Reduce phase separated by a so-called Shuffle &amp; Sort phase. A user has to implement the map and reduce functions which are automatically executed in parallel on a portion of the data. The Mappers invoke the map function for every record of their input data set represented as a key-value pair. The map function outputs a list of new intermediate key-value pairs which are then sorted according to their key and distributed to the Reducers such that all values with the same key are sent to the same Reducer. The reduce function is invoked for every distinct key together with a list of all according values and outputs a list of values which can be used as input for the next MapReduce job. The signatures of the map and reduce functions are therefore as follows: map:</p><p>(inKey, inValue) -&gt; list(outKey, intermediateValue) reduce: (outKey, list(intermediateValue)) -&gt; list(outValue)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">RDFPath</head><p>A RDF data set consists of a set of RDF triples in the form &lt;subject, predicate, object&gt; that can be interpreted as "subject has the property predicate with value object". It is possible to represent a RDF data set as directed, labeled graph where every triple corresponds to an edge (predicate) from one node (subject) to another node (object). For clarity of presentation, we use a simplified RDF notation without URI prefixes in the following. Strings and numbers are mapped to their corresponding datatypes in RDF.</p><p>Executing path queries on very large RDF data sets like social network graphs with billions of entries is a non-trivial task that typically requires many resources and computational power <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b8">9,</ref><ref type="bibr" target="#b19">20,</ref><ref type="bibr" target="#b27">28]</ref>. RDFPath is a declarative RDF path query language, inspired by XPath and designed especially with regard to the MapReduce model. A query in RDFPath is composed by a sequence of location steps where the output of the i th location step is used as input for the (i + 1) th location step. Conceptually, a location step adds one or more additional edges and nodes to an intermediate path that can be restricted by filters. The result of a query is a set of paths, consisting of edges and nodes of the given RDF graph. In the following we give an example-driven introduction to RDFPath.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">RDFPath By Example</head><p>Start Node. The start node is the first part of a RDFPath query, separated by "::" from the rest of the query and specifies the starting point for the evaluation of a path query as shown in Query 1. Using the symbol "*" indicates an arbitrary start node where every subject with the denoted predicate of the first location step is considered (see Query 2).</p><p>Chris :: knows</p><p>* :: knows</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Location</head><p>Step. Location steps are the basic navigational component in RDF-Path, specifying the next edge to follow in the query evaluation process. The usage of multiple location steps, separated by "&gt;", defines the order as well as the amount of edges followed by the query (Query 3). If the same edge is used in several consecutive location steps one can use an abbreviation by specifying the number of iterations within parentheses as shown in Query 4. Instead of specifying a fixed edge, the symbol "*" can be used to follow an arbitrary edge as illustrated in Query 5 that determines all adjacent edges and nodes of Chris.</p><p>Chris :: knows &gt; knows &gt; age</p><p>Chris :: knows (2) &gt; age</p><p>Chris :: *</p><p>Filter. Filters can be specified within any location step using square brackets.</p><p>There are two types of filters to constrain the value (Queries 6, 7) or the properties (Query 8) of a node reached by the location step. Multiple filters are specified in a sequence and a path has to satisfy all filters. If a node does not have the desired property, the filter evaluates to false. Up to now, the following filter expressions are applicable: equals(), prefix(), suffix(), min(), max().</p><p>Chris :: knows &gt; age [min <ref type="bibr" target="#b17">(18)</ref>] [max(67)]</p><p>Chris :: * &gt; * [equals('Peter')]</p><p>Chris :: knows [age = min <ref type="bibr" target="#b29">(30)</ref>] [country = prefix('D')] &gt; name <ref type="bibr" target="#b7">(8)</ref> Bounded Search. This type of query starts with a fixed node and computes the shortest paths between the start node and all reachable nodes within a user-defined bound. For this purpose we extend the notation of the previously introduced abbreviations with an optional symbol "*". While the abbreviations indicate a fixed length, the "*" symbol indicates to use the number as upper bound for the maximum search depth. As an example, in Query 9 we search for all German people with a maximum distance of three to Chris.</p><formula xml:id="formula_8">Chris :: knows [country = equals('DE')] (*3)<label>(9)</label></formula><p>Bounded Shortest Path. This type of query computes the shortest path between two nodes in the graph with a user-defined maximum distance. As we are often interested in the length of the path, the query outputs the shortest distance and the corresponding path between two given nodes. To do this one has to extend a bounded search query with a final distance() function specifying the target node as shown in Query 10.</p><p>Chris :: knows (*3).distance('Peter')</p><p>Aggregation Functions. It is possible to count the number of resulting paths for a query (Query 11 calculates the degree of Chris) or to apply some aggregation functions to the last nodes of the paths, respectively. The following functions are available: count(), sum(), avg(), min() and max(). It should be noted that aggregation functions can only be applied to nodes of numeric type (e.g. integer or double) as shown in Query 12.</p><p>Chris :: *.count()</p><p>Chris :: knows &gt; age.avg()</p><p>Example. Figure <ref type="figure" target="#fig_4">1</ref> shows the evaluation of the last location step of Query 13 on the corresponding RDF graph. The second path is rejected as the age of Sarah does not satisfy the filter condition. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Expressiveness</head><p>In this section we will evaluate the expressiveness of RDFPath w.r.t. the properties listed in Table <ref type="table">1</ref>. A detailed discussion can be found on our project homepage <ref type="foot" target="#foot_5">6</ref> and in <ref type="bibr" target="#b25">[26]</ref>. Query 5 shows an example for the calculation of all adjacent edges and nodes of a node by using the symbol "*" instead of specifying a fixed edge. Query 11 calculates the degree of a node by applying the aggregation function count() on the resulting paths and Query 7 gives all paths with a fixed length of two from Chris to Peter by specifying two location steps with arbitrary edges. The properties path, distance between 2 nodes and shortest paths are only partially supported by RDFPath because in general to answer these properties one has to calculate paths of arbitrary length where RDFPath only supports paths of a maximum fixed length. Furthermore aggregation functions are partially supported as they can only be applied in the last location step of a query.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Query Evaluation</head><p>We The Query Processor parses the query and generates a general execution plan that consists of a sequence of instructions where each instruction describes e.g. the application of a filter, join or aggregation function. In the next step, the general execution plan is mapped to a specific MapReduce plan that consists of a sequence of MapReduce assignments. An assignment encapsulates the specific MapReduce job together with a job configuration. The Query Engine runs the MapReduce jobs in sequence, collects information about the computation process like time and storage utilization and cleans up temporary files. A schematic representation of this procedure is shown in Figure <ref type="figure" target="#fig_1">2</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Mapping of Location Steps to MapReduce jobs</head><p>A query in RDFPath is composed of a sequence of location steps that is translated into a sequence of MapReduce jobs automatically. As illustrated in Figure <ref type="figure" target="#fig_2">3</ref> a location step corresponds to a join in MapReduce between an intermediate set of paths and the corresponding RDF graph partition. Joins are implemented as so-called Reduce-Side-Joins since the assumption of the more efficient Map-Side-Joins that both inputs must be sorted is not fulfilled in general. The principles of Reduce-Side-Joins can be looked up in <ref type="bibr" target="#b17">[18,</ref><ref type="bibr" target="#b30">31]</ref>. Filters are applied in the Map phase by rejecting all triples do not satisfy the filter conditions and aggregation functions are computed in parallel in the Reduce phase of the last location step. We also implemented a mechanism to detect cycles when extending an intermediate path where the user can decide at runtime whether (1) cycles are allowed, (2) only allowed if the cycle contains two or more distinct edges or ( <ref type="formula" target="#formula_3">3</ref>) not allowed at all. Considering Figure <ref type="figure" target="#fig_2">3</ref> the given query requires two joins and is therefore mapped into a MapReduce plan that consists of two MapReduce jobs. While the first job computes all friends of "Chris" that can be reached by following the edge knows at most two times, the second MapReduce job follows the edge country and restricts the value to "DE". </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Evaluation</head><p>We evaluated our implementation on two different data sources to investigate the scalability behavior. First, we used artificial data produced by the SP2Bench generator <ref type="bibr" target="#b28">[29]</ref> which allows to generate arbitrary large RDF documents that contain bibliographic information about synthetic publications. The generated RDF documents contained up to 1.6 billion RDF triples. Second, we collected 225 million RDF triples of real world data from the online music service Last.fm that are accessible via a public API. Due to space limitations we only discuss some results for the Last.fm dataset, which is a more appropriate choice for path queries and can also be interpreted in a more intuitive way. Figure <ref type="figure" target="#fig_3">4</ref> illustrates the dataset. We used a cluster of ten Dell PowerEdge R200 servers connected via a gigabit network and Cloudera's Distribution for Hadoop 3 Beta (CDH3). Each server had a Dual Core 3,16 GHz processor, 4 GB RAM and 1 TB harddisk. One of the servers was exclusively used to distribute the MapReduce jobs (Jobtracker) and  The missing edge labels are named like the target nodes. In the case of ambiguity, the edge label is extended by the type of source (e.g. trackPlaycount and albumPlaycount) store the metadata of the file system (Namenode). Besides the default Hadoop configuration we used 9 reducers (one per harddisk).   Query 2. Starting from all tracks of Michael Jackson that are on the album "Thriller" the query determines all similar tracks that have a minimum duration of 50 seconds. The last location step then looks for the top fans of these tracks who live in Germany. The idea behind this query was to have a look at the impacts of using filters to reduce the amount of intermediate results. The number of results to the query and therefore the used HDFS storage do not increase significantly with the size of the graph as the tracks of the album "Thriller" are fixed. This also explains the execution times of the query as illustrated in the left diagram of Figure <ref type="figure" target="#fig_7">6</ref> and confirms that the execution time is mainly determined by the amount of intermediate results. Query 3. These queries determine the friends of Chris reached by following an increasing number of edges. The first query starts by following the edge knows once and the last query ends by following the edge knows at most ten times. This corresponds to the computation of the Friend of a Friend<ref type="foot" target="#foot_6">7</ref> paths starting from Chris with an increasing maximum distance. The left chart of Figure <ref type="figure">7</ref> illustrates the percentage of reached people, in accordance to the maximum Friend of a Friend distance, where the total percentage represents all reachable people. Starting with a fixed person we can reach over 98% of all reachable persons by following the edge knows seven times which corresponds to the wellknown six degrees of separation paradigm <ref type="bibr" target="#b16">[17]</ref>. The right chart of Figure <ref type="figure">7</ref> shows the execution times depending on the maximum Friend of a Friend distance. We can observe a linear scaling behavior that is mainly determined by the number of joins rather than computation and data transfer time.</p><p>Results. Our evaluation shows that RDFPath allows to express and compute interesting graph issues such as Friend of a Friend queries, small world properties like six degrees of separation or the Erdös number<ref type="foot" target="#foot_7">8</ref> on large RDF graphs. The execution times for the surveyed queries on real-world data from Last.fm scale linear in the size of the graph where the number of joins as well as the amount of intermediate results, that must be transferred over the network, determine the complexity of a query. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusion</head><p>The amount of available Semantic Web data is growing constantly, calling for solutions that are able to scale accordingly. The RDF query language RDFPath, that is presented in this paper, was designed with this constraint in mind and combines an intuitive syntax for path queries with an effective execution strategy using MapReduce. Our evaluation confirms that both large RDF graphs can be handled while scaling linear with the size of the graph and that RDFPath can be used to investigate graph properties such as a variant of the famous six degrees of separation paradigm typically encountered in social graphs.</p><p>As future work we plan to extend RDFPath with more powerful language constructs geared towards the analysis of social graphs, e.g. to express the full list of desiderata stated in <ref type="bibr" target="#b19">[20]</ref>. In parallel, we are optimizing our implementation on the system level by incorporating current results for the efficient computation of joins with MapReduce <ref type="bibr" target="#b6">[7]</ref>.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>25 Fig. 1 .</head><label>251</label><figDesc>Fig. 1. RDFPath Example</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Query Processing</figDesc><graphic coords="7,165.33,396.56,285.48,60.96" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. Joins and Location Steps</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 4 .</head><label>4</label><figDesc>Fig. 4. (a) Histogram of Last.fm data (b) Simplified RDF graph of the Last.fm data set.The missing edge labels are named like the target nodes. In the case of ambiguity, the edge label is extended by the type of source (e.g. trackPlaycount and albumPlaycount)</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Query 1 .</head><label>1</label><figDesc>Starting from a given track this query determines the album name for all similar tracks that can be reached by following the edge trackSimilar at most four times. The overall execution times of this query are shown in the left diagram of Figure 5 and exhibit a linear scaling behavior in the size of the graph. Furthermore it turns out that this is also the case for the amount of transferred data (SHUFFLE), intermediate data (LOCAL) and data stored in HDFS. These values are shown in the right diagram of Figure 5. We conclude that the execution time is mainly influenced by the number of intermediate results stored locally as well as the transferred data between the machines.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Fig. 5 .</head><label>5</label><figDesc>Fig. 5. Query 1</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Fig. 6 .</head><label>6</label><figDesc>Fig. 6. Query 2</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>1 ≤ X ≤ 10 Fig. 7 .</head><label>1107</label><figDesc>Fig. 7. Query 3</figDesc></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://techonomy.com</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">http://dbis.informatik.uni-freiburg.de/?project=DiPoS/RDFPath.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">In<ref type="bibr" target="#b12">[13]</ref> the authors describe how to extend RDQL to support aggregates.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">http://www.w3.org/TR/sparql11-query</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">http://www.ontotext.com/owlim</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">http://dbis.informatik.uni-freiburg.de/?project=DiPoS/RDFPath.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_6">http://www.foaf-project.org</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_7">http://www.oakland.edu/enp</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Tradeoffs between Parallel Database Systems, Hadoop, and HadoopDB as Platforms for Petabyte-Scale Analysis</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">J</forename><surname>Abadi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SSDBM</title>
				<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="1" to="3" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Scalable Semantic Web Data Management Using Vertical Partitioning</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">J</forename><surname>Abadi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Marcus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Madden</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">J</forename><surname>Hollenbach</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">VLDB</title>
				<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="411" to="422" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Extending sparql with regular expression patterns (for querying rdf)</title>
		<author>
			<persName><forename type="first">F</forename><surname>Alkhateeb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">F</forename><surname>Baget</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Euzenat</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Web Sem</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="57" to="73" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Querying RDF Data from a Graph Database Perspective</title>
		<author>
			<persName><forename type="first">R</forename><surname>Angles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gutiérrez</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ESWC</title>
		<imprint>
			<biblScope unit="page" from="346" to="360" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">RDF Query Languages Need Support for Graph Properties</title>
		<author>
			<persName><forename type="first">R</forename><surname>Angles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gutierrez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hayes</surname></persName>
		</author>
		<idno>TR/DCC-2004-3</idno>
		<imprint>
			<date type="published" when="2004-06">June 2004</date>
		</imprint>
		<respStmt>
			<orgName>University of Chile</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Tech. Rep.</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Web and Semantic Web Query Languages: A Survey</title>
		<author>
			<persName><forename type="first">J</forename><surname>Bailey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Bry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Furche</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Schaffert</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Reasoning Web</title>
		<imprint>
			<biblScope unit="page" from="35" to="133" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A Comparison of Join Algorithms for Log Processing in MapReduce</title>
		<author>
			<persName><forename type="first">S</forename><surname>Blanas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ercegovac</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Rao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">J</forename><surname>Shekita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tian</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SIGMOD Conference</title>
				<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="975" to="986" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">MapReduce: Simplified Data Processing on Large Clusters</title>
		<author>
			<persName><forename type="first">J</forename><surname>Dean</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ghemawat</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">OSDI</title>
				<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="page" from="137" to="150" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Towards Web Scale RDF</title>
		<author>
			<persName><forename type="first">O</forename><surname>Erling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Mikhailov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. SSWS</title>
				<meeting>SSWS</meeting>
		<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">RDF Querying: Language Constructs and Evaluation Methods Compared</title>
		<author>
			<persName><forename type="first">T</forename><surname>Furche</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Linse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Bry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Plexousakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gottlob</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Reasoning Web</title>
		<imprint>
			<biblScope unit="page" from="1" to="52" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">The Google File System</title>
		<author>
			<persName><forename type="first">S</forename><surname>Ghemawat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Gobioff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">T</forename><surname>Leung</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. SOSP</title>
				<meeting>SOSP</meeting>
		<imprint>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="29" to="43" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A Comparison of RDF Query Languages</title>
		<author>
			<persName><forename type="first">P</forename><surname>Haase</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Broekstra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Eberhart</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Volz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ISWC</title>
				<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="page" from="502" to="517" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">RDF Aggregate Queries and Views</title>
		<author>
			<persName><forename type="first">E</forename><surname>Hung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">S</forename><surname>Subrahmanian</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICDE</title>
				<imprint>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="717" to="728" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Data Intensive Query Processing for Large RDF Graphs Using Cloud Computing Tools</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">F</forename><surname>Husain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Khan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kantarcioglu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Thuraisingham</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. CLOUD. pp</title>
				<meeting>CLOUD. pp</meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="1" to="10" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">PEGASUS: A Peta-Scale Graph Mining System</title>
		<author>
			<persName><forename type="first">U</forename><surname>Kang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">E</forename><surname>Tsourakakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Faloutsos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICDM</title>
				<imprint>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="229" to="238" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">RQL: A Declarative Query Language for RDF</title>
		<author>
			<persName><forename type="first">G</forename><surname>Karvounarakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Alexaki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Christophides</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Plexousakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Scholl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">WWW</title>
		<imprint>
			<biblScope unit="page" from="592" to="603" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Planetary-Scale Views on a Large Instant-Messaging Network</title>
		<author>
			<persName><forename type="first">J</forename><surname>Leskovec</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Horvitz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. WWW &apos;08</title>
				<meeting>WWW &apos;08</meeting>
		<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="915" to="924" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Data-intensive text processing with MapReduce</title>
		<author>
			<persName><forename type="first">J</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Dyer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Synthesis Lectures on Human Language Technologies</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="177" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<title level="m" type="main">RDF Primer</title>
		<author>
			<persName><forename type="first">F</forename><surname>Manola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Miller</surname></persName>
		</author>
		<ptr target="http://www.w3.org/TR/rdf-primer/(2004" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Representing, Querying and Transforming Social Networks with RDF/SPARQL</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Martín</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gutierrez</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ESWC</title>
				<imprint>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="293" to="307" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">SPARQL Basic Graph Pattern Processing with Iterative MapReduce</title>
		<author>
			<persName><forename type="first">J</forename><surname>Myung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yeon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. MDAC2010. pp</title>
				<meeting>MDAC2010. pp</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="1" to="6" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Pig Latin: A Not-So-Foreign Language for Data Processing</title>
		<author>
			<persName><forename type="first">C</forename><surname>Olston</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Reed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Srivastava</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tomkins</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SIGMOD</title>
				<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="1099" to="1110" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">nSPARQL: A Navigational Language for RDF</title>
		<author>
			<persName><forename type="first">J</forename><surname>Pérez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Arenas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gutierrez</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Semantic Web Conference</title>
				<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="66" to="81" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Semantics and Complexity of SPARQL</title>
		<author>
			<persName><forename type="first">J</forename><surname>Pérez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Arenas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gutierrez</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Trans. Database Syst</title>
		<imprint>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="issue">3</biblScope>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">A Language Extension for Graph Processing and Its Formal Semantics</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">W</forename><surname>Pratt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">P</forename><surname>Friedman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Commun. ACM</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="issue">7</biblScope>
			<biblScope unit="page" from="460" to="467" />
			<date type="published" when="1971">1971</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Przyjaciel-Zablocki</surname></persName>
		</author>
		<title level="m">RDFPath: Verteilte Analyse von RDF-Graphen</title>
				<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
		<respStmt>
			<orgName>Albert-Ludwigs-Universität Freiburg</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Master&apos;s thesis</note>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">PigSPARQL: Übersetzung von SPARQL nach Pig Latin</title>
		<author>
			<persName><forename type="first">A</forename><surname>Schätzle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Przyjaciel-Zablocki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Hornung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Lausen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">BTW</title>
		<imprint>
			<biblScope unit="page" from="65" to="84" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">An Experimental Comparison of RDF Data Management Approaches in a SPARQL Benchmark Scenario</title>
		<author>
			<persName><forename type="first">M</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Hornung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Küchlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Lausen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Pinkel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Semantic Web Conference</title>
				<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="82" to="97" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">SP2Bench: A SPARQL Performance Benchmark</title>
		<author>
			<persName><forename type="first">M</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Hornung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Lausen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Pinkel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICDE</title>
				<imprint>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="222" to="233" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Towards distributed processing of RDF path queries</title>
		<author>
			<persName><forename type="first">H</forename><surname>Stuckenschmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vdovjak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Broekstra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">J</forename><surname>Houben</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Int. J. Web Eng. Technol</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">2/</biblScope>
			<biblScope unit="page" from="207" to="230" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Hadoop: The Definitive Guide</title>
		<author>
			<persName><forename type="first">T</forename><surname>White</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">O&apos;Reilly</title>
				<imprint>
			<date type="published" when="2009-06">June 2009</date>
		</imprint>
	</monogr>
	<note>1 st edn</note>
</biblStruct>

<biblStruct xml:id="b31">
	<monogr>
		<title level="m" type="main">A RPL Through RDF: Expressive Navigation in RDF Graphs</title>
		<author>
			<persName><forename type="first">H</forename><surname>Zauner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Linse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Furche</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Bry</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="251" to="257" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
