<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Decentralizing the Semantic Web: Who will pay to realize it?</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Tobias</forename><surname>Grubenmann</surname></persName>
							<email>grubenmann@ifi.uzh.ch</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Informatics</orgName>
								<orgName type="institution">University of Zurich</orgName>
								<address>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Daniele</forename><surname>Dell'aglio</surname></persName>
							<email>dellaglio@ifi.uzh.ch</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Informatics</orgName>
								<orgName type="institution">University of Zurich</orgName>
								<address>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Abraham</forename><surname>Bernstein</surname></persName>
							<email>bernstein@ifi.uzh.ch</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Informatics</orgName>
								<orgName type="institution">University of Zurich</orgName>
								<address>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Dmitry</forename><surname>Moor</surname></persName>
							<email>dmoor@ifi.uzh.ch</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Informatics</orgName>
								<orgName type="institution">University of Zurich</orgName>
								<address>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sven</forename><surname>Seuken</surname></persName>
							<email>seuken@ifi.uzh.ch</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Informatics</orgName>
								<orgName type="institution">University of Zurich</orgName>
								<address>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Decentralizing the Semantic Web: Who will pay to realize it?</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">A20B9CB493F263893E68D5EA5CC24503</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T00:34+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Fueled by enthusiasm of volunteers, government subsidies, and open data legislation, the Web of Data (WoD) has enjoyed a phenomenal growth. Commercial data, however, has been stuck in proprietary silos, as the monetization strategy for sharing data in the WoD is unclear. This contrasts with the traditional web where advertisement fueled a lot of the growth. This raises the question how the WoD can (i) maintain its success when government subsidies disappear and (ii) convince commercial entities to share their wealth of data. In this paper, we propose a marketplace for decentralized data following basic WoD principles. Our approach allows a customer to buy data from different, decentralized providers in a transparent way. As such, our marketplace presents a first step towards an economically viable WoD beyond subsidies.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The Web of Data (WoD) is a machine-readable alternative to the traditional World Wide Web. In the WoD, data is exposed in a semantically annotated format which allows machines to easily access the information they need according to the task they are performing. Due to the ease of integration given by the underlying Semantic Web technologies, data sources can be queried in a federated fashion without agreeing on a common scheme beforehand. Hence, the WoD can be seen as one big, decentralized database which can be queried over the Web.</p><p>Without financial incentives, many promising datasets will be poorly maintained or be unavailable as relying on volunteers is not enough to keep the data up-to-date and the endpoint running. Indeed, as <ref type="bibr" target="#b1">[2]</ref> points out, only a third of all known public endpoints have an uptime of 99% and above.</p><p>From our opinion, one main reason is a lack in financial incentives for people to provide data in a semantic format. Unlike in the traditional Web, semantic data is accessed primarily by automatic agents rather than human ones. Therefore, advertisements are completely ignored while accessing the data. An alternative to advertisement is to charge a fee for accessing the data. Such strategies are already pursued in the traditional Web by companies like Bloomberg, LexisNexis, and Thomson Reuters. Also, marketplaces like the Azure DataMarketplace allow different publishers to sell data with different subscriptions. So far, none of these implemented markets allow users to buy data in an integrated way from decentralized data providers. Specifically, it is not possible for a user to buy data which constitutes of a join between different datasets from different sources. Hence, applying the aforementioned subscription-based monetization strategies to the WoD is not compatible with the idea of a decentralized Semantic Web. How can we wean the WoD from government subsidies or federation-averse centralization? Finding an answer to this question is crucial to fulfill the promise of the data economy <ref type="bibr" target="#b0">[1]</ref>.</p><p>Our vision is to create a marketplace where decentralized data providers can offer their data and customers can buy answers to SPARQL queries. Such a marketplace is one possible way to make the Semantic Web independent from subsidies and make it financially sustainable. Our marketplace differentiates itself from existing marketplaces by these two distinctive features:</p><p>1. Customers can buy commercial data in an integrated way -meaning that a customer can buy data from multiple commercial datasets as if they would be one big, commercial database. 2. Customers are buying only the data which is needed to form the respective query answer instead of buying all data which is involved in query execution.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">A marketplace for Semantic Data</head><p>We propose to build a marketplace which allows combining data from different sources in an integrated way. Depending on the query and the providers involved, there might be different combination of providers' data that yield non-empty query answers. The market hence needs to (1) decide which datasets to includea process that is akin to source selection but needs to consider the prices for the different results and (2) determine optimal payments to each of the providers ensuring their participation in the marketplace.</p><p>Most data in the Semantic Web is not located in a single endpoint but distributed over several endpoints. Each endpoint can, potentially, contribute to a given query answer. Searching for endpoints offering the required data becomes cumbersome if the number of endpoints increases. A customer may need to access a lot of data out of which only a few (if any at all) end up in the result. Given these problems, we argue for a marketplace which is able to assess individual endpoints on their usefulness for a given query and which can help the customer to decide which data should be bought. As we have shown in <ref type="bibr" target="#b3">[4]</ref>, deciding whether accessing a certain combination of endpoints would yield a big enough result which is worth the involved costs is a challenging task. As the WoD gets more decentralized, it becomes unlikely that it is possible to accurately evaluate the contribution of a single endpoint towards a query answer without actually executing the query. Join estimation techniques for SPARQL queries might help to sort out endpoints which can hardly contribute towards a query answer. However, for the remaining endpoints, only a query execution can reveal the true contribution and value of an endpoint's data. Hence, we argue that a market for Semantic Data in a decentralized setting has to execute a given query on all promising endpoints before the decision can be made which part of the data should be bought by the customer. The sellers trust in the marketplace that it will not forward the data to the customer without payment.</p><p>Once a query is executed on promising endpoints, the result can be rated by the marketplace and either a buying decision can be made by the market on behalf of the customer, or a summary of the findings can be given to the customer who can then make a buying decision. Only after the buying decision has been made and the involved payments have been completed, the customer will receive the actual data.</p><p>Besides the buying decision, the market has to determine (1) how much a customer has to pay for the query answer and (2) how much payment each provider's contribution to a query answer warrants.</p><p>Figure <ref type="figure" target="#fig_0">1</ref> shows all the steps in our marketplace:</p><p>1. The market receives a query from the customer and executes it on the available sources. 2. Only a certain number of solutions from the original query answer are selected to compose the final query answer the customer will receive. Either the market does this buying decision on behalf of the customer or the customer decides based on some statistics which solutions to include into the final query answer. 3. The customer pays the marketplace the indicated price and receives the query answer.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Costs of a Query Answer</head><p>To discuss the costs involved in producing a query answer, we distinguish between two different roles on the sellers' side: Provider and Host.</p><p>A provider is the originator of data, which is used in the production of a query answer. Providers are responsible for the quality of data, including recentness, consistency and accuracy <ref type="bibr" target="#b2">[3]</ref>. Providers do not serve their data; this is done by separate entities, the hosts. Hosts operate computers that run SPARQL endpoints for querying data products. They provide the computational and network resources needed to query the providers' data products. Hence, they ensure the reliability, availability, security, and performance, which are usually specified as Quality of Service <ref type="bibr" target="#b2">[3]</ref>.</p><p>The separation between host and provider enables more flexible business models for data provision, as some providers might have an initial budget to create data (e.g., government subsidies) but do not have the funds to cover the operating costs for running a SPARQL endpoint or may have other reasons to outsource the actual data provision. Providers can decide to act at the same time as a host for their own and/or other provider's data. Nevertheless, we will distinguish between these two different roles and treat them as separate entities.</p><p>Data providers might have large fixed costs, which typically accrue whilst creating the data. The marginal costs of offering data, however, is (effectively)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Data</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Data</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Marketplace Customer</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Data Data</head><p>Query Execution Query 1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>2.</head><p>Data Data</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Marketplace Customer</head><p>Data Data Payment</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>3.</head><p>Final Query Answer</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Buying Decision</head><p>Original Query Answer Final Query Answer zero for the provider. This is because, as discussed above, the data is not served by providers but by hosts. Any cost that might occur while offering data is inflicted on the host. It is important to note that even if a provider acts as its own host, the marginal costs are only inflicted on the entity acting as a host, not as a provider. Like cloud service providers, hosts incur the fixed cost of operating the infrastructure, possibly some variable cost relative in the size of the data they store, and some marginal cost in form of the computational resources spent for each executed query. The host's marginal costs occur whenever the providers' data are queried, independently of whether any data will eventually be bought by a customer.</p><p>Data providers rely on the hosts to make their data available to the marketplace and thus, enable customers to buy their data. Similar to a Web host for traditional Web content, hosts in our market concept are paid by the provider, based on some service agreement. Hence, the providers have to include the hosting costs into their pricing decision. The hosts' costs are already compensated by the providers prior to query execution by the market. Thus, the hosts' costs become transparent to the market and, as a result, the market and customer do not have to take them into account. This facilitates the buying decision.</p><p>Figure <ref type="figure" target="#fig_1">2</ref> shows how the payments are distributed from the marketplace to the providers. The providers pay the hosts depending on some service agreement. Note that the payment from provider to the host can be independent from the payment from the marketplace to the providers. Also note that only provider which can actually contribute to the final query answers are getting payed for their services.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Outlook</head><p>To continue growing and being able to serve as a high-quality, decentralized data source, the WoD has to find the means to fund the creation, serving, and maintenance of data sources. In this paper, we proposed a new vision for funding these activities in the form of a marketplace for Semantic Data.</p><p>As a precursor to our research, we conducted a pilot study simulating a market platform for the WoD <ref type="bibr" target="#b6">[7]</ref>. In <ref type="bibr" target="#b4">[5]</ref>, we introduced the idea of using a double-auction for the WoD and showed the deficiency of the threshold rule in this setting, together with three ways to correct them. In <ref type="bibr" target="#b5">[6]</ref> we studied payment rules which are contingent on future realizations of join-estimates and can be used in the combinatorial data auctions. However, these approaches assumed that we have access to accurate join-estimates to produce satisfying results -an assumption which might be hard to enforce in the WoD.</p><p>Based on our research, we foresee the following challenges in building a market place for Semantic Data:</p><p>-Different market mechanisms have to be explored to understand their tradeoffs under various market settings. -Given a market mechanism, providers of Semantic Data have to decide how they will bundle and price the data they are selling. The challenge is to find prices which will satisfy the customers and allow the providers to cover their costs. -Our market idea introduces a new metric for source selection, query optimization, and query execution: the financial profitability. Revisiting known techniques and developing new techniques with respect to this new metric will undoubtedly open interesting opportunities for research. -A customer might not know enough about the structure of the offered data to compose a query in the first place. Data providers might be required to offer some representative sample data for free or allowing to execute explorative queries (with possible limitations) for free on their datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Host</head><p>This paper is a first step in the direction of finding stable financing for the WoD. We plan to address the aforementioned challenges in future work and believe that our vision of a marketplace for Semantic Data is a promising way to ensure the financial sustainability of decentralized providers of Semantic Data.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. The three steps from a query to the final query answer.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 2 .</head><label>2</label><figDesc>Fig.2. The customer pays the market which redirects the payment to the providers. The providers pay the hosts for their services.</figDesc></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work was partially supported by the Swiss National Science Foundation under grant #153598.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Fuel of the future: Data is giving rise to a new economy</title>
	</analytic>
	<monogr>
		<title level="j">The Economist</title>
		<imprint>
			<date type="published" when="2017-05-06">2017. May 6th), 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">SPARQL webquerying infrastructure: Ready for action</title>
		<author>
			<persName><forename type="first">C</forename><surname>Buil-Aranda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hogan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Umbrich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P.-Y</forename><surname>Vandenbussche</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Semantic Web -ISWC 2013</title>
				<editor>
			<persName><forename type="first">A</forename><forename type="middle">H</forename></persName>
		</editor>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="volume">8219</biblScope>
			<biblScope unit="page" from="227" to="293" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Quality-aware serviceoriented data integration: Requirements, state of the art and open challenges</title>
		<author>
			<persName><forename type="first">S</forename><surname>Dustdar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pichler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Savenkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H.-L</forename><surname>Truong</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM SIGMOD Record</title>
		<imprint>
			<biblScope unit="volume">41</biblScope>
			<biblScope unit="page" from="11" to="19" />
			<date type="published" when="2012">2012</date>
			<publisher>ACM</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Challenges of source selection in the WoD</title>
		<author>
			<persName><forename type="first">T</forename><surname>Grubenmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bernstein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Moor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Seuken</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International Semantic Web Conference ISWC &apos;17</title>
				<meeting>the International Semantic Web Conference ISWC &apos;17</meeting>
		<imprint>
			<publisher>Forthcoming</publisher>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A double auction for querying the web of data</title>
		<author>
			<persName><forename type="first">D</forename><surname>Moor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Grubenmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Seuken</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bernstein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Third Conference on Auctions, Market Mechanisms and Their Applications</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Core-selecting payment rules for combinatorial auctions with uncertain availability of goods</title>
		<author>
			<persName><forename type="first">D</forename><surname>Moor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Seuken</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Grubenmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bernstein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Twenty-Fifth International Joint Conference on Artificial Intelligence</title>
				<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="424" to="432" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Market-based sparql brokerage with matrix: Towards a mechanism for economic welfare growth and incentives for free data provision in the web of data</title>
		<author>
			<persName><forename type="first">M</forename><surname>Zollinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Basca</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bernstein</surname></persName>
		</author>
		<idno>IFI-2013.4</idno>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
	<note type="report_type">Technical Report</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
