<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Linked Data Spaces &amp; Data Portability</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kingsley Idehen OpenLink Software</string-name>
          <email>kidehen@openlinksw.com</email>
          <email>oerling@openlinksw.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mall Road</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Burlington</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Management</institution>
          ,
          <addr-line>Performance, Design, Standardization, Languages, Theory</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Orri Erling OpenLink Software</institution>
          ,
          <addr-line>10 Mall Road, Burlington, MA 01803</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In the year 2007, the size of the Linked Data injected into the Web grew to several billion RDF triples, served by a network of interlinked data sources that cover domains such as general knowledge, geographic information, people, companies, online communities, films, music, books and scientific publications. Unfortunately, the growth rate of User Generated content from a variety of Web based unstructured and semi-structured data-silos continues to exceed that of structured Linked Data. Thus, we have a pressing need for technology, capable of bridging this broadening divide via transparent generation of Linked Data from existing data-silos on the Web. Our Linked Data technology demonstration explores the use of the OpenLink Data Spaces platform as a solution to this problem.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Categories and Subject Descriptors</title>
      <sec id="sec-1-1">
        <title>H.3.2 [Information Storage]</title>
      </sec>
      <sec id="sec-1-2">
        <title>H.3.3 [Information Search &amp; Retrieval]</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>General Terms</title>
    </sec>
    <sec id="sec-3">
      <title>1. INTRODUCTION</title>
      <sec id="sec-3-1">
        <title>User generated content is growing at an exponential rate behind</title>
        <p>corporate firewalls and across the Internet in general. The use of</p>
      </sec>
      <sec id="sec-3-2">
        <title>Web technologies has been the prime accelerator of the</title>
        <p>aforementioned growth due to the pervasiveness of Web based
distributed collaborative applications. Examples include: Social</p>
      </sec>
      <sec id="sec-3-3">
        <title>Networking, Weblogs, Wikis, Shared Bookmark Managers, Photo</title>
      </sec>
      <sec id="sec-3-4">
        <title>Sharing, Polls Management, Calendars, Discussion Forums, File</title>
      </sec>
      <sec id="sec-3-5">
        <title>Sharing, and Feed Aggregation, to name a few.</title>
      </sec>
      <sec id="sec-3-6">
        <title>The exponential growth of user-generated content has resulted in</title>
        <p>the growth of silos comprised of unstructured and/or
semistructured content. Unfortunately, these silos have accelerated,
rather than decelerated, the imminence of an “information
overload” quagmire.
•
•</p>
      </sec>
      <sec id="sec-3-7">
        <title>RDF based structured data</title>
      </sec>
      <sec id="sec-3-8">
        <title>Standardized data serialization formats</title>
      </sec>
      <sec id="sec-3-9">
        <title>HTTP based Unique Identifiers for all Data Items (web</title>
        <p>resources and abstract &amp; concrete things)</p>
      </sec>
      <sec id="sec-3-10">
        <title>HTTP based Data Set containers (Data Spaces)</title>
      </sec>
      <sec id="sec-3-11">
        <title>Data Servers that provide data management and data</title>
        <p>access services for one or more Data Spaces</p>
      </sec>
      <sec id="sec-3-12">
        <title>Key infrastructure oriented shared ontologies</title>
      </sec>
      <sec id="sec-3-13">
        <title>Query Language for interacting with structured data</title>
      </sec>
      <sec id="sec-3-14">
        <title>We identify the items above, collectively, as critical components</title>
        <p>of Linked Data Spaces: points of presence on the Web that expose
structured data via HTTP based URIs.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>2. Issues &amp; Challenges</title>
    </sec>
    <sec id="sec-5">
      <title>2.1 Data Portability</title>
      <sec id="sec-5-1">
        <title>It’s no secret that data wants to be free of the tyranny of</title>
        <p>application logic confinement. In recent times, the realization that
meshing Identity and Data ownership on the Web are critical
requirements of this pursuit of freedom, has resulted in the
emergence of a movement for Data Portability as yet another
enclave within the broader Open Data movement.</p>
        <p>Data portability addresses to key issues: data mobility and data
referencing. Today, data mobility though the use of standard data
formats for moving data across silos (import and export style)
have emerged as the focal point of attention with regards to
addressing the proliferation of data silos on the Web. Examples
include: RSS 1.0, RSS 2.0, Atom, OPML, FOAF, SIOC, and
others. Unfortunately, the ability to reference and de-reference
data across data-silos is yet to catch the attention of those
pursuing data portability.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>2.2 RDFization Middleware</title>
      <sec id="sec-6-1">
        <title>The traditional resistance to RDF adoption, which is critical to</title>
      </sec>
      <sec id="sec-6-2">
        <title>Linked Data comprehension and production, comes from the</title>
        <p>grounding of the RDF Data Model in Graph Theory and the
unwillingness of most Web Application developers to interact
with data formally. This reality has lead to a genre of middleware
tools collectively known as RDFizers, that generate RDF on the
fly.</p>
        <p>With regards, to Linked Data, generating RDF on-the-fly is only
part of the equation; the generated RDF must retain the core
principles of linked data by providing URIs for physical web
accessible resources, concrete entities, and abstract things. Of
course, this process must include intelligent production of
instance data associated with relevant shared schemas or
ontologies.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>2.3 Data Junction Boxes in the Clouds</title>
      <sec id="sec-7-1">
        <title>It is our belief that the Linked Data Web will be more distributed</title>
        <p>than centralized in architecture. We envisage a Linked Data Web
comprised of hubs that range is size from large (e.g. DBpedia,</p>
      </sec>
      <sec id="sec-7-2">
        <title>Geonames, Zitgist etc.), medium sized group (e.g. RDFized</title>
      </sec>
      <sec id="sec-7-3">
        <title>Weblogs, Wikis, Bulletin Boards etc.), and smaller personal hubs</title>
        <p>enabled by operating system virtualization technologies like</p>
      </sec>
      <sec id="sec-7-4">
        <title>Amazon EC2. The medium and smaller hubs are best described as data junction boxes because they act as conduits between existing systems and Linked Data aware User Agents.</title>
      </sec>
      <sec id="sec-7-5">
        <title>This demonstration will demonstrate a Data Space initialization process for end-users that covers:</title>
        <p>•
•
•
•</p>
      </sec>
      <sec id="sec-7-6">
        <title>Domain Name Registration (e.g. .Name acquisition)</title>
      </sec>
      <sec id="sec-7-7">
        <title>DNS configuration</title>
      </sec>
      <sec id="sec-7-8">
        <title>Bonding with existing Web 2.0 platforms Facebook,</title>
        <p>phpBB3, MediaWiki, Wordpress, Drupal, Del.icio.us,</p>
      </sec>
      <sec id="sec-7-9">
        <title>Flickr, and Bugzilla</title>
      </sec>
      <sec id="sec-7-10">
        <title>Production of a dereferencable URIs that exposed the resulting Data Graph</title>
        <p>•</p>
      </sec>
      <sec id="sec-7-11">
        <title>Interaction with the resulting data graph via a number of</title>
      </sec>
      <sec id="sec-7-12">
        <title>Linked Data aware User Agents</title>
        <p>4. Links
•
•
•
http://en.wikipedia.org/wiki/OpenLink_Dat
a_Spaces - OpenLink Data Spaces
http://en.wikipedia.org/wiki/Virtuoso_Univ
ersal_Server - Virtuoso
http://myopenlink.net/ods/index.html - Live
OpenLink Data Spaces Demonstration</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          3.
          <string-name>
            <surname>Identity</surname>
          </string-name>
          &amp; Data Meshing via Linked Data Spaces
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>