=Paper= {{Paper |id=Vol-369/paper-21 |storemode=property |title=Linked Data Spaces & Data Portability |pdfUrl=https://ceur-ws.org/Vol-369/paper20.pdf |volume=Vol-369 |dblpUrl=https://dblp.org/rec/conf/www/IdehenE08 }} ==Linked Data Spaces & Data Portability== https://ceur-ws.org/Vol-369/paper20.pdf
                        Linked Data Spaces & Data Portability

           Kingsley Idehen                                                                                Orri Erling
          OpenLink Software,                                                                         OpenLink Software,
             10 Mall Road,                                                                              10 Mall Road,
      Burlington, MA 01803, USA                                                                  Burlington, MA 01803, USA
    kidehen@openlinksw.com                                                                     oerling@openlinksw.com



                                                                      To alleviate the imminent challenges of global information
                                                                      overload, we need to unobtrusively construct a Web of interlinked
ABSTRACT                                                              structured data from today’s data silos comprised of the
In the year 2007, the size of the Linked Data injected into the Web   following:
grew to several billion RDF triples, served by a network of
interlinked data sources that cover domains such as general                •    RDF based structured data
knowledge, geographic information, people, companies, online               •    Standardized data serialization formats
communities, films, music, books and scientific publications.              •    HTTP based Unique Identifiers for all Data Items (web
Unfortunately, the growth rate of User Generated content from a                 resources and abstract & concrete things)
variety of Web based unstructured and semi-structured data-silos           •    HTTP based Data Set containers (Data Spaces)
continues to exceed that of structured Linked Data. Thus, we have          •    Data Servers that provide data management and data
a pressing need for technology, capable of bridging this                        access services for one or more Data Spaces
broadening divide via transparent generation of Linked Data from           •    Key infrastructure oriented shared ontologies
existing data-silos on the Web. Our Linked Data technology                 •    Query Language for interacting with structured data
demonstration explores the use of the OpenLink Data Spaces
platform as a solution to this problem.                               We identify the items above, collectively, as critical components
                                                                      of Linked Data Spaces: points of presence on the Web that expose
Categories and Subject Descriptors                                    structured data via HTTP based URIs.
H.3.2 [Information Storage]
H.3.3 [Information Search & Retrieval]                                During this demonstration / presentation session we are going
                                                                      explore the creation of “Data Junction Boxes in the Clouds” via
General Terms                                                         OpenLink Data Spaces that exploits in-built RDFization
                                                                      Middleware, plus the ability to mesh User Identity and User Data,
Management, Performance, Design, Standardization, Languages,
                                                                      en route to surmounting the issues and challenges associated with
Theory
                                                                      Data Portability attainment.

Keywords                                                              2. Issues & Challenges
Linked Data, Semantic Web, SPARQL, Data Integration, Data
Spaces                                                                2.1 Data Portability
                                                                      It’s no secret that data wants to be free of the tyranny of
1. INTRODUCTION                                                       application logic confinement. In recent times, the realization that
                                                                      meshing Identity and Data ownership on the Web are critical
User generated content is growing at an exponential rate behind
                                                                      requirements of this pursuit of freedom, has resulted in the
corporate firewalls and across the Internet in general. The use of
                                                                      emergence of a movement for Data Portability as yet another
Web technologies has been the prime accelerator of the
                                                                      enclave within the broader Open Data movement.
aforementioned growth due to the pervasiveness of Web based
distributed collaborative applications. Examples include: Social      Data portability addresses to key issues: data mobility and data
Networking, Weblogs, Wikis, Shared Bookmark Managers, Photo           referencing. Today, data mobility though the use of standard data
Sharing, Polls Management, Calendars, Discussion Forums, File         formats for moving data across silos (import and export style)
Sharing, and Feed Aggregation, to name a few.                         have emerged as the focal point of attention with regards to
                                                                      addressing the proliferation of data silos on the Web. Examples
The exponential growth of user-generated content has resulted in
                                                                      include: RSS 1.0, RSS 2.0, Atom, OPML, FOAF, SIOC, and
the growth of silos comprised of unstructured and/or semi-
                                                                      others. Unfortunately, the ability to reference and de-reference
structured content. Unfortunately, these silos have accelerated,
                                                                      data across data-silos is yet to catch the attention of those
rather than decelerated, the imminence of an “information
                                                                      pursuing data portability.
overload” quagmire.
2.2 RDFization Middleware                                               •   Interaction with the resulting data graph via a number of
The traditional resistance to RDF adoption, which is critical to            Linked Data aware User Agents
Linked Data comprehension and production, comes from the
grounding of the RDF Data Model in Graph Theory and the
unwillingness of most Web Application developers to interact
                                                                     3. Identity & Data Meshing via Linked Data
with data formally. This reality has lead to a genre of middleware   Spaces
tools collectively known as RDFizers, that generate RDF on the
fly.


With regards, to Linked Data, generating RDF on-the-fly is only
part of the equation; the generated RDF must retain the core
principles of linked data by providing URIs for physical web
accessible resources, concrete entities, and abstract things. Of
course, this process must include intelligent production of
instance data associated with relevant shared schemas or
ontologies.

2.3 Data Junction Boxes in the Clouds
It is our belief that the Linked Data Web will be more distributed
than centralized in architecture. We envisage a Linked Data Web
comprised of hubs that range is size from large (e.g. DBpedia,
Geonames, Zitgist etc.), medium sized group (e.g. RDFized
Weblogs, Wikis, Bulletin Boards etc.), and smaller personal hubs
enabled by operating system virtualization technologies like
Amazon EC2. The medium and smaller hubs are best described as        4. Links
data junction boxes because they act as conduits between existing       •   http://en.wikipedia.org/wiki/OpenLink_Dat
systems and Linked Data aware User Agents.                                  a_Spaces - OpenLink Data Spaces
This demonstration will demonstrate a Data Space initialization         •   http://en.wikipedia.org/wiki/Virtuoso_Univ
process for end-users that covers:                                          ersal_Server - Virtuoso
    •    Domain Name Registration (e.g. .Name acquisition)              •   http://myopenlink.net/ods/index.html - Live
    •    DNS configuration                                                  OpenLink Data Spaces Demonstration
    •    Bonding with existing Web 2.0 platforms Facebook,
         phpBB3, MediaWiki, Wordpress, Drupal, Del.icio.us,
         Flickr, and Bugzilla
    •    Production of a dereferencable URIs that exposed the
         resulting Data Graph