=Paper= {{Paper |id=None |storemode=property |title=Using read/write Linked Data for Application Integration -- Towards a Linked Data Basic Profile |pdfUrl=https://ceur-ws.org/Vol-937/ldow2012-paper-04.pdf |volume=Vol-937 |dblpUrl=https://dblp.org/rec/conf/www/HorsNS12 }} ==Using read/write Linked Data for Application Integration -- Towards a Linked Data Basic Profile== https://ceur-ws.org/Vol-937/ldow2012-paper-04.pdf
 Using read/write Linked Data for Application Integration –
           Towards a Linked Data Basic Profile
           Arnaud J Le Hors                                  Martin Nally                             Steve K Speicher
                  IBM                                           IBM                                           IBM
           Software Standards                          CTO, Rational Software                                STSM
                Architect                                  Fellow and VP                              OSLC Lead Architect
           +1 (720) 396-5228                             +1 (714) 472-2690                             +1 (919) 254-0645
         lehors@us.ibm.com                              nally@us.ibm.com                          sspeiche@us.ibm.com



ABSTRACT                                                              1.        INTRODUCTION
                                                                      There is interest in Linked Data technologies for more than one
Linked Data, as defined by Tim Berners-Lee’s 4 rules [1], has
                                                                      purpose. We have seen interest for the purpose of exposing
enjoyed considerable well-publicized success as a technology for
                                                                      information – for example public records – on the Internet in a
publishing data in the World Wide Web [2]. The Rational group in
                                                                      machine-readable format. We have also seen interest in the use of
IBM has for several years been employing a read/write usage of
                                                                      Linked Data for inferring new information from existing
Linked Data as an architectural style for integrating a suite of
                                                                      information, for example in pharmaceutical applications or IBM
applications, and we have shipped commercial products using this
                                                                      Watson [4]. The IBM Rational team has been using Linked Data
technology. We have found that this read/write usage of Linked
                                                                      as an architectural model and implementation technology for
Data has helped us solve several perennial problems that we had
                                                                      application integration in the Product and Application Lifecycle
been unable to successfully solve with other application
                                                                      Management domain. This approach has been largely successful
integration architectural styles that we have explored in the past.
                                                                      and we are pleased – even passionate – about the results but
The applications we have integrated in IBM are primarily in the
                                                                      getting there has not been easy. Although related work exists [5]
domains of Application Lifecycle Management (ALM) and
                                                                      [6][7][8][9], as far as we can tell, there is only a very limited
Integration System Management (ISM), but we believe that our
                                                                      number of people trying to use Linked Data technologies the way
experiences using read/write Linked Data to solve application
                                                                      we are, and the little information that is available on best practices
integration problems could be broadly relevant and applicable
                                                                      and pitfalls remains widely dispersed. We believe that Linked
within the IT industry.
                                                                      Data has the potential to solve some important problems that have
This paper explains why Linked Data, which builds on the              frustrated the IT industry for many years, or at least make
existing World Wide Web infrastructure, presents some unique          significant advances in that direction, but this potential will only
characteristics, such as being distributed and scalable, that may     be realized if we can establish and communicate a much richer
allow the industry to succeed where other application integration     body of knowledge on how to exploit these technologies. In some
approaches have failed. It discusses lessons we have learned along    cases, there also are gaps in the Linked Data standards that need
the way and some of the challenges we have been facing in using       to be addressed. To help with this process, we discuss in this
Linked Data to integrate enterprise applications.                     paper several best practices and anti-patterns we have identified
                                                                      as applicable to more domains than ALM. These include
Finally, we discuss several areas that could benefit from             accessing, updating and creating resources from servers that
additional standard work and discuss several commonly                 expose their resources as Linked Data.
applicable usage patterns along with proposals on how to address
them using the existing W3C standards in the form of a Linked
Data Basic Profile. This includes techniques applicable to clients    2.        THE INTEGRATION CHALLENGE
and servers that read and write linked data, a type of container      IBM Rational is a vendor of industry leading system and software
that allows new resources to be created using HTTP POST and           development tools, particularly those that support the general
existing resources to be found using HTTP GET (analogous to           software development process such as bug tracking, requirements
things like Atom Publishing Protocol (APP) [3]).                      management and test management tools. Like many vendors who
                                                                      sell multiple applications, we have seen strong customer demand
General Terms                                                         for better support of more complete business processes - in our
Management, Design, Standardization                                   case system and software development processes - that span the
                                                                      roles, tasks and data addressed by multiple tools. While answering
                                                                      this demand within the realm of a single vendor offering made of
Keywords                                                              many different products can be challenging it quickly becomes
Linked Data, Usage Patterns, Application Integration, Enterprise      unmanageable when customers want to mix in products from
Application, Standards, ALM, ISM, Profile                             other vendors as well as their own homegrown components.
                                                                      We describe our problem domain here to explain that we were led
                                                                      to explore these technologies by our need to solve long-standing
                                                                      problems in commercial application development and to
 Copyright is held by the author/owner(s).                            emphasize that our conclusions are supported by experience in
 LDOW2012, April 16, 2012, Lyon, France.
shipping and deploying real applications, but we do not believe        The Internet is all over the world, it supports billions of users, it’s
that our experiences or these technologies are limited to our          never gone down, it supports every kind of capability from web
application domain. These problems are encountered in many             pages to video, from education to business, and anyone with an
application domains, have existed for many years, and our              internet connection and an input device can participate in it.
industry has tried several different architectural approaches to       One reason the Web enjoys all these characteristics is that it
address the problem of integrating the various products these          works in terms of protocols and resource formats rather than
complex scenarios require. Here are a few:                             application specific interfaces. As an example, the web allows
     1.   Implement some sort of Application Programming               anyone to access any web page using whatever device and
          Interface (API) for each application, and then, in each      browser they like, independently of the type of hardware and
          application, implement “glue code” that exploits the         system the server is running on. This is possible because the web
          APIs of other applications to link them together.            relies on a resource format for web pages – HTML – and a
     2.   Design a single database to store the data of multiple       protocol for accessing these resources – HTTP –.
          applications, and implement each of the applications         Applying the same principle to the ALM domain integration
          against this database. In the software development tools     problem meant thinking in terms of domain specific resources,
          business, these databases are often called “repositories”.   such as requirements, change requests, and defects, and access to
     3.   Implement a central “hub” or “bus” that orchestrates the     these resources rather than in terms of tools. We stopped thinking
          broader business process by exploiting the APIs              of the applications as being the central concept of the architecture
          described in option 1 above.                                 and instead started to focus on the resources.

While a discussion of the failings of each of these approaches is      In this architecture the focus is on a web of resources from the
outside the scope of this document it is fair to say that although     various application domains – in our case, change management or
each one of them has its adherents and can point to some               quality management etc. - the applications are viewed as simply
successes, none of them is wholly satisfactory. So, we decided to      handlers of HTTP requests for those resources, and are not a
look for an alternative.                                               central focus. Because each resource is identified by a URI, we
                                                                       can easily express arbitrary linkage between resources from
                                                                       different domains or the same domajn.
3.   WHAT WOULD SUCCESS LOOK
                                                                       When we started in this direction, we were not fully aware of the
LIKE?                                                                  linked data work – we reasoned by analogy with the HTML web,
Unsatisfied with the state of the art regarding product integration    and we had understood the value of HTTP and URLs for solving
in the ALM domain we decided around 2004 to have another look          our problems. For data representations, we continued to look to
at how we might approach this integration problem.                     XML for solutions. Over time it became clear to us that to realize
Stepping back from what had already been attempted to date we          the full potential of the architecture we needed a simpler and more
started by identifying what characteristics an ideal solution would    prescriptive data model than the one offered by XML, and so we
have. We came up with the following list:                              started transitioning to RDF [10]. At this point we realized that
                                                                       what we were really doing was applying Linked Data principles to
Distributed – because of outsourcing, acquisitions, and the            application integration.
Internet, systems and work forces are increasingly distributed.
Scalable - need to scale to an unlimited number of products and        5.        LINKED DATA
users                                                                  We wanted an architecture that is minimalist, loosely coupled,
                                                                       had a standard data representation, kept the barriers to entry low
Reliable – as we move from local area networks to wide area            and could be supported by existing applications implemented with
networks, as we move to remote areas of the world without the          many implementation technologies. Linked Data was just what we
best infrastructures, and as users increasingly use mobile             needed.
technology, we have to be reliable across a wide range of
connectivity profiles.                                                 Linked Data was defined by Tim Berners-Lee as the following
                                                                       four rules [1]:
Extensible – we need to be extensible in the sense that we can
work with a wide variety of resources both in the application               1)   Use URIs as names for things
delivery domain but also in adjacent domains.                               2)   Use HTTP URIs so that people can look up those
                                                                                 names.
Simple – avoid the fragility we saw with tight coupling and keep
the barrier to entry low so that it will be easy for people to              3)   When someone looks up a URI, provide useful
interoperate with our products.                                                  information, using the standards (RDF*, SPARQL)

Equitable – equitable architecture that is open to everyone with no         4)   Include links to other URIs, so that they can discover
barriers to participation.                                                       more things.
                                                                       RDF provides a data model that is very flexible, enables
                                                                       interoperability and extensibility.
4.        THE SOLUTION
When looking for a solution that had these characteristics –           With RDF we were able to model the different types of resources
distributed, scalable, reliable, extensible, simple, and equitable –   we needed and the relationships between themsuch that for ALM
we realized that one such solution already existed: The World-         a change request becomes a resource exposed as RDF that can be
Wide Web.                                                              linked to the defect it is to address, and a test to use to validate the
                                                                       change to be made. With Linked Data the change management,
defect management, and test management tools no longer connect         Link : A relationship between two resources when one resource
to each other via specific interfaces but simply access the            (representation) refers to the other resource by means of a URI.
resources directly, following the Linked Data principles.              Basic Profile : A specification that defines the needed
                                                                       specification components from other specifications as well as
6.       CONVENTIONS                                                   providing clarifications and patterns. Within the "Basic Profile for
As we embarked on the process of defining the various resource         Linked Data", it is sometimes referred to as a shortened "Basic
types we needed, their relationship, and their lifecycle it became     Profile".
apparent that we also needed to define a set of conventions above
                                                                       Client : A program that establishes connections for the purpose of
what is currently defined by W3C and the Linked Data standards.
                                                                       sending requests.
Some of these are simple rules that could be thought of as
clarification of the basic Linked Data principles. Others are          Basic Profile Client : A client that adheres to the rules defined in
necessary because, unlike many uses of Linked Data, which are          the Basic Profile.
essentially read-only, our use of Linked Data is fundamentally         Server: An application program that accepts connections in order
read-write which raises its own set of challenges.                     to service requests by sending back responses. Any given program
The following lists some of the categories these conventions fall      may be capable of being both a client and a server; our use of
in:                                                                    these terms refers only to the role being performed by the
                                                                       program for a particular connection, rather than to the program's
     •   Resources – a set of HTTP and RDF standard
                                                                       capabilities in general. Likewise, any server may act as an origin
         techniques and best practices that you should use, and
                                                                       server, proxy, gateway, or tunnel, switching behavior based on the
         anti-patterns you should avoid, when constructing
                                                                       nature of each request.
         clients and servers that read and write linked data. This
         includes a set of common properties leveraging existing       Basic Profile Server : A server that adheres to the rules defined
         RDF vocabularies such as Dublin Core [11]. It also            in the Basic Profile.
         includes what HTTP verb to use for creating, updating,
         getting, and deleting a resource as well as how to use
         them. In particular, in a system where tools may expand       8.        BASIC PROFILE RESOURCES
         resources with additional properties beyond the core          Basic Profile Resources are HTTP linked data resources that
         properties required to be supported by everyone it is         conform to some simple patterns and conventions. Most Basic
         crucial that any application that updates a resource          Profile Resources are domain-specific resources that contain data
         preserves the properties it doesn’t understand.               for an entity in some domain, which could be commercial,
                                                                       governmental, scientific, religious or other. A few Basic Profile
     •   Containers – a type of resource that allows new
                                                                       Resources are defined by the Basic Profile specifications and are
         resources to be created using HTTP POST and existing
                                                                       cross-domain. All Basic Profile Resources follow the four basic
         resources to be found using HTTP GET. These
                                                                       rules of Linked Data, previously laid out in section 5, to which
         containers are to RDF what APP is to XML. They
                                                                       Basic Profile adds a few rules of its own. Some of these rules
         answer the following two basic questions:
                                                                       could be thought of as clarification of the basic linked data rules.
         1)   To which URLs can I POST to create new
              resources?
                                                                            1.   Basic Profile Resources are HTTP resources that
         2)   Where can I GET a list of existing resources?
                                                                                 can be created, modified, deleted and read using
     •   Paging – a mechanism for splitting the information in                   standard HTTP methods.
         large containers into pages that can be fetched                         (Clarification or extension of Linked Data rule #2.)
         incrementally. For example, an individual defect usually                Basic Profile Resources are created by HTTP POST (or
         is sufficiently small that it makes sense to send it all at             PUT) to an existing resource, deleted by HTTP
         once, but the list of all the defects ever created is                   DELETE, updated by HTTP PUT or PATCH [15], and
         typically too big. The paging mechanism provides a                      "fetched" using HTTP GET.
         way to communicate the list in chunks with a simple set                 Additionally Basic Profile Resources can be created,
         of conventions on how to query the first page and how                   updated and deleted using SPARQL Update [16].
         pages are linked from one to the next.
     •   Ordering – a mechanism for specifying which                        2.   Basic Profile Resources use RDF to define their
         predicates were used for page ordering..                                state.
The following sections provide further details regarding a                       (Clarification of Linked Data rule #3.) The state (in the
proposal for addressing these in the form of a “Basic Profile for                sense of state used in the REST architecture) of a Basic
Linked Data” inspired by our work on Open Services for                           Profile Resource is defined by a set of RDF triples.
Lifecycle Collaboration (OSLC) [12].                                             Basic Profile Resources can be mixed in the same
                                                                                 application with other resources that do not have useful
                                                                                 RDF representations such as binary and text resources.
7.       TERMINOLOGY
The terminology used in this paper is based on W3C's                        3.   You can request an RDF/XML representation of
Architecture of the World Wide Web [13] and Hyper-text                           any Basic Profile Resource.
Transfer Protocol (HTTP/1.1) [14].                                               (Clarification of Linked Data rule #3.) The resource
                                                                                 may have other representations as well. These could be
     other RDF formats, like Turtle, N3 or NTriples, but               o DateTime: a Date and Time type as specified by
     non-RDF formats like HTML and JSON would also be                  XSD dateTime.
     popular additions, and Basic Profile sets no limits.              o Decimal: a decimal number type as specified by
                                                                       XSD Decimal.
4.   Basic Profile clients use Optimistic Collision                    o Double: a double floating-point number type as
     Detection on Update.                                              specified by XSD Double.
     (Clarification of Linked Data rule #2.) Because the               o Float: a floating-point number type as specified by
     update process involves first getting a resource,                 XSD Float.
     modifying it and then later putting it back to the server         o Integer: an integer number type as specified by XSD
     there is the possibility of a conflict, e.g. some other           Integer.
     client may have updated the resource since the GET. To            o String: a string type as specified by XSD String).
     mitigate this problem, Basic Profile implementations              o XMLLiteral: a Literal XML value.
     should use the HTTP If-Match header and HTTP
     ETags to detect collisions.
                                                                  8.   Basic Profile clients expect to encounter unknown
5.   Basic Profile Resources use standard vocabularies.                properties and content.
     Basic Profile Resources use common vocabularies                   Basic Profile provides mechanisms for clients to
     (classes, properties, etc) for common concepts. Many              discover lists of expected properties for resources for
     web sites define their own vocabularies for common                particular purposes, but also assumes that any given
     concepts like resource types, label, description, creator,        resource may have many more properties than are listed.
     last-modification-time, priority, enumeration of priority         Some servers will only support a fixed set of properties
     values and so on. This is usually viewed as a good                for a particular type of resource. Clients should always
     feature by users who want their data to match their local         assume that the set of properties for a resource of a
     terminology and processes, but it makes it much harder            particular type at an arbitrary server may be open in the
     for organizations to subsequently integrate information           sense that different resources of the same type may not
     in a larger view. Basic Profile requires all resources to         all have the same properties, and the set of properties
     expose common concepts using a common vocabulary                  that are used in the state of a resource are not limited to
     for properties. Sites may choose to additionally expose           any pre-defined set. However, when dealing with Basic
     the same values under their own private property names            Profile Resources, clients should assume that a Basic
     in the same resources. In general, Basic Profile avoids           Profile server may discard triples for properties of
     inventing its own property names where possible – it              which it does have prior knowledge. In other words,
     uses ones from popular RDF-based standards like the               servers may restrict themselves to a known set of
     RDF standards themselves, Dublin Core, and so on.                 properties, but clients may not. When doing an update
     Basic Profile invents property URLs where no match is             using HTTP PUT, a Basic Profile client must preserve
     found in popular standard vocabularies. A number of               all property-values retrieved using GET that it doesn’t
     recommended standard properties for use in Basic                  change whether it understands them or not. (Use of
     Profile Resources are listed below, in section 8.1.               HTTP PATCH or SPARQL Update instead of PUT for
                                                                       update avoids this burden for clients.)
6.   Basic Profile Resources set rdf:type explicitly.
     A resource’s membership in a class extent can be             9.   Basic Profile clients do not assume the type of a
     indicated explicitly – by a triple in the resource                resource at the end of a link.
     representation that uses the rdf:type predicate and the           Many specifications and most traditional applications
     URL of the class - or derived implicitly. In RDF there is         have a “closed model”, by which we mean that any
     no requirement to place an rdf:type triple in each                reference from a resource in the specification or
     resource, but this is a good practice, since it makes             application necessarily identifies a resource in the same
     query more useful in cases where inferencing is not               specification (or a referenced specification) or
     supported. Remember also that a single resource can               application. By contrast, the HTML anchor tag can
     have multiple values for rdf:type. For example, the               point to any resource addressable by an HTTP URI, not
     dpbedia entry for Barack Obama [17] has dozens of                 just other HTML resources. Basic Profile works like
     rdf:types. Basic Profile sets no limits to the number of          HTML in this sense. A HTTP URI reference in one
     types a resource can have.                                        Basic Profile resource may in general point to any
                                                                       resource, not just a Basic Profile resource.
7.   Basic Profile Resources use a restricted number of
     standard datatypes. RDF does not by itself define                 There are numerous reasons to maintain an open model
     datatypes to be used for property values, so Basic                like HTML’s. One is that it allows data that has not yet
     Profile lists a set of standard datatypes to be used in           been defined to be incorporated in the web in the future.
     Basic Profile to increase interoperability. Here is the           Another reason is that it allows individual applications
     list:                                                             and sites to evolve over time - if clients assume that
     o Boolean: a boolean type as specified by XSD [18]                they know what will be at the other end of a link, then
                                                                       the data formats of all resources across the transitive
     Boolean.
                                                                       closure of all links has to be kept stable for version
     o Date: a Date type as specified by XSD date.
                                                                       upgrade.
           A consequence of this independence is that client        Property              Range          Comment
           implementations that traverse HTTP URI links                                                  resource. This is the predicate
           from one resource to another should always code                                               to use when you don't know
           defensively and be prepared for any resource at the                                           what else to use. If you know
           end of the link. Defensive coding by clients is                                               more specifically what sort of
           necessary to allow sets of applications that                                                  relationship it is, use a more
           communicate via Basic Profile to be independently                                             specific predicate.
           upgraded and flexibly extended.                                                               Should be a URI (see
                                                                                                         dbpedia.org) "Typically, the
8.1        Common Properties                                                                             subject will be represented
The following are some properties from well-known RDF                                                    using keywords, key phrases,
vocabularies that are recommended for use in Basic Profile                                               or classification codes.
Resources. Basic Profile requires none of them, but a                                                    Recommended best practice
                                                                    dcterms:subject       rdfs:Resource
specification based on Basic Profile may require one of these                                            is to use a controlled
properties or more for a particular resource type.                                                       vocabulary. To describe the
                                                                                                         spatial or temporal topic of
Commonly used namespace prefixes:                                                                        the resource, use the
@prefix dcterms: .                                                            Coverage element." (from
@prefix rdf:
                                                                                                         Dublin Core)
  .                                                         A name given to the resource.
@prefix rdfs:
                                                                                                         Represented as rich text in
  .                                                               XHTML format. SHOULD
@prefix bp:                                                         dcterms:title         rdf:XMLLiteral
                                                                                                         include only content that is
  .                                                           valid inside an XHTML
@prefix xsd:
                                                                                                          element.
  .
                                                                    8.1.2      From RDF
8.1.1      From Dublin Core                                         URI: http://www.w3.org/1999/02/22-rdf-syntax-ns#
URI: http://purl.org/dc/terms/
                                                                    Property Range            Comment
Property            Range          Comment                                                    The type or types of the resource. Basic
                                   The identifier of a resource                               Profile recommends that the rdf:type(s) of
                                   (or blank node) that is a        rdf:type rdfs:Class       a resource be set explicitly in resource
                                   contributor of information.                                representations to facilitate query with non-
dcterms:contributor dcterms:Agent This resource may be a                                      inferencing query engines
                                   person or group of people, or
                                                                    8.1.3      From RDF Schema
                                   possibly an automated
                                                                    URI: http://www.w3.org/2000/01/rdf-schema#
                                   system.
                                   The identifier of a resource     Property        Range    Comment
                                   (or blank node) that is the                               The URI (or blank node identifier) of a
                                                                    rdfs:member rdf:Resource
                                   original creator of the                                   member of a container.
dcterms:creator     dcterms:Agent resource. This resource may                                "Provides a human-readable version of a
                                                                    rdfs:label  rdf:Resource
                                   be a person or group of                                   resource name." (From RDFS)
                                   people, or possibly an
                                   automated system.
dcterms:created     xsd:dateTime The creation timestamp
                                                                    9.         BASIC PROFILE CONTAINER
                                   Descriptive text about the       Many HTTP applications and sites have organizing concepts that
                                   resource represented as rich     partition the overall space of resources into smaller containers.
                                   text in XHTML format.            Blog posts are grouped into blogs, wiki pages are grouped into
dcterms:description rdf:XMLLiteral SHOULD include only              wikis, and products are grouped into catalogs. Each resource
                                   content that is valid and        created in the application or site is created within an instance of
                                   suitable inside an XHTML         one of these container-like entities, and users can list the existing
                                   
element. artifacts within one. There is no agreement across applications or A unique identifier for the sites, even within a particular domain, on what these grouping resource. Typically read-only concepts should be called, but they commonly exist and are and assigned by the service important. Containers answer two basic questions, which are: dcterms:identifier rdfs:Literal provider when a resource is 1. To which URLs can I POST to create new resources? created. Not typically 2. Where can I GET a list of existing resources? intended for end-user display. Date on which the resource In the XML world, APP has become popular as a standard for dcterms:modified xsd:dateTime answering these questions. APP is not a good match for Linked was changed. Data - this specification shows how the same problems that are dcterms:relation rdfs:Resource The URI of a related solved by APP for XML-centric designs can be solved by a simple Linked Data usage pattern with some simple conventions . on posting to RDF containers. We call these RDF containers that @prefix o: . you can POST to Basic Profile Containers. Here are some of their characteristics: a bp:Container; 1. A Basic Profile Container is a resource that is a Basic bp:membershipSubject Profile Resource of type bp:Container. ; 2. Clients can retrieve the list of existing resources in a Basic Profile Container. bp:membershipPredicate o:asset. 3. New resources are created in a Basic Profile Container by POSTing to it. 4. Any resource can be POSTed to a Basic Profile a o:netW; Container - a resource does not have to be a Basic Profile Resource with an RDF representation to be o:asset POSTed to a Basic Profile Container. , 5. After POSTing a new resource to a container, the new . resource will appear as a member of the container until The essential structure of the container is the same, but in this it is deleted. A container may also contain resources that example, the membership subject is not the container itself – it is were added through other means - for example through a separate net worth resource. The membership predicate is the user interface of the site that implements the o:asset – a predicate from the domain model. A POST to this Container. container will create a new asset and add it to the list of members 6. The same resource may appear in multiple containers. by adding a new membership triple to the container. You might This happens commonly if one container is a "view" wonder why we didn’t just make http://example.org/netW/nw1 a onto a larger container. container and POST the new asset directly there. That would be a 7. Clients can get partial information about a Basic Profile fine design if http://example.org/netW/nw1 had only assets, but if Container without retrieving a full representation it has separate predicates for assets and liabilities, that design will including all of its contents. not work because it is unspecified to which predicate the POST The representation of a Basic Profile Container is a standard RDF should add a membership triple. Having separate container representation using the rdfs:member predicate or http://example.org/netW/nw1/assetCont and another predicate specified by bp:membershipPredicate. For http://example.org/netW/nw1/liabilityCont container resources example, if you have a container with the URL allows both assets and liabilities to be created. http://example.org/container1, it might have the following In this example, clients cannot simply guess which resource is the representation: membership subject and which predicate is the membership @prefix dcterms: . predicate, so the example includes this information in triples @prefix rdfs: whose subject is the Basic Profile Container resource itself. . @prefix bp: 9.1 rdfs:Container Properties Because a Basic Profile Container is a Basic Profile Resource the . same set of common properties described in section 8.1 applies. In addition, Basic Profile Containers have the following specific a bp:Container ; properties: dcterms:title "A very simple container"; Property Occurs Range Comment rdfs:member Indicates which , predicate of the bp:membershi container should be used , zero or one rdfs:Property pPredicate to determine the . membership when it is Basic Profile does not recognize or recommend the use of other not rdfs:member. forms of RDF container such as Bag and Seq because they are not Indicates which resource friendly to query. This follows standard linked data guidance for is the subject for the RDF usage (see RDF Features Best Avoided in the Linked Data bp:membershi zero or one rdfs:Property members of the Context [5]). pSubject container when it is not Sometimes it is useful to use a subject other than the container the container itself. itself as the membership subject and to use a predicate other than rdfs:member as the membership predicate, as illustrated below. 9.2 Retrieving non-member properties # The following is the representation of The representation of a container that has many members may be # http://example.org/netW/nw1/assetCont large. When we looked at our use cases, we saw that there were @prefix rdfs: several important cases where clients needed to access only the non-member properties of the Container. [The dcterms properties . listed in this page may not seem important enough to warrant @prefix bp: addressing this problem, but we have use cases that add other predicates to containers - for providing validation information and resources, and it requires the definition of a custom HTTP header, associating SPARQL endpoints for example.] Since retrieving the which to some people at least seems comparatively heavyweight. whole container representation to get this information may be onerous, we were motivated to define a way to retrieve only the 9.4 Paging non-member property values. We do this by defining for each Basic Profile Containers may support a technique called Paging Basic Profile Container a corresponding resource, called the "non- which allows the representation of large containers to be member resource", whose state is a subset of the state of the transmitted in chunks. container. The non-member resource's HTTP URI can be derived Paging can be achieved with a simple RDF pattern. For each in the following way. container resource, , we define a new resource If the HTTP URI of the container is {url}, then the HTTP URI of ?firstPage. The triples in the representation of the related non-member resource is {url}?non-member-properties. ?firstPage are a subset of the triples in The representation of {url}?non-member-properties is identical to - same subject, predicate and object. the representation of {url}, except that the membership triples are Basic Profile Container servers may respond to requests for a missing. The subjects of the triples will still be {url} (or whatever container by redirecting the client to the first page resource – they were in the representation of {url}), not {url}?non-member- using a HTTP-303 “See Other” HTTP redirect to the actual URL properties. Any server that does not support non-member- for the page resource. resources should return an HTTP 404-NotFound error when a non-member-resource is requested. Continuing on from the member information from the JohnZSmith net worth example, we’ll split the response across This approach can be thought of as being analogous to using two pages. The client requests the first page as HTTP HEAD compared to HTTP GET. HTTP HEAD is used to http://example.org/netW/nw1/assetCont?firstPage: fetch the response headers for a resource as opposed to requesting the entire representation of a resource using HTTP GET. # The following is the representation of Here is an example: # http://example.org/netW/nw1/assetCont?firstPage Request: @prefix rdf: GET /container1?non-member-properties . HOST: example.org Accept: text/turtle @prefix dcterms: . @prefix bp: Response: . @prefix rdfs: @prefix o: . . @prefix dcterms: <. a bp:Container; a bp:Container; dcterms:title dcterms:title "The assets of JohnZSmith"; "A Basic Profile Container of Acme Resources"; bp:membershipPredicate rdfs:member; bp:membershipSubject dcterms:publisher . ; bp:membershipPredicate o:asset. 9.3 Design motivation and background The concept of non-member-resources has not been especially controversial, but using the URL pattern {url}?non-member- properties to identify them has been controversial. Some people a bp:Page; feel it's an unacceptable intrusion into the URL space that is bp:pageOf owned and controlled by the server that defines {url}. A more ; practical objection is that servers respond unpredictably to URLs they do not understand, especially those that have a "?" character bp:nextPage in them. For example, some servers will return the resource . identified by the portion of the URL that precedes the “?” and simply ignore the rest. This problem could perhaps be mitigated by using a character other than "?" in the URL pattern. An alternative design that was discussed uses a header field in the a o:netW; response header of {url} to allow the server to control and o:asset communicate the URL of the corresponding non-member- , resource - presence or absence of the header field would let , clients know whether the non-member-resource is supported by , the server. The advantages of this approach are that it does not impinge on the server's URL space, and it works predictably for . servers that do not understand the concept of a non-member- resource. The disadvantages are that it requires two server round- trips - a HEAD and a GET - to retrieve the non-member- a o:Stock; o:value 100.00. it chooses based on the value of any available property of the members. In the example below, the value of the o:value predicate a o:Cash; is present for each member, so the client can easily order the members according to the value of that property. In this way, o:value 50.00. Basic Profile Container avoids the use of RDF constructs like Seq # server initially supplied no data for a3 and a4 and List for expressing order. in this response Order only becomes important for Basic Profile Container servers when containers are paginated. If the server does not respect The following example is the result of retrieving the ordering when constructing pages, the client is forced to retrieve representation for the next page: all pages before sorting the members, which would defeat the purpose of pagination. In cases where ordering is important, a Basic Profile Container server exposes all the members on a page # The following is the representation of with a higher sort order than all members on the previous page # http://example.org/netW/nw1/assetCont?p=2 and lower sort order than all the members on the next page. The @prefix rdf: Basic Profile Container specification provides a predicate - . bp:containerSortPredicates - that the server may use to @prefix dcterms: . communicate to the client which predicates were used for page ordering. Multiple predicate values may have been used for @prefix bp: sorting, so the value of this predicate is an ordered list. . Here is an example container described previously, with @prefix o: . representation for ordering of the assets: # The following is the ordered representation of # http://example.org/netW/nw1/assetCont a bp:Container; @prefix rdf: dcterms:title "The assets of JohnZSmith"; . bp:membershipSubject @prefix dcterms: . ; @prefix bp: bp:membershipPredicate o:asset. . @prefix o: . a bp:Page; bp:pageOf a bp:Container; ; dcterms:title "The assets of JohnZSmith"; bp:nextPage rdf:nil. bp:membershipSubject ; bp:membershipPredicate o:asset. a o:netW; o:asset . a bp:Page; bp:pageOf ; a o:Stock; bp:containerSortPredicates (o:value). dcterms:title "Big Co."; o:value 200.02. In this example, there is only one member in the container in the a o:netW; final page. To indicate this is the last page, a value of rdf:nil is o:asset used for the bp:nextPage predicate of the page resource. , Basic Profile Container guarantees that any and all the triples , about the members will be on the same page as the membership . triple for the member. 9.5 Ordering There are many cases where an ordering of the members of the a o:Stock; container is important. Basic Profile Container does not provide any particular support for server ordering of members in o:value 100.00. containers, because any client can order the members in any way a o:Cash; Thanks to Arthur Ryman and John Arwe (as well as others) for o:value 50.00. review, feedback, and some content. a o:RealEstateHolding; 12. REFERENCES o:value 300000. [1] Tim Berners-Lee. Linked Data Design Issues, 2006 http://www.w3.org/DesignIssues/LinkedData.html As you can see by the addition of the bp:containerSortPredicates predicate, the o:value predicate is used to define the ordering of [2] Linked Data – Connect Distributed Data across the Web the results. It is up to the domain model and server to determine http://linkeddata.org/ the appropriate predicate to indicate the resource’s order within a [3] J Gregorio, B. de hOra. Atom Publishing Protocol (APP), page, and up to the client receiving this representation to use that IETF RFC5023, 2007 order in whatever way is appropriate, for example to sort the data http://www.ietf.org/rfc/rfc5023.txt prior to presentation on a user interface. [4] IBM Watson http://www.ibm.com/innovation/us/watson 10. CONCLUSION [5] Tom Heath, Christian Bizer. Linked Data: Evolving the Web We have shipped a number of products using the Linked Data into a Global Data Space, 2011. technology as a way to integrate ALM products and are generally http://linkeddatabook.com/editions/1.0/ pleased with the result. We now have more products in development that use these technologies and are seeing a strong [6] Leigh Dodds. Ian Davis. Linked Data Patterns, 2011. interest in this approach in other parts of our company. http://patterns.dataincubator.org As more data gets exposed using Linked Data we believe we will [7] Tetlow, Phil, Jeff Z Pan, Daniel Oberle, Evan Wallace, be able to do even more for our customers, with a set of Michael Uschold, and Elisa Kendall. Ontology Driven integration services with richer capabilities such as traceability Architectures and Potential Uses of the Semantic Web in across relationships, impact analysis and deep querying Systems and Software Engineering, W3C, 2006. capabilities. Additionally, we will be able to develop higher level http://www.w3.org/2001/sw/BestPractices/SE/ODA/. analytics, reports, and dashboards providing data from multiple [8] De Cesare, Sergio, Guido L Geerts, Grant Holland, Mark products across different domains. We will be able to answer Lycett, and Chris Partridge. Ontology-driven software questions such as: what enhancements in today's build address engineering. Ed. Regina Bernhaupt, Peter Forbrig, Jan requirements that need to be tested with certain test cases? Gulliksen, and Marta Lrusdttir. 2010. October 6409: 279- We believe that Linked Data has the potential to solve some 280. important problems that have frustrated the IT industry for many http://portal.acm.org/citation.cfm?doid=1639950.1639983. years, or at least make significant advances in that direction, but [9] Hesse, Wolfgang. Engineers Discovering the Real World this potential will only be realized if we can establish and From Model-Driven to Ontology-Based Software communicate a much richer body of knowledge on how to exploit Engineering. 2008. In Information Systems and eBusiness these technologies. Technologies, ed. Will Aalst, John Mylopoulos, Norman M It has taken us a number of years of experimentation to achieve Sadeh, Michael J Shaw, Clemens Szyperski, Roland the level of understanding that we have today, we have made Kaschek, Christian Kop, Claudia Steinberger, and Gnther some costly mistakes along the way, and we see no immediate Fliedl, 5:136-147. Springer Berlin Heidelberg. end to the challenges and learning that lie before us. As far as we http://dx.doi.org/10.1007/978-3-540-78942-0_16. can tell, there is only a very limited number of people trying to [10] Graham Klyne, Jeremy J. Carroll. Resource Description use Linked Data technologies in the ways we are using them, and Framework (RDF), W3C, 2004 the little information that is available on best practices and pitfalls http://www.w3.org/TR/rdf-concepts/ is widely dispersed. In some cases, there also are gaps in the [11] Dublin Core Metadata Initiative Linked Data standards that need to be addressed. http://dublincore.org We believe that defining a simple basic profile will enable [12] Open Services for Lifecycle Collaboration (OSLC) broader adoption of Linked Data principles for application http://open-services.net integration. Additional development of some of the concepts will be needed to complete such a basic profile. We are encouraged by [13] Ian Jacobs, Norman Walsh. Architecture of the World Wide the work started at the W3C Linked Enterprise Data Pattenrs Web, W3C. 2004. workshop [19] and look forward to participating in subsequent http://www.w3.org/TR/webarch/ activities. [20] [14] L Dusseault, J. Snell, PATCH Method for HTTP. IETF By sharing information on how we use these technologies we RFC5789, 2010 hope to help the industry move forward on these issues. http://tools.ietf.org/html/rfc5789 [15] Paul Gearon, Alexandre Passant, Axel Polleres. SPARQL 1.1 11. ACKNOWLEDGMENTS Update, W3C 2012 This paper contains material provided by Bill Higgins from IBM, http://www.w3.org/TR/sparql11-update/ and several of the concepts discussed here come from our work in [16] R. Fielding and al. Hyper-text Transfer Protocol (HTTP/1.1), the OSLC. IETF RFC2616, 1999. http://tools.ietf.org/html/rfc2616 [17] Dbpedia entry for Barack Obama [19] W3C Linked Enterprise Data Patterns Workshop http://dbpedia.org/page/Barack_Obama http://www.w3.org/2011/09/LinkedData/ [18] Paul Biron, Ashok Malhotra. XML Schema Part 2: [20] Linked Data at W3C Datatypes, Second Edition, W3C, 2004 http://www.w3.org/standards/semanticweb/data http://www.w3.org/TR/xmlschema-2/