=Paper= {{Paper |id=None |storemode=property |title=OWL: Yet to arrive on the Web of Data? |pdfUrl=https://ceur-ws.org/Vol-937/ldow2012-paper-16.pdf |volume=Vol-937 |dblpUrl=https://dblp.org/rec/conf/www/GlimmHKP12 }} ==OWL: Yet to arrive on the Web of Data?== https://ceur-ws.org/Vol-937/ldow2012-paper-16.pdf
                        OWL: Yet to arrive on the Web of Data?

          Birte Glimm                       Aidan Hogan                   Markus Krötzsch                       Axel Polleres
    Ulm University, Institute of          Digital Enterprise              University of Oxford,           Siemens AG Österreich,
      Artificial Intelligence,            Research Institute,           Department of Computer            Siemensstrasse 90, 1210
      89069 Ulm, Germany                National University of            Science, OX1 3QD                    Vienna, Austria
                                       Ireland Galway, Ireland          Oxford, United Kingdom



ABSTRACT                                                                lines recommend use of RDFS [18, § 4.4.2] for defining and in-
Seven years on from OWL becoming a W3C recommendation, and              terlinking vocabularies. Regarding OWL, guidelines explicitly rec-
two years on from the more recent OWL 2 W3C recommendation,             ommend use of owl:equivalentClass, owl:equivalentProperty,
OWL has still experienced only patchy uptake on the Web. Al-            owl:InverseFunctionalProperty & owl:inverseOf [18, § 4.4.2].
though certain OWL features (like owl:sameAs) are very popular,         However, other OWL features are not concretely mentioned.
other features of OWL are largely neglected by publishers in the           In terms of standards, RDFS and OWL 1 pre-date the Linked
Linked Data world. This may suggest that despite the promise of         Data movement and are not directly tailored towards Linked Data
easy implementations and the proposal of tractable profiles sug-        requirements. Although the informative entailment rules for sup-
gested in OWL’s second version, there is still no “right” standard      porting RDFS inferences are relatively straightforward, things like
fragment for the Linked Data community. In this paper, we (1)           the infinitely many entailed axiomatic triples reduce their practi-
analyse uptake of OWL on the Web of Data, (2) gain insights into        cality [27]. In OWL 1 the situation is more complex: OWL 1 Full
the OWL fragment that is actually used/usable on the Web, where         further extends the RDFS semantics to the extent that reasoning
we arrive at the conclusion that this fragment is likely to be a sim-   becomes undecidable. In OWL 1 DL and OWL 1 Lite, where
plified profile based on OWL RL, (3) propose and discuss such a         the semantics are based on Description Logics, typical reasoning
new fragment, which we call OWL LD (for Linked Data).                   tasks remain decidable, but are of exponential or harder worst-case
                                                                        complexity. OWL 2 addresses the complexity issue by defining
                                                                        profiles [6]: fragments for which at least some reasoning tasks are
1.     INTRODUCTION                                                     tractable. Reasoning with inconsistent data is, however, still prob-
   Under the initial impetus of the Linking Open Data project –         lematic in any OWL fragment. Further, each profile is a syntactic
and guided by the Linked Data principles [3] and associated best-       subset of OWL DL such that RDF data must adhere to certain non-
practices – a rich vein of openly-available structured data has been    trivial conditions which are commonly not followed in Web ontolo-
published on the Web using Semantic Web standards. Publishing           gies [2, 37, 7]. However, OWL RL includes a ruleset called OWL
RDF on the Web is no longer confined to academia and hobbyists:         RL/RDF, which is applicable over arbitrary RDF data.
the current “Web of Data” now features exports from various cor-           Although the OWL RL profile is implementable using straight-
porate and commercial bodies (e.g., BBC, New York Times, Best-          forward rule-based technologies, (as we show) the profile still in-
Buy), online communities (e.g., Freebase, identi.ca), life-science      cludes many features with sparse uptake in Linked Data publish-
corpora (e.g., DrugBank, Linked Clinical Trials) and governmental       ing. Which features are prominently used is, however, unclear.
bodies (e.g., data.gov, data.gov.uk). The “Linked Open Data cloud”      Taking this cue, we herein survey a broad spectrum of RDF Web
now depicts 295 interlinked datasets, which together consist of an      data, looking at the uptake of individual RDFS and OWL features
estimated 31.6 billion RDF triples.1                                    used therein, including datatypes. We further analyse to what ex-
   Although RDF provides standard syntaxes and a common data-           tent OWL features are supported by tools that provide the technical
model for disseminating structured information, it offers very lit-     infrastructure for building complex Semantic Web applications.
tle when it comes to giving semantics to the published data. RDF           Our analysis suggests that a much simpler profile of OWL might
Schema (RDFS) and OWL were developed to address this by pro-            be better targeted towards the current needs of the Linked Data
viding a vocabulary for describing schema data. The special vo-         community. We thus propose OWL LD (for Linked Data) as a sub-
cabulary terms of RDFS and OWL – such as rdfs:subClassOf or             set of the OWL RL profile, using the insights of our survey to make
owl:FunctionalProperty – have a well-defined semantics, which           an informed decision as to which features of the RDFS and OWL
can be used to derive implicit consequences from the data.              standards should be included in the profile.
   In terms of publishing, parts of the RDFS and OWL standards             The remainder of the paper is structured as follows: In the next
have been adopted on the Web of Data. Linked Data literature rec-       section, we introduce some preliminaries. In Section 3, we present
ommends use of owl:sameAs relations between two URIs that re-           our survey of the use of RDFS and OWL features on the Web, in-
fer to the same resource [18, § 2.5.2]. Further, Linked Data guide-     cluding a survey of datatypes. In Section 4, we analyse the tool
1                                                                       support for RDFS and OWL. Drawing upon our observations, we
    http://www4.wiwiss.fu-berlin.de/lodcloud/state/
                                                                        propose and define the OWL LD profile in Section 5, and discuss
Acknowledgements. This work has been funded in part by Science          formal aspects of reasoning over the profile in Section 6. Next, in
Foundation Ireland under Grant No. SFI/08/CE/I1380 (Líon-2) and by an   Section 7, we give a synopsis of related work for empirical analyses
IRCSET postgraduate grant.                                              of RDFS and OWL data on the Web. We conclude in Section 8.
LDOW2012 April 16, 2012 Lyon, France
Copyright held by the author(s)/owner(s).
2.     BACKGROUND                                                         guage fragments of OWL. The essential features of RDF Schema
  We first recall some relevant features of RDF, RDFS, and OWL            (sub-classes and -properties, domain, range) are covered by all frag-
semantics and give a summary of the existing OWL profiles.                ments, but only OWL Full supports arbitrary RDF documents.
                                                                             Various sub-languages of OWL have also been proposed outside
2.1     RDF Graphs and Their Semantics                                    of the official standard. The current profiles have themselves been
                                                                          inspired by existing approaches: EL++ for OWL EL [23], DL-Lite
   Given the set of URI references U, the set of blank nodes B,
                                                                          [5] for OWL QL, and Description Logic Programs (DLP) [13] and
and the set of literals L, the set of RDF constants is denoted by
                                                                          pD* [34] for OWL RL. Generally, these approaches aim to max-
C := U∪B∪L. We use CURIEs to denote URIs (e.g., owl:sameAs),
                                                                          imise expressivity under some design principles. DLP is defined as
where the prefixes used in this paper can be looked up, e.g., at
                                                                          a syntactic fragment of OWL. Other languages – including pD* –
http://prefix.cc/. We often use Turtle syntax; e.g., we may use
                                                                          came about by extending RDFS with additional features. Allemang
a as a shortcut for rdf:type. Finally, V denotes the set of RDF
                                                                          and Hendler proposed RDFS-Plus based on an informal survey of
variables ranging over C and we prefix variables with ‘?’.
                                                                          practitioners and three criteria felt important for adoption: peda-
   An RDF triple (s, p, o) is a triple from the set of all RDF triples
                                                                          gogism (intuitive and easy to learn), practicality (real use-cases in
G := U ∪ B × U × C, where s is called subject, p predicate, and o
                                                                          modelling), and computational feasibility (not too hard to imple-
object. We call a finite set of triples G ⊂ G an RDF graph.
                                                                          ment) [1]. This language was later extended to RDFS 3.0 along
   Semantically, RDF graphs can be interpreted in a number of
                                                                          similar principles [19]. Fisher et al. propose a similar profile called
ways based on various W3C recommendations. The simple se-
                                                                          L2, where the feature selection is made on an ad-hoc basis [11].
mantics [17] considers only the graph structure of RDF, whereas
                                                                          (Table 2 later summarises the main features of these languages.)
more elaborate semantics such as RDFS entailment [17] or the
OWL 2 Direct and RDF-Based Semantics (see below) provide spe-
cial meanings for certain terms.                                          2.3    OWL Semantics and Reasoning
   The common basis for all such semantics is that they are speci-           OWL ontologies can be interpreted under two different seman-
fied in terms of model theory: one defines interpretations together       tics that agree in important cases: the RDF-Based Semantics (RS)
with necessary and sufficient conditions that specify when an in-         [17] and the Direct Semantics (DS) [25]. Like in RDF(S), the se-
terpretation satisfies a graph. When defining a semantics E (such         mantics are defined by specifying a model theory, i.e., by defining
as RDF, RDFS, etc.) one often speaks of E-interpretations and             valid interpretations for ontologies based on semantic conditions.
E-satisfaction. The set of all E-interpretations that E-satisfy a graph   In RS, these models are based on the representation of OWL ax-
G are called the E-models of G. Semantic entailment follows from          ioms as RDF graphs and thus can be viewed as a refined form of
this notion: a graph G E-entails a graph G0 , written G |=E G0 , if and   RDF interpretation. In DS, models are directly defined based on the
only if every E-model of G is also an E-model of G0 .                     structure of OWL axioms in the conceptual framework of Descrip-
                                                                          tion Logics (which in turn is based on first-order logic). Due to this,
2.2     OWL and its Fragments                                             DS is only defined for ontologies that belong to the OWL DL lan-
   OWL 2 is an ontology language that provides advanced schema            guage (or to any of its profiles) while RS can also be used on OWL
modelling capabilities that can be used together with RDF data.           Full. Besides this restriction, OWL language fragments are not tied
OWL 2 supersedes the earlier specification “OWL 1” by introduc-           to either semantics, leaving nine valid combinations of syntactic
ing new modelling features, additional serialisations, updated con-       fragments and formal semantics [33].
formance conditions and various corrections. When omitting the               RS is arguably more robust since it is defined for any RDF graph
version number, we thus mean the current OWL 2 standard.                  while DS only works for ontologies in OWL DL. However, RS
   Every RDF graph can be considered as an OWL ontology and the           entailment (of derived facts) is undecidable: implementations can
language of all RDF documents is called OWL Full to emphasise             only compute a subset of the conclusions that the semantics speci-
that all such graphs should be viewed as ontologies. In applications,     fies. In contrast, there are complete implementations for computing
however, OWL ontologies are usually viewed as being composed of           entailments under DS, albeit with a high (super-exponential) worst-
axioms, that can be more complex than single triples. For example,        case complexity if all of OWL DL is to be covered. When further
the triple ex:a owl:sameAs ex:b . corresponds to the OWL axiom            restricting to the OWL profiles, entailment checking under DS can
SameIndividual(ex:a ex:b) whereas the axiom                               be done in polynomial time. For RS, it is not known in general
                                                                          if the entailment problem becomes simpler in these cases. It is
               ObjectPropertyRange(skos:member
                                                                          known, however, that RS and DS yield the same entailments on
        ObjectUnionOf(skos:Concept skos:Container))                (1)    OWL RL under certain additional conditions, leading to a partial
expands to the six RDF triples                                            tractability result for RS for this case [6]. Similar results could be
     skos:member rdfs:range _:x. _:x owl:unionOf _:x1 .                   obtained in other cases since DS reasoning algorithms can often be
                                                                          modified to obtain correct (though often incomplete) RS reasoners.
      _:x1 rdf:first skos:Concept. _:x1 rdf:rest _:x2 .     (2)
                                                                             DS reasoning in all of the OWL profiles and significant parts of
     _:x2 rdf:first skos:Container. _:x2 rdf:rest rdf:nil .               OWL DL can be implemented using rules in a forward-chaining
Various conditions must be imposed on RDF graphs to ensure that           manner. For OWL RL, an algorithm is suggested in the specifica-
they are in one-to-one correspondence to a collection of OWL ax-          tion [6], while other works have covered OWL EL [23] and parts
ioms. A syntactic subset of OWL Full for which this is possible is        of OWL DL that also cover OWL QL [32]. For OWL QL, query
OWL DL, which also imposes further restrictions that are useful for       rewriting is a more common reasoning technique [5, 30]. There
computing semantic conclusions from the ontology [26]. Drawing            are many different reasoning techniques for OWL DL under DS,
such conclusions can still be computationally expensive. Hence,           though not all of them lead to polynomial algorithms when applied
OWL further defines three syntactically restricted sub-languages          to the OWL profiles. Two (necessarily incomplete) reasoning meth-
(profiles) of OWL DL called OWL EL, OWL RL and OWL QL [6]                 ods are known for RS: algorithms based on sets of derivation rules
(see also Table 2 later for a brief feature comparison). OWL Full,        like the ones for OWL RL and an approach based on using first-
OWL DL and the OWL profiles together constitute the five lan-             order theorem provers [31].
                                                                            №                          Document URI                          Rank
3.    SURVEY OF RDFS & OWL ADOPTION                                          1   http://www.w3.org/1999/02/22-rdf-syntax-ns                   0.121
      ON THE WEB OF DATA                                                     2   http://www.w3.org/2000/01/rdf-schema                         0.110
                                                                             3   http://dublincore.org/2010/10/11/dcelements.rdf              0.096
   We now present an empirical survey of RDFS & OWL adoption                 4   http://www.w3.org/2002/07/owl                                0.078
on the Web of Data. Our survey is conducted over the Billion Triple          5   http://www.w3.org/2000/01/rdf-schema-more                    0.049
                                                                             6   http://dublincore.org/2010/10/11/dcterms.rdf                 0.036
Challenge 2011 corpus, which consists of 2.145 billion quadru-               7   http://www.w3.org/2009/08/skos-reference/skos.rdf            0.026
ples crawled from 7.411 million RDF/XML documents through                    8   http://xmlns.com/foaf/spec/                                  0.023
an open crawl ran in May/June 2011 spanning 791 pay-level do-                9   http://dublincore.org/DCMI.rdf                               0.021
mains. (A pay-level domain is a direct sub-domain of a top-level            10   http://www.w3.org/2003/g/data-view                           0.017
                                                                            14   http://id.loc.gov/authorities/sh98002267                   4.01E-3
domain (TLD) or a second-level country domain (ccSLD), e.g.,                30   http://motools.sourceforge.net/doc/musicontology.rdfs      2.38E-3
dbpedia.org, bbc.co.uk. This gives us our notion of “domain”).              38   http://www.w3.org/.../wn20/schemas/wnfull.rdfs             7.79E-4
This corpus represents a broad sample of the Web of Data.                   43   http://vivoweb.org/files/vivo-core-public-1.2.owl          6.11E-4
                                                                            87   http://www.w3.org/2006/time                                2.07E-4
                                                                           116                                                              1.22E-4
3.1    Measures Used                                                       129
                                                                                 http://rdf.geospecies.org/ont/geospecies
                                                                                 http://motools.sourceforge.net/timeline/timeline.rdf       1.06E-4
   In order to adequately characterise the uptake of various RDF(S)        159   http://vocab.org/bio/0.1/termgroup2.rdf                    8.11E-5
and OWL features used in this corpus, we present different mea-            259   http://www.ordnancesurvey.co.uk/.../geometry.owl           4.39E-5
                                                                           289   http://www.ordnancesurvey.co.uk/.../admingeo.owl           4.01E-5
sures to quantify their prevalence and prominence.                         990   http://www.ordnancesurvey.co.uk/.../spatialrelations.owl   1.24E-5
   First, we look at the prevalence of use of different features, i.e.,
how often they are used. Here, we must take into account the di-
versity of the data under analysis, where few domains account for a       Table 1: Top ten ranked documents and notable ranks (position
great many triples and many domains account for few triples, where        < 1, 000) mentioned later in Table 2
certain domains tend to publish many small documents and others
publish few large documents, and so forth [20]. We thus present
three statistics: (1) number of axioms using the feature [Ax], (2)        ularies; we also present the ranks of other notable documents men-
number of documents [Doc] and (3) number of domains [Dom].                tioned in the following section.2
   However, raw counts do not reflect that the use of an OWL fea-
ture in one important ontology may often have greater practical im-       3.2     Survey of RDF(S)/OWL Features
pact than use in a thousand obscure documents. Thus, we also look            Table 2 presents the results of the survey of RDF(S) and OWL
at the prominence of use of different features. We use PageRank to        usage in our corpus, where for features with non-trivial semantics,
quantify our notion of prominence: PageRank calculates a variant          we present the measures mentioned in the previous section, as well
of the Eigenvector centrality of nodes (e.g., documents) in a graph,      as support for the features in the different reasoning profiles dis-
where taking the intuition of directed links as “positive votes”, the     cussed in Section 2.2. We exclude rdf:type, which appeared in
resulting scores help characterise the relative prominence (i.e., cen-    90.3% of documents.P We present the table ordered by the sum of
trality) of particular documents on the Web [28, 15].                     PageRank measure [ Rank]; recall that Table 1 provides a legend
   In particular, we first rank documents in the corpus. To construct     for notable documents (Pos<1,000).
the graph, we consider RDF documents as nodes, where a directed              In column ‘ST’, we indicate which features have expressions that
edge (d1 , d2 ) is extended from document d1 to d2 iff d1 hosts RDF       can be represented as a single triple in RDF, i.e., which features do
data that contains (in any triple position) a URI that dereferences to    not require auxiliary blank nodes of the form _:x or the SEQ pro-
document d2 . This notion of dereferenceable links is core to Linked      duction in Table 1 of the OWL 2 Mapping to RDF document [29].
Data principles [3]. Note also that we follow redirects when check-       This distinction is motivated by our initial observations that such
ing dereferenceability. We then apply a standard PageRank analysis        features are typically the most widely usedPin Web data.
over the resulting directed graph, using the power iteration method          Figure 1 gives a visual overview of the Rank measure for the
with ten iterations. For reasons of space, we refer the interested        listed features (log scale), where different shades of grey are used
reader to [28] for more detail on PageRank, and [20] for more de-         to indicate to which vocabulary a term belongs (e.g., distinguishing
tail on the particular algorithms used for this paper.                    the terms new in OWL 2 from the ones already in OWL 1).
   Given these rank scores, for the different RDF(S) and OWL fea-            Regarding prevalence, we see from Table 2 that owl:sameAs is
tures we then present (1) the                                             the most widely used axiom in terms of documents (1.778 million;
                             P sum of PageRank scores for documents
in which they are used [ Rank]; (2) the max PageRank score                24%) and domains (117; 14.8%). Surprisingly (to us), RDF con-
of the highest-ranked document in which it appears [max Rank];            tainer membership properties (rdf:_*) are also heavily used (likely
(3) the max PageRank position of that document in the ordering of         attributable to RSS 1.0 documents). Regarding prominence, we
the 7.411 million documents [max Pos].                                    make the following observations:
   In terms of intuition under the random surfer model of Page-               1 The top six features are those that form the core of RDFS [27].
Rank [28], given an agent starting from a random location and                 2 The RDF(S) declaration classes rdfs:Class, rdf:Property
traversing documents on (our sample of) the Web                           are used in fewer, but more prominent documents than OWL’s ver-
                                                   P of Data through
randomly selected dereferenceable URIs, the           Rank value for      sions owl:Class, owl:DatatypeProperty, owl:ObjectProperty.
a feature approximately indicates the probability with which that             3 The top eighteen features are expressible with a single RDF
agent will be at a document using that feature after traversing ten       triple. The highest ranked primitive for which this is not the case
links. In other words, the score indicates the likelihood of an agent,    is owl:unionOf in nineteenth position, which requires use of RDF
operating over the Web of Data based on dereferenceable princi-           collections (i.e., lists). Union classes are often specified as the do-
ples, to encounter a given feature.                                       2
   The graph extracted from the corpus consists of 7.411 million            We limit the results to those presented for space reasons. We ran
                                                                          another similar analysis with links to and from core RDF(S) and
nodes and 198.6 million edges. Table 1 presents the top-10 ranked         OWL vocabularies disabled. The results for the feature analysis
documents in our corpus, which are dominated by core meta-vo-             remained similar. Mainly owl:sameAs dropped several positions
cabularies, documents linked therefrom, and other popular vocab-          in terms of the sum of PageRank.
                                                                                                                        RDFS+
                                                                                                                 RDFS



                                                                                                                                DLP
                                                                                                                                      pD*


                                                                                                                                                 QL
                                                                                                                                                      RL
                                                                                                                                            EL
                                                                                                                        L2
              №
                                                    P
                             Primitive                 Rank max Rank max Pos           Ax           Doc     Dom                                    ST
               1 rdf:Property                       5.74E-1       1.21E-1         1    17,509        8,049 48 X - - - X -               -    -     X
               2 rdfs:range                         4.67E-1       1.21E-1         1    51,540       44,492 89 X X X X X X X X X
               3 rdfs:domain                        4.62E-1       1.21E-1         1    97,288       43,247 89 X X X X X X X X X
               4 rdfs:subClassOf                    4.60E-1       1.21E-1         1 1,164,620 115,608 109 X X X X X X X X X
               5 rdfs:Class                         4.45E-1       1.21E-1         1    39,606       19,904 43 X - - X X -               -    -     X
               6 rdfs:subPropertyOf                 2.35E-1       1.10E-1         2    11,490        6,080 80 X X X X X X X X X
               7 owl:Class                          1.74E-1       7.80E-2         4   255,002 302,701 111 - - X X X X X X X
               8 owl:ObjectProperty                 1.68E-1       7.80E-2         4    35,065 285,412 92 - - X X - X X X X
               9 rdfs:Datatype                      1.68E-1       1.21E-1         1        31           23    9 X∗ - - X∗ X∗ X∗ X∗ X∗ X∗
              10 owl:DatatypeProperty               1.65E-1       7.80E-2         4    23,888 234,483 82 - - X X - X X X X
              11 owl:AnnotationProperty             1.60E-1       7.80E-2         4       216 172,290 55 - - - X - X X X X
              12 owl:FunctionalProperty             9.18E-2       2.63E-2         7     3,222          298 34 - - X X X -               - X X
              13 owl:equivalentProperty             8.54E-2       3.57E-2         6       168          141 23 - X X X X X X X X
              14 owl:inverseOf                      7.91E-2       2.63E-2         7     1,160          366 43 - X X X X - X X X
              15 owl:disjointWith                   7.65E-2       2.63E-2         7     3,266          230 27 - - - X - X X X X
              16 owl:sameAs                         7.29E-2       4.01E-3        14 3,450,554 1,778,208 117 - X X X X X - X X
              17 owl:equivalentClass                5.24E-2       2.32E-2         8    25,827       22,291 39 - X X X X X X X X
              18 owl:InverseFunctionalProperty 4.79E-2            2.32E-2         8        75          111 24 - - X X∗ X -              - X X
              19 owl:unionOf                        3.15E-2       2.63E-2         7    46,721       15,162 30 - - - X∗ -            -   - X∗ -
              20 owl:SymmetricProperty              3.13E-2       2.63E-2         7       175          120 23 - X X X X - X X X
              21 owl:TransitiveProperty             2.98E-2       2.63E-2         7       223          150 30 - X X X X X - X X
              22 owl:someValuesFrom                 2.13E-2       1.65E-2        10     3,854        1,753 15 - - - X∗ X∗ X X∗ X∗ -
              23 rdf:_*                             1.42E-2       8.11E-5       159 7,791,545 293,022 62 X - - - X -                    -    -      -
              24 owl:allValuesFrom                  2.98E-3       7.79E-4        38   108,989       29,084 20 - - - X∗ X∗ -             - X∗ -
              25 owl:minCardinality                 2.43E-3       6.11E-4        43   395,841       33,309 19 - - - X∗ -            -   -    -      -
              26 owl:maxCardinality                 2.14E-3       6.11E-4        43   223,994       10,413 24 - - - X∗ -            -   - X∗ -
              27 owl:cardinality                    1.75E-3       7.79E-4        38    20,781        3,170 24 - - - X∗ -            -   -    -      -
              28 owl:oneOf                          4.13E-4       2.07E-4        87       736           74 11 - - - X∗ - X∗ - X∗ -
              29 owl:hasValue                       3.91E-4       2.07E-4        87     1,624           55 14 - - - X∗ X X - X                      -
              30 owl:intersectionOf                 3.37E-4       1.06E-4       129     2,324          186 13 - - - X - X X∗ X                      -
              31 owl:NamedIndividual (2)            1.63E-4       1.22E-4       116       205            3    2 - - - -        - X X X X
              32 owl:AllDifferent                   1.55E-4       1.22E-4       116        87           21    8 - - - -        - X - X              -
              33 owl:propertyChainAxiom (2)         1.23E-4       4.01E-5       289        52           14    6 - - - -        - X - X              -
              34 owl:onDataRange                    8.41E-5       4.39E-5       259        89            3    1 - - - -        -    -   -    -      -
              35 owl:minQualifiedCardinality (2) 8.40E-5          4.39E-5       259         7            2    1 - - - -        -    -   -    -      -
              36 owl:qualifiedCardinality (2)       4.02E-5       4.01E-5       289        95            2    1 - - - -        -    -   -    -      -
              37 owl:AllDisjointClasses (2)         4.01E-5       4.01E-5       289         9            2    2 - - - -        - X X X              -
              38 owl:maxQualifiedCardinality (2) 4.01E-5          4.01E-5       289         1            1    1 - - - -        -    -   - X∗ -
              39 owl:ReflexiveProperty (2)          1.30E-5       1.24E-5       990         1            2    1 - - - -        - X X -             X
              40 owl:complementOf                   1.96E-6       6.28E-8 549,258         759           75    4 - - - X∗ -          - X∗ X∗ -
              41 owl:differentFrom                  7.18E-7       6.81E-8 486,354         691           25    7 - - - X - X - X X
              42 owl:onDatatype                     2.72E-7       2.72E-7    70,414         2            1    1 - - - -        -    -   -    -      -
              43 owl:disjointUnionOf                6.31E-8       4.28E-8 1,005,307         2            2    2 - - - -        -    -   -    -      -
                             (2)
              44 owl:hasKey                         3.67E-8       3.67E-8 1,336,720         1            1    1 - - - -        - X - X              -
                                              (2)
              45 owl:propertyDisjointWith           2.43E-8       2.43E-8 3,911,874         4            1    1 - - - -        -    - X X X
                                                                                            (2)                  (2)                           (2)
                Not Used: rdfs:ContainerMembershipProperty, owl:AllDisjointProperties , owl:Annotation , owl:AsymmetricProperty ,
                 owl:Axiom (2) , owl:IrreflexiveProperty (2) , owl:NegativePropertyAssertion (2) , owl:datatypeComplementOf (2) , owl:hasSelf (2)



Table 2: Survey of RDFS/OWL primitives used on the Web of Data and support in different tractable profiles where ∗ denotes that
the semantics is not fully axiomatised by the OWL RL/RDF rules or that usage of the term is restricted under OWL Direct Semantics


main or range of a given property: the most prominent such ex-                       support for disjoint(15) and union classes(19) . DLP – as defined by
ample is the SKOS vocabulary (the seventh highest ranked docu-                       Volz [36, §A] – has coverage of all such features, but does not sup-
ment) which specifies the range of the skos:member property as                       port inverse-functional(18) datatype properties. pD* does not sup-
the union of skos:Concept and skos:Container as in (1) above.                        port disjoint(15) or union classes(19) .
    4 Of the features new to OWL 2, the most prominently used is                        Regarding the OWL profiles, OWL EL and OWL QL both omit
owl:NamedIndividual in thirty-first position. Our crawl was con-                     support for important top-20 features. Neither include functional(12)
ducted nineteen months after OWL 2 became a W3C Recommen-                            or inverse-functional properties(18) , or union classes(19) . OWL EL
dation (Oct. 2009); by means of a quick scan of the max Pos col-                     further omits support for inverse(14) and symmetric properties(20) .
umn of Table 2, we note that new OWL 2 features have had lit-                        OWL QL does not support the prevalent same-as(16) feature. Con-
tle penetration in prominent Web vocabularies during that interim.                   versely, OWL RL has much better coverage, albeit having only par-
Further, several OWL 2 features were not used at all in our corpus.                  tial support for union classes(19) .
    5 owl:complementOf and owl:differentFrom are the least                              Summing up, we acknowledge that such a survey cannot give a
prominently used original OWL features.                                              universal or definitive indication of the most important OWL fea-
   In terms of profile support, we observe that RDFS has good                        tures for Linked Data. First, we only survey a limited sample of
catchment for a few of the most prominent features, but otherwise                    the Web of Data. Second, the future may (or may not) see radical
has poor coverage. Aside from syntactic/declaration features, from                   changes in how OWL is used on the Web; e.g., OWL 2 terms may
the top-20 features (which cover 98% of the total cumulative rank),                  soon enjoy more adoption. Still, Table 2 offers insights into the ex-
L2 misses functional properties(pos=12) , disjoint classes(15) , inverse-            tant trends of adoption and later informs the design of a new OWL
functional properties(18) and union classes(19) . RDFS-Plus omits                    profile tailored for the current Web of Data.
      Figure 1: The sum of PageRank for each of the listed features from Table 2 shown in logarithmic scale on the vertical axis

                                                                           №
                                                                                                            P
                                                                                       Primitive               Rank      Lit       Doc    Dom D O2
3.3    Survey of Datatype Use                                               1 xsd:dateTime                  4.18E-2   2,919,518 1,092,048 68 X X
   We now look at the use of datatypes on the Web of Data.                  2 xsd:boolean                   2.37E-2      75,215    41,680 22 X X
   Aside from plain literals, the RDF semantics defines a single            3 xsd:integer                   1.97E-2   1,015,235 716,904 41 X X
                                                                            4 xsd:string                    1.90E-2   1,629,224 475,397 76 X X
datatype supported under RDF-entailment: rdf:XMLLiteral [17].               5 xsd:date                      1.82E-2     965,647 550,257 39 X -
However, the RDF semantics also defines D-entailment, which pro-            6 xsd:long                      1.63E-2   1,143,351 357,723     6 X X
vides interpretations over a datatype map that gives a mapping from         7 xsd:anyURI                    1.61E-2   1,407,283 339,731 16 X X
                                                                            8 xsd:int                       1.52E-2   2,061,837 400,448 31 X X
lexical datatype strings into a value space. The datatype map may           9 xsd:float                     9.09E-3     671,613 341,156 21 X X
also impose disjointness constraints within its value space. These         10 xsd:gYear                     4.63E-3     212,887 159,510 12 X -
interpretations allow for determining which lexical strings are valid      11 xsd:nonNegativeInteger        3.35E-3       9,230    10,926 26 X X
                                                                           12 xsd:double                    2.00E-3     137,908    68,682 31 X X
for a datatype, which different lexical strings refer to the same value    13 xsd:decimal                   1.11E-3      43,747    13,179   9 X X
and which to different values, and which sets of datatype values are       14 xsd:duration                  6.99E-4      28,541    28,299   4 -  -
disjoint from each other. An XSD-datatype map is then defined              15 xsd:gMonthDay                 5.98E-4      34,492    20,886   3 X -
                                                                           16 xsd:short                     5.71E-4      18,064    11,643   2 X X
that extends the set of supported datatypes into those defined for         17 rdf:XMLLiteral                4.97E-4       1,580       791 11 X X
XML Schema (1.0), including types for boolean, numeric, tempo-             18 xsd:gMonth                    2.50E-4       2,250     1,132   3 X -
ral, string and other forms of literals. Datatypes that are deemed to      19 rdf:PlainLiteral              1.34E-4         109        19   2 - X
                                                                           20 xsd:gYearMonth                8.49E-5       6,763     3,080   5 X -
be ambiguously defined (viz. xsd:duration) or specific to XML              21 xsd:positiveInteger           5.11E-5       1,423     1,890   2 X X
(e.g., xsd:QName), etc. are omitted.                                       22 xsd:gDay                      4.26E-5       2,234     1,117   1 X -
   The original OWL specification recommends use of a similar set          23 xsd:token                     3.56E-5       2,900     1,450   1 X X
of datatypes to that for D-entailment, where compliant reasoners           24 xsd:unsignedByte              2.62E-7          66        11   1 X X
                                                                           25 xsd:byte                      2.60E-7          58        11   1 X X
are required to support xsd:string and xsd:integer. Furthermore,           26 xsd:time                      8.88E-8          23         4   3 X -
OWL allows for defining enumerated datatypes.                              27 xsd:unsignedLong              6.71E-8           6         1   1 X X
   With the standardisation of OWL 2 came two new datatypes,                – other xsd/owl dts. not used        —           —         — — — —
namely owl:real and owl:rational, along with novel support for
xsd:dateTimeStamp. However, XSD datatypes relating to date,
                                                                           Table 3: Survey of (std.) datatypes used on the Web of Data
time and Gregorian calendar values are not supported. OWL 2 also
introduced mechanisms for defining new datatypes by restricting
facets of legacy datatypes; however, from Table 2 we note that
owl:onDatatype (used for facet restrictions) has only very few
occurrences in our corpus.                                                supported by D-entailment with the recommended XSD datatype
   Implementing the entire range of RDF, XSD and OWL datatypes            map. O2 indicates the datatypes supported by OWL 2.
can be costly [10], with custom code (or an external library) re-            We observe from the table that the top four standard datatypes
quired to support each one. Thus, it is interesting to see which          are supported by both the traditional XSD datatype map and in
datatypes are most commonly used on the Web of Data.                      OWL 2. However, OWL 2 does not support xsd:date(5) which is
   In our corpus, we found 278 different datatype URIs assigned           prominently featured in our corpus, and does not support Gregorian
to literals. Of these, 158 came from the DBpedia exporter which           datatypes(10,15,18,20,22) nor xsd:time(26) . Despite not being supported
models SI units, currencies, etc., as datatypes. Using analogous          by any standard entailment regime, xsd:duration(14) was used in
measures as before, Table 3 lists the top standard RDF(S), OWL            28 thousand documents across four different domains.
and XSD datatypes as used to type literals in our corpus. We omit            Conversely, various standard datatypes are not used at all in the
max-rank statistics for brevity, and omit plain literals which were       data; e.g., xsd:dateTimeStamp, the “new” OWL datatypes, bi-
used in 6.609 million documents (89%). D indicates the datatypes          nary datatypes and various normalised-string/token datatypes.
4.    AVAILABLE TOOL SUPPORT                                             to alleviate this, e.g., AllegroGraph advertises “dynamic materi-
   Apart from understanding which OWL features are used in doc-          alisation” as a compromise. Backward chaining, in contrast, af-
uments on the Web, it is also crucial to understand what tool sup-       fects query answering performance but allows for easier updates.
port is available. We therefore now survey the availability of soft-     In the case of OWL QL (and RDFS), backward chaining can be
ware that provides the necessary technical infrastructure for build-     performed using a form of query rewriting that depends only on
ing complex applications, i.e., databases, reasoners and libraries.      schema information, and thus is likely to scale well. The tableau
   As a baseline requirement, tools need to be able to read OWL          approach of PelletDb, on the other hand, is more demanding when
documents and parse out axioms. Conformance with the OWL                 used at query time but can support all features of OWL DL.
standard actually requires support for the RDF/XML serialisation            Summarising, among the listed systems, three systems work with
as an input format [33]. Parsing OWL axioms from RDF triples             the Direct Semantics of OWL (PelletDb, DLEJena and QuOnto),
is not an easy task, and requires processing joins since axioms can      whereas the other systems are rule-based and work directly with
be composed of several RDF triples [29]. In addition, OWL ax-            RDF triples, usually via forward chaining. Thus, we conclude
ioms – such as owl:unionOf or owl:intersectionOf – may use               that an implementation via rules and compatibility also with the
arbitrary-length RDF lists, which require particular attention to val-   RDF-Based Semantics is an important criteria for comprehensive
idate and parse. Other features, such as type declarations and on-       tool support. Surprisingly, only two thirds of the tools support
tology imports, further complicate matters. Consequently, there are      owl:sameAs, which is one of the most popular features according
few compliant, stand-alone libraries for parsing OWL (relative to        to our survey. A possible explanation is that owl:sameAs blows up
libraries for RDF). Aside from parsing, querying OWL axioms us-          the size of the materialisation when using forward-chaining, so for
ing the SPARQL standard is also non-trivial, especially considering      an efficient support special optimisations are required, as, e.g., im-
axioms using arbitrary-length lists.3                                    plemented in OWLIM or Oracle 11g [22]. Although, four systems
   Thus even before any actual reasoning takes place, multi-triple       (nearly) support OWL RL, the complexity of a fully compliant and
OWL axioms are inconvenient to serialise, publish, parse and query       efficient implementation is still considered high [22].
using standard RDF tools. Conversely, OWL axioms that are rep-              Regarding datatypes, many triple stores use internal canonicali-
resented in a single RDF triple do not require the detection of com-     sation of typed literals, but full datatype reasoning is only sparsely
plex triple patterns and can easily be processed in a triple-at-a-time   supported or documented; some tools such as OWLIM explicitly do
manner with the RDF libraries and parsers that are available for         not support datatype rules of OWL RL. Datatype support in several
many programming languages. The question of whether a feature            tools is, for example, surveyed by Emmons et al. [10].
can be expressed in a single triple or not may thus already have
significant consequences for the practical cost of supporting it.        5.     DEFINING THE OWL LD PROFILE
    Databases are another important class of tools for building RDF        In this section, we build upon our previous observations to sug-
applications and numerous commercial and non-commercial sys-             gest a simple OWL profile that is adequate for the current needs of
tems are available today. Many of these systems evaluate OWL             the Web. In the previous sections, we have identified a number of
features to improve query answering services. Table 4 provides an        key issues for OWL adoption on the Web:
overview of such systems. We only include tools that have native              1. Adequacy: features that are widely used on the Web should
support for at least rdfs:subClassOf and rdfs:subPropertyOf                      be included.
reasoning (excluding, e.g., 5store), are developed for production             2. Implementability: features that are more challenging to pro-
use (excluding prototypes such as YARS2 [16] and QueryPie [35])                  cess and reason with should be avoided.
and that are meant to be used with large amounts of instance data             3. Robustness: noisy and unreliable data should not prevent the
(excluding OWL EL tools such as ELK [21]). The table lists the                   use of ontological data in reasoning.
most frequently implemented features explicitly and describes pro-
file support in a separate column. We additionally mention the main         Comparing this to the design guidelines of RDFS-Plus [1], we
inference strategy and the source of our information.4                   can see that adequacy relates to “practicality” while implementabil-
    A number of tools support the (near-)complete OWL RL profile.        ity subsumes to “computational feasibility.” We do not consider
Jena with the “OWL mini” ruleset has an incomplete implementa-           “pedagogism” as a design goal since we did not assess how intu-
tion of OWL (1) DL features that can be viewed as an approxima-          itive features are. In contrast, the work presented in Section 3 and 4
tion of OWL RL. PelletDb and QuOnto are reasoning layers on top          provides us with a much better understanding for assessing imple-
of a database with support for OWL DL and OWL QL, respectively.          mentability and adequacy. Robustness has not been considered as
DLEJena uses Pellet to perform TBox (schema) reasoning, where            a design goal for RDFS-Plus while we find it to be of great impor-
the resulting entailments and the OWL RL/RDF rules are used to           tance for making sense of Web data.
generate a set of ABox (instance) rules, which are then executed            Each of the above requirements leads to a number of concrete as-
using Jena’s RETE engine.                                                pects. Adequacy has been discussed in Section 3 based on a sam-
    Contrasting with these fairly powerful implementations, we find      ple of published ontologies. Looking at Table 2, we can see that
a number of tools that support only a few selected semantic fea-         many of the most frequently used features are of a simple struc-
tures, including some that only support a fragment of RDFS.              ture. In fact, owl:unionOf is the highest ranked feature that is not
    The reasoning algorithms that have been used are also important      expressed by a single triple in RDF serialisations of OWL.
in practice. Forward chaining (materialisation) often incurs sig-           Implementability was discussed in Section 4. We observed that
nificant penalties for data updates, although there are approaches       parsing, processing and querying OWL axioms in the RDF-based
3
                                                                         syntaxes (RDF/XML, N-Triples or Turtle) using widely available
  Property paths in SPARQL 1.1 make the task somewhat easier,            RDF-based tools is easier when all axioms can be mapped to a sin-
but checking that lists are well-formed is still challenging.
4
  We note that it is difficult to verify whether the tools indeed hold   gle triple in the RDF data-model. Moreover, inferencing is more
what they claim, e.g., in practice one might find that the support is    difficult for some features than for others, even in rule-based ap-
not as complete as advertised. Nevertheless, we take each system’s       proaches used commonly for OWL RL, e.g., support for list-based
description as an indication of available support.                       (multi-triple) expressions that can be of arbitrary length [4].
                   sC sP ran dom sA tra sym inv iFP               Profile           Algorithm                                   Source
PelletDb            X X   X   X   X X    X   X   X                OWL DL                       tableau   http://clarkparsia.com/pelletdb/
DLEJena             X X   X   X   X X    X   X   X                OWL RL    tableau, forward chaining    [24], http://lpis.csd.auth.gr/systems/DLEJena/
OWLIM               X X   X   X   X X    X   X   X              ∼ OWL RL             forward chaining    [4], http://www.ontotext.com/owlim
Oracle 11g          X X   X   X   X X    X   X   X                OWL RL             forward chaining    [22], http://tinyurl.com/oracle-sw
Jena OWL mini       X X   X   X   X X    X   X   X              ∼ OWL RL             forward chaining    http://openjena.org/inference/
Virtuoso            X X   -   -   X X    X   X   X                      —          backward chaining     http://virtuoso.openlinksw.com/rdf-quad-store/
AllegroGraph        X X   X   X   X X     -  X    -                     —            forward chaining    http://tinyurl.com/agraph-doc
QuOnto              X X   X   X   -  -   X   X   X                OWL QL               query rewriting   http://www.dis.uniroma1.it/quonto/
Jena RDFS           X X   X   X   -  -    -  -    -                     —            forward chaining    http://openjena.org/inference/
Sesame RDFS Sail    X X   X   X   -  -    -  -    -                     —            forward chaining    http://www.openrdf.org/
4store with 4rs     X X   X   X   -  -    -  -    -                     —              query rewriting   http://4sreasoner.ecs.soton.ac.uk/



Table 4: RDF database systems with reasoning support (sC: rdfs:subClassOf; sP: rdfs:subPropertyOf; ran: rdfs:range;
dom: rdfs:domain; sA: owl:sameAs; tra: owl:TransitiveProperty; sym: owl:SymmetricProperty; inv: owl:inverseOf; iFP:
owl:InverseFunctionalProperty)


   Robustness requires a high tolerance against syntactic errors.            can still apply entailment over the remainder of supported features.
The RDF-Based Semantics has this feature and can always be ap-               OWL LD is not intended to restrict vocabulary publishers in what
plied, hence no special language design is needed. However, it is            features they use (unless, of course, they are interested in the ben-
also desirable to be able to apply the Direct Semantics to a fragment        efits of DS-based reasoning). Instead, the terse OWL LD profile
as it yields stronger completeness guarantees for reasoning. Even            enables developers and researchers to focus directly on the inter-
if RDF-Based entailments are desired, the completeness of DS rea-            section of features that are (i) the most prominently used in Linked
soning methods can be used to obtain similar guarantees for RS [6,           Data, (ii) the most robust, and (iii) the easiest to implement.
Theorem PR1]. This kind of robustness can be accomplished by re-                Formally, we define OWL LD by restricting the OWL RL gram-
ducing the use of features for which OWL DL imposes additional               mar [6]. Roughly speaking, we remove all definitions and mentions
requirements, in particular cardinalities and property chains.               of productions listed as follows:
   Another aspect of robustness is tolerance to inconsistencies. This
feature is generally available in OWL profiles that are not able to          Datatype entailment:
express truly disjunctive information. Due to this, all inconsis-              DataRange, DataIntersectionOf, DatatypeDefinition
tencies are directly related to an individual or literal upon which          Boolean connectives & enums.:
conflicting requirements are imposed (including the special case               *OneOf, *IntersectionOf, *UnionOf, *ComplementOf
of ill-typed literal values). Hence, it is easy to ignore (all ele-          Restriction classes:
ments involved in) inconsistencies and to continue reasoning on the            *ValuesFrom, *HasValue, zeroOrOne, *Cardinality
remaining consistent ontology to derive meaningful conclusions.              Chains & keys:
Any OWL profile (or subset thereof) has this feature.                          propertyExpressionChain, HasKey
   From these observations, we derive that it is a reasonable design         Negative property assertions:
guideline for an OWL profile to restrict to OWL axioms that are in             sourceIndividual, target*, Negative*PropertyAssertion
OWL RL and at the same time are expressed as single RDF triples.                We further restrict the productions for DifferentIndividuals and
This directly addresses implementability based on the above obser-           Disjoint* to not use the list-based syntaxes. The full grammar
vations together with the fact that OWL RL is now widely imple-              can be found online [12]. All additional structural restrictions of
mented (cf. Table 4). Adequacy is addressed since the most im-               OWL DL are inherited from OWL RL. Note that all RL datatypes
portant features identified above are both in RL and expressed in            are supported as well, though implementers may use our study in
single triples. Note that the coverage of additional, rarely used fea-       Section 3 to select most relevant datatypes to support (the OWL
tures like reflexive properties is not a concern from the viewpoint          specification generally allows conforming tools to answer entail-
of adequacy (which asks for coverage, not for exclusivity) and is            ment questions with Unknown if a used feature is not supported).
not difficult to implement in the restricted fragment either.                   Comparing OWL LD with earlier approaches, it is interesting
   Robustness for interpretation in DS (i.e., as a subset of OWL DL)         to note that it can be viewed as a natural extension of languages
is eased by the omission of property chains and (most) cardinal-             like L2, RDFS-Plus, RDFS 3.0 as discussed in Section 2 and 3. In
ities (note that functionality remains). Single-triple axioms are            particular, RDFS 3.0 is already close to OWL LD which mainly
also less prone to syntactic errors when represented in RDF. How-            adds further OWL 2 constructs from OWL RL while only omitting
ever, other restrictions of OWL DL regarding the need for declara-           owl:AllDifferent as the list-based variant of owl:differentFrom.
tions, the non-existence of inverse functional data properties, and          This adds to our confidence that OWL LD is a natural OWL profile
the restrictions on blank nodes are still relevant. We suggest to            that can be motivated from a number of perspectives.
develop canonical (and thus predictable) repair strategies for ad-
dressing these issues – specifying this is left to future work. Im-
portantly, robustness suggests that, similarly to OWL RL, arbitrary          6.      REASONING IN OWL LD
RDF graphs should be allowed when using RS for reasoning. To                    OWL LD falls into a syntactic subset of OWL DL and can be
reconcile these issues, we first define a syntactic OWL LD profile           processed by tools that implement DS entailment checking. On the
as a subset of OWL RL (which in turn imposes the syntactic re-               other hand, we can also restrict the OWL RL/RDF rules to obtain
strictions of OWL DL) and we then suggest an RS-based extension              a terse set of inference rules that yields sound but possibly incom-
of this profile for reasoning with arbitrary OWL Full ontologies.            plete entailment under RS; the full set is found in Table 5 at the end
   Crucially, if an ontology uses features (such as owl:unionOf)             of the paper. These rules are applicable to any RDF graph allowing
that do not fall under the remit of OWL LD, an RS-based reasoner             us to robustly draw sound conclusions from Web data.
     The OWL LD ruleset comprises of rules of the form:                      One of the earliest comprehensive empirical studies of RDF Web
                     B1 ∧ . . . ∧ Bn → H (0 ≤ n ≤ 3)                         data was presented by Ding et al. in 2005 [8]. They report about the
                                                                             prevalence of vocabulary terms in over 1.5 million RDF/XML Web
where H is called the head and B1 ∧. . .∧ Bn is the body. A rule with        documents, where the bulk of data was described using the Friend
an empty body (e.g., the rule cls-thing) is simply a fact. Multiple          of a Friend (FOAF) and Dublin Core (DC) ontologies. The work
atoms in rule heads (e.g., eq-ref) denote conjunctions that could also       focuses on characterising the structure and distributions of the raw
be expressed using multiple rules with the same body. The datatype           data rather than issues relating to semantics or to RDFS and OWL.
rules are somewhat exceptional, however, and require custom logic               Various works look at the syntactic profiles of OWL ontologies
outside of a standard rule-engine. Moreover, some rules use false            on the Web [2, 37, 7]. Bechhofer and Volz identify and categorise
in the head to express that an inconsistency is to be derived. An            OWL DL restrictions violated by a sample group of 201 OWL on-
inconsistency-tolerant system could already be realised by simply            tologies (all of which were found to be in OWL Full); these include
not taking these conclusions into account for query answering.               incorrect or missing typing of classes and properties, complex ob-
   Unlike OWL RL/RDF which encodes arbitrary-length lists in the             ject properties (e.g., functional properties) declared to be transi-
bodies of some of its rules, the bodies of OWL LD rules comprise             tive, inverse-functional datatype properties, and so forth [2]. In a
solely of a fixed set of (a maximum of three) ternary RDF atoms of           later survey, Wang et al. study over 1,276 ontologies, where 924
the form T (s, p, o) where s, p, o ∈ C ∪ V. These restrictions sim-          (72.4%) were identified as being in OWL Full, although they pro-
plify the use of the OWL LD rules in a variety of tools. Excluding           posed that 863 could be patched (93.4%) [37]. In a similar study,
datatype support, since the rules can only derive triples that are built     d’Aquin et al. found that while 81% of 22,200 RDF Web docu-
from the set C of RDF constants that originally occur in the ontol-          ments surveyed fell into OWL Full, from the features used, 95%
ogy and ruleset, the number of entailments is bounded by |C|3 . This         would fall under the expressivity of the lightweight AL(D) De-
bound is tight, e.g., the rules entail all possible triples from the RDF     scription Logic [7]. To summarise, these studies show that restric-
graph owl:sameAs owl:sameAs a ; rdfs:domain owl:Thing .                      tions laid out in the OWL standard (specifically for the OWL Lite
Optimisations for rule-based systems as explored in many works               and OWL DL dialects) are not well-followed by Web ontologies,
can be applied to implement the OWL LD inferencing efficiently.              but that such ontologies are typically relatively inexpressive. These
Systems can efficiently support datatypes by, e.g., only checking            works re-enforce the need for our RS-based extension of OWL LD.
entailments as needed, or using canonicalisation techniques, etc.               More recent papers focus on analysing owl:sameAs adoption
   We are now left to describe the relationship between DS and RS            on the Web of Data [9, 14]. Ding et al. provide a quantitative
for the OWL LD profile.                                                      analysis of the owl:sameAs graph extracted from the BTC-2010
                                                                             dataset (the ancestor of our corpus) [9], summarising the use of
   Theorem 1. Let R contain the OWL LD entailment rules (Ta-                 owl:sameAs to link between different publishers of Linked Data.
ble 5) and let O1 and O2 be OWL 2 ontologies that satisfy the                In a similar vein, Halpin et al. [14] focus on the incorrect use
OWL LD grammar and the following properties:                                 of owl:sameAs; they employ four human judges to manually in-
     1. neither O1 nor O2 contains an IRI that is used for more than         spect 500 such links sampled from Web data, where their results
        one type of entity (i.e., no IRI is used both as, say, a class and   suggest that owl:sameAs is often used imprecisely, although dis-
        an individual);                                                      agreement between the judges indicates that the quality of specific
     2. O1 does not contain SubAnnotationPropertyOf, Anno-                   owl:sameAs links can be subjective. Such surveys indicate that
        tationPropertyDomain or AnnotationPropertyRange;                     reasoners must proceed cautiously when operating over Web data.
     3. each axiom in O2 is an assertion of the form as specified
        below, for a, a1 , and a2 named individuals:                         8.    CONCLUSION
         (a) ClassAssertion(C a) where C is a class,                            We have presented a comprehensive analysis of the current use
         (b) ObjectPropertyAssertion(OP a1 a2 ) where OP is                  of OWL on the Web based on a large sample of RDF/XML docu-
             an object property,                                             ments. We confirmed that OWL has indeed “arrived” on the Web
         (c) DataPropertyAssertion(DP a1 a2 ) where DP is a                  of Data, albeit to varying degrees for different features.
             data property, or                                                  Following Linked Data principles, we used a PageRank algo-
         (d) SameIndividual(a1 a2 ).                                         rithm to assess the importance of individual documents, OWL fea-
                                                                             tures, and datatypes. Our results show that single-triple expressible
Furthermore, let RDF(O1 ) and RDF(O2 ) be translations of O1 and             OWL RL axioms are most prominent on the Web. A survey of tools
O2 , respectively, into RDF graphs [29]; and let FO(RDF(O1 )) and            confirms that these features tend to receive better support.
FO(RDF(O2 )) be the translation of these graphs into first-order                Based on these observations, we defined the OWL LD profile as
theories in which triples are represented using the T predicate.             a sub-language of OWL RL and provided a rule-based reasoning
Then, O1 entails O2 under the OWL 2 Direct Semantics [25] iff                calculus for it. Though motivated by a new analysis of the current
FO(RDF(O1 )) ∪ R entails FO(RDF(O2 )) under the standard first-              Web of Data, OWL LD also aligns closely with the earlier propos-
order semantics.                                                             als of RDFS-Plus and L2, indicating that it is a natural profile that
                                                                             can be motivated from various perspectives. We argue that this is
   The proof of the Correspondence Theorem below follows imme-
                                                                             due to the syntactic restriction of OWL features to those that can
diately from the according theorem for OWL RL [6, Theorem PR1]
                                                                             be expressed using single RDF triples, which reveals exactly the
together with the fact that OWL LD is a restriction of OWL RL.
                                                                             cases where OWL expressions are fully aligned with, and most in-
Like in the case of OWL RL, this result applies only to checking
                                                                             tuitively expressed in, the RDF data model. We argue that this bears
the entailment of basic facts, not of OWL axioms in general.
                                                                             crucial advantages regarding not only tool support, but also usabil-
                                                                             ity. We therefore believe that, even if OWL as a whole might never
7.      RELATED WORK                                                         arrive on the Web of Data, the OWL LD profile is a natural fit for
  Here we discuss related studies on the use of the RDFS and OWL             modelling Linked Data vocabularies. In fact, as we have shown,
on the Web (related OWL profiles have been covered in Section 2).            OWL LD is already widely used.
9.   REFERENCES                                                             thesis, DERI, NUIG, 2011.
                                                                       [21] Y. Kazakov, M. Krötzsch, and F. Simančík. Concurrent
 [1] D. Allemang and J. A. Hendler. Semantic Web for the                    classification of EL ontologies. In Proc. 10th Int. Semantic
     Working Ontologist: Effective Modeling in RDFS and OWL.                Web Conf. (ISWC’11). Springer, 2011.
     Morgan Kaufmann/Elsevier, 2008.
                                                                       [22] V. Kolovski, Z. Wu, and G. Eadon. Optimizing
 [2] S. Bechhofer and R. Volz. Patching syntax in OWL
                                                                            enterprise-scale OWL 2 RL reasoning in a relational
     ontologies. In Proc. 3rd Int. Semantic Web Conf. (ISWC’04),
                                                                            database system. In Proc. 9th Int. Semantic Web Conf.
     pages 668–682. Springer, 2004.                                         (ISWC’10), pages 436–452. Springer, 2010.
 [3] T. Berners-Lee. Linked Data. W3C Design Issues, July 2006.        [23] M. Krötzsch. Efficient rule-based inferencing for OWL EL.
 [4] B. Bishop and S. Bojanov. Implementing OWL 2 RL and                    In Proc. 22nd Int. Conf. on Artificial Intelligence (IJCAI’11),
     OWL 2 QL rule-sets for OWLIM. In Proc. OWLED 2011                      pages 2668–2673, 2011.
     Workshop on OWL: Experiences and Directions, 2011.                [24] G. Meditskos and N. Bassiliades. DLEJena: A practical
 [5] D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, and               forward-chaining OWL 2 RL reasoner combining Jena and
     R. Rosati. Tractable reasoning and efficient query answering           Pellet. J. of Web Semantics, 8(1):89–94, 2010.
     in description logics: The DL-Lite family. J. of Automated
                                                                       [25] B. Motik, P. F. Patel-Schneider, and B. Cuenca Grau. OWL 2
     Reasoning, 39(3):385–429, 2007.                                        Web Ontology Language: Direct Semantics. W3C
 [6] B. Cuenca Grau, B. Motik, Z. Wu, A. Fokoue, and C. Lutz.               Recommendation, Oct. 2009.
     OWL 2 Web Ontology Language: Profiles. W3C                        [26] B. Motik, P. F. Patel-Schneider, and B. Parsia. OWL 2 Web
     Recommendation, Oct. 2009.                                             Ontology Language: Structural Specification and
 [7] M. d’Aquin, C. Baldassarre, L. Gridinoc, S. Angeletou,                 Functional-Style Syntax. W3C Recommendation, Oct. 2009.
     M. Sabou, and E. Motta. Characterizing knowledge on the
                                                                       [27] S. Muñoz, J. Pérez, and C. Gutierrez. Simple and efficient
     Semantic Web with Watson. In Proc. 5th Int. Workshop on
                                                                            minimal RDFS. J. of Web Semantics, 7(3):220–234, 2009.
     Evaluation of Ontologies and Ontology-based Tools, pages
                                                                       [28] L. Page, S. Brin, R. Motwani, and T. Winograd. The
     1–10, 2007.
                                                                            PageRank Citation Ranking: Bringing Order to the Web.
 [8] L. Ding and T. Finin. Characterizing the semantic web on the           Technical report, Stanford, 1998.
     web. In Proc. 5th Int. Semantic Web Conf. (ISWC’06), pages
                                                                       [29] P. F. Patel-Schneider, B. Motik, B. Cuenca Grau, I. Horrocks,
     242–257. Springer, 2006.
                                                                            B. Parsia, A. Ruttenberg, and M. Schneider. OWL 2 Web
 [9] L. Ding, J. Shinavier, Z. Shangguan, and D. L. McGuinness.
                                                                            Ontology Language: Mapping to RDF Graphs. W3C
     SameAs networks and beyond: Analyzing deployment status
                                                                            Recommendation, Oct. 2009.
     and implications of owl:sameAs in linked data. In Proc. 9th
                                                                       [30] H. Pérez-Urbina, B. Motik, and I. Horrocks. Tractable query
     Int. Semantic Web Conf. (ISWC’10), pages 145–160.
                                                                            answering and rewriting under description logic constraints.
     Springer, 2010.
                                                                            J. of Applied Logic, 8(2):151–232, 2009.
[10] I. Emmons, S. Collier, M. Garlapati, and M. Dean. RDF
                                                                       [31] M. Schneider and G. Sutcliffe. Reasoning in the OWL 2 Full
     literal data types in practice. In SSWS 2011, 2011.
                                                                            ontology language using first-order automated theorem
[11] F. Fischer, G. Ünel, B. Bishop, and D. Fensel. Towards a
                                                                            proving. In Proc. 23rd Int. Conf. on Automated Deduction
     scalable, pragmatic knowledge representation language for
                                                                            (CADE-23), pages 461–475. Springer, 2011.
     the web. In Ershov Memorial Conf., pages 124–134, 2009.
                                                                       [32] F. Simančík, Y. Kazakov, and I. Horrocks.
[12] B. Glimm, A. Hogan, M. Krötzsch, and A. Polleres. OWL                  Consequence-based reasoning beyond Horn ontologies. In
     LD Entailment Ruleset and Implementational Notes, Nov.                 Proc. 22nd Int. Conf. on Artificial Intelligence (IJCAI’11),
     2011. http://www.semanticweb.org/OWLLD/.
                                                                            pages 1093–1098, 2011.
[13] B. Grosof, I. Horrocks, R. Volz, and S. Decker. Description
                                                                       [33] M. Smith, I. Horrocks, M. Krötzsch, and B. Glimm. OWL 2
     logic programs: Combining logic programs with description
                                                                            Web Ontology Language: Conformance. W3C
     logic. In World Wide Web, 2004.
                                                                            Recommendation, Oct. 2009.
[14] H. Halpin, P. J. Hayes, J. P. McCusker, D. L. McGuinness,
                                                                       [34] H. J. ter Horst. Completeness, decidability and complexity of
     and H. S. Thompson. When owl:sameAs isn’t the same: An                 entailment for RDF Schema and a semantic extension
     analysis of identity in linked data. In Proc. 9th Int. Semantic        involving the OWL vocabulary. J. of Web Semantics, 3, 2005.
     Web Conf. (ISWC’10), pages 305–320. Springer, 2010.
                                                                       [35] J. Urbani, F. van Harmelen, S. Schlobach, and H. Bal.
[15] A. Harth, S. Kinsella, and S. Decker. Using naming authority
                                                                            QueryPIE: Backward reasoning for OWL Horst over very
     to rank data and ontologies for web search. In Proc. 8th Int.
                                                                            large knowledge bases. In Proc. 10th Int. Semantic Web
     Semantic Web Conf. (ISWC’09), pages 277–292, 2009.                     Conf. (ISWC’11). Springer, 2011.
[16] A. Harth, J. Umbrich, A. Hogan, and S. Decker. YARS2: A
                                                                       [36] R. Volz. Web Ontology Reasoning with Logic Databases.
     federated repository for querying graph structured data from           PhD thesis, Universität Karlsruhe, 2004.
     the Web. In Proc. 6th Int. Semantic Web Conf. (ISWC’07),
                                                                       [37] T. D. Wang, B. Parsia, and J. A. Hendler. A survey of the
     pages 211–224. Springer, 2007.
                                                                            web ontology landscape. In Proc. 5th Int. Semantic Web
[17] P. Hayes. RDF Semantics. W3C Recommendation, Feb.
                                                                            Conf. (ISWC’06), pages 682–694. Springer, 2006.
     2004.
[18] T. Heath and C. Bizer. Linked Data: Evolving the Web into a
     Global Data Space (1st Edition). Morgan & Claypool, 2011.
[19] J. A. Hendler. RDFS 3.0. In W3C Workshop on RDF Next
     Steps, June 2010.
[20] A. Hogan. Exploiting RDFS and OWL for Integrating
     Heterogeneous, Large-Scale, Linked Data Corpora. PhD
                         ID             Body                                                             Head
                         eq-ref         ?s ?p ?o .                                                       ?s owl:sameAs ?s . ?p owl:sameAs ?p . ?o owl:sameAs ?o .
                         eq-sym         ?x owl:sameAs ?y .                                               ?y owl:sameAs ?x .
                         eq-trans       ?x owl:sameAs ?y . ?y owl:sameAs ?z .                            ?x owl:sameAs ?z .
Equality




                         eq-rep-s       ?s owl:sameAs ?s 0 . ?s ?p ?o .                                  ?s 0 ?p ?o .
                         eq-rep-p       ?p owl:sameAs ?p0 . ?s ?p ?o .                                   ?s ?p0 ?o .
                         eq-rep-o       ?o owl:sameAs ?o0 . ?s ?p ?o .                                   ?s ?p ?o0 .
                         eq-diff1       ?x owl:sameAs ?y . ?x owl:differentFrom ?y .                     false
                         prp-ap         (for each core annotation property ?p)                           ?p a owl:AnnotationProperty .
                         prp-dom        ?p rdfs:domain ?c . ?x ?p ?y .                                   ?x a ?c .
                         prp-rng        ?p rdfs:range ?c . ?x ?p ?y .                                    ?y a ?c .
                         prp-fp         ?p a owl:FunctionalProperty . ?x ?p ?y 1 . ?x ?p ?y 2 .          ?y 1 owl:sameAs ?y 2 .
                         prp-ifp        ?p a owl:InverseFunctionalProperty . ?x 1 ?p ?y . ?x 2 ?p ?y .   ?x 1 owl:sameAs ?x 2 .
Property Axioms




                         prp-irp        ?p a owl:IrreflexiveProperty . ?x ?p ?x .                        false
                         prp-symp       ?p a owl:SymmetricProperty . ?x ?p ?y .                          ?y ?p ?x .
                         prp-asyp       ?p a owl:AsymmetricProperty . ?x ?p ?y . ?y ?p ?x .              false
                         prp-trp        ?p a owl:TransitiveProperty . ?x ?p ?y . ?y ?p ?z .              ?x ?p ?z .
                         prp-spo1       ?p1 rdfs:subPropertyOf ?p2 . ?x ?p1 ?y .                         ?x ?p2 ?y .
                         prp-eqp1       ?p1 owl:equivalentProperty ?p2 . ?x ?p1 ?y .                     ?x ?p2 ?y .
                         prp-eqp2       ?p1 owl:equivalentProperty ?p2 . ?x ?p2 ?y .                     ?x ?p1 ?y .
                         prp-pdw        ?p1 owl:propertyDisjointWith ?p2 . ?x ?p1 ?y . ?x ?p2 ?y .       false
                         prp-inv1       ?p1 owl:inverseOf ?p2 . ?x ?p1 ?y .                              ?y ?p2 ?x .
                         prp-inv2       ?p1 owl:inverseOf ?p2 . ?x ?p2 ?y .                              ?y ?p1 ?x .
                         cls-thing      —                                                                owl:Thing a owl:Class .
 Classes




                         cls-nothing    —                                                                owl:Nothing a owl:Class .
                         cls-nothing2   ?x a owl:Nothing .                                               false
                         cax-sco        ?c 1 rdfs:subClassOf ?c 2 . ?x a ?c 1 .                          ?x a ?c 2 .
 Class Ax.




                         cax-eqc1       ?c 1 owl:equivalentClass ?c 2 . ?x a ?c 1 .                      ?x a ?c 2 .
                         cax-eqc2       ?c 1 owl:equivalentClass ?c 2 . ?x a ?c 2 .                      ?x a ?c 1 .
                         cax-dw         ?c 1 owl:disjointWith ?c 2 . ?x a ?c 1 , ?c 2 .                  false
                         dt-type1       (for each supported datatype ?dt)                                ?dt a rdfs:Datatype .
Datatypes




                         dt-type2       (for each literal ?lt in the value space of datatype ?dt)        ?lt a ?dt .
                         dt-eq          (for all ?lt1 and ?lt2 with the same data value)                 ?lt1 owl:sameAs ?lt2 .
                         dt-diff        (for all ?lt1 and ?lt2 with different data values)               ?lt1 owl:differentFrom ?lt2 .
                         dt-not-type    ?lt a ?dt . (where ?lt is not in the value space of ?dt)         false
                                                                                                         ?c rdfs:subClassOf ?c . ?c rdfs:subClassOf owl:Thing .
                         scm-cls        ?c a owl:Class .
                                                                                                         ?c owl:equivalentClass ?c . owl:Nothing rdfs:subClassOf ?c .
                         scm-sco        ?c 1 rdfs:subClassOf ?c 2 . ?c 2 rdfs:subClassOf ?c 3 .          ?c 1 rdfs:subClassOf ?c 3 .
                         scm-eqc1       ?c 1 owl:equivalentClass ?c 2 .                                  ?c 1 rdfs:subClassOf ?c 2 . ?c 2 rdfs:subClassOf ?c 1 .
     Schema Vocabulary




                         scm-eqc2       ?c 1 rdfs:subClassOf ?c 2 . ?c 2 rdfs:subClassOf ?c 1 .          ?c 1 owl:equivalentClass ?c 2 .
                         scm-op         ?p a owl:ObjectProperty .                                        ?p rdfs:subPropertyOf ?p . ?p owl:equivalentProperty ?p .
                         scm-dp         ?p a owl:DatatypeProperty .                                      ?p rdfs:subPropertyOf ?p . ?p owl:equivalentProperty ?p .
                         scm-spo        ?p1 rdfs:subPropertyOf ?p2 . ?p2 rdfs:subPropertyOf ?p3 .        ?p1 rdfs:subPropertyOf ?p3 .
                         scm-eqp1       ?p1 owl:equivalentProperty ?p2 .                                 ?p1 rdfs:subPropertyOf ?p2 . ?p2 rdfs:subPropertyOf ?p1 .
                         scm-eqp2       ?p1 rdfs:subPropertyOf ?p2 . ?p2 rdfs:subPropertyOf ?p1 .        ?p1 owl:equivalentProperty ?p2 .
                         scm-dom1       ?p rdfs:domain ?c 1 . ?c 1 rdfs:subClassOf ?c 2 .                ?p rdfs:domain ?c 2 .
                         scm-dom2       ?p2 rdfs:domain ?c . ?p1 rdfs:subPropertyOf ?p2 .                ?p1 rdfs:domain ?c .
                         scm-rng1       ?p rdfs:range ?c 1 . ?c 1 rdfs:subClassOf ?c 2 .                 ?p rdfs:range ?c 2 .
                         scm-rng2       ?p2 rdfs:range ?c . ?p1 rdfs:subPropertyOf ?p2 .                 ?p1 rdfs:range ?c .



                               Table 5: The OWL LD ruleset in Turtle/N3 style syntax where false in the head denotes inconsistency