=Paper=
{{Paper
|id=None
|storemode=property
|title=OWL: Yet to arrive on the Web of Data?
|pdfUrl=https://ceur-ws.org/Vol-937/ldow2012-paper-16.pdf
|volume=Vol-937
|dblpUrl=https://dblp.org/rec/conf/www/GlimmHKP12
}}
==OWL: Yet to arrive on the Web of Data?==
OWL: Yet to arrive on the Web of Data? Birte Glimm Aidan Hogan Markus Krötzsch Axel Polleres Ulm University, Institute of Digital Enterprise University of Oxford, Siemens AG Österreich, Artificial Intelligence, Research Institute, Department of Computer Siemensstrasse 90, 1210 89069 Ulm, Germany National University of Science, OX1 3QD Vienna, Austria Ireland Galway, Ireland Oxford, United Kingdom ABSTRACT lines recommend use of RDFS [18, § 4.4.2] for defining and in- Seven years on from OWL becoming a W3C recommendation, and terlinking vocabularies. Regarding OWL, guidelines explicitly rec- two years on from the more recent OWL 2 W3C recommendation, ommend use of owl:equivalentClass, owl:equivalentProperty, OWL has still experienced only patchy uptake on the Web. Al- owl:InverseFunctionalProperty & owl:inverseOf [18, § 4.4.2]. though certain OWL features (like owl:sameAs) are very popular, However, other OWL features are not concretely mentioned. other features of OWL are largely neglected by publishers in the In terms of standards, RDFS and OWL 1 pre-date the Linked Linked Data world. This may suggest that despite the promise of Data movement and are not directly tailored towards Linked Data easy implementations and the proposal of tractable profiles sug- requirements. Although the informative entailment rules for sup- gested in OWL’s second version, there is still no “right” standard porting RDFS inferences are relatively straightforward, things like fragment for the Linked Data community. In this paper, we (1) the infinitely many entailed axiomatic triples reduce their practi- analyse uptake of OWL on the Web of Data, (2) gain insights into cality [27]. In OWL 1 the situation is more complex: OWL 1 Full the OWL fragment that is actually used/usable on the Web, where further extends the RDFS semantics to the extent that reasoning we arrive at the conclusion that this fragment is likely to be a sim- becomes undecidable. In OWL 1 DL and OWL 1 Lite, where plified profile based on OWL RL, (3) propose and discuss such a the semantics are based on Description Logics, typical reasoning new fragment, which we call OWL LD (for Linked Data). tasks remain decidable, but are of exponential or harder worst-case complexity. OWL 2 addresses the complexity issue by defining profiles [6]: fragments for which at least some reasoning tasks are 1. INTRODUCTION tractable. Reasoning with inconsistent data is, however, still prob- Under the initial impetus of the Linking Open Data project – lematic in any OWL fragment. Further, each profile is a syntactic and guided by the Linked Data principles [3] and associated best- subset of OWL DL such that RDF data must adhere to certain non- practices – a rich vein of openly-available structured data has been trivial conditions which are commonly not followed in Web ontolo- published on the Web using Semantic Web standards. Publishing gies [2, 37, 7]. However, OWL RL includes a ruleset called OWL RDF on the Web is no longer confined to academia and hobbyists: RL/RDF, which is applicable over arbitrary RDF data. the current “Web of Data” now features exports from various cor- Although the OWL RL profile is implementable using straight- porate and commercial bodies (e.g., BBC, New York Times, Best- forward rule-based technologies, (as we show) the profile still in- Buy), online communities (e.g., Freebase, identi.ca), life-science cludes many features with sparse uptake in Linked Data publish- corpora (e.g., DrugBank, Linked Clinical Trials) and governmental ing. Which features are prominently used is, however, unclear. bodies (e.g., data.gov, data.gov.uk). The “Linked Open Data cloud” Taking this cue, we herein survey a broad spectrum of RDF Web now depicts 295 interlinked datasets, which together consist of an data, looking at the uptake of individual RDFS and OWL features estimated 31.6 billion RDF triples.1 used therein, including datatypes. We further analyse to what ex- Although RDF provides standard syntaxes and a common data- tent OWL features are supported by tools that provide the technical model for disseminating structured information, it offers very lit- infrastructure for building complex Semantic Web applications. tle when it comes to giving semantics to the published data. RDF Our analysis suggests that a much simpler profile of OWL might Schema (RDFS) and OWL were developed to address this by pro- be better targeted towards the current needs of the Linked Data viding a vocabulary for describing schema data. The special vo- community. We thus propose OWL LD (for Linked Data) as a sub- cabulary terms of RDFS and OWL – such as rdfs:subClassOf or set of the OWL RL profile, using the insights of our survey to make owl:FunctionalProperty – have a well-defined semantics, which an informed decision as to which features of the RDFS and OWL can be used to derive implicit consequences from the data. standards should be included in the profile. In terms of publishing, parts of the RDFS and OWL standards The remainder of the paper is structured as follows: In the next have been adopted on the Web of Data. Linked Data literature rec- section, we introduce some preliminaries. In Section 3, we present ommends use of owl:sameAs relations between two URIs that re- our survey of the use of RDFS and OWL features on the Web, in- fer to the same resource [18, § 2.5.2]. Further, Linked Data guide- cluding a survey of datatypes. In Section 4, we analyse the tool 1 support for RDFS and OWL. Drawing upon our observations, we http://www4.wiwiss.fu-berlin.de/lodcloud/state/ propose and define the OWL LD profile in Section 5, and discuss Acknowledgements. This work has been funded in part by Science formal aspects of reasoning over the profile in Section 6. Next, in Foundation Ireland under Grant No. SFI/08/CE/I1380 (Líon-2) and by an Section 7, we give a synopsis of related work for empirical analyses IRCSET postgraduate grant. of RDFS and OWL data on the Web. We conclude in Section 8. LDOW2012 April 16, 2012 Lyon, France Copyright held by the author(s)/owner(s). 2. BACKGROUND guage fragments of OWL. The essential features of RDF Schema We first recall some relevant features of RDF, RDFS, and OWL (sub-classes and -properties, domain, range) are covered by all frag- semantics and give a summary of the existing OWL profiles. ments, but only OWL Full supports arbitrary RDF documents. Various sub-languages of OWL have also been proposed outside 2.1 RDF Graphs and Their Semantics of the official standard. The current profiles have themselves been inspired by existing approaches: EL++ for OWL EL [23], DL-Lite Given the set of URI references U, the set of blank nodes B, [5] for OWL QL, and Description Logic Programs (DLP) [13] and and the set of literals L, the set of RDF constants is denoted by pD* [34] for OWL RL. Generally, these approaches aim to max- C := U∪B∪L. We use CURIEs to denote URIs (e.g., owl:sameAs), imise expressivity under some design principles. DLP is defined as where the prefixes used in this paper can be looked up, e.g., at a syntactic fragment of OWL. Other languages – including pD* – http://prefix.cc/. We often use Turtle syntax; e.g., we may use came about by extending RDFS with additional features. Allemang a as a shortcut for rdf:type. Finally, V denotes the set of RDF and Hendler proposed RDFS-Plus based on an informal survey of variables ranging over C and we prefix variables with ‘?’. practitioners and three criteria felt important for adoption: peda- An RDF triple (s, p, o) is a triple from the set of all RDF triples gogism (intuitive and easy to learn), practicality (real use-cases in G := U ∪ B × U × C, where s is called subject, p predicate, and o modelling), and computational feasibility (not too hard to imple- object. We call a finite set of triples G ⊂ G an RDF graph. ment) [1]. This language was later extended to RDFS 3.0 along Semantically, RDF graphs can be interpreted in a number of similar principles [19]. Fisher et al. propose a similar profile called ways based on various W3C recommendations. The simple se- L2, where the feature selection is made on an ad-hoc basis [11]. mantics [17] considers only the graph structure of RDF, whereas (Table 2 later summarises the main features of these languages.) more elaborate semantics such as RDFS entailment [17] or the OWL 2 Direct and RDF-Based Semantics (see below) provide spe- cial meanings for certain terms. 2.3 OWL Semantics and Reasoning The common basis for all such semantics is that they are speci- OWL ontologies can be interpreted under two different seman- fied in terms of model theory: one defines interpretations together tics that agree in important cases: the RDF-Based Semantics (RS) with necessary and sufficient conditions that specify when an in- [17] and the Direct Semantics (DS) [25]. Like in RDF(S), the se- terpretation satisfies a graph. When defining a semantics E (such mantics are defined by specifying a model theory, i.e., by defining as RDF, RDFS, etc.) one often speaks of E-interpretations and valid interpretations for ontologies based on semantic conditions. E-satisfaction. The set of all E-interpretations that E-satisfy a graph In RS, these models are based on the representation of OWL ax- G are called the E-models of G. Semantic entailment follows from ioms as RDF graphs and thus can be viewed as a refined form of this notion: a graph G E-entails a graph G0 , written G |=E G0 , if and RDF interpretation. In DS, models are directly defined based on the only if every E-model of G is also an E-model of G0 . structure of OWL axioms in the conceptual framework of Descrip- tion Logics (which in turn is based on first-order logic). Due to this, 2.2 OWL and its Fragments DS is only defined for ontologies that belong to the OWL DL lan- OWL 2 is an ontology language that provides advanced schema guage (or to any of its profiles) while RS can also be used on OWL modelling capabilities that can be used together with RDF data. Full. Besides this restriction, OWL language fragments are not tied OWL 2 supersedes the earlier specification “OWL 1” by introduc- to either semantics, leaving nine valid combinations of syntactic ing new modelling features, additional serialisations, updated con- fragments and formal semantics [33]. formance conditions and various corrections. When omitting the RS is arguably more robust since it is defined for any RDF graph version number, we thus mean the current OWL 2 standard. while DS only works for ontologies in OWL DL. However, RS Every RDF graph can be considered as an OWL ontology and the entailment (of derived facts) is undecidable: implementations can language of all RDF documents is called OWL Full to emphasise only compute a subset of the conclusions that the semantics speci- that all such graphs should be viewed as ontologies. In applications, fies. In contrast, there are complete implementations for computing however, OWL ontologies are usually viewed as being composed of entailments under DS, albeit with a high (super-exponential) worst- axioms, that can be more complex than single triples. For example, case complexity if all of OWL DL is to be covered. When further the triple ex:a owl:sameAs ex:b . corresponds to the OWL axiom restricting to the OWL profiles, entailment checking under DS can SameIndividual(ex:a ex:b) whereas the axiom be done in polynomial time. For RS, it is not known in general if the entailment problem becomes simpler in these cases. It is ObjectPropertyRange(skos:member known, however, that RS and DS yield the same entailments on ObjectUnionOf(skos:Concept skos:Container)) (1) OWL RL under certain additional conditions, leading to a partial expands to the six RDF triples tractability result for RS for this case [6]. Similar results could be skos:member rdfs:range _:x. _:x owl:unionOf _:x1 . obtained in other cases since DS reasoning algorithms can often be modified to obtain correct (though often incomplete) RS reasoners. _:x1 rdf:first skos:Concept. _:x1 rdf:rest _:x2 . (2) DS reasoning in all of the OWL profiles and significant parts of _:x2 rdf:first skos:Container. _:x2 rdf:rest rdf:nil . OWL DL can be implemented using rules in a forward-chaining Various conditions must be imposed on RDF graphs to ensure that manner. For OWL RL, an algorithm is suggested in the specifica- they are in one-to-one correspondence to a collection of OWL ax- tion [6], while other works have covered OWL EL [23] and parts ioms. A syntactic subset of OWL Full for which this is possible is of OWL DL that also cover OWL QL [32]. For OWL QL, query OWL DL, which also imposes further restrictions that are useful for rewriting is a more common reasoning technique [5, 30]. There computing semantic conclusions from the ontology [26]. Drawing are many different reasoning techniques for OWL DL under DS, such conclusions can still be computationally expensive. Hence, though not all of them lead to polynomial algorithms when applied OWL further defines three syntactically restricted sub-languages to the OWL profiles. Two (necessarily incomplete) reasoning meth- (profiles) of OWL DL called OWL EL, OWL RL and OWL QL [6] ods are known for RS: algorithms based on sets of derivation rules (see also Table 2 later for a brief feature comparison). OWL Full, like the ones for OWL RL and an approach based on using first- OWL DL and the OWL profiles together constitute the five lan- order theorem provers [31]. № Document URI Rank 3. SURVEY OF RDFS & OWL ADOPTION 1 http://www.w3.org/1999/02/22-rdf-syntax-ns 0.121 ON THE WEB OF DATA 2 http://www.w3.org/2000/01/rdf-schema 0.110 3 http://dublincore.org/2010/10/11/dcelements.rdf 0.096 We now present an empirical survey of RDFS & OWL adoption 4 http://www.w3.org/2002/07/owl 0.078 on the Web of Data. Our survey is conducted over the Billion Triple 5 http://www.w3.org/2000/01/rdf-schema-more 0.049 6 http://dublincore.org/2010/10/11/dcterms.rdf 0.036 Challenge 2011 corpus, which consists of 2.145 billion quadru- 7 http://www.w3.org/2009/08/skos-reference/skos.rdf 0.026 ples crawled from 7.411 million RDF/XML documents through 8 http://xmlns.com/foaf/spec/ 0.023 an open crawl ran in May/June 2011 spanning 791 pay-level do- 9 http://dublincore.org/DCMI.rdf 0.021 mains. (A pay-level domain is a direct sub-domain of a top-level 10 http://www.w3.org/2003/g/data-view 0.017 14 http://id.loc.gov/authorities/sh98002267 4.01E-3 domain (TLD) or a second-level country domain (ccSLD), e.g., 30 http://motools.sourceforge.net/doc/musicontology.rdfs 2.38E-3 dbpedia.org, bbc.co.uk. This gives us our notion of “domain”). 38 http://www.w3.org/.../wn20/schemas/wnfull.rdfs 7.79E-4 This corpus represents a broad sample of the Web of Data. 43 http://vivoweb.org/files/vivo-core-public-1.2.owl 6.11E-4 87 http://www.w3.org/2006/time 2.07E-4 116 1.22E-4 3.1 Measures Used 129 http://rdf.geospecies.org/ont/geospecies http://motools.sourceforge.net/timeline/timeline.rdf 1.06E-4 In order to adequately characterise the uptake of various RDF(S) 159 http://vocab.org/bio/0.1/termgroup2.rdf 8.11E-5 and OWL features used in this corpus, we present different mea- 259 http://www.ordnancesurvey.co.uk/.../geometry.owl 4.39E-5 289 http://www.ordnancesurvey.co.uk/.../admingeo.owl 4.01E-5 sures to quantify their prevalence and prominence. 990 http://www.ordnancesurvey.co.uk/.../spatialrelations.owl 1.24E-5 First, we look at the prevalence of use of different features, i.e., how often they are used. Here, we must take into account the di- versity of the data under analysis, where few domains account for a Table 1: Top ten ranked documents and notable ranks (position great many triples and many domains account for few triples, where < 1, 000) mentioned later in Table 2 certain domains tend to publish many small documents and others publish few large documents, and so forth [20]. We thus present three statistics: (1) number of axioms using the feature [Ax], (2) ularies; we also present the ranks of other notable documents men- number of documents [Doc] and (3) number of domains [Dom]. tioned in the following section.2 However, raw counts do not reflect that the use of an OWL fea- ture in one important ontology may often have greater practical im- 3.2 Survey of RDF(S)/OWL Features pact than use in a thousand obscure documents. Thus, we also look Table 2 presents the results of the survey of RDF(S) and OWL at the prominence of use of different features. We use PageRank to usage in our corpus, where for features with non-trivial semantics, quantify our notion of prominence: PageRank calculates a variant we present the measures mentioned in the previous section, as well of the Eigenvector centrality of nodes (e.g., documents) in a graph, as support for the features in the different reasoning profiles dis- where taking the intuition of directed links as “positive votes”, the cussed in Section 2.2. We exclude rdf:type, which appeared in resulting scores help characterise the relative prominence (i.e., cen- 90.3% of documents.P We present the table ordered by the sum of trality) of particular documents on the Web [28, 15]. PageRank measure [ Rank]; recall that Table 1 provides a legend In particular, we first rank documents in the corpus. To construct for notable documents (Pos<1,000). the graph, we consider RDF documents as nodes, where a directed In column ‘ST’, we indicate which features have expressions that edge (d1 , d2 ) is extended from document d1 to d2 iff d1 hosts RDF can be represented as a single triple in RDF, i.e., which features do data that contains (in any triple position) a URI that dereferences to not require auxiliary blank nodes of the form _:x or the SEQ pro- document d2 . This notion of dereferenceable links is core to Linked duction in Table 1 of the OWL 2 Mapping to RDF document [29]. Data principles [3]. Note also that we follow redirects when check- This distinction is motivated by our initial observations that such ing dereferenceability. We then apply a standard PageRank analysis features are typically the most widely usedPin Web data. over the resulting directed graph, using the power iteration method Figure 1 gives a visual overview of the Rank measure for the with ten iterations. For reasons of space, we refer the interested listed features (log scale), where different shades of grey are used reader to [28] for more detail on PageRank, and [20] for more de- to indicate to which vocabulary a term belongs (e.g., distinguishing tail on the particular algorithms used for this paper. the terms new in OWL 2 from the ones already in OWL 1). Given these rank scores, for the different RDF(S) and OWL fea- Regarding prevalence, we see from Table 2 that owl:sameAs is tures we then present (1) the the most widely used axiom in terms of documents (1.778 million; P sum of PageRank scores for documents in which they are used [ Rank]; (2) the max PageRank score 24%) and domains (117; 14.8%). Surprisingly (to us), RDF con- of the highest-ranked document in which it appears [max Rank]; tainer membership properties (rdf:_*) are also heavily used (likely (3) the max PageRank position of that document in the ordering of attributable to RSS 1.0 documents). Regarding prominence, we the 7.411 million documents [max Pos]. make the following observations: In terms of intuition under the random surfer model of Page- 1 The top six features are those that form the core of RDFS [27]. Rank [28], given an agent starting from a random location and 2 The RDF(S) declaration classes rdfs:Class, rdf:Property traversing documents on (our sample of) the Web are used in fewer, but more prominent documents than OWL’s ver- P of Data through randomly selected dereferenceable URIs, the Rank value for sions owl:Class, owl:DatatypeProperty, owl:ObjectProperty. a feature approximately indicates the probability with which that 3 The top eighteen features are expressible with a single RDF agent will be at a document using that feature after traversing ten triple. The highest ranked primitive for which this is not the case links. In other words, the score indicates the likelihood of an agent, is owl:unionOf in nineteenth position, which requires use of RDF operating over the Web of Data based on dereferenceable princi- collections (i.e., lists). Union classes are often specified as the do- ples, to encounter a given feature. 2 The graph extracted from the corpus consists of 7.411 million We limit the results to those presented for space reasons. We ran another similar analysis with links to and from core RDF(S) and nodes and 198.6 million edges. Table 1 presents the top-10 ranked OWL vocabularies disabled. The results for the feature analysis documents in our corpus, which are dominated by core meta-vo- remained similar. Mainly owl:sameAs dropped several positions cabularies, documents linked therefrom, and other popular vocab- in terms of the sum of PageRank. RDFS+ RDFS DLP pD* QL RL EL L2 № P Primitive Rank max Rank max Pos Ax Doc Dom ST 1 rdf:Property 5.74E-1 1.21E-1 1 17,509 8,049 48 X - - - X - - - X 2 rdfs:range 4.67E-1 1.21E-1 1 51,540 44,492 89 X X X X X X X X X 3 rdfs:domain 4.62E-1 1.21E-1 1 97,288 43,247 89 X X X X X X X X X 4 rdfs:subClassOf 4.60E-1 1.21E-1 1 1,164,620 115,608 109 X X X X X X X X X 5 rdfs:Class 4.45E-1 1.21E-1 1 39,606 19,904 43 X - - X X - - - X 6 rdfs:subPropertyOf 2.35E-1 1.10E-1 2 11,490 6,080 80 X X X X X X X X X 7 owl:Class 1.74E-1 7.80E-2 4 255,002 302,701 111 - - X X X X X X X 8 owl:ObjectProperty 1.68E-1 7.80E-2 4 35,065 285,412 92 - - X X - X X X X 9 rdfs:Datatype 1.68E-1 1.21E-1 1 31 23 9 X∗ - - X∗ X∗ X∗ X∗ X∗ X∗ 10 owl:DatatypeProperty 1.65E-1 7.80E-2 4 23,888 234,483 82 - - X X - X X X X 11 owl:AnnotationProperty 1.60E-1 7.80E-2 4 216 172,290 55 - - - X - X X X X 12 owl:FunctionalProperty 9.18E-2 2.63E-2 7 3,222 298 34 - - X X X - - X X 13 owl:equivalentProperty 8.54E-2 3.57E-2 6 168 141 23 - X X X X X X X X 14 owl:inverseOf 7.91E-2 2.63E-2 7 1,160 366 43 - X X X X - X X X 15 owl:disjointWith 7.65E-2 2.63E-2 7 3,266 230 27 - - - X - X X X X 16 owl:sameAs 7.29E-2 4.01E-3 14 3,450,554 1,778,208 117 - X X X X X - X X 17 owl:equivalentClass 5.24E-2 2.32E-2 8 25,827 22,291 39 - X X X X X X X X 18 owl:InverseFunctionalProperty 4.79E-2 2.32E-2 8 75 111 24 - - X X∗ X - - X X 19 owl:unionOf 3.15E-2 2.63E-2 7 46,721 15,162 30 - - - X∗ - - - X∗ - 20 owl:SymmetricProperty 3.13E-2 2.63E-2 7 175 120 23 - X X X X - X X X 21 owl:TransitiveProperty 2.98E-2 2.63E-2 7 223 150 30 - X X X X X - X X 22 owl:someValuesFrom 2.13E-2 1.65E-2 10 3,854 1,753 15 - - - X∗ X∗ X X∗ X∗ - 23 rdf:_* 1.42E-2 8.11E-5 159 7,791,545 293,022 62 X - - - X - - - - 24 owl:allValuesFrom 2.98E-3 7.79E-4 38 108,989 29,084 20 - - - X∗ X∗ - - X∗ - 25 owl:minCardinality 2.43E-3 6.11E-4 43 395,841 33,309 19 - - - X∗ - - - - - 26 owl:maxCardinality 2.14E-3 6.11E-4 43 223,994 10,413 24 - - - X∗ - - - X∗ - 27 owl:cardinality 1.75E-3 7.79E-4 38 20,781 3,170 24 - - - X∗ - - - - - 28 owl:oneOf 4.13E-4 2.07E-4 87 736 74 11 - - - X∗ - X∗ - X∗ - 29 owl:hasValue 3.91E-4 2.07E-4 87 1,624 55 14 - - - X∗ X X - X - 30 owl:intersectionOf 3.37E-4 1.06E-4 129 2,324 186 13 - - - X - X X∗ X - 31 owl:NamedIndividual (2) 1.63E-4 1.22E-4 116 205 3 2 - - - - - X X X X 32 owl:AllDifferent 1.55E-4 1.22E-4 116 87 21 8 - - - - - X - X - 33 owl:propertyChainAxiom (2) 1.23E-4 4.01E-5 289 52 14 6 - - - - - X - X - 34 owl:onDataRange 8.41E-5 4.39E-5 259 89 3 1 - - - - - - - - - 35 owl:minQualifiedCardinality (2) 8.40E-5 4.39E-5 259 7 2 1 - - - - - - - - - 36 owl:qualifiedCardinality (2) 4.02E-5 4.01E-5 289 95 2 1 - - - - - - - - - 37 owl:AllDisjointClasses (2) 4.01E-5 4.01E-5 289 9 2 2 - - - - - X X X - 38 owl:maxQualifiedCardinality (2) 4.01E-5 4.01E-5 289 1 1 1 - - - - - - - X∗ - 39 owl:ReflexiveProperty (2) 1.30E-5 1.24E-5 990 1 2 1 - - - - - X X - X 40 owl:complementOf 1.96E-6 6.28E-8 549,258 759 75 4 - - - X∗ - - X∗ X∗ - 41 owl:differentFrom 7.18E-7 6.81E-8 486,354 691 25 7 - - - X - X - X X 42 owl:onDatatype 2.72E-7 2.72E-7 70,414 2 1 1 - - - - - - - - - 43 owl:disjointUnionOf 6.31E-8 4.28E-8 1,005,307 2 2 2 - - - - - - - - - (2) 44 owl:hasKey 3.67E-8 3.67E-8 1,336,720 1 1 1 - - - - - X - X - (2) 45 owl:propertyDisjointWith 2.43E-8 2.43E-8 3,911,874 4 1 1 - - - - - - X X X (2) (2) (2) Not Used: rdfs:ContainerMembershipProperty, owl:AllDisjointProperties , owl:Annotation , owl:AsymmetricProperty , owl:Axiom (2) , owl:IrreflexiveProperty (2) , owl:NegativePropertyAssertion (2) , owl:datatypeComplementOf (2) , owl:hasSelf (2) Table 2: Survey of RDFS/OWL primitives used on the Web of Data and support in different tractable profiles where ∗ denotes that the semantics is not fully axiomatised by the OWL RL/RDF rules or that usage of the term is restricted under OWL Direct Semantics main or range of a given property: the most prominent such ex- support for disjoint(15) and union classes(19) . DLP – as defined by ample is the SKOS vocabulary (the seventh highest ranked docu- Volz [36, §A] – has coverage of all such features, but does not sup- ment) which specifies the range of the skos:member property as port inverse-functional(18) datatype properties. pD* does not sup- the union of skos:Concept and skos:Container as in (1) above. port disjoint(15) or union classes(19) . 4 Of the features new to OWL 2, the most prominently used is Regarding the OWL profiles, OWL EL and OWL QL both omit owl:NamedIndividual in thirty-first position. Our crawl was con- support for important top-20 features. Neither include functional(12) ducted nineteen months after OWL 2 became a W3C Recommen- or inverse-functional properties(18) , or union classes(19) . OWL EL dation (Oct. 2009); by means of a quick scan of the max Pos col- further omits support for inverse(14) and symmetric properties(20) . umn of Table 2, we note that new OWL 2 features have had lit- OWL QL does not support the prevalent same-as(16) feature. Con- tle penetration in prominent Web vocabularies during that interim. versely, OWL RL has much better coverage, albeit having only par- Further, several OWL 2 features were not used at all in our corpus. tial support for union classes(19) . 5 owl:complementOf and owl:differentFrom are the least Summing up, we acknowledge that such a survey cannot give a prominently used original OWL features. universal or definitive indication of the most important OWL fea- In terms of profile support, we observe that RDFS has good tures for Linked Data. First, we only survey a limited sample of catchment for a few of the most prominent features, but otherwise the Web of Data. Second, the future may (or may not) see radical has poor coverage. Aside from syntactic/declaration features, from changes in how OWL is used on the Web; e.g., OWL 2 terms may the top-20 features (which cover 98% of the total cumulative rank), soon enjoy more adoption. Still, Table 2 offers insights into the ex- L2 misses functional properties(pos=12) , disjoint classes(15) , inverse- tant trends of adoption and later informs the design of a new OWL functional properties(18) and union classes(19) . RDFS-Plus omits profile tailored for the current Web of Data. Figure 1: The sum of PageRank for each of the listed features from Table 2 shown in logarithmic scale on the vertical axis № P Primitive Rank Lit Doc Dom D O2 3.3 Survey of Datatype Use 1 xsd:dateTime 4.18E-2 2,919,518 1,092,048 68 X X We now look at the use of datatypes on the Web of Data. 2 xsd:boolean 2.37E-2 75,215 41,680 22 X X Aside from plain literals, the RDF semantics defines a single 3 xsd:integer 1.97E-2 1,015,235 716,904 41 X X 4 xsd:string 1.90E-2 1,629,224 475,397 76 X X datatype supported under RDF-entailment: rdf:XMLLiteral [17]. 5 xsd:date 1.82E-2 965,647 550,257 39 X - However, the RDF semantics also defines D-entailment, which pro- 6 xsd:long 1.63E-2 1,143,351 357,723 6 X X vides interpretations over a datatype map that gives a mapping from 7 xsd:anyURI 1.61E-2 1,407,283 339,731 16 X X 8 xsd:int 1.52E-2 2,061,837 400,448 31 X X lexical datatype strings into a value space. The datatype map may 9 xsd:float 9.09E-3 671,613 341,156 21 X X also impose disjointness constraints within its value space. These 10 xsd:gYear 4.63E-3 212,887 159,510 12 X - interpretations allow for determining which lexical strings are valid 11 xsd:nonNegativeInteger 3.35E-3 9,230 10,926 26 X X 12 xsd:double 2.00E-3 137,908 68,682 31 X X for a datatype, which different lexical strings refer to the same value 13 xsd:decimal 1.11E-3 43,747 13,179 9 X X and which to different values, and which sets of datatype values are 14 xsd:duration 6.99E-4 28,541 28,299 4 - - disjoint from each other. An XSD-datatype map is then defined 15 xsd:gMonthDay 5.98E-4 34,492 20,886 3 X - 16 xsd:short 5.71E-4 18,064 11,643 2 X X that extends the set of supported datatypes into those defined for 17 rdf:XMLLiteral 4.97E-4 1,580 791 11 X X XML Schema (1.0), including types for boolean, numeric, tempo- 18 xsd:gMonth 2.50E-4 2,250 1,132 3 X - ral, string and other forms of literals. Datatypes that are deemed to 19 rdf:PlainLiteral 1.34E-4 109 19 2 - X 20 xsd:gYearMonth 8.49E-5 6,763 3,080 5 X - be ambiguously defined (viz. xsd:duration) or specific to XML 21 xsd:positiveInteger 5.11E-5 1,423 1,890 2 X X (e.g., xsd:QName), etc. are omitted. 22 xsd:gDay 4.26E-5 2,234 1,117 1 X - The original OWL specification recommends use of a similar set 23 xsd:token 3.56E-5 2,900 1,450 1 X X of datatypes to that for D-entailment, where compliant reasoners 24 xsd:unsignedByte 2.62E-7 66 11 1 X X 25 xsd:byte 2.60E-7 58 11 1 X X are required to support xsd:string and xsd:integer. Furthermore, 26 xsd:time 8.88E-8 23 4 3 X - OWL allows for defining enumerated datatypes. 27 xsd:unsignedLong 6.71E-8 6 1 1 X X With the standardisation of OWL 2 came two new datatypes, – other xsd/owl dts. not used — — — — — — namely owl:real and owl:rational, along with novel support for xsd:dateTimeStamp. However, XSD datatypes relating to date, Table 3: Survey of (std.) datatypes used on the Web of Data time and Gregorian calendar values are not supported. OWL 2 also introduced mechanisms for defining new datatypes by restricting facets of legacy datatypes; however, from Table 2 we note that owl:onDatatype (used for facet restrictions) has only very few occurrences in our corpus. supported by D-entailment with the recommended XSD datatype Implementing the entire range of RDF, XSD and OWL datatypes map. O2 indicates the datatypes supported by OWL 2. can be costly [10], with custom code (or an external library) re- We observe from the table that the top four standard datatypes quired to support each one. Thus, it is interesting to see which are supported by both the traditional XSD datatype map and in datatypes are most commonly used on the Web of Data. OWL 2. However, OWL 2 does not support xsd:date(5) which is In our corpus, we found 278 different datatype URIs assigned prominently featured in our corpus, and does not support Gregorian to literals. Of these, 158 came from the DBpedia exporter which datatypes(10,15,18,20,22) nor xsd:time(26) . Despite not being supported models SI units, currencies, etc., as datatypes. Using analogous by any standard entailment regime, xsd:duration(14) was used in measures as before, Table 3 lists the top standard RDF(S), OWL 28 thousand documents across four different domains. and XSD datatypes as used to type literals in our corpus. We omit Conversely, various standard datatypes are not used at all in the max-rank statistics for brevity, and omit plain literals which were data; e.g., xsd:dateTimeStamp, the “new” OWL datatypes, bi- used in 6.609 million documents (89%). D indicates the datatypes nary datatypes and various normalised-string/token datatypes. 4. AVAILABLE TOOL SUPPORT to alleviate this, e.g., AllegroGraph advertises “dynamic materi- Apart from understanding which OWL features are used in doc- alisation” as a compromise. Backward chaining, in contrast, af- uments on the Web, it is also crucial to understand what tool sup- fects query answering performance but allows for easier updates. port is available. We therefore now survey the availability of soft- In the case of OWL QL (and RDFS), backward chaining can be ware that provides the necessary technical infrastructure for build- performed using a form of query rewriting that depends only on ing complex applications, i.e., databases, reasoners and libraries. schema information, and thus is likely to scale well. The tableau As a baseline requirement, tools need to be able to read OWL approach of PelletDb, on the other hand, is more demanding when documents and parse out axioms. Conformance with the OWL used at query time but can support all features of OWL DL. standard actually requires support for the RDF/XML serialisation Summarising, among the listed systems, three systems work with as an input format [33]. Parsing OWL axioms from RDF triples the Direct Semantics of OWL (PelletDb, DLEJena and QuOnto), is not an easy task, and requires processing joins since axioms can whereas the other systems are rule-based and work directly with be composed of several RDF triples [29]. In addition, OWL ax- RDF triples, usually via forward chaining. Thus, we conclude ioms – such as owl:unionOf or owl:intersectionOf – may use that an implementation via rules and compatibility also with the arbitrary-length RDF lists, which require particular attention to val- RDF-Based Semantics is an important criteria for comprehensive idate and parse. Other features, such as type declarations and on- tool support. Surprisingly, only two thirds of the tools support tology imports, further complicate matters. Consequently, there are owl:sameAs, which is one of the most popular features according few compliant, stand-alone libraries for parsing OWL (relative to to our survey. A possible explanation is that owl:sameAs blows up libraries for RDF). Aside from parsing, querying OWL axioms us- the size of the materialisation when using forward-chaining, so for ing the SPARQL standard is also non-trivial, especially considering an efficient support special optimisations are required, as, e.g., im- axioms using arbitrary-length lists.3 plemented in OWLIM or Oracle 11g [22]. Although, four systems Thus even before any actual reasoning takes place, multi-triple (nearly) support OWL RL, the complexity of a fully compliant and OWL axioms are inconvenient to serialise, publish, parse and query efficient implementation is still considered high [22]. using standard RDF tools. Conversely, OWL axioms that are rep- Regarding datatypes, many triple stores use internal canonicali- resented in a single RDF triple do not require the detection of com- sation of typed literals, but full datatype reasoning is only sparsely plex triple patterns and can easily be processed in a triple-at-a-time supported or documented; some tools such as OWLIM explicitly do manner with the RDF libraries and parsers that are available for not support datatype rules of OWL RL. Datatype support in several many programming languages. The question of whether a feature tools is, for example, surveyed by Emmons et al. [10]. can be expressed in a single triple or not may thus already have significant consequences for the practical cost of supporting it. 5. DEFINING THE OWL LD PROFILE Databases are another important class of tools for building RDF In this section, we build upon our previous observations to sug- applications and numerous commercial and non-commercial sys- gest a simple OWL profile that is adequate for the current needs of tems are available today. Many of these systems evaluate OWL the Web. In the previous sections, we have identified a number of features to improve query answering services. Table 4 provides an key issues for OWL adoption on the Web: overview of such systems. We only include tools that have native 1. Adequacy: features that are widely used on the Web should support for at least rdfs:subClassOf and rdfs:subPropertyOf be included. reasoning (excluding, e.g., 5store), are developed for production 2. Implementability: features that are more challenging to pro- use (excluding prototypes such as YARS2 [16] and QueryPie [35]) cess and reason with should be avoided. and that are meant to be used with large amounts of instance data 3. Robustness: noisy and unreliable data should not prevent the (excluding OWL EL tools such as ELK [21]). The table lists the use of ontological data in reasoning. most frequently implemented features explicitly and describes pro- file support in a separate column. We additionally mention the main Comparing this to the design guidelines of RDFS-Plus [1], we inference strategy and the source of our information.4 can see that adequacy relates to “practicality” while implementabil- A number of tools support the (near-)complete OWL RL profile. ity subsumes to “computational feasibility.” We do not consider Jena with the “OWL mini” ruleset has an incomplete implementa- “pedagogism” as a design goal since we did not assess how intu- tion of OWL (1) DL features that can be viewed as an approxima- itive features are. In contrast, the work presented in Section 3 and 4 tion of OWL RL. PelletDb and QuOnto are reasoning layers on top provides us with a much better understanding for assessing imple- of a database with support for OWL DL and OWL QL, respectively. mentability and adequacy. Robustness has not been considered as DLEJena uses Pellet to perform TBox (schema) reasoning, where a design goal for RDFS-Plus while we find it to be of great impor- the resulting entailments and the OWL RL/RDF rules are used to tance for making sense of Web data. generate a set of ABox (instance) rules, which are then executed Each of the above requirements leads to a number of concrete as- using Jena’s RETE engine. pects. Adequacy has been discussed in Section 3 based on a sam- Contrasting with these fairly powerful implementations, we find ple of published ontologies. Looking at Table 2, we can see that a number of tools that support only a few selected semantic fea- many of the most frequently used features are of a simple struc- tures, including some that only support a fragment of RDFS. ture. In fact, owl:unionOf is the highest ranked feature that is not The reasoning algorithms that have been used are also important expressed by a single triple in RDF serialisations of OWL. in practice. Forward chaining (materialisation) often incurs sig- Implementability was discussed in Section 4. We observed that nificant penalties for data updates, although there are approaches parsing, processing and querying OWL axioms in the RDF-based 3 syntaxes (RDF/XML, N-Triples or Turtle) using widely available Property paths in SPARQL 1.1 make the task somewhat easier, RDF-based tools is easier when all axioms can be mapped to a sin- but checking that lists are well-formed is still challenging. 4 We note that it is difficult to verify whether the tools indeed hold gle triple in the RDF data-model. Moreover, inferencing is more what they claim, e.g., in practice one might find that the support is difficult for some features than for others, even in rule-based ap- not as complete as advertised. Nevertheless, we take each system’s proaches used commonly for OWL RL, e.g., support for list-based description as an indication of available support. (multi-triple) expressions that can be of arbitrary length [4]. sC sP ran dom sA tra sym inv iFP Profile Algorithm Source PelletDb X X X X X X X X X OWL DL tableau http://clarkparsia.com/pelletdb/ DLEJena X X X X X X X X X OWL RL tableau, forward chaining [24], http://lpis.csd.auth.gr/systems/DLEJena/ OWLIM X X X X X X X X X ∼ OWL RL forward chaining [4], http://www.ontotext.com/owlim Oracle 11g X X X X X X X X X OWL RL forward chaining [22], http://tinyurl.com/oracle-sw Jena OWL mini X X X X X X X X X ∼ OWL RL forward chaining http://openjena.org/inference/ Virtuoso X X - - X X X X X — backward chaining http://virtuoso.openlinksw.com/rdf-quad-store/ AllegroGraph X X X X X X - X - — forward chaining http://tinyurl.com/agraph-doc QuOnto X X X X - - X X X OWL QL query rewriting http://www.dis.uniroma1.it/quonto/ Jena RDFS X X X X - - - - - — forward chaining http://openjena.org/inference/ Sesame RDFS Sail X X X X - - - - - — forward chaining http://www.openrdf.org/ 4store with 4rs X X X X - - - - - — query rewriting http://4sreasoner.ecs.soton.ac.uk/ Table 4: RDF database systems with reasoning support (sC: rdfs:subClassOf; sP: rdfs:subPropertyOf; ran: rdfs:range; dom: rdfs:domain; sA: owl:sameAs; tra: owl:TransitiveProperty; sym: owl:SymmetricProperty; inv: owl:inverseOf; iFP: owl:InverseFunctionalProperty) Robustness requires a high tolerance against syntactic errors. can still apply entailment over the remainder of supported features. The RDF-Based Semantics has this feature and can always be ap- OWL LD is not intended to restrict vocabulary publishers in what plied, hence no special language design is needed. However, it is features they use (unless, of course, they are interested in the ben- also desirable to be able to apply the Direct Semantics to a fragment efits of DS-based reasoning). Instead, the terse OWL LD profile as it yields stronger completeness guarantees for reasoning. Even enables developers and researchers to focus directly on the inter- if RDF-Based entailments are desired, the completeness of DS rea- section of features that are (i) the most prominently used in Linked soning methods can be used to obtain similar guarantees for RS [6, Data, (ii) the most robust, and (iii) the easiest to implement. Theorem PR1]. This kind of robustness can be accomplished by re- Formally, we define OWL LD by restricting the OWL RL gram- ducing the use of features for which OWL DL imposes additional mar [6]. Roughly speaking, we remove all definitions and mentions requirements, in particular cardinalities and property chains. of productions listed as follows: Another aspect of robustness is tolerance to inconsistencies. This feature is generally available in OWL profiles that are not able to Datatype entailment: express truly disjunctive information. Due to this, all inconsis- DataRange, DataIntersectionOf, DatatypeDefinition tencies are directly related to an individual or literal upon which Boolean connectives & enums.: conflicting requirements are imposed (including the special case *OneOf, *IntersectionOf, *UnionOf, *ComplementOf of ill-typed literal values). Hence, it is easy to ignore (all ele- Restriction classes: ments involved in) inconsistencies and to continue reasoning on the *ValuesFrom, *HasValue, zeroOrOne, *Cardinality remaining consistent ontology to derive meaningful conclusions. Chains & keys: Any OWL profile (or subset thereof) has this feature. propertyExpressionChain, HasKey From these observations, we derive that it is a reasonable design Negative property assertions: guideline for an OWL profile to restrict to OWL axioms that are in sourceIndividual, target*, Negative*PropertyAssertion OWL RL and at the same time are expressed as single RDF triples. We further restrict the productions for DifferentIndividuals and This directly addresses implementability based on the above obser- Disjoint* to not use the list-based syntaxes. The full grammar vations together with the fact that OWL RL is now widely imple- can be found online [12]. All additional structural restrictions of mented (cf. Table 4). Adequacy is addressed since the most im- OWL DL are inherited from OWL RL. Note that all RL datatypes portant features identified above are both in RL and expressed in are supported as well, though implementers may use our study in single triples. Note that the coverage of additional, rarely used fea- Section 3 to select most relevant datatypes to support (the OWL tures like reflexive properties is not a concern from the viewpoint specification generally allows conforming tools to answer entail- of adequacy (which asks for coverage, not for exclusivity) and is ment questions with Unknown if a used feature is not supported). not difficult to implement in the restricted fragment either. Comparing OWL LD with earlier approaches, it is interesting Robustness for interpretation in DS (i.e., as a subset of OWL DL) to note that it can be viewed as a natural extension of languages is eased by the omission of property chains and (most) cardinal- like L2, RDFS-Plus, RDFS 3.0 as discussed in Section 2 and 3. In ities (note that functionality remains). Single-triple axioms are particular, RDFS 3.0 is already close to OWL LD which mainly also less prone to syntactic errors when represented in RDF. How- adds further OWL 2 constructs from OWL RL while only omitting ever, other restrictions of OWL DL regarding the need for declara- owl:AllDifferent as the list-based variant of owl:differentFrom. tions, the non-existence of inverse functional data properties, and This adds to our confidence that OWL LD is a natural OWL profile the restrictions on blank nodes are still relevant. We suggest to that can be motivated from a number of perspectives. develop canonical (and thus predictable) repair strategies for ad- dressing these issues – specifying this is left to future work. Im- portantly, robustness suggests that, similarly to OWL RL, arbitrary 6. REASONING IN OWL LD RDF graphs should be allowed when using RS for reasoning. To OWL LD falls into a syntactic subset of OWL DL and can be reconcile these issues, we first define a syntactic OWL LD profile processed by tools that implement DS entailment checking. On the as a subset of OWL RL (which in turn imposes the syntactic re- other hand, we can also restrict the OWL RL/RDF rules to obtain strictions of OWL DL) and we then suggest an RS-based extension a terse set of inference rules that yields sound but possibly incom- of this profile for reasoning with arbitrary OWL Full ontologies. plete entailment under RS; the full set is found in Table 5 at the end Crucially, if an ontology uses features (such as owl:unionOf) of the paper. These rules are applicable to any RDF graph allowing that do not fall under the remit of OWL LD, an RS-based reasoner us to robustly draw sound conclusions from Web data. The OWL LD ruleset comprises of rules of the form: One of the earliest comprehensive empirical studies of RDF Web B1 ∧ . . . ∧ Bn → H (0 ≤ n ≤ 3) data was presented by Ding et al. in 2005 [8]. They report about the prevalence of vocabulary terms in over 1.5 million RDF/XML Web where H is called the head and B1 ∧. . .∧ Bn is the body. A rule with documents, where the bulk of data was described using the Friend an empty body (e.g., the rule cls-thing) is simply a fact. Multiple of a Friend (FOAF) and Dublin Core (DC) ontologies. The work atoms in rule heads (e.g., eq-ref) denote conjunctions that could also focuses on characterising the structure and distributions of the raw be expressed using multiple rules with the same body. The datatype data rather than issues relating to semantics or to RDFS and OWL. rules are somewhat exceptional, however, and require custom logic Various works look at the syntactic profiles of OWL ontologies outside of a standard rule-engine. Moreover, some rules use false on the Web [2, 37, 7]. Bechhofer and Volz identify and categorise in the head to express that an inconsistency is to be derived. An OWL DL restrictions violated by a sample group of 201 OWL on- inconsistency-tolerant system could already be realised by simply tologies (all of which were found to be in OWL Full); these include not taking these conclusions into account for query answering. incorrect or missing typing of classes and properties, complex ob- Unlike OWL RL/RDF which encodes arbitrary-length lists in the ject properties (e.g., functional properties) declared to be transi- bodies of some of its rules, the bodies of OWL LD rules comprise tive, inverse-functional datatype properties, and so forth [2]. In a solely of a fixed set of (a maximum of three) ternary RDF atoms of later survey, Wang et al. study over 1,276 ontologies, where 924 the form T (s, p, o) where s, p, o ∈ C ∪ V. These restrictions sim- (72.4%) were identified as being in OWL Full, although they pro- plify the use of the OWL LD rules in a variety of tools. Excluding posed that 863 could be patched (93.4%) [37]. In a similar study, datatype support, since the rules can only derive triples that are built d’Aquin et al. found that while 81% of 22,200 RDF Web docu- from the set C of RDF constants that originally occur in the ontol- ments surveyed fell into OWL Full, from the features used, 95% ogy and ruleset, the number of entailments is bounded by |C|3 . This would fall under the expressivity of the lightweight AL(D) De- bound is tight, e.g., the rules entail all possible triples from the RDF scription Logic [7]. To summarise, these studies show that restric- graph owl:sameAs owl:sameAs a ; rdfs:domain owl:Thing . tions laid out in the OWL standard (specifically for the OWL Lite Optimisations for rule-based systems as explored in many works and OWL DL dialects) are not well-followed by Web ontologies, can be applied to implement the OWL LD inferencing efficiently. but that such ontologies are typically relatively inexpressive. These Systems can efficiently support datatypes by, e.g., only checking works re-enforce the need for our RS-based extension of OWL LD. entailments as needed, or using canonicalisation techniques, etc. More recent papers focus on analysing owl:sameAs adoption We are now left to describe the relationship between DS and RS on the Web of Data [9, 14]. Ding et al. provide a quantitative for the OWL LD profile. analysis of the owl:sameAs graph extracted from the BTC-2010 dataset (the ancestor of our corpus) [9], summarising the use of Theorem 1. Let R contain the OWL LD entailment rules (Ta- owl:sameAs to link between different publishers of Linked Data. ble 5) and let O1 and O2 be OWL 2 ontologies that satisfy the In a similar vein, Halpin et al. [14] focus on the incorrect use OWL LD grammar and the following properties: of owl:sameAs; they employ four human judges to manually in- 1. neither O1 nor O2 contains an IRI that is used for more than spect 500 such links sampled from Web data, where their results one type of entity (i.e., no IRI is used both as, say, a class and suggest that owl:sameAs is often used imprecisely, although dis- an individual); agreement between the judges indicates that the quality of specific 2. O1 does not contain SubAnnotationPropertyOf, Anno- owl:sameAs links can be subjective. Such surveys indicate that tationPropertyDomain or AnnotationPropertyRange; reasoners must proceed cautiously when operating over Web data. 3. each axiom in O2 is an assertion of the form as specified below, for a, a1 , and a2 named individuals: 8. CONCLUSION (a) ClassAssertion(C a) where C is a class, We have presented a comprehensive analysis of the current use (b) ObjectPropertyAssertion(OP a1 a2 ) where OP is of OWL on the Web based on a large sample of RDF/XML docu- an object property, ments. We confirmed that OWL has indeed “arrived” on the Web (c) DataPropertyAssertion(DP a1 a2 ) where DP is a of Data, albeit to varying degrees for different features. data property, or Following Linked Data principles, we used a PageRank algo- (d) SameIndividual(a1 a2 ). rithm to assess the importance of individual documents, OWL fea- tures, and datatypes. Our results show that single-triple expressible Furthermore, let RDF(O1 ) and RDF(O2 ) be translations of O1 and OWL RL axioms are most prominent on the Web. A survey of tools O2 , respectively, into RDF graphs [29]; and let FO(RDF(O1 )) and confirms that these features tend to receive better support. FO(RDF(O2 )) be the translation of these graphs into first-order Based on these observations, we defined the OWL LD profile as theories in which triples are represented using the T predicate. a sub-language of OWL RL and provided a rule-based reasoning Then, O1 entails O2 under the OWL 2 Direct Semantics [25] iff calculus for it. Though motivated by a new analysis of the current FO(RDF(O1 )) ∪ R entails FO(RDF(O2 )) under the standard first- Web of Data, OWL LD also aligns closely with the earlier propos- order semantics. als of RDFS-Plus and L2, indicating that it is a natural profile that can be motivated from various perspectives. We argue that this is The proof of the Correspondence Theorem below follows imme- due to the syntactic restriction of OWL features to those that can diately from the according theorem for OWL RL [6, Theorem PR1] be expressed using single RDF triples, which reveals exactly the together with the fact that OWL LD is a restriction of OWL RL. cases where OWL expressions are fully aligned with, and most in- Like in the case of OWL RL, this result applies only to checking tuitively expressed in, the RDF data model. We argue that this bears the entailment of basic facts, not of OWL axioms in general. crucial advantages regarding not only tool support, but also usabil- ity. We therefore believe that, even if OWL as a whole might never 7. RELATED WORK arrive on the Web of Data, the OWL LD profile is a natural fit for Here we discuss related studies on the use of the RDFS and OWL modelling Linked Data vocabularies. In fact, as we have shown, on the Web (related OWL profiles have been covered in Section 2). OWL LD is already widely used. 9. REFERENCES thesis, DERI, NUIG, 2011. [21] Y. Kazakov, M. Krötzsch, and F. Simančík. Concurrent [1] D. Allemang and J. A. Hendler. Semantic Web for the classification of EL ontologies. In Proc. 10th Int. Semantic Working Ontologist: Effective Modeling in RDFS and OWL. Web Conf. (ISWC’11). Springer, 2011. Morgan Kaufmann/Elsevier, 2008. [22] V. Kolovski, Z. Wu, and G. Eadon. Optimizing [2] S. Bechhofer and R. Volz. Patching syntax in OWL enterprise-scale OWL 2 RL reasoning in a relational ontologies. In Proc. 3rd Int. Semantic Web Conf. (ISWC’04), database system. In Proc. 9th Int. Semantic Web Conf. pages 668–682. Springer, 2004. (ISWC’10), pages 436–452. Springer, 2010. [3] T. Berners-Lee. Linked Data. W3C Design Issues, July 2006. [23] M. Krötzsch. Efficient rule-based inferencing for OWL EL. [4] B. Bishop and S. Bojanov. Implementing OWL 2 RL and In Proc. 22nd Int. Conf. on Artificial Intelligence (IJCAI’11), OWL 2 QL rule-sets for OWLIM. In Proc. OWLED 2011 pages 2668–2673, 2011. Workshop on OWL: Experiences and Directions, 2011. [24] G. Meditskos and N. Bassiliades. DLEJena: A practical [5] D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, and forward-chaining OWL 2 RL reasoner combining Jena and R. Rosati. Tractable reasoning and efficient query answering Pellet. J. of Web Semantics, 8(1):89–94, 2010. in description logics: The DL-Lite family. J. of Automated [25] B. Motik, P. F. Patel-Schneider, and B. Cuenca Grau. OWL 2 Reasoning, 39(3):385–429, 2007. Web Ontology Language: Direct Semantics. W3C [6] B. Cuenca Grau, B. Motik, Z. Wu, A. Fokoue, and C. Lutz. Recommendation, Oct. 2009. OWL 2 Web Ontology Language: Profiles. W3C [26] B. Motik, P. F. Patel-Schneider, and B. Parsia. OWL 2 Web Recommendation, Oct. 2009. Ontology Language: Structural Specification and [7] M. d’Aquin, C. Baldassarre, L. Gridinoc, S. Angeletou, Functional-Style Syntax. W3C Recommendation, Oct. 2009. M. Sabou, and E. Motta. Characterizing knowledge on the [27] S. Muñoz, J. Pérez, and C. Gutierrez. Simple and efficient Semantic Web with Watson. In Proc. 5th Int. Workshop on minimal RDFS. J. of Web Semantics, 7(3):220–234, 2009. Evaluation of Ontologies and Ontology-based Tools, pages [28] L. Page, S. Brin, R. Motwani, and T. Winograd. The 1–10, 2007. PageRank Citation Ranking: Bringing Order to the Web. [8] L. Ding and T. Finin. Characterizing the semantic web on the Technical report, Stanford, 1998. web. In Proc. 5th Int. Semantic Web Conf. (ISWC’06), pages [29] P. F. Patel-Schneider, B. Motik, B. Cuenca Grau, I. Horrocks, 242–257. Springer, 2006. B. Parsia, A. Ruttenberg, and M. Schneider. OWL 2 Web [9] L. Ding, J. Shinavier, Z. Shangguan, and D. L. McGuinness. Ontology Language: Mapping to RDF Graphs. W3C SameAs networks and beyond: Analyzing deployment status Recommendation, Oct. 2009. and implications of owl:sameAs in linked data. In Proc. 9th [30] H. Pérez-Urbina, B. Motik, and I. Horrocks. Tractable query Int. Semantic Web Conf. (ISWC’10), pages 145–160. answering and rewriting under description logic constraints. Springer, 2010. J. of Applied Logic, 8(2):151–232, 2009. [10] I. Emmons, S. Collier, M. Garlapati, and M. Dean. RDF [31] M. Schneider and G. Sutcliffe. Reasoning in the OWL 2 Full literal data types in practice. In SSWS 2011, 2011. ontology language using first-order automated theorem [11] F. Fischer, G. Ünel, B. Bishop, and D. Fensel. Towards a proving. In Proc. 23rd Int. Conf. on Automated Deduction scalable, pragmatic knowledge representation language for (CADE-23), pages 461–475. Springer, 2011. the web. In Ershov Memorial Conf., pages 124–134, 2009. [32] F. Simančík, Y. Kazakov, and I. Horrocks. [12] B. Glimm, A. Hogan, M. Krötzsch, and A. Polleres. OWL Consequence-based reasoning beyond Horn ontologies. In LD Entailment Ruleset and Implementational Notes, Nov. Proc. 22nd Int. Conf. on Artificial Intelligence (IJCAI’11), 2011. http://www.semanticweb.org/OWLLD/. pages 1093–1098, 2011. [13] B. Grosof, I. Horrocks, R. Volz, and S. Decker. Description [33] M. Smith, I. Horrocks, M. Krötzsch, and B. Glimm. OWL 2 logic programs: Combining logic programs with description Web Ontology Language: Conformance. W3C logic. In World Wide Web, 2004. Recommendation, Oct. 2009. [14] H. Halpin, P. J. Hayes, J. P. McCusker, D. L. McGuinness, [34] H. J. ter Horst. Completeness, decidability and complexity of and H. S. Thompson. When owl:sameAs isn’t the same: An entailment for RDF Schema and a semantic extension analysis of identity in linked data. In Proc. 9th Int. Semantic involving the OWL vocabulary. J. of Web Semantics, 3, 2005. Web Conf. (ISWC’10), pages 305–320. Springer, 2010. [35] J. Urbani, F. van Harmelen, S. Schlobach, and H. Bal. [15] A. Harth, S. Kinsella, and S. Decker. Using naming authority QueryPIE: Backward reasoning for OWL Horst over very to rank data and ontologies for web search. In Proc. 8th Int. large knowledge bases. In Proc. 10th Int. Semantic Web Semantic Web Conf. (ISWC’09), pages 277–292, 2009. Conf. (ISWC’11). Springer, 2011. [16] A. Harth, J. Umbrich, A. Hogan, and S. Decker. YARS2: A [36] R. Volz. Web Ontology Reasoning with Logic Databases. federated repository for querying graph structured data from PhD thesis, Universität Karlsruhe, 2004. the Web. In Proc. 6th Int. Semantic Web Conf. (ISWC’07), [37] T. D. Wang, B. Parsia, and J. A. Hendler. A survey of the pages 211–224. Springer, 2007. web ontology landscape. In Proc. 5th Int. Semantic Web [17] P. Hayes. RDF Semantics. W3C Recommendation, Feb. Conf. (ISWC’06), pages 682–694. Springer, 2006. 2004. [18] T. Heath and C. Bizer. Linked Data: Evolving the Web into a Global Data Space (1st Edition). Morgan & Claypool, 2011. [19] J. A. Hendler. RDFS 3.0. In W3C Workshop on RDF Next Steps, June 2010. [20] A. Hogan. Exploiting RDFS and OWL for Integrating Heterogeneous, Large-Scale, Linked Data Corpora. PhD ID Body Head eq-ref ?s ?p ?o . ?s owl:sameAs ?s . ?p owl:sameAs ?p . ?o owl:sameAs ?o . eq-sym ?x owl:sameAs ?y . ?y owl:sameAs ?x . eq-trans ?x owl:sameAs ?y . ?y owl:sameAs ?z . ?x owl:sameAs ?z . Equality eq-rep-s ?s owl:sameAs ?s 0 . ?s ?p ?o . ?s 0 ?p ?o . eq-rep-p ?p owl:sameAs ?p0 . ?s ?p ?o . ?s ?p0 ?o . eq-rep-o ?o owl:sameAs ?o0 . ?s ?p ?o . ?s ?p ?o0 . eq-diff1 ?x owl:sameAs ?y . ?x owl:differentFrom ?y . false prp-ap (for each core annotation property ?p) ?p a owl:AnnotationProperty . prp-dom ?p rdfs:domain ?c . ?x ?p ?y . ?x a ?c . prp-rng ?p rdfs:range ?c . ?x ?p ?y . ?y a ?c . prp-fp ?p a owl:FunctionalProperty . ?x ?p ?y 1 . ?x ?p ?y 2 . ?y 1 owl:sameAs ?y 2 . prp-ifp ?p a owl:InverseFunctionalProperty . ?x 1 ?p ?y . ?x 2 ?p ?y . ?x 1 owl:sameAs ?x 2 . Property Axioms prp-irp ?p a owl:IrreflexiveProperty . ?x ?p ?x . false prp-symp ?p a owl:SymmetricProperty . ?x ?p ?y . ?y ?p ?x . prp-asyp ?p a owl:AsymmetricProperty . ?x ?p ?y . ?y ?p ?x . false prp-trp ?p a owl:TransitiveProperty . ?x ?p ?y . ?y ?p ?z . ?x ?p ?z . prp-spo1 ?p1 rdfs:subPropertyOf ?p2 . ?x ?p1 ?y . ?x ?p2 ?y . prp-eqp1 ?p1 owl:equivalentProperty ?p2 . ?x ?p1 ?y . ?x ?p2 ?y . prp-eqp2 ?p1 owl:equivalentProperty ?p2 . ?x ?p2 ?y . ?x ?p1 ?y . prp-pdw ?p1 owl:propertyDisjointWith ?p2 . ?x ?p1 ?y . ?x ?p2 ?y . false prp-inv1 ?p1 owl:inverseOf ?p2 . ?x ?p1 ?y . ?y ?p2 ?x . prp-inv2 ?p1 owl:inverseOf ?p2 . ?x ?p2 ?y . ?y ?p1 ?x . cls-thing — owl:Thing a owl:Class . Classes cls-nothing — owl:Nothing a owl:Class . cls-nothing2 ?x a owl:Nothing . false cax-sco ?c 1 rdfs:subClassOf ?c 2 . ?x a ?c 1 . ?x a ?c 2 . Class Ax. cax-eqc1 ?c 1 owl:equivalentClass ?c 2 . ?x a ?c 1 . ?x a ?c 2 . cax-eqc2 ?c 1 owl:equivalentClass ?c 2 . ?x a ?c 2 . ?x a ?c 1 . cax-dw ?c 1 owl:disjointWith ?c 2 . ?x a ?c 1 , ?c 2 . false dt-type1 (for each supported datatype ?dt) ?dt a rdfs:Datatype . Datatypes dt-type2 (for each literal ?lt in the value space of datatype ?dt) ?lt a ?dt . dt-eq (for all ?lt1 and ?lt2 with the same data value) ?lt1 owl:sameAs ?lt2 . dt-diff (for all ?lt1 and ?lt2 with different data values) ?lt1 owl:differentFrom ?lt2 . dt-not-type ?lt a ?dt . (where ?lt is not in the value space of ?dt) false ?c rdfs:subClassOf ?c . ?c rdfs:subClassOf owl:Thing . scm-cls ?c a owl:Class . ?c owl:equivalentClass ?c . owl:Nothing rdfs:subClassOf ?c . scm-sco ?c 1 rdfs:subClassOf ?c 2 . ?c 2 rdfs:subClassOf ?c 3 . ?c 1 rdfs:subClassOf ?c 3 . scm-eqc1 ?c 1 owl:equivalentClass ?c 2 . ?c 1 rdfs:subClassOf ?c 2 . ?c 2 rdfs:subClassOf ?c 1 . Schema Vocabulary scm-eqc2 ?c 1 rdfs:subClassOf ?c 2 . ?c 2 rdfs:subClassOf ?c 1 . ?c 1 owl:equivalentClass ?c 2 . scm-op ?p a owl:ObjectProperty . ?p rdfs:subPropertyOf ?p . ?p owl:equivalentProperty ?p . scm-dp ?p a owl:DatatypeProperty . ?p rdfs:subPropertyOf ?p . ?p owl:equivalentProperty ?p . scm-spo ?p1 rdfs:subPropertyOf ?p2 . ?p2 rdfs:subPropertyOf ?p3 . ?p1 rdfs:subPropertyOf ?p3 . scm-eqp1 ?p1 owl:equivalentProperty ?p2 . ?p1 rdfs:subPropertyOf ?p2 . ?p2 rdfs:subPropertyOf ?p1 . scm-eqp2 ?p1 rdfs:subPropertyOf ?p2 . ?p2 rdfs:subPropertyOf ?p1 . ?p1 owl:equivalentProperty ?p2 . scm-dom1 ?p rdfs:domain ?c 1 . ?c 1 rdfs:subClassOf ?c 2 . ?p rdfs:domain ?c 2 . scm-dom2 ?p2 rdfs:domain ?c . ?p1 rdfs:subPropertyOf ?p2 . ?p1 rdfs:domain ?c . scm-rng1 ?p rdfs:range ?c 1 . ?c 1 rdfs:subClassOf ?c 2 . ?p rdfs:range ?c 2 . scm-rng2 ?p2 rdfs:range ?c . ?p1 rdfs:subPropertyOf ?p2 . ?p1 rdfs:range ?c . Table 5: The OWL LD ruleset in Turtle/N3 style syntax where false in the head denotes inconsistency