Introducing Customised Datatypes and
Datatype Predicates into OWL(∗)
Jeff Z. Pan and Ian Horrocks
School of Computer Science, University of Manchester, UK
Abstract. Although OWL is rather expressive, it has a very serious limitation on
datatypes; i.e., it does not support customised datatypes. It has been pointed out
that many potential users will not adopt OWL unless this limitation is overcome,
and the W3C Semantic Web Best Practices and Deployment Working Group
has set up a task force to address this issue. This paper provides a solution for
this issue by presenting two decidable datatype extensions of OWL DL, namely
OWL-Eu and OWL-E. OWL-Eu provides a minimal extension of OWL DL to
support customised datatypes, while OWL-E extends OWL DL with both cus-
tomised datatypes and customised datatype predicates.
1 Introduction
The OWL Web Ontology Language [1] is a W3C recommendation for expressing on-
tologies in the Semantic Web. Datatype support [7, 8] is one of the key features that
OWL is expected to provide, and has prompted extensive discussions in the RDF-Logic
mailing list [10] and in the Semantic Web Best Practices mailing list [12]. Although
OWL adds considerable expressive power to the Semantic Web, the OWL datatype for-
malism (or simply OWL datatyping) is much too weak for many applications; in partic-
ular, OWL datatyping does not provide a general framework for customised datatypes,
such as XML Schema derived datatypes.
It has been pointed out that many potential users will not adopt OWL unless this
limitation is overcome [11], as it is often necessary to enable users to define their own
datatypes and datatype predicates for their ontologies and applications. One of the most
well known type systems is W3C XML Schema Part 2 [2], which defines facilities to
allow users to define customised datatypes, such as those defined by imposing some
restrictions in the value spaces of existing datatypes.
Example 1. Customised datatypes are useful in capturing the intended meaning of some
vocabulary in ontologies. For example, users might want to use the customised datatype
‘atLeast18’ in the following definition of the class ‘Adult’:
Class(Adult complete Person
restriction(age allValuesFrom(atLeast18))),
which says that an Adult is a Person whose age is at least 18. The datatype constraint
(∗)
This work is partially supported by the FP6 Network of Excellence EU project Knowledge
Web (IST-2004-507842).
‘at least 18’ can be defined as an XML Schema user-defined datatype
in which the facet ‘minInclusive’ is used to restrict the value space of atLeast18 (a
customised datatype) to be a subset of the value space of integer (an XML Schema
built-in datatype).
User-defined datatypes (like the above one) cannot, however, be used in the OWL
datatyping, which (only) provides the use of some built-in XML Schema datatypes and
enumerated datatypes, which are defined by explicitly specifying their instances. The
OWL datatyping does not support XML Schema customised datatypes for the following
two reasons: (i) XML Schema does not provide a standard way to access a user-defined
datatype. (ii) OWL DL does not provide a mechanism to guarantee the computability
of the kinds of customised datatypes it supports.
This paper provides a solution for this issue by presenting two decidable datatype
extensions of OWL DL, namely OWL-Eu and OWL-E. OWL-Eu provides a minimal
extension of OWL DL to support customised datatypes, while OWL-E extends OWL
DL with both customised datatypes and customised datatype predicates. The rest of the
paper is organised as follows: Section 2 further discusses the motivations of introducing
customised datatypes and datatype predicates. Section 3 extends the OWL datatyping
to unary datatype groups, which enables the use of customised datatypes. Section 4
and 5 present the OWL-Eu and the OWL-E languages, respectively; the latter one is
based on datatype groups, which are general forms of unary datatype groups. Section 6
concludes the paper and suggests some future work.
2 Motivations
Allowing users to define their own vocabulary is one of the most useful features that
ontologies can provide over other approaches, such as the Dublin Core, of providing
semantics in the Semantic Web. In the Dublin Core standard, the meaning of the set
of 15 information properties are described in English text. The main drawback of the
Dublin Core is its inflexibility; it is impossible to ‘predefine’ information properties for
all sorts of applications.
Ontologies, however, are more flexible in that users can define their own vocabu-
lary based on existing vocabularies. In ontology languages, a set of class constructors
are usually provided so that users can build class expressions based on, for example, ex-
isting class names. The intended meaning of the vocabulary, therefore, can be captured
by the axioms in the ontologies. Let us revisit Example 1 and consider the intended
meaning of the Adult class. According to its definition, an Adult is a Person who is at
least 18 years old. As a result, programs can also understand the meaning of customised
vocabulary, with the help of ontologies.
Although OWL DL provides a set of expressive class constructors to build cus-
tomised classes, it does not provide enough expressive power to support, for example,
2
XML Schema customised datatypes. In order to capture the intended meaning of Adult,
Example 1 has already shown the necessity of customised datatypes. In what follows,
we give some more examples to illustrate the usefulness of customised datatypes and
datatype predicates in various SW and ontology applications.
Example 2. Semantic Web Service: Matchmaking
Matchmaking is a process that takes a service requirement and a group of service
advertisements as input, and returns all the advertisements that may potentially satisfy
the requirement. In a computer sales ontology, a service requirement may ask for a PC
with memory size greater than 512Mb, unit price less than 700 pounds and delivery
date earlier than 15/03/2004.
Here ‘greater than 512’, ‘less than 700’ and ‘earlier than 15/03/2004’ are customised
datatypes of base datatypes integer, integer and date, respectively.
Example 3. Electronic Commerce: A ‘No Shipping Fee’ Rule
Electronic shops may need to classify items according to their sizes, and to reason
that an item for which the sum of height, length and width is no greater than 15cm
belongs to a class in their ontology, called ‘small-items’. Then they can have a rule
saying that for ‘small-items’ no shipping costs are charged. Accordingly, the billing
system will charge no shipping fees for all the instances of the ‘small-items’ class.
Here ‘greater than 15’ is a customised datatype, ‘sum’ is a datatype predicate, while
‘sum no greater than 15’ is a customised datatype predicate.
3 Unary Datatype Groups
The OWL datatyping is defined based on the notion of datatype maps [9]. A datatype
map is a partial mapping from supported datatype URIrefs to datatypes. In this section,
we introduce unary datatype groups, which extend the OWL datatyping with a hierarchy
of supported datatypes.
Definition 1 A unary datatype group G is a triple (Md ,B,dom), where Md is the
datatype map of G, B is the set of primitive base datatype URI references in G and
dom is the declared domain function. We call S the set of supported datatype URI ref-
erences of G, i.e., for each u ∈ S, Md (u) is defined; we require B ⊆ S. We assume that
there exist unary datatype URI reference rdfs:Literal, owlx:DatatypeBottom 6∈ S.
The declared domain function dom has the following properties: for each u ∈ S, if
u ∈ B, dom(u) = u; otherwise, dom(u) = v, where v ∈ B.
Definition 1 ensures that all the primitive base datatype URIrefs of G are supported
(B ⊆ S) and that each supported datatype URIref relates to a primitive base datatype
URIref through the declared domain function dom.
Example 4. G1 = (Md1 , B1 , dom1 ) is a unary datatype group, where
– Md1 = {xsd:integer 7→ integer, xsd:string 7→ string, xsd:nonNegativeInteger
7→≥0 , xsdx:integerLessThanN 7→
can be represented by the following disjunctive expression
or(
and(xsd:nonNegativeInteger, xsdx:integerLessThan100000)
oneOf(“low”ˆˆxsd:string,“medium”ˆˆxsd:string, “expensive”ˆˆxsd:string)
).
Note that “low”ˆˆxsd:string is a typed literal, which represents a value of the
xsd:string datatype. “low”, instead, is a plain literal, where no datatype informa-
tion is provided. ♦
We now define the interpretation of a unary datatype group.
4
Abstract Syntax DL Syntax Semantics
a datatype URIref u u uD
oneOf(l1 , . . . , ln ) {l1 , . . . , ln } {l1 } ∪ . . . ∪ {lnD }
D
not(u) u (dom(u))D \ uD if u ∈ S \ B
∆D \ uD otherwise
and(E1 , . . . , En ) E1 ∧ . . . ∧ En E1D ∩ . . . ∩ EnD
or(P, Q) E1 ∨ . . . ∨ En E1D ∪ . . . ∪ EnD
Table 1. Syntax and semantics of datatype expressions (OWL-Eu data ranges)
Definition 3 A datatype interpretation ID of a unary datatype group G =
(Md , B, dom) is a pair (∆D , ·D ), where ∆D (the datatype domain) is a non-empty
set and ·D is a datatype interpretation function, which has to satisfy the following con-
ditions:
1. (rdfs:Literal)D = ∆D and (owlx:DatatypeBottom)D = ∅;
2. for each plain literal l, lD = l ∈ PL and PL ⊆ ∆D (PL is the value space for plain
literals);
3. for any two primitive base datatype URIrefs u1 , u2 ∈ B: uD D
1 ∩ u2 = ∅;
4. for each supported datatype URIref u ∈ S, where d = Md (u):
(a) uD = V (d) ⊆ ∆D , L(u) ⊆ L(dom(u)) and L2V (u) ⊆ L2V (dom(u));
(b) if s ∈ L(d), then (“s”ˆˆu)D = L2V (d)(s); otherwise, (“s”ˆˆu)D is not defined;
5. ∀u 6∈ S, uD ⊆ ∆D , and “v”ˆˆu ∈ uD .
Moreover, we extend ·D to G unary datatype expression as shown in Table 5
(page 8). Let E be a G unary datatype expression, the negation of E is of the form
¬E, which is interpreted as ∆D \ E D .
Next, we introduce the kind of basic reasoning mechanisms required for a unary
datatype group.
Definition 4 Let V be a set of variables, G = (Md , B, dom) a unary datatype group
and u ∈ B a primitive base datatype URIref. A datatype conjunction of u is of the form
k
^ l
^ (i) (i)
C = uj (vj ) ∧ 6=i (v1 , v2 ), (1)
j=1 i=1
(i) (i) Vk
where the vj are variables from V, v1 , v2 are variables in j=1 uj (vj ), uj are
datatype URI references from S such that dom(uj ) = u, and 6=i are the inequality
Vk
predicates for primitive base datatypes Md (dom(ui )) where ui appear in j=1 uj (vj ).
A datatype conjunction C is called satisfiable iff there exists an interpretation
(∆D , ·D ) of G and a function δ mapping the variables in C to data values in ∆D s.t.
(i) (i) (i) (i)
δ(vj ) ∈ uD D
j (for all 1 ≤ j ≤ k) and {δ(v1 ), δ(v2 )} ⊆ ui and δ(v1 ) 6= δ(v2 ) (for
all 1 ≤ i ≤ l). Such a function δ is called a solution for C w.r.t. (∆D , ·D ).
We end this section by elaborating the conditions that computable unary datatype
groups require.
5
Definition 5 A unary datatype group G is conforming iff
1. for any u ∈ S \ B: there exists u0 ∈ S \ B such that u0D = uD , and
2. for each primitive base datatype in G, the satisfiability problems for finite datatype
conjunctions of the form (1) is decidable.
4 OWL-Eu
In this section, we present a small extension of OWL DL, i.e., OWL-Eu. The underpin-
ning DL of OWL-Eu is SHOIN (G 1 ), i.e., the SHOIN DL combined with a unary
datatype group G (1 for unary). Specifically, OWL-Eu (only) extends OWL data range
(i.e., enumerated datatypes as well as some built-in XML Schema datatypes) to OWL-
Eu data ranges defined as follows.
Definition 6 An OWL-Eu data range is a G unary datatype expression. Abstract (as
well as DL) syntax and model-theoretic semantics of OWL-Eu data ranges are presented
in Table 5 (page 8).
The consequence of the extension is that customised datatypes, represented by
OWL-Eu data ranges, can be used in datatype exists restrictions (∃T.u) and datatype
value restrictions (∀T.u), where T is a datatype property and u is an OWL-Eu data
range. Hence, this extension of OWL DL is as large as is necessary to support cus-
tomised datatypes.
Example 6. PCs with memory size greater than or equal to 512 Mb and with price
cheaper than 700 pounds can be represented in the following OWL-Eu concept descrip-
tion in DL syntax (cf. Table 5 on page 8):
PC u ∃memorySizeInM b.<512 u ∃priceInP ound. <700 ,
where <512 is a relativised negated expression and <700 is a supported datatype in G1 .
♦
It turns out that OWL-Eu (i.e., the SHOIN (G 1 ) DL) is decidable.
Theorem 1. The SHOIN (G 1 )-concept satisfiability problem w.r.t. a knowledge base
is decidable if the combined unary datatype group is conforming.
Proof: (Sketch) We will show the decidability of SHOIN (G 1 )-concept satisfiability
w.r.t. TBoxes and RBoxes by reducing it to the SHOIN -concept satisfiability w.r.t.
TBoxes and RBoxes. The basic idea behind the reduction is that we can replace each
datatype group-based concept C in T with a new atomic primitive concept AC in T 0 .
We then compute the satisfiability problem for all possible conjunctions of datatype
group-based concepts (and their negations) in T (of which there are only a finite
number), and in case a conjunction C1 u . . . u Cn is unsatisfiable, we add an axiom
AC1 u . . . u ACn v ⊥ to T 0 . For example, unary datatype group-based concepts
∃T. >1 and ∀T. ≤0 occurring in T would be replaced with A∃T.>1 and A∀T.≤0 in
T 0 , and A∃T.>1 u A∀T.≤0 v ⊥ would be added to T 0 because ∃T. >1 u ∀T. ≤0 is
unsatisfiable (i.e., there is no solution for the predicate conjunction >1 (v) ∧ ≤0 (v)).
6
5 OWL-E: A Step Further
In this section, we present a further extension of OWL-Eu, called OWL-E, which sup-
ports not only customised datatypes, but also customised datatype predicates.
A datatype predicate (or simply predicate) p is characterised by an arity a(p), or
a minimum arity amin (p) if p can have multiple arities, and a predicate extension (or
simply extension) E(p). The notion of predicate maps can be defined in an obvious
way. For example, =int is a (binary) predicate with arity a(=int ) = 2 and extension
E(=int ) = {hi1 , i2 i ∈ V (integer)2 | i1 = i2 }, where V (integer) is the value space
for the datatype integer.
Now we can generalise unary datatype groups by the definition of datatype groups.
In fact, datatypes and datatype predicates can be unified in datatype groups. Roughly
speaking, a datatype group is a group of built-in predicate URIrefs ‘wrapped’ around a
set of primitive datatype URIrefs. A datatype group G is a tuple (Mp ,B,dom), where
Mp is the predicate map of G, B is the set of primitive datatype URI references in G
and dom is the declared domain function. We call S the set of built-in predicate URI
references of G, i.e., for each u ∈ S, Mp (u) is defined; we require B ⊆ S. The declared
domain function dom has the following properties: for each u ∈ S,
u if u ∈ B,
(v1 , . . . , vn ), where v1 , . . . , vn ∈ B if u ∈ S \ B and
a(Mp (u)) = n,
dom(u) = {(v, . . . , v ) | i ≥ n}, where v ∈ B if u ∈ S \ B and
| {z }
i times
amin (Mp (u)) = n.
Example 7. G2 = (Mp 2 , B2 , dom2 ) is a datatype group, where
– Mp 2 = {xsd:integer 7→ integer, xsd:string 7→ string, xsd:integerGreaterThanOr-
EqualToN 7→ ≥N , xsdx:integerLessThanN 7→ mT1 , . . . , Tn .E {x ∈ ∆ | ]{ht1 , . . . , tn i | hx, ti i ∈ T I (for all
I
atleast restriction 1 ≤ i ≤ m) ∧ ht1 , . . . , tn i ∈ E D } ≥ m}
expressive predicate 6mT1 , . . . , Tn .E {x ∈ ∆ | ]{ht1 , . . . , tn i | hx, ti i ∈ T I (for all
I
atmost restriction 1 ≤ i ≤ m) ∧ ht1 , . . . , tn i ∈ E D } ≤ m}
Table 3. New class constructors in OWL-E
where Ts , Th , Tl , Tw are concrete roles representing “sum in cm”, “hight
in cm”, “length in cm” and “width in cm”, respectively, and (+int ∧
[≥15 , integer, integer, integer]) is a conjunctive datatype expression representing the
customised predicate “sum no larger than or equal to 15”.1 ♦
Like OWL-Eu, OWL-E (i.e., the SHOIQ(G) DL) is also a decidable extension of
OWL-DL.
Theorem 2. The SHOIN (G)- and SHOIQ(G)-concept satisfiability and subsump-
tion problems w.r.t. TBoxes and RBoxes are decidable.
According to Tobies [13, Lemma 5.3], if L is a DL that provides the nominal con-
structor, knowledge base satisfiability can be polynomially reduced to satisfiability of
TBoxes and RBoxes. Hence, we obtain the following theorem.
Theorem 3. The knowledge base satisfiability problems of SHOIN (G) and
SHOIQ(G) are decidable.
6 Conclusion
In this paper, we propose OWL-Eu and OWL-E, two decidable extensions of OWL
DL that support customised datatypes and customised datatype predicates. OWL-Eu
provides a general framework for integrating OWL DL with customised datatypes, such
as XML Schema non-list simple types. OWL-E further extends OWL-Eu to support
customised datatype predicates.
We have implemented a prototype extension of the FaCT [5] DL system, called
FaCT-DG, to support TBox reasoning in both OWL-Eu and OWL-E (without nomi-
nals). As for future work, we are planning to extend the DIG1.1 interface [3] to sup-
port OWL-Eu, and to implement a Protégé [6] plug-in to support XML Schema non-list
simple types, i.e. users should be able to define and/or import customised XML Schema
non-list simple types based on a set of supported datatypes, and to exploit our prototype
through the extended DIG interface. Furthermore, we plan to extend the FaCT++ DL
reasoner [4] to support the full OWL-Eu and OWL-E ontology languages.
1
To save space, we use predicates instead of predicate URIrefs here.
9
Bibliography
[1] Sean Bechhofer, Frank van Harmelen, James Hendler, Ian Horrocks, Deborah L.
McGuinness, Peter F. Patel-Schneider, and Lynn Andrea Stein eds. OWL Web
Ontology Language Reference. http://www.w3.org/TR/owl-ref/, Feb 2004.
[2] Paul V. Biron and Ashok Malhotra. Extensible Markup Language (XML) Schema
Part 2: Datatypes – W3C Recommendation 02 May 2001. Technical report, World
Wide Web Consortium, 2001. http://www.w3.org/TR/xmlschema-2/.
[3] DIG. SourceForge DIG Interface Project. http://sourceforge.net/projects/dig/,
2004.
[4] FaCT++. http://owl.man.ac.uk/factplusplus/, 2003.
[5] Ian Horrocks. Using an Expressive Description Logic: FaCT or Fiction? In Proc.
of KR’98, pages 636–647, 1998.
[6] Holger Knublauch, Ray W. Fergerson, Natalya Fridman Noy, and Mark A. Musen.
The Protégé OWL Plugin: An Open Development Environment for Semantic Web
Applications. In International Semantic Web Conference, pages 229–243, 2004.
[7] Jeff Z. Pan and Ian Horrocks. Extending Datatype Support in Web Ontology
Reasoning. In Proc. of the 2002 Int. Conference on Ontologies, Databases and
Applications of SEmantics (ODBASE 2002), Oct 2002.
[8] Jeff Z. Pan and Ian Horrocks. Web Ontology Reasoning with Datatype Groups.
In Proc. of the 2003 International Semantic Web Conference (ISWC2003), pages
47–63, 2003.
[9] Peter F. Patel-Schneider, Patrick Hayes, and Ian Horrocks. OWL Web On-
tology Language Semantics and Abstract Syntax. Technical report, W3C,
Feb. 2004. W3C Recommendation, URL http://www.w3.org/TR/2004/
REC-owl-semantics-20040210/.
[10] RDF-Logic Mailing List. http://lists.w3.org/archives/public/www-rdf-logic/.
W3C Mailing List, starts from 2001.
[11] Alan Rector. Re: [UNITS, OEP] FAQ : Constraints on data values range.
Discussion in [12], Apr. 2004. http://lists.w3.org/Archives/Public/public-swbp-
wg/2004Apr/0216.html.
[12] Semantic Web Best Practice and Development Working Group Mailing List.
http://lists.w3.org/archives/public/public-swbp-wg/. W3C Mailing List, starts
from 2004.
[13] Stephan Tobies. Complexity Results and Practical Algorithms for Logics in
Knowledge Representation. PhD thesis, Rheinisch-Westfälischen Technischen
Hochschule Aachen, 2001. URL http://lat.inf.tu-dresden.de/
research/phd/Tobies-PhD-2001.pdf .