<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Original Reduced Ontologies
Unoptimized Optimized
Class Prop. TBox Subcl. Disj. Subpr. Func. Dom. Ran.</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Reasoning with Large A-Boxes in Fuzzy Description Logics using DL reasoners: An experimental evaluation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>P. Cimiano</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>P. Haase</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Q. Ji</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>T. Mailis</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>G. Stamou</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>G. Stoilos</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>T. Tran</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>V. Tzouvaras</string-name>
          <email>tzouvarasg@image.ntua.gr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute AIFB, Universitat Karlsruhe</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National and Technical University of Athens</institution>
        </aff>
      </contrib-group>
      <volume>3</volume>
      <issue>6</issue>
      <abstract>
        <p>While knowledge representation languages developed in the context of the Semantic Web build mainly on crisp knowledge, fuzzy reasoning seems to be needed for many applications, e.g. for retrieval of images or other resources. However, techniques for fuzzy reasoning need to scale to large A-Boxes for practical applications. As a rst step towards clarifying whether fuzzy reasoning techniques can indeed scale, we examine a speci c point in space. Earlier research has shown that fuzzy description logics can be transformed to crisp description logics in a satis ability-preserving fashion. This thus opens the possibility of using standard description logic reasoners for reasoning with fuzzy OWL ontologies. As the transformation produces a quadratic blow-up of the T-Box, a crucial question is if such an approach is feasible from a performance point of view. We provide an empirical evaluation on four di erent ontologies of varying complexity and with generated A-Boxes of up to a million individuals. To our knowledge, we thus provide the rst systematic and empirical evaluation of a fuzzy reasoning approach based on a reduction to standard description logics.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Current knowledge representation formalisms for the Semantic Web such as
OWL or RDF(S) rely on crisp logics. However, it has become clear that the
assumption that all knowledge is crisp is unsustainable. There are various
reasons for this. First of all, the large amount of knowledge needed to bootstrap the
semantic web can neither be acquired by hand nor by simply importing legacy
data. Thus, knowledge acquisition techniques are needed. However, knowledge
acquisition techniques building on machine learning or natural language
processing techniques are inherently error-prone. Consequently, the need arises to
represent the certainty or con dence of the extracted information. Further, for
the purpose of retrieval, e.g. of images or other data, we need ways to assess
the degree to which a certain answer matches the information needs of a query
(see [1]). Restricting ourselves to the crisp case would yield very brittle systems
which only return answers in case information is perfectly modeled and perfectly
matches the user query. Finally, there are concepts which are inherently fuzzy
and which can not be represented in a crisp way. These are concepts
representing linguistic variables in the sense of Zadeh [2], e.g. such as hot, young, cheap
etc. Thus, various extensions to Semantic Web languages have been presented to
deal with imperfect information, i.e. probabilistic approaches such as [3], [4] for
OWL, [5] for RDF as well as approaches based on Fuzzy Logics such as [6] and
[7]. For an overview of non-crisp ontology formalisms the reader is referred to [8].
Whatever the formalism used is, one requirement is of crucial importance: the
availability of scalable reasoning procedures to deal with imperfect knowledge.</p>
      <p>In this paper, we thus ask ourselves a straightforward question with a not so
straightforward answer: are current reasoning procedures mature and scalable
enough to reason with a large amount of imperfect (e.g. fuzzy) information?
Giving a principled and empirically grounded answer to this problem is certainly
out of the scope of this and many articles to be published. In fact, we are
convinced that it is only through systematic experimental evaluation that we
will nd out which techniques can scale towards real scenarios with millions of
facts to be stored and retrieved.</p>
      <p>As a rst step towards answering this question, the contribution of this paper
is to examine a very speci c setting, i.e. we assume that knowledge is represented
using fuzzy description logics and analyze the scalability of the KAON2 reasoner
on the knowledge bases resulting from reducing Fuzzy Description Logics (see [6,
7]) to standard description logics based on the idea described in [9]. KAON2 is a
resolution-based SHIQ(D)-reasoner which transforms a SHIQ(D) knowledge
base into a Disjunctive Datalog Program. This allows to reuse optimization
techniques from database technology to reason with large A-Boxes. The aim of
our paper is to explore if one can indeed reason with large Fuzzy A-Boxes, a
necessary requirement for large-scale knowledge management applications. In
this sense this paper can be seen as a contribution towards a larger goal: the one
of clarifying whether fuzzy knowledge representation can indeed scale towards
meeting the size of large-scale applications.</p>
      <p>The structure of this paper is as follows: in Section 2, we introduce the logic
fKD-SHIN as well as the reduction to crisp OWL presented already in [7]
in order to make the paper self-contained. In Section 3 we brie y describe the
OWL-DL reasoner used in our experiments, i.e. the KAON2 reasoner. In Section
4 we present our experiments and results. Finally, in Section 5 we discuss some
related work and conclude.
2</p>
      <p>Fuzzy Description Logics and their reduction to
Classical DLs
In this section we provide a brief introduction to the fuzzy Description Logic
(DL) fKD-SHIN [10]. fKD-SHIN is a fuzzy extension of the SHIN DL [11]
which uses the fuzzy operators 1 x, max and min for performing the fuzzy set
theoretic operations. For a general fuzzy extension of SHIN please refer to [6].</p>
      <p>Fuzzy Description Logics [12] have been proposed as powerful knowledge
representation languages capable of capturing vague (fuzzy) knowledge that exists
in many applications. Fuzzy DLs and ontologies usually keep the same syntax
as their crisp (classical) counterpart, without adding any additional degrees to
them; notable exceptions are [6] and [13]. Thus, using concepts and roles we
can build concept descriptions in the usual way by using conjunctions (C u D),
disjunctions (C t D), negation (:C), existential (9R:C) and universal
quanti cation (8R:C) as well as cardinality restrictions ( nR, nR). Similarly,
concept and role axioms, like concept equivalence (C D), concept subsumption
(C v D), role inclusions (R v S) and transitivity (Trans(R)) can be de ned in
the usual way. Thus, the de nitions of a TBox and an RBox are as usual.</p>
      <p>Constructor
top
bottom
general negation
conjunction
disjunction
exists restriction
value restriction
at-most
at-least
inverse role
equivalence
sub-concept
transitive role
sub-role
concept assertions
role assertions</p>
      <p>Fuzzy DLs extend the syntax of assertions with membership degrees, thus
allowing to create fuzzy assertions (see [12, 10, 14, 15]) as well as to annotate
individual axioms (assertions) with degrees of membership. More formally, a
fuzzy assertion [12] is of the form (a : C)./n or ((a; b) : R) B n, where ./ 2 f
; &gt;; ; &lt;g, B 2 f ; &gt;g and a; b are individuals, and n 2 (0; 1]. In many cases we
write (a : C) = n instead of writing two fuzzy assertions of the form (a : C) n
and (a : C) n. For example, one is able to state that grass is Green to a degree
of at least 0.7, writing (grass : Green) 0:7, or that athens is nearTo rome to
a degree of at least 0:8, by the axiom ((athens; rome) : nearTo) 0:8. A set of
fuzzy assertions de nes a (fuzzy) ABox.</p>
      <p>The semantics of fuzzy DLs are provided by fuzzy interpretations [12], which
extend classical interpretations to the unit interval [0; 1]. A fuzzy interpretation
is a pair I = ( I ; I ) where the domain I is a non-empty set of objects and
I is a fuzzy interpretation function, which maps
{ an individual name a 2 I to an element aI 2 I ,
{ a concept name A 2 C to a membership function AI : I ! [0; 1],
{ a role name R 2 R to a membership function RI : I I ! [0; 1].
Intuitively, an object (pair of objects) can now belong to a fuzzy concept (role)
to any degree between 0 and 1. For example, HotPlaceI (RomeI ) = 0:7, means
that RomeI is a hot place to a degree equal to 0.7. Fuzzy interpretations can be
extended to interpret fKD-SHIN -concepts and roles with the aid of the fuzzy
operators 1 x, min and max. The complete semantics for concept descriptions,
concept and role axioms are depicted in Table 1.</p>
      <p>We can now de ne the inference problems for fuzzy DLs. A fuzzy knowledge
base is satis able (unsatis able) i there exists (does not exist) a fuzzy
interpretation I which satis es all axioms in . An f-SHIN -concept C is n-satis able
w.r.t. i there exists a model I of for which there is some a 2 I such that
CI (a) = n, and n 2 (0; 1]; C subsumes D w.r.t. i for every model I of
we have 8d 2 I ; CI (d) DI (d); a fuzzy ABox A is consistent (inconsistent )
w.r.t. a fuzzy TBox T and RBox R if there exists (does not exist) a model I
of T and R that satis es each assertion in A. Given a fuzzy concept axiom, a
fuzzy role axiom or a fuzzy assertion , entails , written j= , i for all
models I of , I satis es .</p>
      <p>Currently, there are many proposals for reasoning in various di erent
fragments of fuzzy Description Logics ranging from tableaux-based [12, 10] to
optimization-based [16, 17]. Straccia proposed in [9] a method for reducing an
fKD-ALCH knowledge base to a classical ALCH one. The purpose of the
reduction is to reduce the reasoning problem from f-DLs to crisp DLs in order to
allow to use existing optimized reasoners, like FaCT [18], Pellet [19] or KAON2
(see Section 3) to perform the inference tasks of fuzzy DLs. Later, Bobillo et
al. extended the method [13] to cover the full fuzzy OWL language, while quite
recently a number of optimisations of the method have been also proposed [20].
In the following we sketch the main idea and refer the interested reader to [9,
13, 20] for a detailed presentation.</p>
      <p>The main idea behind the reduction is that a fuzzy assertion of the form
(a : C) n, where a is an individual and n 2 [0; 1] can be represented by a
crisp assertion of the form a : C n, where C n is a new concept. Intuitively,
C n stands for the set of objects that belong to C to a degree greater or equal
than n. Then there are two important points for the reduction:
1. The number of di erent membership degrees we have to consider in the
reduction is nite (this is a property of fKD-DLs) and more precisely de ned
by the set: N = f0; 0:5; 1g [ fn; 1 n j (a : C)./n 2 A or ((a; b) : R)./n 2
Ag.
2. In order for the reduction to be satis ability-preserving, additional concept
and role axioms need to be added. For example, if n1; n2 are two degrees from
N and n1 n2, then it obviously holds that A n2 v A n1 . In summary
for each atomic concept in the fuzzy KB the following axioms are added [9,
13]:
where 1 i jN j 1 and 2 j jN j. Note that axioms with concepts
A n and A&lt;n are necessary since the fuzzy ABox might contain fuzzy
assertions of the form (a : A) n or (a : A) &lt; n. Similarly for each atomic
roles we have: R ni+1 v R&gt;ni ; R&gt;ni v R ni .</p>
      <p>
        Finally, concept and role axioms in the fuzzy KB should also be reduced [9,
13]. Thus, if jCj and jRj is the number of the di erent atomic concepts and
roles in the original fuzzy KB, respectively, and jN j the number of di erent
membership degrees that appear in the fuzzy ABox, then the reduced crisp KB
would contain as much as 8jCj(jN j 1) concept axioms and 2jRj(jN j 1)
role inclusion axioms of this kind of satis ability-preserving axioms. On the other
hand, as shown in [9], there are additionally 4jT jjN j concept and 2jRjjN j role
inclusion axioms, where jT j and jRj is the number of concept and role axioms
in the original fuzzy KB. Due to the fact that the number of axioms increases
quadratically, the performance of DL reasoners can heavily deteriorate. For this
reason, Bobillo et. al. [20] have proposed a number of optimisations. In particular,
they note the concept equivalences A n :A&gt;n and A&lt;n :A n, thus they
map fuzzy assertions of the form (a : A) n and (a : A) &lt; n in the ABox to
a : :A&gt;n and a : :A n respectively. Then, using the laws of contradiction and
excluded middle, it can be proved that all axioms of the forms (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) to (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) above are
unnecessary. Roughly, speaking this optimisation reduces the number of concept
axioms in the reduced TBox to about a quarter compared to the unoptimised
one. In our experiments we will in particular also compare the performance of
the optimised vs. the unoptimised reduction.
3
      </p>
      <p>OWL DL Reasoning with KAON2
Reasoning with KAON2 is based on special-purpose algorithms which have been
designed for dealing with large ABoxes. They are described in more detail in [21],
such that we only present a birds' eyes perspective here. The underlying
rationale of the reasoning procedures is that algorithms for deductive databases have
proven to be e cient in dealing with large numbers of facts. The KAON2
approach utilises this by transforming OWL DL ontologies to disjunctive datalog,
and by the subsequent application of the mentioned and established algorithms
for dealing with disjunctive datalog [22].</p>
      <p>A birds' eyes perspective on the KAON2 approach is depicted in Figure 1.
KAON2 can handle SHIQ(D)1 description logic ontologies, which corresponds
1 SHIQ(D) provides datatypes and quali ed number restrictions in addition to the
SHIN description logic used for our fuzzy reasoning.
suffices for
some queries
e.g. instance
retrieval for
named classes</p>
    </sec>
    <sec id="sec-2">
      <title>OWL DL TBox (no nominals)</title>
    </sec>
    <sec id="sec-3">
      <title>Transformation to Disjunctive Datalog [ExpTime]</title>
    </sec>
    <sec id="sec-4">
      <title>OWL DL ABox</title>
      <p>Disjunctive Datalog Reasoning Engine [coNP]</p>
    </sec>
    <sec id="sec-5">
      <title>Answer</title>
      <p>roughly to OWL DL without nominals. The TBox is processed together with a
query by the transformation algorithm, which is described in more detail below
and which returns a disjunctive datalog program. This, together with an ABox,
is then fed into a disjunctive datalog reasoner which eventually returns an answer
to the query. In some cases, e.g. when querying for instances of named classes,
the query does not need to be fed into the transformation algorithm but instead
needs to be taken into account only by the datalog reasoner.</p>
      <p>The transformation algorithm accepts a SHIQ (or SHIQ(D)) TBox and
returns a disjunctive datalog program. Note that the returned program is in
general not logically equivalent to the input TBox; the exact relationship is
given below in Theorem 1.</p>
      <p>
        The steps of the algorithm can roughly be described as follows. (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
Transitivity axioms for roles S are replaced by adding axioms of the form 8R:C v 8S:8S:C
whenever S v R. This is a standard method for eliminating transitivity axioms,
such that the resulting knowledge base is satis able if and only if the original
knowledge base is. This ensures that the translation can be used to solve
typical SHIQ reasoning problems by reducing them to unsatis ability of a SHIQ
knowledge base.
      </p>
      <p>
        Employing the fact that SHIQ can be regarded as a subset of rst-order
logic, step (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) uses standard algorithms to transform the knowledge base into
conjunctive normal form. This involves eliminating existential quanti ers by
Skolemization, such that function symbols must be introduced into the
knowledge base.
      </p>
      <p>
        Next, in step (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ), the obtained set of clauses is partially saturated by adding
logical consequences. This is the crucial step of the algorithm where one has to
compute enough consequences to allow for a reduction to function-free Datalog.
Since the computational complexity is ExpTime for SHIQ but only NP for
disjunctive Datalog, it should not come as a surprise that this transformation
step can be exponential in the size of the input. The details of this step are
rather sophisticated such that we refer to [21] for details and proofs.
      </p>
      <p>
        Now function symbols can safely be eliminated in step (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ). To ensure that
this process still preserves satis ability of the knowledge base, one has to add a
linear number of auxiliary axioms. Finally, it is easy to syntactically transform
the resulting set of clauses into a disjunctive Datalog program in step (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ).
      </p>
      <p>
        Due to the transformations in steps (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) and (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), the output of the algorithm
is in general not logically equivalent to the input. Since all major reasoning
tasks for SHIQ can be reduced to satis ability checking, it is su cient for the
transformation to preserve satis ability, as shown in the following theorem from
[21].
      </p>
      <p>Theorem 1. Let K be a SHIQ(D) knowledge base and D(K) be the datalog
output of the KAON2 transformation algorithm on input K. Then the following
claims hold.</p>
      <p>{ K is unsatis able if and only if D(K) is unsatis able.
{ K j= if and only if D(K) j= , where is of the form A(a) or R(a; b), for</p>
      <p>A a named concept and R a role.
{ K j= C(a) for a nonatomic concept C if and only if, for Q a new atomic
concept, D(K [ fC v Qg) j= Q(a).</p>
      <p>A performance evaluation of KAON2 is reported in [23]. It shows that
KAON2 is indeed superior to other reasoners in most cases where size of the
ABox dominates compared to the size of the TBox. This is exactly the reason
why we have decided to use KAON2 in the context of our experiments with large
A-Boxes.
4</p>
      <sec id="sec-5-1">
        <title>Experiments</title>
        <p>In this section, we present the results of our experimental evaluation. First of
all, we describe the datasets used, i.e. the ontologies, in more detail (see
Subsection 4.1). These belong to di erent complexity classes in order to be able to
analyse performance with respect to di erent classes of ontologies. Most
standard benchmark datasets do not focus on average performance (over a larger
amount of queries) nor on performance with respect to data size. Thus, we made
two important decisions in the context of our experimental evaluation. First, we
decided to automatically generate larger A-Boxes ranging from 100 to 1 Mio.
individuals. In order not to bias our results by some generation strategy, we apply
three di erent ways of generating A-Boxes of varying sizes for a given ontology.
These strategies are described in more detail in Subsection 4.2. Second, instead
of relying on a limited number of prede ned queries as done in most
benchmarks, we decided to also randomly generate a number of n queries of varying
complexity. This will then allow us to analyze the performance of our approach
with respect to an average conjunctive query.
4.1</p>
        <sec id="sec-5-1-1">
          <title>Datasets</title>
          <p>We have chosen some popular ontologies which have been used in previous
benchmarks [23]. These are described in what follows:
{ VICODI: The VICODI ontology2 was manually created in the EU-funded
VICODI project and is a representative of the RDFS(DL) fragment. This
ontology is relatively small and simple since it does not contain any
disjunctions, existential quanti cation or number restrictions.
{ LUBM: The Lehigh University Benchmark (LUBM)3 was explicitly
designed for OWL benchmarks. It describes the organizational structure of
universities. Due to existential restriction on the right side of class
expressions, the LUBM ontology is in OWL Lite.
{ Semintec: The Semintec ontology4 is about nancial services and was
created at the University of Poznan. It is also relatively simple like the VICODI
ontology without existential quanti ers or disjunctions. But it is a
representative of the OWL DL fragment as it contains functional properties and
disjointness constraints.
{ Wine: The Wine ontology5 is a prominent example of an OWL DL ontology.</p>
          <p>More precisely, we used a version for the Wine ontology without nominals,
as it has been used in previous benchmarks (cf. [23]).
4.2</p>
        </sec>
        <sec id="sec-5-1-2">
          <title>Generating A-Boxes</title>
          <p>While the ontologies mentioned above already contain some A-Box assertions,
for most of them the A-Boxes are too small for our intended performance
evaluations. Therefore, for each of the ontologies, we generated di erent extended
datasets with increasing A-Box size to be able to quantify the relationship
between query time and the size of the A-Box. In particular, in order not to bias our
quantitative analysis, we experimented with three di erent ways of generating
A-Boxes: random, structural and proportional. We brie y describe these di erent
generation strategies, omitting details due to space limitations. For simplicity,
from now on, the individuals always include concept individuals and property
individuals if we do not specify which kind of individuals.</p>
          <p>{ random generation: according to this strategy, a number n of individuals
are generated by rst randomly selecting m properties, generating new
individuals of these and randomly deciding whether to create a new domain
(range) individual or use an existing one. This yields at most m property
and 2m concept individuals. Then, the rest to n is lled up with random
concept individuals.
2 http://www.vicodi.org
3 http://swat.cse.lehigh.edu/projects/lubm/index.htm
4 http://www.cs.put.poznan.pl/alawrynowicz/semintec.htm
5 http://www.schemaweb.info/schema/SchemaDetails.aspx?id=62
{ structural generation: according to this strategy, the existing A-Box is
copied m times until the nal number of individuals is reached, whereby
concept and property individuals are renamed. This strategy thus aims at
preserving the structure of the original ontology.
{ proportional: according to this strategy, the conditional probability
P (cj jic), i.e. the probability that given that ic is a concept individual it
belongs to concept cj is calculated in the original A-Box. Further, also P (pj jip)
is calculated for properties. The relative frequency of property versus concept
individuals is also calculated. Then, new individuals are randomly generated
according to the above probabilities. This strategy aims at preserving the
distribution in the original ontology.</p>
          <p>For each of the ontologies mentioned above, we have thus generated ontologies
with 100, 1.000, 10.000, 100.000 and 1 Mio. individuals. Further, in order to
generate fuzzy ontologies out of these, a random fuzzy degree has been generated
for each of the individuals in the di erent datasets. The ontologies with the fuzzy
degree values have then been reduced using the optimised and the unoptimised
reduction.
4.3</p>
        </sec>
        <sec id="sec-5-1-3">
          <title>Queries</title>
          <p>To generate various sorts of queries for tests, we use 5 query patterns which vary
from the simplest one to retrieve all the individuals of a concept to the complex
ones such as:
SELECT ?V1 ?V2 ?V3 ?V4 ?V5 ?V6
WHERE f?V1 rdf:type C1 . ?V2 rdf:type C2 . ?V3 R1 ?V4 . ?V5 R2 ?V6g
where, V i (i = 1:::6) indicates variables. C1 and C2 means concepts and R1 and
R2 are relations.</p>
          <p>When generating a query, we randomly select one query pattern. For each
chosen pattern, we replace each variable in the pattern by randomly choosing one
from several distinct variables given beforehand. As for each relation (concept)
in the pattern, we randomly choose one relation (concept) from the ontology to
be queried.</p>
          <p>This way of creating queries is obviously exible enough to generate an
arbitrary number of queries of varying complexity. It is important to mention that,
as we only use named classes in our queries, the reduction to Datalog does not
need to be performed for each query (see Section 3).
4.4</p>
        </sec>
        <sec id="sec-5-1-4">
          <title>First observations: number of axioms</title>
          <p>In order to generate ontologies with Fuzzy ABoxes, we randomly generated fuzzy
degrees for the A-Boxes generated as described above. After applying the
reduction, we obtained ontologies with a number of axioms (TBox and RBox) as
reported in Table 2. It shows the number of axioms for the original ontologies as
well as those resulting from the reduction to crisp OWL-DL for di erent degrees,
i.e. 3 (0, 0.5 1), 6 (0, 0.2, 0.4, 0.6, 0.8, 1) and 11 (0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6,
0.7, 0.8, 0.9, 1). We can certainly see that the reduction increases the number
of axioms dramatically. Interestingly, the optimized reduction produces about a
third of the axioms produced by the unoptimized method.
We have carried out performance evaluation tests for the above mentioned four
ontologies comparing the query performance for standard query answering with
respect to the original ontology as well as fuzzy query answering with respect
to the fuzzy ontologies after the reduction using the optimised and unoptimised
methods. The fuzzy queries are also randomly generated using the patterns
described above, inserting the concepts and relations produced by the reduction,
thus yielding "fuzzy" conjunctive queries which are actually crisp queries with
respect to the transformed ontologies.</p>
          <p>Figure 2 shows the average time for query answering with respect to the
automatically generated datasets for the crisp ontologies containing 100 to 1.
Mio individuals for the di erent generation strategies. A rst interesting result
is that the di erent generation strategies do not have a major e ect on the query
time. For this reason, we decided to carry out the remaining experiments using
only the proportional generation strategy.</p>
          <p>In particular, we tested ontologies with 3, 6 and 11 fuzzy degrees. For all
queries, we imposed a time limit of 20 minutes. After that, query evaluation was
stopped and counted as 20 minutes. All results are reported averaged over 100
queries. Our rst conclusions were the following:
{ The unoptimised reduction leads to ontologies which are completely
intractable, such that almost none of the queries can be answered within 20
minutes.
{ The Wine ontology showed to be completely intractable both with the
optimised and unoptimised reduction even using only 3 degrees! This shows that
highly expressive (fuzzy) ontologies lead to a high number of axioms when
they are reduced to crisp ontologies, thus becoming intractable for
state-ofthe-art reasoners. Thus, we do not show the detailed results for the Wine
ontology from now on.
{ Using the optimised reduction, the query time with respect to the original
crisp ontologies increases a factor of between 2 and 100 depending on the
)c 1000
e
s
(e 100
m
ite 10
s
no 1
p
s
eR 0,1
0,01
number of degrees used. We discuss this increase for the ontologies
transformed using the optimised reduction in more detail below with respect to
the di erent ontologies.</p>
          <p>Overall, we can conclude that the increase in the time for answering a query
is considerable when considering the transformed ontologies, i.e. roughly a factor
)c 1000
e
s
(e 100
m
ite 10
s
no 1
p
s
eR 0,1
0,01
0
0
1
_
i
d
o
c
i
v
0
0
0
1
_
i
d
o
c
i
v
0
0
0
0
1
_
i
d
o
c
i
v
0
0
0
0
0
1
_
i
d
o
c
i
v
0
0
0
0
0
0
1
_
i
d
o
c
i
v
0
0
1
_
m
b
u
l
0
0
0
1
_
m
b
u
l
0
0
0
0
1
_
m
b
u
l
0
0
0
0
0
1
_
m
b
u
l
0
0
0
0
0
0
1
_
m
b
u
l
0
0
1
_
c
e
it
n
m
e
s
0
0
0
1
_
c
e
it
n
m
e
s
0
0
0
0
1
_
c
e
it
n
m
e
s
0
0
0
0
0
1
_
c
e
it
n
m
e
s
0
0
0
0
0
0
1
_
c
e
it
n
m
e
s
Data sets with different individual size</p>
          <p>
            crisp 3 degrees 6 degrees 11 degrees
between 2 and 100 compared to query answering with respect to the original crisp
ontologies. The pattern which emerges is quite clear. For the reduced ontologies
with 3 degrees, the query time seems to double compared to standard query
answering with respect to the original ontologies. When using 6-11 degrees, the
increase is in most cases a factor of at least 10 compared to the time for answering
queries with respect to the original crisp ontology. For 11 degrees, the increase
can even reach a factor of 100 (e.g. for VICODI with 100.000 individuals), but in
most cases there is an increase with a factor of 10. The good news is certainly that
for smaller ontologies from 100 to 10.000 individuals, the query time is at most a
few (&lt;4) seconds regardless of the number of degrees used. This shows that for
smaller ontologies the approach considered is de nitely feasible regardless of the
degrees considered. For bigger ontologies, using more degrees (
            <xref ref-type="bibr" rid="ref6 ref7">6-11</xref>
            ) comes at a
high price. Thus, applications will clearly need to balance the amount of degrees
they need to distinguish with the maximum querying time they can a ord. In
any case, it seems that fuzzy query answering is feasible in real time (where real
time is de ned as under a few seconds) in exactly those cases in which query
answering with respect to the original crisp ontology is also. In all those cases
where fuzzy query answering is not feasible in real time (let's say &gt; 100 seconds),
neither standard query answering with respect to the crisp ontologies is. In such
a case, one could argue, it is not anymore important whether the queries can
be answered in 100 seconds (1.6 minutes) or 1000 seconds (16 minutes) as these
times are both clearly beyond a real-time behaviour.
          </p>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>Related Work and Conclusion</title>
        <p>In the last couple of years it has become evident that fuzzy Description Logics
could play an important role in several Semantic Web applications that face a
signi cant amount of vague knowledge, like multimedia analysis and retrieval.
In order to support inference services for fuzzy-DLs, several di erent reasoning
algorithms have been proposed such as tableaux-based [12, 10],
optimizationbased [16, 17] as well as techniques that reduce fuzzy-DL reasoning to crisp DL
reasoning [9, 13, 20].</p>
        <p>Although there has been a signi cant amount of work on the reduction based
DL reasoning methods, there is currently no systematic and complete
evaluation of the method6. In particular, Straccia was the rst to propose the method,
showing that reasoning in fKD-ALCH can be reduced to reasoning in ALCH,
while later Bobillo et. al. [13] extended the method to reduce fKD-SHOIN
(together with fuzzy subsumption axioms [6]) to SHOIN . Then Bobillo [20]
extended the technique to SROIQ also proposing a number of optimisation
methods that reduced the number of created axioms. Additionally, in [20] there
is a rst report on an implementation accompanied with a preliminary
evaluation. More precisely, the authors have implemented their proposed optimised
reduction technique which then they evaluated over a fuzzi ed version of the
Koala ontology7. The Koala ontology contains 20 named classes, 15 anonymous
classes, 4 object properties, 1 datatype property and 6 individuals. The axioms of
the ontology where randomly extended with membership degrees, using 3, 5, 7,
9 and 11 di erent degrees, thus creating a set of ve fuzzy ontologies. Then, the
reduction method was evaluated using the Pellet reasoner providing the times
for knowledge base satis ability checking. In contrast, we have mainly focused
on the task of query answering. Instead of evaluating with respect to only one
small ontology, we have tested four di erent ontologies corresponding to di erent
complexity classes (i.e. the RDFS DL fragment, OWL Lite and OWL-DL) with
respect to di erent A-Box sizes of up to a million of individuals. Our focus has
in fact been to test the scalability of the reduction-based approach depending
on A-Box size. This is also the reason why we have used KAON2 as reasoner,
which was particularly developed to handle large A-Boxes. Our experimental
evaluation has shown that for quite expressive ontologies, especially the Wine
ontology, the approach based on the reduction to crisp DL is not feasible for
fuzzy query answering. Arguably, the wine ontology was never conceived as a
realistic ontology for A-Box reasoning but rather for T-Box reasoning. In any
case, in our experiments the ontologies resulting from the reduction to crisp DL
where only tractable when using the optimised reduction, which reduced the
number of axioms by around 66% compared to the unoptimised reduction.</p>
        <p>Overall, the increase in query time is moderate for small numbers of degrees
(e.g. 3-6). Distinguishing 3-6 degrees leads to increased query times with a factor
6 Note that the landscape is not di erent with the other reasoning methods for
fuzzy</p>
        <p>
          DLs
7 http://http://protege.cim3.net/file/pub/ontologies/koala/koala.owl
between 2-10 for smaller A-Boxes (up to 10.000 individuals) and to a factor of
100 for larger A-Boxes with around 100.000 individuals and more. The
interesting conclusion is nevertheless that for rather inexpressive ontologies such as
VICODI and LUBM, query answering with the approach analyzed here is de
nitely feasible for a few degrees (
          <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6">3-6</xref>
          ) even with very large A-Boxes as the time
increase is below a factor of 10. For applications which really need fuzzy query
answering, this increase might be acceptable. Finally, the encouraging result is
that for smaller A-Boxes with a number of individuals between 100 and 10.000
query answering takes below a few seconds irrespectively of the degrees used.
The interesting conclusion that real time query answering seems to be feasible
for the fuzzy ontologies exactly for those A-Box sizes for which it is also feasible
for the crisp ontologies is an encouraging result from our point of view.
        </p>
        <p>There are obvious avenues for future work. First, it would be appropriate to
experiment with other reasoners such as FaCT or Pellet for reasoning with the
reduced ontologies. Further, a necessary next step is to compare with respect to
other techniques such as tableaux-based reasoners such as Fire (see [24]). While
our envisioned application is retrieval and thus querying, the investigation of
several issues from the application point of view seem interesting. First of all,
we expect that, in contrast to what we have assumed in this paper, not all
concepts in an ontology should be considered as inherently fuzzy. In fact, we expect
applications to pose clear requirements on what concepts should be understood
as fuzzy and which not. By this, the large amount of axioms produced by the
reduction could be dramatically constrained, thus leading to a better performance.
In this context it will be necessary to clarify the interactions between fuzzy and
non-fuzzy concepts in query answering. Overall, we think that this paper o ers
a clear step forward towards clarifying whether techniques for reasoning with
fuzzy knowledge can be expected to scale up.
8. Lukasiewicz, T., Straccia, U.: An overview of uncertainty and vagueness in
description logics for the semantic web. Technical Report INFSYS Research Report
1843-06-07, Technische Universitat Wien (2006)
9. Straccia, U.: Transforming fuzzy description logics into classical description logics.</p>
        <p>In: Proceedings of the 9th European Conference on Logics in Arti cial Intelligence
(JELIA-04). Number 3229 in Lecture Notes in Computer Science, Lisbon, Portugal,
Springer Verlag (2004) 385{399
10. Stoilos, G., Stamou, G., Tzouvaras, V., Pan, J.Z., Horrocks, I.: Reasoning with
very expressive fuzzy description logics. Journal of Arti cial Intelligence Research
30(8) (2007) 273{320
11. Baader, F., McGuinness, D., Nardi, D., Patel-Schneider, P.: The Description Logic
Handbook: Theory, implementation and applications. Cambridge University Press
(2002)
12. Straccia, U.: Reasoning within fuzzy description logics. Journal of Arti cial
Intelligence Research 14 (2001) 137{166
13. Bobillo, F., Delgado, M., Gomez-Romero, J.: A crisp representation for fuzzy
SHOIN with fuzzy nominals and general concept inclusions. In: Proc. of the 2nd
International Workshop on Uncertainty Reasoning for the Semantic Web (URSW
06), Athens, Georgia. (2006)
14. Straccia, U.: Answering vague queries in fuzzy DL-Lite. In: Proceedings of the
11th International Conference on Information Processing and Management of
Uncertainty in Knowledge-Based Systems, (IPMU-06). (2006) 2238{2245
15. Pan, J., Stamou, G., Stoilos, G., Thomas, E.: Expressive querying over fuzzy
DL-Lite ontologies. In: Proceedings of the International Workshop on Description
Logics (DL 2007). (2007)
16. Straccia, U.: Description logics with fuzzy concrete domains. In: 21st Conf. on</p>
        <p>Uncertainty in Arti cial Intelligence (UAI-05), Edinburgh (2005)
17. Bobillo, F., Straccia, U.: A fuzzy description logic with product t-norm. In:
Proceedings of the IEEE International Conference on Fuzzy Systems (Fuzz-IEEE-07),
London. (2007)
18. Horrocks, I.: Using an Expressive Description Logic: FaCT or Fiction? In:
Proceedings of the Sixth International Conference on Principles of Knowledge
Representation and Reasoning (KR'98), Trento, Italy, June 2-5, 1998, Morgan Kaufmann
(1998) 636{649
19. Sirin, E., Parsia, B., Grau, B.C., Kalyanpur, A., Katz, Y.: Pellet: A practical</p>
        <p>OWL-DL reasoner. Journal of Web Semantics 5 (2007) 51{53
20. Bobillo, F., Delgado, M., Gomez-Romero, J.: Optimising the crisp representation of
the fuzzy dl SROIQ. In: Proc. of the 3nd International Workshop on Uncertainty
Reasoning for the Semantic Web (URSW 07), Busan, Korea. (2007)
21. Motik, B.: Reasoning in Description Logics using Resolution and Deductive</p>
        <p>Databases. PhD thesis, Universitat Karlsruhe (2006)
22. Cumbo, C., Faber, W., Greco, G., Leone, N.: Enhancing the magic-set method for
disjunctive datalog programs. In Demoen, B., Lifschitz, V., eds.: ICLP. Volume
3132 of Lecture Notes in Computer Science., Springer (2004) 371{385
23. Motik, B., Sattler, U.: A comparison of reasoning techniques for querying large
description logic ABoxes. In: Proceedings of the 13th International Conference on
Logic for Programming Arti cial Intelligence and Reasoning (LPAR 2006), Phnom
Penh, Cambodia, November, 2006. (2006)
24. Simou, N., Kollias, S.: Fire : A fuzzy reasoning engine for impecise knowledge,
K-Space PhD Students Workshop, Berlin, Germany, 14 September 2007 (2007)</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Meghini</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sebastiani</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Straccia</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>A model of multimedia information retrieval</article-title>
          .
          <source>Journal of the ACM</source>
          <volume>48</volume>
          (
          <issue>5</issue>
          ) (
          <year>2001</year>
          )
          <volume>909</volume>
          {
          <fpage>970</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Zadeh</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>The concept of a linguistic variable and its application to approximate reasoning</article-title>
          .
          <source>Information Sciences 8</source>
          <volume>-9</volume>
          (
          <year>1975</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Ding</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peng</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <article-title>Studies in Fuzziness and Soft Computing</article-title>
          . In: BayesOWL: Uncertainty Modeling in Semantic Web Ontologies. Springer-Verlag (
          <year>2005</year>
          )
          <fpage>27</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. da Costa,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Laskey</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          :
          <article-title>PR-OWL: A framework for probabilistic ontologies</article-title>
          .
          <source>In: Proceedings of the International Conference on Formal Ontology in Information Systems</source>
          . (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Udrea</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Subrahmanian</surname>
            ,
            <given-names>V.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Majkic</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>Probabilistic rdf</article-title>
          .
          <source>In: Proc. of IRI'06</source>
          . (
          <year>2006</year>
          )
          <volume>172</volume>
          {
          <fpage>177</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Straccia</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Towards a fuzzy description logic for the semantic web</article-title>
          .
          <source>In: Proceedings of the 2nd European Semantic Web Conference</source>
          . (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Stoilos</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stamou</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tzouvaras</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Fuzzy</surname>
            <given-names>OWL</given-names>
          </string-name>
          :
          <article-title>Uncertainty and the semantic web</article-title>
          .
          <source>In: Proc. of the International Workshop on OWL: Experiences and Directions</source>
          . (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>