1 Introduction

Ontological Stream Reasoning via Syntactic Approximation

Yuan Ren

Jeff Z. Pan

Yuting Zhao

0 0 Dept. of Computing Science University of Aberdeen Aberdeen , UK

Traditionally Description Logic (DL)-based ontologies are only used to capture static knowledge. Recently the fast development of the Semantic Web and its evolving data raises the requirements of reasoning services on dynamic knowledge streams. In this paper, we present an efficient ontology stream reasoning approach which combines the delete and re-derive algorithm and the syntactic approximation to guarantee soundness and tractability. Compared with existing works, it allows stream reasoning for ontologies in very expressive language profiles and requires no prior knowledge about the streams.

1 Introduction

Traditionally Description Logic (DL)-based ontologies are only used to capture static knowledge. That is to say, the knowledge base is assumed to be permanent. However the dynamics of knowledge is becoming more and more frequent in various applications, including real time reasoning in ontology editors, sensor data streams, semantic web content updates and mobile semantic web [ 13, 12 ]. In these cases, in reasoning we should consider the erasure of old knowledge and adding of new knowledge. This raises the requirements of stream reasoning, in which people want to do reasoning with changing knowledge [ 17, 18 ].

In [ 19 ] a delete and re-derive (DRed) approach is adopted from the stream data processing in relational database. When the updates of knowledge base occur, the stream reasoner first erases all the entailments that rely on the removed axioms, then re-do the reasoning over the remaining axioms and added axioms to entail new results. Later, [ 6 ] realized that if the time window of stream is fixed, we can use a time tag to annotate each entailed axiom, so that when the time of expiration comes, the stream reasoner immediately knows whether an entailment can be erased. Similarly, when a new entailment is derived, the stream reasoner immediately knows when it will expire. However, the applicability of the later approach relies on the prior knowledge of fixed stream time window. Further more, both of these works still focus on rather simple languages such as RDF and RDFs and relatively small knowledge base.

In this paper, we propose a novel approach which applies syntactic approximation to reduce the reasoning complexity. Instead of trivially using syntactic approximate reasoning to compute entailments, we extend it with the traceability information among entailments, so that when erasing axioms, we can efficiently detect the entailments impacted by such erasure. When adding axioms, we augment the syntactic approximation with the incremental reasoning facility of E L+/E L++ [ 16 ].

The remainder of the paper is organized as follows: In Sec.2 we introduce background knowledge of DL, ontology and the syntactic approximation. In Sec.3 we review existing works on stream reasoning and propose the big picture of our approach. The next two sections present our approach. First in Sec.4 we extend the syntactic approximation so that we maintain the traceability relations among them when we derive entailments. In Sec.5 we utilize the traceability information to efficiently analyze the impact of erasure of axioms. And further apply incremental reasoning technique to deal with adding of axioms. Our proposal is implemented and evaluated in Sec.6. In Sec.7 we discuss two possible extensions of our approach. Sec.8 summaries the paper. 2 2.1

Background DL and Ontology

DL is a family of logic-based knowledge representation formalisms. They vary in language expressive power and computational complexities. In this paper we focus on the DL SROIQ and E L++. For the sake of conciseness, we highlight the parts of syntax and semantics that are of special interest in our approximate reasoning approach. Complete specifications of these two languages can be found in [ 10 ] and [ 1, 2 ], respectively.

In DL, concept and role expressions are composed with language constructs. Let > be the top concept, ? the bottom concept, A an atomic concept, n an integer number, a an individual, r an atomic role, in SROIQ, concept expressions C, D can be inductively composed with the following constructs: > j ? j A j C uD j 9R:C j fag j :C j nR:C j 9R:Self where R is a role expression which is either r or an inverse role R . And 9R:Self is a self-restriction.

Conventionally, CtD; 8R:C and nR:C are used to abbreviate :(:Cu:D); :9R::C and : (n + 1)R:C, respectively. fa1; a2; : : : ; ang can be regarded as abbreviation of fa1g t fa2g t : : : t fang. Without loss of generality, in what follows, we assume all the concepts to be in their unique negated normal forms (NNF)1 and use ~C to denote the NNF of :C. We also call >; ?; A; fag basic concepts because they are not composed by other concepts or roles. Given a knowledge base (an ontology or a TBox, or an ABox), we use CN (RN ) to denote the set of basic concepts (atomic roles) in .

E L++ supports > j ? j A j C u D j 9r:C j fag, where r is an atomic role.

A DL ontology O =< T ; A > is composed of a TBox T and an ABox A. A TBox is a set of concept and role axioms. Both SROIQ and E L++ support concept inclusion axioms (CIs, e.g. C v D) and role inclusion axioms (RIs, e.g. r v s, r1 : : : rn v s). SROIQ supports also other axioms such as functionality of roles (e.g. f unc(r)), inverse roles (e.g. inverseRole(r; s)), etc. If C v D and D v C, we write C D. If C is non-atomic, C v D is a general concept inclusion axiom (GCI). An ABox is a set of assertion axioms. Both SROIQ and E L++ support the concept assertion axioms, 1 An SROIQ concept is in NNF iff negation is applied only to atomic concepts, nominals or

Self-restriction. NNF of a given concept can be computed in linear time [ 9 ]. e.g. a : C, role assertion axioms, e.g. (a; b) : R, individual equality, e.g. a = b and inequality, e.g. a 6= b.

If an axiom can be derived from an ontology O, we say that is entailed by O, denoted by O j= . is an entailment of O. 2.2

Syntactic Approximation

Syntactic approximation [ 14, 15 ] is an approximate reasoning technique that reduces DL reasoning in SROIQ to E L++. The reasoning complexity is thus reduced from 2NEXPTIME-Complete to PTIME-Complete while the results are guaranteed correct. In this subsection, we briefly recall the syntactic approximation technique and its features. For more details about proofs, readers are referred to [ 14, 15 ]. In later sections, we will extend the current approach to support required closed world reasoning services.

The idea of syntactic approximation from SROIQ and E L++ is to first encode non-E L++ expressions with fresh names, then maintain their semantics with additional axioms and separate data structures. For example, complement relations between an named concept A and the new name, e.g. nA, assigned to its complement :A are maintained in the Complement Table (CT). In reasoning phase, additional completion rules will be used to partially recover the semantics.

In approximation, we only consider concepts corresponding to the particular TBox in question. We use the notion term to refer to these “interesting” concept expressions. More precisely, a term is: (i) a concept expression in any axiom, or (ii) a singleton of any individual, or (iii) the complement of a term, or (iv) the syntactic sub-expression of a term. In order to represent all these terms and role expressions that will be used in E L++ reasoning, we first assign names to them.

Definition 1. (Name Assignment) Given S a set of concept expressions, E a set of (negative) role expressions, a name assignment f n is a function as for each C 2 S (R 2 E), f n(C) = C (f n(R) = R) if C is a basic concept (R is named), otherwise f n(C) (f n(R)) is a fresh name.

Now we can transform ontologies into E L++ with additional data structures by the following definition.

++ Transformation) Given an Ontology O and a name assignment Definition 2. (E LCQ

++ transformation Afn;ELC+Q+ (O) is a triple (T ; CT; QT ) constructed as f n, its E LCQ follows: 1. T ; CT and QT are all initialized to ;. 2. for each C v D (C D) in O, T = T [ff n(C) v f n(D)g (T = T [ff n(C) f n(D)g). 3. for each E L++ role axiom 2 O, add [R=fn(R)] into T . 4. for each a : C 2 O, T = T [ ffag v f n(C)g. 5. for each (a;:b) : R 2 O, T = T [ffag v 9f n(R):fbg; fbg v 9f n(Inv(R)):fagg. 6. for each a =: b 2 O, T = T [ ffag fbgg. 7. for each a 6= b 2 O, T = T [ ffag v :fbgg 8. for each term C in T , CT = CT [ f(f n(C); f n(~C))g, and (a) if C is the form C1u: : :uCn, then T = T [ff n(C) f n(C1)u: : :uf n(Cn)g, (b) if C is the form 9r:D, then T = T [ ff n(C) 9r:f n(D)g, (c) if C is the form 9R:Self , then T = T [ ff n(C) 9f n(R):Self g (d) if C is the form nR:D, then i. if n = 0, T = T [ f> v f n(C)g ii. if n = 1, T = T [ ff n(C) 9f n(R):f n(D)g iii. otherwise, T = T [ ff n(C) f n(D)fn(R);ng, and QT = QT [ f(f n(C); f n(R); n)g.

(e) otherwise T = T [ ff n(C) v >g. 9. for each pair of names A and r, if there exist (A; r; i1); (A; r; i2); : : : ; (A; r; in) 2 QT with i1 < i2 < : : : < in, T = T [fAr;in v Ar;in 1 ; : : : ; Ar;i2 v Ar;i1 ; Ar;i1 v Step 2 rewrites all the concept axioms; Step 3 preserves all the E L++ role axioms; Step 4 to 7 rewrite all the ABox axioms and internalize them into the approximated TBox; Step 8 defines terms and constructs the complement table CT and cardinality table QT ; Particularly, in step 8.(d), f n(D)fn(R);n is a fresh name. Obviously, this is unique for a given tuple of D, R and n. We call them cardinality names. Similarly, nR:D will be approximated via the approximation of its complement (n+1)R:D. In step 9, for each pair of name assignment A; r in T , a subsumption chain is added into T because inr:A v : : : v i2r:A v i1r:A v 9r:A.

++ approximation.The E LCQ ++ approximation approx

We call this procedure an E+L+CQontology ([14, Proposition 1]) with a table maintainimates an ontology into an E L ing the complementary relations and another table maintaining the cardinality relations. This approximation can be computed in linear time ([14, Proposition 2]).

++ transformation (T ; CT; QT ), we normalize axioms of form C v Given an E LCQ D1 u : : : u Dn into C v D1; : : : ; C v Dn, and recursively normalize role chain r1 : : : rn v s with n > 2 into r1 : : : rn 1 v u and u rn v s. Because C, Di are basic concepts, this procedure can be done in linear time. In the following, we assume T to be always normalized. For convenience, we use a complement function f c : CNT 7! CNT as: for each A 2 CNT , f c(A) = B if (A; B) 2 CT . Note that if A is a cardinality name, then it does not have a complement. In what follows, when applying f c(A) we always assume that A is not a cardinality name but a assigned name.

With the normalized approximation, the reasoning can be realized by extending E L++ completion rules with support for the CT and QT . Given an ontology E LCQ ++ transformation (T ; CT; QT ), the completion rules will compute, for each basic concept A, a subsumer set S(A) CNT such that if B 2 S(A) then A v B, and for each named role r, a relation set R(r) CNT CNT . For each basic concept A, S(A) is initialized to fA; >g and for each named role r, R(r) is initialized to ;. An example of such rules can is illustrated as follows:

If A 2 S(B) and f c(B) 2= S(f c(A)), then S(f c(A)) := S(f c(A)) [ ff c(B)g This rule asserts the reverse subsumption between concepts to supplement the absence of negation, i.e. A v B !~A v~B.

The whole set of rules (R1-R16) can be found in [ 14 ]. Reasoning with them is tractable and sound ([14, Theorem 1 and 2]).

As in classical reasoning, unsatisfiability checking of a concept C can be reduced to entailment checking of C v ?; ontology inconsistency checking can be reduced to entailment checking of > v ? or fag v ?. 3

Technical Motivation

In this section, we present a technical definition of stream reasoning, then show its significance and difficulty by analyzing exisiting works. 3.1

Stream Reasoning

As we introduced in early sections, a stream can be realized by updates on an ontology. Such updates consist of erasure of existing axioms and adding of new axioms. Definition 3. An ontology stream is a sequence of ontologies O(0); O(1); : : : constructed as follows: 1. O(0) is an initial ontology; 2. O(i + 1) = O(i) n Er(i) [ Ad(i). where Er(i) O(i) makes up a sequence of erased axioms. Ad(i) makes up a sequence of added axioms.

The change from O(i) to O(i + 1) is an update. It’s important to note that the sequences of erased axioms and added axioms are not necessarily known to reasoners in advance, i.e. when computing Ans(O(i); q), a stream reasoner does not necessarily know Er(j) and Ad(j) if j i.

For a stream O(0); O(1); : : : and a reasoning problem q, applications ask for reasoning answers of q on O(i), denoted by Ans(O(i); q). Such answers can be directly computed on O(i), i.e. Ans(O(i); q) = f (O(i); q), which means the answer is a function of the O(i) and q. We call this the naive approach.

More interestingly, people would like to compute Ans(O(i+1); q) based on Ans(O(i); q) so that partial results can be reused, i.e. Ans(O(i+1); q) = g(Ans(O(i); 1); q; Er(i); Ad(i)), which means the answer is a function of Ans(O(i + 1)); q; Er(i) and Ad(i). Note that, when Er(i) = ; for all i in the stream, stream reasoning is reduced to incremental reasoning because no axiom is erased. When Ad(i) = ; for all i in the stream, stream reasoning is reduced to incremental revision because no axiom is added. 3.2

Existing Works

There have been many works regarding querying streaming data on the semantic web. [ 7 ] proposed an extension of SPARQL to process data streams. Similarly, C-SPARQL [ 5, 4 ], another extension of SPARQL can continuously query from a RDF knowledge base.

More related works focus on continuously and incrementally updating and materializing ontological knowledge bases. [ 19 ] adopted the Delete and Re-derive (DRed) algorithm [ 8 ] from traditional data stream management systems and proposed a declarative variant of it. When change occurs, the “stream reasoner” first overestimates the consequences of the deletion, then “cash-back” the over-deleted consequences that can be derived by other facts, and finally adds new entailments that are derived from the new facts. This approach maintains ontologies in logical databases and update them with logical programs thus it applied to ontology languages that can be translated to Datalog programs, such as RDF(S) and DLP. Also, the relations among asserted axioms and their consequences have to be maintained by additional rules so that when the asserted axioms are erased it is easy to final their consequences. This makes the maintenance program substantially large than the original program.

Similar idea was used in [ 6 ] with an additional assumption that the time window of stream is fixed and known to the stream reasoner. So that it is no longer needed to maintain the relations among asserted axioms and their consequences. Instead, the relation between a consequence and a time point should be maintained so that when the time comes, the stream reasoner expires corresponding consequences. However, this still requires updating expiration time of existing entailments when new facts are asserted into the ontology, which ends up with computing all possible justifications for each entailment, making it difficult to apply this approach to more expressive languages than RDF(S). Furthermore, if the time window is not known to the stream reasoner, this approach can not work. 3.3

Our Approach

Compared with existing works, our approach will have the following differences: 1. We apply syntactic approximation to stream reasoning. This enables stream reasoning for more expressive ontology languages. 2. Both asserted axioms and their consequences are entailed by the current ontology.

When doing reasoning, we can keep traceability relations between the deriving facts and the derived facts. Such relations can be easily used to find impacted entailments in deletion. 3. When dealing with newly added axioms, we incrementally compute new entailments. Since we approximate ontologies to E L++, we can use its tractable incremental reasoning facility [ 16 ].

In the next sections, we first extend syntactic approximation to maintain traceability information on the fly, then use such information in updating ontologies. 4

Augmenting Syntactic Approximation with Traceability Information

In this section, we investigate how to further extend our proposed syntactic approximation to maintain the traceability among entailments.

The basic idea is to construct a directed traceability graph, whose nodes are entailments, whose edges are dependency between entailments such that if there exists an edge from entailment e1 to entailment e2 then e2 is derived from e1 (together with other entailments). It is obvious that a node can have multiple inbound edges if the entailment is derived from multiple other entailments. Asserted axioms and tautology axioms do not have any inbound edges or only have inbound edges from themselves. In case of syntactic approximation, the traceability is established in two phases: the first phase is on the approximation and normalization (Sec 4.1), while the second phase is on reasoning (Sec 4.2).

4.1 Traced Approximation and Normalization

In the approximation and normalization phase, the input is a given ontology O, while the output is an extended syntactic transformation (T ; CT; QT ). As introduced in Sec 2, in (T ; CT; QT ), elements of CT and QT are used to represent the semantics of some new names in T . Therefore we only need to get trace of axioms in T . Such traceability are maintained in a Traceability Graph T G = (N; E). An element e 2 N is an entailment of O and (T ; CT; QT ) and E N N represents the derivation among entailments. This is realized by the following definition: (T ; CT; QT; T G), T G = (N; E) constructed as follows: Definition 4. (Traced Syntactic Transformation) Given an ontology O with Afn;ELC+Q+ (O) = (T 0; CT 0; QT 0), then its traced syntactic transformation tAfn;ELC+Q+ (O) is a four-tuple

The above definition creates traceability information for T when approximation and normalization are performed. It is also important to make Step-3 before 4, make 4 before 5 and 6, and make Step-9 before the other role axioms , because there might be duplications in approximating axioms. In this case we only maintain one of them to maintain tractability of the solution. For example, in ffag A; fag v A; a : Ag the same approximated axiom fag v A has three different asserted axioms. According to Def.4 only (fag A; fag v A) will be maintained in T G. (T ; CT; QT; T G) can be computed in linear time.

Proposition 1. For an ontology O, its traced syntactic transformation tAfn;ELC+Q+ (O) =

According to the above proposition and Def.4, given a traced syntactic transformation (T ; CT; QT; T G), all the axioms in T generated by Step-2 and 3 in Def.2 have been normalized traced to their asserted axioms in O. The other axioms, i.e. those generated by Step-8 and 9 of Def.2 do not have traceability information because they are actually tautologies and do not rely on axioms in O. These axioms will still need to be normalized and the normalized axioms should be added into T G as nodes. In what follows, we always assume T to be completely normalized. 4.2

Traced Reasoning

In the reasoning, the input is a traced syntactic transformation (T ; CT; QT; T G), and the output are the subsumer sets of basic concepts and relation sets of atomic roles. We need to maintain the traceability information regarding elements of these sets. For example, B 2 S(A) only if A v B can be inferred, then we should add A v B as a node in the T G and connect it to other entailments.

In addition, the computation of subsumption closure of roles are implicitly treated in the original reasoning paradigm. Here we make them explicit and maintain the traceability information as well. We use r v s to denote that r v s can be derived. In initialization of the reasoning, we have r v r for every r 2 RNT . Obviously r v r should be added into T G without inbound edges.

To this end, we can maintain the traceability of inferences results when applying the completion rules. For each rule, the descendent axiom should be added into the T G as a new node, and an edge should be created from each of the antecedent axioms to the descendent axiom. For example, suppose the traceability graph T G = (N; E), the traced completion rule for role subsumption is as follows:

If r1 v r2, r2 v r3 and r1 v r3 not inferred then infer r1 v r3; N = N [ fr1 v r3g and E = E [ f(r1 v r2; r1 v r3); (r2 v r3; r1 v r3)g Another example is the rule we showed at the end of Sec.2.2:

If A 2 S(B) and f c(B) 2= S(f c(A)) then S(f c(A)) := S(f c(A)) [ ff c(B)g, N = N [ ff c(A) v f c(B)g and T T = T T [ f(B v A; f c(A) v f c(B))g The other rules can be extended in a similar way. A useful optimization when generating traceability edges is that, if a node is not reachable from any asserted axiom, we do not need to generate any outbound edges from it. This is because when erasing asserted axioms from the ontology, such a node will not be impacted. Thus, an entailment derived from it will be impacted only if it is also connected to another impacted node.

Obviously, the traced completion rules have the same time complexity and reasoning quality as the original rules: Theorem 1. For any ontology O and its traced syntactic transformation (T ; CT; QT; T G), TBox reasoning by traced completion rules will terminate in polynomial time w.r.t. jCNOj + jRNOj and is soundness guaranteed.

Because the syntactic approximation computes the subsumption between all named concepts in “one go”, thus the tAfn;ELC+Q+ (O), especially the traceability graph T G = (N; E), can be regarded as a by-product of Ans(O; q) for any ontology O and any reasoning problem q. If the reasoning problem is to classify the ontology, then Ans(O; q) N because N contains all the entailed concept subsumptions.

In previous works [ 11 ], similar techniques are used to compute justifications in a glass-box manner. The major differences are: 1. In justification computation, the traceability information is used in a backward manner, i.e. given an entailment, finding all asserted axioms that derive it. In stream reasoning, as we will show, the traceability information is used in a forward manner, i.e. given an asserted axiom, finding all entailments impacted by it. 2. In justification computation, traceability information is recursively collapsed, i.e. the intermediate entailments are not part of the justification, only the original axioms are. In stream reasoning, the intermediate entailments also count because they are also part of the reasoning results. 5

Stream Reasoning with Traced Syntactic Approximation

We follow the idea of DRed to split the updates of ontologies into two steps: erasing and adding. 5.1

Erasing Axioms from Ontology

In the erasing step, the traceability graph provides a convenient way to find all entailments impacted by the erased axioms. More precisely, suppose tAfn;ELC+Q+ (O(i)) = (T E (i); CT E (i); QT E (i); T GE (i)), where T GE (i) = (N E (i); EE (i)), as follows: (T (i); CT (i); QT (i); T G(i)), where T G(i) = (N (i); E(i)), and we want to erase Er(i) O(i) from the O(i), then we can construct an erased approximation eAfn;ELC+Q+ (O(i); Er(i)) =

As we can see, all the entailments (including the erased axioms themselves) reachable from the erased axioms are removed from the approximation and the traceability graph. The corresponding traceability edges are removed as well.

Consequently, the reasoning results should be erased: 1. for any role r in T (i) and not in T E (i), erasing all role subsumption involving R.

And removing R(r). 2. earsing all role subsumptions r v s 2 N (i) and r v s 2= N E (i). 3. for any basic concept A in T (i) and not in T E (i), removing S(A). 4. for any basic concepts A, B such that B 2 S(A), if A v B 2= N E (i), then

S(A) = S(A) n fBg. 5. for any role name r, basic concepts A and B such that (A; B) 2 R(r), if A v 9r:B 2= N E (i), then R(r) = R(r) n f(A; B)g.

Similar as the DRed approach, such erasure may lose some information. Using our old example O(i) = ffag A; fag v A; a : Ag, we only have (fag A; fag v A) 2 E(i). If fag A 2 Er(i) then we will have fag v A 2= N E (i). Compared to tAf n;ELC+Q+ (O(i) n Er(i)) this loses information, because fag v A still remains in O(i) n Er(i) or can still be approximated from a : A. As a consequence, any further entailments reachable from fag v A are all missing. We will need to re-approximate and re-derive these entailments with the added axioms. 5.2

Adding Axioms into Ontology

In the adding step, we need to re-perform the approximation and reasoning. Such reperforming can be optimized with the erased approximation. More precisely, suppose the erased approximation eAf n;ELC+Q+ (O(i); Er(i)) = (T E (i); CT E (i); QT E (i); T GE (i)), where T GE (i) = (N E (i); EE (i)), and we want to add Ad(i) into the ontology (Ad(i) canAb(ei);;)C,tTheAn(wi)e; QcaTnAco(in)s;tTruGctAa(ni)a)d,dwedhearpepTroGxiAm(ai)tio=n (aNAAfn(;iE)L;EC+Q+A((Oi)()i,)a;sEfro(lilo);wAs:d(i)) = (T 1. Generating T A(i); CT A(i) and QT A(i) from O(i) n Er(i) [ Ad(i) with respect to Def.2. But in Step 1, instead of initializing T A(i); CT A(i) and QT A(i) with ;, initializing them with T E (i); CT E (i) and QT E (i), respectively. 2. Generating T GA(i) from (T A(i); CT A(i); QT A(i)) with respect to Def.4. But in Step 2, instead of initializing N A(i) and EA(i) with ;, initializing them with N E (i) and EE (i), respectively.

In the first step, we re-approximate the updated ontology by making use of the remaining approximation after erasure. In the second, we re-compute the traceability graph by making use of the remaining traceability graph after erasure. Compared with a reapproximation from scratch tA(O(i+1)) = (T (i+1); CT (i+1); QT (i+1); T G(i+1)) we have the following properties: T A(i) = T (i + 1); CT A(i) = CT (i + 1); QT A(i + 1) = QT (i + 1) and T GA(i) = T G(i + 1).

Thus, the reasoning results can be updated accordingly. Simply applying the traced completion rules again then we will get the results on O(i + 1). Because we have already obtained partial results on the S sets, R sets and role subsumptions, such a completion procedure will be more efficient than computing all results from scratch. And the reasoning is still tractable and sound: Theorem 2. For any ontology O(i), erased axioms Er(i) O(i) and added axioms Ad(i), performing erasing as introduced in Sec.5.1 and adding as introduced in Sec.5.2 will terminate in polynomial time w.r.t. jCNO(i)[Ad(i)j+jRNO(i)[Ad(i)j and the results are the same as reasoning on O(i) n Er(i) [ Ad(i). 6

Evaluation

We implemented the proposed approach in our REL reasoner. In order to evaluate its performance, we tested it with large ABox stream reasoning. Due to the lack of stream reasoning benchmark, we generated our test data from existing benchmark ontologies and simulated streams. The ontology we used is the well-known Lehigh University Benchmark (LUBM) ontology 2. It has a simple TBox but arbitrarily large ABox. In our evaluation, we generate one university with 15 departments. All experiments were conducted in an environment of 64-bit Windows 7 Enterprise with 1.60 GHz CPU and 3G RAM allocated to JVM 1.6.0.07.

We do the following pre-processing to simulate an ontology stream: 1. We randomly partition the ABox into 15 sub-ABoxes of same size. We call them

A0; A1; : : : ; A14. 2. We re-merged the TBox T and sub-ABoxes A0 to A4 into O(0), as the initial ontology for stream reasoning. 3. Hence, we generate streams as follows: O(i + 1) = O(i) n Ai [ Ai+5. In other words, in each update, we erase the Ai from O(i) and add Ai+5 into it to create O(i + 1). Thus, O(1) = T [ A1 [ A2 : : : A5; : : : ; O(10) = T [ A10 [ : : : A14.

Now we create a stream with 10 updates. In each update, 20% of the ABox is changed.

For each stream ontology O(i), we perform ABox reasoning and query for the types of all individuals. This is because, in LUBM a certain type of an individual can have multiple sources, e.g. direct assertion, or through some domain/range axioms, or through subsumption closure of its other types. While the relation between two individuals are usually either directly asserted, or derived through role subsumption, i.e. without multiple sources.

We accomplish such a reasoning task with both the naive approach (re-do reasoning on O(i) from scratch) and the stream reasoning approach. We record the time of both 2 http://swat.cse.lehigh.edu/projects/lubm/ approaches to evaluate the effectiveness of the stream reasoning approach. The results are summarized in the following table (Table 1). 1. We merged n (n=4,5,6,7) sub-ABoxes with the TBox to make the initial ontology, i.e. O(0) = T [ A0 [ : : : An 1. 2. For each n, the updated part varies from 1 sub-ABox to (n 1) sub-ABoxes.

For example, when n = 5, we update the ontology with 1; 2; 3; 4 sub-ABoxes, respectively. Thus, the Update/Ontology ratio (update ratio for short, ABox only) is 20%; 40%; 60% and 80%, respectively.

For each n and each update ratio, we do same reasoning as before with both the naive approach and the stream reasoning approach, and then compute the TStream Reasoning=TNaive ratio (time ratio for short), where TStream Reasoning and TNaive are time for the two approaches, respectively. The relation between update ratio and time ratio is illustrated in Figure 1.

From this figure, we can see that stream reasoning approach always pays off no matter how large the ABox is, and how large the updated part is. This justifies the usability of our approach. 7

Further Extensions

Although not implemented, we introduce two extensions of our approach in this section. Both of them try to trade time consumption with space consumption: 1. Maintaining Erased Information: there is a possibility that erased axioms may be added back into the ontology again in the future. With the current solution, we still need to re-compute all information related to them. A potential extension is that, instead of immediately removing all corresponding information in erasure, we label them as “erased” and do not use them in subsequent approximation and reasoning. When they are added back into the ontology, we remove the “erased” label and certain results can be enabled immediately without reasoning. Consequently the size of the T G will increase with the stream, i.e. jT Gj is polynomial w.r.t. jO(0) [ Ad(1) [ Ad(2) [ : : : j. 2. Maintaining Complete Traceability Information: as our early example ffag A; fag v A; a : Ag shows, an entailment fag v A can have multiple sources. In the current solution we only maintain one of them in the T G. So that when such a source is erased, we need to re-perform approximation and reasoning to obtain the entailment. A potential extension is to maintain traceability information from all the sources in the T G and erase an entailment only if all the asserted axioms it is reachable from are erased. For example, when erasing fag A from the above ontology, A v fag should be erased but fag v A should not. Therefore entailments have multiple sources do not need to be re-computed. However, the complexity of this extension is comparable to computing all justifications, i.e. the computation is NP-Complete w.r.t. jO(i) [ Ad(i)j. 8

Conclusion

In this paper, we presented an approach to ontological stream reasoning via syntactic approximation. We took advantage of the complexity reduction brought by approximation technology and enabled stream reasoning for relatively complex languages and large knowledge bases by extending syntactic approximation with a traceability graph. And then we use such a graph to erase entailments from the ontology, and use incremental reasoning facility to handle the added axioms. Evaluation on benchmark ontology shows that our approach works nicely in practice. In the future, we would like to further improve the scalability of our approach. Combining with the time tag approach is also a potential improvement.

Baader ,

Brandt , and

Lutz. Pushing the E L Envelope . In Proceedings IJCAI-05 , 2005 .

Franz

Baader ,

Sebastian

Brandt , and

Carsten

Lutz . Pushing the EL Envelope Further . In Kendall Clark and Peter F. Patel-Schneider, editors, In OWLED-2008 , 2008 .

Franz

Baader , Diego Calvanese, Deborah L. McGuinness , Daniele Nardi , and Peter F. PatelSchneider, editors. The Description Logic Handbook: Theory , Implementation, and Applications . Cambridge University Press, 2003 .

Davide

Francesco Barbieri , Daniele Braga, Stefano Ceri, and

Michael

Grossniklaus . An execution environment for c-sparql queries . In EDBT '10 , 2010 .

Davide

Francesco Barbieri , Daniele Braga, Stefano Ceri, Emanuele Della Valle, and

Michael

Grossniklaus . C-sparql: Sparql for continuous querying . In WWW2009 , 2009 .

Davide

Francesco Barbieri , Daniele Braga, Stefano Ceri, Emanuele Della Valle, and

Michael

Grossniklaus . Incremental reasoning on streams and rich background knowledge . In ESWC2010 , 2010 .

Andre

Bolles , Marco Grawunder, and

Jonas

Jacobi . Streaming sparql extending sparql to process data streams . In ESWC08 , 2008 .

Ashish

Gupta , Inderpal Singh Mumick, and

V. S.

Subrahmanian . Maintaining views incrementally . In SIGMOD '93 , 1993 .

Bernhard

Hollunder , Werner Nutt, and Manfred Schmidt-Schauß. Subsumption Algorithms for Concept Description Languages . In ECAI-90 , pages 348 - 353 . Pitman Publishing, 1990 .

10. Ian

Horrocks

, Oliver Kutz, and

Ulrike

Sattler . The Even More Irresistible SROIQ . In KR 2006 , 2006 .

11. Sik Chun (joey) Lam, Jeff Z . Pan, Derek Sleeman, and

Wamberto

Vasconcelos . A fine-grained approach to resolving unsatisfiable ontologies . In In Proc. of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI-2006) , 2006 .

12.

Marko

Luther and

Sebastian

Bohm . Situation-Aware Mobility : An Application for Stream Reasoning . In in Proc. of 1st International Workshop on Stream Reasoning (SR2009) , 2009 .

13. Bijan

Parsia

, Christian Halaschek-wiener, and

Evren

Sirin . E.s.: Towards incremental reasoning through updates . In in OWL DL. In: Proc. WWW-2006 . ( 2006 ), 2006 .

14. Yuan

Ren

, Jeff

Pan , and

Yuting

Zhao . Soundness Preserving Approximation for TBox Reasoning . In the Proc. of the 25th AAAI Conference Conference (AAAI2010) , 2010 .

15. Yuan

Ren

, Jeff

Pan , and

Yuting

Zhao . Towards Soundness Preserving Approximation for ABox Reasoning of OWL2 . In the Proc. of the International Description Logic Workshop (DL2010) , 2010 .

16.

Boontawee

Suntisrivaraporn . Module Extraction and Incremental Classification: A Pragmatic Approach for E L+ Ontologies . In ESWC' 08 , 2008 .

17. Emanuele Della Valle, Stefano Ceri, Daniele Braga, Irene Celino, Dieter Frensel, Frank van Harmelen, and Gulay Unel. Research Chapters in the Area of Stream Reasoning . In in Proc. of 1st International Workshop on Stream Reasoning (SR2009) , 2009 .

18. Emanuele Della Valle, Stefano Ceri, Frank van Harmelen,

and Dieter

Fensel . It's a streaming world! reasoning upon rapidly changing information . IEEE Intelligent Systems , 24 : 83 - 89 , 2009 .

19. Raphael

Volz

, Steffen Staab , and Boris Motik . Incementally maintaining materializations of ontologies stored in logic databases . In Journal of Data Semantics II, LNCS , Vol 3360 , 2 : 1 - 34 , 2005 .