=Paper= {{Paper |id=None |storemode=property |title=Parallelizing Algebraic Reasoning for the Description Logic SHOQ |pdfUrl=https://ceur-ws.org/Vol-1054/paper-05.pdf |volume=Vol-1054 |dblpUrl=https://dblp.org/rec/conf/csws/FaddoulM13 }} ==Parallelizing Algebraic Reasoning for the Description Logic SHOQ== https://ceur-ws.org/Vol-1054/paper-05.pdf
              Parallelizing Algebraic Reasoning for the
                      Description Logic SHOQ
                             Jocelyne Faddoul                                             Wendy MacCaull
                     Centre for Logic and Information                           Centre for Logic and Information
                       St. Francis Xavier University                              St. Francis Xavier University
                           Nova Scotia, Canada                                        Nova Scotia, Canada
                          Email: jfaddoul@stfx.ca                                  Email: wmaccaul@stfx.ca



   Abstract—Reaching the full potential of the semantic web           of lightweight ontologies, i.e., expressed using the extensions
awaits the availability of highly scalable reasoners. Despite nu-     of the tractable DL EL. Sacrificing the expressivity of the
merous efforts to optimize existing Description Logics reasoners,     knowledge modelled is a limiting (and often unacceptable)
there is always the need to compromise the expressivity or the size
of the used ontologies in time sensitive applications. Hybrid alge-   compromise. For example, given the Foundational Model of
braic reasoning has been investigated in the context of optimizing    Anatomy (FMA) ontology1 , one might add axiom (1) to
reasoning with ontologies where the expressivity is rich enough       express the fact that the adult human has 206 bones, and axiom
to include qualified cardinality restrictions and nominals. On the    (2) to express the fact that the knee joint is involved in more
other hand parallel models have been considered to allow scalable     than 100 rheumatic diseases2 . These axioms use the qualified
reasoning with ontologies, however, only poor Description Logic
expressivity has been considered. In this work, we investigate        cardinality restrictions (QCRs) DL operator, which is known to
parallelizing hybrid algebraic reasoning as a means to seek           lead to severe performance degradation of existing state of the
scalable solutions without the need to sacrifice expressivity.        art DL reasoners (e.g., Fact++3 , Hermit4 , Pellet5 ). RacerPro6
                                                                      remains the only DL reasoner that can efficiently handle QCRs
                        I. M OTIVATION                                using algebraic reasoning, however, it does not fully support
   Applications of the semantic web are numerous, wide                nominals.
ranging and have tremendous potential for adding value in
a vast array of situations which can take advantage of in-
                                                                                                Adult v P ersonu ≥ 206 hasBone      (1)
telligence, i.e., the capacity to reason over knowledge stored
in a knowledge base such as an ontology. However, if the               KneeJoint v≥ 100 involvedInDisease.RhDisease                 (2)
application is time sensitive, the time required for reasoning           To the best of our knowledge, algebraic reasoning remains
can be prohibitive.                                                   the most promising approach for DL reasoning with ontologies
   Description logics (DL) have gained a lot of attention in          relying on the use of QCRs. This has been shown in fragments
the research community as they provide a logical formalism            of DL using Qualified Cardinality Restrictions (QCRs)[2],
for the codification of medical knowledge, ontologies, and            [3], inverse roles [4], [5], and nominals [6], [7]. Practical
the semantic web. There has been a great deal of research             implementations of such algebraic tableau algorithms requires
into optimizing DL reasoning strategies and in carving out            a carefully chosen set of optimizations in order to outperform
fragments over which reasoning can proceed at a reasonable            the highly optimized existing state of the art reasoners. Most
pace — but reasoning using these strategies or over these             algebraic tableau-based algorithms proposed so far are double
fragments often does not scale to allow the use of large              exponential in the worst case; their optimized implementations
ontologies. Reasoning for time sensitive tasks still requires         have been tested on a suite of artificial or often adapted subsets
severe restrictions on the expressivity, the complexity and/or        of ontologies. The scalability of the algebraic approach with
the size of the ontology which, of course, limits the knowledge       real world and often large ontologies remains open.
that can be used.                                                        The high performance computing (HPC) paradigm would
   Standard DL inference services, e.g., TBox classification,         seem to offer a solution to these problems, but progress
concept satisfiability checking, instance checking, etc., have        using high performance computing methodologies has been
been extended with query answering in order to extract in-            challenging and slow [8], [9], [10]. The techniques that have
formation and drive applications such as web services and             offered speedy solutions in other domains (e.g., for “number
workflow management systems [1]. For many applications
                                                                        1 http://sig.biostr.washington.edu/projects/fm/index.html
(e.g., associated with health services delivery) these services
                                                                        2 http://www.medicinenet.com/knee pain/article.htm
are time sensitive, but require time consuming reasoning                3 http://owl.man.ac.uk/factplusplus/
over complex and often large ontologies. The expressivity of            4 http://www.hermit-reasoner.com/
the domain knowledge is often sacrificed in order to meet               5 http://clarkparsia.com/pellet/

practical reasoning performance, hence the recent popularity            6 http://www.racer-systems.com/
crunching” in the physical sciences) do not suffice to crack        DL reasoning and hybrid algebraic tableau DL reasoning for
the time bottleneck of reasoning tasks required for effective       the DL SHOQ.
use of ontologies. Work is needed to find techniques for                  a) Standard Tableau: In the case of the standard tableau
this kind of computing. The increasing availability of cloud        algorithm for the DL SHOQ [17], non-determinism is due to:
computing facilities means that we can all have access to              • handling disjunctions (The t-Rule): if there exists in the
powerful computing resources; indeed, our own laptops have                completion graph a node x such that C1 t C2 is in the
multiple cores. New methods are needed if we are to take                  label, L(x), of x then there can be two possible and
advantage of their potential.                                             distinct ways to extend the completion model: one in
   Recently, there has been encouraging results [11], [12], [10],         which C1 is added to the label of x, and one in which
[13], [14]. The work considered so far, considers parallelizing           C2 is added to the label of x.
the TBox classification task [12], the Abox querying task              • handling qualified cardinality restrictions (choose-rule,
[15], or the concept satisfiability checking task [8], [11]               ≤-rule): if there exists in the completion graph a node
using ontologies relying on the least expressive fragments of             x with ≤ nR.C its label and there exists m >= 1 nodes
DLs. Parallelizing algebraic reasoning to allow the handling              y, y1 . . . ym , related to x via the role R, then:
of large ontologies using number restrictions needs further                 – for each ym there can be two possible and distinct
investigation.                                                                 ways to extend the completion model: one in which
   Our research is focused on finding ways to combine high                     C is added to the label of ym , and one in which ¬C
performance computing and algebraic tableau reasoning [6]                      is added to the label of ym (choose-rule).
to enable scalable reasoning with ontologies handling the                   – if m > n there can be m!    n! possible and distinct ways
expressivity of the DL SHOQ.                                                   to extend the completion model such that in each
 II. H IGH P ERFORMANCE C OMPUTING AND A LGEBRAIC                              case excess role fillers (yi and yj , i 6= j) are merged
                     R EASONING                                                until the at-most restriction is satisfied (≤-rule).
                                                                          b) Algebraic Tableau: In the case of the hybrid algebraic
   In this work, we investigate combining HPC and algebraic
                                                                    tableau algorithm for the DL SHOQ [6], disjunctions are
tableau reasoning for deciding DL concept satisfiability. Every
                                                                    handled similar to the case of standard tableau. However,
standard DL reasoning task can be reduced to a concept satis-
                                                                    handling qualified cardinality restrictions relies on the use of
fiability check. Our goal is to parallelize the algebraic tableau
                                                                    the atomic decomposition technique [18], which computes dis-
reasoning algorithm presented in [6] for the DL SHOQ,
                                                                    joint partitions by considering all possible interactions between
which is basic DL ALC extended with transitive roles, role
                                                                    domain elements. This handling of domain elements results in
hierarchies, nominals and qualified cardinality restrictions, and
                                                                    only one additional source of non-determinism rather than the
for which the satisfiability problem is ExpTime-complete.
                                                                    two sources for handling qualified cardinality restrictions with
   The algebraic algorithm presented in [6] decides the satis-
                                                                    the standard tableau:
fiability of a concept C by constructing a compressed com-
                                                                       • handling partitions (the ch-Rule): for each partition com-
pletion graph representing a model. The algorithm is hybrid;
it relies on tableau expansion rules working together with an             puted by the atomic decomposition technique there can be
integer programming solver (e.g., simplex solver) and comes               two possible and distinct ways to extend the completion
with a double exponential worst case complexity. However,                 model: one in which the partition must be empty, and one
in practice and when equipped with suited optimizations,                  in which the partition must have at-least one element.
algebraic reasoning performs better than existing state of the         We argue that hybrid algebraic reasoning appears to have
art reasoners in handling qualified cardinality restrictions and    a better potential for parallelization than standard tableau for
nominals [16], [7]. In this paper, we argue that algebraic          the following reasons:
reasoning is well suited for parallel programming models               • having less sources of non-determinism (2 instead of 3)
offering a potential improvement over standard tableau-based              means less overhead in managing concurrent execution
DL reasoning.                                                             of non-deterministic rules. This also means that adopting
                                                                          optimizations such as dependency directed backtracking
A. Parallel Reasoning                                                     becomes more fine grained and less complicated.
  Constructing completion models for DL concepts often                 • the ch-rule always fires for two choices. This means that
requires non-deterministic choices, which result in separately            the search trees resulting from the distinct branches have
exploring more than one completion graph expansion. In the                similar structure which facilitates load balancing between
case of the DL SHOQ, non-deterministic tableau-rules lead to              parallel expansions of the search tree. Load balancing is a
an independent construction of tableau branches since nodes               common goal in parallel computing where unequal thread
belonging to different branches do not exchange information.              workloads can easily diminish the performance gain of
This feature suggests that we extend the search strategy                  parallelization.
adopted to construct tableau models using parallel processing.         • satisfying qualified cardinality restriction is delegated to
In the following, we list and compare the main sources of                 an inequation solver and can be done in isolation from
non-deterministic expansions in the cases of standard tableau             tableau expansion. This means that the task of satisfying
    the inequations can be delegated to the use of separate
                                                                     Rule completed                                  No leader
    threads, or even FPGAs [14] and GPUs (Graphical Pro-                                            Executing
    cessing Units).
  • the use of “compressed completion graph” consisting of
    proxy nodes representing sets of domain elements instead                Leader available                        New rule applicable
    of completion graphs consisting of a node representing
    each domain element allows the use of a smaller data
    structure representing the completion model. This means
    that a smaller amount of data needs to shared among and                                        New leader
                                                                                   Following                          Leading
    communicated between parallel tasks thus reducing the
    communication overhead between threads.
                                                                        Fig. 1. Thread transitions in the Leader/Followers design pattern.
B. The Parallel Execution Framework
   We consider parallelization of the hybrid algebraic rea-
soning algorithm using an object-oriented framework sup-            soner for Description Logics) prototype reasoner [16], imple-
porting thread level parallelism (TLP). In this framework, a        mented in java, is being redesigned to adopt the parallel exe-
compressed completion graph data structure is shared among          cution framework described in the previous section. Given that
threads which concurrently apply tableau rules until termina-       HARD has been equipped with a suite of crucial optimization
tion of the satisfiability check. In this approach we choose to     techniques such as lazy partitioning and dependency directed
investigate the or-parallelism with a shared memory strategy,       backtracking, one has to consider the possible effects of the
where non-deterministic branches of the ch-Rule and the t-          TLP. One of the features that renders non-deterministic rules
Rule are explored using parallel threads.                           appealing for parallelization, is that they result in completion
   In order to minimize the overhead of creating and destroying     graph expansions which can be explored in isolation. How-
threads every time a non-deterministic rule is applied, we          ever, dependency directed backtracking relies on information
implement the Thread Pool design pattern. This means that           exchange between branches such that dependencies must be
a fixed number of threads is created and organized into a           recorded and consulted before pruning the search space. Such
queue until associated with an applicable completion rule. The      required communication between branches complicates the use
number of threads can be assigned depending on the number           of dependency directed backtracking in our parallel frame-
of available processors and resource thrashing can be avoided.      work. In this context, the use of the Thread-Specific Storage
In this framework, threads coordinate themselves using the          or the Monitor Object seems worth investigating.
Leader/Followers design pattern where a single thread from
the thread pool acts as a leader and manages thread-rule                         IV. R ELATED W ORK AND O UTLOOK
assignment. Figure 1 illustrates the state transitions between         Our work here is motivated by problems arising in the
threads when adopting the Leader/Followers design pattern.          area of health services delivery. The healthcare system is
When a thread is in the leader state, it can immediately change     composed of many different professionals operating at many
state to become in executing state if a non-deterministic com-      sites of care offering a wide variety of services and requiring
pletion rule becomes applicable. A thread in the executing state    a vast amount of information both in the form of data and
can run concurrently with the leader thread and other executing     also in the form of clinical and other protocols. We are
threads. Once a thread finishes expanding a completion graph,       currently involved in a multi-year project in collaboration with
it either changes state to become leader, if no leader thread is    our local health authority and industry partner to develop an
available, or to become follower. In the latter case, a thread is   ontology-driven Careflow Management System. Our lab has
waiting, in the thread pool, to be promoted to the leader state     developed an ontology-driven workflow system and we have
by the current leader. Since the threads expand a shared model,     done some work in the scalability problem for querying over
the compressed completion graph can be implemented as a             the OWL 2RL fragment [19], [20], [21] over large ABoxes.
Monitor Object to ensure synchronization between threads.           We are currently expanding our system with an ontology-
   Even though the order in which expansion rules are applied       driven service discovery engine. We believe that the high
does not affect soundness and completeness of the satisfiability    performance computing paradigm offers a great deal of hope
test, in practice, results have shown that performance speedup      for the scalability problem for knowledge crunching, that is,
can be achieved using certain ordering. We plan to investigate      for ontological reasoning tasks, in time sensitive applications.
our parallel model while considering different ways of enforc-         A parallel algorithm for description logics reasoning has
ing an ordering in which node labels are chosen as premise          been considered in 1995 [22], with limited scalability results
for tableau rules as was done in [11] for the basic DL ALC.         due to hardware limitations. Further results and research ac-
                                                                    tivity have been reported in this area since the work presented
                       III. D ISCUSSION                             in [8], where non-deterministic choices in core satisfiability
  The implementation and evaluation of this parallel frame-         test were explored concurrently. Parallelizing rule-based OWL
work is ongoing work. The HARD (Hybrid Algebraic Rea-               inferencing has been considered in [13] by examining a data
partitioning approach and a rule partitioning approach. Parallel                  [8] T. Liebig and F. Müller, “Parallelizing tableaux-based description logic
reasoning has also been investigated in the context of dis-                           reasoning,” in Proceedings of the 2007 workshop on On the Move to
                                                                                      Meaningful Internet Systems 2007 - OTM 2007, 2007, pp. 1135–1144.
tributed resolution reasoning [23] about interlinked ontologies                   [9] M. Aslani and V. Haarslev, “Concurrent classification of owl ontologies
as an alternative to centralized tableau-based reasoning (DL                          - an empirical evaluation,” in Proceedings of the 2012 International
ALCHIQ). Techniques using the MapReduce algorithm to                                  Workshop on Description Logics, DL-2012, Y. Kazakov, D. Lembo, and
                                                                                      F. Wolter, Eds., vol. 846, Rome, Italy, June 7-10, 2012.
classify EL+ ontologies [24] and fuzzy EL+ ontologies [25]                       [10] K. Wu and V. Haarslev, “A parallel reasoner for the description logic
have been proposed with no empirical evaluation. Concurrent                           ALC,” in Proceedings of the 2012 International Workshop on Descrip-
classification of lightweight ontologies has also been con-                           tion Logics, DL-2012, ser. CEUR Workshop Proceedings, Y. Kazakov,
                                                                                      D. Lembo, and F. Wolter, Eds., vol. 846. Rome, Italy, June 7-10:
sidered in the context of consequence-based reasoning [26].                           CEUR-WS.org, 2012.
Tableau-based concurrent classification of more expressive                       [11] A. Meissner, “Experimental analysis of some computation rules in
ontologies has been recently reported in [9], where lock-free                         a simple parallel reasoning system for the ALC description logic,”
                                                                                      International Journal of Applied Mathematics and Computer Science,
algorithms with limited synchronization have been used in a                           vol. 21, no. 1, pp. 83–95, March 2011.
multi-core environment, and in [10] where specialized data                       [12] M. Aslani and V. Haarslev, “Parallel tbox classification in description
structures have been proposed to optimize the use of a shared                         logics - first experimental results,” in ECAI 2010 - 19th European Con-
                                                                                      ference on Artificial Intelligence, ser. Frontiers in Artificial Intelligence
memory environment.                                                                   and Applications, H. Coelho, R. Studer, and M. Wooldridge, Eds., vol.
   We plan to investigate parallel reasoning in the context of                        215. Lisbon, Portugal, August 16-20: IOS Press, 2010, pp. 485–490.
enhancing core satisfiability tests for expressive ontologies.                   [13] R. Soma and V. Prasanna, “Parallel inferencing for owl knowledge
                                                                                      bases,” in 37th International Conference on Parallel Processing -
Little work has been reported in this context. In [11], a parallel                    ICPP’08, 2008, pp. 75–82.
search engine (Mozart system) was used to parallelize Descrip-                   [14] P. Subramanian, “A field programmable gate array based finite-domain
tion Logics satisfiability check, however, the algorithm only                         constraint solver,” Master’s thesis, School of Graduate Studies, Utah
                                                                                      State University, 2008.
considers basic DL ALC. We plan to handle the expressivity                       [15] J. E. M. Alvarez, “Query engine for massive distributed ontologies using
of the DL SHOQ by designing a parallel architecture for                               mapreduce,” Master’s thesis, Technische Universitat Hamburg-Harburg,
the algebraic tableau calculus presented in [6], and which                            2010.
                                                                                 [16] J. Faddoul, “Reasoning algebraı̈cally with description logics,” Ph.D.
was shown to be the only one able to decide the satisfia-                             dissertation, Concordia University, Montreal, Canada, 2011.
bility of complex ontologies relying on the use of nominals                      [17] I. Horrocks and U. Sattler, “Ontology reasoning in the SHOQ(D)
and qualified cardinality restrictions [16], [7]. We plan to                          description logic,” in Proceedings of the 17th International Joint
                                                                                      Conference on Artificial Intelligence (IJCAI 2001). Morgan Kaufmann,
implement and evaluate our approach in a multi-core and                               Los Altos, 2001, pp. 199–204. [Online]. Available: download/2001/
multi-processor environment using the Atlantic Computational                          ijcai01.pdf
Excellence Network (ACEnet) resources.                                           [18] H. J. Ohlbach and J. Koehler, “Modal logics description logics and
                                                                                      arithmetic reasoning,” Artificial Intelligence, vol. 109, no. 1-2, pp. 1–31,
                                                                                      1999.
                         ACKNOWLEDGMENT                                          [19] F. Rabbi and W. MacCaull, “T-square: A domain specific language for
  The first author is an ACEnet Postdoctoral Fellow, and                              rapid workflow development,” in ACM/IEEE 15th International Confer-
                                                                                      ence on Model Driven Engineering Languages & Systems (MODELS
gratefully acknowledges support from Atlantic Canada’s HPC                            2012), ser. Lecture Notes in Computer Science, vol. 7590, Innsbruck,
Consortium; the second author acknowledges support from                               Austria, September 2012, pp. 36–52.
NSERC and ACOA.                                                                  [20] M. R. U. Faruqui and W. MacCaull, “Owlontdb: A scalable reasoning
                                                                                      system for OWL 2 RL ontologie,” in FHIES 2012: International Sym-
                                                                                      posium on Foundations of Health Information Engineering and Systems,
                              R EFERENCES                                             ser. Lecture Notes in Computer Science, vol. 7789. Springer, 2013.
 [1] W. MacCaull and F. Rabbi, “NOVA Workflow: A Workflow Management             [21] F. Rabbi, W. MacCaull, and M. R. U. Faruqui, “A scalable ontology
     Tool Targeting Health Service Delivery,” in International Smposium on            reasoner via incremental materialization,” in Submitted to CBMS 2013.
     Foundations of Health Information Engineering and Systems (FHIES -          [22] F. W. Bergmann and J. J. Quantz, “Parallelizing description logics,”
     2011), ser. Lecture Notes in Computer Science, vol. 7151. Springer,              in 19th Ann. German Conference on Artificial Intelligence, ser. LNCS.
     2012, pp. 75–92.                                                                 Springer-Verlag, 1995, pp. 137–148.
 [2] V. Haarslev, M. Timmann, and R. Möller, “Combining tableaux and            [23] A. Schlicht and H. Stuckenschmidt, “Distributed resolution for expres-
     algebraic methods for reasoning with qualified number restrictions,”             sive ontology networks,” in Web Reasoning and Rule Systems, 3rd
     in Proceedings of the International Workshop on Description                      International Conference (RR-2009), Chantilly, VA, USA, October 2009,
     Logics (DL’2001), Aug. 1-3, Stanford, USA, ser. CEUR Workshop                    pp. 87–101.
     Proceedings, vol. 49, 2001, pp. 152–161. [Online]. Available:               [24] R. Mutharaju, F. Maier, and P. Hitzler, “A MapReduce algorithm for
     citeseer.ist.psu.edu/article/haarslev01combining.html                            el+,” in 23rd International Workshop on Description Logics, 2010, pp.
 [3] N. Farsiniamarj and V. Haarslev, “Practical reasoning with qualified             464–474.
     number restrictions: A hybrid abox calculus for the description logic,”     [25] Z. Zhou, G. Qi, C. Liu, P. Hitzler, and R. Mutharaju, “Reasoning with
     AI Communications, vol. 23, no. 2-3, pp. 205–240, 2010.                          fuzzy-EL+ ontologies using mapreduce,” in ECAI 2012 - 21st European
 [4] Y. Ding, “Tableau-based reasoning for description logics with inverse            Conference on Artificial Intelligence, L. D. R. et al., Ed. IOS Press,
     roles and number restrictions,” Ph.D. dissertation, Concordia University,        2012, pp. 933–934.
     2008.                                                                       [26] Y. Kazakov, M. Krötzsch, and F. Simancı́k, “Concurrent classification
 [5] L. R. Pour, “Algebraı̈c reasoning with the description logic SHIQ,”              of el ontologies,” in Proceedings of the 10th international conference on
     Master’s thesis, Concordia University.                                           The semantic web, ser. ISWC’ 11. Bonn, Germany: Springer-Verlag,
 [6] J. Faddoul and V. Haarslev, “Algebraic tableau reasoning for the de-             2011, pp. 305–320.
     scription logic SHOQ,” Journal of Applied Logic, vol. 8, no. 4, pp.
     334–355, 2010.
 [7] ——, “Optimized algebraic tableau reasoning for the description logic
     SHOQ,” Journal of Artificial Intelligence Research (JAIR) - In prepa-
     ration, 2013.