=Paper=
{{Paper
|id=None
|storemode=property
|title=Parallelizing Algebraic Reasoning for the Description Logic SHOQ
|pdfUrl=https://ceur-ws.org/Vol-1054/paper-05.pdf
|volume=Vol-1054
|dblpUrl=https://dblp.org/rec/conf/csws/FaddoulM13
}}
==Parallelizing Algebraic Reasoning for the Description Logic SHOQ==
Parallelizing Algebraic Reasoning for the Description Logic SHOQ Jocelyne Faddoul Wendy MacCaull Centre for Logic and Information Centre for Logic and Information St. Francis Xavier University St. Francis Xavier University Nova Scotia, Canada Nova Scotia, Canada Email: jfaddoul@stfx.ca Email: wmaccaul@stfx.ca Abstract—Reaching the full potential of the semantic web of lightweight ontologies, i.e., expressed using the extensions awaits the availability of highly scalable reasoners. Despite nu- of the tractable DL EL. Sacrificing the expressivity of the merous efforts to optimize existing Description Logics reasoners, knowledge modelled is a limiting (and often unacceptable) there is always the need to compromise the expressivity or the size of the used ontologies in time sensitive applications. Hybrid alge- compromise. For example, given the Foundational Model of braic reasoning has been investigated in the context of optimizing Anatomy (FMA) ontology1 , one might add axiom (1) to reasoning with ontologies where the expressivity is rich enough express the fact that the adult human has 206 bones, and axiom to include qualified cardinality restrictions and nominals. On the (2) to express the fact that the knee joint is involved in more other hand parallel models have been considered to allow scalable than 100 rheumatic diseases2 . These axioms use the qualified reasoning with ontologies, however, only poor Description Logic expressivity has been considered. In this work, we investigate cardinality restrictions (QCRs) DL operator, which is known to parallelizing hybrid algebraic reasoning as a means to seek lead to severe performance degradation of existing state of the scalable solutions without the need to sacrifice expressivity. art DL reasoners (e.g., Fact++3 , Hermit4 , Pellet5 ). RacerPro6 remains the only DL reasoner that can efficiently handle QCRs I. M OTIVATION using algebraic reasoning, however, it does not fully support Applications of the semantic web are numerous, wide nominals. ranging and have tremendous potential for adding value in a vast array of situations which can take advantage of in- Adult v P ersonu ≥ 206 hasBone (1) telligence, i.e., the capacity to reason over knowledge stored in a knowledge base such as an ontology. However, if the KneeJoint v≥ 100 involvedInDisease.RhDisease (2) application is time sensitive, the time required for reasoning To the best of our knowledge, algebraic reasoning remains can be prohibitive. the most promising approach for DL reasoning with ontologies Description logics (DL) have gained a lot of attention in relying on the use of QCRs. This has been shown in fragments the research community as they provide a logical formalism of DL using Qualified Cardinality Restrictions (QCRs)[2], for the codification of medical knowledge, ontologies, and [3], inverse roles [4], [5], and nominals [6], [7]. Practical the semantic web. There has been a great deal of research implementations of such algebraic tableau algorithms requires into optimizing DL reasoning strategies and in carving out a carefully chosen set of optimizations in order to outperform fragments over which reasoning can proceed at a reasonable the highly optimized existing state of the art reasoners. Most pace — but reasoning using these strategies or over these algebraic tableau-based algorithms proposed so far are double fragments often does not scale to allow the use of large exponential in the worst case; their optimized implementations ontologies. Reasoning for time sensitive tasks still requires have been tested on a suite of artificial or often adapted subsets severe restrictions on the expressivity, the complexity and/or of ontologies. The scalability of the algebraic approach with the size of the ontology which, of course, limits the knowledge real world and often large ontologies remains open. that can be used. The high performance computing (HPC) paradigm would Standard DL inference services, e.g., TBox classification, seem to offer a solution to these problems, but progress concept satisfiability checking, instance checking, etc., have using high performance computing methodologies has been been extended with query answering in order to extract in- challenging and slow [8], [9], [10]. The techniques that have formation and drive applications such as web services and offered speedy solutions in other domains (e.g., for “number workflow management systems [1]. For many applications 1 http://sig.biostr.washington.edu/projects/fm/index.html (e.g., associated with health services delivery) these services 2 http://www.medicinenet.com/knee pain/article.htm are time sensitive, but require time consuming reasoning 3 http://owl.man.ac.uk/factplusplus/ over complex and often large ontologies. The expressivity of 4 http://www.hermit-reasoner.com/ the domain knowledge is often sacrificed in order to meet 5 http://clarkparsia.com/pellet/ practical reasoning performance, hence the recent popularity 6 http://www.racer-systems.com/ crunching” in the physical sciences) do not suffice to crack DL reasoning and hybrid algebraic tableau DL reasoning for the time bottleneck of reasoning tasks required for effective the DL SHOQ. use of ontologies. Work is needed to find techniques for a) Standard Tableau: In the case of the standard tableau this kind of computing. The increasing availability of cloud algorithm for the DL SHOQ [17], non-determinism is due to: computing facilities means that we can all have access to • handling disjunctions (The t-Rule): if there exists in the powerful computing resources; indeed, our own laptops have completion graph a node x such that C1 t C2 is in the multiple cores. New methods are needed if we are to take label, L(x), of x then there can be two possible and advantage of their potential. distinct ways to extend the completion model: one in Recently, there has been encouraging results [11], [12], [10], which C1 is added to the label of x, and one in which [13], [14]. The work considered so far, considers parallelizing C2 is added to the label of x. the TBox classification task [12], the Abox querying task • handling qualified cardinality restrictions (choose-rule, [15], or the concept satisfiability checking task [8], [11] ≤-rule): if there exists in the completion graph a node using ontologies relying on the least expressive fragments of x with ≤ nR.C its label and there exists m >= 1 nodes DLs. Parallelizing algebraic reasoning to allow the handling y, y1 . . . ym , related to x via the role R, then: of large ontologies using number restrictions needs further – for each ym there can be two possible and distinct investigation. ways to extend the completion model: one in which Our research is focused on finding ways to combine high C is added to the label of ym , and one in which ¬C performance computing and algebraic tableau reasoning [6] is added to the label of ym (choose-rule). to enable scalable reasoning with ontologies handling the – if m > n there can be m! n! possible and distinct ways expressivity of the DL SHOQ. to extend the completion model such that in each II. H IGH P ERFORMANCE C OMPUTING AND A LGEBRAIC case excess role fillers (yi and yj , i 6= j) are merged R EASONING until the at-most restriction is satisfied (≤-rule). b) Algebraic Tableau: In the case of the hybrid algebraic In this work, we investigate combining HPC and algebraic tableau algorithm for the DL SHOQ [6], disjunctions are tableau reasoning for deciding DL concept satisfiability. Every handled similar to the case of standard tableau. However, standard DL reasoning task can be reduced to a concept satis- handling qualified cardinality restrictions relies on the use of fiability check. Our goal is to parallelize the algebraic tableau the atomic decomposition technique [18], which computes dis- reasoning algorithm presented in [6] for the DL SHOQ, joint partitions by considering all possible interactions between which is basic DL ALC extended with transitive roles, role domain elements. This handling of domain elements results in hierarchies, nominals and qualified cardinality restrictions, and only one additional source of non-determinism rather than the for which the satisfiability problem is ExpTime-complete. two sources for handling qualified cardinality restrictions with The algebraic algorithm presented in [6] decides the satis- the standard tableau: fiability of a concept C by constructing a compressed com- • handling partitions (the ch-Rule): for each partition com- pletion graph representing a model. The algorithm is hybrid; it relies on tableau expansion rules working together with an puted by the atomic decomposition technique there can be integer programming solver (e.g., simplex solver) and comes two possible and distinct ways to extend the completion with a double exponential worst case complexity. However, model: one in which the partition must be empty, and one in practice and when equipped with suited optimizations, in which the partition must have at-least one element. algebraic reasoning performs better than existing state of the We argue that hybrid algebraic reasoning appears to have art reasoners in handling qualified cardinality restrictions and a better potential for parallelization than standard tableau for nominals [16], [7]. In this paper, we argue that algebraic the following reasons: reasoning is well suited for parallel programming models • having less sources of non-determinism (2 instead of 3) offering a potential improvement over standard tableau-based means less overhead in managing concurrent execution DL reasoning. of non-deterministic rules. This also means that adopting optimizations such as dependency directed backtracking A. Parallel Reasoning becomes more fine grained and less complicated. Constructing completion models for DL concepts often • the ch-rule always fires for two choices. This means that requires non-deterministic choices, which result in separately the search trees resulting from the distinct branches have exploring more than one completion graph expansion. In the similar structure which facilitates load balancing between case of the DL SHOQ, non-deterministic tableau-rules lead to parallel expansions of the search tree. Load balancing is a an independent construction of tableau branches since nodes common goal in parallel computing where unequal thread belonging to different branches do not exchange information. workloads can easily diminish the performance gain of This feature suggests that we extend the search strategy parallelization. adopted to construct tableau models using parallel processing. • satisfying qualified cardinality restriction is delegated to In the following, we list and compare the main sources of an inequation solver and can be done in isolation from non-deterministic expansions in the cases of standard tableau tableau expansion. This means that the task of satisfying the inequations can be delegated to the use of separate Rule completed No leader threads, or even FPGAs [14] and GPUs (Graphical Pro- Executing cessing Units). • the use of “compressed completion graph” consisting of proxy nodes representing sets of domain elements instead Leader available New rule applicable of completion graphs consisting of a node representing each domain element allows the use of a smaller data structure representing the completion model. This means that a smaller amount of data needs to shared among and New leader Following Leading communicated between parallel tasks thus reducing the communication overhead between threads. Fig. 1. Thread transitions in the Leader/Followers design pattern. B. The Parallel Execution Framework We consider parallelization of the hybrid algebraic rea- soning algorithm using an object-oriented framework sup- soner for Description Logics) prototype reasoner [16], imple- porting thread level parallelism (TLP). In this framework, a mented in java, is being redesigned to adopt the parallel exe- compressed completion graph data structure is shared among cution framework described in the previous section. Given that threads which concurrently apply tableau rules until termina- HARD has been equipped with a suite of crucial optimization tion of the satisfiability check. In this approach we choose to techniques such as lazy partitioning and dependency directed investigate the or-parallelism with a shared memory strategy, backtracking, one has to consider the possible effects of the where non-deterministic branches of the ch-Rule and the t- TLP. One of the features that renders non-deterministic rules Rule are explored using parallel threads. appealing for parallelization, is that they result in completion In order to minimize the overhead of creating and destroying graph expansions which can be explored in isolation. How- threads every time a non-deterministic rule is applied, we ever, dependency directed backtracking relies on information implement the Thread Pool design pattern. This means that exchange between branches such that dependencies must be a fixed number of threads is created and organized into a recorded and consulted before pruning the search space. Such queue until associated with an applicable completion rule. The required communication between branches complicates the use number of threads can be assigned depending on the number of dependency directed backtracking in our parallel frame- of available processors and resource thrashing can be avoided. work. In this context, the use of the Thread-Specific Storage In this framework, threads coordinate themselves using the or the Monitor Object seems worth investigating. Leader/Followers design pattern where a single thread from the thread pool acts as a leader and manages thread-rule IV. R ELATED W ORK AND O UTLOOK assignment. Figure 1 illustrates the state transitions between Our work here is motivated by problems arising in the threads when adopting the Leader/Followers design pattern. area of health services delivery. The healthcare system is When a thread is in the leader state, it can immediately change composed of many different professionals operating at many state to become in executing state if a non-deterministic com- sites of care offering a wide variety of services and requiring pletion rule becomes applicable. A thread in the executing state a vast amount of information both in the form of data and can run concurrently with the leader thread and other executing also in the form of clinical and other protocols. We are threads. Once a thread finishes expanding a completion graph, currently involved in a multi-year project in collaboration with it either changes state to become leader, if no leader thread is our local health authority and industry partner to develop an available, or to become follower. In the latter case, a thread is ontology-driven Careflow Management System. Our lab has waiting, in the thread pool, to be promoted to the leader state developed an ontology-driven workflow system and we have by the current leader. Since the threads expand a shared model, done some work in the scalability problem for querying over the compressed completion graph can be implemented as a the OWL 2RL fragment [19], [20], [21] over large ABoxes. Monitor Object to ensure synchronization between threads. We are currently expanding our system with an ontology- Even though the order in which expansion rules are applied driven service discovery engine. We believe that the high does not affect soundness and completeness of the satisfiability performance computing paradigm offers a great deal of hope test, in practice, results have shown that performance speedup for the scalability problem for knowledge crunching, that is, can be achieved using certain ordering. We plan to investigate for ontological reasoning tasks, in time sensitive applications. our parallel model while considering different ways of enforc- A parallel algorithm for description logics reasoning has ing an ordering in which node labels are chosen as premise been considered in 1995 [22], with limited scalability results for tableau rules as was done in [11] for the basic DL ALC. due to hardware limitations. Further results and research ac- tivity have been reported in this area since the work presented III. D ISCUSSION in [8], where non-deterministic choices in core satisfiability The implementation and evaluation of this parallel frame- test were explored concurrently. Parallelizing rule-based OWL work is ongoing work. The HARD (Hybrid Algebraic Rea- inferencing has been considered in [13] by examining a data partitioning approach and a rule partitioning approach. Parallel [8] T. Liebig and F. Müller, “Parallelizing tableaux-based description logic reasoning has also been investigated in the context of dis- reasoning,” in Proceedings of the 2007 workshop on On the Move to Meaningful Internet Systems 2007 - OTM 2007, 2007, pp. 1135–1144. tributed resolution reasoning [23] about interlinked ontologies [9] M. Aslani and V. Haarslev, “Concurrent classification of owl ontologies as an alternative to centralized tableau-based reasoning (DL - an empirical evaluation,” in Proceedings of the 2012 International ALCHIQ). Techniques using the MapReduce algorithm to Workshop on Description Logics, DL-2012, Y. Kazakov, D. Lembo, and F. Wolter, Eds., vol. 846, Rome, Italy, June 7-10, 2012. classify EL+ ontologies [24] and fuzzy EL+ ontologies [25] [10] K. Wu and V. Haarslev, “A parallel reasoner for the description logic have been proposed with no empirical evaluation. Concurrent ALC,” in Proceedings of the 2012 International Workshop on Descrip- classification of lightweight ontologies has also been con- tion Logics, DL-2012, ser. CEUR Workshop Proceedings, Y. Kazakov, D. Lembo, and F. Wolter, Eds., vol. 846. Rome, Italy, June 7-10: sidered in the context of consequence-based reasoning [26]. CEUR-WS.org, 2012. Tableau-based concurrent classification of more expressive [11] A. Meissner, “Experimental analysis of some computation rules in ontologies has been recently reported in [9], where lock-free a simple parallel reasoning system for the ALC description logic,” International Journal of Applied Mathematics and Computer Science, algorithms with limited synchronization have been used in a vol. 21, no. 1, pp. 83–95, March 2011. multi-core environment, and in [10] where specialized data [12] M. Aslani and V. Haarslev, “Parallel tbox classification in description structures have been proposed to optimize the use of a shared logics - first experimental results,” in ECAI 2010 - 19th European Con- ference on Artificial Intelligence, ser. Frontiers in Artificial Intelligence memory environment. and Applications, H. Coelho, R. Studer, and M. Wooldridge, Eds., vol. We plan to investigate parallel reasoning in the context of 215. Lisbon, Portugal, August 16-20: IOS Press, 2010, pp. 485–490. enhancing core satisfiability tests for expressive ontologies. [13] R. Soma and V. Prasanna, “Parallel inferencing for owl knowledge bases,” in 37th International Conference on Parallel Processing - Little work has been reported in this context. In [11], a parallel ICPP’08, 2008, pp. 75–82. search engine (Mozart system) was used to parallelize Descrip- [14] P. Subramanian, “A field programmable gate array based finite-domain tion Logics satisfiability check, however, the algorithm only constraint solver,” Master’s thesis, School of Graduate Studies, Utah State University, 2008. considers basic DL ALC. We plan to handle the expressivity [15] J. E. M. Alvarez, “Query engine for massive distributed ontologies using of the DL SHOQ by designing a parallel architecture for mapreduce,” Master’s thesis, Technische Universitat Hamburg-Harburg, the algebraic tableau calculus presented in [6], and which 2010. [16] J. Faddoul, “Reasoning algebraı̈cally with description logics,” Ph.D. was shown to be the only one able to decide the satisfia- dissertation, Concordia University, Montreal, Canada, 2011. bility of complex ontologies relying on the use of nominals [17] I. Horrocks and U. Sattler, “Ontology reasoning in the SHOQ(D) and qualified cardinality restrictions [16], [7]. We plan to description logic,” in Proceedings of the 17th International Joint Conference on Artificial Intelligence (IJCAI 2001). Morgan Kaufmann, implement and evaluate our approach in a multi-core and Los Altos, 2001, pp. 199–204. [Online]. Available: download/2001/ multi-processor environment using the Atlantic Computational ijcai01.pdf Excellence Network (ACEnet) resources. [18] H. J. Ohlbach and J. Koehler, “Modal logics description logics and arithmetic reasoning,” Artificial Intelligence, vol. 109, no. 1-2, pp. 1–31, 1999. ACKNOWLEDGMENT [19] F. Rabbi and W. MacCaull, “T-square: A domain specific language for The first author is an ACEnet Postdoctoral Fellow, and rapid workflow development,” in ACM/IEEE 15th International Confer- ence on Model Driven Engineering Languages & Systems (MODELS gratefully acknowledges support from Atlantic Canada’s HPC 2012), ser. Lecture Notes in Computer Science, vol. 7590, Innsbruck, Consortium; the second author acknowledges support from Austria, September 2012, pp. 36–52. NSERC and ACOA. [20] M. R. U. Faruqui and W. MacCaull, “Owlontdb: A scalable reasoning system for OWL 2 RL ontologie,” in FHIES 2012: International Sym- posium on Foundations of Health Information Engineering and Systems, R EFERENCES ser. Lecture Notes in Computer Science, vol. 7789. Springer, 2013. [1] W. MacCaull and F. Rabbi, “NOVA Workflow: A Workflow Management [21] F. Rabbi, W. MacCaull, and M. R. U. Faruqui, “A scalable ontology Tool Targeting Health Service Delivery,” in International Smposium on reasoner via incremental materialization,” in Submitted to CBMS 2013. Foundations of Health Information Engineering and Systems (FHIES - [22] F. W. Bergmann and J. J. Quantz, “Parallelizing description logics,” 2011), ser. Lecture Notes in Computer Science, vol. 7151. Springer, in 19th Ann. German Conference on Artificial Intelligence, ser. LNCS. 2012, pp. 75–92. Springer-Verlag, 1995, pp. 137–148. [2] V. Haarslev, M. Timmann, and R. Möller, “Combining tableaux and [23] A. Schlicht and H. Stuckenschmidt, “Distributed resolution for expres- algebraic methods for reasoning with qualified number restrictions,” sive ontology networks,” in Web Reasoning and Rule Systems, 3rd in Proceedings of the International Workshop on Description International Conference (RR-2009), Chantilly, VA, USA, October 2009, Logics (DL’2001), Aug. 1-3, Stanford, USA, ser. CEUR Workshop pp. 87–101. Proceedings, vol. 49, 2001, pp. 152–161. [Online]. Available: [24] R. Mutharaju, F. Maier, and P. Hitzler, “A MapReduce algorithm for citeseer.ist.psu.edu/article/haarslev01combining.html el+,” in 23rd International Workshop on Description Logics, 2010, pp. [3] N. Farsiniamarj and V. Haarslev, “Practical reasoning with qualified 464–474. number restrictions: A hybrid abox calculus for the description logic,” [25] Z. Zhou, G. Qi, C. Liu, P. Hitzler, and R. Mutharaju, “Reasoning with AI Communications, vol. 23, no. 2-3, pp. 205–240, 2010. fuzzy-EL+ ontologies using mapreduce,” in ECAI 2012 - 21st European [4] Y. Ding, “Tableau-based reasoning for description logics with inverse Conference on Artificial Intelligence, L. D. R. et al., Ed. IOS Press, roles and number restrictions,” Ph.D. dissertation, Concordia University, 2012, pp. 933–934. 2008. [26] Y. Kazakov, M. Krötzsch, and F. Simancı́k, “Concurrent classification [5] L. R. Pour, “Algebraı̈c reasoning with the description logic SHIQ,” of el ontologies,” in Proceedings of the 10th international conference on Master’s thesis, Concordia University. The semantic web, ser. ISWC’ 11. Bonn, Germany: Springer-Verlag, [6] J. Faddoul and V. Haarslev, “Algebraic tableau reasoning for the de- 2011, pp. 305–320. scription logic SHOQ,” Journal of Applied Logic, vol. 8, no. 4, pp. 334–355, 2010. [7] ——, “Optimized algebraic tableau reasoning for the description logic SHOQ,” Journal of Artificial Intelligence Research (JAIR) - In prepa- ration, 2013.