=Paper=
{{Paper
|id=Vol-1387/paper6
|storemode=property
|title=Evaluation and Optimized Usage of OWL 2 Reasoners in an Event-based eHealth Context
|pdfUrl=https://ceur-ws.org/Vol-1387/paper_6.pdf
|volume=Vol-1387
|dblpUrl=https://dblp.org/rec/conf/ore/BonteOSMADBVTM15
}}
==Evaluation and Optimized Usage of OWL 2 Reasoners in an Event-based eHealth Context==
Evaluation and Optimized Usage of OWL 2 Reasoners in an Event-based eHealth Context. Pieter Bonte1 , Femke Ongenae1 , Jeroen Schaballie1 , Ben De Meester1 , Dörthe Arndt1 , Wim Dereuddre2 , Jabran Bhatti2 , Stijn Verstichel1 , Ruben Verborgh1 , Rik Van de Walle1 , Erik Mannens1 , and Filip De Turck1 1 Ghent University - iMinds, Gaston Crommenlaan 8, 9000 Ghent, Belgium Pieter.Bonte@intec.ugent.be 2 Televic HealthCare, Leo Bekaertlaan 1, 8870 Izegem Abstract This paper evaluates the performance of the OWL 2 reason- ers Pellet and HermiT in an eHealth context where most of the ABox is considered static and discrete transient events describing the environ- ment are incrementally added and processed. The considered use case is the assignment of tasks and calls to nurses. To provide personalized and optimized care, the selection process utilizes reasoning to make in- telligent assignment decisions based on the available information. This has been implemented using multiple SPARQL-queries to enable easy adaptation of the assignment algorithm. Since limited time is available to perform the assignments, the decision should be made in at most five seconds. An analysis of the performance and scalability of the reasoners is presented. To deal with the limited time frame, several optimizations are suggested, which exploit that most of the ABox is considered static. Keywords: eHealth, Evaluation, Event-based, OWL2 Reasoners 1 Introduction To provide personalized care for patients, task and call assignment systems ben- efit from considering as much information as possible. Most of this info can be considered relatively static, e.g., the patient’s profile and pathology and the competences of the staff. Discrete events representing tasks, calls and changes to the environment, e.g., person location updates, are incrementally added and processed. Scalable reasoning on this static data, which is incrementally up- dated with event data, is thus required to assign the most appropriate staff to tasks and calls. The selection procedure, represented as a decision tree, has been implemented using multiple SPARQL-queries, allowing easy adaptation of the algorithm. Since the selection procedure should happen as fast as possible, the allowed decision time has been limited to five seconds by domain experts. In this paper, the developed task and call assignment reasoning platform is evaluated in terms of scalability using the OWL-2 reasoners Pellet [10] and HermiT [8] based on hospital data with an increasing number of wards. Moreover, several optimizations to speed up the reasoning are proposed and evaluated. These optimizations are able to deal with the event-based and time-constraint scenarios and are able to execute the numerous SPARQL-queries efficiently. 2 2 The Scenario The considered scenario consists of a hospital with wards, containing patients and care staff. The patients execute calls to receive medical aid. The locations of the medical staff are automatically tracked. When a patient is in need of aid, the decision tree is checked to determine the most suited staff member, based on the patient’s pathology and profile, the location of the care staff, etc. The scenario consists of following steps, with the performed reasoning actions between brackets: Call Launched: A patient launches a call (select nurse) Call Redirect: The nurse indicates that she is busy (select new nurse) Call Temporary Accept: The new nurse accepts the call (update call status) Corridor: The nurse moves towards the room of the patient (update location) Patient loc: The nurse arrives in the room (update location & turn on lights) Presence On: The nurse logs into the terminal (update call status & turn on lights) Presence Off: The nurse logs out (update call status & turn off lights) Corridor loc: The nurse leaves the room (update location & turn off lights) In the considered eHealth use case, the static data consists of the hospital configuration and profile information of the patients and care staff. The dynamic data are typically calls by the patients to receive aid, updates of the status of the call or location updates by the staff. 3 The Accio Ontology To represent the eHealth knowledge, the ACCIO ontology3 is used. An elaborate description can be found in Ongenae, et al. [7]. We evaluate the scalability of the OWL 2 reasoners by executing the scenario over an increasing number of wards. The details of the ontology loaded with the preliminary data for each number of wards is summarized in Table 1. The loaded data consists of the configuration of the different wards, including the personnel and patient information. Note that these are the numbers of the static part of the ABox. During the scenario, the TBox is considered static. #Wards: 1 10 25 50 75 100 Axioms 3412 11113 20349 41098 63869 85564 Logical Axioms 2109 8167 15426 32389 49670 66741 Individuals 270 1913 3890 8464 13166 17790 Classes 332 Object Properties 182 Data Properties 51 DL Expressivity SHOIQ(D) Table 1. Summary of the ACCIO ontology for different number of wards 3 http://users.intec.ugent.be/pieter.bonte/ontology/accio.html 3 4 Implementation To implement the scenario, we utilized the ModulAr, Service, Semantic & Flex- ible Platform (MASSIF), a data-driven platform that allows the easy develop- ment and collaboration of (ontology-based) services. Each service performs a distinguished reasoning task. Services exchange their knowledge over a Semantic Communication Bus (SCB) [2]. A detailed description of the MASSIF platform can be found in De Backere, et al. [1]. Four Services, performing different reasoning tasks, were implemented. These are elaborated below, in followed by the number of queries needed to implement their logic. Presence Service: tracks the location of the staff – #queries: 2. Status Call Service: tracks the status of the calls – #queries: 4. Light Service: handles the lights in the patient rooms – #queries: 9. Help Selection Service: handles the staff assignment – #queries: 200. Even though the Help Selection Service implements 200 queries, the real number of executed queries depends on the number of branches that need to be checked in the decision tree, which represents the nurse assignment algorithm. The OWL API [4] is used to internally represent the ontology in the MAS- SIF platform. It provides an OWLReasoner -interface, offering an uniform access point to various reasoners. In this paper we evaluate the most popular reasoners implementing the interface. Unfortunately, the OWL API does not provide any SPARQL support. To resolve this matter a Jena model [5] was used that al- lows SPARQL queries through Jena ARQ. This model requires synchronization with the OWL API. Only Pellet is able to convert its internal knowledge base to Jena. Utilizing HermiT or a more optimized use of Pellet requires a direct conversion from OWL API to Jena, this is achieved by writing the ontology to a outputstream in RDF/XML format and reading this stream in Jena. Different approaches were evaluated for using the reasoners, of which the distinctive steps are visualized in Figure 1. Two flows can be discerned. A flow at the top describing the precomputation steps at start-up and one at the bot- tom describing the reasoning steps during the execution of the scenario. Before discussing them in detail, we elaborate the on two implemented optimizations for executing numerous queries in an event-based context. Figure 1. Workflow depicting the various approaches 4 4.1 Materialization Materialization is the process of precomputing key sets of implicit assertions in the knowledge base and is frequently employed by semantic query and reasoning engines to improve query performance [6]. Doing so requires more storage, but it allows easy look-up at run-time [9], because it can bypass reasoning when executing queries. This step is depicted in Figure 1 as (g). Adding additional event data (ABox) to a materialized ontology requires realization, i.e., computing the direct types of the added individuals. Based on a previous classification, the class hierarchy can be used to retrieve all types. 4.2 Subset Reasoning When adding new individuals to a materialized ontology, subset reasoning re- trieves a subset of ABox data in such a way that all necessary data to calculate the types of the new individuals can be achieved with a minimal data set. The size of the subset is ontology dependent and determined through a precompution step, which finds the TBox axiom with the largest depth. This is depicted in Fig- ure 1 as step (h). The depth of an axiom is similar to the modal depth and defines the deepest nesting of the operators. A formal definition can be found below. We define the axiom depth as d, R as the roles and C as the concepts. θ = C|∀R.θ|∃R.θ d(C) = 0 d(θ1 ∧ θ2 ) = max(d(θ1 ), d(θ2 )) d(∀R.θ) = 1 + d(θ) (1) d(θ1 ∨ θ2 ) = max(d(θ1 ), d(θ2 )) d(∃R.θ) = 1 + d(θ) The calculated depth defines the size of the dataset necessary to calculate the types of new individuals in the materialized ontology. If we view the ontology as a graph with the individuals as vertexes and the relations as edges, the subset of data needed to calculate the types is a subtree of the graph with as root the individual and as depth the calculated axiom depth. If the types of multiple individuals need to be calculated, a union of subtrees can be considered. A subset is sufficient to determine the types of an individual. Only the in- dividuals/literals with whom the given individual has a relation have influence, considering the calculated depth and the fact that all needed inferred data has been calculated in a previous materialization step. For this approach to preserve completeness, the TBox describing the new individuals should be part of an ontology definition T that can be seen as an extension of the static ontology def- inition T 0 and if T is local, it does not yield new consequences in T 0 . A definition of locality can be found in Grau, et al. [3]. Transitive relations to leafs in the subtree could also cause incompleteness, since the transitivity could possibly not be fulfilled because of the limited data in the subset. However, this never occurs in our scenario. The maximum axiom depth for the ACCIO ontology is three. Compared to existing modularization techniques [11], our technique only ex- tracts ABox data, preserving the TBox allows dynamic and performant extrac- tion at runtime. Since the ontology has been materialized, the extracted data can be limited to the data that directly influences the calculation of the new types. All other data has been inferred in a previous step. 5 4.3 The Approaches The different approaches used to perform the reasoning are discussed below. Each approach is depicted in Figure 1 as a sequence of steps. 1. Pellet is used to perform the necessary reasoning each time a query is executed. Pellet performs the conversion of its internal knowledge base to a Jena Model each time an event arrives, but is able to cache intermediate results. This is depicted in Figure 1 as step (a). 2. To eliminate the reasoning at query time, a full materialization of the ontology is calculated at arrival of an event and translated from the OWL API to the Jena Model. Jena ARQ allows querying without any reasoning. This is depicted as step (b) and has been evaluated with the HermiT reasoner. 3. Instead of calculating the whole inferred model each time an event arrives, the static data is materialized as a pre-computation step. For each individual in the arriving data, its types are calculated based on the materialized ontology and the OWL API and the Jena model are incremented with this inferred knowledge. This is shown as steps (d)-(e)-(c) and has been evaluated with Pellet and HermiT. 4. To determine the types of the arriving events, it is not necessary to analyze the whole ontology. Starting from a materialized ontology, we compute the subset of data that has influence on the calculation of the types of the newly arrived individuals, by utilizing the subset reasoning explained in Section 4.2. This step is depicted as steps (f)-(d)-(e)-(c) and has been evaluated with Pellet and HermiT. It is important to note that in approaches 3 and 4, the completeness of the materialized ontology might become partly lost when the new events yield consequences in the static data. This is because the calculation of the types of the event data does not lead to a recalculation of the materialized static data. However, this is never the case in the eHealth scenario discussed in this paper. 5 Evaluation Set-up and Results The scalability of the reasoners is evaluated by increasing the number of wards, resulting in a growing ABox. Each ward consists of 10 rooms, two patients asking for aid and three nurses. Each approach and each number of wards was evaluated 35 times. The first three and last two results were dropped to eliminated the influence of the warm-up and cooling down period. The evaluation was done on a Debian server with an Intel Xeon CPU E5620 2.40GHz with 12 GB of memory. Figure 2 summarizes the performance of the reasoners. The corresponding approach is indicated between brackets. On the Y-axis the time to complete the whole scenario, which contains multiple reasoning steps, is indicated. Evaluating over the sum of the various scenario steps, allows us to gain a clear understanding of the different trends for the various approaches. Since the results for Pellet do not meet the time constraint of five seconds for 1 ward, more than 10 wards were not evaluated. As for HermiT, materializing the whole ontology every time an event arrives is not feasible. Since the execution time for ten wards did not come close to the time constraints, the evaluation was not continued. Approach 3 is 6 1000 1000 10 (1) 800 (3) 800 8 (4) (4) time (s) 6 600 time (s) 600 time (s) (2) 4 400 400 (3) 2 200 200 (4) 0 0 0 1 10 25 50 75 100 1 10 25 50 1 10 25 50 Number of wards Number of wards Number of wards (a) Pellet Reasoner (b) HermiT Reasoner (c) HermiT + Subset Figure 2. Evaluation of the various approaches more performant than the Pellet approaches but still scales not well because it takes the whole dataset into consideration when reasoning. The subsetting approach is the fastest and meets the time constraints. There- for it is shown in detail in Figure 2 (c). It is clear that the size of the dataset has limited influence on the reasoning times due to the extraction of the minimal dataset through subsetting. Partly losing the completeness of the ontology has huge performance benefits. We therefore analyze these results more in depth in Figure 3 which visualizes the average times for each service in each scenario step for a fixed number of wards (here 75). The presented times are the total service time. For the HelpSelectionService 55% of the time is spent on reasoning, 42% on quering and less than 3% on conversion. The execution time for the various scenario steps differ. This is due to the fact that not all steps require the same amount of reasoning or queries. We can see that the HelpSelectionService takes the longest in the first two steps, as this service checks the biggest decision tree. 6 Conclusion and Future Work In this paper the performance and scalability of the HermiT and Pellet reasoner in an event-based eHealth scenario were evaluated. Scalable reasoning over rel- atively static data that is incrementally updated with event data is reached by limiting the data the reasoners need to calculate the types of the event data. It is clear that partly losing the completeness of the ontology has huge performance benefits. The technique can be exploited in cases where performance is critical and where the dynamic part of the ontology which describes the events has no or limited consequences on the rest of the ontology. Furthermore, it was shown that HermiT is more performant in calculating the inferred types than the Pellet reasoner. In future work, we will focus on adapting the subset-algorithm to re- claim completeness of the whole ontology by incrementally increasing the subset if possible changes to the static data have been detected. 1200 a: CallLaunched c: CallTempAccept e: Patient loc g: Presence Off 1000 b: CallRedirected d: Corridor f: Presence On h: Corridor time (ms) 800 HelpSelectionService 600 PresenceService 400 LightService StatusCallService 200 SCB 0 MASSIF Overhead a b c d e f g h Scenario steps Figure 3. Evaluation of the HermiT reasoner for the subset approach 7 References 1. De Backere, F., Ongenae, F., Van den Abeele, F., Nelis, J., Bonte, P., Clement, E., Philpott, M., Hoebeke, J., Verstichel, S., Ackaert, A., et al.: Towards a social and context-aware multi-sensor fall detection and risk assessment platform. Computers in biology and medicine (2014) 2. Famaey, J., et al: An ontology-driven semantic bus for autonomic communication elements. In: Brennan, R., Fleck, J., van der Meer, S. (eds.) Lecture Notes in Comput. Sci. vol. 6473, pp. 37–50. Springer Verlag Berlin (2010) 3. Grau, B.C., Horrocks, I., Kazakov, Y., Sattler, U.: A logical framework for modu- larity of ontologies. In: IJCAI. vol. 2007, pp. 298–303 (2007) 4. Horridge, M., Bechhofer, S.: The owl api: A java api for owl ontologies. Semantic Web 2(1), 11–21 (2011) 5. McBride, B.: Jena: Implementing the rdf model and syntax specification. In: SemWeb (2001) 6. Narayanan S, Catalyurek U, K.T.S.J.: Parallel materialization of large aboxes. In: Symposium on Applied Computing. vol. 2009, pp. 1257–1261 7. Ongenae, F., Bleumers, L., Sulmon, N., Verstraete, M., Van Gils, M., Jacobs, A., De Zutter, S., Verhoeve, P., Ackaert, A., De Turck, F.: Participatory design of a continuous care ontology (2011) 8. OXFORD, U.O.: Hermit reasoner (2014), http://hermit-reasoner.com 9. Rabbi, F., MacCaull, W., Faruqui, R.U.: A scalable ontology reasoner via incre- mental materialization. Proceedings of CBMS 2013 - 26th IEEE International Sym- posium on Computer-Based Medical Systems pp. 221–226 (2013) 10. Sirin, E., Parsia, B., Grau, B.C., Kalyanpur, A., Katz, Y.: Pel- let: A practical OWL-DL reasoner. Web Semantics: Science, Ser- vices and Agents on the World Wide Web 5(2), 51–53 (2007), http://www.sciencedirect.com/science/article/pii/S1570826807000169 11. Tsarkov, D.: Improved algorithms for module extraction and atomic decompo- sition. In: 25th International Workshop on Description Logics. p. 345. Citeseer (2012)