Knowledge Change Management and Analysis for Multi-Disciplinary Engineering Environments Fajar J. Ekaputra1 Estefanía Serral2 Marta Sabou1 Stefan Biffl1 1) 2) Vienna University of Technology, KU Leuven, Dept. of Decision Sciences and Infor- Christian Doppler Laboratory CDL-Flex mation Management Favoritenstrasse 9-11/188, AT 1040 Vienna, Austria Naamsestraat 69, 3000 Leuven, Belgium. {firstname.lastname}@tuwien.ac.at Estefania.Serralasensio@kuleuven.be ABSTRACT The Ontology-Based Information Integration (OBII) approach has Multi-Disciplinary Engineering (MDEng) environments involve a been previously proposed (e.g., in [1, 16]) to integrate data from wide range of models, processes and tools that were not designed the heterogeneous sources using Semantic Web technologies. to cooperate together. The Ontology-Based Information Integra- OBII consists of three components: local ontologies (to represent tion (OBII) approach has been proposed to address the integration data specific to one engineering discipline – i.e., local data), a issue within such environments. However, knowledge changes common ontology (i.e., represent the aggregation of relevant and management and analysis (KCMA) process within the environ- related concepts in organizational level, e.g., power plant) and the ment are not covered within the OBII approach. While the tradi- mapping between these common and local ontologies to enable tional ontology change management approach has been investi- linking and integration between these heterogeneous data. gated to the general problem, it remains unclear how to use the However, to the best of our knowledge, OBII does not provide available solutions within MDEng context. In this paper, we ex- support for knowledge change management and analysis tend the OBII approach to enable the KCMA process. We have (KCMA), which is an essential and challenging requirement for identified the main KCMA requirements within MDEng projects MDEng environments. In MDEng environments, the used models and studied the related work of Ontology Change Management to and data change over time very often due to (1) changes in the propose a suitable solution, as well as suggesting further works. represented domains, such as the introduction/removal of do- main concepts; (2) changes in the underlying data sources, such Categories and Subject Descriptors as when new data elements become available and old data ele- I.2.4 [Artificial Intelligence]: Knowledge Representation For- ments become obsolete; or (3) changes in the intended use of the malisms and Methods - Semantic network; H.5.3 [Group & Or- semantic models and data, such as by changing requirements of ganization Interfaces] Collaborative computing the currently supported tools or the design of new tools. Ontology change management has been investigated as a generic Keywords problem to address the dynamic of ontology data and its deriva- Knowledge Change Management, Change Analysis, Ontology- tive challenges (e.g., [7, 8, 14, 17]. However, it is not clear how Based Information Integration, Multi-Disciplinary Engineering applicable the current solutions are to improve the KCMA support of OBII approach in the context of MDEng. Therefore, the pro- 1. INTRODUCTION posed approaches for dealing with ontology change management The process of designing complex mechatronic objects such as in general settings must be adapted to fulfill the requirements of power plants or steel mills often requires teams of engineers from MDEng such as change propagation to the overlapped data in diverse engineering domains (e.g., mechanical, electrical and other disciplines. software engineering) to work together. As a result, this design In this paper, we extend the OBII approach to address the specific process takes place in a multi-disciplinary engineering (MDEng) requirements of KCMA in MDEng environment. We have identi- environment in which experts from various engineering domains fied a set of requirements from the environment as and studied the and organizations work together towards creating a complex engi- related work on the ontology change management to derive our neering artifact [13]. This environment is highly heterogeneous as proposed approach for addressing the problem. it involves a wide range of data models, processes, and tools that The rest of the paper will be structured as follows: Section 2 will were originally not designed to cooperate seamlessly. identify key requirements for KCMA within OBII based MDEng An illustrative MDEng setting is the engineering of a power plant. solution and summarized the relevant related works. We explain As any large-scale project, the development of the power plant our proposed solution in Section 3 and finally conclude the paper requires the coordinated work of engineers from multiple disci- in Section 4 and identify the potential future work. plines, which needs to converge to a high-quality product. This heterogeneous team of experts should be coordinated in such a way that important project-level technical and management con- straints are fulfilled (e.g., the mass and dimension constraints of the base plate are not exceeded by individual equipment). Such coordination requires aggregating relevant data across teams from various disciplines but it is hampered by the semantic heteroge- neity of the data, with different disciplines using diverse terms to refer to the same entities. 13 Table 1 Solution Alternatives for Knowledge Change within MDEng Environment Related Approaches MDEng Kle04 Sto04 Noy06 Pap09 Grö10 Zab11 Van13 Hor13 Gra14 Requirements Details KCMA [7] [14] [8] [10] [4] [17] [12] [6] [3] Single Ontology x x x x x Several, Loosely cou- x x x x Multiple Linked pled ontologies Ontologies Several Closely cou- x (x) pled ontologies (1) Scale > 1M triples (2) x <200K x 100K Knowledge Change A-Box x x x x x x Focus (3) T-Box x x x x x x x x Change Validation Manual (User) x x x (4) Automatic x (x) x Change Detection Low-level x x x x x x x (5) High-level x x x Ontology Change Ontology Evolution x x x x x x x x (6) Ontology Versioning x (x) x (x) x proach is adopted by Graube et al. [3], where they tried to use 2. REQUIREMENTS & RELATED WORK named graphs to store changes and ontology versions. Their ap- At the level of actual MDEng environment data, industrial part- proach did not scale well for change data analysis, since the query ners need to keep data versions, move backwards to previous performance on the change data dropped significantly after sever- versions, and query different versions of large data coming from al thousands of triples. Papavassiliou et al. [10], on the other the heterogeneous local data sources. Furthermore, in multidisci- hand, successfully experimented their approach on almost 200k plinary MDEng environments the effective and considerate prop- triples. agation of changes is essential to ensure a consistent view of the (3) Axiom and Facts Changes. Heterogeneity of data sources project, to minimize defects and risks, and acceptance of new within MDEng environment also means that additional tools solutions with domain experts. To achieve this, the changes com- could be added anytime, which may imply changes in the com- ing from one discipline need to be communicated and coordinated mon and other local ontologies. The goal of the KCMA within with the participants of other disciplines, where those changes are MDEng environment is to address such changes in the data struc- relevant (closely linked ontologies), while striving to provide the ture (Axioms) as well as data instances (Facts) to support the high-level changes definition (e.g., defined in terms of domain stakeholders in analyzing the changed data for their use. Several concepts such as “Motor X updated to new version”) instead of KCMA approaches are already able to address this requirement low-level changes (i.e., at levels up individual change operations [6, 10, 14], and their approach could be used as the basis for the on the versioned files) to ease the analysis process of the data. KCMA for MDEng environment. Next, we will identify the key requirements to support KCMA (4) Automatic-to-Manual Validation Shift. Given the mission within OBII based MDEng solution based on our interviews with critical nature of the project in the MDEng, the domain experts domain experts and our experience in handling knowledge in such and engineers do not want to totally rely on the automatic change environment. Furthermore, we summarized the relevant related validation mechanism based on the constraint definition, (e.g., to works of knowledge / ontology change management from SW decide whether changes initiated by a local ontology will break community (see summary in Table 1; number on the requirements the global data consistency). They wanted to involve the domain explanation correlates with the number in the table). experts in the validation workflow, in order to make critical deci- (1) Closely Coupled Ontologies. In the OBII based MDEng, we sion about changes and how to proceed with it. are dealing with KCMA in a closely linked ontologies environ- Stojanovic et al. [15] have provided a mechanism to involve do- ment, where local ontology changes (both of axioms and facts) main experts to check the semantic validity of ontology change might affect and change other ontologies via change propagation. over multiple ontologies. One interesting line of work came re- This is not the typical setting for KCMA in Semantic Web com- cently that could be applied in the change validation, which in- munity, where they are dealing with open Web data. This differ- volved using crowdsourcing to better structure model coming ence reflected within most of traditional KCMA that focused on from automatic ontology engineering [5]. multiple ontologies [6, 7, 11, 12]. Stojanovic [14] provide an exception to this trend, where she provided an attempt to propa- (5) High-Level Change Definition and Detection. In the typical gate changes to relevant ontologies. However, the work is not tools used within the power plant design, they are able to produce continued and not further developed. report data that consists of signal list that represent the atomic parts of a factory handled by specific tools. Difference between (2) Large Amount of Data. An average power plant’s engineer- two versions of signal lists represents the changes between them. ing design data is ranging between several hundreds thousand and However, it is challenging for a project manager to grasp the tens of millions of signals. Those numbers, combined with the meaning with such low-level changes data. They need the data to hundreds of process iterations lead to a large number of data to be presented in a more meaningful manner as high-level changes, process. Horridge et al. [6] has shown the answer to the large in terms of domain level common concepts. scale challenge of the changed data by introducing the binary formats of store ontology data and version differences, claimed to Papavassiliou et al. [10] shown an example on how to derive such be working with more than one million triples. A different ap- high-level changes from low-level changes without compromising 14 performance. Alternatively, Gröner et al. [4] had shown the usage (4) Change Definition and Detection. This phase focuses on the of a subset of OWL-DL reasoning to recognize high-level changes definition and detection of low-level (i.e., triples) and high-level pattern. The goal of this requirement is to provide stakeholders (e.g., semantic and domain-specific) changes between two ver- with a better decision support system w.r.t. KCMA in OBII based sions of engineering data. An important point to consider within MDEng approach. this phase is to balance the expressivity of high-level changes (6) Ontology Change. Flouris et al. [2] has provided an excellent definition and the computational complexity of the detection algo- definition of ontology evolution, defined as “a process of modify- rithm, as mentioned by Papavassiliou et al [10]. Generic open ing an ontology in response to a certain change in the domain or source Ontology APIs (e.g., Apache Jena, Sesame API) typically its conceptualization” and ontology versioning, defined as “an provides mechanisms for detection of low-level (triple) changes ability to handle an evolving ontology by creating and managing between two ontology versions. Additionally, research results, different variants/versions of this ontology”. e.g., PROMPTDIFF [9] and the high-level changes definition approach from Papavassiliou et al [10], could be used to further Most of the ontology change management approaches focus on enhance the detection algorithm. These approaches will be used as one of them, e.g., ontology evolution [4, 7, 8, 17] or ontology a basis for our work to address change definition and detection in versioning [3, 12], while the others are trying to address both of MDEng environment. them [6, 14]. In the context of our work, these approaches would become a good basis for our solution approach. (5) Change Validation. The phase of change validation requires definition of constraints for preserving the validity of data in the local (e.g., mechanical engineering) and global scope (e.g., power 3. PROPOSED SOLUTION plant). Workflow definition is another important element, in order In order to address the challenge of providing support for to configure involvement of validation components (e.g., con- Knowledge Change Management and Analysis (KCMA) in the straint validation engine and domain experts) in the validation Ontology Based Information Integration (OBII) approach within process. To formulate the constraints, recently, there is an initia- Multi-Disciplinary Engineering (MDEng) environment, we extend tive called Shapes Constraint Language (SHACL5) W3C working the OBII approach [1, 16] as shown in Figure 2. We have added group, which aims to provide the constraint standard vocabulary four additional phases (shown as white boxes in the figure), which for RDF graph data. are derived from the related works and available standards in on- tology change management and related fields from the Semantic (6) Change Propagation. Changes in the MDEng environment Web community. We utilize IDEF-01 diagram to structure the need to be propagated to the relevant components (i.e., common proposed approach, in which processes shown as boxes, and re- ontologies and other relevant local ontologies). The phase re- sources are shown as directed arrows. quires the common ontology and mapping definitions, as well as the validated changes. Knowledge engineer will need to configure There are three domain experts involved in the framework: the propagation based on the mapping definitions (e.g., based on Knowledge Engineer (KE), Project Manager (PM) and Domain SPIN or SWRL6 rules), to make sure that no corrupted or irrele- Expert (DE). The framework draws on several standards and vant data is included in the propagation process. technologies, (e.g., SPARQL for querying, PROV-O) which will be used for structuring and implementing our approach. Input and (7) Data Store and Analysis. The goal of this phase is to enable output of the system is shown in the left and right side of the dia- relevant stakeholders (e.g., project manager) to access and analyze gram respectively. the data and its changes within the projects. The changes data will be stored within RDF triple stores, e.g., Sesame7. We are planning The main stages of the proposed approach are: to utilize the W3C standard PROV-O8 vocabulary for storing the (1) Local Ontologies Definition. This phase requires the change provenance information. Examples of queries that would Knowledge Engineer and Domain Experts to work together to be made on this change data are: (1) Provenance information of translate the local tools data structure (e.g., MCAD model for the changes (e.g., committer, date, reasons of change), (2) Change mechanical engineer) to the local ontology axioms definition. overview on specific objects, and (3) Analysis of completeness (2) Common Ontology & Mapping Definition. KE and DE will and inconsistencies over changes. be working together in this phase to define the common ontology and its mappings to the local ontologies. To support this goal, 4. CONCLUSION & FUTURE WORK Semantic Web vocabularies and standards are required to formal- We have extended the OBII approach, mainly created for the pur- ize the ontology and mapping. There are several approaches, e.g., pose of data integration, to properly address the challenge of SPARQL or SPIN2, which could be used to define the mapping KCMA within MDEng environment. We have identified key re- and transformation rules within our context. quirements as well as studied the related state of the art from the ontology change management area. This work is meant to lay the (3) Local Ontologies ETL. With regards to the heterogeneous foundation towards a solution for providing a fully functional domain tools and their data formats within the MDEng environ- KCMA solution for OBII-based MDEng domain. ment, we need to provide the suitable extract, transform, and load (ETL) functions phase to produce the data in the required ontolo- gy formats. Several solutions could be re-used to address this problem, e.g., Apache Jena3 and R2RDF4. 1 5 http://www.idef.com/idef0.htm https://w3c.github.io/data-shapes/shacl/ 2 6 http://goo.gl/TcTB8R http://www.w3.org/Submission/SWRL/ 3 7 http://jena.apache.org/ http://rdf4j.org/ 4 8 http://www.w3.org/TR/r2rml/ http://www.w3.org/TR/prov-o/ 15 (S) SPARQL (C) SW Constraints (R) SW Rules (W) SW Workflow (P) PROV-O 3 Domain Tools Local Input Data Ontologies ETL Local ontologies data KE (S) (R) Latest Version of Data 4 Change High-Level Detection Changes Definition Local Ontology (C) (W) Axioms Detected Changes DE KE 5 Validation Rules & Change Validation Results Workflows Validation 1 Local Local Ontology Common Ontology & Data Valid Changes Structure Definition Local Ontologies (C) (R) Mapping Definition DE KE (R) (W) Axioms 2 6 DE KE Common Onto- Common Ontology & Changes Domain Knowledge & logy & Mapping Relevant Changes to Domain Tools Mapping Definition Propagation Analysis Requirements Definition DE KE KE Latest Version of Axioms Relevant Data (S) (P) for Change Analysis 7 Data Store & Analysis Knowledge Change Queries Analysis Results PM KE Knowledge Project Domain Engineer (KE) Manager (PM) Experts (DE) Figure 1 Extended OBII approach to address KCMA in the MDEng environment In the process of investigating a suitable extension, we found [3] Graube, M., Hensel, S. and Urbas, L. 2014. R43ples: out that there is a gap in the standardization of several aspects of Revisions for Triples. Proceedings of the 1st Workshop on Semantic Web, e.g., constraint and rules vocabulary, which Linked Data Quality (2014). could hinder further adoption of semantic web in the context of [4] Gröner, G., Parreiras, F. and Staab, S. 2010. Semantic MDEng domains, e.g., Industrial Automation System, and make recognition of ontology refactoring. The Semantic Web– it difficult the use of the proposed extension. Fortunately, there ISWC 2010. (2010). are already initiatives towards standardization of these vocabu- laries, e.g., SHACL working group for RDF graph constraint [5] Hanika, F., Wohlgenannt, G. and Sabou, M. 2014. The and RML Mapping Language for semantic mapping. uComp Protégé Plugin: Crowdsourcing Enabled Ontology Engineering. Knowledge Engineering and Knowledge As future work, we will develop the prototype implementation Management (2014). based on our proposed OBII extension framework, as well as [6] Horridge, M., Redmond, T., Tudorache, T. and Musen, M. conduct evaluations of the approach. We are also planning to 2013. Binary OWL. OWL Experiences and Directions generalize the approach to address similar problem settings, Workshop (OWLED) (2013). such as in scholarly data management. [7] Klein, M. 2004. Change Management for Distributed Ontologies. PhD Thesis. Vrije University Amsterdam, 5. ACKNOWLEDGMENTS Netherlands. This work was supported by the Christian Doppler For- schungsgesellschaft, the Federal Ministry of Economy, Family [8] Noy, N., Chugh, A., Liu, W. and Musen, M. 2006. A and Youth and Österreichischer Austauschdienst (ÖAD). framework for ontology evolution in collaborative environments. The Semantic Web-ISWC 2006. (2006). [9] Noy, N. and Musen, M. 2002. Promptdiff: A fixed-point 6. REFERENCES algorithm for comparing ontology versions. AAAI/IAAI. [1] Calvanese, D., De Giacomo, G. and Lenzerini, M. 2001. (2002). Ontology of Integration and Integration of Ontologies. [10] Papavassiliou, V., Flouris, G. and Fundulaki, I. 2009. On International Description Logics Workshop (2001). Detecting High-Level Changes in RDF/S KBs. 8th [2] Flouris, G., Manakanatas, D., Kondylakis, H., International Semantic Web Conference, Chantilly, VA, Plexousakis, D. and Antoniou, G. 2008. Ontology change: USA. (2009). Classification and survey. The Knowledge Engineering [11] Redmond, T., Smith, M., Drummond, N. and Tudorache, Review. 23, 2 (2008). T. 2008. Managing Change: An Ontology Version Control System. OWLED. (2008). 16 [12] Vander Sande, M., Colpaert, P., Verborgh, R., Coppens, S., Mannens, E. and Van de Walle, R. 2013. R&Wbase: git for triples. Linked Data on the Web Workshop (2013). [13] Serral, E., Mordinyi, R., Kovalenko, O., Winkler, D. and Biffl, S. 2013. Evaluation of Semantic Data Storages for Integrating Heterogenous Disciplines in Automation Systems Engineering. IECON 2013 (Vienna, Austria, 2013). [14] Stojanovic, L. 2004. Methods and tools for ontology evolution. PhD Thesis. University of Kalsruhe, Germany. [15] Stojanovic, L. and Maedche, A. 2002. User-driven ontology evolution management. Knowledge engineering and knowledge management: ontologies and the semantic web. (2002). [16] Wache, H., Voegele, T., Visser, U., Stuckenschmidt, H., Schuster, G., Neumann, H. and Hübner, S. 2001. Ontology-based integration of information-a survey of existing approaches. IJCAI-01 workshop: ontologies and information sharing (2001), 108–117. [17] Zablith, F. 2011. Harvesting Online Ontologies for Ontology Evolution. PhD Thesis. The Open University, UK. 17