Mapping Refinement based on Ontology Evolution: A Context-Matching Approach Julio Cesar dos Reis1 1 Institute of Computing, University of Campinas (UNICAMP) 13083-852, Campinas, SP, Brazil jreis@ic.unicamp.br Abstract. Ontologies and their associated mappings play a central role in se- veral semantic-enabled tasks. However, the continuous evolution of ontologies requires updating existing concept alignments. Whereas mapping maintenance techniques have mostly handled revision and removal type of ontology changes, the addition of concepts and attributes demand further studies. This article pro- poses techniques to refine set of established mappings based on the evolution of ontologies. We investigate ways of suggesting correspondences with the new version of the ontology without applying a matching operation to the whole set of ontology entities. Obtained results explore the neighbourhood of concepts in the alignment process to update mapping sets. 1. Introduction Over the last decade several domains have exploited ontologies and their capabilities for various purposes ranging from information retrieval to data management and sharing. However, the size of these domains often requires the use of several ontologies whose elements are linked through mappings. Mappings are the materialization of semantic relations between elements of interrelated ontologies [Shvaiko and Euzenat 2013]. Creating mappings between ontologies is a complex task especially due to the increasing size of ontologies. To this end, several automatic ontology alignment te- chniques have been proposed [Mascardi et al. 2010]. Nevertheless, significant ma- nual efforts of validation are still demanded if a certain level of quality is required [Shvaiko and Euzenat 2013]. This problem prevents software applications relying on mappings to fully take advantage on them. Ontologies evolve over time to keep them up-to-date according to the domain kno- wledge. Ontology changes may affect mappings already established or can be a source for treating mapping refinement. In this context, in order to avoid the costly ontology re-alignment process, it is crucial to have adequate mapping adaptation strategies to keep mappings semantically valid over time [Reis et al. 2015]. Manual mapping main- tenance is possible only if modifications are applied to a restricted number of mappings. Otherwise automatic methods are required for large and highly dynamic ontologies. For example, biomedical ontologies usually contain hundred of thousands of concepts inter- connected via mappings. Coping with the mapping reconciliation problem in a semi-automatic way entails many research challenges. First, it is difficult to evaluate the real impact of the ontology evolution on mappings. For instance, changing an attribute value may lead to invalidate a mapping in some cases. In these situations, the challenging issue is to identify and classify the different cases. Second, several types of ontology changes can be applied to an ontology, but it is unknown how these different types of operations should be duly taken into consideration for mapping reconciliation [Groß et al. 2013]. Thus, generally only the removal of concepts is addressed. Third, the reconciliation of mappings with several types of semantic relations need deeper studies [Reis et al. 2015]. The design of techniques for mapping adaptation according to the different ty- pes of ontology changes has been coped within existing approaches. Previous work in literature [Reis et al. 2015] presented a mapping adaptation strategy for two out of three categories of ontology evolution: removal of knowledge and revision of knowledge. For example, when concepts are removed, heuristics were designed to automatically apply adaptation actions over mappings. The addition of knowledge (third category) is the most frequent type of change occurred in ontology evolution. New concepts or attributes in concepts are added to comply with new domain knowledge. Such new knowledge needs to be aligned with the interrelated ontologies, but we need to avoid applying a matching operation with the whole target ontology. Literature presents several interpretations for ontology mapping refinement. The first interpretation is in the sense of helping to expand the types of semantic relations identified during the matching process. Refinement can modify or enrich semantic re- lations; for instance, during the refinement process, an equivalence (≡) relation (i.e., a relationship defining that two interrelated concepts are equivalent) can be modified to an is-a (v) relationship (i.e., representing a relationship where one concept is a specializa- tion of the other) [Arnold and Rahm 2014]. Another sense is refinement operations as a way of adding (deriving) or removing mappings, for instance, based on defined pattern rules [Hamdi et al. 2010]. These rules can be created manually or automatically, and are used to validate the mapped relations. The second category aims to enrich relations in a previously calculated mapping and the mapping set with new derived mappings. In this paper, we explore the second option. In this paper, we propose a mapping refinement methodology to update mapping sets taking ontology changes into account (based on new concepts added in ontology evo- lution). We study the use of conceptual information related to neighbour concepts for enhancing the mapping adaptation quality. For this purpose, we investigate techniques to reuse already established mappings and to explore the role of neighbour concepts to derive new mappings. Our proposal allows suggesting new correspondences without ap- plying a matching operation with the whole set of ontology entities. The remainder of this article is organized as follows: Section 2 presents the formal definitions and problem sta- tement. Section 3 reports on our approaches to refine ontology mappings under ontology evolution. Section 4 draws conclusions and future work. 2. Preliminaries Ontology. An ontology O specifies a conceptualization of a domain in terms of concepts, attributes and relationships [Gruber 1993]. Formally, an ontology O = (CO , RO , AO ) consists of a set of concepts CO interrelated by directed relationships RO . Each concept c ∈ CO has a unique identifier and is associated with a set of attributes AO (c) = {a1 , a2 , ..., ap }. Each relationship r(c1 , c2 ) ∈ RO is typically a triple (c1 , c2 , t) where t is the relationship (e.g., “is a”, “part of”, “adviced by”, etc.) interrelating c1 and c2 . Context of a concept. We define the context of a particular concept ci ∈ CO as a set of super concepts, sub concepts and sibling concepts of ci , as following: CT (ci , λ) = sup(ci , λ) ∪ sub(ci , λ) ∪ sib(ci , λ) (1) where sup(ci , λ) = {cj |cj ∈ CO , r(ci , cj ) = “ @ ” ∧ length(ci , cj ) ≤ λ ∧ ci 6= cj } sub(ci , λ) = {cj |cj ∈ CO , r(cj , ci ) = “ @ ” ∧ length(ci , cj ) ≤ λ ∧ ci 6= cj } (2) sib(ci , λ) = {cj |cj ∈ CO , sup(cj ) = sup(ci ) ∧ length(ci , cj ) ≤ λ ∧ ci 6= cj } where λ is the length between two concepts (in terms of their relationship distance in the hierarchy of concepts) and the “@” symbol indicates that “ci is a sub concept of cj ”. This definition of CT (ci , λ) is specially designed as the relevant concepts to be taken into account in the settings of this investigation on mapping refinement. Similarity between concepts. Given two particular concepts ci and cj , the simila- rity between them can be defined as the maximum similarity between each couple of attributes from ci and cj . Formally: sim(ci , cj ) = arg max sim(aix , ajy ) (3) where sim(aix , ajy ) is the similarity between two attributes aix and ajy denoting concepts ci and cj , respectively. We can compute this similarity at different linguistic levels: character, string and semantic level [Dinh et al. 2014]. Mapping. Given two concepts cs and ct from two different ontologies, a mapping mst can be defined as: mst = (cs , ct , semT ype, conf ) (4) where semT ype is the semantic relation connecting cs and ct . In this article, we diffe- rentiate relation from relationship, where the former belongs to a mapping and the later to an ontology. The following types of semantic relation are considered: unmappable [⊥], equivalent [≡], narrow-to-broad [≤], broad-to-narrow [≥] and overlapped [≈]. For example, concepts can be equivalent (e.g., “head”≡“head”), one concept can be less or more general than the other (e.g., “thumb”≤“finger”) or concepts can be somehow seman- tically related (≈). The conf is the similarity between cs and ct indicating the confidence of their relation [Euzenat and Shvaiko 2007]. We define MjST as a set of mappings mst between ontologies OS and OT at a given time j. We assume j ∈ N the version of the ontology release OSj . Therefore, ontology OS0 is the version 0 whereas OS1 is the version 1 of the same ontology. Ontology change operations (OCO). An ontology change operation (OCO) is defined to represent a change in an attribute, in a set of one or more concepts or in a re- lationship between concepts. OCOs are classified into two main categories: atomic and complex changes. Each OCO in the former cannot be divided into smaller operations while each one of the latter is composed of more than one atomic operation. For ins- tance, the operation chgA(c, a, v) is composed of two atomic operations delA(a, c) and addA(a, c). In this paper, we pay further attention to the operations of concept addition. Problem statement. Consider two versions of the same source ontology OSj at time j and OSj+1 at time j + 1, a target ontology OTj , and an initial set of mappings MjST between OSj and OTj at time tj . Suppose that the frequency of new releases of OS and OT is different and at time j + 1 only OS evolves. Since this evolution is likely to impact the mappings MjST , it is necessary to refine MjST to guarantee the quality and completeness of Mj+1ST . The quality is related to the consistency of mappings. For instance, mappings cannot be established between removed concepts. The completeness refers to the recall of aligned concepts in Mj+1 j ST . In this investigation, we study how MST can be refined (e.g., new mappings derived) based on ontology changes related to addition of knowledge. In particular, we address the following research questions: • How to exploit existing mappings for mapping refinement? • Is it possible to reach mapping refinement for alignment of new concepts without applying a matching operation in the whole target ontology? • What is the impact of using the context of concepts CT (ci , λ) in the target onto- logy on the mapping refinement effectiveness? We consider that OT has not evolved (thus OTj and OTj+1 are the same version of the ontology OT ). OSj and OSj+1 are two distinct versions of the same ontology OS . At time j + 1, newly added concepts appear in OSj+1 and we attempt to refine the original j j+1 mapping set MST to provide a set of valid mappings MST . 3. Mapping Refinement under Concept Addition Changes At the first step, our approach identifies all newly added concepts using the Conto-Diff tool [Hartung et al. 2013]. This tool allows to identify atomic and complex ontology changes. Next, we extract the contextual information, i.e., super, sub and sibling concepts of those newly added concepts (cf. Formula 1). At this stage, we consider λ = 1 (i.e., direct parents and children concepts). We then examine the existing mappings between the source concept in the context of the newly added concept and the corresponding target concepts. We propose adequate correspondences for each newly added concept. The idea behind the context-oriented technique is that the candidate mapping is established between a newly added concept and a target concept of an existing mapping at time j. Figure 1 illustrates the general scenario for the local approach to mapping refinement where we consider that OTj and OTj+1 are the same ontology while the source ontology OSj evolved to OSj+1 and presents new added concepts at time j + 1. Algorithm 1 computes the diff between two given versions of the source ontology (line 1). For each newly added concept c1i , the algorithm considers a candidate con- cept namely ct in the target ontology by exploiting already existing mappings related to CT (c1i , 1) (lines 4-8). We determine a new refined mapping by calculating the similarity between a new concept c1i of OSj+1 and a candidate ct . If the maximum similarity (among the concept attributes) is greater than or equal to a threshold τ , the algorithm establishes a mapping between the newly added concept and the candidate target concept. Algorithm 1 searches for the candidate ct that yields the maximum similarity value. The semantic type of the new mapping is computed according the relationship between the newly ad- ded concept and the optimal concept c1ij ∈ CT (c1i ). For instance, if c1ij ∈ sup(c1i ) so the mit .semt ype ←≤. This aspect requires further research to determine the type of semantic relation of the derived new mapping. Figura 1. Context approach for mapping refinement. Algorithm 1: Contextual approach to mapping refinement Input: OSj , OSj+1 , OTj , OTj+1 , MjST , τ ∈ R Output: MA = {m1 , m2 , ..., mN } j j+1 1: Cadd ← dif fadd (OS , OS ) {newly added concepts} 2: Ct ← ∅ {initialize target concepts of candidate mappings} 3: for all c1i ∈ Cadd do 4: for all c1ij ∈ CT (c1i , 1) do 5: if ∃c0t ∈ COT0 , ∃m(c0ij , c0t ) ∈ M0ST then S 0 6: Ct ← Ct {ct } 7: end if 8: end for 9: mit ← ∅ 10: for all ct ∈ Ct do 11: mcand ← argmax sim(c1i , ct ) {Create a mapping between concepts c1i and ct } 12: if max(sim(c1i , ct )) ≥ τ then 13: mit ← mcand 14: τ ← max(sim(c1i , ct )) 15: end if 16: end for S 17: MA ← MA {mit } 18: end for 19: return MA 4. Conclusion Ontology mappings play a central role for semantic data integration. However, domain knowledge update leads to new concepts in ontology versions. This requires to maintain mapping sets up-to-date according to the knowledge dynamics. In this article, we propo- sed techniques to refine ontology alignments based on evolving ontologies. We defined an algorithm to create contextual alignment to increase the chances of obtaining mat- ching between concepts. Our developed approach relied on similarity computed between concept attributes to maximize the similarity value (semantic confidence) between the interrelated mappings proposed. Future work involves further investigating heuristics to update the type of semantic relation in the refinement in the investigated procedures. We plan to conduct exhaustive experiments in different domains to evaluate the quality of detected refined mappings. Acknowledgements This work is financially supported by the São Paulo Research Foundation (FAPESP)1 (Grant #2017/02325-5). Referências [Arnold and Rahm 2014] Arnold, P. and Rahm, E. (2014). Enriching ontology mappings with semantic relations. Data & Knowledge Engineering, 93:1–18. [Dinh et al. 2014] Dinh, D., Reis, J. C. D., Pruski, C., Silveira, M. D., and Reynaud-Delaitre, C. (2014). Identifying relevant concept attributes to support mapping maintenance under ontology evolution. Web Semantics, 29:53 – 66. [Euzenat and Shvaiko 2007] Euzenat, J. and Shvaiko, P. (2007). Ontology matching. Sprin- ger. [Groß et al. 2013] Groß, A., Reis, J. C. D., Hartung, M., Pruski, C., and Rahm, E. (2013). Semi-Automatic Adaptation of Mappings between Life Science Ontologies. In Pro- ceedings The 9th International Conference on Data Integration in the Life Sciences, pages 90–104. [Gruber 1993] Gruber, T. R. (1993). A translation approach to portable ontology specifica- tions. Knowl. Acquis., 5(2):199–220. [Hamdi et al. 2010] Hamdi, F., Reynaud, C., and Safar, B. (2010). Pattern-based mapping refinement. Knowledge Engineering and Management by the Masses, pages 1–15. [Hartung et al. 2013] Hartung, M., Gross, A., and Rahm, E. (2013). COnto-Diff: Generation of Complex Evolution Mappings for Life Science Ontologies. Journal of Biomedical Informatics, 46:15–32. [Mascardi et al. 2010] Mascardi, V., Locoro, A., and Rosso, P. (2010). Automatic onto- logy matching via upper ontologies: A systematic evaluation. IEEE Transactions on Knowledge and Data Engineering, 22(5):609–623. [Reis et al. 2015] Reis, J. C. D., Pruski, C., Silveira, M. D., and Reynaud-Delaitre, C. (2015). Dykosmap: A framework for mapping adaptation between biomedical kno- wledge organization systems. Journal of Biomedical Informatics, 55:153 – 173. [Shvaiko and Euzenat 2013] Shvaiko, P. and Euzenat, J. (2013). Ontology Matching: State of the Art and Future Challenges. IEEE Trans. Knowl. Data Eng., 25(1):158–176. 1 The opinions, hypotheses and conclusions or recommendations expressed in this material are the res- ponsibility of the authors and do not necessarily reflect the views of FAPESP.