GMRs: Reconciliation of Generic Merge
                       Requirements in Ontology Integration

              Samira Babalou[0000−0002−4203−1329] , Birgitta König-Ries[0000−0002−2382−9722]

                            Heinz-Nixdorf Chair for Distributed Information Systems
                  Institute for Computer Science, Friedrich Schiller University Jena, Germany
                        {samira.babalou,birgitta.koenig-ries}@uni-jena.de


                     Abstract. With growing popularity of ontologies as semantics-aware
                     integration solutions, various ontology merging methods have been
                     proposed. Each of them uses their own set of criteria to determine a
                     desirable merge result. In this work, we first categorize these criteria into
                     generic merge requirements (GMRs). Not all of these requirements can
                     be met simultaneously. We argue that users should be able to select those
                     requirements that are most important to them, but that system support
                     is required to determine a compatible set of requirements based on user
                     inputs. We propose a graph-theory based approach to determining such
                     sets.

                     Keywords: Ontology merging . Merge requirements . Graph theory


            1      Introduction
            An ontology is a formal explicit description of a domain. It contains a set
            of entities including classes, properties, and instances. Given a set of input
            ontologies and a set of correspondences between them, an ontology merging
            process creates a new merged ontology. One aspect that various approaches to
            ontology merging differ in is the set of criteria they aim to fulfill, i.e., what are
            the requirements they expect the merged ontology to meet. We have conducted
            an analysis of the literature and determined which criteria, here called generic
            merge requirements (GMR), are used by different approaches. Our ultimate aim
            is to provide a flexible merging approach, where users can actively choose which
            requirements are important to them, instead of allowing only a very indirect
            choice by picking the right merging method, something that is not transparent
            to the user. Unfortunately, not all GMRs are compatible with each other. For
            instance, one may want to preserve all classes contained in the original ontology
            in the merged ontology. On the other hand, one could wish to achieve class
            acyclicity. Likely, these goals conflict. Thus, once a user has chosen which GMRs
            they consider important, a system is needed that is able to check whether these
            are compatible and which other requirements can be met simultaneously.
                In this paper, we take the first step towards this goal: We analyse the
            literature to compile a list of GMRs, we provide first insights into their
            compatibility and we describe a graph-theory based method for determining
            maximal sets of compatible GMRs.


Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            Completeness                                                               Deduction                                 Acyclicity
                                                Logic Properties
          𝑅𝑅𝑅. Class preservation                                               𝑅𝑅12. Entailment deduction        𝑅𝑅𝑅𝑅. Acyclicity in the class hierarchy
                                                    Integrity                           satisfaction
                                                                                                                     𝑅𝑅𝑅𝑅. Acyclicity in the property
        𝑅𝑅𝑅. Property preservation
                                                                                                                                hierarchy
                                                   Minimality                          Constraint
        𝑅𝑅𝑅. Instance preservation                                                                                   𝑅𝑅18. Prohibition of properties
                                                                                 𝑅𝑅13. One type restriction           being inverses of themselves
                                          𝑅𝑅8. Class redundancy prohibition
  𝑅𝑅4. Correspondence preservation
                                             𝑅𝑅9. Property redundancy         𝑅𝑅14. Property value’s constraint             Connectivity
     𝑅𝑅5. Correspondences’ Property                  prohibition
               preservation                                                     𝑅𝑅𝑅𝑅. Property’s domain and       𝑅𝑅19. Unconnected class prohibition
                                             𝑅𝑅𝑅𝑅. Instance redundancy                 range oneness
     𝑅𝑅6. Property’s value preservation              prohibition                                                      𝑅𝑅20. Unconnected property
                                                                                                                              prohibition
                                               𝑅𝑅𝑅𝑅. Extraneous entity            Model Properties
        𝑅𝑅7. Structure preservation                  prohibition                                                        Legend    Dimension   Aspect   GMR


                                                     Fig. 1: GMR classification


2          GMRs Overview and Classification
To discover the most commonly used GMRs, we have studied and analyzed three
different research areas including (i) ontology merging methods [1,2,3,6,7,10], (ii)
ontology merging benchmarks [4,9], and (iii) ontology engineering [5,8]. Overall,
we classify the GMRs according to three dimensions subdivided into different
aspects (see Fig. 1). The three dimensions we identified are:

  – Integrity: This dimension refers to the degree of knowledge coverage in
    the merge process via (i) the completeness aspect, and to the amount of
    knowledge redundancy by (ii) the minimality aspect.
  – Model Properties: Within this dimension, the principles of creating a new
    ontology model are investigated. In this dimension, (i) acyclicity, and (ii)
    connectivity satisfaction aspects are taken into account.
  – Logic Properties: The inference of the expected knowledge with involved
    constraints is analyzed in this dimension. This includes (i) deduction, and
    (ii) constraint satisfaction aspects.

We determined six aspects that GMRs can be grouped into:
Completeness refers to knowledge preservation and coverage:

R1. Class preservation: Each class in (all/target) 1 input ontologies should have
    a mapped class in the merged ontology [1,3,6,7,10].
R2. Property preservation: Each property from the (all/target) input ontologies
    is explicitly in or implied by the merged ontology [7,10].
R3. Instance preservation: All instances of (all/target) input ontologies should
    be preserved in the merged ontology [7,10].
R4. Correspondence preservation: If two entities of the input ontologies are
    corresponding 2 , both should map to the same merged entity in the merged
    ontology [6,7,10].
 1
      Preserving all entities from all input ontologies or a preferred one.
 2
      This can be equality, similarity or is-a correspondences. In each case, the same type
      of corresponding should be preserved.
 R5. Correspondences’ property preservation: If any of the corresponding entities
     from the input ontologies has a certain property, the merged entity should
     also have this property [6,7].
 R6. Property’s value preservation: Properties’ values from the (all/target) input
     ontologies should be preserved in the merged ontology [6,7]. In case of
     conflicts a resolution strategy is required.
 R7. Structure preservation: If two entities are connected via a certain property
     in an input ontology, their mapped entities in the merged ontology should be
     connected via the respective mapped property [1], thus preserving the input
     ontologies’ structures in the merged ontology.
  Minimality refers to knowledge redundancy and controlling of semantic overlap:
 R8. Class redundancy prohibition: A class from the (all/target) input ontologies
     should have at most one mapping in the merged ontology [1,3,4,7,10].
 R9. Property redundancy prohibition: A property from the (all/target) input
     ontologies should have at most one mapping in the merged ontology [4].
R10. Instance redundancy prohibition: An instance from the (all/target) input
     ontologies should have at most one mapping in the merged ontology.
R11. Extraneous entity prohibition: No additional entities other than input
     ontologies’ entities should be added in the merged result [7].
  Deduction refers to the deduction satisfaction with R12 :
R12. Entailment deduction satisfaction: The merged ontology is desirable to be
     able to entail all entailments of the (all/target) input ontologies [2]. As the
     semantic consequences of the integration, it can include more entailments
     but it should at least not miss knowledge from the input ontologies.
  Constraint reflects the satisfaction of the ontology constraints:
R13. One type restriction: Two corresponding entities should follow the same data
     type [7]; e.g., if the range of author Id in one of the input ontology is
     String and in the other one is Integer, then the range of the merged entity
     author Id in the merged ontology cannot have both types.
R14. Property value’s constraint: If the (all/target) input ontologies place some
     restriction on a property’s values (e.g., in terms of cardinality or by
     enumerating possible values) this should be preserved without conflict in
     the merged ontology [7].
R15. Property’s domain and range oneness: The merge process should not result
     in multiple domains or ranges defined for a single property. This rule is recast
     from the ontology modelling issues in [8].
  Acyclicity refers to controlling the chain problem in the merged ontology:
R16. Acyclicity in the class hierarchy: A cycle of is-a relationships implies equality
     of all of the classes in the cycle, since is-a is transitive. Therefore, the merge
     process should not produce a cycle in the class hierarchy [1,3,5,7,8,10].
R17. Acyclicity in the property hierarchy: The merge process should not
     produce a cycle between properties with respect to the is-subproperty-of
     relationship [8].
R18. Prohibition of properties being inverses of themselves: The merged process
     should not cause an inverse recursive definition on the properties [8].
 Connectivity refers to the hierarchy connectivity satisfaction:

R19. Unconnected class prohibition: The merge process should not make the
     classes unconnected [3,7]. Every class that had some connections in the input
     ontologies before the merge process, should not be unconnected after merge
     process in the merged ontology.
R20. Unconnected property prohibition: The merge process should not make the
     properties unconnected [6,8]. Every property that had some connections in
     the input ontologies before the merge process, should not be unconnected
     after merge process in the merged ontology.


 3    GMR Compatibility
 Intuitively, not all GMRs are compatible, e.g., completeness and minimality are
 difficult to achieve simultaneously. In our ongoing work, we are determining the
 precise relationships. Typically, users will, however, not be interested in all of
 them anyhow. Depending on the application they want to create an ontology for,
 some GMRs might be important to them while others are only nice to have. We
 therefore propose an approach that takes as input a set of user-selected GMRs,
 namely U and an undirected graph G reflecting the relationship among GMRs.
 G = (V, E), where V is the set of vertices representing the GMRs, and E is the
 set of edges. Two GMRs are connected via an edge iff they are compatible.
     Given the set U containing the GMRs the user is interested in, we find the
 maximum subset of V containing all vertices out of U and no incompatible nodes.
 This may not always be possible, since the user might have chosen incompatible
 GMRs already. In this case, we search for a maximum subset of V that preserves
 as many nodes out of U as possible and contains compatible nodes only.
     Precisely, we recast the problem at hand as maximum clique extraction on G.
 A clique is a set of fully connected vertices. We thus extract a compatible clique
 KC -Clique, where K indicates the number of vertices in the clique, and C denotes
 that the clique is compatible. KC -Clique includes compatible GMRs from U and
 some additional compatible GMRs related to U’s elements. KC -max-Clique, is a
 clique containing at least K vertices that is not a subset of any other cliques. To
 compute the KC -max-Clique, we use the CLIQUES algorithm in [11].
     To illustrate our approach with an example, Fig. 2 left shows the graph G,
 for six GMRs where U = {R2 , R8 , R11 }. Based on this graph, three different
 cliques have been extracted, in which the user-selected compatible GMRs, the
 incompatible U’s elements, and the additional compatible GMRs related to U
 are indicated with green, red, and yellow circles, respectively. Two possible
 represented 3C -Clique are compatible cliques however they are not a maximal
 clique. The 4C -max-Clique is the best match to the U and indicates the maximal
 compatible subset on the sketched G.
          𝑹𝑹𝟒𝟒   𝑹𝑹𝟐𝟐                     𝑹𝑹𝟒𝟒   𝑹𝑹𝟐𝟐                     𝑹𝑹𝟒𝟒   𝑹𝑹𝟐𝟐                     𝑹𝑹𝟒𝟒   𝑹𝑹𝟐𝟐


 𝑹𝑹𝟏𝟏𝟏𝟏                 𝑹𝑹𝟏𝟏𝟏𝟏   𝑹𝑹𝟏𝟏𝟏𝟏                 𝑹𝑹𝟏𝟏𝟏𝟏   𝑹𝑹𝟏𝟏𝟏𝟏                 𝑹𝑹𝟏𝟏𝟏𝟏   𝑹𝑹𝟏𝟏𝟏𝟏                 𝑹𝑹𝟏𝟏𝟏𝟏


          𝑹𝑹𝟖𝟖   𝑹𝑹𝟏𝟏𝟏𝟏                   𝑹𝑹𝟖𝟖   𝑹𝑹𝟏𝟏𝟏𝟏                   𝑹𝑹𝟖𝟖   𝑹𝑹𝟏𝟏𝟏𝟏                   𝑹𝑹𝟖𝟖   𝑹𝑹𝟏𝟏𝟏𝟏


Fig. 2: From left to right: a graph G for six GMRs; two different compatible
3C -Clique; a compatible 4C -max-Clique.


4         Conclusion
There is a variety of generic merge requirements (GMR) in the ontology
engineering domain. We gathered the most common GMRs 3 , and classified
them into three dimensions. Since not all GMRs can be fulfilled at the same
time, we propose an approach that allows users to specify the most important
GMRs for their specific task, then determines the maximal compatible subset
of GMRs. This result can then be used to select a proper merge method or to
parameterize a generic merge method. We are currently working on two aspects
of this problem: First, a systematic determination of GMR compatibility and
second the development of a generic, parameterizable merge method.
   Acknowledgments. S. Babalou is supported by a scholarship from German
Academic Exchange Service (DAAD).

References
 1. F. Duchateau and Z. Bellahsene. Measuring the quality of an integrated schema.
    In ER, pages 261–273, 2010.
 2. E. Jiménez-Ruiz, B. C. Grau, I. Horrocks, and R. Berlanga. Ontology integration
    using mappings: Towards getting the right logical consequences. In ESWC, 2009.
 3. S. P. Ju, H. E. Esquivel, A. M. Rebollar, and M. C. Su. Creado–a methodology to
    create domain ontologies using parameter-based ontology merging techniques. In
    MICAI, 2011.
 4. M. Mahfoudh, G. Forestier, and M. Hassenforder. A benchmark for ontologies
    merging assessment. In KSEM, pages 555–566, 2016.
 5. N. F. Noy, D. L. McGuinness, et al. Ontology development 101: A guide to creating
    your first ontology, 2001.
 6. N. F. Noy and M. A. Musen. The prompt suite: interactive tools for ontology
    merging and mapping. IJHCS, 59(6):983–1024, 2003.
 7. R. A. Pottinger and P. A. Bernstein.          Merging models based on given
    correspondences. In VLDB Endowment, 2003.
 8. M. Poveda-Villalón, M. C. Suárez-Figueroa, and A. Gómez-Pérez. Validating
    ontologies with oops! pages 267–281, 2012.
 9. S. Raunich and E. Rahm. Towards a benchmark for ontology merging. In OTM,
    volume 7567, pages 124–133, 2012.
10. S. Raunich and E. Rahm. Target-driven merging of taxonomies with atom. Inf.
    Syst., 42:1–14, 2014.
11. E. Tomita, A. Tanaka, and H. Takahashi. The worst-case time complexity for
    generating all maximal cliques and computational experiments. TCS, 2006.
3
    see also: http://comerger.uni-jena.de/requirement.jsp