-

⋆ Is my ontology matching system similar to yours?

Ernesto Jime´nez-Ruiz

Bernardo Cuenca Grau

Ian Horrocks

0 0 Department of Computer Science, University of Oxford , Oxford UK

In this paper we extend the evaluation of the OAEI 2012 Large BioMed track, which involves the matching of the semantically rich ontologies FMA, NCI and SNOMED CT. Concretely, we report about the differences and similarities among the mappings computed by the participant ontology matching systems. The quality of the mappings computed by an ontology matching system in the Ontology Alignment Evaluation Initiative (OAEI) [2, 1] is typically measured in terms of precision and recall with respect to a reference set of mappings. Additionally, the OAEI also evaluates the coherence of the computed mappings [1]. However, the differences and similarities among the mappings computed by different systems have often been neglected in the OAEI.1 In this paper we provide a more fine-grained comparison among the matching systems participating in the OAEI 2012 Large BioMed track;2 concretely (i) we have harmonised (i.e. voted) the computed mapping sets, and (ii) we provide a graphical representation of the similarity of these sets. We have considered the mappings voted (i.e. included in the output) by at least one ontology matching system. Figure 1 shows the harmonization (i.e. voting) results for the FMA-NCI and FMA-SNOMED matching problems. Mappings have received at most 11 and 8 votes (i.e. number of participating systems3), respectively. For example, in the FMA-NCI matching problem, 3,719 mappings have been voted by at least 2 systems. Figure 1 also shows the evolution of F-score, Precision and Recall for the different harmonized mapping sets. As expected the maximum recall (respectively precision) is reached with the minimum (respectively maximum) number of votes. For example, the maximum recall in the FMA-SNOMED problem is 0.81, which shows the difficulty of identifying correct mappings in this matching problem. The harmonized mapping sets with the best trade-off between precision and recall have been selected as the representative mapping sets of the participating ontology matching systems. For the FMA-NCI matching problem we have selected the mappings sets with (at least) 3, 4 and 5 votes, while in the FMA-SNOMED matching problem we have selected the sets with (at least) 2 and 3 votes (see dark-grey bars in Figure 1). ⋆ This research was financed by the Optique project with the grant agreement FP7-318338. 1 As far as we know, only in the 2007 Anatomy track some effort was done in this line: http: //oaei.ontologymatching.org/2007/results/anatomy/ 2 Results available at: http://www.cs.ox.ac.uk/isg/projects/SEALS/oaei/ 3 Systems with several variants have only been considered once in the voting process.

Introduction

Mapping harmonization

LLooggMMaapp-noe ServOMapL YAM++ ServOMap Vote5 Vote4 Vote3 GOMMA GOMMA-Bk UMLSL UMLSA

UMLS spapng i m fr boeu m N 11,160 17,020 3,719 3,074 2,862 2,739 2,563 2,420 2,292 2,077 1,630 812 1 2 3 4 5Number5of votes7 8 9 10 11 7,997 6,592 4,218

UUMMLSLUSLMALS

YSAeMrv+O+MapL SVeortveO3Map

LLooggMMaapp-noe 0 0LogMapLt 0.10 0.20 Jaccard distance 0.30 0.40 0 0 LogMapLt 0.10 0.20 0.30 Jaccard distance 0.50 0.40

Mapping similarity among systems We have compared the similarity among (i) the representative mapping sets from the harmonisation (see Section 2), (ii) the UMLS-based reference mappings of the track, and (iii) the mapping sets computed by the top-8 ontology matching systems in the FMA-NCI and FMA-SNOMED matching problems [ 1 ]. To this end we have calculated the jaccard distance (|MA ∪ MB | − |MA ∩ MB |)/|MA ∪ MB |, which ranges from 0 (the same) to 1 (different), between each pair (MA and MB ) of the mapping sets from (i)-(iii), and represented such distances in a two-dimensional scatterplot (see Figure 2). System names which are distant to each other indicate that their computed mappings differ to a large degree. For example, in Figure 2 (right), the mappings computed by LogMapLt and GOMMA are very different with respect to the mappings computed by other systems, as well as with respect to the harmonized and reference mapping sets.

1. Aguirre , J. , et al.: Results of the Ontology Alignment Evaluation Initiative 2012 . In: Ontology Matching Workshop. Vol- 946 of CEUR Workshop Proceedings ( 2012 )

2. Euzenat , J. , Meilicke , C. , Stuckenschmidt , H. , Shvaiko , P. , Trojahn , C. : Ontology alignment evaluation initiative: Six years of experience . J. Data Sem . 15 , 158 - 192 ( 2011 )