Presentation of the system

Anchor-Flood: Results for OAEI 2009

Md. Hanif Seddiqui

hanif@kde.ics.tut.ac.jp 0

Masaki Aono

aono@ics.tut.ac.jp 0 0 Toyohashi University of Technology , Japan

Our ontology schema matching algorithm takes the essence of the locality of reference by considering the neighboring concepts and relations to align the entities of ontologies. It starts off a seed point called an anchor (a pair of “look-alike” concepts across ontologies) and collects two blocks of neighboring concepts across ontologies. The concepts of the pair of blocks are aligned and the process is repeated for newly found aligned pairs. This year, we use a semantically reformed dynamic block of concepts starting from an anchor-concept and produce two blocks from one anchor to get alignment. We improve our memory management. The experimental results show its effectiveness against the benchmark, anatomy track and other datasets. We also extend our algorithm to match instances of IIMB benchmarks and we obtained effective results.

Presentation of the system

During OAEI-2008, our ontology alignment system used the locality of reference for collecting neighboring concepts with strong semantic arbitrary depth for aligning concepts across pair of ontologies. This year, we incorporate a process of collecting concepts with strong intrinsic semantic similarity within ontology elements considering intrinsic Information Content (IC) [6] to form a dynamic block. Hence our system forms a pair of dynamic blocks staring off an anchor across ontologies. We improve our memory management to cope large scale ontology alignment effectively. Our algorithm has shorter run time than that of the previous year. It takes less memory and even less time as well to align large ontologies. We participate in the benchmark datasets, all four tasks of anatomy track, conference and directory as well. We also take limited participation in the instance matching track. We participate only in the IIMB benchmark track of instance matching track. 1.1

State, purpose, general statement

The purpose of our Anchor-Flood algorithm [8] is basically ontology matching. However, we use our algorithm in patent mining system to classify a research abstract in terms of International Patent Classification (IPC). Containing mostly general terminologies in an abstract leads classification to a formidable task. Automatic extracted taxonomy of related terms available in an abstract is aligned with the taxonomy of IPC ontology with our algorithm successfully.

Furthermore, we use our algorithm to integrate the multimedia resources represented by MPEG-7 [5] ontologies [11]. We have achieved good performance with effective results in the field of multimedia resource integration [7].

To be specific, we describe our Anchor-Flood algorithm, instance matching algorithm and their results against OAEI 2009 datasets here. 1.2

Specific techniques used

We have two parts of our system. One is the ontology schema matching AnchorFlood algorithm to align concepts and properties of a pair of ontologies. Another is the instance matching approach which essentially uses our Anchor-Flood algorithm. We implement our system in Java. We create our own memory model of ontology by the ARP triple parser of Jena module.

1.2.1 Ontology Schema Matching

As a part of preprocessing, our system parses ontologies into our own developed memory model by using ARP triple parser of Jena. We also normalize the lexical description of ontology entities. Our schema matching algorithm starts off an anchor. It has a complex process of collecting small blocks of concepts and related properties dynamically by considering super-concept, sub-concept, siblings and few other neighbors from the anchor point. The size of blocks affect the running time adversely. Therefore, we incorporate semantic similarity considering intrinsic Information Content (IC) for building blocks of neighboring concepts from anchor-concepts.

Local alignment process aligns concepts and their related properties based on lexical information [2, 10, and 12], and structural relations [1, 3, 4]. Retrieved aligned pairs are considered as anchors for further processing. The process is repeated until there is no more aligned pair to be processed. Hence, it burst out with a pair of aligned fragment of the ontologies, giving the taste of segmentation [9]. Multiple anchors from different part of ontologies confirm a fair collection of aligned pairs as a whole.

1.2.2 Ontology Instance Matching

In an ontology, neither a concept nor an instance comprises its full specification in its name or URI alone. Therefore we consider the semantically linked information that includes linked concepts, properties and their values and other instances as well. They all together make an information cloud to specify the meaning of that particular instance. We refer this collective information of association as Semantic Link cloud. The degree of certainty is proportional to the number of semantic link associated to a particular instance by means of property values and other instances. First, pair of TBox is aligned with our Anchor-Flood algorithm. Then, we check the alignment of the type of an instance to any concept of the neighbors of the type of another instance across ABox. We measure the structural similarity among the elements available in a pair of clouds to produce instance alignment. The instance matching algorithm is depicted in Fig. 2 and in Fig. 3. The Anchor-Flood algorithm needs an anchor to start off. Therefore, we use a tiny program module for extracting probable aligned pairs as anchors. It uses lexical information and some statistical information to extract a small number of aligned pairs from different part of ontologies. The program is essentially smaller, simpler and faster. We also removed the subsumption module of our algorithm to keep it faster. 1.4

Link to the system and parameters file

The version of Anchor-Flood for OAEI-2009 can be downloaded from our website: http://www.kde.ics.tut.ac.jp/~hanif/res/2009/anchor_flood.zip. The parameter file is also included in the anchor_flood.zip file. I recommend readers to read the readme.txt file first. The file includes the necessary description and parameters as well in brief. 1.5

Link to the set of provided alignments (in align format)

The results for OAEI-2008 are available at our website: http://www.kde.ics.tut.ac.jp/~hanif/res/2009/aflood.zip. 2

Results

2.1

benchmark In this section, we describe the results of Anchor-Flood algorithm against the benchmark, anatomy, directory and conferences ontologies and the IIMB instance matching benchmark provided by the OAEI 2009 campaign. We also participate directory and conference track this year for the first time. 2.4

Instance Matching: IIMB Benchmarks

On the basis of transformation, the benchmark dataset is divided into four groups: 001-010, 011-019, 020-029 and 030-037. Table 3 shows the precision and recall for each of the groups. However, the detailed results are displayed in Annex section of this paper.

Table 3. Instance matching results against IIMB benchmarks

Datasets Trnasformation

001-010 011-019 020-029 030-037

Value transformations Structural transformations Logical transformations Several combinations of the previous transformations

Prec.

0.99 0.72 1.00 0.75 Rec. 0.99 0.79 0.96 0.82

F-Measure

0.991 0.751 0.981 0.786 3

General comments

In this section, we want to comment on the results of our system and the way to improve it. 3.1

Comments on the results

The main strength of our schema matching system is the way of minimizing the comparisons between entities, which leads enhancement in running time. In instance matching, our system shows its strength over value and logical transformations. The weak points are: our system ignores some distantly placed aligned pairs in ontology alignment system. In instance matching, we have still rooms to work in structural transformation. 3.2

Discussions on the way to improve the proposed system

It has still rooms of improving alignments strengthening the semantic and structural analysis and adding background knowledge. We also want to incorporate complex alignment like subsumption and 1:n alignments. In instance matching, we want to improve our system against structural transformation. 4

Conclusion

Ontology matching is very important for attaining interoperability as the core of every semantic application is ontology. We implemented faster algorithm to align specific interrelated parts across ontologies, which gives the flavor of segmentation. The anatomical ontology matching shows the effectiveness of our Anchor-Flood algorithm. Our instance matching algorithm also shows its strength in value and logical transformations. In structural transformation our algorithm is also effective in spite of challenging transformation. We improved our previous Anchor-Flood algorithm in several perceptions to retrieve ontology alignment. Furthermore, we improve the versatility of using it in different applications including instance matching, patent classification and multimedia resource integration. 6. 7. 8. 9. 10. 11. 12. 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 51 26 93 93 93 93 93 93 93 93 93 93 65 26 99 95 76 36 95 30

Bouquet , P. , Serafini , L. and Zanobini , S. : Semantic Coordination: A New Approach and an Application . Proceedings of the 2nd International Semantic Web Conference (ISWC2003) , Sanibel Island, Florida, USA ( 2003 ) pp. 130 - 145 .

Proceedings of the 16th European Conference on Artificial Intelligence (ECAI2004) , Valencia, Spain ( 2004 ) pp. 333 - 337 .

Giunchiglia , F. and Shaiko , P. : Semantic Matching, The Knowledge Engineering Review , Cambridge Univ Press, Vol. 18 ( 3 ), 2004 , pp. 265 - 280 .

Giunchiglia , F. , Shvaiko , P. and Yatskevich , M.: S-Match: an Algorithm and an Implementation of Semantic Matching . Proceedings of the 1st European Semantic Web Symposium (ESWS2004) , Heraklion, Greece, ( 2004 ) pp. 61 - 75 .

Nack , F. and Lindsay , A.T.: Everything you wanted to know about MPEG-7 (

Part

I).

IEEE Multimedia , Vol. 6 ( 3 ), 1999 , pp. 65 -- 77 .

Proceedings of the 14th International Joint Conference on Artificial Intelligence.

Montreal , Canada ( 1995 ) pp. 448 - 453 .

Seddiqui , M.H.

and

Aono , M.:

MPEG-7 based Multimedia Information Integration through Instance Matching . Berkeley, IEEE International Conference on Semantic Computing, CA, USA ( 2009 ) pp. 618 - 623 .

Seddiqui , M.H.

and

Aono , M.:

An Efficient and Scalable Algorithm for Segmented Alignment of Ontologies of Arbitrary Size . Web Semantics: Science, Services and Agents on the World Wide Web ( 2008 ), doi:10.1016/j.websem. 2009 . 09 .001.

Seidenberg , J. and Rector , A. : Web Ontology Segmentation: Analysis, Classification and Use . Proceedings of the 15th International Conference on World Wide Web (WWW2006) , Edinburgh, Scotland ( 2006 ) pp. 13 - 22 .

Proceedings of the 4th International Semantic Web Conference (ISWC2005) , Galway, Ireland ( 2005 ) pp. 623 - 637 .

Troncy , R. , et al.: Mpeg-7 based Multimedia Ontologies: Interoperability Support or Interoperability Issue . Proceedings of the 1st International Workshop on Multimedia Annotation and Retrieval enabled by Shared Ontologies (MAReSO), Genova , Italy ( 2007 ).

Technical

Report , Statistical Research Division, U.S. Census Bureau, Washington, USA ( 1999 ).