=Paper= {{Paper |id=None |storemode=property |title=Thesaurus mapping: a challenge for ontology alignment? |pdfUrl=https://ceur-ws.org/Vol-946/om2012_poster8.pdf |volume=Vol-946 |dblpUrl=https://dblp.org/rec/conf/semweb/Ritze012 }} ==Thesaurus mapping: a challenge for ontology alignment?== https://ceur-ws.org/Vol-946/om2012_poster8.pdf
       Thesaurus Mapping: A Challenge for Ontology
                      Alignment?

                            Dominique Ritze and Kai Eckert

                           Mannheim University Library
                dominique.ritze,eckert@bib.uni-mannheim.de

    Thesauri are hierarchical knowledge organization systems commonly used in li-
braries to categorize and index publications. While sometimes referred to as so-called
lightweight ontologies [4], they actually fundamentally differ from ontologies in sev-
eral aspects. Nevertheless, as thesauri are actively used, constantly maintained and im-
proved, they offer an interesting background knowledge for semantic applications. This
year, we reinstantiated the OAEI library track1 , i.e., we provide the ontology matching
community with the challenge to create alignments between thesauri. First, we aim for
interesting insights into the differences between ontologies and thesauri. Second, we
try to further integrate existing thesauri by means of new alignments which leads to
better search experiences within library systems. From 2007 to 2009, there has already
been a library track in the OAEI [1]. They focused on matching thesauri describing the
same topic but at a different level of granularity. For our track, we selected two very
comparable thesauri with topical overlaps. To make sure that the created alignments are
indeed used, we work closely together with the maintaining institutions. We apply the
following two thesauri:

STW: The Thesaurus for Economics (STW) provides vocabulary on any economic
   subject: more than 6,000 standardized subject headings (in English and German)
   and 19,000 additional keywords. The entries are richly interconnected by 16,000
   broader/narrower and 10,000 related relations. The vocabulary is maintained on a
   regular basis by ZBW German National Library of Economics – Leibniz Centre for
   Economics2 . The thesaurus is available in SKOS [3].
TheSoz: Similar to the STW, the Thesaurus for the Social Sciences (TheSoz) serves as
   a crucial instrument for indexing documents and research information in the social
   sciences. Overall, it contains about 12,000 keywords, from which 8,000 are stan-
   dardized subject headings (in English, German and French) and 4,000 additional
   ones. The thesaurus is owned and maintained by GESIS - Leibniz Institute for the
   Social Sciences3 . The thesaurus is available in SKOS [5].

   The matching results are evaluated by means of a reference alignment which has
been manually created by domain experts in 2006 [2]. It has not been adapted or further
developed after its initial creation. Hence, it does not cover changes of the thesauri.
Within the reference alignment, concepts are aligned to more than one concept
 1
   http://web.informatik.uni-mannheim.de/oaei-library/2012/
 2
   http://zbw.eu/index-e.html
 3
   http://www.gesis.org/en/home/
(n:m mapping). All in all, the alignment contains 2,839 exact matches and 1,450 sub-
sumptions. Other generated correspondences will be evaluated by domain experts as
well. It is planned to extend the reference alignment on the basis of manually evaluated
matching results, if the quality is sufficient to justify the effort.
   The participating matchers in OAEI are currently developed for (OWL) ontology
matching. As a starting point for them, we provide an OWL version of the thesauri.
Therefore, the SKOS predicates are mapped to RDF/OWL as follows:

         SKOS                                             RDF/OWL
         skos:concept                                     owl:class
         skos:prefLabel, skos:altLabel                    rdfs:label
         skos:scopeNote, skos:notation                    rdfs:comment
         skos:related                                     rdfs:seeAlso
         skos:narrower                                    rdfs:superClassOf
         skos:broader                                     rdfs:subClassOf

    There are several issues with such a mapping: First and foremost, a skos:concept
is not a class. Concepts sometimes represent classes, like C OMMODITIES, but there are
other concepts that clearly represent instances, like G ERMANY. The mapping of the
broader/narrower relationships is likewise problematic. In the STW, the narrower path
C OMMODITIES → M ETALS → M ETAL P RODUCTS → R AZOR is found. All metals are
commodities too, but metal products like a razor only consist of metal, but are no metal.
And last, the expressiveness of SKOS regarding different types of labels, additional de-
scriptive notes and general concept relations are lost in RDF/OWL.
    Thus, the question arises to which degree the current matching systems are ham-
pered by these oversimplifications and semantic inconsistencies. We indeed hope that
specialized SKOS matchers will join the challenge and that they outperform the generic
ontology matchers. This way, the library track can contribute to the integration of the-
sauri in real world applications. As a side-effect, we would like to raise the discussion,
how thesauri relate to ontologies and which role they might play in the Semantic Web.


References
1. Antoine Isaac, Lourens van der Meij, Shenghui Wang, and Henk Matthezing. Results of the
   OAEI 2007 Library Thesaurus Mapping Track. Technical report, VU Amsterdam, 2007.
2. Philipp Mayr and Vivien Petras. Building a Terminology Network for Search: The KoMoHe
   Project. In Proc. of the Int. Conference on Dublin Core and Metadata Applications, pages
   177 – 182, 2008.
3. Joachim Neubert. Bringing the “Thesaurus for Economics” on to the Web of Linked Data. In
   Proc. of the WWW Workshop on Linked Data on the Web (LDOW), 2009.
4. Michael Uschold and Michael Gruninger. Ontologies and semantics for seamless connectivity.
   SIGMOD Rec., 33(4):58–64, 2004.
5. Benjamin Zapilko, Johann Schaible, Philipp Mayr, and Brigitte Mathiak. TheSoz: A SKOS
   Representation of the Thesaurus for the Social Sciences. Semantic Web – Interoperability,
   Usability, Applicability. accepted.