<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Ontology-based Data Matching Framework: Use Case Competency-based HRM</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Peter De Baer</string-name>
          <email>Peter.De.Baer@vub.ac.be</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yan Tang</string-name>
          <email>Yan.Tang@vub.ac.be</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pieter De Leenheer</string-name>
          <email>Pieter@Collibra.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Collibra nv/sa</institution>
          ,
          <addr-line>Ransbeekstraat 230, 12 Brussels</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Peter.De.Baer</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>VUB STARLab, Vrije Universiteit Brussel</institution>
          ,
          <addr-line>Pleinlaan 2, 1050 Brussels</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>As part of the European PROLIX (Process Oriented Learning and Information eXchange) project, VUB STARLab designed a generic ontologybased data matching framework (ODMF). Within the project, the ODMF is used to calculate the similarity between data elements, e.g. competency, function, person, task, and qualification, based on competency-information. Several ontology-based data matching strategies were implemented and evaluated as part of the ODMF. In this article we describe the ODMF and discuss the implemented matching strategies.</p>
      </abstract>
      <kwd-group>
        <kwd>data matching</kwd>
        <kwd>competency management</kwd>
        <kwd>matchmaking</kwd>
        <kwd>ontology</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>ODMF</title>
      <p>
        Semantic data matching plays an important role in many modern ICT systems.
Examples are data mining [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], electronic markets [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], HRM [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], service discovery [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ],
etc. Many existing solutions, for example [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], make use of description logics and are
often tightly linked to certain ontology engineering platforms and/or domains of data
matching. This often leads to a knowledge bottleneck because many potential domain
users and domain experts may not be familiar with description logics or the specific
platform at hand. To avoid such potential technical barrier we designed the ODMF so
that it is independent of a particular ontology engineering platform, and does not
require the use of description logics. Instead, we make use of the combination of an
ontologically structured terminological database [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and a DOGMA ontology [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to
describe data. Both the DOGMA ontology and the terminological database make use
of natural language to describe meaning. On top of this semantic data model we
developed an interpreter module and a comparison module. Both the interpreter and
the comparator make use of a library of matching algorithms. The matching
algorithms have access to the data model via an API, and may be written in any
programming language that can access this Java API. Via the terminology base, data
can be described and interpreted in different natural languages. We believe that this
multilingualism will improve the usefulness of the framework within an international
setting.
      </p>
      <p>The ODMF is designed to support data matching in general. Currently, the ODMF
has been, however, only implemented and evaluated as part of the European
integrated PROLIX project1. Within the PROLIX platform2, the ODMF supports
semantic matching of competency-based data elements, e.g. competency, function,
person, task, and qualification.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Matching strategies</title>
      <p>We implemented and evaluated several ontology-based data matching algorithms
within the ODMF. These algorithms relate to three major groups: (1) string matching,
(2) lexical matching, and (3) graph matching. However, most matching algorithms
make use of a combination of these techniques.
1. String matching techniques are useful to identify data objects, e.g. competences
and qualifications, using a (partial) lexical representation of the object. We
selected two matching tools for this type of data matching: (a) regular
expressions and (b) the SecondString3 library.
2. Lexical matching techniques are useful to identify data objects, e.g. competences
and qualifications, using a (partial) lexical representation of the object. In
addition to plain string matching techniques, linguistic information is used to
improve the matching. We selected two techniques to improve the matching: (a)
tokenization and lemmatization and (b) the use of an ontologically structured
terminological database.
3. Graph matching techniques are useful (a) to calculate the similarity between two
given objects and (b) to find related objects for a given object.
1 http://www.prolixproject.org/
2 http://prolixportal.prolix-dev.de/
3 http://secondstring.sourceforge.net/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Agarwal</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lamparter</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <string-name>
            <surname>SMART - A Semantic Matchmaking</surname>
          </string-name>
          <article-title>Portal for Electronic Markets</article-title>
          .
          <source>In: Proceedings of the Seventh IEEE International Conference on E-Commerce Technology</source>
          , pp.
          <fpage>405</fpage>
          -
          <lpage>408</lpage>
          , (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Biesalski</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Breiter</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abecker</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Towards Integrated, Intelligent Human Resource Management</article-title>
          .
          <source>In: 1st workshop "FOMI</source>
          <year>2005</year>
          ", Formal Ontologies Meet Industry (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>De</surname>
            <given-names>Baer</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Kerremans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            , and
            <surname>Temmerman</surname>
          </string-name>
          , R.:
          <article-title>Constructing Ontology-underpinned Terminological Resources. A Categorisation Framework API</article-title>
          .
          <source>Proceedings of the 8th International Conference on Terminology and Knowledge Engineering</source>
          , Copenhagen (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Jarrar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meersman</surname>
          </string-name>
          , R.:
          <article-title>Formal Ontology Engineering in the DOGMA Approach</article-title>
          . In:
          <article-title>On the Move to Meaningful Internet Systems: CoopIS, DOA, and ODBASE</article-title>
          , LNCS, Springer Verlag, pp.
          <fpage>1238</fpage>
          -
          <lpage>1254</lpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Shu</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rana</surname>
            ,
            <given-names>O. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Avis</surname>
            ,
            <given-names>N. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dingfang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Ontology-based semantic matchmaking approach</article-title>
          .
          <source>In: Advances in Engineering Software 38</source>
          , pp.
          <fpage>59</fpage>
          -
          <lpage>67</lpage>
          ,
          <issue>ScienceDirect</issue>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Stamou</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ntoulas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Christodoulakis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>TODE- an ontology based model for the dynamic population of web directories</article-title>
          . In:
          <article-title>“Data Mining with Ontologies: Implementations, Findings and Frameworks”</article-title>
          , edited by Nigro, H.,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Cisaro</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          , E.,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Xodo</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          ,
          <string-name>
            <surname>H.</surname>
          </string-name>
          ,
          <source>IGI Global</source>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>