<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The uComp Protege Plugin for Crowdsourcing Ontology Validation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Florian Hanika</string-name>
          <email>florian.hanika@wu.ac.at</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gerhard Wohlgenannt</string-name>
          <email>gerhard.wohlgenannt@wu.ac.at</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marta Sabou</string-name>
          <email>marta.sabou@modul.ac.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>MODUL University Vienna</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>The validation of ontologies using domain experts is expensive. Crowdsourcing has been shown a viable alternative for many knowledge acquisition tasks. We present a Protege plugin and a work ow for outsourcing a number of ontology validation tasks to Games with a Purpose and paid micro-task crowdsourcing. Protege3 is a well-known free and open-source platform for ontology engineering. Protege can be extended with plugins using the Protege Development Kit. We present a plugin for crowdsourcing ontology engineering tasks, as well as the underlying technologies and work ows. More speci cally, the plugin supports outsourcing of some typical ontology validation tasks (see Section 2.2) to Games with a Purpose (GWAP) and paid-for crowdsourcing. The research question our work focuses on is how to integrate ontology engineering processes with human computation (HC), to study which tasks can be outsourced, how this a ects the quality of the ontological elements, and to provide tool support for HC. This paper concentrates on the integration process and tool support. As manual ontology construction by domain experts is expensive and cumbersome, HC helps to decrease cost and increase scalability by distributing jobs to multiple workers.</p>
      </abstract>
      <kwd-group>
        <kwd>Protege plugin</kwd>
        <kwd>ontology engineering</kwd>
        <kwd>crowdsourcing</kwd>
        <kwd>human computation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>The uComp Protege Plugin</title>
      <p>The uComp Protege Plugin allows the validation of certain parts of an ontology,
which makes it useful in any setting where the quality of an ontology is
questionable, for example if an ontology was generated automatically with ontology
learning methods, or if a third-party ontology needs to be evaluated before use.
This section covers the uComp API, and the uComp Protege plugin
(functionality and installation).</p>
      <sec id="sec-2-1">
        <title>3 protege.stanford.edu</title>
        <sec id="sec-2-1-1">
          <title>The uComp API</title>
          <p>The Protege plugin sends all validation tasks to the uComp HC API. Depending
on the settings, the API further delegates the tasks to a GWAP or to
CrowdFlower4. CrowdFlower is a platform for paid micro-task crowdsourcing. The
uComp API5 currently supports classi cation tasks (other task types are under
development). The API user can create new HC jobs, cancel jobs, and collect
results from the service. All communication is done via HTTP and JSON.
2.2</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>The plugin</title>
          <p>The plugin supports the validation of various parts of an ontology: relevance of
classes, subClassOf relations, domain and range axioms, instanceOf relations,
etc. The general usage pattern is as follows: the user selects the respective part
of the ontology, provides some information for the crowdworkers, and submits
the job. As soon as available, the results are presented to the user.</p>
          <p>Class relevance check For the sake of brevity, we only describe the Class
Relevance Check and SubClass Relation Validation in some detail. The other
task types follow a very similar pattern. Class Relation Check helps to decide if a
given class (or a set of classes) { based on the class label { is relevant for the given
domain. Figure 1 shows an example class relevance check for the class bond. After
selecting a class, the user can enter a ontology domain (here: Finance) to validate
against, and give additional advice to the crowdworkers. Furthermore, (s)he can
choose between the GWAP and CrowdFlower for validation. If CrowdFlower is</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>4 www.crowd ower.com</title>
        <p>5 tinyurl.com/mkarmk9</p>
        <p>The uComp Protege Plugin for Crowdsourcing Ontology Validation
selected, the expected cost of the job can be calculated. The validate subtree
option allows to validate not only the current class, but also all its subclasses
(recursively). To validate the whole ontology in one go, the user selects the root
class (Thing) and marks the validate subtree option. When available, the results
of the HC task are presented in a textbox. In Figure 1 only one judgment was
collected { the crowdworker stated that class bond is relevant for the domain.
Validation of SubClass Relations With this component, a user can ask the
crowd if there exists a subClass relation between a given class and its
superclasses.</p>
        <p>Similar to the class relevance check, users can set the ontology domain, and
choose CrowdFlower or GWAP (\uComp-Quiz"). In Figure 2 the subClass
relation between dollar and currency is evaluated. Before sending to CrowdFlower,
expected costs can be calculated as number of units (elements to evaluate)
multiplied by number of judgments per unit and payment per judgment.
2.3</p>
        <sec id="sec-2-2-1">
          <title>Installation and Con guration</title>
          <p>As the uComp plugin is part of the o cial Protege repository, it can easily
be installed from within Protege with File ! Check for plugins !
Downloads. To con gure and use the plugin, the user needs to create a le name
ucomp api settings.txt in folder .Protege. The le contains the uComp API
key6, the number of judgments per unit which we be collected, and the payment
per judgment (if using CrowdFlower), for example: abcdefghijklmnopqrst,5,2
6 For API requests see tinyurl.com/mkarmk9
Detailed information about the functionality, usage and installation of the plugin
is provided with the plugin documentation.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Related Work</title>
      <p>
        Human computation outsources computing steps to humans, typically for
problems computers can not solve (yet). Together with altruism, fun (as in GWAPs)
and monetary incentives are central ways to motivate humans to participate.
Early work in the eld of GWAPs was done by von Ahn [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Games have
successfully been used for example in ontology alignment [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] or to verify class de
nitions [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Micro-task crowdsourcing is very popular recently in knowledge
acquisition and natural language processing, and has also been integrated into the
popular NLP framework GATE [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. A number of studies show that crowdworkers
provide results of similar quality as domain experts [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ].
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>In this paper we introduce a Protege plugin for validating ontological elements,
and its integration into a human computation work ow. The plugin delegates
validation tasks to a GWAP or to CrowdFlower and displays the results to the
user. Future work includes an extensive evaluation of various aspects: HC
workows in ontology engineering, quality of crowdsourcing results, and the usability
of the plugin itself.</p>
      <p>Acknowledgments. The work presented was developed within project uComp,
which receives the funding support of EPSRC EP/K017896/1, FWF 1097-N23,
and ANR-12-CHRI-0003-03, in the framework of the CHIST-ERA ERA-NET.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>von Ahn</surname>
          </string-name>
          , L.:
          <article-title>Games With a Purpose</article-title>
          .
          <source>Computer</source>
          <volume>39</volume>
          (
          <issue>6</issue>
          ),
          <volume>92</volume>
          {
          <fpage>94</fpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bontcheva</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roberts</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Derczynski</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rout</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>The GATE Crowdsourcing Plugin: Crowdsourcing Annotated Corpora Made Easy</article-title>
          . In: Proc.
          <article-title>of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL)</article-title>
          .
          <source>ACL</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Markotschi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Voelker</surname>
          </string-name>
          , J.: Guess What?
          <article-title>! Human Intelligence for Mining Linked Data</article-title>
          .
          <source>In: Proceedings of the Workshop on Knowledge Injection into and Extraction from Linked Data (KIELD) at the International Conference on Knowledge Engineering and Knowledge Management (EKAW)</source>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Noy</surname>
            ,
            <given-names>N.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mortensen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Musen</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alexander</surname>
            ,
            <given-names>P.R.</given-names>
          </string-name>
          :
          <article-title>Mechanical Turk As an Ontology Engineer?: Using Microtasks As a Component of an Ontology-engineering Work ow</article-title>
          .
          <source>In: Proc. 5th ACM WebSci Conf</source>
          . pp.
          <volume>262</volume>
          {
          <fpage>271</fpage>
          . WebSci '13,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Sabou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bontcheva</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scharl</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Fols, M.:
          <article-title>Games with a Purpose or Mechanised Labour?: A Comparative Study</article-title>
          .
          <source>In: Proc. of the 13th Int. Conf. on Knowledge Management and Knowledge Technologies</source>
          . pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          . i-Know '
          <fpage>13</fpage>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Siorpaes</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hepp</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Games with a Purpose for the Semantic Web</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          <volume>23</volume>
          (
          <issue>3</issue>
          ),
          <volume>50</volume>
          {
          <fpage>60</fpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>