<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Frame Semantics Annotation Made Easy with DBpedia</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marco Fossati</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sara Tonelli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claudio Giuliano</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>fossati</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>satonelli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>giulianog@fbk.eu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Fondazione Bruno Kessler - via Sommarive</institution>
          ,
          <addr-line>18 - 38123 Trento</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Crowdsourcing techniques applied to natural language processing have recently experienced a steady growth and represent a cheap and fast, albeit valid, solution to create benchmarks and training data. Nevertheless, some particularly complex tasks such as semantic role annotation have been rarely conducted in a crowdsourcing environment, due to their intrinsic di culty. In this paper, we present a novel approach to accomplish this task by leveraging information automatically extracted from DBpedia. We show that replacing role de nitions, typically meant for expert annotators, with a list of DBpedia types, makes the identi cation and assignment of role labels more intuitive also for non-expert workers. Results prove that such strategy improves on the standard annotation work ow, both in terms of accuracy and of time consumption.</p>
      </abstract>
      <kwd-group>
        <kwd>Natural Language Processing</kwd>
        <kwd>Frame Semantics</kwd>
        <kwd>Entity Linking</kwd>
        <kwd>DBpedia</kwd>
        <kwd>Crowdsourcing</kwd>
        <kwd>Task Modeling</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Frame semantics [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is one of the theories that originate from the long strand
of linguistic research in arti cial intelligence. A frame can be informally de ned
as an event triggered by some term in a text and embedding a set of
participants. For instance, the sentence Goofy has murdered Mickey Mouse evokes
the Killing frame (triggered by murdered) together with the Killer and
Victim participants (respectively Goofy and Mickey Mouse). Such theory has led
to the creation of FrameNet [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], namely an English lexical database containing
manually annotated textual examples of frame usage.
      </p>
      <p>Annotating frame information is a complex task, usually modeled in two
steps: given a sentence, annotators are rst asked to choose the frame activated
by a predicate (or lexical unit, LU, e.g. murdered in the example above evoking
Killing). Second, they assign the semantic roles (or frame elements, FEs ) that
describe the participants involved in the chosen frame. In this work, we focus on
the second step, namely FEs recognition.</p>
      <p>
        Currently, FrameNet development follows a strict protocol for data
annotation and quality control. The entire procedure is known to be both
timeconsuming and costly, thus representing a burden for the extension of the
resource [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Furthermore, deep linguistic knowledge is needed to tackle this
annotation task, and the resource developed so far would not have come to light
without the contribution of skilled linguists and lexicographers. On one hand, the
task complexity depends on the inherently complex theory behind frame
semantics, with a repository of thousands of roles available for the assignment. On the
other hand, these roles are de ned for expert annotators, and their descriptions
are often obscure to common readers. We report three examples below:
{ Support: Support is a fact that lends epistemic support to a claim, or that
provides a reason for a course of action. Typically it is expressed as an
External Argument. (Evidence frame)
{ Protagonist: A person or self-directed entity whose actions may potentially
change the mind of the Cognizer (Influence of Event on Cognizer
frame)
{ Locale: A stable bounded area. It is typically the designation of the nouns
of Locale-derived frames. (Locale by Use frame)
      </p>
      <p>Since we aim at investigating whether such activity can be cast to a crowd of
non-expert contributors, we need to reduce its complexity by intervening on the
FE descriptions. In particular, we want to assess to what extent more information
on the role semantics coming from external knowledge sources such as DBpedia1
can improve non-expert annotators' performance. We leverage the CrowdFlower
platform,2 which serves as a bridge to a plethora of crowdsourcing services, the
most popular being Amazon's Mechanical Turk (AMT).3</p>
      <p>We claim that providing annotators with information on the semantic types
typically associated with FEs will enable faster and cheaper annotations, while
maintaining an equivalent accuracy. The additional information is extracted in
a completely automatic way, and the work ow we present can be potentially
applied to any crowdsourced annotation task in which semantic typing is relevant.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        The construction of annotation datasets for natural language processing tasks
via non-expert contributors has been approached in di erent ways, the most
prominent being games with a purpose (GWAP) and micro-tasks. While the
former technique leverages fun as the motivation for attracting participants,
the latter mainly relies on a monetary reward. The e ects of such factors on a
contributor's behavior have been analyzed in the motivation theory literature,
but are beyond the scope of this paper. The reader may refer to [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] for an
overview focusing on AMT.
      </p>
      <p>
        Games with a Purpose. Verbosity [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] was one of the rst attempts in
gathering annotations with a GWAP. Phrase Detectives [
        <xref ref-type="bibr" rid="ref4 ref5">5,4</xref>
        ] was meant to harvest
a corpus with coreference resolution annotations. The game included a
validation mode, where participants could assess the quality of previous contributions.
A data unit, namely a resolved coreference for a given entity, is judged
complete only if the agreement is unanimous. Disagreement between experts and
1 http://dbpedia.org
2 https://crowdflower.com
3 https://www.mturk.com
the crowd appeared to be a potential indicator of ambiguous input data. Indeed,
it has been shown that in most cases disagreement did not represent a poor
annotation, but rather a valid alternative.
      </p>
      <p>
        Micro-tasks. Design and evaluation guidelines for ve natural language
microtasks are described in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Similarly to our approach, the authors compared
crowdsourced annotations with expert ones for quality estimation. Moreover,
they used the collected annotations as training sets for machine learning
classi ers and measured their performance. However, they explicitly chose a set of
tasks that could be easily understood by non-expert contributors. Similarly, [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]
built a multilingual textual entailment dataset for statistical machine translation
by developing an annotation pipeline to decompose the annotators' task into a
sequence of activities. Finally, [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] exploited Google AdWords, a tool for web
advertising, to measure message persuasiveness while avoiding subjects being
aware of the experiments and being biased by external rewards.
Semantic Role Annotation. Manual annotation of semantic roles has been
recently addressed via crowdsourcing in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Furthermore, [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] highlighted
the crucial role of recruiting people from the crowd in order to bypass the need
for linguistics expert annotations. Uniformly to our contribution, the task
described in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] was modeled in a multiple-choice answers fashion. Nevertheless, the
focus is narrowed to the frame discrimination task, namely selecting the correct
frame evoked by a given LU. Such task is comparable to the word sense
disambiguation one as per [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], although the di culty seems augmented, due to lower
inter-annotator agreement values. The authors experienced issues that are
related to our work with respect to the quality check mechanism in CrowdFlower,
as well as the complexity of the frame names and de nitions. Outsourcing the
task to the CrowdFlower platform has two major drawbacks: (a) the
proprietary nature of the aggregated inter-annotator agreement value provided in the
response data, and (b) the need to manually simplify FE de nitions that
generated high disagreement. In this respect, our previous work [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] was the rst
attempt to address item (b) by manually simplifying the way FEs are described.
In this work, we further investigate this aspect by exploiting automatically
extracted links to DBPedia.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Annotation Work ow</title>
      <p>Our goal is to determine if crowdsourced annotation of semantic roles can be
improved by providing non-expert annotators with information from DBpedia
on the roles they are supposed to label. Speci cally, instead of displaying the
lexicographic de nition for each possible role to be labeled, annotators are shown a
set of semantic types associated with each role coming from FrameNet. Based on
this, annotators should better recognize such roles in an unseen sentence.
Evaluation is performed by comparing this annotation framework with a baseline,
where standard FE de nitions substitute DBpedia information.</p>
      <p>
        Before performing the annotation task, we need to leverage the list of
semantic types that best characterizes each FE in a frame. We extract these statistics
by connecting the FrameNet database 1.5 [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] to DBpedia, after isolating a set
of sentences to be used as test data (cf. Section 4). The work ow to prepare the
input for the crowdsourced task is based on the following steps.
      </p>
      <p>
        Linking to Wikipedia. For each annotated sentence in the FrameNet database,
we rst link each textual span labeled as FE to a Wikipedia page W . We employ
The Wiki Machine, a kernel-based linking system (details on the
implementation are reported in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]), which was trained on the Wikipedia dump of March
2010.4 Since FEs can be expressed by both common nouns and real-world
entities, we needed a linking system that satisfactorily processes both nominal
types. A comparison with the state-of-the-art system Wikipedia Miner [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] on
the ACE05-WIKI dataset [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] showed that The Wiki Machine achieved a suitable
performance on both types (.76 F1 on real-world entities and .63 on common
nouns), while Wikipedia Miner had a poorer performance on the second noun
type (respectively .76 and .40 F1). These results were also con rmed in a more
recent evaluation [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], in which The Wiki Machine achieved the highest F1
compared with an ensemble of academic and commercial systems, such as DBpedia
Spotlight, Zemanta, Open Calais, Alchemy API, and Ontos.
      </p>
      <p>The system applies an `all word' linking strategy, in that it tries to connect
each word (or multiword) in a given sentence to a Wikipedia page. In case
a linked textual span (partially) matches a string corresponding to a FE, we
assume that one possible sense of FE is represented in Wikipedia through W .
The Wiki Machine also assigns a con dence score to each linked term. This
con dence is higher in case the words occurring in the same context of the
linked term show high similarity, because the system considers that the linking
is likely to be more accurate.</p>
      <p>We illustrate in Figure 1 the Wikipedia pages (and con dence score) that the
Wiki Machine system associates with the sentence Sardar Patel was assisting
Gandhiji in the Salt Satyagraha with great wisdom, an example sentence
for the Assistance frame originally annotated with four FEs, namely Helper,
Bene ted party, Goal and Manner. Since Wikipedia is a repository of concepts,
which are usually expressed by nouns, we are able to link only nominal llers.
Linking to DBpedia. In order to obtain the semantic types that are typical
for each FE, linking to Wikipedia is not enough. In fact, too many di erent pages
would be connected to a FE, making it di cult to generalize over the Wikipedia
pages (i.e. concepts). This emerges also from the example above, where the
pages linked to Sardar Patel, Gandhjii and Salt Satyagraha do not provide
information on the typical llers of Helper, Bene ted party and Goal respectively.
One possible option could be to resort to Wikipedia categories, which however
are not homogenous enough to allow for a consistent extraction of FE semantic
types.</p>
      <p>We tackle this problem by using Wikipedia pages as a bridge to DBpedia.
In fact, Wikipedia page URLs directly map to DBpedia resource URIs. Hence,
for each linked FE, we query DBpedia for rdf:type objects. In this way, we are
able to compute statistics on the most frequent semantic types associated with
4 http://download.wikimedia.org/enwiki/20100312</p>
      <sec id="sec-3-1">
        <title>Vallabhbhai_Patel (154.51)</title>
        <p>Helper</p>
      </sec>
      <sec id="sec-3-2">
        <title>Mohandas_Karamchand_Gandhi (139.16)</title>
        <p>Benefited_party
[ Sardar Patel ] was assisting [ Gandhiji ]
in the [ Salt Satyagraha ] [ with great wisdom ]
Goal
Manner</p>
      </sec>
      <sec id="sec-3-3">
        <title>Salt_Satyagraha (197.54)</title>
      </sec>
      <sec id="sec-3-4">
        <title>Wisdom (186.30)</title>
        <p>a given FE from a given frame. For instance, the FE Victim from the Killing
frame has a top DBpedia type Animal with a frequency of 38. We aim at
investigating whether such top-occurring types represent both valid generalizations
and simpli cations of a standard FE de nition, and may thus substitute it. At
the end of this pre-processing step, we create a repository where, for each FE, a
set of DBpedia types is listed and ranked by frequency.</p>
        <p>Posting the Annotation Task on CrowdFlower. We nally set up a
crowdsourced experiment where, in each test sentence, annotators have to choose the
most appropriate FE given the most frequent DBpedia types (proper task) or the
standard FE de nition (baseline). Details are reported in the following section.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiments</title>
      <p>We rst provide an overview of critical aspects underpinning a generic
crowdsourced experiment. Subsequently, we describe the anatomy and the modeling of
the tasks we outsourced to the CrowdFlower platform. Input data, full results,
interface code and screenshots are available at http://db.tt/iogsU7RI .
Golden Data. Quality control of the collected judgements is a key factor for
the success of the experiments. The essential drawback of crowdsourcing services
relies on the cheating risk. Workers are generally paid a few cents for tasks which
may only need a single click to be completed. Hence, it is highly probable to
collect data coming from random choices that can heavily pollute the results.
The issue is resolved by adding gold units, namely data for which the requester
already knows the answer. If a worker misses too many gold answers within a
given threshold, he or she will be agged as untrusted and his or her judgments
will be automatically discarded.</p>
      <p>Worker Switching E ect. Depending on their accuracy in providing answers
to gold units, workers may switch from a trusted to an untrusted status and vice
versa. In practice, a worker submits his or her responses via a web page. Each
page contains one gold unit and a variable number of regular units that can be
set by the requester during the calibration phase. If a worker becomes untrusted,
the platform collects another judgment to ll the gap. If a worker moves back
to the trusted status, his or her previous contribution is added to the results as
free extra judgments. Such phenomenon typically occurs when the complexity of
gold units is high enough to induce low agreement in workers' answers. Thus, the
requester is constrained to review gold units and to eventually forgive workers
who missed them. This has not been a blocking issue in our experiments, since
we assessed a relatively low average percentage of missed judgments for gold
units, namely 28%.</p>
      <p>Cost Calibration. The total cost of a crowdsourced task is naturally bound
to a data unit. This represents an issue in our experiments, as the number of
questions per unit (i.e. a sentence) varies according to the number of frames and
FEs evoked by the LU contained in a sentence. Therefore, we need to use the
average number of questions per sentence as a multiplier to a constant cost per
sentence. We set the payment per working page to 3 $ cents and the number of
sentences per page to 3. Since most of the sentences in our annotation task have
3 FEs, the average cost per FE results in 0.325 $ cent (see Table 2 below).
Pre-processing of FrameNet Data for DBpedia Types Extraction.
Table 1 provides some statistics of the processed FrameNet data that were leveraged
to extract DBpedia types (cf. Section 3). More speci cally:
1. From the FrameNet 1.5 database, the Wiki Machine managed to link 77%
of the total number of FE instances. Hence, unlinked data is skipped for the
next step.
2. DBpedia provided type information for 42% of the total number of linked
FE instances. Types occurring once are ignored, as they re ect the content
of a single sentence and are likely to convey misleading suggestions. The too
generic owl#Thing type is ltered as well.
Test Data Preparation. Before linking the FrameNet database to DBpedia,
we isolate a subset to be used as test data. From 500 randomly chosen sentences,
we select those in which the number of FEs per frame is between 3 and 4.</p>
      <p>This small dataset serves as input for our experiments. Table 2 details the
nal settings. We hand-pick six sentences and for each of them we mark one
question as gold for quality check. Almost all sentences contain three FEs with
few exceptions (cf. the average value in Table 2). We extract the ve most
frequent DBpedia types from the statistics and assign them to the corresponding
FEs in our input. Since not all FEs have exactly ve associated types (cf. the
average value in Table 2), we provide workers with variable suggestion sets.
Finally, we ensure all workers are native English speakers.
Modeling. Data units are delivered to workers via a web interface. Our task is
illustrated in Figure 2 and is presented as follows:
(a) Workers are invited to read a sentence and to focus on the bolded word
appearing as a title above the sentence (e.g. taste in the screenshot).
(b) A question concerning each FE is then shown together with a set of answers
corresponding to the sentence chunks that may express the given FE. For
instance, in Figure 2, the question Which is the Perceiver Passive? is
coupled with multiple choices taken from the given sentence.
(c) For each question, a suggestion box displays the top types retrieved from
DBpedia and connected to the given FE (cf. Section 3 for details). This
should help annotators in choosing the text chunk that better ts the given
FE.
(d) Finally, workers match each question with the proper text chunk.</p>
      <p>On the other hand, the baseline di ers from our strategy in that (i) it does
not display the suggestion box and (ii) questions are replaced with the FE
definition extracted from FrameNet. For instance, in Figure 2, the question about
the Perceiver Passive would be replaced with This FE is the being who has a
perceptual experience, not necessarily on purpose. The baseline is more
compliant with the standard approach adopted to annotate FEs in the FrameNet
project.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Results</title>
      <p>Our main purpose is to evaluate the validity of the proposed approach against
the conventional FrameNet annotation procedure. We leverage expert-annotated
sentences and are thus able to directly measure workers' accuracy. Speci cally,
we compute 2 values:
{ Majority vote. An answer is considered correct only if the majority of
judgments are correct.
{ Absolute. The total number of correct judgments divided by the total number
of collected judgments.</p>
      <p>The results of our experiments are detailed in Table 3. The number of untrusted
judgments may be considered as a shallow indicator of the overall task
complexity. In fact, we tried to maximize objectivity and simplicity when choosing gold
units. Moreover, the input dataset (and gold units as well) is identical in both
experiments. Therefore, we can infer that the number of workers who missed
gold is directly in uenced by the question model, which is the only variable
parameter. We compute the execution time as the interval between the rst and
the last judged unit.</p>
      <p>Our approach outperformed the baseline both in terms of accuracy and time.
While majority vote accuracy values di er slightly, absolute accuracy clearly
favors our strategy. Such measure can be seen as a further indicator of the task
complexity. A higher score implies a higher number of correct judgments, which
may designate a better inter-worker agreement, thus a more straightforward task.
This claim is not only supported by the moderate decrease of untrusted
judgments, but also by the dramatic reduction of the execution time. Consequently,
the results we obtained demonstrate that entity linking techniques combined
with DBpedia types simplify FEs annotation.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Discussion and Conclusions</title>
      <p>In this work, we present a novel approach to annotate frame elements in a
crowdsourcing environment using information extracted from DBpedia. The task is
simpli ed for non-expert annotators by replacing FE de nitions, usually meant
for linguistic experts, with semantic types obtained from DBpedia. This is
accomplished without manual simpli cation, in a completely automatic fashion.</p>
      <p>Results prove that such method improves on the standard annotation
workow, both in terms of accuracy and of time consumption. Although the
interconnection between FEs and DBpedia is semantically not perfect, extracting
frequency statistics from the whole FrameNet database and considering only the
most occurring types from DBpedia make the procedure quite robust to wrong
links.</p>
      <p>Possible issues may arise when two or more frame elements in the same
frame share the same semantic type. For instance, the Goal and Place FEs in
the Arriving frame are both likely to be lled by elements describing a location.
We also expect that our approach is less accurate with FEs that can be lled
both by nouns and by verbs, for instance the Activity FE in the Activity finish
frame. In such cases, information extracted from DBpedia would probably be
inconsistent. Besides, DBpedia statistics are reliable when several annotated
sentences are available for a frame, while they may be misleading if extracted
from few instances. We plan to investigate these issues and to explore possible
solutions to cope with data sparseness.</p>
      <p>Additional future work will involve the following aspects:
{ Evaluation of an ad-hoc strategy for the extraction of semantic types, namely
providing workers with suggestions by matching information that are
dynamically derived from each given sentence with DBpedia types.
{ Clustering of similar semantic types with respect to the meaning they convey
and to the frequency, e.g. Place and Location Underspecified.
Finally, the overall e ectiveness of our approach depends both on the
performance of the entity linking system and on the coverage of the knowledge base.
Hence, long term research will focus on enhancing The Wiki Machine precision
and recall, and extending DBpedia type coverage.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>C.F.</given-names>
          </string-name>
          :
          <article-title>FrameNet, current collaborations and future goals</article-title>
          .
          <source>Language Resources</source>
          and Evaluation pp.
          <volume>1</volume>
          {
          <issue>18</issue>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>C.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fillmore</surname>
            ,
            <given-names>C.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lowe</surname>
            ,
            <given-names>J.B.</given-names>
          </string-name>
          :
          <article-title>The Berkeley FrameNet Project</article-title>
          .
          <source>In: Proceedings of the 17th international conference on Computational linguistics-Volume</source>
          <volume>1</volume>
          . pp.
          <volume>86</volume>
          {
          <fpage>90</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bentivogli</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Forner</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giuliano</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marchetti</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pianta</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tymoshenko</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Extending English ACE 2005 Corpus Annotation with Ground-truth Links to Wikipedia</article-title>
          .
          <source>In: 23rd International Conference on Computational Linguistics</source>
          . pp.
          <volume>19</volume>
          {
          <issue>26</issue>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Chamberlain</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kruschwitz</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poesio</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Constructing an anaphorically annotated corpus with non-experts: Assessing the quality of collaborative annotations</article-title>
          .
          <source>In: Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources</source>
          . pp.
          <volume>57</volume>
          {
          <fpage>62</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chamberlain</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poesio</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kruschwitz</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Phrase detectives: A web-based collaborative annotation game</article-title>
          .
          <source>Proceedings of I-Semantics, Graz</source>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Fillmore</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Frame semantics</article-title>
          . Linguistics in the morning calm pp.
          <volume>111</volume>
          {
          <issue>137</issue>
          (
          <year>1982</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Fossati</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giuliano</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tonelli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Outsourcing FrameNet to the Crowd</article-title>
          .
          <source>In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics</source>
          . pp.
          <volume>742</volume>
          {
          <fpage>747</fpage>
          .
          <string-name>
            <surname>So</surname>
            <given-names>a</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bulgaria</surname>
          </string-name>
          (
          <year>August 2013</year>
          ), http://www.aclweb.org/ anthology/P13-2130
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Guerini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Strapparava</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stock</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Ecological Evaluation of Persuasive Messages Using Google AdWords</article-title>
          . In:
          <article-title>Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics</article-title>
          .
          <source>ACL2012 (July</source>
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Hong</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>C.F.</given-names>
          </string-name>
          :
          <article-title>How Good is the Crowd at \real"</article-title>
          <source>WSD? In: Proceedings of the 5th Linguistic Annotation Workshop</source>
          . pp.
          <volume>30</volume>
          {
          <issue>37</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kaufmann</surname>
          </string-name>
          , N.,
          <string-name>
            <surname>Schulze</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Veit</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>More than fun and money. Worker motivation in crowdsourcing { A study on Mechanical Turk</article-title>
          .
          <source>In: Proceedings of the Seventeenth Americas Conference on Information Systems</source>
          , Detroit, MI (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Mendes</surname>
            ,
            <given-names>P.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jakob</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garc</surname>
            a-Silva,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Dbpedia spotlight: shedding light on the web of documents</article-title>
          .
          <source>In: Proceedings of the 7th International Conference on Semantic Systems</source>
          . pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          . I-Semantics '
          <fpage>11</fpage>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Milne</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Witten</surname>
            ,
            <given-names>I.H.</given-names>
          </string-name>
          :
          <article-title>Learning to link with Wikipedia</article-title>
          .
          <source>In: CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management</source>
          . pp.
          <volume>509</volume>
          {
          <fpage>518</fpage>
          . ACM, NY, USA (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Negri</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bentivogli</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mehdad</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giampiccolo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marchetti</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Divide and conquer: crowdsourcing the creation of cross-lingual textual entailment corpora</article-title>
          .
          <source>In: Proceedings of the Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <volume>670</volume>
          {
          <fpage>679</fpage>
          . EMNLP '
          <volume>11</volume>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computational Linguistics, Stroudsburg, PA, USA (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Ruppenhofer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ellsworth</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petruck</surname>
            ,
            <given-names>M.R.</given-names>
          </string-name>
          , Johnson,
          <string-name>
            <given-names>C.R.</given-names>
            ,
            <surname>Scheffczyk</surname>
          </string-name>
          , J.:
          <source>FrameNet II: Extended Theory and Practice</source>
          . Available at http://framenet.icsi.berkeley.edu/book/book.html (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Snow</surname>
            , R.,
            <given-names>O</given-names>
          </string-name>
          <string-name>
            <surname>'Connor</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jurafsky</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>A.Y.</given-names>
          </string-name>
          :
          <article-title>Cheap and fast|but is it good?: evaluating non-expert annotations for natural language tasks</article-title>
          .
          <source>In: Proceedings of the Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <volume>254</volume>
          {
          <fpage>263</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Tonelli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giuliano</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tymoshenko</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Wikipedia-based WSD for multilingual frame annotation</article-title>
          .
          <source>Arti cial Intelligence</source>
          <volume>194</volume>
          ,
          <fpage>203</fpage>
          {
          <fpage>221</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Von Ahn</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kedia</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blum</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Verbosity: a game for collecting common-sense facts</article-title>
          .
          <source>In: Proceedings of the SIGCHI conference on Human Factors in computing systems</source>
          . pp.
          <volume>75</volume>
          {
          <fpage>78</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>