<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Crowdsourcing Language Resources for Dutch using PYBOSSA: Case Studies on Blends, Neologisms and Language Variation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Peter Dekker</string-name>
          <email>peter.dekker@ivdnt.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tanneke Schoonheim</string-name>
          <email>tanneke.schoonheim@ivdnt.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Instituut voor de Nederlandse Taal, Dutch Language Institute</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>24</fpage>
      <lpage>25</lpage>
      <abstract>
        <p>In this paper, we evaluate PYBOSSA, an open-source crowdsourcing framework, by performing case studies on blends, neologisms and language variation. We describe the procedural aspects of crowdsourcing, such as working with a crowdsourcing platform and reaching the desired audience. Furthermore, we analyze the results, and show that crowdsourcing can shed new light on how language is used by speakers.</p>
      </abstract>
      <kwd-group>
        <kwd>crowdsourcing</kwd>
        <kwd>lexicography</kwd>
        <kwd>neologisms</kwd>
        <kwd>language variation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Crowdsourcing (or: citizen science) has shown to be a
quick and cost-efficient way to perform tasks by a large
number of lay people, which normally have to be
performed by a small number of experts
        <xref ref-type="bibr" rid="ref4 ref8">(Holley, 2010; Causer
et al., 2018)</xref>
        . In this paper, we use the PYBOSSA (PB)
framework1 for crowdsourcing language resources for the
Dutch language. We will describe our experiences with
this framework, to accomplish the goals of language
documentation and generation of language learning material.
In addition to sharing our experiences, we will report on
linguistic findings based on the experiments we performed
on blends, neologisms and language variation.
      </p>
      <p>
        For the Dutch language, crowdsourcing has been valuable
in the past. We distinguish two types of approaches. On
one hand, there are fixed tasks, where more or less one
answer is correct. As fixed tasks, crowdsourcing has been
applied to the transcription of letters from 17th and
18thcentury Dutch sailors
        <xref ref-type="bibr" rid="ref16">(Van der Wal et al., 2012)</xref>
        and
historical Dutch Bible translations
        <xref ref-type="bibr" rid="ref15 ref2">(Beelen and Van der Sijs,
2014)</xref>
        .
      </p>
      <p>
        On the other hand, there are open tasks, referred to in
empirical sciences as elicitation tasks, where different answers
by different users are welcomed, in order to capture
variation. Examples of open tasks are Palabras
        <xref ref-type="bibr" rid="ref12 ref3">(Burgos et al.,
2015; Sanders et al., 2016)</xref>
        , where lay native Dutch
speakers were asked to transcribe vowels produced by L2
learners, and Emigrant Dutch2, which tries to capture the
language use of emigrant Dutch speakers. Of course, mixture
forms between open and fixed tasks are possible.
The Dutch Language Institute strives to document the
language as it is used, by compiling language resources (eg.
dictionaries) based on corpora from different sources, such
as newspapers and websites. Fixed-task crowdsourcing,
such as transcription and correction, can help in this
process. However, we see even greater possibilities for
opentask crowdsourcing, asking speakers how they use and
perceive the language, which we will explore in this paper.
      </p>
      <p>1Homepage: http://pybossa.com. DOI: https://
doi.org/10.5281/zenodo.1485460</p>
      <p>
        2http://www.meertens.knaw.nl/
vertrokken-nederlands/
Open-task crowdsourcing has been applied to lexicography
for other languages, such as Slovene, where crowdsourcing
was integrated in the thesaurus and collocation dictionary
applications
        <xref ref-type="bibr" rid="ref7 ref9">(Holdt et al., 2018; Kosem et al., 2018)</xref>
        . On
top of this goal of language documentation, we would like
to use crowdsourcing to make language material available
for language learners.
      </p>
      <p>2.</p>
    </sec>
    <sec id="sec-2">
      <title>Method</title>
      <p>As the basis for our experiments, we hosted an instance
of PYBOSSA at our institute. We named our
crowdsourcing platform Taalradar (‘language radar’): this signifies
both the ‘radar’ (overview) we would like to gain over
the entire language through crowdsourcing, and the
personal ‘language radar’ or linguistic intuition of
contributors, which we would like to exploit. We ran two
crowdsourcing rounds: in september 2018 and in
novemberdecember 2018.
2.1.</p>
    </sec>
    <sec id="sec-3">
      <title>Tasks</title>
      <p>
        We designed four tasks, which are well-suited to reach our
goals: documentation of the Dutch language and
developing material for language learning. Since we would like to
get a picture of the speakers of the language, we ask for
user details (gender, age and city of residence) in all tasks.
The tasks were created as Javascript/HTML files inside PB.
Tasks 1 and 2: Blends analysis and recognition Blends
are compound words, formed “by fusing parts of at least
two other source words of which either one is shortened in
the fusion and/or where there is some form of phonemic or
graphemic overlap of the source words”
        <xref ref-type="bibr" rid="ref6">(Gries, 2004)</xref>
        . An
example of a blend in both English and Dutch is brunch,
which consists of breakfast and lunch. For our
experiments, we used blends collected for the Algemeen
Nederlands Woordenboek (Dictionary of Contemporary Dutch;
ANW)
        <xref ref-type="bibr" rid="ref11 ref11 ref13 ref13 ref14 ref14 ref15">(Tiberius and Niestadt, 2010; Schoonheim and
Tempelaars, 2010; Tiberius and Schoonheim, 2016)</xref>
        .
We developed two tasks: analysis and recognition of
blends. In the analysis task, contributors are presented with
a blend, and asked of which source words this blend
consists. No context of the blend is provided. 10 blends are
presented in total. Figure 1 shows the task as it is presented
to the contributor.
In the recognition task, contributors are presented with a
citation from the ANW dictionary. 10 citations are presented.
Contributors should recognize the blend in the citation.
Every citation contains one blend, but we ask for “one or
multiple blends” and present users with tree input fields to enter
blends. We deliberately designed the task in this somewhat
deceptive way, to see which other words are candidates for
being perceived as blends.
      </p>
      <p>
        Task 3: Neologisms In this task, contributors were asked
to judge neologisms (new words) in a citation, on two
criteria: endurance of the concept (“This word will be used
for long time.”) and diversity of users and situations (“This
word will be used by different people [eg. young, old] in
different situations [eg. conversation, newspaper].”). We
selected these two criteria from the FUDGE test, a test
to rate the sustainability of a neologism, which normally
consists of 5 criteria
        <xref ref-type="bibr" rid="ref10">(Metcalf, 2004)</xref>
        . The neologisms and
their citations were taken from newspaper material, which
is used in the lexicographic workflow (see section 4.). From
this corpus, sentences which contain a hitherto unknown
word are extracted: these are possible neologisms, but can
also be words that have been formed ad hoc.
Lexicographers accept or reject a word as neologism. We presented
15 words in a citation to users: 5 which have been attested
by lexicographers as neologisms, 5 which have been
rejected as neologisms, and 5 unattested words.
      </p>
      <p>Task 4: Language variation In this task, contributors are
asked how they call a certain concept or how they would
express a certain sentence. The goal is to chart dialectal
variation, but also other kinds of language variation. We used
a list of questions from Taalverhalen.be, a website which
tries to chart language variation using questionnaires3. The
list contains 16 questions: 9 questions on words for sweets,
and 6 questions about the general vocabulary. An example
of a question is: “How do you call VINEGAR?”. On top
of the user details we ask in other tasks (gender, age, city
of residence), we also ask for province, mother tongue and
educational level.</p>
    </sec>
    <sec id="sec-4">
      <title>2.2. Audience</title>
      <p>Our experiments were advertised via our institutional
newsletter, which reaches 3891 subscribers with an
interest in language. We assume this was the channel with the
largest reach: in the first round, the newsletter article
received 519 clicks, and in the second round, 65 clicks. In
both rounds, we observed an increase in contributions after
the release of the newsletter. Additionally, we attended two
linguistics events, where we offered visitors the possibility
to engage in our crowdsourcing experiments: the meeting
3http://taalverhalen.be, maintained by Miet Ooms.
of the international society of Dutch linguistics and Drongo
festival, an event for the language sector in The
Netherlands. Finally, we advertised our experiments via social
media (Twitter, LinkedIn) and a Dutch linguistics blog.
3.</p>
    </sec>
    <sec id="sec-5">
      <title>Results</title>
      <p>The results section consists of two parts. We will first
describe our experiences with PYBOSSA as a crowdsourcing
platform. Then, we will report on the linguistic findings on
the language phenomena we performed experiments on.</p>
    </sec>
    <sec id="sec-6">
      <title>3.1. Experiences of crowdsourcing with</title>
    </sec>
    <sec id="sec-7">
      <title>PYBOSSA</title>
      <p>Table 1 shows the number of contributors for each of the
tasks. It can be observed that only a small number of
visitors did not finish the whole task. This could be due to the
small number of questions we offered per task. The tasks in
the second round (november-december 2018) received less
contributors than in the first round (september 2018), this
could be due to a less prominent place of the announcement
in our newsletter in the second round. In all experiments,
more women than men participated. Also, participants with
ages above 50 were well represented. More participants
came from The Netherlands than from Flanders.
# started
# completed
period
Task
Blends analysis
Blends recognition
Neologisms
Language variation</p>
      <p>We will now discuss our experiences with the PYBOSSA
platform. A strength of PB is the freedom it offers when
designing a task: the whole interface can be written in HTML
and Javascript and can be customized. This also makes it
easy to share tasks with other researchers4. The account
system and saving/loading of tasks is handled by PB, so
this does not have to be implemented by the task developer.
Responses of the PB authors on the bug tracker are quick
and concise. It is clear that PB is mainly designed for
fixedtask crowdsourcing, not focusing on variation and the
details of the contributor. For open-task, linguistic purposes,
some points require attention (at time of writing). Firstly,
there is no built-in support for asking contributor details.
We handled this by asking contributor details via a normal
question. However, since all given answers are visible
publicly in PB, this also applies to the details, which may not
be ideal from a privacy perspective. Secondly, contributors
cannot go back to a previous task and change their answers.
Thirdly, multiple anonymous logins from the same
computer are not allowed, making it harder to use PB on e.g.
a trade fair. A workaround is possible, but not built in PB
by default. Also, anonymous users are identified by IP
address: this can cause problems when multiple anonymous
contributors connect via a shared internet connection, such
4Our tasks can be downloaded from: https://github.
com/INL/taalradar.
Endurant</p>
      <p>Diverse</p>
      <p>Status
as in classroom use. Finally, there is no built-in possibility
for a contributor to stop answering after a subset of the total
number of questions available, and show an end screen.
All in all, PYBOSSA, is a convenient crowdsourcing tool,
but has its limitations with regard to open-task
crowdsourcing.</p>
    </sec>
    <sec id="sec-8">
      <title>3.2. Linguistic findings</title>
      <p>Blends For the blends analysis task, we compared the
contributor answers to the attested analyses from the ANW.
Contributors showed an average accuracy of 42%, with
average accuracies per word ranging between 2-83%. Table 2
shows the given answers for the analysis of the blend
preferendum. This shows that there is not always one correct
analysis of a blend, when a related noun and verb can both
be filled in as source word: while prefereren ‘to prefer’ +
referendum is the attested analysis, preferentie ‘preference’
+ referendum may also be an option. It is even more
interesting to see that a number of contributors analyze this
blend entirely differently than the attested analysis: they
analyze the blend as pre ‘before’ + referendum.</p>
      <p>Answer
referendum, prefereren
referendum, preferentie
pre, referendum
do not know
preferent, referendum</p>
      <p>Frequency</p>
      <p>For the blends recognition task, the contributor answers
were compared to the ANW entry in which the citation
occurs. Contributors had an average recognition accuracy of
87%, with average accuracies per word ranging between
54-97%. The accuracies are high: most blends are
recognized correctly. Table 3 shows the given answers for the
recognition of one specific blend: twittie. twittie ‘twitter
fight’ is a blend of twitter and fittie ‘fight’ (slang). Most
contributors correctly recognize this as blend. Many
people however also perceive fittie (which does also occur in
this citation) and tweet as blends, possibly because these
words appear new or unknown.</p>
      <p>Neologisms Table 4 shows the endurance and diversity
judgments for the 15 words in the neologisms task. These
results show that in general, neologisms rejected by
lexicographers also receive lower crowd endurance scores. For
diversity, this pattern is not as clear.
Woord
gendertransformatie
insectenafname
dreigingsmonitor
belevenisstad
vluchtelingenpraktijk
multimediamerk
zonnepriesteres
seniorenmodebranche
moeilijkheidsparadox
afradertje
tijdstrends
nachtnanny
lighttaks
korttheater
dieetopenbaring</p>
      <p>Language variation In the language variation task, we
found that most people used the standard Dutch term to
signify a word, only a minority of the given forms was a
dialectal form. However, it is interesting to investigate the
differences between Dutch and Flemish contributors. The
number of contributors from The Netherlands (around 100
per question) is larger than the number of Flemish
contributors (around 15 per question). Table 5 shows the relative
frequencies of given answers for the concept TAKE A SEAT,
split per language area. ga lekker zitten is very popular in
The Netherlands, while zet u is only used in Flanders.</p>
      <p>Flanders</p>
      <p>The Netherlands
Utterance
ga zitten
ga lekker zitten
neem plaats
zet u
pak een stoel
Total answers
31%
0%
6%
31%
6%
16
38%
18%
7%
0%
4%
115</p>
      <p>These differences per area are observed for more questions.
For example, a SWEET ON A STICK is referred to by many
Flemish contributors as lekstok, whereas contributors from
the Netherlands mainly use the form lolly. And WISHING
A GOOD NIGHT is done by saying slaap wel in Flanders,
while welterusten is used more in The Netherlands.</p>
    </sec>
    <sec id="sec-9">
      <title>Future applications</title>
      <p>Integrating crowdsourcing into a lexicographic
workflow Our case study on neologisms shows the potential of
crowdsourcing for lexicography. Crowdsourcing becomes
even more useful, if it becomes fully integrated into the
lexicographic workflow. Currently, at the INT, newspaper
material is fed in and sentences with unknown words are
automatically extracted. Lexicographers then manually decide
on inclusion in the dictionary. In an ideal workflow, the
extracted sentences are automatically imported into a
crowdsourcing application and shown to the public. Contributor
judgments can help lexicographers in deciding on
dictionary inclusion. A challenge will be to motivate a crowd to
contribute over a long period of time. To maintain worflow
stability, also in case of a temporary drop in crowd
participation, crowd consultation will be an optional step in the
workflow.</p>
      <p>
        Language learning We have not yet performed
crowdsourcing experiments for language learning, but we are
looking into future directions which seem promising.
Crowdsourcing can be used to cluster word senses, which
could help people with language or speech disabilities.
Crowdsourcing has been used for word sense
disambiguation before
        <xref ref-type="bibr" rid="ref1 ref17">(Akkaya et al., 2010; Venhuizen et al., 2013)</xref>
        ,
also specifically targeted at creating language learning
material
        <xref ref-type="bibr" rid="ref11 ref13 ref14">(Parent and Eskenazi, 2010)</xref>
        . It would be worthwhile
to apply this methodology to the ANW dictionary or the
semantic lexixon DiaMaNT
        <xref ref-type="bibr" rid="ref5">(Depuydt and De Does, 2018)</xref>
        .
Another idea could be to use crowdsourcing to select
suitable learning sentences for collocations or proverbs from a
corpus.
      </p>
      <p>5.</p>
    </sec>
    <sec id="sec-10">
      <title>Conclusion</title>
      <p>Our experiments have shown that crowdsourcing proves
useful for documenting the Dutch language, and can be
valuable for developing Dutch language learning material
in the future. We used the PYBOSSA framework for our
crowdsourcing experiments, which is very powerful, but
also has its limitations when using it for linguistic purposes.</p>
    </sec>
    <sec id="sec-11">
      <title>Acknowledgements</title>
      <p>This work was supported by EU COST action CA160105
enetCollect, which is gratefully acknowledged. We thank
Miet Ooms for supplying the questions for the language
variation task. We thank our colleagues at the INT for
valuable advices.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Akkaya</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Conrad</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiebe</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Mihalcea</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Amazon mechanical turk for subjectivity word sense disambiguation</article-title>
          .
          <source>In Proceedings of the NAACL HLT</source>
          <year>2010</year>
          <article-title>workshop on creating speech and language data with Amazon's Mechanical Turk</article-title>
          , pages
          <fpage>195</fpage>
          -
          <lpage>203</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Beelen</surname>
          </string-name>
          , H. and
          <string-name>
            <surname>Van der Sijs</surname>
          </string-name>
          , N. (
          <year>2014</year>
          ). Crowdsourcing de Bijbel. Neerlandia / Nederlands van Nu,
          <volume>(</volume>
          <fpage>2</fpage>
          -
          <lpage>2014</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Burgos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanders</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cucchiarini</surname>
          </string-name>
          , C.,
          <string-name>
            <surname>van</surname>
            <given-names>Hout</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            , and
            <surname>Strik</surname>
          </string-name>
          ,
          <string-name>
            <surname>H.</surname>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Auris populi: crowdsourced native transcriptions of Dutch vowels spoken by adult Spanish learners</article-title>
          .
          <source>InterSpeech</source>
          <year>2015</year>
          . Dresden, Germany, page 7.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Causer</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grint</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sichani</surname>
            ,
            <given-names>A.-M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Terras</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>'Making such bargain': Transcribe Bentham and the quality and cost-effectiveness of crowdsourced transcription</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Depuydt</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          and
          <string-name>
            <surname>De Does</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>The Diachronic Semantic Lexicon of Dutch as Linked Open Data</article-title>
          .
          <source>In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC</source>
          <year>2018</year>
          ), Paris, France, May.
          <source>European Language Resources Association (ELRA).</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Gries</surname>
            ,
            <given-names>S. T.</given-names>
          </string-name>
          (
          <year>2004</year>
          ).
          <article-title>Shouldnt it be breakfunch? A quantitative analysis of blend structure in English</article-title>
          . Linguistics,
          <volume>42</volume>
          (
          <issue>3</issue>
          ), January.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Holdt</surname>
            ,
            <given-names>Š. A.</given-names>
          </string-name>
          ,
          <article-title>Cˇ ibej</article-title>
          , J.,
          <string-name>
            <surname>Dobrovoljc</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gantar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gorjanc</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klemenc</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kosem</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krek</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Laskowski</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Robnik-Šikonja</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Thesaurus of Modern Slovene: By the Community for the Community</article-title>
          .
          <source>Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts</source>
          , pages
          <fpage>989</fpage>
          -
          <lpage>997</lpage>
          ,
          <year>July</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Holley</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2010</year>
          ). Crowdsourcing: How and
          <string-name>
            <given-names>Why</given-names>
            <surname>Should Libraries Do It? D-Lib</surname>
          </string-name>
          <string-name>
            <surname>Magazine</surname>
          </string-name>
          ,
          <volume>16</volume>
          (
          <issue>3</issue>
          /4), March.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Kosem</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krek</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gantar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holdt</surname>
          </string-name>
          , Š. A.,
          <article-title>Cˇ ibej</article-title>
          , J., and
          <string-name>
            <surname>Laskowski</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Collocations Dictionary of Modern Slovene</article-title>
          .
          <source>Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts</source>
          , pages
          <fpage>989</fpage>
          -
          <lpage>997</lpage>
          ,
          <year>July</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Metcalf</surname>
            ,
            <given-names>A. A.</given-names>
          </string-name>
          (
          <year>2004</year>
          ).
          <article-title>Predicting new words: the secrets of their success</article-title>
          .
          <source>Houghton Mifflin Harcourt.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Parent</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Eskenazi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Clustering dictionary definitions using Amazon Mechanical Turk</article-title>
          .
          <source>In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech</source>
          and
          <article-title>Language Data with Amazon's Mechanical Turk</article-title>
          , pages
          <fpage>21</fpage>
          -
          <lpage>29</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Sanders</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burgos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cucchiarini</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , and van Hout,
          <string-name>
            <surname>R.</surname>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Palabras: Crowdsourcing Transcriptions of L2 Speech. International Conference on Language Resources and Evaluation (LREC) 2016</article-title>
          . Portorož, Slovenia, page
          <volume>7</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Schoonheim</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Tempelaars</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Dutch Lexicography in Progress, The Algemeen Nederlands Woordenboek (ANW)</article-title>
          .
          <source>In Proceedings of the XIV Euralex International Congress</source>
          , Ljouwert, Fryske Akademy/Afuk, abstract, page
          <volume>179</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Tiberius</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Niestadt</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>The ANW: An online Dutch dictionary</article-title>
          .
          <source>Proceedings of the XIV Euralex International Congress. Ljouwert</source>
          , Fryske Akademy/Afuk.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Tiberius</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Schoonheim</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>The Algemeen Nederlands Woordenboek (ANW) and its Lexicographical Process</article-title>
          .
          <source>Der lexikografische Prozess bei Internetwörterbüchern. 4. Arbeitsbericht "Internetlexikografie"</source>
          .
          <source>Mannheim: Institut für Deutsche Sprache</source>
          . (OPAL X/
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Van der Wal</surname>
          </string-name>
          , M. J.,
          <string-name>
            <surname>Rutten</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Simons</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Letters as loot: Confiscated Letters filling major gaps in the History of Dutch</article-title>
          . In Marina Dossena et al., editors,
          <source>Pragmatics &amp; Beyond New Series</source>
          , volume
          <volume>218</volume>
          , pages
          <fpage>139</fpage>
          -
          <lpage>162</lpage>
          . John Benjamins Publishing Company, Amsterdam.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Venhuizen</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Evang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Basile</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Bos</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Gamification for word sense labeling</article-title>
          .
          <source>In Proceedings of the 10th International Conference on Computational Semantics (IWCS</source>
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>