<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Condition-Task-Store: A Declarative Abstraction for Microtask-based Complex Crowdsourcing</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kenji Gonnokami</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Atsuyuki Morishima</string-name>
          <email>mori@slis.tsukuba.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hiroyuki Kitagawa</string-name>
          <email>kitagawa@cs.tsukuba.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Tsukuba</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2013</year>
      </pub-date>
      <abstract>
        <p>Microtasks have been widely adopted by many crowdsourcing platforms as a unit for human computation. Recently, tools to support programmers to implement complex crowdsourcing applications with microtasks have been proposed. One approach is to provide a library of functions that can be called by programs written in imperative programming languages. Another approach is to allow SQL queries to invoke microtasks. The former approach provides large expressive power, while the latter allows declarative descriptions with limited expressive power. This paper proposes the Condition-Task-Store (CTS) abstraction, which is an alternative declarative approach to implement complex datacentric crowdsourcing with microtasks. The CTS abstraction is unique in that it has all the following features: (1) it naturally extends the task template adopted by many crowdsourcing platforms to define microtasks, (2) it allows declarative descriptions of crowdsourcing systems, and (3) it has large expressive power.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>As computer network technologies evolved,
crowdsourcing became popular in many application domains. Software
systems that take the crowdsourcing approach are called
crowdsourcing systems [2].</p>
      <p>Crowdsourcing systems are often constructed on
crowdsourcing platforms, which provide fundamental functions
for implementing crowdsourcing systems. For example, the
Amazon Mechanical Turk (MTurk) [3] is a crowdsourcing
platform that provides a market in which workers perform
microtasks (called HITs in MTurk) with a small payment
amount per task. Crowdsourcing platforms often provide
APIs for requesters to register microtasks in the platforms.</p>
      <p>Because there are frequent patterns appearing in
programs for crowdsourcing systems, software tools have been
developed to support the implementation of complex
crowdsourcing systems. For example, Turkit [4] provides a library
of functions to define and call tasks from general-purpose
programming languages and introduces the crash-and-rerun
Copyright c 2013 for the individual papers by the papers’ authors. Copying
permitted for private and academic purposes. This volume is published and
copyrighted by its editors.
programming model for minimizing the cost of re-running
programs. Recently, the declarative approach to the
development of crowdsourcing systems has emerged because
declarative abstractions have an affinity toward crowdsourcing
applications. Declarative descriptions do not impose
unnecessary timing constraints, and we can adopt many well-known
optimization techniques. For example, there are proposals
that use SQL-like languages to describe data-centric
crowdsourcing systems [7] [8] [13]. However, they provide limited
expressive power (see Section 4).</p>
      <p>
        In this paper, we offer the following two key contributions:
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) Alternative approach to declarative
crowdsourcing. We first introduce the Condition-Task-Store (CTS)
abstraction, which is an alternative declarative approach for
implementing complex data-centric crowdsourcing with
microtasks. The CTS abstraction describes a crowdsourcing
system as a set of CTS rules, each of which is a natural
extension of task template adopted by many
crowdsourcing platforms to define microtasks. Therefore, programmers
who are familiar with task templates can easily write simple
programs with the CTS abstraction.
      </p>
      <p>
        The CTS abstraction is unique in that it has all the
following features: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) it naturally extends the task template, (
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
it allows declarative descriptions of crowdsourcing systems,
and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) it has large expressive power.
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) Novel criterion for the expressive power of
languages. Next, we discuss the expressive power of the CTS
abstraction. We introduce a novel criterion to measure the
expressive power of programming languages for
crowdsourcing; the criterion focuses on the class of interactions with
humans we can implement with the language. The
criterion is important for the following two reasons. First,
complex crowdsourcing often requires various types of
interactions with humans. For example, one of such interactions
of crowdsourcing is the iterative collaboration [11], which is
not necessarily supported by every existing framework.
Second, the class of interactions is closely related to the class of
games in game theory: because human behavior is affected
by the incentives and rules defined by the game structure,
the class represents the size of mechanism design space, i.e.,
the set of possible mechanisms we can implement to
exploit the wisdom of the crowd. In fact, the change of game
structure affects the quality of the data produced by
crowdsourcing systems [5]. Our examples in Sections 3 and 4 also
suggest how game structure is important in crowdsourcing.
Then, we theoretically show that our CTS abstraction is not
only Turing complete, but can also support a wide range of
game structures.
The remainder of the paper is organized as follows.
Section 2 explains related work. Section 3 introduces the CTS
abstraction. Section 4 discusses its expressive power.
Section 5 explains a prototype to support the software
development using the CTS abstraction. Section 6 is the summary.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. RELATED WORK</title>
      <p>
        Many approaches to support the development of complex
crowdsourcing systems have been proposed. They differ
from one another in the abstraction they use to describe
crowdsourcing systems.
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) Imperative programming languages. TurKit [4]
provides a function library for implementing crowdsourcing,
which can be used via codes written in imperative
programming languages. It supports the crash-and-rerun model to
avoid re-performing costly operations.
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) MapReduce-like abstraction. CrowdForge [14] is
a MapReduce-like framework for describing complex tasks
on MTurk. It models a crowdsourcing system as a set of
tasks to implement partition, map, and reduce functions.
As we discuss in Section 4.3, the expressive power of the
CTS abstraction is larger than that of CrowdForge.
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) Control/data flows. CrowdLang [11] is a language
for describing crowdsourcing systems in terms of basic
operators, data items, and control flow constructs. Currently,
it seems that CrowdLang is used as a language for writing
crowdsourcing systems at a high-abstraction level and that
it does not provide the means to describe the details required
to directly execute the code.
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) Rule-based abstraction. The CTS abstraction is not
the first rule-based abstraction. CyLog [9] is a Datalog-like,
rule-based language for describing crowdsourcing systems.
A weakness of CyLog is that it requires programmers to be
familiar with programming by Horn clauses even for simple
crowdsourcing. In contrast, the core component of the CTS
abstraction is a task template. Therefore, although the CTS
abstraction borrows concepts from logic-based languages,
programmers can start with a set of simple task templates
and then naturally proceed to more complex crowdsourcing.
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) SQL-like abstraction. CrowdDB [8], Qurk [7], and
Deco [13] use SQL to describe crowdsourcing systems. They
propose novel query processing and optimization schemes
and we believe that some of the proposed techniques can be
applied to the CTS abstraction. As shown in Section 4.3,
the expressive power of the CTS abstraction is larger.
      </p>
      <p>To our knowledge, this paper is the first to investigate the
CTS abstraction. The abstraction models crowdsourcing
systems as a set of CTS rules, each of which is similar to a
task template. Technically, a CTS rule can be implemented
by combining two ECA rules [12]: one generates microtasks
and the other stores results in the database (and can be
implemented by using imperative languages). The CTS
abstraction provides a higher-level, user-friendly abstraction
designed for describing crowdsourcing systems, which has a
well-defined semantics and proven large expressive power.</p>
      <p>Our discussion on the expressive power is related to game
theory. Recently, the literature on algorithmic game
theory has addressed various aspects involving both algorithms
and games, such as complexities of computing equilibrium
of games [15]. To our knowledge, our paper is the first to
discuss classes of games that can be implemented by
abstractions for crowdsourcing.
3. THE CTS ABSTRACTION</p>
      <p>The task template is a popular form for defining and
registering microtasks into crowdsourcing platforms. Figure 1
shows an example of a task template in XML format for
microtasks (HITs) of MTurk, which asks a worker to enter
how many movies he or she watched in a month. The
essential components of a task template are the question to
be shown (QuestionContent) and the type of the values to
be received by workers (AnswerSpecification). Task
templates can contain variables (called placeholders) with which
we can define many microtasks that are based on the same
template but differ in the values bound to the variables.</p>
      <p>Requesters first write task templates to define and insert
microtasks into the task pool. Then, workers perform the
tasks that exist in the task pool.
3.2</p>
    </sec>
    <sec id="sec-3">
      <title>Overview of the CTS Abstraction</title>
      <p>In the CTS abstraction, a crowdsourcing system is
described by a set of CTS rules. We assume that there exists
a relational database. CTS rules read and write data to and
from the database.</p>
      <p>A CTS rule is the fundamental building block of the CTS
abstraction. We first give a simple example and next show
another example that requires more than one CTS rule.</p>
      <sec id="sec-3-1">
        <title>Example 1. A Simple Crowdsourcing System</title>
        <p>We use the task shown in Figure 2 to ask workers to tag
books. The details are as follows:
• The database has the Book(bid, title, author)
relation to store book information.
• For each book stored in the Book relation, tags are given
by three workers.
• The result is stored in the Tag(bid, tag) relation.
• Workers are paid 10 (cents or any currency) per task, if
any other workers entered the same tag.</p>
        <p>The crowdsourcing system can be described only by the
CTS rule in Figure 3. A CTS rule consists of three parts:
condition, task, and store. We explain each part below.
The condition part: We write the condition to generate
and insert a task into the task pool. The condition specifies
what tuples need to exist in the database for generating a
task. For example, the condition in Figure 3 states that a
task is generated for each tuple existing in the Book relation.
The task part: We write a task template that contains a
question to be posed to workers. The question can contain
variables (such as $author and $title) bound to values in
the condition (i.e., the variables are replaced with values
each time the task is generated). The task part also
specifies additional information, including the variables and its
associated types, to store entered values.</p>
        <p>
          Because there are frequently occurring patterns that
appear in task template specifications, we provide template
generators that allow users to describe task templates
without specifying implementation details such as HTML tags.
The task part in Figure 3 states that we use the Entry
template generator with the following parameters: (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) the
input field is labeled as “Tag,” (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) the entered value is to
be stored in the tag variable, and (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) the type of tag is
text. Count is the number of tasks to be generated for the
same value. Payoff describes how much is paid to workers
per task. For example, PayIf(count(Tag(bid, tag)) &gt;=
2, 10) states that 10 cents will be paid if there are other
workers that entered the same tag to the same book.
The store part: We specify the relations and attributes
in which the entered values will be stored. The store part
in Figure 3 states that we use the Tag relation to store bid
and the entered tag. 2
        </p>
        <p>As the example above suggests, a CTS rule is a natural
extension of the widely-used task template. Because we can
omit the condition and task parts, a CTS rule can be used
to describe the following four types of processing.
1. Generate a task when the condition is satisfied, and
store the result into the database.
2. If the condition part is omitted, generate a task with
no condition, and store the result into the database.
3. If the task part is omitted, compute and store values
into the database when the condition is satisfied.
4. If both of the condition and task parts are omitted,
store values into the database with no condition.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Example 2. More Complex Crowdsourcing.</title>
        <p>We consider a crowdsourcing system to rate restaurants,
in which workers perform the following two types of tasks:
Task 1: Enter names of restaurants (Figure 4). A 10 cent
payment is paid if the average rating by others is higher
than 3.</p>
        <p>Task 2: Enter an evaluation rating (1 to 5) for the given
restaurant (Figure 5). Three workers perform this task
for each restaurant, and they receive 10 cents per task.</p>
        <p>We assume that the results of Task 1 are stored in
Restaurant(rname), and that those of Task 2 are stored
in Rating(rname, value). Then, Figure 6 shows CTS rules
for Tasks 1 and 2.</p>
        <p>
          The CTS rule for Task 1 has no condition: thus, the task is
unconditionally generated. Unless the count number is
specified, the number of generated tasks is determined as follows:
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) if the key of the predicate (rname of Restaurant(rname))
is bound to values by the condition, the task is generated
only once for each case in which the condition is satisfied; (
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
otherwise the same task is repeatedly generated everytime
the task is performed and removed from the task pool.
        </p>
        <p>The condition of the CTS rule for Task 2 states that it
generates a task for each tuple stored in Restaurant relation.
Thus, the task is generated for each restaurant entered in
Task 1. Workers receive 10 cents for performing a task. 2
Discussion. As the examples above suggest, the
description is declarative, and each rule is invoked when its
condition is satisfied. Compared to the code written in imperative
programming languages, CTS rules naturally describe the
parallel and asynchronous processing of computation
involving human workers. On the other hand, the CTS abstraction
is more expressive than the declarative query languages that
do not support the transitive closure, because it essentially
supports a loop with a dynamic condition check [1].</p>
        <p>An important point is that the incentive structure plays
a critical role to appropriately exploit the wisdom of crowd
and (at least theoretically) ensure data quality. Because the
incentive structure and rules involving multiple humans can
be modeled as games, we can use game theory to discuss
their behaviors. For example, with a simple game-theoretic
analysis, the incentive structure of Example 1 guarantees
that rational workers enter tags that others can easily come
up with. Similarly, the incentive structure of Example 2
guarantees that rational workers for Task 1 enter the names
of restaurants that are likely to receive high ratings.
3.3</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Formal Definition</title>
      <p>This section first defines the CTS rules and then explains
the syntactic sugar for the concise description of rules.
3.3.1</p>
      <sec id="sec-4-1">
        <title>Definition of CTS Rules</title>
        <p>Definition 1. A program of the CTS abstraction is a
set of CTS rules, each of which is modeled as a triple
Ri = (Ci, Ti, Si), running over a relational database schema.
Here, Ti is a task template, Ci is the condition to generate
a task using Ti, and Si is the description of how to store
the result in the database. The database contains a
predefined relation with the schema Worker(pid, payoff) to
store information on workers and the accumulated values of
the payoffs they received so far.</p>
        <p>• Ci is a sequence P1(x11, . . . , x1n1 ), . . . , Pm(xm1, . . . , xmnm )
of zero or more atoms. Some of the atoms can be
arithmetic atoms, such as x11 = 3.
• Ti is either a triple (q, x¯, y¯) or null. Here, q is a question
for workers, x¯ is a sequence of variables bound by Ci,
and y¯ is a sequence of variables to store the results of
performing the task.
• Si is a sequence Q1(y11, . . . , y1n1 ), . . . , Qm(ym1, . . . , ymnm )
of one or more atoms, where each yjk is any of a
variable bound by Ci, a variable that appears in y¯, or a
constant. Each atom can be followed by either /update
or /delete. 2
3.3.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Syntactic Sugar</title>
        <p>
          We introduce the following variety of syntactic sugars to
make the rule description concise.
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) Attributes of atoms. The notation of atoms, which
appear in the condition and store part of CTS rules, are
essentially the same as those of Prolog and Datalog. A
key difference is that they explicitly specify attribute names
for their parameters. Each parameter is specified in the
form attribute:variable or attribute:constant. For example,
Restaurant(name:x, zip:305) is an atom.
        </p>
        <p>There are cases in which parameters and attributes can
be omitted in each atom, as described below.</p>
        <p>
          • We can omit a parameter if the rule does not
consume the value of the bounded variable. For example,
Restaurant(name:x) is an atom that omits zip.
• We can omit an attribute name if the attribute name
is the same as the variable name. For example,
Restaurant(name:name, zip:y) can be represented
as Restaurant(name, zip:y) because the attribute
and the variable have the same name.
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) Task part. Because we have frequently occuring
patterns in the description, we introduce task-template
generators and the following three fields for the task part.
•Generator is the name of a task-template generator,
associated with its parameters. For example, Entry and
Choice in Figure 6 are task-template generators. They
generate actual task templates based on the given
parameters and the sentence written in the Question
field. These allow users to define task templates
without specifying implementation details (such as HTML
tags). The Crapid system (Section 5) implements
various task-template generators.
•Count specifies the number of generated tasks. Let N
be the number specified in the field. For every case in
which the condition holds, the same task is generated
N times and N tuples are inserted into the relation.
This is implemented by adding the count attribute
to the schema of the relation in the store part, and
copying the CTS rule N times in the program.
•Payoff specifies a function to compute payoff values
given to workers. If a function is specified, rules to
update Worker(pid, payoff) are automatically
generated and added to the set of rules. The Crapid
system provides pre-defined payoff functions.
3.4
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Evaluation Model and Semantics</title>
      <p>Given a description d of the CTS abstraction, the
evaluation of d on instance ins of the database is performed by
evaluating each CTS rule in a bottom-up way on ins. More
precisely,
• For each CTS rule (Ci, Ti, Si) ∈ d, check if Ci is
satisfied with ins in the following way: for each atom ak in
Ci (from left to right), check if there exists any tuple
to bind variables in ak to the values that are
consistent with the values bound to the variables appeared
in a0 . . . ak−1.
• For every combination V of values that satisfies all the
atoms in Ci, perform one of the following:
– If Ti 6= null, replace variables in Ti with values in</p>
      <p>
        V and insert the task into the task pool.
– If Ti = null, create new tuples using values in V
for the atoms that appear in Si. Then, perform
one of the following: insert the tuples into ins,
update ins with the tuples (when the atom is
followed by /update), or delete the tuples from ins
(when followed by /delete).
• If a worker completes the generated task, (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) create
new tuples for the atoms that appear in Si, using
values in V and the entered values for the task, then (
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
apply the insertion, delete, or update operation to ins
with the new tuples.
• If ins is updated, check if there exist rules for which Ci
is satisfied with the new ins. If such rules exist, process
them. Terminate the process if we cannot find such a
rule. If there are multiple rules that can be executed
at the same time, the rule that appears earlier in the
code is evaluated first.
      </p>
      <p>For example, assume that we have the CTS rule shown in
Figure 7. We first check if there exists a tuple that matches
Image(i, size, type:"photo") in the database. Assume
that the Image relation has tuple t = (img98, 100, photo).
Then, i and size are bound to img98 and 100, respectively.
Next, check if there exists a tuple that matches Large(size)
with size=100. If exists, we replace $i in the task template
with img98 and insert the task into the task pool to ask
workers to choose the category for the image img98. If the
task is performed, the result of performing the task is stored
into LargePhoto, and the payoff attribute of the Worker
relation is incremented by 1.</p>
      <p>Given a set d of CTS rules, the semantics of d is defined
as a set of rational consequences of d, in a similar way as
the semantics of CyLog codes [9]. A rational consequence
of d is a set of facts that are derived from the rules and
the equilibrium [16] of the games, i.e., the states reached by
rational workers.</p>
    </sec>
    <sec id="sec-6">
      <title>4. EXPRESSIVE POWER</title>
      <p>This section discusses the expressive power of the CTS
abstraction. First, we show that the CTS abstraction is Turing
complete. Then, we introduce a criterion to measure the
expressive power of programming languages, which focuses on
the class of games the language can implement. Finally, we
compare the expressive power of the CTS abstraction with
those of other frameworks.
4.1</p>
    </sec>
    <sec id="sec-7">
      <title>Turing Completeness</title>
      <p>Theorem 1. The CTS abstraction is Turing complete.
Proof Outline. Figure 8 is a set of CTS rules that
implements any Turing machine. Formally, a Turing machine
consists of a quintuple (K, Σ, δ, s, H) where K is a finite set
of states, Σ is an alphabet, s ∈ K is the initial state, H ∈ K
is the set of halting states, and δ is the transition function
[6]. Intuitively, we need the following three components to
implement a turing machine.</p>
      <p>Element 1. Memory of the machine’s inner state
Element 2. Head reading and writing information stored in
the tape
Element 3. The long tape in infinitum.</p>
      <p>In Figure 8, TuringMachine(id, st, head) implements
Elements 1 and 2. Here, st records the current state whose
domain is K, and head stores the position of the head.
Tape(pos, sym) implements Element 3, in which each tuple
(p, s) states that symbol s (whose domain is Σ) is written at
position p of the tape. Rule(st, sym, new st, new sym,
dir) stores the rules δ to read and write symbols on the
tape and move the head.</p>
      <p>Rule 1 initializes a Turing machine (the initial state is s).
Rule 2 states how the head moves and writes symbols onto
the tape according to the rules stored in Rule(st, sym,
new st, new sym, dir). It states that if the inner state is
st and the symbol at the current position of the head is
sym, write new sym at the position, update the inner state
by new st, and move the head to pos+dir. Rule 3 extends
the tape when the head reaches a position that the head
never visited before. We need Rule 3 because Rule 2 always
requires that Tape(pos, sym) exists. We also need a rule
to stop the machine if it reaches the halting states H ⊆ K.
The rule is obvious and omitted due to the space limitation.</p>
      <p>Defining the CTS rules to implement a Turing machine
proves that the CTS abstraction is Turing complete. 2
4.2</p>
    </sec>
    <sec id="sec-8">
      <title>Expressive Power in Terms of Games</title>
      <p>This section proposes to use the game concept as a
measure of the expressive power of programming languages for
crowdsourcing, because the class of games that the language
can implement affects the way in which the implemented
system can exploit the power of the wisdom of crowd. First, we
enumerate several classes of games and show the relationship
among the classes.</p>
      <p>Definition 2. G1 is a class of games that satisfy the
following conditions:
1. Every input from a human is not affected by the inputs
from others, and
2. The payoffs are computed by a primitive recursive
function of the input values. 2
An example of a game in G1 is one that asks humans to
enter tags for a given image without telling them what tags
are entered by other humans. Payoffs are defined for each
combination of worker inputs. For example, workers receive
payoffs when they enter the same tag. Games in G1 are
called simultaneous games in game theory.</p>
      <p>
        Definition 3. GN is a class of programs in which (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) N (&gt; 0)
is known in advance; (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) each game has at most N -phases of
interactions, each of which asks a worker to enter data; (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) at
each i-th phase workers are shown some information based
on what was entered in the first to the i − 1-th phases; and
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) payoffs are computed by a primitive recursive function
of the entered values. 2
      </p>
      <p>
        Each game in GN has at most N phases, each of which
asks a human to enter data considering some information
computed from data in the previous phases. For example,
assume that we want to divide a set of cakes into two groups
whose total prices are equivallent to each other. Then, a
program that (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) asks one worker to divide the cakes into
two groups at the first phase, then (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) asks another worker
to choose one group, and finally (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) gives to each worker the
total price of cakes in his group, belongs to GN (with N =2).
Note that the game guarantees that the prices of the two
groups become the same if workers are rational; it exploits
the power of human intelligence to compute how they can
make two groups whose prices are equivalent to each other.
From the definition, every game in G1 belongs to GN .
Definition 4. G∗ is a class of programs in which each
program (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) executes a sequence of interactions with workers,
with the sequence being generated by a primitive recursive
function, (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) shows workers at each interaction the
information computed by a function whose parameters are taken
from the results of past interactions, and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) payoffs are
computed by a primitive recursive function of the entered
values. 2
      </p>
      <p>An example of a game in G∗ is to ask workers to write a
paragraph that explains a given keyword. Workers update
the paragraph until the paragraph is satisfied by the
majority of the crowd. When the paragraph is completed, every
writer (worker) who contributed to the paragraph receives a
payoff, which is computed by dividing a fixed total payment
by the number of the contributors.</p>
      <p>Theorem 2. GN is a proper subset of G∗.</p>
      <p>Proof. For every g, g ∈ GN ⇒ g ∈ G∗ and for any given i,
there exists g ∈ G∗ s.t. g has i+1 input phases and therefore
g 6∈ GN . 2
Theorem 3. Assume that we allow Turing machines to
interact with humans at any step of its execution. Let M be
the set of all such machines. M can implement any g ∈ G∗.
Proof. The sequence of interactions with workers that can
be generated by a primitive recursive function can be
implemented by some m ∈ M . The information shown and the
payoffs can be computed if m is a Turing machine. 2</p>
      <p>Note that being Turing complete is not a sufficient
condition to be able to implement G∗, because the power of
interactions with humans does not matter for a language to
be Turing complete. G∗ contains indefinite-length sequential
games (in game theory) that can be expressed with Turing
machines that are able to interact with humans at any step
of its execution.</p>
      <p>From the definition of the CTS rules, the following holds.</p>
      <sec id="sec-8-1">
        <title>Theorem 4.</title>
        <p>games in G∗.</p>
        <p>The CTS abstraction can implement any
2
4.3</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>Comparison of Expressive Powers</title>
    </sec>
    <sec id="sec-10">
      <title>PROTOTYPE SYSTEM</title>
      <p>We implemented the Crapid system, a prototype system
to develop crowdsourcing systems using the CTS
abstraction. Crapid takes as input a set of CTS rules and outputs
executable code. Crapid supports various task-template
generators (i.e., Entry, Choice, Decision, and Comparisons
as of May 2013) and payoff functions to help users easily
define microtasks in CTS rules. Crapid provides a
formbased user interface. When a user chooses a task-template
generator, Crapid provides an appropriate set of selection
boxes and drop-down menus to specify CTS rules. The
output code is executable on Crowd4U [10], a crowdsourcing
platform deployed at universities.</p>
    </sec>
    <sec id="sec-11">
      <title>SUMMARY</title>
      <p>
        This paper introduced the CTS abstraction, a declarative
approach for implementing complex crowdsourcing with
microtasks. The paper also introduced a novel criterion to
measure the expressive power of programming languages for
crowdsourcing by focusing on the class of games we can
implement with the language. The class represents the size of
mechanism design space, i.e., the set of possible mechanisms
we can implement to exploit the wisdom of the crowd. The
CTS abstraction is unique in that it has all the following
features: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) it naturally extends the task template adopted
by many crowdsourcing platforms, (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) it allows declarative
description, and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) it has large expressive power.
      </p>
      <p>Future work includes the development of various rewriting
techniques for the CTS abstraction. For example, we plan
to adapt various optimization techniques for crowdsourcing
systems [7] [8] [13] into our context.</p>
      <p>Acknowledgements. The authors are grateful to Prof.
Shigeo Matsubara of Kyoto university for his helpful
comments, and to the contributors of Crowd4U, whose names
are partially listed at http://crowd4u.org. This research was
partially supported by PRESTO from the Japan Science and
Technology Agency, and by the Grant-in-Aid for Scientific
Research (#25240012) from MEXT, Japan.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Abiteboul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hull</surname>
          </string-name>
          , V. Vianu: Foundations of Databases. Addison-Wesley
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Doan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ramakrishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Halevy</surname>
          </string-name>
          . “
          <source>Crowdsourcing systems on the World-Wide Web. Commun.ACM</source>
          .
          <year>2011</year>
          , vol.
          <volume>54</volume>
          , no.
          <issue>4</issue>
          , p.
          <fpage>86</fpage>
          -
          <lpage>96</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Amazon</given-names>
            <surname>Mechanical Turk</surname>
          </string-name>
          , https://www.mturk.com/.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Little</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. B.</given-names>
            <surname>Chilton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Goldman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. C.</given-names>
            <surname>Miller</surname>
          </string-name>
          . “
          <article-title>Turkit: human computation algorithms on mechanical turk”</article-title>
          .
          <source>Proc. UIST</source>
          (
          <year>2010</year>
          ), ACM, New York,
          <fpage>57</fpage>
          -
          <lpage>66</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Parkes</surname>
          </string-name>
          . “
          <article-title>A game-theoretic analysis of the ESP game”</article-title>
          .
          <source>ACM Trans. Economics and Comput</source>
          .
          <volume>1</volume>
          (
          <issue>1</issue>
          ): 3 (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H. R.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. H.</given-names>
            <surname>Papadimitriou</surname>
          </string-name>
          . “
          <article-title>Elements of the theory of computation</article-title>
          .”
          <string-name>
            <surname>Prentice-Hall</surname>
          </string-name>
          , Englewood Cliffs, New Jersey,
          <year>1981</year>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Marcus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Karger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Madden</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Miller</surname>
          </string-name>
          . “
          <article-title>Human-powered sorts and joins”</article-title>
          .
          <source>In VLDB</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Franklin</surname>
          </string-name>
          .,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kossmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kraska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xin</surname>
          </string-name>
          . “
          <article-title>CrowdDB: answering queries with crowdsourcing”</article-title>
          .
          <source>SIGMOD Conference</source>
          .
          <year>2011</year>
          , p.
          <fpage>61</fpage>
          -
          <lpage>72</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Morishima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shinagawa</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Mochizuki.</surname>
          </string-name>
          “
          <article-title>The Power of Integrated Abstraction for Data-centric Human/Machine Computations”</article-title>
          .
          <source>VLDS2011</source>
          , pp.
          <fpage>5</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Morishima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shinagawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mitsuishi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Aoki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fukusumi</surname>
          </string-name>
          . “
          <article-title>CyLog/Crowd4U: A Declarative Platform for Complex Data-centric Crowdsourcing”</article-title>
          .
          <source>PVLDB</source>
          <volume>5</volume>
          (
          <issue>12</issue>
          ):
          <fpage>1918</fpage>
          -
          <lpage>1921</lpage>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Minder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          . “
          <article-title>CrowdLang: A Programming Language for the Systematic Exploration of Human Computation Systems”</article-title>
          .
          <source>SocInfo</source>
          <year>2012</year>
          :
          <fpage>124</fpage>
          -
          <lpage>137</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>N. W.</given-names>
            <surname>Paton</surname>
          </string-name>
          ,
          <string-name>
            <surname>O.</surname>
          </string-name>
          <article-title>Di'az. “Active Database Systems”</article-title>
          .
          <source>ACM Comput. Surv</source>
          .
          <volume>31</volume>
          (
          <issue>1</issue>
          ):
          <fpage>63</fpage>
          -
          <lpage>103</lpage>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>H.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Parameswaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Garcia-Molina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Polyzotis</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Widom. “</surname>
          </string-name>
          <article-title>An overview of the deco system: data model and query language; query processing and optimization”</article-title>
          .
          <source>SIGMOD Record</source>
          <volume>41</volume>
          (
          <issue>4</issue>
          ):
          <fpage>22</fpage>
          -
          <lpage>27</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kittur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Smus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Khamkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Kraut</surname>
          </string-name>
          . “
          <article-title>CrowdForge: crowdsourcing complex work”</article-title>
          .
          <source>UIST</source>
          <year>2011</year>
          :
          <fpage>43</fpage>
          -
          <lpage>52</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>T.</given-names>
            <surname>Roughgarden</surname>
          </string-name>
          . “
          <article-title>Algorithmic game theory”</article-title>
          .
          <source>Commun. ACM</source>
          <volume>53</volume>
          (
          <issue>7</issue>
          ):
          <fpage>78</fpage>
          -
          <lpage>86</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>F.</given-names>
            <surname>Vega-Redondo</surname>
          </string-name>
          .
          <source>Economics and Theory of Games</source>
          , Cambridge University Press,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>