Exploiting Twitter as a Social Channel for
                                 Human Computation

                Ernesto Diaz-Aviles                           Ricardo Kawase                         Wolfgang Nejdl
                   diaz@L3S.de                                kawase@L3S.de                          nejdl@L3S.de
                               L3S Research Center / University of Hannover. Hannover, Germany


ABSTRACT                                                                       Social networking sites such as Twitter3 have experienced
To fully leverage the innate problem solving capabilities of                an explosion in global Internet traffic over the past years.
humans necessitates paradigm shifts towards decentraliza-                   As of June 2011, it is estimated that the Twitter users sur-
tion of human computation systems, making the existence of                  passed 300 million, and they generate more than 200 million
central authorities superfluous and even impossible. In this                of 140-character Twitter messages – tweets – every day [8,
position paper, we propose a novel decentralized architecture               9]. Interestingly enough, nearly two-thirds of active Twit-
that exploits the Twitter social network as a communication                 ter users access the microblogging service using a mobile
channel for harnessing human computation. Our framework                     phone [2].
provides individuals and organizations the necessary infras-                   The massive amount of mobile users plus the simplicity
tructure for human computation, facilitating human task                     of the interactions in Twitter, together with its scalability
submission, assignment and aggregation. We presented a                      and real-time message exchange, make this social network-
proof of concept and explore the feasibility of our approach                ing system an appealing environment to assign and collect
in the light of several use cases.                                          feedback for human computation tasks, which fits the nature
                                                                            of a tweet: short and simple.
                                                                               In this work we propose MechSwarm, a decentralized frame-
Categories and Subject Descriptors                                          work for human computation built upon Twitter’s infras-
H.3 [Information Storage and Retrieval]: Information                        tructure. Our primary contributions can be summarized as
Search and Retrieval—Information Filtering;     K.4                         follows:
[Computer and Society]: [General]                                                • We present the building blocks necessary for a decen-
                                                                                   tralized crowdsourcing architecture.
General Terms
                                                                                 • We introduce simple yet powerful idioms for human-
Human Factors; Design                                                              intelligent-task assignment over the Twitter social net-
                                                                                   work, which can be considered as a protocol for human
Keywords                                                                           computation over a transport layer.
Human Computation; Social Computer; Twitter                                      • We present a conceptual design of our framework and
                                                                                   identify a number of use cases for human computation,
                                                                                   that can take advantage of the proposed approach.
1.     INTRODUCTION
   Today’s most successful crowdsourcing services such as                   The rest of this paper is structured as follows: Section 2
Amazon’s Mechanical Turk1 and CrowdFlower2 , share a com-                   introduces the terminology and core concepts of the frame-
mon characteristic: they are all based on centralized archi-                work, as well as, its components and workflow. In Section 3,
tectures. In these services, both, users’ profile information               we present a conceptual design of our approach that shows
and task distribution engine are centralized.                               its feasibility. Section 4 introduces several Use Case sce-
   However, the Social Computer vision that we share is                     narios and presents how human intelligence tasks can be
more likely to be based on decentralized architectures [6],                 described using the MechSwarm Task Language. We discuss
like the ones provided by social networks and mobile de-                    current and future issues in Section 5. Section 6 presents
vices, where humans would interact via free information ex-                 related work. In Section 7, we conclude the paper. Finally,
change or trading to solve large-scale problems, that cannot                Appendix A, includes basic terminology used in Twitter as
be easily addressed by conventional computer systems and                    a reference.
algorithms.
1
                                                                            2.     MECHSWARM FRAMEWORK
    Mechanical Turk: mturk.com
2
    CrowdFlower: crowdflower.com                                              First, we introduce the key concepts of our proposed frame-
                                                                            work MechSwarm. We borrow some terminology from Ama-
                                                                            zon’s Mechanical Turk [1] and extend it in order to explain
                                                                            our approach.
Copyright c 2012 for the individual papers by the papers’ authors. Copy-    3
ing permitted for private and academic purposes. This volume is published       Twitter: twitter.com
and copyrighted by its editors.
CrowdSearch 2012 workshop at WWW 2012, Lyon, France
Human Intelligence Task (HIT):                                      Requester

A fine-grained task such as, “Is this a picture of the Golden                                                         HIT
Gate Bridge?” or “In the video segment about sports or tech-                     HIT                               HIT
                                                                                  HIT                           HIT solved
                                                                                    HIT                           solved
nology?”, which can be easily performed by humans and it              Problem                                  solved      Solution
can be rendered as part of a Web-based user interface.
                                                                                                               HIT
                                                                       1.                 HIT                                 6.
Contributor:                                                                                                  solved
                                                                    MechSwarm
A human being who is willing and able to perform a HIT.                         Human Computation
                                                                                                               HIT-Solver
Each contributor has also a Twitter account that is used to            2.           Optimizer
receive a HIT request and to reply with the solution. This
is equivalent to the concept of worker according to Ama-                        HIT                                HIT
zon’s Mechanical Turk terms, but we rather use the term                          :                                     :
                                                                       3.                                                     5.
contributor instead, since is a more general concept, for ex-
                                                                            HIT Request                       HIT Response
ample, volunteers which do not expect a monetary payment
for performing a given HIT are also considered contributors.

Requesters:
The individuals or organizations that need a set of HITs to
be done. Each requester has a Twitter account that is used
                                                                                HIT                                HIT
to send HITs requests and receive HITs’ responses.
                                                                                 :                                     :
HIT Assignment:                                                             HIT Request                       HIT Response
When a requester needs a particular HIT to be done, he uses
MechSwarm to assign it to a candidate contributor that will            4.
perform the task.
  More formally, we define a Human Computation system
(HCOMP-system) as a triple (T, H, A), where

   • T is called problem and corresponds to a set of Human
     Intelligence Tasks (HITs),
                                                                                                Contributor
   • H is a set of human candidates to perform a task t
     (i.e.,contributors), and
                                                                  Figure 1: MechSwarm Components and Workflow.
   • A : T → H is a function that assigns each task t to a        (1) A requester defines a problem (i.e., the set of
     human A(t) ∈ H.                                              HITs T ). Each HIT is defined using the MechSwarm
                                                                  task language. (2) The requester submits the HITs
   The solution to the problem T is denoted by Solution(T ).      to the Human Computation Optimizer (HCO). The
Note that this definition does not impose any restriction         HCO identifies for each HIT a contributor from the
on where the task submission, assignment, and completion          requester’s Twitter Social Graph (i.e., from his fol-
takes place.                                                      lowers), and assigns him the HIT. (3) The HCO uses
   MechSwarm provides (i) the selection of candidate con-         the Twitter social network to route the HIT request
tributors H, (ii) a task assignment over this set (i.e., A) and   to the contributor. (4) The contributor receives the
(iii) an aggregation mechanism to compute the final solution      HIT, completes the assignment and issues a HIT
of the problem, i.e., Solution(T ).                               response to the requester. (5) The HIT-Solver com-
   In the rest of this section we detail the different compo-     ponent collects the HIT responses to the specified
nents of the framework and the system workflow.                   problem on behalf of the requester, and (6) com-
                                                                  putes the problem’s final solution, i.e., Solution(T).
2.1    Components and Workflow
  The MechSwarm Task Language, the Human Computation
Optimizer and the HIT-Solver are the fundamental compo-
                                                                    The basic workflow of the system is shown in Figure 1. A
nents of the framework. They are specified as follows:
                                                                  requester begins by defining a problem (i.e., T ) that can be
 MechSwarm Task Language. The language used to                    split into several HITs easily tackled by humans. The HITs
    specify the HITs and basic protocol for message ex-           are expressed using the MechSwarm’s task language. The
    change.                                                       requester submits the problem to the Human Computation
                                                                  Optimizer, which selects a set of contributors as candidates
 Human Computation Optimizer (HCO). The compo-                    to solve the HITs (i.e., H), and assigns to each of them a
   nent that manages the HIT requests, contributor selec-         task to perform (i.e., A).
   tion and task assignment.                                        Each contributor completes the HIT assigned and sends
                                                                  back a response with the solution. The HIT-Solver collects
 HIT-Solver. The component responsible to aggregate the           the set of HITs completed and computes a final solution to
    completed HITs and to compute a final solution.               the problem, i.e., Solution(T ).
                  Function           Character                   3.2   Human Computation Optimizer (HCO)
              Question ID start           #                         The Human Computation Optimizer (HCO) is a core com-
              Question ID end             ?                      ponent in charge of managing HIT requests, contributor se-
             Parameter delimiter          &                      lection and task assignment. HCO exploits social proximity
            Parameter terminator          !                      to assign HITs to contributors belonging to the requester’s
                                                                 social graph.
Table 1: List of reserved characters of MechSwarm                   We are exploring more sophisticated methods for contrib-
Task Language to identify each part of the message.              utor selection and task assignments, in particular we want
These reserved characters are configurable by the                to (i) automatically identify the nature and semantics of the
developer.                                                       problem (e.g., HITs), and (ii) learn and keep a contributor
                                                                 profile in order to optimize the task assignments according
                                                                 the capabilities of each contributor.
   In Figure 1, we can observe that the HCP and HIT-Solver       3.3   HIT-Solver
components run on the requester’s infrastructure, and not
on a centralized system. As a consequence, requesters’ pro-         We consider each HIT as part of a problem. The solution
file information remains private and does not need to be         of the problem does not only imply to solve each HIT, but
disclosed to third parties.                                      also to produce an aggregated result, or a meaningful com-
   In the next section we present an instance on how to re-      bination of the output produced by individual HITs. For
alize these concepts.                                            example, in order to translate into Spanish a text document
                                                                 written in German, we could split the document into para-
                                                                 graphs, and then create and assign a HIT to a contributor
3.    CONCEPTUAL DESIGN                                          requesting the translation of each of them. The final solu-
  In this section we demonstrate the feasibility of the pro-     tion corresponds to the result of each HIT plus the ordering
posed decentralized framework. We present and discuss how        of the translated paragraphs, with respect to the original
each component can be realized.                                  document.
                                                                    The final step of computing the aggregated solution of a
3.1    MechSwarm Task Language                                   problem is performed by the HIT-Solver.
   One crucial point in distributing tasks among many con-
tributors is to make sure that they are familiar with the cho-   4.    USE CASES
sen language (i.e., protocol ) to communicate with the frame-       We reserve this section to expose a list of use cases (UC),
work. If the communication channel and the communication         encompassing several human computation tasks, that can be
languages are not coordinated, any human computation is          effectively solved using our framework. We use the Twitter
in vain. To this end we propose a response formatting that       account “MechSwarm” in our discussion below.
is short and simple to use, is familiar to Twitter users and
is customizable. The basic format is a tweet containing a        UC1: Pairwise Comparisons. The framework is pre-
Twitter mention. to the framework, followed by the iden-         pared to listen to all tweets that mention the account (@Mech-
tification of a HIT, followed by the the choices from a list     Swarm) and, if required, to acknowledge the received re-
of possible answers. For basic terminology used in Twitter,      sponse. Additionally, HIT-Solver computes the final solu-
please refer to Appendix A.                                      tion based on the HIT responses received. The framework
   In Table 1 we list the predefined character delimiters used   logs all responses received, in order not to processing them
in the MechSwarm Task Language. Note that the only char-         more than once.
acter that is not customizable is the Twitter reserved symbol
(@), used for mentioning. In the end, a request and response      Request: “@Contributor
should be formatted as follows:                                  Which    one   is   your  favorite         search     engine?
                                                                 #favSearch?Google&Yahoo!”
     • Request Template:
       @<Target Contributor><Question>#<QuestionID>?               Response: “@MechSwarm #favSearch? Yahoo!”
       <Choice1>&<Choice2>&. . . &<Choice n>!
                                                                 UC2: Sound Verification. The framework can be used
     • Response Template:                                        to confirm results from unsupervised methods as automatic
       @<MechSwarm Framework>#<QuestionID>?                      tagging images, videos or sounds.
       <Answer1>&<Answer2>&. . . &<Answer n> !
                                                                    Request: “@Contributor
   Please note that the list of choices in the request is op-    Is      http:///example.com/sound.mp3            a      bird?
tional. Furthermore, observe that complex and massive task       #soundHIT?yes&no!”
definition require additional software tools, e.g., to select
from a database the set of questions to be asked in a ques-        Response: “@MechSwarm #soundHIT? yes!”
tionnaire, but the basic idioms presented in this section can
even be input directly by the requesters.
   The request and response length is restricted to 140 char-
acters, given the message limit imposed by Twitter. Con-
crete examples of HITs, specified using the MechSwarm Task
Language, can be found in Section 4.
UC3: Image Tagging Additionally, yet another applica-             the functionality provided by tools like TurKit [4] using our
tion is to provide means for contributors to add correct hu-      framework?
man judged metadata to resources.                                    In particular, we are interested in use case scenario UC1:
                                                                  Pairwise Comparisons, which is at the core of learning to
  Request: “@Contributor Tag image                                rank and collaborative filtering algorithms, which can be
http://example.com/picture.png #tagImageHIT?”                     realized using a decentralized crowdsourcing workforce.

  Response: “@MechSwarm #tagImageHIT?                             6.     RELATED WORK
dog&animal&nature!”
                                                                     When talking about Human Computation, there are two
                                                                  main concepts that come in mind: crowdsourcing and Games
UC4: Near Duplicate Detection For the task of video
                                                                  With A Purpose (GWAP) [10]. Crowdsourcing is the act of
duplicate detection, the contributors could access a simple
                                                                  gathering together the solutions performed by large groups
interface displaying two videos and two buttons (“yes” and
                                                                  of people over some specific task. Today the most prominent
“no”). Once the contributor clicks on one of the buttons this
                                                                  human computation application for is Amazon Mechanical
triggers his Twitter account to post the formatted message
                                                                  Turk, a marketplace for crowdsourcing. Amazon Mechanical
understandable by the framework.
                                                                  Turk works as a platform to coordinate (humans) to perform
                                                                  simple tasks that usually computers cannot, in exchange for
  Request: “@Contributor Are these terms/videos the same?
                                                                  monetary rewards.
http://example.com/V1V2/ #matchVideoHIT?yes&no!”
                                                                     Games With A Purpose, or human computation games,
                                                                  exploit the idea of having human players to compete in solv-
     Response: “@MechSwarm #matchVideoHIT? yes!”.
                                                                  ing problems. Many domains have profit from the GWAP
                                                                  approach, mostly annotation of images and music [12, 3] and
UC5: Translation The requesters can post HITs that re-
                                                                  also collecting common sense facts [11, 7].
quire sentences to be translated to a certain language.
                                                                     Another great example that exploits Human Computa-
                                                                  tion is the reCAPTCHA project4 , which provides a captcha
  Request:   “@Contributor       Translate   to   Portuguese:
                                                                  service that is primarily used to identify whether is a hu-
Hello world #translateHIT?”
                                                                  man accessing some online content, and at the same time,
                                                                  collects the feedback to correct words in digitalized books
     Response: “@MechSwarm #translateHIT? Ola’ Mundo!”.
                                                                  that optical character recognition (OCR) programs fail to
                                                                  recognize with certainty.
5.     DISCUSSION AND FUTURE WORK                                    First, like Amazon Mechanical Turk we propose a frame-
  Twitter aggregates millions of users that are intercon-         work that defines workflows and terminologies for modeling
nected through follower/followee ties. The users interactions     human computational tasks. Second, from GWAPs we share
in Twitter, using mobile devices, open the opportunity to         the motivational power embedded in games and social net-
achieve large scale human computations, similar than the          works to leverage the distribution and completion of tasks.
ones performed in centralized crowdsourcing systems, with         Lastly, like reCAPTCHA, instead of forcing users to search
the additional benefits of contributor’s social ties and real-    for tasks, we bring the tasks to the users, using the Twitter’s
time information exchange.                                        nature of pushing notifications.
  Regarding the monetary motivation supported by crowd-              Like a crowdsourcing marketplace, our goal is to support
sourcing systems like Amazon Mechanical Turk, we think            Human Computation, but we propose a decentralized ap-
that alternative decentralized trade spaces for human com-        proach based on Twitter’s architecture and social graph,
putation are possible, where rewards and incentives to indi-      which is quite different from the aforementioned works.
viduals do not necessarily involve a monetary payment for
their contributions. Clear examples exist of such spaces that     7.     CONCLUSION
support our vision, Wikipedia, for instance, can be consid-
ered as a massive human computation task of knowledge                In this paper, we introduced a decentralized human com-
gathering, where the vast majority of contributors does not       putation framework, MechSwarm, that exploits Twitter’s so-
receive money for their efforts, but are motivated by intrinsic   cial network to harness the problem solving power of human
rewards that comes from work achieved itself [5]. The de-         intelligence.
centralized framework we discussed here is flexible enough           The framework does not require a centralized system to
to also incorporate monetary rewards if it is required, but       manage task submission, assignment, and completion, but
the contract would be established directly by the requester       has the potential to empower individuals and organization
and contributors, without any intermediaries.                     to distribute tasks across large number of human contribu-
  Our work opens the door to interesting future directions.       tors over Twitter’s social graph. We presented a conceptual
One interesting question is: how to exploit plurality for         design and explore the feasibility of our approach in the light
error-resilient HIT solving? Additionally, one issue to be ex-    of several use cases.
amined is how the public HIT responses from one contribu-            We envision that decentralized architectures for human
tor influences others. It is a reasonable assumption that any     computation will emerge as viable alternatives to well es-
suggestion or recommendation before the execution of a task       tablished crowdsourcing services. Our approach is a small
may bias its outcome, thus should be empirically verified.        step towards realizing this vision.
  We plan to deploy a live implementation of our framework.
We want to explore how can more complex tasks be solved
                                                                  4
using the basic idioms we proposed, is it possible to achieve         reCAPTCHA: google.com/recaptcha
8.   REFERENCES                                               [11] L. von Ahn, M. Kedia, and M. Blum. Verbosity: a
                                                                   game for collecting common-sense facts. In CHI’06,
 [1] J. Barr and L. F. Cabrera. Ai gets a brain. Queue,            pages 75–78, 2006.
     4:24–29, May 2006.
                                                              [12] L. von Ahn, R. Liu, and M. Blum. Peekaboom: a
 [2] Edison Research. Twitter usage in america: 2010.              game for locating objects in images. In CHI’06, pages
     http://www.edisonresearch.com/, 2010.
                                                                   55–64. ACM Press, 2006.
 [3] E. Law, L. von Ahn, R. Dannenberg, and
     M. Crawford. Tagatune: a game for music and sound
     annotation. In Proceedings of the 8th International      APPENDIX
     Conference on Music Information Retrieval (ISMIR         Appendix A: Basic Twitter Terminology
     2007), 2007.
 [4] G. Little, L. B. Chilton, M. Goldman, and R. C.            • Tweet: A message posted via Twitter containing 140
     Miller. Turkit: human computation algorithms on              characters or fewer.
     mechanical turk. In Proceedings of the 23nd annual         • @: The @ sign is used to call out usernames in Tweets.
     ACM symposium on User interface software and               • Mention A mention is any Twitter update that con-
     technology, UIST ’10, pages 57–66, New York, NY,             tains @username anywhere in the body of the Tweet.
     USA, 2010. ACM.
                                                                • Follower: A follower is another Twitter user who fol-
 [5] M. Poppendieck. Unjust desserts? Better Software,
                                                                  lows a specific account.
     pages 33–47, July/August July/August 2004.
 [6] D. Robertson and F. Giunchiglia. The social                • Followee: Reflects other Twitter users that a specific
     computer: Combining machine and human                        account chose to follow.
     computation (DISI-10-036). Technical report,               • Lists: Curated groups of other Twitter users. Used
     University of Trento, 2010.                                  to tie specific individuals into a group on your Twitter
 [7] P. Singh, T. Lin, E. T. Mueller, G. Lim, T. Perkins,         account.
     and W. L. Zhu. Open mind common sense: Knowledge           • Reply: A Tweet posted in reply to another user’s mes-
     acquisition from the general public. In On the Move to       sage, usually posted by clicking the “reply” button next
     Meaningful Internet Systems, pages 1223–1237, 2002.          to their Tweet. Always begins with @username.
 [8] C. Taylor. Social networking ‘utopia’ isn’t coming.
     CNN Tech. http://goo.gl/emF5j, June 2011.                  • Retweet: A Tweet by another user, forwarded by
                                                                  someone else.
 [9] @twittereng. 200 million tweets per day. Twitter Blog.
     http://goo.gl/eybp0, June 2011.
[10] L. von Ahn. Games with a purpose. IEEE Computer,
     39(6):92–94, 2006.