DBCrowd 2013: First VLDB Workshop on Databases and Crowdsourcing


     Crowdsourcing to Mobile Users: A Study of the Role of
                     Platforms and Tasks

                Vincenzo Della Mea                             Eddy Maddalena                             Stefano Mizzaro
                                     Department of Mathematics and Computer Science
                                                    University of Udine
                                                         Udine, Italy
                           vincenzo.dellamea@uniud.it, eddy.maddalena@uniud.it, mizzaro@uniud.it

ABSTRACT                                                                          they allow requesters to post the tasks they want to crowd-
We study whether the task currently proposed on crowd-                            source and workers to perform those tasks for a small reward
sourcing platforms are adequate to mobile devices. We aim                         (usually a few cents).
at understanding both (i) which crowdsourcing platforms,                            Meanwhile, mobile devices (phones, smartphones, tablets,
among the existing ones, are more adequate to mobile de-                          and in the near future glasses, watches, and so on) have
vices, and (ii) which kinds of tasks are more adequate to mo-                     become ubiquitous and are used to access the Web. Ac-
bile devices. Results of a user study hint that: some crowd-                      cording to several statistics, in the next few years there will
sourcing platforms seem more adequate to mobile devices                           be more Web accesses by mobile devices than by classical
than others; some inadequacy issues seem rather superficial                       desktop/laptop computers (see, e.g., [6]).
and can be resolved by a better task design; some kinds of                          In this paper we study the intersection of mobile and
tasks are more adequate than others; and there might be                           crowdsourcing. We aim at understanding whether the task
some unexpected opportunities with mobile devices.                                currently proposed on crowdsourcing platforms are adequate
                                                                                  to mobile devices. By “adequate” we mean that they can
                                                                                  be performed effectively by using a mobile device in place of
Categories and Subject Descriptors                                                a desktop/laptop computer. We specifically seek to answer
H.4.m [Information systems applications]: Miscellaneous                           two research questions:

                                                                                  Q1 Which crowdsourcing platforms, among the existing ones,
General Terms                                                                        are more adequate to mobile devices?
Experimentation, Measurement.
                                                                                  Q2 Which kinds of tasks are more adequate to mobile de-
                                                                                     vices?
Keywords
                                                                                     Besides the above mentioned statistics on increasing mo-
Crowdsourcing, mobile devices.
                                                                                  bile usage, this research is also justified by the fact that to-
                                                                                  day quite often people access the Web on their mobile phones
1.    INTRODUCTION AND AIMS                                                       for short periods of time, for example while commuting to
   Among the phenomena that are acquiring increasing im-                          work on train or underground, while waiting for a bus or for
portance in the information technology landscape, two are                         a friend, while in a car (and not driving), while standing in
the subjects of this paper: (i) crowdsourcing, and (ii) mobile                    a queue, etc. In other terms, there is plenty of human work-
devices and applications.                                                         force available for a few minutes (or seconds) bursts, and
   Crowdsourcing, i.e., the outsourcing of tasks typically per-                   this kind of workforce seems perfect for the crowdsourcing
formed by a few experts to a large crowd as an open call,                         scenario, where the tasks are usually short and the reward
has been shown to be reasonably effective in many cases,                          is usually low. Moreover, some crowdsourcing tasks could
like Wikipedia, the Chess match of Kasparov against the                           be more adequate to a mobile scenario than to a classical
world in 1999, and several others (see, e.g., [4] or even                         desktop one. For example, taking pictures of some point of
http://en.wikipedia.org/wiki/Crowdsourcing). Several                              interest (like a monument, a paint, or a billboard), describ-
crowdsourcing platforms (Amazon Mechanical Turk being                             ing a real life scene, or even recording movements, destina-
probably the most known) have also appeared on the Web:                           tions, and trajectories in an urban traffic setting. However,
                                                                                  to fruitfully exploit this workforce, it is necessary that the
                                                                                  platforms are adequate and tasks are feasible. This consid-
                                                                                  eration also underlies our choice of focussing on the worker
                                                                                  side and neglecting the requester part.
                                                                                     The paper is structured as follows. In Section 2 we briefly
                                                                                  survey the related work on mobile and crowdsourcing, trying
                                                                                  to focus on the research involving both aspects. In Sections 3
                                                                                  and 4 we describe two experiments aiming at answering the
Copyright c 2013 for the individual papers by the papers’ authors. Copying
permitted for private and academic purposes. This volume is published and         two research questions above. In Section 5 we draw conclu-
copyrighted by its editors.                                                       sions and sketch future developments.


                                                                             1
                                                                             14
                   DBCrowd 2013: First VLDB Workshop on Databases and Crowdsourcing


2.    RELATED WORK                                                         id         Platform name       URL                     # of
   Although crowdsourcing commercial platforms seem de-                                                                           tasks
signed with a desktop/laptop user in mind, there has al-                   mTurk      Amazon              mturk.com                1154
ready been some work on the idea of having workers using                              Mechanical Turk
mobile devices. We briefly survey it in this section.
                                                                           micW       Micro Workers       microworkers.com         1302
   Musthag and Ganesan[7] focus on mobile micro-task mar-
ket and present some statistics on mobile workers behavior.                minW       Minute Workers      minuteworkers.com          86
   The mCrowd platform [11] is an iPhone based mobile                      shortT     Short Task          shorttask.com             175
crowdsourcing platform that enables mobile users to act
as both requester and workers, and focuses on tasks like                                       Table 1: Platforms
geolocation-aware image collection, road traffic monitoring,
etc., that exploit the rich array of sensors available on iPhones.
   Eagle [2] describes txteagle, a mobile crowdsourcing mar-              tasks available per month [5]. Though, the samples are nei-
ketplace used in Kenya and Rwanda for tasks like transla-                 ther negligible, since they count around 1% − 5%. For each
tions, polls, and transcriptions.                                         task we extracted: identifier, title, required proof, remunera-
   Location-based distribution of tasks to mobile workers is              tion, time needed, requester identifier, and description. The
proposed in [1]. Some design criteria for mobile crowdsourc-              task collection is available upon request. Three examples of
ing platforms are also presented and discussed. A similar                 tasks in our collection are (errors included):
approach, focused on the specific domain of news reporting                   • Task example 1:
is presented in [9]: SMS messages are used for location based
assignment for crowdsourcing news.                                                1. Go to http://goo.gl/Dlzk
   Narula and colleagues [8] focus on low-end mobile devices                      2. Click the link to go to the download
and present MobileWorks, a platform for OCR tasks specifi-
cally aimed at users from the developing world. Experimen-                        3. Complete a survey/offer on Sharecash and down-
tal results demonstrate a high rate of task completion (120                          load the file
per hour) and a high accuracy (99%). A similar approach                           4. Send proof
is presented in [3], where the mClerk system is described.
Some experimental results again witness the feasibility of                   • Task example 2:
the approach. Some discussion of the viral diffusion of the                       1. Go to http://OneDollarRiches.com/5737
system among workers is also discussed.
   As a different approach, the CrowdSearch system, an im-                        2. Click on Join Now button
age search service for mobile phones that relies on Amazon                        3. Invest 1 dollar by logging in into your Alertpay
Mechanical Turk, is presented in [10]. It is interesting be-                         account
cause, although it does not exploit a mobile crowd, it is an                      4. After that enter you personal details and login.
example of exploiting a crowd in (almost) real time.
                                                                                  5. Join and finish signing up
3.    EXPERIMENT 1                                                              While Sign up use same e-mail of your Alertpay ac-
                                                                                count. because when u make ur refferaf there 1$ sing
3.1    Aims                                                                     up go direct into ur alterpay account.
   The first experiment aims to verify the suitability of ex-
                                                                             • Task example 3: Find the details for this Restaurant
isting crowdsourcing platforms to mobile devices (see ques-
tion Q1 in Section 1). We asked the participants to estimate                        – For this restaurant below, enter the details below
the difficulty of performing a task on both a mobile device
                                                                                    – You must confirm that the restaurant is still open
and a desktop/laptop computer.
                                                                                    – Include the full address, e.g. http://www.thechee
3.2    Participants                                                                   secakefactory.com
   Sixteen participants were involved in the experiment. All                        – Do not include URLs to city guides and listings
of them were italian students, aged between 16 and 30. We                             like Citysearch
required a good knowledge of English and familiarity with
computers and smartphones. Participants were randomly                           Restaurant : Akasha Organics 160 North Main St.
subdivided into 4 groups (U1 ,U2 ,U3 ,U4 ), each one containing                 Ketchum
four participants.                                                              Fill in the text fields with this information: Still open,
                                                                                Restaurant name,Website Address,Phone number,Street
3.3    Data                                                                     Address,City,State,Zip code.
   We selected four among the most popular crowdsourcing
platforms (see Table 1). We downloaded some randomly se-                  3.4    Methods
lected tasks from these platform, for a total of 2717 tasks                  We randomly extracted 48 tasks, 12 from each platform,
(the exact number for each platform is shown in the third                 and divided them into 4 groups (T1 , T2 , T3 , T4 ). Each group
column in Table 1). The download has been performed in                    contains 12 tasks (3 tasks from each of the 4 platforms).
October and November 2012. The downloaded tasks are                       Task group Ti was assigned to user group Ui (e.g., task group
among those that can be performed by any requester, i.e.,                 T1 was assigned user group U1 ). We developed a web ap-
without any qualification. These are not huge samples: for                plication to show to each participant the group of 12 tasks
example, on mTurk one can count hundreds of thousands of                  assigned to his/her user group (see Figure 1). By using this


                                                                     2
                                                                     15
                  DBCrowd 2013: First VLDB Workshop on Databases and Crowdsourcing


                  Figure 1: The interface used in the first experiment (translated into English)


application, each participant recorded two estimates of dif-
                                                                                      5


ficulty for each task, one for a desktop and one for a mobile                                                         Desktop
device (see the bottom part of the figure). Tasks were pre-                                                           Mobile
sented in random order and participants did not know from
                                                                                      4


which platform the tasks were extracted.
   Difficulty was provided on a seven points scale ranging
from trivial to impossible. For each task we therefore ob-
                                                                                      3
                                                                         Difficulty


tained 4 estimates (from the participants in the same group).
We then converted the labels into the [0..6] range and cal-
                                                                                      2


culated the average of difficulty estimates.

3.5   Results
                                                                                      1


  Figure 2 shows the averaged estimated difficulty, on desk-
top and mobile, for each platform. Tasks from mTurk are
estimated slightly more difficult than MicroWorkers, Min-
                                                                                      0


uteWorkers, and ShortTask. The difference of difficulty es-                               mTurk     micW     minW      shortT
timates between desktop and mobile is also shown in Fig-
ure 3: difficulty estimation is consistently higher on mobile                             Figure 2: Estimated difficulty
devices, both in absolute terms and as a percentage of the
desktop difficulty.
  By manually analyzing the task collection we realized that            • use of frame attribute in html pages;
some of them are inadequate to mobile devices for some
typical reasons:                                                        • bad layout in a small resolution display;
                                                                        • need of a high power CPU.
   • too long description;
                                                                     Some of these task issues seem due to the task content, while
   • technical obstacles like scrolling problems, unsupported        some other depend on how the Web interface is realized.
     audio formats and/or plugins, pages with Adobe Flash,           Many of them seem rather superficial and can be overcome
     etc.;                                                           by a better task design and/or better user interfaces.


                                                                3
                                                                16
                             DBCrowd 2013: First VLDB Workshop on Databases and Crowdsourcing


                                                                                sults. Also, their classification was easier (sometimes it is
                                                                                not clear how to classify real tasks). Finally, this allowed us
          1.50


                                                           100
                                             Difference                         to create task descriptions written in Italian, thus remov-
                                             %                                  ing any language issue from the experiment (all participants
          1.25


                                                                                were Italian native speakers). The created tasks are in all


                                                           80
                                                                                respects similar to real tasks.
             1.00


                                                                                4.3    Methods


                                                           Percentage(%)
                                                                     60
  Difference
     0.75


                                                                                   We took the usual special care to avoid any order and
                                                                                learning bias. Each participant performed 6 tasks (one for


                                                            40
          0.50


                                                                                each of the categories in Table 2) on the desktop platform
                                                                                and 6 other tasks (again, one for each category) on the mo-


                                                           20
                                                                                bile one. His/her tasks were selected from two task groups,
          0.25


                                                                                depending on the user group the participant was assigned
                                                                                to. To further avoid bias, participants in each group alter-
          0.00


                                                           0
                                                                                natively started from desktop or from mobile. Therefore,
                     mTurk    micW    minW   shortT
                                                                                each participant performed a total of 12 different tasks, half
                                                                                on desktop and half on mobile. Each task was performed by
Figure 3: Mobile-desktop difference of estimated                                8 participants in two user groups, half of which performed
difficulty, as absolute time (bars on the left) and                             it on mobile and half on desktop.
as a fraction (right)                                                              Statistics have been calculated as follows. At first, the
                                                                                average time needed for task completion has been calculated
                                                                                for each task separately for mobile and desktop performance
                                                                                (i.e., averaged on 4 subjects each). Then category averages
4.              EXPERIMENT 2                                                    have been calculated from task averages, again separately
                                                                                for mobile and desktop devices.
4.1                 Aims
   The aim of the second experiment is to identify which task                   4.4    Results
kinds are more adequate for mobile devices (see question Q2                        Figure 5 shows the average time to complete for a task,
in Section 1). We therefore now focus on task features, and                     for each category and on both mobile and desktop devices.
not on platforms. Also, in place of asking estimates to par-                    Figure 6 shows the differences in average time to complete.
ticipants, we required them to actually perform the tasks                       Some tasks are quicker: Cat, Mod, Sen required less than
on both desktop and mobile devices and we measured the                          one minute on average, on both desktop and mobile. ImT
time spent on each task. Participants used two prototype                        and Tra are a bit longer, between one and two minutes on
platforms that we built ad hoc for the experiment: one for                      average, and Wri is even longer. As expected, all tasks are
desktop devices using Google Web Toolkit, and the other                         faster on desktop, with the only exception of Wri: in it,
specifically made for mobile devices, by means of an Android                    the participants autonomously decided to use the voice-to-
application. Figure 4 shows the resulting user interfaces.                      text functionality when on mobile, and this turned out to
                                                                                be quicker than writing with a keyboard (although we did
4.2                 Participants and Data                                       not investigate the quality of transcription). As highlighted
   The 16 participants (the same as in the previous experi-                     in Figure 6, ImT and Tra show a higher mobile-desktop dif-
ment) were subdivided into 4 groups labeled U1 , U2 , U3 , U4 .                 ference, both on absolute time and percentage, probably
   To identify the kinds of task in a somehow objective way,                    because they require multiple texts in more fields, a cum-
we relied on the task categories usually requested in crowd-                    bersome activity if carried out by mobile.
sourcing marketplaces. More in detail, we started from                             Looking at the percentage differences in Figure 6, one can
the 11 categories suggested by Amazon Mechanical Turk                           notice that Cat small difference in absolute terms is actually
when creating a new task (see https://requester.mturk.                          quite high in percentage: this means that even if the differ-
com/create/projects/new): Categorization, Data Collec-                          ence in time is rather small, since Cat tasks are quite short
tion, Moderation of an Image, Sentiment, Survey, Survey                         (as can be seen in Figure 5), this small value is important in
Link, Tagging of an Image, Transcription from A/V, Tran-                        percentage terms. Conversely, looking at the two rightmost
scription from an Image, Writing, and Other. To obtain an                       bars, the percentage difference in Wri looks smaller than the
amenable number of categories in our experiment, we ex-                         absolute time difference; this is again due to the average
cluded 5 Mechanical Turk categories: Data collection, Sur-                      length of the Wri task, which is quite high (see Figure 5).
vey and Survey link (considered somehow similar to Sen-                         Though, the improvement on mobile is still important, being
timent), Transcription from A/V (to avoid technical issues                      around 20%.
on mobile devices), and Other. We therefore selected 6 task
categories, those shown in Table 2. Then we created 4 new
tasks for each category, for a total of 24 tasks, and grouped                   5.    CONCLUSIONS AND FUTURE WORK
them in four task groups (labeled Ta , Tb , Tc , Td ), each group                 The work described in this paper is a first exploration of
containing six tasks, one from each category.                                   the opportunities and challenges of outsourcing tasks to a
   Using artificial tasks (i.e., tasks created by ourselves) al-                mobile crowd. Results provide preliminary evidence on the
lowed to remove any platform bias and those issues discussed                    inadequacy of current crowdsourcing platforms for mobile
at the end of Section 3.5, that might have affected the re-                     devices, even if task complexity would be adequate for being


                                                                           4
                                                                           17
                 DBCrowd 2013: First VLDB Workshop on Databases and Crowdsourcing


           Figure 4: The interface used in the second experiment: desktop (left) and mobile (right)

  Id     Category                       Description
  Cat    Content categorization         Some images are proposed to the worker, which is required to assign each of them
                                        to the correct category.
  Mod    Moderation of an image         The worker is required to flag adult contente pictures that are inappropriate for
                                        children.
  Sen    Sentiment                      Some sentences are proposed to the worker, which is required to record his agree-
                                        ment by means of a Likert scale.
  ImT    Image tagging                  Some images are proposed to worker, which is required to tag each of them with
                                        keywords.
  Tra    Transcription from an image    The worker is required to extract and write the textual content from a picture.
  Wri    Writing                        The worker is required to write a short text about a specific topic.

                                                Table 2: Task categories


carried out on mobile scenarios. More in detail, results are            • Experiment 2 also confirms that mobile devices might
fourfold:                                                                 offer some unexpected opportunities, like the voice-to-
                                                                          text unexpected (by us) solution, autonomously adopted
   • Experiment 1 results show that, according to user per-               by participants.
     ception of difficulty, some crowdsourcing platforms might
     be slightly more adequate to mobile devices than oth-              We carried out two separate experiments, although shar-
     ers.                                                            ing subjects, in order to study two different aspects of mo-
                                                                     bile crowdsourcing: crowdsourcing platform effects, and task
   • Some inadequacy issues seem rather superficial and              category effects. The experiments are preliminary and re-
     can be resolved by a better task or interface design.           sults are not final, but this is consistent with our aims, that
                                                                     were to begin to study the general issue of mobile crowd-
   • Experiment 2 shows that tasks of different kinds, as            sourcing. This exploratory attitude is also a motivation for
     defined by mTurk categories, might present different            having two experiments performed with different method-
     difficulties when carried out on desktop or on mobile           ologies (asking to the participants an estimate of difficulty
     devices. This might hint a first specialization of task         and having participants performing the actual tasks). Of
     assignment, although examining features of easy and             course, these experiments, or similar ones, could have been
     difficult tasks might provide a better ad-hoc special-          run by means of some crowdsourcing platform themselves.
     ization, perhaps even independent of the kind of task.          We preferred a more traditional approach and started with


                                                                 5
                                                               18
                             DBCrowd 2013: First VLDB Workshop on Databases and Crowdsourcing


                                                                            60


                                                                                                                                  60
          250


                                                                                                                           Time
                   Desktop                                                                                                 %
                   Mobile


                                                                            40


                                                                                                                                  40
          200


                                                                            20


                                                                                                                                  20
                                                                                                                                  Percentage(%)
          150


                                                                         Time(s)
Time(s)


                                                                            0


                                                                                                                                        0
          100


                                                                            −20


                                                                                                                                  −20
                                                                            −40


                                                                                                                                  −40
          50


                                                                            −60


                                                                                                                                  −60
                                                                                   Cat   Mod   Sen    ImT     Tra    Wri
          0


                  Cat    Mod      Sen   ImT   Tra     Wri

                                                                        Figure 6: Mobile-desktop differences in average time
                                                                        to complete for each task category, as absolute time
Figure 5: Average time to complete for each task                        (bars on the left) and as a fraction (right)
category on both mobile and desktop devices
                                                                         [7] M. Musthag and D. Ganesan. Labor dynamics in a
classical user studies, but we do plan to do that in the future.             mobile micro-task market. In W. E. Mackay, S. A.
   To further develop this work, other experiments can be                    Brewster, and S. Bødker, editors, CHI, pages 641–650.
imagined. For example, the same experiments described                        ACM, 2013.
here could be repeated in real-world scenarios (on the train,            [8] P. Narula, P. Gutheim, D. Rolnitzky, A. Kulkarni, and
road, school rooms, or crowded places) to have more re-                      B. Hartmann. MobileWorks: A mobile crowdsourcing
alistic results. It is also feasible to imagine an extended                  platform for workers at the bottom of the pyramid.
crowdsourcing platform that on the basis of the context of a                 Proc. HCOMP11, 2011.
worker (time, date, geolocation, habits and preferences, mo-             [9] H. Väätäjä, T. Vainio, E. Sirkkunen, and K. Salo.
bile device sensors, etc.), automatically filters and selects                Crowdsourced news reporting: supporting news
tasks tailored for a specific context.                                       content creation with mobile phones. In Proceedings of
                                                                             the 13th International Conference on Human
6.              REFERENCES                                                   Computer Interaction with Mobile Devices and
    [1] F. Alt, A. S. Shirazi, A. Schmidt, U. Kramer, and                    Services, MobileHCI ’11, pages 435–444, New York,
        Z. Nawaz. Location-based crowdsourcing: extending                    NY, USA, 2011. ACM.
        crowdsourcing to the real world. In Proceedings of the          [10] T. Yan, V. Kumar, and D. Ganesan. Crowdsearch:
        6th Nordic Conference on Human-Computer                              exploiting crowds for accurate real-time image search
        Interaction: Extending Boundaries, NordiCHI ’10,                     on mobile phones. In MobiSys ’10: Proceedings of the
        pages 13–22, New York, NY, USA, 2010. ACM.                           8th international conference on Mobile systems,
    [2] N. Eagle. txteagle: Mobile crowdsourcing. In                         applications and services, pages 77–90. ACM Press,
        Proceedings of the 3rd International Conference on                   2010.
        Internationalization, Design and Global Development:            [11] T. Yan, M. Marzilli, R. Holmes, D. Ganesan, and
        Held as Part of HCI International 2009, IDGD ’09,                    M. Corner. mCrowd: a platform for mobile
        pages 447–456, Berlin, Heidelberg, 2009.                             crowdsourcing. In Proceedings of the 7th ACM
        Springer-Verlag.                                                     Conference on Embedded Networked Sensor Systems,
    [3] A. Gupta, W. Thies, E. Cutrell, and R. Balakrishnan.                 SenSys ’09, pages 347–348, New York, NY, USA,
        mClerk: enabling mobile crowdsourcing in developing                  2009. ACM.
        regions. In Proceedings of the SIGCHI Conference on
        Human Factors in Computing Systems, CHI ’12, pages
        1843–1852, New York, NY, USA, 2012. ACM.
    [4] J. Howe. Crowdsourcing: Why the Power of the Crowd
        Is Driving the Future of Business. Random House Inc,
        2008.
    [5] P. G. Ipeirotis. Analyzing the amazon mechanical turk
        marketplace. XRDS, 17(2):16–21, Dec. 2010.
    [6] M. Meeker and L. Wu. Internet Trends D11
        Conference — The annual Internet Trends Report,
        2013. http://www.slideshare.net/kleinerperkins/
        kpcb-internet-trends-2013.


                                                                   6
                                                                   19