=Paper=
{{Paper
|id=None
|storemode=property
|title=Crowds, not Drones: Modeling Human Factors in Interactive Crowdsourcing
|pdfUrl=https://ceur-ws.org/Vol-1025/vision1.pdf
|volume=Vol-1025
|dblpUrl=https://dblp.org/rec/conf/dbcrowd/RoyLTAD13
}}
==Crowds, not Drones: Modeling Human Factors in Interactive Crowdsourcing==
DBCrowd 2013: First VLDB Workshop on Databases and Crowdsourcing Crowds, not Drones: Modeling Human Factors in Interactive Crowdsourcing Senjuti Basu Roy† , Ioanna Lykourentzou†† , Saravanan Thirumuruganathan‡,4 Sihem Amer-Yahia , Gautam Das‡,4 . † UW Tacoma, †† CRP Henri Tudor/INRIA Nancy Grand-Est, ‡ UT Arlington, 4 QCRI, CNRS, LIG senjutib@uw.edu, ioanna.lykourentzou@{tudor.lu,inria.fr}, saravanan.thirumuruganathan@mavs.uta.edu, sihem.amer-yahia@imag.fr, gdas@uta.edu ABSTRACT tags reflecting photo quality as opposed to photo content). In this vision paper, we propose SmartCrowd, an intelligent In this paper, we are interested in the question of harnessing and adaptive crowdsourcing framework. Contrary to exist- the crowd to approximate truth(s) effectively and efficiently ing crowdsourcing systems, where the process of hiring work- while taking into account the innate uncertainty of human ers (crowd), learning their skills, and evaluating the accu- behavior, named human factors. racy of tasks they perform are fragmented, siloed, and often Crowdsourcing Today: Existing systems are built on ad-hoc, SmartCrowd foresees a paradigm shift in that pro- top of private or public platforms, such as Mechanical Turk, cess, considering unpredictability of human nature, namely Turkit, Mob4hire, uTest, Freelancer, eLance, oDesk, Guru, human factors. SmartCrowd offers opportunities in making Topcoder, Trada, 99design, Innocentive, CloudCrowd, and crowdsourcing intelligent through iterative interaction with CloudFlower [3]. Tasks are typically small, independent, ho- the workers, and adaptively learning and improving the un- mogeneous, have minor incentives, and do not require longer derlying processes. Both existing (majority of which do not engagement from workers. Similarly, the crowd is typically require longer engagement from volatile and mostly non- volatile, arrival and departure is asynchronous, with differ- recurrent workers) and next generation crowdsourcing appli- ent levels of attention and accuracy. cations (which require longer engagement from the crowd) Limitations of current approaches: There are two stand to benefit from SmartCrowd. We outline the opportu- primary limitations related to current crowdsourcing ap- nities in SmartCrowd, and discuss the challenges and direc- proaches. The first refers to the separation and non-optimization tions, that can potentially revolutionize the existing crowd- of the underlying processes in a dynamic environment. The sourcing landscape. second limitation is related to the omission of human fac- tors when designing an optimized crowdsourcing solution. In fact, while recent research investigates some of the opti- 1. INTRODUCTION mization aspects, those aspects are not studied in conjunc- Crowdsourcing systems have gained popularity in a vari- tion with human factors. ety of domains. Common crowdsourcing scenarios include Three major processes involved in the task of ground- data gathering (asking volunteers to tag a picture or a video), truth approximations are - worker skill estimation, worker- document editing (as in Wikipedia), opinion solicitation (ask- to-task assignment, and task accuracy evaluation. Most cur- ing foodies to provide a summary of their experience at a rent commercial crowdsourcing systems (a survey of which restaurant), collaborative intelligence (asking residents to can be found in [3] ) either do not offer algorithmic optimiza- match old city maps), etc. The action of each worker in- tion, or do that partially and in isolation. Pre-qualification volved in crowdsourcing can be viewed as an approximation tests, the usage of golden standard data, or hiring of work- of ground truths. In the examples we describe, truth could ers based on worker past performance are the norm. Task be a complete set of tags describing a picture, a Wikipedia assignment is completely open and allows self-appointment article, an exhaustive opinion on a restaurant, etc. Truth by the workers, thus undermining quality (workers prefer can be objective (single ground truth) or subjective, where to increase their individual profit over accomplishing qual- there may be different truths for different users (e.g., young- itative tasks). Worker wage is often pre-determined and sters tend to like fast-food restaurants while young profes- fixed per task, oblivious to the quality of the actual pool of sionals may not, photography professionals tend to prefer workers who undertake the task in reality. Recent research undertakes some of the challenges unsolved by commercial platforms, and proposes active learning strategies for task evaluation [10, 1, 7], task assignment process [5], adjust- ing worker wages accordingly to skills [11]. However these works: i) focus on a specific crowdsourcing application type (mostly real-time crowdsourcing with highly volatile crowds) thus losing genericity, and ii) focus on the algorithmic opti- Copyright c 2013 for the individual papers by the papers’ authors. Copying mization of some but not all of the involved processes (e.g. permitted for private and academic purposes. This volume is published and skill learning, or wage determination, or task assignment). copyrighted by its editors. 1 39 DBCrowd 2013: First VLDB Workshop on Databases and Crowdsourcing A more critical limitation refers to the omission or inad- discovery to be dynamic, adaptive, and iterative in discov- equate incorporation of the uncertainty stemming from hu- ering skills required for tasks, evaluating the accuracy of man factors into the design of the crowdsourcing optimiza- completed tasks, learning skills of involved workers, assign- tion algorithm. Algorithmic solutions rely on simple, ideal- ing tasks to workers, determining the number of workers and ized models (e.g. known worker skills or steady worker per- offered incentives, considering human factors. Interestingly, formance). A recent work [8] proposes probabilistic worker these intermediate objectives are often inter-dependent, and skill estimation models, based on the workers past perfor- improving one improves others. The overall objective of mance, considering potential deviations in worker perfor- this adaptive process is to maximize accuracy and efficiency mance. Another recent work studies the egoistic profit- while reducing cost and effort. oriented objectives of individual workers to incentivize them (e.g. by properly adjusting wages) in order to calibrate al- 2.1 High Level Architecture gorithms that approximate the ground truth related to the The primary distinction of our framework is the deliber- crowdsourcing task [2]. Benefit of explicit feedback and in- ate acknowledgement of the importance of human factors in formation exchange between workers is studied [4, 6] to im- crowdsourcing and how it guides each of our objectives in a prove worker self-coordination, but no existing research in- dynamic environment. Further, we envision our framework corporates these aspects in a dynamic and interactive envi- to have an interactive dialogue with the workers to enable ronment, nor are there optimized solutions for ground truth adaptive learning, while the workers participate in crowd- discovery, considering human factors. sourcing tasks. The first two dimensions we tackle are: Opportunities: Future crowdsourcing systems therefore • “who knows what”, i.e. to evaluate the contribu- need to, first treat the crowdsourcing problem not in op- tions of workers and based on that to estimate their timization silos, but as an adaptive optimization problem, skills with the least possible error (skill learning pro- seamlessly handling the three main crowdsourcing processes cess). (worker skill estimation, task assignment, task evaluation). Secondly and equally important, the uncertainty stemming • “who will be asked to contribute to what”, i.e., from human factors needs to be quantified and incorporated by learning required skills for tasks and estimating into the design of any future algorithm that seeks to opti- workers’ skills, assign tasks to workers (task assign- mize the above adaptive crowdsourcing problem. For ex- ment process). ample, the estimation of every worker parameter that can SmartCrowd functions as follows: workers enter the crowd- be influenced by uncertainty needs to be incorporated into sourcing platform and complete tasks. Many crowdsourced the design of the crowdsourcing optimization process. Also, tasks typically require multiple skills. In the beginning, the planning horizon and the optimization boundaries of SmartCrowd holds no knowledge over the skills of newcom- any algorithm applied to facilitate crowdsourcing need con- ers. Furthermore, some required skills may be latent, and sequently to be determined with this uncertainty in mind. unknown to SmartCrowd in the beginning. As the workers New challenges rise from the above two opportunities, of undertake and complete more tasks, SmartCrowd discovers adopting a seamless crowdsourcing process and of incorpo- latent skills, evaluates workers contribution to the tasks and rating uncertainty into it. learns their skills, and therefore assign appropriate tasks to In summary, crowdsourcing has transitioned from being the workers, which in turn achieves higher accuracy and im- used as research tool into a research topic on its own. Sooner proved efficiency in the process. Moreover, this process is or later, database researchers have to confront the issues re- adaptive and iterative, worker skills are “learnt more accu- sulting from hybrid processing involving humans and com- rately” and “used more appropriately” over time, ensuring puters. The uncertainties arising due to human factors in gradual improvement. crowdsourcing are very different from traditional uncertainty, Figure 1 shows two primary functionalities that are im- such as in probabilistic databases [9]. SmartCrowd envisions proved adaptively in SmartCrowd: one depicting learning crowdsouring as an adaptive process where human factors worker skills, and the other depicting completion time of the are given the significance they deserve. Further, we also (ground truth discovery) tasks. More precisely, the steeper introduce a mechanism of crowd-indexing by which work- the skill estimation error curve gets, the faster we arrive to ers are organized into groups. Such indices are triggered accurate approximation of workers’ skills, i.e., the faster we by human factors, dynamically maintained and provide an can profile workers with low error. Also, there is a moment efficient way to search for workers. in time when the approximation error in skill estimation is acceptable. This is marked in the figure with a dashed 2. OUR VISION vertical line. Before that, the system is in “cold start” We propose to rethink crowdsourcing as an adaptive pro- phase, and does not know “much” about workers. Tradi- cess that relies on an interactive dialogue between the work- tionally, this problem is tackled with uniform-prior assump- ers and the system in order to build and refine worker skills, tions, spammer-hammer model, multi-dimensional wisdom while tasks are being completed. In parallel, as workers of crowd to bootstrap user skills [3]. After that, the frame- complete more tasks, the system ‘learns” their skills more work continues to improve its knowledge on workers’ skills accurately, and this adaptive learning is used to dynami- and adaptively assigns tasks to workers in iteration, until cally assign tasks to workers in the next iteration, by under- the system determines that a stopping condition has been standing the intrinsic uncertainty of human behavior. Note reached. Interestingly, faster minimization of skill estima- that, key to the success of these steps is the knowledge on tion error leads to earlier termination of cold start period ground truth, which the system is oblivious of (and wishes (i.e., the dashed vertical line to the left), which gives rise to to discover) in the first place. The primary paradigm shift better opportunities in designing the task assignment pro- in SmartCrowd is in envisioning the process of ground-truth cess (task assignment improvement area). 2 40 DBCrowd 2013: First VLDB Workshop on Databases and Crowdsourcing Op2mal%task%assignment%% 3.1 Human Factors Task%assignment%% improvement%% Human factors, a key distinction of SmartCrowd, relates area% to the uncertainty and non-deterministic nature of the be- Task%comple2on% havior of human workers. For example, there is uncertainty accuracy%curve% regarding worker availability: workers can enter the crowd- sourcing platform when they want, remain connected for as Skill% long as they like and they may or may not accept to make a es2ma2on% improvement%% Skill%es2ma2on% contribution. In the same sense, there is uncertainty regard- error%curve% area% ing the wage that workers may request: worker wage may End%of%cold%start%problem% 2me% vary from person to person, even among persons with the in%skill%es2ma2on% same profile for the system, but also wage may vary for the same person in different times, for example due to the per- Figure 1: Tradeoff between Skill Estimation Accu- son’s workload, available time but also due to unseen factors. racy and Task Completion Efficiency Finally, uncertainty also goes for skills: the efficiency with which a person completes a task cannot be considered fixed and it is rather uncertain, for example it may decline with As skill estimation improves, task completion efficiency the previous workload of the person, or it may depend on is also expected to improve, since the system can assign the offered wage or on the worker’s motivation and personal tasks more intelligently to workers. However, worker skill engagement in the task. estimation is critically related to accurate task evaluation The uncertainty stemming from the human factors does process, i.e., to evaluate the accuracy of the completed tasks not preclude from designing a crowdsourcing solution with a by the workers. In the absence of explicit ground truth, global optimization target. What it does mean, however, is SmartCrowd resorts to uncovering the ground truth using that, instead of fixed parameter values, SmartCrowd needs to workers themselves. While this interactive process does not study the aforementioned dimensions considering probabil- necessarily require longer engagement from the workers in ities and confidence boundaries (e.g. we cannot determine the system, it offers opportunities for improved learning. the ”exact wage” of a person but an approximation, with Therefore, the third and final dimension we tackle is: certain deviation of a central wage value), and be able to update the probabilities, as workers complete more tasks. • “engaging workers explicitly to improve learn- ing”, i.e., how to further exploit the learned expertise 3.2 Who Evaluates What and How? of workers by engaging them explicitly in evaluating Tasks submitted by workers need to be evaluated for ac- the skill of other workers or by completing more tasks. curacy. Interestingly, the process of evaluating completed Most importantly, these dimensions in SmartCrowd are stud- tasks is tightly coupled with acquiring each worker’s contri- ied in conjunction with two key aspects that are exclusive bution, which in turn helps learning worker skills. A ques- to crowdsourcing - human factor and scale. The unpre- tion however is, who evaluates what and how? dictability and inconsistency in human behavior are deliber- A worker’s contribution to a task can be evaluated through ate in the design of SmartCrowd. Additionally, SmartCrowd a fully-automated and implicit way by comparing submitted envisions the designed solutions to be scalable, i.e., toler- results against each other. In lieu of a known ground truth, ant to the size of the crowd, and its volatility. To the best a worker’s contribution could be measured by computing the of our knowledge, SmartCrowd is the first ever framework divergence of submitted contributions thus far using simple that considers these factors explicitly in crowdsourcing. Fi- or weighted averages, majority voting, etc. More sophisti- nally, SmartCrowd could be adapted inside existing systems, cated models such as multivariate data analysis could also be since it is designed assuming current crowdsourcing infras- used to approximate ground truth. In all cases, implicit eval- tructure. uation becomes effective when the acquired aggregated data In summary, to design accurate and efficient crowdsourc- approximates the unknown ground truth. A faster, more re- ing, SmartCrowd relies on a formal modeling of the task liable but costlier alternative is to explicitly designate some evaluation, worker skill estimation, and task assign- of the current workers as the evaluators of submitted tasks. ment processes, considering human factor and scale. We envision a hybrid method instead; task evaluation is performed by combining system’s acquired intelligence aug- 3. CHALLENGES AND DIRECTIONS mented with explicit human expertise. This requires com- While the opportunities foreseen in SmartCrowd are novel, plex modeling - 1) how to combine implicit and explicit the challenges in achieving them are exceptionally ardu- evaluations together, 2) when and how to hire explicit eval- ous. These challenges get further magnified, because of, uators, 3) how many explicit evaluators are required. In (1) Human factor - which necessitates the key challenges addition, human factors also contributes multiple new pa- to be modeled and solved considering unpredictability and rameters such as 4) what should be the offered incentives, inconsistency in worker behavior, their volatility, and asyn- 5) how to model inconsistent attention and arbitrary de- chronous arrival and departure; (2) Scale - which necessi- parture of explicit evaluators, and 6) how to compute this tates the solutions to be incremental and tolerant to the incrementally, as workers enter and exit asynchronously. volatility of the crowd and its size. SmartCrowd proposes novel indexing opportunities and reasons that human fac- 3.3 How to Estimate Worker Skills? tor induced crowd-indexing provides a transparent way of Skill estimation pertains to learning worker skills accu- achieving the objectives of SmartCrowd in conjunction with rately and effectively. In SmartCrowd, the output of task human factors and scale. evaluation (i.e., a worker’s contribution to each completed 3 41 DBCrowd 2013: First VLDB Workshop on Databases and Crowdsourcing task) is used to estimate worker skills. Therefore, the first to that end, where workers are organized and indexed into challenge is, how to identify and quantify a skill set? groups, and the indexes are dynamically maintained. For many complex tasks, some skills may be latent. For Interestingly, SmartCrowd demands new forms of indexing example, in image moderation, skills might vary for differ- triggered by human factors, such as predictive skill estima- ent images. In SmartCrowd, we envision learning such latent tion and task acceptance rate. These factors are dynamic skills as the tasks are being executed by workers. Discover- and vary over time, as workers undertake and complete more ing a set of latent skills could be formulated as a structure tasks. Efficient determination of the right group of workers learning problem in machine learning with the objective of for collaborative tasks is a key question when optimizing uncovering a multi-layer probabilistic model. On the con- cost (time and money). Similarly, selecting explicit evalu- trary, the problem could also be formulated as a fixed prob- ator(s) efficiently for task evaluation could benefit tremen- abilistic model with the objective of learning inference from dously from index design. However, in SmartCrowd, we en- it. Unlike traditional machine learning problems where the vision incremental indexing strategies, that are adaptive to end objective is accurate prediction, one unique requirement this dynamic environment. for SmartCrowd is to make these discovered skills contextual In contrast to traditional database indexing, crowd-indexing and interpretable by the applications. is (a) on-demand indexing where the notion of query work- Irrespective of the specific algorithm used to quantify worker load is akin to tasks arriving at different rates (b) con- skills, additional challenges in the model involve - 1) deter- strained indexing with different objectives such as latency, mining the minimal number of tasks that workers (or certain budget, worker skill diversity (c) alternate indexing as it re- groups of workers) need to complete, until their skills can quires to have a fall-back option (due to the uncertainty of be estimated with high accuracy, considering they may not workers accepting a task). behave consistently, 2) identifying the “stopping condition” to decide whether a worker’s skills have been estimated with 4. CONCLUSION In this paper, we developed a vision for intelligent crowd- adequate certainty or not, and 3) enabling fast and incre- sourcing and presented our framework, SmartCrowd. In con- mental computation (using worker clustering or view main- trast to existing systems, SmartCrowd promotes an iterative tenance) of skills, as new workers arrive. In addition, human interaction with workers and an involvement of those work- factors causes additional challenges such as identifying dec- ers beyond task completion (they are involved in evaluat- lination of skills (possibly due to boredom) or model how ing each others’ contributions), in order to adaptively learn worker skill changes over time. and improve the processes of learning workers’ skills and as- signing tasks. Both existing (which do not require longer 3.4 How to Assign Tasks to Workers? engagement from a volatile and mostly non-recurrent work- In SmartCrowd, we envision that workers are assigned to ers) and next generation crowdsourcing applications (which tasks based on learned workers’ skills and the remaining require longer engagement from the crowd) could benefit unfinished tasks. Interestingly, unlike traditional task as- from our vision. As discussed in this paper, increasing intel- signment problems in project management, in SmartCrowd ligence in SmartCrowd comes with several hard challenges. , workers’ skills are unknown in the beginning, and learned SmartCrowd aims to be principled yet efficient in proposing skills evolve as workers engage in more tasks and subject to the solution to those challenges. inconsistency and unpredictability due to human factors. References In SmartCrowd, we model assigning tasks to workers as [1] R. Boim, O. Greenshpan, T. Milo, S. Novgorodov, N. Poly- a probabilistic optimization problem, with the objective of zotis, and W. C. Tan. Asking the right questions in crowd maximizing accuracy, or minimizing time, or optimizing both data sourcing. In ICDE, pages 1261–1264, 2012. at the same time probabilistically. Furthermore, additional [2] R. Cavallo and S. Jain. Efficient crowdsourcing contests. In factors such as cost (money) could be considered. AAMAS, pages 677–686, Richland, SC, 2012. Several related questions (or constraints) are required to [3] A. Doan, R. Ramakrishnan, and A. Y. Halevy. Crowdsourc- be factored into this formulation as well - (1) what if a ing systems on the World-Wide Web. Communications of worker declines an assigned task, 2) can multiple tasks be The ACM, pages 385–396, 2011. allocated to the same worker, 3) in the case of multiple task [4] S. Dow, A. Kulkarni, S. Klemmer, and B. Hartmann. Shep- allocation, does SmartCrowd suggest an ordering tasks to the herding the crowd yields better work. In CSCW, 2012. worker, 4) during task assignment, does SmartCrowd need to [5] C.-J. Ho and J. W. Vaughan. Online task assignment in assign tasks such that there are no idle workers, 5) is there crowdsourcing markets. In AAAI, 2012. an upper limit on the number of tasks that a single worker [6] S.-W. Huang and W.-T. Fu. Don’t hide in the crowd!: in- creasing social transparency between peer workers improves can be assigned to in one iteration? 6) how important is crowdsourcing outcomes. In CHI, 2013. the system’s benefit vs worker’s benefit? Should the sys- [7] D. R. Karger, S. Oh, and D. Shah. Budget-optimal tem optimize across tasks (i.e., exploit), or give newcomers task allocation for reliable crowdsourcing systems. CoRR, opportunities (i.e, explore) to prove their skills? abs/1110.3564, 2011. [8] X. Liu, M. Lu, B. C. Ooi, Y. Shen, S. Wu, and M. Zhang. 3.5 Crowd-Indexing Cdas: a crowdsourcing data analytics system. Proc. VLDB Endow., 5(10):1040–1051, June 2012. Crowdsourcing is an adaptive process - where workers/tasks [9] A. Parameswaran and N. Polyzotis. Answering queries using arrive asynchronously, and the system learns more about humans, algorithms and databases. In CIDR 2011. workers as they complete assigned tasks. Satisfying the key [10] A. Ramesh, A. Parameswaran, H. Garcia-Molina, and objectives of worker skill estimation, worker-to-task assign- N. Polyzotis. Identifying reliable workers swiftly. Techni- ment, and task accuracy evaluation while accounting for hu- cal report, Stanford University. man factors at scale, necessitates the development of efficient [11] Y. Singer and M. Mittal. Pricing mechanisms for crowd- searching techniques. SmartCrowd proposes crowd-indexing sourcing markets. WWW ’13, pages 1157–1166, 2013. 4 42