Gamification for Machine Learning: The Classification Game Giorgio Maria Di Nunzio Maria Maistro Daniel Zilio Dept. of Inf. Eng. (DEI) Dept. of Inf. Eng. (DEI) Dept. of Inf. Eng. (DEI) University of Padua University of Padua University of Padua dinunzio@dei.unipd.it maistro@dei.unipd.it daniel.zilio.3@studenti.unipd.it shown that it is possible to create annotated datasets at a↵ordable costs. In this paper, we want to apply Abstract game mechanics to the problem of classification of ob- jects, a supervised machine learning problem, with a The creation of a labelled dataset for machine two-fold goal in mind: i) how the gamification of a learning purposes is a costly process. In recent classification problem can be used to understand what works, it has been shown that a mix of crowd- is the ‘price’ of labelling a small amount of objects sourcing and active learning approaches can for building a reasonably accurate classifier, ii) to ana- be used to annotate objects at an a↵ordable lyze the classification performance given the presence cost. In this paper, we study the gamification of small sample sizes and little training [3, 10]. of machine learning techniques; in particular, In this first pilot study, we designed a simple game the problem of classification of objects. In this based on a visual interpretation of probabilistic clas- first pilot study, we designed a simple game, sifiers [6, 5, 7]. The game consists in separating two based on a visual interpretation of probabilis- sets of coloured points on a two-dimensional plane by tic classifiers, that consists in separating two means of a straight line. Despite its simplicity, this sets of coloured points on a two-dimensional very abstract scenario can be, and will be in the next plane by means of a straight line. We present version of the game, substituted with more captivating the current results of this first experiment that ones (see Section 5.1). At the beginning of the game, we used to collect the requirements for the players know nothing about the type of objects that next version of the game and to analyze i) they have to separate (in our case the objects are text what is the ‘price’ to build a reasonably accu- documents), but they know that the points they see rate classifier with a small amount of labelled on the plane are a small subset of the total and their objects, ii) and compare the accuracy of the position on the plane is not accurate. Players have a player to the state-of-the-art classification al- limited amount of resources that can be used to im- gorithms. prove the position of the points and/or visualize more points (see Section 3 for a detailed explanation of the 1 Introduction game). Supervised machine learning algorithms require la- To summarize, this experiment has been designed belled examples to be trained and be evaluated prop- to study: erly. However, the labelling process is a costly, time- consuming and non-trivial task. Manual annotation • how the gamification of a classification problem by experts is the obvious choice [24], but it is slow and can be used to understand what is the ‘price’ a expensive. In the last years, mixed approaches that user is willing to pay to build a classifier; use crowd-sourcing [13] and active learning [19] have • how the performance of a ‘human’ classifier com- Copyright c by the paper’s authors. Copying permitted for pares to the state-of-the-art algorithms on a small private and academic purposes. scale training dataset. In: F. Hopfgartner, G. Kazai, U. Kruschwitz, and M. Meder (eds.): Proceedings of the GamifIR 2016 Workshop, Pisa, Italy, The first goal is related to the problem of minimizing 21-July-2016, published at http://ceur-ws.org the cost of labelling the dataset and at the same time building a reasonably accurate classifier. The second 3 The Classification Game goal is related to the problem of classification perfor- The game is based on the two-dimensional represen- mance given the presence of small sample sizes and tation of probabilities [6, 21] which is a very intuitive little training [3, 10]. way of presenting the problem of classification on a The paper is organized as follows: in Section 2, we two-dimensional space. Given two classes c1 and c2 , present some background literature on gamification in an object o is assigned to category c1 if the following Information Retrieval (IR). In Section 3, we describe inequality holds: in detail the game, the rules and the interactive ap- plication. Section 4 discusses the pilot study and the P (o|c2 ) < m P (o|c1 ) +q (1) initial results. Section 5 is dedicated to the require- | {z } | {z } y x ments and suggestions for future improvements col- lected from the players during this study. In Section 6, where P (o|c1 ) and P (o|c2 ) are the likelihoods of the we give our final remarks. object o given the two categories, while m and q are two parameters that depend on the misclassification costs that can be assigned by the user to compensate 2 Related Work for either the unbalanced classes situation or di↵erent class costs. Gamification is defined as “the use of game design el- If we interpret the two likelihoods as two coordi- ements in non-game contexts” [4], i.e. tipical game el- nates x and y of a two dimensional space, the problem ements are used for purposes di↵erent from their nor- of classification can be studied on a two-dimensional mal expected employment. In this context, we can plot. The decision of the classification is represented define our web application as a Game with a Pur- by the ‘line’ y = mx + q that splits the plane into pose (GWAP), that is a game which presents some two parts and all the points that fall ‘below’ this line purposes, usually boring and dull for people, within are classified as objects that belong to class c1 (see an entertaining setting, in order to make them enjoy- Figure 1 for an example). Without entering into the able and to use human computation to solve problems. mathematical details of this approach [6], the basic Nowadays, gamification spreads through a wide range idea of the game is that players can adapt the two pa- of disciplines and its applications are implemented in rameters m and q in order to optimize the separation di↵erent and various aspects of scientific fields of study. of points and, at the same time, can use their resources For instances, gamification is applied to learning activ- to improve the estimate of the two likelihoods by buy- ities [16, 15], business and enterprise [14, 22, 23] and ing training data, and/or add more points to the plot medicine [9, 2]. by buying validation data. In recent years, also IR has dealt with gamifica- tion, as witnessed by the Workshop on Gamification 3.1 Rules of the game for Information Retrieval (GamifIR) in 2014, 2015 and The game is organized in 10 levels, which are presented 2016. In [12] the authors describe the fundamental from the easiest to the most difficult and which corre- elements and mechanics of a game and provide an spond to the di↵erent classification tasks of the top 10 overview of possible applications of gamification to the classes of the Reuters 21578 dataset1 . A level is dif- IR process. Moreover, [20] investigates possible ap- ficult when it is hard to linearly separate the positive proaches to properly gamify web search, i.e. making and the negative class (class c1 and c2 in Equation 1), the search of information and the scanning of results a i.e. when the positive and negative class partially over- more enjoyable activity. Actually, many proposals of lap, and/or when there are few examples of objects game applied to di↵erent aspects of IR have been pre- in the positive class. We used the standard Reuters sented. For example, [18] describes a game that turns ModApte split to obtain the training and test docu- document tagging into the activity of taking care of ments for the ten classes. An objects in the training a garden, with the aim of managing private archives. set can be used during the game either as a training In [17], the authors propose a method to obtain rank- example or a validation sample, but not both. In this ing of images by utilizing human computation through way we are setting the machine learning problem in a gamified web application. Finally, [11] introduces a terms of a hold-out method that, despite being not strategy to gamifying the annotation of a French cor- very accurate compared to a cross-validation, is much pora. Although a lot of gamified applications were easier to use in a game. presented in the IR field in the last few years, to the best of our knowledge there are no applications to ma- 1 http://www.daviddlewis.com/resources/ chine learning and in particular to machine classifiers. testcollections/reuters21578/ The goal of each level (an in general of the game) low these two buttons, there is a field which indicates is to find the best classifier, i.e. the one which maxi- the remaining resources, and the available ‘clicks’, to mizes the F1 score, with the least amount or resources. training/validate the model for the current level. The Resources (we intentionally did not use the word cred- player can utilize the positive and negative check boxes its or money) can be spent to increase the training to visualize only the positive and negative points re- and the validation set. At the beginning of the game, spectively. The player can center the line by clicking the players has already a free 10% of the collection directly on the plot image and then shift and rotate it annotated: 5% used for training and 5% used for val- by using the sliders, which corresponds to the q and m idation. At any point in the game, the player can parameter of Equation 1. Finally, we added a button use some resources to buy additional training or vali- that allows the player to go back to the best parame- dation objects. When he/she selects the training op- ters setting that he/she has found during the current tion, an additional 5% of the collection is added to level. The color of the button indicates whether it is the training, this action will set the points more pre- active (blue) or disabled (red). cisely (because probabilities are estimated more ac- In the top right panel, there is the main window that curately), while when he/she chooses the validation shows the two setsw of points: light blue for the posi- option, an additional 5% is included in the validation tive class and red for the negative one. The dark blue set and more points will appear in the plot. Notice line is the decision line used to split the two classes. In that, when the player starts the game, he/she is pro- the left upper corner of the plot, there are three num- vided with a sufficient amount of resources to ‘buy’ the bers: the first one in yellow is the current F1 score, the whole dataset for each level. In this way, the player second one in blue is the best F1 score obtained by the does not have to think about saving resources for dif- user in the current class with the amount of training ficult levels. There is no limit to the time the user and validation set, and the red one is the value that can spend in a level since the important part of the the player should ‘beat’ 3 . game is to understand the amount of resources that Finally, below the plot area, on the right bottom the player consider sufficient to solve the problem, not box, there is a summary of the user actions and results. how fast he/she can do it. Once the player has found The histogram describes the amount of resources spent what he/she considers the best classifier, he/she can in training actions, with the green color, and valida- proceed with the test, thus the classifier is tested on tion actions, with the pink color. For each class, the the test set and the F1 score is computed. At this F1 score obtained on the test set is reported below point, the level is completed and the player is forced the corresponding column. At the end of the game, to go to the next level or conclude the game. i.e. when the user has completed all the levels, he/she will see his/her mean F1 score, averaged on each cate- 3.2 Interactive Application gory, and the mean goal score, which is the average of the F1 scores obtained by the automatic algorithm on We have implemented the game of classification with each category. the Shiny package in R [1], and the source code of the application is freely available for download2 . This interactive Web application can be used as a show case 4 Pilot Study online, but we had to use a local version in order to A pilot study was carried out to test this preliminary avoid lagging and server disconnection that would have version of the game and to collect opinions and sug- made the game very annoying for our players. The gestions regarding possible improvements of the game. objects plotted on the two-dimensional space are news The experiment was conducted on a sample of 20 stu- agencies from the Reuters newswires dataset. dent and researchers of the University of Padua. As In Figure 1, you can see the layout of the web ap- future work, we aim at spreading this game through plication for this pilot study. The interface can be the use of social media in order to collect a bigger and divided in three main areas: the left panel, which con- a more diversified sample of users. tains buttons and controls to adjust the parameters The majority of the users that participate in this of the classifier, the top right panel, which displays test had just a naı̈ve idea of what machine learning is the plot of the two classes, and the bottom right area, and how an automatic classifier operates. Since the which summarizes the employed resources and the F1 majority of our users were not machine learning ex- scores. pert, we provided them a brief explanation of the prob- In the left side panel, there is a text field that allows the player to choose a username, and two buttons to 3 This value corresponds to the best score of the Bernoulli start the game and to move to the next category. Be- Naı̈ve Bayes classifier trained on whole dataset. We used the results reported in [6] and we rounded down some values in 2 https://github.com/gmdn/Classification order to motivate the players. Figure 1: Layout of the web application lem and the fundamental concepts (especially about In this initial analysis, we were impressed by two training and validation) before starting the game. results: on average, the players could beat the ‘goal’ Then, we introduced them to the interface as described score more easily than expected, which means that in section 3.2. the probabilistic classifiers can be trained/validated with just 25% of the original dataset and obtain in 4.1 Results many cases even better results than a cross-validation In our experiments, for each player we collected the F1 on the whole dataset. We will investigate this prob- score for each class and the amount of resources used. lem in future works. The second interesting aspect is For a comparison with the state-of-the-art, we trained that SVM performed as well as the whole dataset with on the exact same training and validation set used by only 25% of the annotated dataset and without cross- each player for each class a SVM with linear kernel validation. This results is very promising since, po- 4 using the ‘kernlab’ package in R together with the tentially, the gamification problem may give a strong 5 ‘caret’ package . For a fair comparison, since the play- indication about how to stop the labelling process and ers acted as ‘optimizers’ of the Naı̈ve Bayes classifier use the annotated dataset to train with a very high decision, we also optimized the SVM by adjusting the accuracy state-of-the-art-algorithm. This second part cost parameter C within the range [0.01, 0.05] (smaller will require a deep analysis and further experiments to or larger values of C did not produce any significant confirm the statistical significance of this process. change). In Table 1, we report the F1 measure for each player 5 Further Developments on each level. The name of the column are the orig- During the game and at the end of each game ses- inal names of the top 10 Reuters-21578 classes. The sion, we discussed together with each player about im- last three column shows the average of the F1 measure provements and issues of the interface and the game across the classes for each player, the average F1 mea- in general. We report in this section a summary of the sure of the SVM across the data of each player, the discussions. percentage of resources used by each player in a game. The last two rows represents: the average F1 score for each class for the players and for the SVM. 5.1 Game Scenarios 4 https://cran.r-project.org/web/packages/kernlab/ Together with the users who participated to this pilot 5 https://cran.r-project.org/web/packages/caret/ study, we started to sketch some possible scenarios of Table 1: F1 results for each player and level from easiest to hardest. Average performance of the SVM on the same training validation is shown for each player and level. Last column shows the resources used by each player. username earn acq grain crude money.fx ship wheat interest trade corn average SVM resources airamoigroig 0.97 0.95 0.88 0.83 0.76 0.78 0.74 0.74 0.78 0.56 0.80 0.86 29% Alan 0.96 0.91 0.79 0.67 0.75 0.68 0.68 0.63 0.76 0.59 0.74 0.85 16% Ale 0.94 0.95 0.83 0.86 0.76 0.79 0.77 0.63 0.71 0.49 0.77 0.85 18% CalebTheGame 0.96 0.90 0.88 0.87 0.76 0.84 0.77 0.75 0.76 0.61 0.81 0.87 56% ClaudioBarba 0.96 0.96 0.85 0.83 0.72 0.74 0.75 0.69 0.63 0.63 0.78 0.80 12% dz 0.96 0.95 0.85 0.79 0.74 0.76 0.72 0.74 0.74 0.62 0.79 0.83 17% edoardo verona 0.97 0.96 0.86 0.89 0.75 0.80 0.76 0.74 0.75 0.57 0.80 0.87 42% Erica 0.96 0.94 0.82 0.78 0.75 0.72 0.59 0.67 0.59 0.53 0.74 0.84 23% gadaleta 0.96 0.96 0.91 0.87 0.74 0.81 0.70 0.73 0.72 0.53 0.79 0.87 26% Giada 0.96 0.95 0.83 0.82 0.75 0.80 0.71 0.74 0.67 0.62 0.78 0.87 23% Hector 0.95 0.95 0.90 0.80 0.74 0.86 0.77 0.56 0.60 0.56 0.77 0.86 31% jeppy 0.96 0.95 0.81 0.67 0.69 0.74 0.60 0.65 0.71 0.51 0.73 0.79 1% ottoX8 0.97 0.94 0.89 0.71 0.75 0.78 0.75 0.75 0.67 0.56 0.78 0.82 26% pil 0.95 0.93 0.84 0.83 0.72 0.82 0.71 0.73 0.60 0.61 0.77 0.84 12% poiopoio 0.95 0.95 0.85 0.83 0.77 0.77 0.73 0.73 0.65 0.52 0.78 0.85 17% power23 0.96 0.95 0.86 0.90 0.73 0.78 0.74 0.77 0.68 0.52 0.79 0.86 28% renberche 0.96 0.95 0.88 0.77 0.71 0.75 0.76 0.72 0.72 0.55 0.78 0.87 39% signoraMaria 0.97 0.96 0.85 0.88 0.78 0.78 0.74 0.74 0.72 0.60 0.80 0.84 23% Ste 0.97 0.96 0.92 0.77 0.71 0.80 0.71 0.72 0.69 0.59 0.78 0.86 45% veronica 0.97 0.96 0.89 0.87 0.73 0.82 0.69 0.74 0.47 0.54 0.77 0.83 12% average 0.96 0.95 0.86 0.81 0.74 0.78 0.72 0.71 0.68 0.57 SVM 0.97 0.95 0.89 0.85 0.73 0.82 0.89 0.74 0.82 0.80 this game that will improve the game experience. We cans that have been left on the field. You have have come up with four possible alternatives: limited resources and you can only split the area in two sides: one side should contain more plastic • Plants and gardening: we have a field sown with bottles than glass bottles and the other side more di↵erent types of seeds, but we do not know ex- glass bottles than plastic bottles. You have some actly where these seeds are. Some of the seeds kids that can help you to spot where are the parts will grow into edible plants, others will grow into with more glass or plastic. weeds. The goal is to build a fence that separated the field in a way that “our” part of the field will 5.2 Game Controls contain the most of the edible plants and the least We received some very good feedback about game con- of the weeds. We can ask some help to our ani- trols and interaction and how to improve them to ob- mal granivore friends (like pigeons and doves) to tain a better feeling of the game. fly over and check some part of the fields to see whether the seeds are good or not. 5.2.1 Main Window • Gold mines: there is an area with a lot of gold Most of the players would prefer a full screen window nuggets as well as useless common stones and you to see the points more clearly without any distraction. are the first explorer to mine this area. Your re- The amount of credits used per class is not very rel- sources are limited, and you can only choose one evant for their game when they play as well as the part of the area while the rest remains untouched. score obtained in previous classes. It would be better The goal is to choose the area with loads of gold to have a window with the ranking of the scores that and the smallest number of stones. You have a players can open when they need to see their status. friend who is an expert in gold mining and can probe the area to understand whether there is 5.3 Line Control gold or stone. Players began to understand the use of the sliders af- • Aerial warfare: in this war scenario, we have an ter a few attempts. In particular, the rotation of the army that is involved in a military zone, and we line was not immediately clear since the plot is cen- are forced to perform a raid to seize an area in tred around the minimum and maximum values of the order to secure the zone. The goal is to send our coordinates, while the slope is computed given the in- air forces to clear the area that contains the most tercept of the line with the y axis. The thing that puz- of the enemies and the least of our ground troops zled the players was the non-intuitive rotation around and civilians. Before the raid, we can send our he- a point far from the plot limits. For this reason they licopters to explore the area and check the current suggested two alternatives: situation. • to select a fixed point within the plot (i.e. the • Plastic and Glass Recycling: after a music festival center of the line) and to rotate the line along in a park, you have to collect all the bottles and this point; • to maintain the slope fixed and rotate the plane called ‘goal’ score on the same category. We indicate instead of the line. with ti and vi the amount of resources spent by the user in training and validation documents respectively, Moreover, the control of the slope of the line would be and R is the total amount of resources provided at the easier with a “knob” rather than a slider6 . There are beginning of the game, 1800 in our game. We define also new ideas about the interactions with the game in the user rating J as terms of touch screen technologies. In fact, it would be much easier for the players to interact with the game N N ! X si 1 X with the gestures that are now “natural” on mobile J =a +b 1 (ti + vi ) (2) devices: rotation, swipe, zooming in and out, may en- g i=1 i R i=1 hance the user experience and bring the game to a di↵erent level. Finally, line controls should be overlaid where a and b are two parameters, which range in on top of the main window instead of being on one the interval [0, 1]. Notice that these parameters repre- side, in this way the eye of the player does not have sent the significance assigned to the two di↵erent tasks to move from one side to the other of the screen every which define the game, indeed, a influences the impor- time the line has to be adjusted. tance of the e↵ectiveness objective, while b determine the importance of the efficiency purpose. For this pre- 5.4 Game Incentives liminary version of the game, we chose a and b with equal value, a = b = 1, since we consider both the effi- From the live interaction with the players during the ciency and e↵ectiveness task of equivalent significance. game sessions, it was clear that one of the strongest Future work and further applications of this game can motivations to replay the game was to have ranking of justify the preference for one of the two tasks, moti- the players with the scores obtained, to know whether vating the choice of di↵erent weights for a and b. the friends/colleagues performed better or worse. At the time of the pilot experiments, we could give them 6 Final Remarks hints about their performance compared to the other players and just that information was enough for them In this first pilot study of the gamification in machine to sparkle their sense of competition. Some of them learning, we set up a simple game, based on a visual were willing to play a second time to just beat their interpretation of probabilistic classifiers, that consists competitors. This is in line with the literature on gam- in separating two sets of coloured points on a two- ification [12, 8]. In addition to the ranking of the scores dimensional plane by means of a straight line. The of ‘human’ players, we want to introduce the compar- 20 players that participated in this study already gave ison of scores between each player and a set of state- us important suggestions in order to improve both the of-the-art classification algorithm trained on the exact game mechanics and the game controls. We believe same game. We want to see how strong this incentive that, with the right game scenario (plants or gold would be for a human to know that his performance is mine for example), this game could be easily played better or worse compared to a computer. We are also by users that do not need any information about train- planning a set of virtual goods and badges to motivate ing/validation. the player during the game and after each session. Moreover, the classification results of the game were Since the competitiveness is one of the main moti- also very high compared to the small amount of la- vations which encourages the users to play the game belled objects and we also found a very promising re- and reach high performances, it is very important to lation with the results of the SVM trained on the same find the definition of a formal criterion to rate and labelled dataset. We are currently study a new version rank players by taking into account the F1 score and of the game with some options that implements cross the resources spent. As described in section 3.1, the validation, which would bring the machine learning as- goal of the game is multiobjective, indeed the main pect to a new di↵erent level. task consists in defining, at the same time, a classifier which is e↵ective, i.e. it reaches high values of accuracy References and precision, and efficient, i.e. it uses a few amount [1] Winston Chang. Shiny: Web Application Frame- of resources in terms of training and validation. work for R, 2015. R package version 0.11. Let C = {c1 , . . . , ci , . . . , cN } be the set of categories, we denote with si the F1 score obtained on the test [2] Yu Chen and Pearl Pu. Healthytogether: Ex- set of the i-th category by the player, and with gi the ploring social incentives for mobile fitness applica- F1 score obtained by the automatic algorithm that we tions. In Proceedings of the Second International 6 See for example the gallery of this type of control realized Symposium of Chinese CHI, Chinese CHI ’14, with d3js https://radmie.github.io/ng-knob/ pages 25–34, New York, NY, USA, 2014. ACM. [3] Giorgio Corani and Marco Za↵alon. Learning reli- [12] Luca Galli, Piero Fraternali, and Alessandro Boz- able classifiers from small or incomplete data sets: zon. On the Application of Game Mechanics in The naive credal classifier 2. J. Mach. Learn. Res., Information Retrieval. In Proc. of the 1st Int. 9:581–621, June 2008. Workshop on Gamification for Information Re- trieval, GamifIR’14, pages 7–11, New York, NY, [4] Sebastian Deterding, Dan Dixon, Rilla Khaled, USA, 2014. ACM. and Lennart Nacke. From Game Design El- ements to Gamefulness: Defining “Gamifica- [13] Chien-Ju Ho, Shahin Jabbari, and Jennifer Wort- tion”. In Proc. of the 15th International Aca- man Vaughan. Adaptive task assignment for demic MindTrek Conference: Envisioning Future crowdsourced classification. In ICML (1), vol- Media Environments, MindTrek ’11, pages 9–15, ume 28 of JMLR Proceedings, pages 534–542. New York, NY, USA, 2011. ACM. JMLR.org, 2013. [5] Giorgio Maria Di Nunzio. Using Scatterplots to [14] Jose Luis Jurado, Alejandro Fernandez, and Ce- Understand and Improve Probabilistic Models for sar A. Collazos. Applying gamification in the con- Text Categorization and Retrieval. Int. J. Ap- text of knowledge management. In Proceedings of prox. Reasoning, 50(7):945–956, 2009. the 15th International Conference on Knowledge Technologies and Data-driven Business, i-KNOW [6] Giorgio Maria Di Nunzio. A New Decision to Take ’15, pages 43:1–43:4, New York, NY, USA, 2015. for Cost-Sensitive Naı̈ve Bayes Classifiers. In- ACM. formation Processing & Management, 50(5):653 – 674, 2014. [15] Karl M Kapp. The gamification of learning and instruction: game-based methods and strategies [7] Giorgio Maria Di Nunzio and Alessandro Sordoni. for training and education. John Wiley & Sons, A Visual Tool for Bayesian Data Analysis: The 2012. Impact of Smoothing on Naive Bayes Text Clas- [16] Isabella Kotini and Sofia Tzelepi. Gamification in sifiers. In Proc. of the ACM SIGIR’12 conference Education and Business, chapter A Gamification- on research and development in Information Re- Based Framework for Developing Learning Activ- trieval, Portland, OR, USA, August 12-16, 2012, ities of Computational Thinking, pages 219–252. page 1002, 2012. Springer Int. Publ., Cham, 2015. [8] David Easley and Arpita Ghosh. Incentives, gami- [17] Mathias Lux, Mario Guggenberger, and Michael fication, and game theory: An economic approach Riegler. Picturesort: Gamification of image rank- to badge design. In Proceedings of the Fourteenth ing. In Proceedings of the First International ACM Conference on Electronic Commerce, EC Workshop on Gamification for Information Re- ’13, pages 359–376, New York, NY, USA, 2013. trieval, GamifIR ’14, pages 57–60, New York, NY, ACM. USA, 2014. ACM. [9] Carsten Eickho↵. Crowd-powered experts: Help- [18] Carlos Maltzahn, Arnav Jhala, Michael Mateas, ing surgeons interpret breast cancer images. In and Jim Whitehead. Gamification of private dig- Proceedings of the First International Workshop ital data archive management. In Proceedings of on Gamification for Information Retrieval, Gam- the First International Workshop on Gamification ifIR ’14, pages 53–56, New York, NY, USA, 2014. for Information Retrieval, GamifIR ’14, pages 33– ACM. 37, New York, NY, USA, 2014. ACM. [10] George Forman and Ira Cohen. Learning from [19] Burr Settles. Closing the loop: Fast, interac- Little: Comparison of Classifiers Given Little tive semi-supervised annotation with queries on Training. In Knowledge Discovery in Databases: features and instances. In Proceedings of the PKDD 2004, Pisa, Italy, September 20-24, 2004, 2011 Conference on Empirical Methods in Nat- Proceedings, pages 161–172, 2004. ural Language Processing, EMNLP 2011, 27-31 July 2011, John McIntyre Conference Centre, Ed- [11] Karën Fort, Bruno Guillaume, and Hadrien Chas- inburgh, UK, A meeting of SIGDAT, a Special In- tant. Creating zombilingo, a game with a pur- terest Group of the ACL, pages 1467–1478, 2011. pose for dependency syntax annotation. In Pro- ceedings of the First International Workshop on [20] Mark Shovman. The Game of Search: What is Gamification for Information Retrieval, GamifIR the Fun in That? In Proc. of the 1st Int. Work- ’14, pages 2–6, New York, NY, USA, 2014. ACM. shop on Gamification for Information Retrieval, GamifIR’14, pages 46–48, New York, NY, USA, 2014. ACM. [21] Rita Singh and Bhiksha Raj. Classification in likelihood spaces. Technometrics, 46(3):318–329, 2004. [22] Laurentiu Catalin Stanculescu, Alessandro Boz- zon, Robert-Jan Sips, and Geert-Jan Houben. Work and play: An experiment in enterprise gam- ification. In Proceedings of the 19th ACM Confer- ence on Computer-Supported Cooperative Work & Social Computing, CSCW ’16, pages 346–358, New York, NY, USA, 2016. ACM. [23] Jennifer Thom, David Millen, and Joan DiMicco. Removing gamification from an enterprise sns. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work, CSCW ’12, pages 1067–1070, New York, NY, USA, 2012. ACM. [24] Salud M. Jiménez Zafra, Giacomo Be- rardi, Andrea Esuli, Diego Marcheggiani, Maria Teresa Martı́n-Valdivia, and Alejan- dro Moreo Fernández. A multi-lingual annotated dataset for aspect-oriented opinion mining. In Proceedings of the 2015 Conference on Empir- ical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, pages 2533–2538, 2015.