Gamification for Machine Learning:
                            The Classification Game

    Giorgio Maria Di Nunzio                     Maria Maistro                               Daniel Zilio
    Dept. of Inf. Eng. (DEI)               Dept. of Inf. Eng. (DEI)                Dept. of Inf. Eng. (DEI)
      University of Padua                    University of Padua                      University of Padua
     dinunzio@dei.unipd.it                  maistro@dei.unipd.it                 daniel.zilio.3@studenti.unipd.it


                                                                 shown that it is possible to create annotated datasets
                                                                 at a↵ordable costs. In this paper, we want to apply
                        Abstract                                 game mechanics to the problem of classification of ob-
                                                                 jects, a supervised machine learning problem, with a
     The creation of a labelled dataset for machine              two-fold goal in mind: i) how the gamification of a
     learning purposes is a costly process. In recent            classification problem can be used to understand what
     works, it has been shown that a mix of crowd-               is the ‘price’ of labelling a small amount of objects
     sourcing and active learning approaches can                 for building a reasonably accurate classifier, ii) to ana-
     be used to annotate objects at an a↵ordable                 lyze the classification performance given the presence
     cost. In this paper, we study the gamification              of small sample sizes and little training [3, 10].
     of machine learning techniques; in particular,                  In this first pilot study, we designed a simple game
     the problem of classification of objects. In this           based on a visual interpretation of probabilistic clas-
     first pilot study, we designed a simple game,               sifiers [6, 5, 7]. The game consists in separating two
     based on a visual interpretation of probabilis-             sets of coloured points on a two-dimensional plane by
     tic classifiers, that consists in separating two            means of a straight line. Despite its simplicity, this
     sets of coloured points on a two-dimensional                very abstract scenario can be, and will be in the next
     plane by means of a straight line. We present               version of the game, substituted with more captivating
     the current results of this first experiment that           ones (see Section 5.1). At the beginning of the game,
     we used to collect the requirements for the                 players know nothing about the type of objects that
     next version of the game and to analyze i)                  they have to separate (in our case the objects are text
     what is the ‘price’ to build a reasonably accu-             documents), but they know that the points they see
     rate classifier with a small amount of labelled             on the plane are a small subset of the total and their
     objects, ii) and compare the accuracy of the                position on the plane is not accurate. Players have a
     player to the state-of-the-art classification al-           limited amount of resources that can be used to im-
     gorithms.                                                   prove the position of the points and/or visualize more
                                                                 points (see Section 3 for a detailed explanation of the
1    Introduction                                                game).
Supervised machine learning algorithms require la-                   To summarize, this experiment has been designed
belled examples to be trained and be evaluated prop-             to study:
erly. However, the labelling process is a costly, time-
consuming and non-trivial task. Manual annotation                  • how the gamification of a classification problem
by experts is the obvious choice [24], but it is slow and            can be used to understand what is the ‘price’ a
expensive. In the last years, mixed approaches that                  user is willing to pay to build a classifier;
use crowd-sourcing [13] and active learning [19] have
                                                                   • how the performance of a ‘human’ classifier com-
Copyright c by the paper’s authors. Copying permitted for
                                                                     pares to the state-of-the-art algorithms on a small
private and academic purposes.                                       scale training dataset.
In: F. Hopfgartner, G. Kazai, U. Kruschwitz, and M. Meder
(eds.): Proceedings of the GamifIR 2016 Workshop, Pisa, Italy,   The first goal is related to the problem of minimizing
21-July-2016, published at http://ceur-ws.org                    the cost of labelling the dataset and at the same time
building a reasonably accurate classifier. The second         3     The Classification Game
goal is related to the problem of classification perfor-
                                                              The game is based on the two-dimensional represen-
mance given the presence of small sample sizes and
                                                              tation of probabilities [6, 21] which is a very intuitive
little training [3, 10].
                                                              way of presenting the problem of classification on a
   The paper is organized as follows: in Section 2, we        two-dimensional space. Given two classes c1 and c2 ,
present some background literature on gamification in         an object o is assigned to category c1 if the following
Information Retrieval (IR). In Section 3, we describe         inequality holds:
in detail the game, the rules and the interactive ap-
plication. Section 4 discusses the pilot study and the                        P (o|c2 ) < m P (o|c1 ) +q           (1)
initial results. Section 5 is dedicated to the require-                       | {z }        | {z }
                                                                                 y             x
ments and suggestions for future improvements col-
lected from the players during this study. In Section 6,
                                                              where P (o|c1 ) and P (o|c2 ) are the likelihoods of the
we give our final remarks.
                                                              object o given the two categories, while m and q are
                                                              two parameters that depend on the misclassification
                                                              costs that can be assigned by the user to compensate
2    Related Work                                             for either the unbalanced classes situation or di↵erent
                                                              class costs.
Gamification is defined as “the use of game design el-           If we interpret the two likelihoods as two coordi-
ements in non-game contexts” [4], i.e. tipical game el-       nates x and y of a two dimensional space, the problem
ements are used for purposes di↵erent from their nor-         of classification can be studied on a two-dimensional
mal expected employment. In this context, we can              plot. The decision of the classification is represented
define our web application as a Game with a Pur-              by the ‘line’ y = mx + q that splits the plane into
pose (GWAP), that is a game which presents some               two parts and all the points that fall ‘below’ this line
purposes, usually boring and dull for people, within          are classified as objects that belong to class c1 (see
an entertaining setting, in order to make them enjoy-         Figure 1 for an example). Without entering into the
able and to use human computation to solve problems.          mathematical details of this approach [6], the basic
Nowadays, gamification spreads through a wide range           idea of the game is that players can adapt the two pa-
of disciplines and its applications are implemented in        rameters m and q in order to optimize the separation
di↵erent and various aspects of scientific fields of study.   of points and, at the same time, can use their resources
For instances, gamification is applied to learning activ-     to improve the estimate of the two likelihoods by buy-
ities [16, 15], business and enterprise [14, 22, 23] and      ing training data, and/or add more points to the plot
medicine [9, 2].                                              by buying validation data.
   In recent years, also IR has dealt with gamifica-
tion, as witnessed by the Workshop on Gamification            3.1   Rules of the game
for Information Retrieval (GamifIR) in 2014, 2015 and
                                                              The game is organized in 10 levels, which are presented
2016. In [12] the authors describe the fundamental
                                                              from the easiest to the most difficult and which corre-
elements and mechanics of a game and provide an
                                                              spond to the di↵erent classification tasks of the top 10
overview of possible applications of gamification to the
                                                              classes of the Reuters 21578 dataset1 . A level is dif-
IR process. Moreover, [20] investigates possible ap-
                                                              ficult when it is hard to linearly separate the positive
proaches to properly gamify web search, i.e. making
                                                              and the negative class (class c1 and c2 in Equation 1),
the search of information and the scanning of results a
                                                              i.e. when the positive and negative class partially over-
more enjoyable activity. Actually, many proposals of
                                                              lap, and/or when there are few examples of objects
game applied to di↵erent aspects of IR have been pre-
                                                              in the positive class. We used the standard Reuters
sented. For example, [18] describes a game that turns
                                                              ModApte split to obtain the training and test docu-
document tagging into the activity of taking care of
                                                              ments for the ten classes. An objects in the training
a garden, with the aim of managing private archives.
                                                              set can be used during the game either as a training
In [17], the authors propose a method to obtain rank-
                                                              example or a validation sample, but not both. In this
ing of images by utilizing human computation through
                                                              way we are setting the machine learning problem in
a gamified web application. Finally, [11] introduces a
                                                              terms of a hold-out method that, despite being not
strategy to gamifying the annotation of a French cor-
                                                              very accurate compared to a cross-validation, is much
pora. Although a lot of gamified applications were
                                                              easier to use in a game.
presented in the IR field in the last few years, to the
best of our knowledge there are no applications to ma-           1 http://www.daviddlewis.com/resources/

chine learning and in particular to machine classifiers.      testcollections/reuters21578/
   The goal of each level (an in general of the game)          low these two buttons, there is a field which indicates
is to find the best classifier, i.e. the one which maxi-       the remaining resources, and the available ‘clicks’, to
mizes the F1 score, with the least amount or resources.        training/validate the model for the current level. The
Resources (we intentionally did not use the word cred-         player can utilize the positive and negative check boxes
its or money) can be spent to increase the training            to visualize only the positive and negative points re-
and the validation set. At the beginning of the game,          spectively. The player can center the line by clicking
the players has already a free 10% of the collection           directly on the plot image and then shift and rotate it
annotated: 5% used for training and 5% used for val-           by using the sliders, which corresponds to the q and m
idation. At any point in the game, the player can              parameter of Equation 1. Finally, we added a button
use some resources to buy additional training or vali-         that allows the player to go back to the best parame-
dation objects. When he/she selects the training op-           ters setting that he/she has found during the current
tion, an additional 5% of the collection is added to           level. The color of the button indicates whether it is
the training, this action will set the points more pre-        active (blue) or disabled (red).
cisely (because probabilities are estimated more ac-               In the top right panel, there is the main window that
curately), while when he/she chooses the validation            shows the two setsw of points: light blue for the posi-
option, an additional 5% is included in the validation         tive class and red for the negative one. The dark blue
set and more points will appear in the plot. Notice            line is the decision line used to split the two classes. In
that, when the player starts the game, he/she is pro-          the left upper corner of the plot, there are three num-
vided with a sufficient amount of resources to ‘buy’ the       bers: the first one in yellow is the current F1 score, the
whole dataset for each level. In this way, the player          second one in blue is the best F1 score obtained by the
does not have to think about saving resources for dif-         user in the current class with the amount of training
ficult levels. There is no limit to the time the user          and validation set, and the red one is the value that
can spend in a level since the important part of the           the player should ‘beat’ 3 .
game is to understand the amount of resources that                 Finally, below the plot area, on the right bottom
the player consider sufficient to solve the problem, not       box, there is a summary of the user actions and results.
how fast he/she can do it. Once the player has found           The histogram describes the amount of resources spent
what he/she considers the best classifier, he/she can          in training actions, with the green color, and valida-
proceed with the test, thus the classifier is tested on        tion actions, with the pink color. For each class, the
the test set and the F1 score is computed. At this             F1 score obtained on the test set is reported below
point, the level is completed and the player is forced         the corresponding column. At the end of the game,
to go to the next level or conclude the game.                  i.e. when the user has completed all the levels, he/she
                                                               will see his/her mean F1 score, averaged on each cate-
3.2   Interactive Application                                  gory, and the mean goal score, which is the average of
                                                               the F1 scores obtained by the automatic algorithm on
We have implemented the game of classification with
                                                               each category.
the Shiny package in R [1], and the source code of
the application is freely available for download2 . This
interactive Web application can be used as a show case         4    Pilot Study
online, but we had to use a local version in order to
                                                               A pilot study was carried out to test this preliminary
avoid lagging and server disconnection that would have
                                                               version of the game and to collect opinions and sug-
made the game very annoying for our players. The
                                                               gestions regarding possible improvements of the game.
objects plotted on the two-dimensional space are news
                                                               The experiment was conducted on a sample of 20 stu-
agencies from the Reuters newswires dataset.
                                                               dent and researchers of the University of Padua. As
   In Figure 1, you can see the layout of the web ap-
                                                               future work, we aim at spreading this game through
plication for this pilot study. The interface can be
                                                               the use of social media in order to collect a bigger and
divided in three main areas: the left panel, which con-
                                                               a more diversified sample of users.
tains buttons and controls to adjust the parameters
                                                                  The majority of the users that participate in this
of the classifier, the top right panel, which displays
                                                               test had just a naı̈ve idea of what machine learning is
the plot of the two classes, and the bottom right area,
                                                               and how an automatic classifier operates. Since the
which summarizes the employed resources and the F1
                                                               majority of our users were not machine learning ex-
scores.
                                                               pert, we provided them a brief explanation of the prob-
   In the left side panel, there is a text field that allows
the player to choose a username, and two buttons to               3 This value corresponds to the best score of the Bernoulli

start the game and to move to the next category. Be-           Naı̈ve Bayes classifier trained on whole dataset. We used the
                                                               results reported in [6] and we rounded down some values in
  2 https://github.com/gmdn/Classification                     order to motivate the players.
                                     Figure 1: Layout of the web application
lem and the fundamental concepts (especially about            In this initial analysis, we were impressed by two
training and validation) before starting the game.         results: on average, the players could beat the ‘goal’
Then, we introduced them to the interface as described     score more easily than expected, which means that
in section 3.2.                                            the probabilistic classifiers can be trained/validated
                                                           with just 25% of the original dataset and obtain in
4.1 Results                                                many cases even better results than a cross-validation
In our experiments, for each player we collected the F1    on the whole dataset. We will investigate this prob-
score for each class and the amount of resources used.     lem in future works. The second interesting aspect is
For a comparison with the state-of-the-art, we trained     that SVM performed as well as the whole dataset with
on the exact same training and validation set used by      only 25% of the annotated dataset and without cross-
each player for each class a SVM with linear kernel        validation. This results is very promising since, po-
                                  4
using the ‘kernlab’ package in R together with the         tentially, the gamification problem may give a strong
                5
‘caret’ package . For a fair comparison, since the play-   indication about how to stop the labelling process and
ers acted as ‘optimizers’ of the Naı̈ve Bayes classifier   use the annotated dataset to train with a very high
decision, we also optimized the SVM by adjusting the       accuracy state-of-the-art-algorithm. This second part
cost parameter C within the range [0.01, 0.05] (smaller    will require a deep analysis and further experiments to
or larger values of C did not produce any significant      confirm the statistical significance of this process.
change).
   In Table 1, we report the F1 measure for each player    5     Further Developments
on each level. The name of the column are the orig-
                                                           During the game and at the end of each game ses-
inal names of the top 10 Reuters-21578 classes. The
                                                           sion, we discussed together with each player about im-
last three column shows the average of the F1 measure
                                                           provements and issues of the interface and the game
across the classes for each player, the average F1 mea-
                                                           in general. We report in this section a summary of the
sure of the SVM across the data of each player, the
                                                           discussions.
percentage of resources used by each player in a game.
The last two rows represents: the average F1 score for
each class for the players and for the SVM.                5.1   Game Scenarios
  4 https://cran.r-project.org/web/packages/kernlab/       Together with the users who participated to this pilot
  5 https://cran.r-project.org/web/packages/caret/         study, we started to sketch some possible scenarios of
Table 1: F1 results for each player and level from easiest to hardest. Average performance of the SVM on the
same training validation is shown for each player and level. Last column shows the resources used by each player.
       username      earn    acq   grain   crude   money.fx   ship   wheat    interest   trade   corn   average   SVM     resources
    airamoigroig     0.97   0.95    0.88    0.83       0.76   0.78    0.74        0.74    0.78   0.56      0.80    0.86        29%
             Alan    0.96   0.91    0.79    0.67       0.75   0.68    0.68        0.63    0.76   0.59      0.74    0.85        16%
              Ale    0.94   0.95    0.83    0.86       0.76   0.79    0.77        0.63    0.71   0.49      0.77    0.85        18%
 CalebTheGame        0.96   0.90    0.88    0.87       0.76   0.84    0.77        0.75    0.76   0.61      0.81    0.87        56%
  ClaudioBarba       0.96   0.96    0.85    0.83       0.72   0.74    0.75        0.69    0.63   0.63      0.78    0.80        12%
               dz    0.96   0.95    0.85    0.79       0.74   0.76    0.72        0.74    0.74   0.62      0.79    0.83        17%
 edoardo verona      0.97   0.96    0.86    0.89       0.75   0.80    0.76        0.74    0.75   0.57      0.80    0.87        42%
            Erica    0.96   0.94    0.82    0.78       0.75   0.72    0.59        0.67    0.59   0.53      0.74    0.84        23%
        gadaleta     0.96   0.96    0.91    0.87       0.74   0.81    0.70        0.73    0.72   0.53      0.79    0.87        26%
           Giada     0.96   0.95    0.83    0.82       0.75   0.80    0.71        0.74    0.67   0.62      0.78    0.87        23%
           Hector    0.95   0.95    0.90    0.80       0.74   0.86    0.77        0.56    0.60   0.56      0.77    0.86        31%
            jeppy    0.96   0.95    0.81    0.67       0.69   0.74    0.60        0.65    0.71   0.51      0.73    0.79         1%
          ottoX8     0.97   0.94    0.89    0.71       0.75   0.78    0.75        0.75    0.67   0.56      0.78    0.82        26%
               pil   0.95   0.93    0.84    0.83       0.72   0.82    0.71        0.73    0.60   0.61      0.77    0.84        12%
        poiopoio     0.95   0.95    0.85    0.83       0.77   0.77    0.73        0.73    0.65   0.52      0.78    0.85        17%
         power23     0.96   0.95    0.86    0.90       0.73   0.78    0.74        0.77    0.68   0.52      0.79    0.86        28%
       renberche     0.96   0.95    0.88    0.77       0.71   0.75    0.76        0.72    0.72   0.55      0.78    0.87        39%
   signoraMaria      0.97   0.96    0.85    0.88       0.78   0.78    0.74        0.74    0.72   0.60      0.80    0.84        23%
              Ste    0.97   0.96    0.92    0.77       0.71   0.80    0.71        0.72    0.69   0.59      0.78    0.86        45%
        veronica     0.97   0.96    0.89    0.87       0.73   0.82    0.69        0.74    0.47   0.54      0.77    0.83        12%
          average    0.96   0.95    0.86    0.81       0.74   0.78    0.72        0.71    0.68   0.57
             SVM     0.97   0.95    0.89    0.85       0.73   0.82    0.89        0.74    0.82   0.80
this game that will improve the game experience. We                        cans that have been left on the field. You have
have come up with four possible alternatives:                              limited resources and you can only split the area
                                                                           in two sides: one side should contain more plastic
  • Plants and gardening: we have a field sown with                        bottles than glass bottles and the other side more
    di↵erent types of seeds, but we do not know ex-                        glass bottles than plastic bottles. You have some
    actly where these seeds are. Some of the seeds                         kids that can help you to spot where are the parts
    will grow into edible plants, others will grow into                    with more glass or plastic.
    weeds. The goal is to build a fence that separated
    the field in a way that “our” part of the field will             5.2     Game Controls
    contain the most of the edible plants and the least
                                                                     We received some very good feedback about game con-
    of the weeds. We can ask some help to our ani-
                                                                     trols and interaction and how to improve them to ob-
    mal granivore friends (like pigeons and doves) to
                                                                     tain a better feeling of the game.
    fly over and check some part of the fields to see
    whether the seeds are good or not.
                                                                     5.2.1    Main Window
  • Gold mines: there is an area with a lot of gold                  Most of the players would prefer a full screen window
    nuggets as well as useless common stones and you                 to see the points more clearly without any distraction.
    are the first explorer to mine this area. Your re-               The amount of credits used per class is not very rel-
    sources are limited, and you can only choose one                 evant for their game when they play as well as the
    part of the area while the rest remains untouched.               score obtained in previous classes. It would be better
    The goal is to choose the area with loads of gold                to have a window with the ranking of the scores that
    and the smallest number of stones. You have a                    players can open when they need to see their status.
    friend who is an expert in gold mining and can
    probe the area to understand whether there is                    5.3     Line Control
    gold or stone.
                                                                     Players began to understand the use of the sliders af-
  • Aerial warfare: in this war scenario, we have an                 ter a few attempts. In particular, the rotation of the
    army that is involved in a military zone, and we                 line was not immediately clear since the plot is cen-
    are forced to perform a raid to seize an area in                 tred around the minimum and maximum values of the
    order to secure the zone. The goal is to send our                coordinates, while the slope is computed given the in-
    air forces to clear the area that contains the most              tercept of the line with the y axis. The thing that puz-
    of the enemies and the least of our ground troops                zled the players was the non-intuitive rotation around
    and civilians. Before the raid, we can send our he-              a point far from the plot limits. For this reason they
    licopters to explore the area and check the current              suggested two alternatives:
    situation.
                                                                       • to select a fixed point within the plot (i.e. the
  • Plastic and Glass Recycling: after a music festival                  center of the line) and to rotate the line along
    in a park, you have to collect all the bottles and                   this point;
  • to maintain the slope fixed and rotate the plane                    called ‘goal’ score on the same category. We indicate
    instead of the line.                                                with ti and vi the amount of resources spent by the
                                                                        user in training and validation documents respectively,
Moreover, the control of the slope of the line would be
                                                                        and R is the total amount of resources provided at the
easier with a “knob” rather than a slider6 . There are
                                                                        beginning of the game, 1800 in our game. We define
also new ideas about the interactions with the game in
                                                                        the user rating J as
terms of touch screen technologies. In fact, it would be
much easier for the players to interact with the game                                    N                 N
                                                                                                                         !
                                                                                       X   si           1 X
with the gestures that are now “natural” on mobile                               J =a         +b 1            (ti + vi )   (2)
devices: rotation, swipe, zooming in and out, may en-                                      g
                                                                                        i=1 i
                                                                                                        R i=1
hance the user experience and bring the game to a
di↵erent level. Finally, line controls should be overlaid               where a and b are two parameters, which range in
on top of the main window instead of being on one                       the interval [0, 1]. Notice that these parameters repre-
side, in this way the eye of the player does not have                   sent the significance assigned to the two di↵erent tasks
to move from one side to the other of the screen every                  which define the game, indeed, a influences the impor-
time the line has to be adjusted.                                       tance of the e↵ectiveness objective, while b determine
                                                                        the importance of the efficiency purpose. For this pre-
5.4    Game Incentives                                                  liminary version of the game, we chose a and b with
                                                                        equal value, a = b = 1, since we consider both the effi-
From the live interaction with the players during the                   ciency and e↵ectiveness task of equivalent significance.
game sessions, it was clear that one of the strongest                   Future work and further applications of this game can
motivations to replay the game was to have ranking of                   justify the preference for one of the two tasks, moti-
the players with the scores obtained, to know whether                   vating the choice of di↵erent weights for a and b.
the friends/colleagues performed better or worse. At
the time of the pilot experiments, we could give them
                                                                        6    Final Remarks
hints about their performance compared to the other
players and just that information was enough for them                   In this first pilot study of the gamification in machine
to sparkle their sense of competition. Some of them                     learning, we set up a simple game, based on a visual
were willing to play a second time to just beat their                   interpretation of probabilistic classifiers, that consists
competitors. This is in line with the literature on gam-                in separating two sets of coloured points on a two-
ification [12, 8]. In addition to the ranking of the scores             dimensional plane by means of a straight line. The
of ‘human’ players, we want to introduce the compar-                    20 players that participated in this study already gave
ison of scores between each player and a set of state-                  us important suggestions in order to improve both the
of-the-art classification algorithm trained on the exact                game mechanics and the game controls. We believe
same game. We want to see how strong this incentive                     that, with the right game scenario (plants or gold
would be for a human to know that his performance is                    mine for example), this game could be easily played
better or worse compared to a computer. We are also                     by users that do not need any information about train-
planning a set of virtual goods and badges to motivate                  ing/validation.
the player during the game and after each session.                         Moreover, the classification results of the game were
    Since the competitiveness is one of the main moti-                  also very high compared to the small amount of la-
vations which encourages the users to play the game                     belled objects and we also found a very promising re-
and reach high performances, it is very important to                    lation with the results of the SVM trained on the same
find the definition of a formal criterion to rate and                   labelled dataset. We are currently study a new version
rank players by taking into account the F1 score and                    of the game with some options that implements cross
the resources spent. As described in section 3.1, the                   validation, which would bring the machine learning as-
goal of the game is multiobjective, indeed the main                     pect to a new di↵erent level.
task consists in defining, at the same time, a classifier
which is e↵ective, i.e. it reaches high values of accuracy              References
and precision, and efficient, i.e. it uses a few amount
                                                                         [1] Winston Chang. Shiny: Web Application Frame-
of resources in terms of training and validation.
                                                                             work for R, 2015. R package version 0.11.
    Let C = {c1 , . . . , ci , . . . , cN } be the set of categories,
we denote with si the F1 score obtained on the test                      [2] Yu Chen and Pearl Pu. Healthytogether: Ex-
set of the i-th category by the player, and with gi the                      ploring social incentives for mobile fitness applica-
F1 score obtained by the automatic algorithm that we                         tions. In Proceedings of the Second International
   6 See for example the gallery of this type of control realized            Symposium of Chinese CHI, Chinese CHI ’14,
with d3js https://radmie.github.io/ng-knob/                                  pages 25–34, New York, NY, USA, 2014. ACM.
 [3] Giorgio Corani and Marco Za↵alon. Learning reli-       [12] Luca Galli, Piero Fraternali, and Alessandro Boz-
     able classifiers from small or incomplete data sets:        zon. On the Application of Game Mechanics in
     The naive credal classifier 2. J. Mach. Learn. Res.,        Information Retrieval. In Proc. of the 1st Int.
     9:581–621, June 2008.                                       Workshop on Gamification for Information Re-
                                                                 trieval, GamifIR’14, pages 7–11, New York, NY,
 [4] Sebastian Deterding, Dan Dixon, Rilla Khaled,               USA, 2014. ACM.
     and Lennart Nacke. From Game Design El-
     ements to Gamefulness: Defining “Gamifica-             [13] Chien-Ju Ho, Shahin Jabbari, and Jennifer Wort-
     tion”. In Proc. of the 15th International Aca-              man Vaughan. Adaptive task assignment for
     demic MindTrek Conference: Envisioning Future               crowdsourced classification. In ICML (1), vol-
     Media Environments, MindTrek ’11, pages 9–15,               ume 28 of JMLR Proceedings, pages 534–542.
     New York, NY, USA, 2011. ACM.                               JMLR.org, 2013.

 [5] Giorgio Maria Di Nunzio. Using Scatterplots to         [14] Jose Luis Jurado, Alejandro Fernandez, and Ce-
     Understand and Improve Probabilistic Models for             sar A. Collazos. Applying gamification in the con-
     Text Categorization and Retrieval. Int. J. Ap-              text of knowledge management. In Proceedings of
     prox. Reasoning, 50(7):945–956, 2009.                       the 15th International Conference on Knowledge
                                                                 Technologies and Data-driven Business, i-KNOW
 [6] Giorgio Maria Di Nunzio. A New Decision to Take             ’15, pages 43:1–43:4, New York, NY, USA, 2015.
     for Cost-Sensitive Naı̈ve Bayes Classifiers. In-            ACM.
     formation Processing & Management, 50(5):653
     – 674, 2014.                                           [15] Karl M Kapp. The gamification of learning and
                                                                 instruction: game-based methods and strategies
 [7] Giorgio Maria Di Nunzio and Alessandro Sordoni.             for training and education. John Wiley & Sons,
     A Visual Tool for Bayesian Data Analysis: The               2012.
     Impact of Smoothing on Naive Bayes Text Clas-
                                                            [16] Isabella Kotini and Sofia Tzelepi. Gamification in
     sifiers. In Proc. of the ACM SIGIR’12 conference
                                                                 Education and Business, chapter A Gamification-
     on research and development in Information Re-
                                                                 Based Framework for Developing Learning Activ-
     trieval, Portland, OR, USA, August 12-16, 2012,
                                                                 ities of Computational Thinking, pages 219–252.
     page 1002, 2012.
                                                                 Springer Int. Publ., Cham, 2015.
 [8] David Easley and Arpita Ghosh. Incentives, gami-       [17] Mathias Lux, Mario Guggenberger, and Michael
     fication, and game theory: An economic approach             Riegler. Picturesort: Gamification of image rank-
     to badge design. In Proceedings of the Fourteenth           ing. In Proceedings of the First International
     ACM Conference on Electronic Commerce, EC                   Workshop on Gamification for Information Re-
     ’13, pages 359–376, New York, NY, USA, 2013.                trieval, GamifIR ’14, pages 57–60, New York, NY,
     ACM.                                                        USA, 2014. ACM.
 [9] Carsten Eickho↵. Crowd-powered experts: Help-          [18] Carlos Maltzahn, Arnav Jhala, Michael Mateas,
     ing surgeons interpret breast cancer images. In             and Jim Whitehead. Gamification of private dig-
     Proceedings of the First International Workshop             ital data archive management. In Proceedings of
     on Gamification for Information Retrieval, Gam-             the First International Workshop on Gamification
     ifIR ’14, pages 53–56, New York, NY, USA, 2014.             for Information Retrieval, GamifIR ’14, pages 33–
     ACM.                                                        37, New York, NY, USA, 2014. ACM.
[10] George Forman and Ira Cohen. Learning from             [19] Burr Settles. Closing the loop: Fast, interac-
     Little: Comparison of Classifiers Given Little              tive semi-supervised annotation with queries on
     Training. In Knowledge Discovery in Databases:              features and instances. In Proceedings of the
     PKDD 2004, Pisa, Italy, September 20-24, 2004,              2011 Conference on Empirical Methods in Nat-
     Proceedings, pages 161–172, 2004.                           ural Language Processing, EMNLP 2011, 27-31
                                                                 July 2011, John McIntyre Conference Centre, Ed-
[11] Karën Fort, Bruno Guillaume, and Hadrien Chas-             inburgh, UK, A meeting of SIGDAT, a Special In-
     tant. Creating zombilingo, a game with a pur-               terest Group of the ACL, pages 1467–1478, 2011.
     pose for dependency syntax annotation. In Pro-
     ceedings of the First International Workshop on        [20] Mark Shovman. The Game of Search: What is
     Gamification for Information Retrieval, GamifIR             the Fun in That? In Proc. of the 1st Int. Work-
     ’14, pages 2–6, New York, NY, USA, 2014. ACM.               shop on Gamification for Information Retrieval,
    GamifIR’14, pages 46–48, New York, NY, USA,
    2014. ACM.

[21] Rita Singh and Bhiksha Raj. Classification in
     likelihood spaces. Technometrics, 46(3):318–329,
     2004.
[22] Laurentiu Catalin Stanculescu, Alessandro Boz-
     zon, Robert-Jan Sips, and Geert-Jan Houben.
     Work and play: An experiment in enterprise gam-
     ification. In Proceedings of the 19th ACM Confer-
     ence on Computer-Supported Cooperative Work
     & Social Computing, CSCW ’16, pages 346–358,
     New York, NY, USA, 2016. ACM.

[23] Jennifer Thom, David Millen, and Joan DiMicco.
     Removing gamification from an enterprise sns.
     In Proceedings of the ACM 2012 Conference on
     Computer Supported Cooperative Work, CSCW
     ’12, pages 1067–1070, New York, NY, USA, 2012.
     ACM.

[24] Salud M. Jiménez Zafra, Giacomo Be-
     rardi, Andrea Esuli, Diego Marcheggiani,
     Maria Teresa Martı́n-Valdivia, and Alejan-
     dro Moreo Fernández. A multi-lingual annotated
     dataset for aspect-oriented opinion mining. In
     Proceedings of the 2015 Conference on Empir-
     ical Methods in Natural Language Processing,
     EMNLP 2015, Lisbon, Portugal, September
     17-21, 2015, pages 2533–2538, 2015.