=Paper= {{Paper |id=Vol-1749/paper21 |storemode=property |title=Gamification for IR: The Query Aspects Game |pdfUrl=https://ceur-ws.org/Vol-1749/paper21.pdf |volume=Vol-1749 |authors=Giorgio Maria Di Nunzio,Maria Maistro,Daniel Zilio |dblpUrl=https://dblp.org/rec/conf/clic-it/NunzioMZ16 }} ==Gamification for IR: The Query Aspects Game== https://ceur-ws.org/Vol-1749/paper21.pdf
                       Gamification for IR: The Query Aspects Game

   Giorgio Maria Di Nunzio                       Maria Maistro                           Daniel Zilio
    Dept. of Inf. Eng. (DEI)                 Dept. of Inf. Eng. (DEI)              Dept. of Inf. Eng. (DEI)
   University of Padua, Italy                University of Padua, Italy            University of Padua, Italy
   Via Gradenigo 6/a 35131                   Via Gradenigo 6/a 35131               Via Gradenigo 6/a 35131
       dinunzio@dei.unipd.it                   maistro@dei.unipd.it          daniel.zilio.3@studenti.unipd.it




                        Abstract                                     In particolare, ci focalizzeremo su tre as-
                                                                     petti principali: i) si vuole creare una
       English.                                                      collezione in modo che l’assegnazione dei
       The creation of a labelled dataset for Infor-                 giudizi da parte dei valutatori richieda il
       mation Retrieval (IR) purposes is a costly                    minor sforzo possibile, ii) per mezzo di
       process. For this reason, a mix of crowd-                     un’interfaccia che utilizza dinamiche di
       sourcing and active learning approaches                       gioco iii) insieme a tecniche di NLP per
       have been proposed in the literature in or-                   la riformulazione della query.
       der to assess the relevance of documents of
       a collection given a particular query at an
                                                               1     Introduction
       affordable cost. In this paper, we present
       the design of the gamification of this inter-           In Information Retrieval (IR), the performance of
       active process that draws inspiration from              a system is evaluated using experimental test col-
       recent works in the area of gamification                lections. These collections consist of a set of docu-
       for IR. In particular, we focus on three                ments, a set of queries, and a set of relevance judg-
       main points: i) we want to create a set of              ments, where each judgement explains whether a
       relevance judgements with the least effort              document is relevant or not to each query. The
       by human assessors, ii) we use interactive              creation of relevance judgements is a costly, time-
       search interfaces that use game mechan-                 consuming, and non-trivial task; for these rea-
       ics, iii) we use Natural Language Process-              sons, the interest in approaches that generate rele-
       ing (NLP) to collect different aspects of a             vance judgements with the least amount of effort
       query. 1                                                has increased in IR and related areas (i.e., super-
                                                               vised Machine Learning (ML) algorithms). In the
       Italiano                                                last years, mixed approaches that use crowdsourc-
       La creazione di una collezione sperimen-                ing (Ho et al., 2013) and active learning (Settles,
       tale per l’Information Retrieval (IR) è un             2011) have shown that it is possible to create an-
       processo costoso sia dal punto di vista                 notated datasets at affordable costs. Specifically,
       economico che in termini di sforzo umano.               crowdsourcing has been part of the IR toolbox as a
       Per ridurre i costi legati all’attribuzione             cheap and fast mechanism to obtain labels for sys-
       dei giudizi di rilevanza ai documenti di                tem evaluation. However, successful deployment
       una collezione, sono stati proposti ap-                 of crowdsourcing at scale involves the adjustment
       procci che integrano tecniche di crowd-                 of many variables, a very important one being the
       sourcing e active learning. In questo                   number of assessors needed per task, as explained
       paper viene presentata un’idea basata                   in (Abraham et al., 2016).
       sull’utilizzo della gamification (‘ludiciz-             1.1    Search Diversification and Query
       zazione’) in IR per l’attribuzione di giudizi                  Reformulation
       di rilevanza in maniera semi-automatica.
                                                               The effectiveness of a search and the satisfaction
   1
     This paper is partially an extended abstract of the pa-   of users can be enhanced through providing var-
per “Gamification for Machine Learning: The Classification
Game” presented at the GamifIR 2016 Workshop co-located        ious results of a search query in a certain order
with SIGIR 2016 (Di Nunzio et al., 2016)                       of relevance and concern. The technique used
to avoid presenting similar, though relevant, re-       2016. In (Galli et al., 2014), the authors describe
sults to the user is known as a diversification of      the fundamental elements and mechanics of a
search results (Abid et al., 2016). While exist-        game and provide an overview of possible applica-
ing research in search diversification offers sev-      tions of gamification to the IR process. In (Shov-
eral solutions for introducing variety into the re-     man, 2014), approaches to properly gamify Web
sults, the majority of such work is based on the as-    search are presented, i.e. making the search of
sumption that a single relevant document will ful-      information and the scanning of results a more
fil a user’s information need, making them inade-       enjoyable activity. Actually, many proposals of
quate for many informational queries. In (Welch         game applied to different aspects of IR have been
et al., 2011), the authors propose a model to make      presented. For example in (Maltzahn et al., 2014),
a tradeoff between a user’s desire for multiple rel-    the authors describes a game that turns document
evant documents, probabilistic information about        tagging into the activity of taking care of a gar-
an average user’s interest in the subtopics of a mul-   den, with the aim of managing private archives.
tifaceted query, and uncertainty in classifying doc-    A method to obtain ranking of images by utilizing
uments into those subtopics.                            human computation through a gamified web appli-
    Most information retrieval systems operate by       cation is proposed in (Lux et al., 2014). Fort et al.
performing a single retrieval in response to a          introduce a strategy to gamify the annotation of a
query. Effective results sometimes require sev-         French corpora (Fort et al., 2014).
eral manual reformulations by the user or semi-            In this paper, we want to apply game mechanics
automatic reformulations assisted by the system.        to the problem of relevance assessment with three
Diaz presents an approach to automatic query re-        goals in mind: i) we want to create a set of rele-
formulation which combines the iterated nature          vance judgements with the least effort by human
of human query reformulation with the automatic         assessors, ii) we use interactive search interfaces
behavior of pseudo relevance feedback (Diaz,            that use game mechanics, iii) we use NLP to col-
2016). In (Azzopardi, 2009), the author proposes        lect different aspects of a query. In this context,
a method for generating queries for ad-hoc top-         we can define our web application as a Game with
ics to provide the necessary data for this compre-      a Purpose (GWAP), that is a game which presents
hensive analysis of query performance. Bailey et        some purposes, usually boring and dull for people,
al. explore the impact of widely differing queries      within an entertaining setting, in order to make
that searchers construct for the same information       them enjoyable and to solve problem with the aid
need description. By executing those queries, we        of human computation. The design and the im-
demonstrate that query formulation is critical to       plementation of this interactive interface will be
query effectiveness (Bailey et al., 2015).              used as a post-hoc analysis of two Text REtrieval
                                                        Conference (TREC)2 2016 tracks, namely the To-
1.2   Gamification in IR                                tal Recall Track and the Dynamic Domain Track.
                                                        These two tracks are interesting for our problem
Gamification is defined as “the use of game de-         since they both re-create a situation where we need
sign elements in non-game contexts” (Deterding          to find the best set (or the total amount) of relevant
et al., 2011), i.e. tipical game elements are used      documents with the minimum effort by the asses-
for purposes different from their normal expected       sor that has to judge the documents proposed by
employment. Nowadays, gamification spreads              the system given an information need.
through a wide range of disciplines and its appli-
cations are implemented in different and various        2       Design of the Experiment
aspects of scientific fields of study. For instances,
gamification is applied to learning activities (Ko-     In this first pilot study, we will implement a sim-
tini and Tzelepi, 2015; Kapp, 2012), business and       ple game based on a visual interpretation of prob-
enterprise (Jurado et al., 2015; Stanculescu et al.,    abilistic classifiers (Di Nunzio, 2014; Di Nunzio,
2016; Thom et al., 2012) and medicine (Eickhoff,        2009; Di Nunzio and Sordoni, 2012). The game
2014; Chen and Pu, 2014).                               consists in separating two sets of colored points
                                                        on a two-dimensional plane by means of a straight
   IR has recently dealt with gamification, as wit-
                                                        line, as shown in Figure 1. Despite its simplicity,
nessed by the Workshop on Gamification for In-
                                                            2
formation Retrieval (GamifIR) in 2014, 2015 and                 http://trec.nist.gov
this very abstract scenario received a good feed-      ever, while in the classification game we already
back by kids of primary schools who tested it dur-     have a set of labelled documents and the goal is
ing the European Researcher’s Night at the Uni-        to find the optimal classifier, in this new game
versity of Padua3 . The next step will be to design    we need to find the relevant documents. To this
and implement the game with real game develop-         purpose, we will follow the idea of the works de-
ment platforms like, for example, Unity4 or Mar-       scribed in the following subsections: i) building
malade5 .                                              assessment by varying the description of the in-
                                                       formation need, ii) using an interactive interface
2.1      The Classification Game                       that suggests the amount of relevant information
The ‘original game’ (Di Nunzio et al., 2016) is        that has to be judged, iii) using NLP approaches
based on the two-dimensional representation of         to generate variations of a query.
probabilities (Di Nunzio, 2014; Singh and Raj,
2004), which is a very intuitive way of presenting     3.1   Building Relevance Assessments With
the problem of classification on a two-dimensional           Query Aspects
space. Given two classes c1 and c2 , an object o is
                                                       In (Efron, 2009), the author presents a method for
assigned to category c1 if the following inequality
                                                       creating relevance judgments without explicit rel-
holds:
                                                       evance assessments. The idea is to create differ-
             P (o|c2 ) < m P (o|c1 ) +q         (1)
             | {z }        | {z }                      ent “aspects” of a query: given a query q and a
                 y            x                        set of documents D, the same information need
where P (o|c1 ) and P (o|c2 ) are the likelihoods of   that generated q could also generate other queries
the object o given the two categories, while m and     that focus on another aspects of the same need. A
q are two parameters that depend on the misclas-       query aspect is an articulation of a searcher’s infor-
sification costs that can be assigned by the user to   mation need which might be a re-elaboration (for
compensate for either the unbalanced classes situ-     example, rephrasing, specification, or generaliza-
ation or different class costs.                        tion) of the query. By generating several queries
   If we interpret the two likelihoods as two coor-    related to an information need and running each
dinates x and y of a two dimensional space, the        of these against our document collection, we can
problem of classification can be studied on a two-     create a pool of results where each result set per-
dimensional plot. The decision of the classifica-      tains to a particular aspect of the information need
tion is represented by the ‘line’ y = mx + q that      with a limited human effort.
splits the plane into two parts, therefore all the        In practice, in order to build a set of relevance
points that fall ‘below’ this line are classified as   assessments for q, we generate a number of query
objects that belong to class c1 (see Figure 1 for      aspects using a single IR system. Then, the union
an example). Without entering into the mathemat-       of the top k documents retrieved for each aspect
ical details of this approach (Di Nunzio, 2014),       constitutes a list of pseudo-relevance assessments
the basic idea of the game is that the players can     for the query q.
adapt the two parameters m and q in order to opti-
mize the separation of points and, at the same time,   3.2   An Interactive Interface to Generate
can use their resources to improve the estimate of           Query Aspects
the two likelihoods by buying training data, and/or    Building different aspects of the same information
add more points to the plot, by buying validation      need is not an easy task. As explained in (Umem-
data.                                                  oto et al., 2016), searchers often cannot easily
                                                       come up with effective queries for collecting doc-
3       The Query Aspects Game
                                                       uments that cover diverse aspects. In general, ex-
The classification game can be easily adjusted into    perts that have to search for relevant documents
a relevance assessment game if the two classes are     usually have to issue more queries to complete the
‘relevant’ and ‘non-relevant’ (we assume only bi-      tasks if search engines return few documents rel-
nary relevance assessment for the moment). How-        evant to unexplored aspects. Moreover, quitting
    3
      http://www.venetonight.it/padova/
                                                       this tasks too early without in-depth exploration
    4
      https://unity3d.com                              prevents searchers from finding essential informa-
    5
      https://www.madewithmarmalade.com/               tion.
Figure 1: Layout of the original “classification game” that will be adapted to the “query aspects game”.


                                                      ing techniques to propose variations of a query to
                                                      express the same information need. This problem
                                                      has been studied for more than twenty years in IR.
                                                      In (Strzalkowski et al., 1997), the authors dis-
                                                      cuss how the simplest word-based representations
                                                      of content, while relatively better understood, are
Figure 2: Scentbars and the visualization of miss-    usually inadequate since single words are rarely
ing information. Figure from (Umemoto et al.,         specific enough for accurate discrimination. Con-
2016)                                                 sequently, a better method is to identify groups of
                                                      words that create meaningful phrases, especially
                                                      if these phrases denote important concepts in the
   Umemoto et al. propose an interactive interface,   domain.
named ScentBar, that helps searchers to visualize
the amount of missing information for both the           Some examples of advanced techniques of
search query and suggestion queries in the form       phrase extraction, including extended N-grams
of a stacked bar chart. The interface, a portion of   and syntactic parsing, attempt to uncover con-
which is shown in Figure 2, visualizes an estimate    cepts, which would capture underlying semantic
of missing information for each aspect of a query     uniformity across various surface forms of expres-
that could be obtained by the searcher. When the      sion. Syntactic phrases, for example, appear rea-
user collects new information during the browsing     sonable indicators of content since they can ade-
of the results, the bars of the different query as-   quately deal with word order changes and other
pects change color to indicate the amount of effort   structural variations. In the literature, there are
that the system estimates necessary to find most of   examples of query reformulation using NLP ap-
the relevant information. The estimates of the re-    proaches for example to the modification and/or
quired effort to complete a task are formalized as    expansion of both parts thematic and geospa-
as a set-wise metric were the gain for each aspects   tial that are usually recognized in a geographical
is represented by a conditional probability.          query (Perea-Ortega et al., 2013), or to support the
                                                      refinement of a vague, non-technical initial query
3.3   Using NLP to Generate Query Aspects
                                                      into a more precise problem description (Roulland
The last part of the design of the query aspects      et al., 2007), or to predict search satisfaction (Has-
game involves the use of natural language process-    san et al., 2013).
4   Conclusions and Future Works                              MindTrek ’11, pages 9–15, New York, NY, USA.
                                                              ACM.
In this work, we presented the requirements of the
design of an interactive interface that uses game           Giorgio Maria Di Nunzio and Alessandro Sordoni.
                                                              2012. A Visual Tool for Bayesian Data Analysis:
mechanics together with NLP techniques to gen-                The Impact of Smoothing on Naive Bayes Text Clas-
erate variation of an information need in order to            sifiers. In Proc. of the ACM SIGIR’12 conference on
label a collection of documents. Starting from the            research and development in Information Retrieval,
successful experience of the gamification of a ma-            Portland, OR, USA, August 12-16, 2012, page 1002.
chine learning problem (Di Nunzio et al., 2016),            Giorgio Maria Di Nunzio, Maria Maistro, and Daniel
we are preparing a new pilot study of the ‘query              Zilio. 2016. Gamification for machine learning:
aspects game’ that will be used to generate rele-             The classification game. In Proceedings of the Third
                                                              International Workshop on Gamification for Infor-
vant documents for two TREC tracks: the Total                 mation Retrieval co-located with 39th International
Recall track and the Dynamic Domain track. The                ACM SIGIR Conference on Research and Develop-
results of this study will be available at the end            ment in Information Retrieval (SIGIR 2016), Pisa,
of November 2016, and can be presented and dis-               Italy, July 21, 2016., pages 45–52.
cussed at the workshop.                                     Giorgio Maria Di Nunzio. 2009. Using Scatterplots
                                                              to Understand and Improve Probabilistic Models for
                                                              Text Categorization and Retrieval. Int. J. Approx.
References                                                    Reasoning, 50(7):945–956.

Adnan Abid, Naveed Hussain, Kamran Abid, Farooq             Giorgio Maria Di Nunzio. 2014. A New Decision to
  Ahmad, Muhammad Shoaib Farooq, Uzma Farooq,                 Take for Cost-Sensitive Naı̈ve Bayes Classifiers. In-
  Sher Afzal Khan, Yaser Daanial Khan, Muham-                 formation Processing & Management, 50(5):653 –
  mad Azhar Naeem, and Nabeel Sabir. 2016. A                  674.
  survey on search results diversification techniques.
  Neural Computing and Applications, 27(5):1207–            Fernando Diaz, 2016. Pseudo-Query Reformulation,
  1229.                                                       pages 521–532. Springer International Publishing,
                                                              Cham.
Ittai Abraham, Omar Alonso, Vasilis Kandylas, Rajesh        Miles Efron. 2009. Using multiple query aspects to
   Patel, Steven Shelford, and Aleksandrs Slivkins.           build test collections without human relevance judg-
   2016. How many workers to ask?: Adaptive explo-            ments. In Proceedings of the 31th European Con-
   ration for collecting high quality labels. In Proceed-     ference on IR Research on Advances in Information
   ings of the 39th International ACM SIGIR Confer-           Retrieval, ECIR ’09, pages 276–287, Berlin, Heidel-
   ence on Research and Development in Information            berg. Springer-Verlag.
   Retrieval, SIGIR ’16, pages 473–482, New York,
   NY, USA. ACM.                                            Carsten Eickhoff. 2014. Crowd-powered experts:
                                                              Helping surgeons interpret breast cancer images.
Leif Azzopardi. 2009. Query side evaluation: An               In Proceedings of the First International Workshop
  empirical analysis of effectiveness and effort. In          on Gamification for Information Retrieval, GamifIR
  Proceedings of the 32Nd International ACM SIGIR             ’14, pages 53–56, New York, NY, USA. ACM.
  Conference on Research and Development in Infor-
  mation Retrieval, SIGIR ’09, pages 556–563, New           Karën Fort, Bruno Guillaume, and Hadrien Chastant.
  York, NY, USA. ACM.                                         2014. Creating zombilingo, a game with a purpose
                                                              for dependency syntax annotation. In Proceedings
Peter Bailey, Alistair Moffat, Falk Scholer, and Paul         of the First International Workshop on Gamification
  Thomas. 2015. User variability and ir system evalu-         for Information Retrieval, GamifIR ’14, pages 2–6,
  ation. In Proceedings of the 38th International ACM         New York, NY, USA. ACM.
  SIGIR Conference on Research and Development in
  Information Retrieval, SIGIR ’15, pages 625–634,          Luca Galli, Piero Fraternali, and Alessandro Boz-
  New York, NY, USA. ACM.                                     zon. 2014. On the Application of Game Mechan-
                                                              ics in Information Retrieval. In Proc. of the 1st
Yu Chen and Pearl Pu. 2014. Healthytogether: Explor-          Int. Workshop on Gamification for Information Re-
  ing social incentives for mobile fitness applications.      trieval, GamifIR’14, pages 7–11, New York, NY,
  In Proceedings of the Second International Sympo-           USA. ACM.
  sium of Chinese CHI, Chinese CHI ’14, pages 25–
  34, New York, NY, USA. ACM.                               Ahmed Hassan, Xiaolin Shi, Nick Craswell, and Bill
                                                              Ramsey. 2013. Beyond clicks: query reformulation
Sebastian Deterding, Dan Dixon, Rilla Khaled, and             as a predictor of search satisfaction. In Proceedings
  Lennart Nacke. 2011. From Game Design Elements              of the 22nd ACM international conference on Con-
  to Gamefulness: Defining “Gamification”. In Proc.           ference on information & knowledge manage-
  of the 15th International Academic MindTrek Con-            ment, CIKM ’13, pages 2019–2028, New York, NY,
  ference: Envisioning Future Media Environments,             USA. ACM.
Chien-Ju Ho, Shahin Jabbari, and Jennifer Wortman         Mark Shovman. 2014. The Game of Search: What is
  Vaughan. 2013. Adaptive task assignment for              the Fun in That? In Proc. of the 1st Int. Workshop
  crowdsourced classification. In Proceedings of the       on Gamification for Information Retrieval, Gami-
  30th International Conference on Machine Learn-          fIR’14, pages 46–48, New York, NY, USA. ACM.
  ing, ICML 2013, volume 28 of JMLR Proceedings,
  pages 534–542. JMLR.org.                                Rita Singh and Bhiksha Raj. 2004. Classification in
                                                             likelihood spaces. Technometrics, 46(3):318–329.
Jose Luis Jurado, Alejandro Fernandez, and Cesar A.
                                                          Laurentiu Catalin Stanculescu, Alessandro Bozzon,
   Collazos. 2015. Applying gamification in the con-
                                                            Robert-Jan Sips, and Geert-Jan Houben. 2016.
   text of knowledge management. In Proceedings
                                                            Work and play: An experiment in enterprise gamifi-
   of the 15th International Conference on Knowledge
                                                            cation. In Proceedings of the 19th ACM Conference
   Technologies and Data-driven Business, i-KNOW
                                                            on Computer-Supported Cooperative Work & Social
   ’15, pages 43:1–43:4, New York, NY, USA. ACM.
                                                            Computing, CSCW ’16, pages 346–358, New York,
                                                            NY, USA. ACM.
Karl M Kapp. 2012. The gamification of learning and
  instruction: game-based methods and strategies for      Tomek Strzalkowski, Fang Lin, Jose Perez-Carballo,
  training and education. John Wiley & Sons.                and Jin Wang. 1997. Building effective queries
                                                            in natural language information retrieval. In Pro-
Isabella Kotini and Sofia Tzelepi.         2015.    A       ceedings of the Fifth Conference on Applied Natural
   Gamification-Based Framework for Developing              Language Processing, ANLC ’97, pages 299–306,
   Learning Activities of Computational Thinking. In        Stroudsburg, PA, USA. Association for Computa-
   Torsten Reiners and C. Lincoln Wood, editors, Gam-       tional Linguistics.
   ification in Education and Business, pages 219–252.
   Springer Int. Publ., Cham.                             Jennifer Thom, David Millen, and Joan DiMicco.
                                                             2012. Removing gamification from an enterprise
Mathias Lux, Mario Guggenberger, and Michael                 sns. In Proceedings of the ACM 2012 Conference
 Riegler. 2014. Picturesort: Gamification of im-             on Computer Supported Cooperative Work, CSCW
 age ranking. In Proceedings of the First Interna-           ’12, pages 1067–1070, New York, NY, USA. ACM.
 tional Workshop on Gamification for Information
 Retrieval, GamifIR ’14, pages 57–60, New York,           Kazutoshi Umemoto, Takehiro Yamamoto, and Kat-
 NY, USA. ACM.                                              sumi Tanaka. 2016. Scentbar: A query sugges-
                                                            tion interface visualizing the amount of missed rele-
                                                            vant information for intrinsically diverse search. In
Carlos Maltzahn, Arnav Jhala, Michael Mateas, and
                                                            Proceedings of the 39th International ACM SIGIR
  Jim Whitehead. 2014. Gamification of private digi-
                                                            Conference on Research and Development in Infor-
  tal data archive management. In Proceedings of the
                                                            mation Retrieval, SIGIR ’16, pages 405–414, New
  First International Workshop on Gamification for
                                                            York, NY, USA. ACM.
  Information Retrieval, GamifIR ’14, pages 33–37,
  New York, NY, USA. ACM.                                 Michael J. Welch, Junghoo Cho, and Christopher Ol-
                                                            ston. 2011. Search result diversity for informational
José M. Perea-Ortega, Miguel A. Garcı́a-Cumbreras,         queries. In Proceedings of the 20th International
   and L. Alfonso Ureña López. 2013. Applying nlp         Conference on World Wide Web, WWW ’11, pages
   techniques for query reformulation to information        237–246, New York, NY, USA. ACM.
   retrieval with geographical references. In Proceed-
   ings of the 2012 Pacific-Asia Conference on Emerg-
   ing Trends in Knowledge Discovery and Data Min-
   ing, PAKDD’12, pages 57–69, Berlin, Heidelberg.
   Springer-Verlag.

Frédéric Roulland, Aaron Kaplan, Stefania Castellani,
   Claude Roux, Antonietta Grasso, Karin Pettersson,
   and Jacki O’Neill, 2007. Query Reformulation and
   Refinement Using NLP-Based Sentence Clustering,
   pages 210–221. Springer Berlin Heidelberg, Berlin,
   Heidelberg.

Burr Settles. 2011. Closing the loop: Fast, interac-
  tive semi-supervised annotation with queries on fea-
  tures and instances. In Proceedings of the 2011
  Conference on Empirical Methods in Natural Lan-
  guage Processing, EMNLP 2011, 27-31 July 2011,
  John McIntyre Conference Centre, Edinburgh, UK,
  A meeting of SIGDAT, a Special Interest Group of
  the ACL, pages 1467–1478.