=Paper=
{{Paper
|id=Vol-1749/paper21
|storemode=property
|title=Gamification for IR: The Query Aspects Game
|pdfUrl=https://ceur-ws.org/Vol-1749/paper21.pdf
|volume=Vol-1749
|authors=Giorgio Maria Di Nunzio,Maria Maistro,Daniel Zilio
|dblpUrl=https://dblp.org/rec/conf/clic-it/NunzioMZ16
}}
==Gamification for IR: The Query Aspects Game==
Gamification for IR: The Query Aspects Game Giorgio Maria Di Nunzio Maria Maistro Daniel Zilio Dept. of Inf. Eng. (DEI) Dept. of Inf. Eng. (DEI) Dept. of Inf. Eng. (DEI) University of Padua, Italy University of Padua, Italy University of Padua, Italy Via Gradenigo 6/a 35131 Via Gradenigo 6/a 35131 Via Gradenigo 6/a 35131 dinunzio@dei.unipd.it maistro@dei.unipd.it daniel.zilio.3@studenti.unipd.it Abstract In particolare, ci focalizzeremo su tre as- petti principali: i) si vuole creare una English. collezione in modo che l’assegnazione dei The creation of a labelled dataset for Infor- giudizi da parte dei valutatori richieda il mation Retrieval (IR) purposes is a costly minor sforzo possibile, ii) per mezzo di process. For this reason, a mix of crowd- un’interfaccia che utilizza dinamiche di sourcing and active learning approaches gioco iii) insieme a tecniche di NLP per have been proposed in the literature in or- la riformulazione della query. der to assess the relevance of documents of a collection given a particular query at an 1 Introduction affordable cost. In this paper, we present the design of the gamification of this inter- In Information Retrieval (IR), the performance of active process that draws inspiration from a system is evaluated using experimental test col- recent works in the area of gamification lections. These collections consist of a set of docu- for IR. In particular, we focus on three ments, a set of queries, and a set of relevance judg- main points: i) we want to create a set of ments, where each judgement explains whether a relevance judgements with the least effort document is relevant or not to each query. The by human assessors, ii) we use interactive creation of relevance judgements is a costly, time- search interfaces that use game mechan- consuming, and non-trivial task; for these rea- ics, iii) we use Natural Language Process- sons, the interest in approaches that generate rele- ing (NLP) to collect different aspects of a vance judgements with the least amount of effort query. 1 has increased in IR and related areas (i.e., super- vised Machine Learning (ML) algorithms). In the Italiano last years, mixed approaches that use crowdsourc- La creazione di una collezione sperimen- ing (Ho et al., 2013) and active learning (Settles, tale per l’Information Retrieval (IR) è un 2011) have shown that it is possible to create an- processo costoso sia dal punto di vista notated datasets at affordable costs. Specifically, economico che in termini di sforzo umano. crowdsourcing has been part of the IR toolbox as a Per ridurre i costi legati all’attribuzione cheap and fast mechanism to obtain labels for sys- dei giudizi di rilevanza ai documenti di tem evaluation. However, successful deployment una collezione, sono stati proposti ap- of crowdsourcing at scale involves the adjustment procci che integrano tecniche di crowd- of many variables, a very important one being the sourcing e active learning. In questo number of assessors needed per task, as explained paper viene presentata un’idea basata in (Abraham et al., 2016). sull’utilizzo della gamification (‘ludiciz- 1.1 Search Diversification and Query zazione’) in IR per l’attribuzione di giudizi Reformulation di rilevanza in maniera semi-automatica. The effectiveness of a search and the satisfaction 1 This paper is partially an extended abstract of the pa- of users can be enhanced through providing var- per “Gamification for Machine Learning: The Classification Game” presented at the GamifIR 2016 Workshop co-located ious results of a search query in a certain order with SIGIR 2016 (Di Nunzio et al., 2016) of relevance and concern. The technique used to avoid presenting similar, though relevant, re- 2016. In (Galli et al., 2014), the authors describe sults to the user is known as a diversification of the fundamental elements and mechanics of a search results (Abid et al., 2016). While exist- game and provide an overview of possible applica- ing research in search diversification offers sev- tions of gamification to the IR process. In (Shov- eral solutions for introducing variety into the re- man, 2014), approaches to properly gamify Web sults, the majority of such work is based on the as- search are presented, i.e. making the search of sumption that a single relevant document will ful- information and the scanning of results a more fil a user’s information need, making them inade- enjoyable activity. Actually, many proposals of quate for many informational queries. In (Welch game applied to different aspects of IR have been et al., 2011), the authors propose a model to make presented. For example in (Maltzahn et al., 2014), a tradeoff between a user’s desire for multiple rel- the authors describes a game that turns document evant documents, probabilistic information about tagging into the activity of taking care of a gar- an average user’s interest in the subtopics of a mul- den, with the aim of managing private archives. tifaceted query, and uncertainty in classifying doc- A method to obtain ranking of images by utilizing uments into those subtopics. human computation through a gamified web appli- Most information retrieval systems operate by cation is proposed in (Lux et al., 2014). Fort et al. performing a single retrieval in response to a introduce a strategy to gamify the annotation of a query. Effective results sometimes require sev- French corpora (Fort et al., 2014). eral manual reformulations by the user or semi- In this paper, we want to apply game mechanics automatic reformulations assisted by the system. to the problem of relevance assessment with three Diaz presents an approach to automatic query re- goals in mind: i) we want to create a set of rele- formulation which combines the iterated nature vance judgements with the least effort by human of human query reformulation with the automatic assessors, ii) we use interactive search interfaces behavior of pseudo relevance feedback (Diaz, that use game mechanics, iii) we use NLP to col- 2016). In (Azzopardi, 2009), the author proposes lect different aspects of a query. In this context, a method for generating queries for ad-hoc top- we can define our web application as a Game with ics to provide the necessary data for this compre- a Purpose (GWAP), that is a game which presents hensive analysis of query performance. Bailey et some purposes, usually boring and dull for people, al. explore the impact of widely differing queries within an entertaining setting, in order to make that searchers construct for the same information them enjoyable and to solve problem with the aid need description. By executing those queries, we of human computation. The design and the im- demonstrate that query formulation is critical to plementation of this interactive interface will be query effectiveness (Bailey et al., 2015). used as a post-hoc analysis of two Text REtrieval Conference (TREC)2 2016 tracks, namely the To- 1.2 Gamification in IR tal Recall Track and the Dynamic Domain Track. These two tracks are interesting for our problem Gamification is defined as “the use of game de- since they both re-create a situation where we need sign elements in non-game contexts” (Deterding to find the best set (or the total amount) of relevant et al., 2011), i.e. tipical game elements are used documents with the minimum effort by the asses- for purposes different from their normal expected sor that has to judge the documents proposed by employment. Nowadays, gamification spreads the system given an information need. through a wide range of disciplines and its appli- cations are implemented in different and various 2 Design of the Experiment aspects of scientific fields of study. For instances, gamification is applied to learning activities (Ko- In this first pilot study, we will implement a sim- tini and Tzelepi, 2015; Kapp, 2012), business and ple game based on a visual interpretation of prob- enterprise (Jurado et al., 2015; Stanculescu et al., abilistic classifiers (Di Nunzio, 2014; Di Nunzio, 2016; Thom et al., 2012) and medicine (Eickhoff, 2009; Di Nunzio and Sordoni, 2012). The game 2014; Chen and Pu, 2014). consists in separating two sets of colored points on a two-dimensional plane by means of a straight IR has recently dealt with gamification, as wit- line, as shown in Figure 1. Despite its simplicity, nessed by the Workshop on Gamification for In- 2 formation Retrieval (GamifIR) in 2014, 2015 and http://trec.nist.gov this very abstract scenario received a good feed- ever, while in the classification game we already back by kids of primary schools who tested it dur- have a set of labelled documents and the goal is ing the European Researcher’s Night at the Uni- to find the optimal classifier, in this new game versity of Padua3 . The next step will be to design we need to find the relevant documents. To this and implement the game with real game develop- purpose, we will follow the idea of the works de- ment platforms like, for example, Unity4 or Mar- scribed in the following subsections: i) building malade5 . assessment by varying the description of the in- formation need, ii) using an interactive interface 2.1 The Classification Game that suggests the amount of relevant information The ‘original game’ (Di Nunzio et al., 2016) is that has to be judged, iii) using NLP approaches based on the two-dimensional representation of to generate variations of a query. probabilities (Di Nunzio, 2014; Singh and Raj, 2004), which is a very intuitive way of presenting 3.1 Building Relevance Assessments With the problem of classification on a two-dimensional Query Aspects space. Given two classes c1 and c2 , an object o is In (Efron, 2009), the author presents a method for assigned to category c1 if the following inequality creating relevance judgments without explicit rel- holds: evance assessments. The idea is to create differ- P (o|c2 ) < m P (o|c1 ) +q (1) | {z } | {z } ent “aspects” of a query: given a query q and a y x set of documents D, the same information need where P (o|c1 ) and P (o|c2 ) are the likelihoods of that generated q could also generate other queries the object o given the two categories, while m and that focus on another aspects of the same need. A q are two parameters that depend on the misclas- query aspect is an articulation of a searcher’s infor- sification costs that can be assigned by the user to mation need which might be a re-elaboration (for compensate for either the unbalanced classes situ- example, rephrasing, specification, or generaliza- ation or different class costs. tion) of the query. By generating several queries If we interpret the two likelihoods as two coor- related to an information need and running each dinates x and y of a two dimensional space, the of these against our document collection, we can problem of classification can be studied on a two- create a pool of results where each result set per- dimensional plot. The decision of the classifica- tains to a particular aspect of the information need tion is represented by the ‘line’ y = mx + q that with a limited human effort. splits the plane into two parts, therefore all the In practice, in order to build a set of relevance points that fall ‘below’ this line are classified as assessments for q, we generate a number of query objects that belong to class c1 (see Figure 1 for aspects using a single IR system. Then, the union an example). Without entering into the mathemat- of the top k documents retrieved for each aspect ical details of this approach (Di Nunzio, 2014), constitutes a list of pseudo-relevance assessments the basic idea of the game is that the players can for the query q. adapt the two parameters m and q in order to opti- mize the separation of points and, at the same time, 3.2 An Interactive Interface to Generate can use their resources to improve the estimate of Query Aspects the two likelihoods by buying training data, and/or Building different aspects of the same information add more points to the plot, by buying validation need is not an easy task. As explained in (Umem- data. oto et al., 2016), searchers often cannot easily come up with effective queries for collecting doc- 3 The Query Aspects Game uments that cover diverse aspects. In general, ex- The classification game can be easily adjusted into perts that have to search for relevant documents a relevance assessment game if the two classes are usually have to issue more queries to complete the ‘relevant’ and ‘non-relevant’ (we assume only bi- tasks if search engines return few documents rel- nary relevance assessment for the moment). How- evant to unexplored aspects. Moreover, quitting 3 http://www.venetonight.it/padova/ this tasks too early without in-depth exploration 4 https://unity3d.com prevents searchers from finding essential informa- 5 https://www.madewithmarmalade.com/ tion. Figure 1: Layout of the original “classification game” that will be adapted to the “query aspects game”. ing techniques to propose variations of a query to express the same information need. This problem has been studied for more than twenty years in IR. In (Strzalkowski et al., 1997), the authors dis- cuss how the simplest word-based representations of content, while relatively better understood, are Figure 2: Scentbars and the visualization of miss- usually inadequate since single words are rarely ing information. Figure from (Umemoto et al., specific enough for accurate discrimination. Con- 2016) sequently, a better method is to identify groups of words that create meaningful phrases, especially if these phrases denote important concepts in the Umemoto et al. propose an interactive interface, domain. named ScentBar, that helps searchers to visualize the amount of missing information for both the Some examples of advanced techniques of search query and suggestion queries in the form phrase extraction, including extended N-grams of a stacked bar chart. The interface, a portion of and syntactic parsing, attempt to uncover con- which is shown in Figure 2, visualizes an estimate cepts, which would capture underlying semantic of missing information for each aspect of a query uniformity across various surface forms of expres- that could be obtained by the searcher. When the sion. Syntactic phrases, for example, appear rea- user collects new information during the browsing sonable indicators of content since they can ade- of the results, the bars of the different query as- quately deal with word order changes and other pects change color to indicate the amount of effort structural variations. In the literature, there are that the system estimates necessary to find most of examples of query reformulation using NLP ap- the relevant information. The estimates of the re- proaches for example to the modification and/or quired effort to complete a task are formalized as expansion of both parts thematic and geospa- as a set-wise metric were the gain for each aspects tial that are usually recognized in a geographical is represented by a conditional probability. query (Perea-Ortega et al., 2013), or to support the refinement of a vague, non-technical initial query 3.3 Using NLP to Generate Query Aspects into a more precise problem description (Roulland The last part of the design of the query aspects et al., 2007), or to predict search satisfaction (Has- game involves the use of natural language process- san et al., 2013). 4 Conclusions and Future Works MindTrek ’11, pages 9–15, New York, NY, USA. ACM. In this work, we presented the requirements of the design of an interactive interface that uses game Giorgio Maria Di Nunzio and Alessandro Sordoni. 2012. A Visual Tool for Bayesian Data Analysis: mechanics together with NLP techniques to gen- The Impact of Smoothing on Naive Bayes Text Clas- erate variation of an information need in order to sifiers. In Proc. of the ACM SIGIR’12 conference on label a collection of documents. Starting from the research and development in Information Retrieval, successful experience of the gamification of a ma- Portland, OR, USA, August 12-16, 2012, page 1002. chine learning problem (Di Nunzio et al., 2016), Giorgio Maria Di Nunzio, Maria Maistro, and Daniel we are preparing a new pilot study of the ‘query Zilio. 2016. Gamification for machine learning: aspects game’ that will be used to generate rele- The classification game. In Proceedings of the Third International Workshop on Gamification for Infor- vant documents for two TREC tracks: the Total mation Retrieval co-located with 39th International Recall track and the Dynamic Domain track. The ACM SIGIR Conference on Research and Develop- results of this study will be available at the end ment in Information Retrieval (SIGIR 2016), Pisa, of November 2016, and can be presented and dis- Italy, July 21, 2016., pages 45–52. cussed at the workshop. Giorgio Maria Di Nunzio. 2009. Using Scatterplots to Understand and Improve Probabilistic Models for Text Categorization and Retrieval. Int. J. Approx. References Reasoning, 50(7):945–956. Adnan Abid, Naveed Hussain, Kamran Abid, Farooq Giorgio Maria Di Nunzio. 2014. A New Decision to Ahmad, Muhammad Shoaib Farooq, Uzma Farooq, Take for Cost-Sensitive Naı̈ve Bayes Classifiers. In- Sher Afzal Khan, Yaser Daanial Khan, Muham- formation Processing & Management, 50(5):653 – mad Azhar Naeem, and Nabeel Sabir. 2016. A 674. survey on search results diversification techniques. Neural Computing and Applications, 27(5):1207– Fernando Diaz, 2016. Pseudo-Query Reformulation, 1229. pages 521–532. Springer International Publishing, Cham. Ittai Abraham, Omar Alonso, Vasilis Kandylas, Rajesh Miles Efron. 2009. Using multiple query aspects to Patel, Steven Shelford, and Aleksandrs Slivkins. build test collections without human relevance judg- 2016. How many workers to ask?: Adaptive explo- ments. In Proceedings of the 31th European Con- ration for collecting high quality labels. In Proceed- ference on IR Research on Advances in Information ings of the 39th International ACM SIGIR Confer- Retrieval, ECIR ’09, pages 276–287, Berlin, Heidel- ence on Research and Development in Information berg. Springer-Verlag. Retrieval, SIGIR ’16, pages 473–482, New York, NY, USA. ACM. Carsten Eickhoff. 2014. Crowd-powered experts: Helping surgeons interpret breast cancer images. Leif Azzopardi. 2009. Query side evaluation: An In Proceedings of the First International Workshop empirical analysis of effectiveness and effort. In on Gamification for Information Retrieval, GamifIR Proceedings of the 32Nd International ACM SIGIR ’14, pages 53–56, New York, NY, USA. ACM. Conference on Research and Development in Infor- mation Retrieval, SIGIR ’09, pages 556–563, New Karën Fort, Bruno Guillaume, and Hadrien Chastant. York, NY, USA. ACM. 2014. Creating zombilingo, a game with a purpose for dependency syntax annotation. In Proceedings Peter Bailey, Alistair Moffat, Falk Scholer, and Paul of the First International Workshop on Gamification Thomas. 2015. User variability and ir system evalu- for Information Retrieval, GamifIR ’14, pages 2–6, ation. In Proceedings of the 38th International ACM New York, NY, USA. ACM. SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’15, pages 625–634, Luca Galli, Piero Fraternali, and Alessandro Boz- New York, NY, USA. ACM. zon. 2014. On the Application of Game Mechan- ics in Information Retrieval. In Proc. of the 1st Yu Chen and Pearl Pu. 2014. Healthytogether: Explor- Int. Workshop on Gamification for Information Re- ing social incentives for mobile fitness applications. trieval, GamifIR’14, pages 7–11, New York, NY, In Proceedings of the Second International Sympo- USA. ACM. sium of Chinese CHI, Chinese CHI ’14, pages 25– 34, New York, NY, USA. ACM. Ahmed Hassan, Xiaolin Shi, Nick Craswell, and Bill Ramsey. 2013. Beyond clicks: query reformulation Sebastian Deterding, Dan Dixon, Rilla Khaled, and as a predictor of search satisfaction. In Proceedings Lennart Nacke. 2011. From Game Design Elements of the 22nd ACM international conference on Con- to Gamefulness: Defining “Gamification”. In Proc. ference on information & knowledge manage- of the 15th International Academic MindTrek Con- ment, CIKM ’13, pages 2019–2028, New York, NY, ference: Envisioning Future Media Environments, USA. ACM. Chien-Ju Ho, Shahin Jabbari, and Jennifer Wortman Mark Shovman. 2014. The Game of Search: What is Vaughan. 2013. Adaptive task assignment for the Fun in That? In Proc. of the 1st Int. Workshop crowdsourced classification. In Proceedings of the on Gamification for Information Retrieval, Gami- 30th International Conference on Machine Learn- fIR’14, pages 46–48, New York, NY, USA. ACM. ing, ICML 2013, volume 28 of JMLR Proceedings, pages 534–542. JMLR.org. Rita Singh and Bhiksha Raj. 2004. Classification in likelihood spaces. Technometrics, 46(3):318–329. Jose Luis Jurado, Alejandro Fernandez, and Cesar A. Laurentiu Catalin Stanculescu, Alessandro Bozzon, Collazos. 2015. Applying gamification in the con- Robert-Jan Sips, and Geert-Jan Houben. 2016. text of knowledge management. In Proceedings Work and play: An experiment in enterprise gamifi- of the 15th International Conference on Knowledge cation. In Proceedings of the 19th ACM Conference Technologies and Data-driven Business, i-KNOW on Computer-Supported Cooperative Work & Social ’15, pages 43:1–43:4, New York, NY, USA. ACM. Computing, CSCW ’16, pages 346–358, New York, NY, USA. ACM. Karl M Kapp. 2012. The gamification of learning and instruction: game-based methods and strategies for Tomek Strzalkowski, Fang Lin, Jose Perez-Carballo, training and education. John Wiley & Sons. and Jin Wang. 1997. Building effective queries in natural language information retrieval. In Pro- Isabella Kotini and Sofia Tzelepi. 2015. A ceedings of the Fifth Conference on Applied Natural Gamification-Based Framework for Developing Language Processing, ANLC ’97, pages 299–306, Learning Activities of Computational Thinking. In Stroudsburg, PA, USA. Association for Computa- Torsten Reiners and C. Lincoln Wood, editors, Gam- tional Linguistics. ification in Education and Business, pages 219–252. Springer Int. Publ., Cham. Jennifer Thom, David Millen, and Joan DiMicco. 2012. Removing gamification from an enterprise Mathias Lux, Mario Guggenberger, and Michael sns. In Proceedings of the ACM 2012 Conference Riegler. 2014. Picturesort: Gamification of im- on Computer Supported Cooperative Work, CSCW age ranking. In Proceedings of the First Interna- ’12, pages 1067–1070, New York, NY, USA. ACM. tional Workshop on Gamification for Information Retrieval, GamifIR ’14, pages 57–60, New York, Kazutoshi Umemoto, Takehiro Yamamoto, and Kat- NY, USA. ACM. sumi Tanaka. 2016. Scentbar: A query sugges- tion interface visualizing the amount of missed rele- vant information for intrinsically diverse search. In Carlos Maltzahn, Arnav Jhala, Michael Mateas, and Proceedings of the 39th International ACM SIGIR Jim Whitehead. 2014. Gamification of private digi- Conference on Research and Development in Infor- tal data archive management. In Proceedings of the mation Retrieval, SIGIR ’16, pages 405–414, New First International Workshop on Gamification for York, NY, USA. ACM. Information Retrieval, GamifIR ’14, pages 33–37, New York, NY, USA. ACM. Michael J. Welch, Junghoo Cho, and Christopher Ol- ston. 2011. Search result diversity for informational José M. Perea-Ortega, Miguel A. Garcı́a-Cumbreras, queries. In Proceedings of the 20th International and L. Alfonso Ureña López. 2013. Applying nlp Conference on World Wide Web, WWW ’11, pages techniques for query reformulation to information 237–246, New York, NY, USA. ACM. retrieval with geographical references. In Proceed- ings of the 2012 Pacific-Asia Conference on Emerg- ing Trends in Knowledge Discovery and Data Min- ing, PAKDD’12, pages 57–69, Berlin, Heidelberg. Springer-Verlag. Frédéric Roulland, Aaron Kaplan, Stefania Castellani, Claude Roux, Antonietta Grasso, Karin Pettersson, and Jacki O’Neill, 2007. Query Reformulation and Refinement Using NLP-Based Sentence Clustering, pages 210–221. Springer Berlin Heidelberg, Berlin, Heidelberg. Burr Settles. 2011. Closing the loop: Fast, interac- tive semi-supervised annotation with queries on fea- tures and instances. In Proceedings of the 2011 Conference on Empirical Methods in Natural Lan- guage Processing, EMNLP 2011, 27-31 July 2011, John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL, pages 1467–1478.