=Paper= {{Paper |id=Vol-1749/paper21 |storemode=property |title=Gamification for IR: The Query Aspects Game |pdfUrl=https://ceur-ws.org/Vol-1749/paper21.pdf |volume=Vol-1749 |authors=Giorgio Maria Di Nunzio,Maria Maistro,Daniel Zilio |dblpUrl=https://dblp.org/rec/conf/clic-it/NunzioMZ16 }} ==Gamification for IR: The Query Aspects Game== https://ceur-ws.org/Vol-1749/paper21.pdf

Gamification for IR: The Query Aspects Game

Giorgio Maria Di Nunzio Maria Maistro Daniel Zilio
Dept. of Inf. Eng. (DEI) Dept. of Inf. Eng. (DEI) Dept. of Inf. Eng. (DEI)
University of Padua, Italy University of Padua, Italy University of Padua, Italy
Via Gradenigo 6/a 35131 Via Gradenigo 6/a 35131 Via Gradenigo 6/a 35131
dinunzio@dei.unipd.it maistro@dei.unipd.it daniel.zilio.3@studenti.unipd.it

Abstract In particolare, ci focalizzeremo su tre as-
petti principali: i) si vuole creare una
English. collezione in modo che l’assegnazione dei
The creation of a labelled dataset for Infor- giudizi da parte dei valutatori richieda il
mation Retrieval (IR) purposes is a costly minor sforzo possibile, ii) per mezzo di
process. For this reason, a mix of crowd- un’interfaccia che utilizza dinamiche di
sourcing and active learning approaches gioco iii) insieme a tecniche di NLP per
have been proposed in the literature in or- la riformulazione della query.
der to assess the relevance of documents of
a collection given a particular query at an
1 Introduction
affordable cost. In this paper, we present
the design of the gamification of this inter- In Information Retrieval (IR), the performance of
active process that draws inspiration from a system is evaluated using experimental test col-
recent works in the area of gamification lections. These collections consist of a set of docu-
for IR. In particular, we focus on three ments, a set of queries, and a set of relevance judg-
main points: i) we want to create a set of ments, where each judgement explains whether a
relevance judgements with the least effort document is relevant or not to each query. The
by human assessors, ii) we use interactive creation of relevance judgements is a costly, time-
search interfaces that use game mechan- consuming, and non-trivial task; for these rea-
ics, iii) we use Natural Language Process- sons, the interest in approaches that generate rele-
ing (NLP) to collect different aspects of a vance judgements with the least amount of effort
query. 1 has increased in IR and related areas (i.e., super-
vised Machine Learning (ML) algorithms). In the
Italiano last years, mixed approaches that use crowdsourc-
La creazione di una collezione sperimen- ing (Ho et al., 2013) and active learning (Settles,
tale per l’Information Retrieval (IR) è un 2011) have shown that it is possible to create an-
processo costoso sia dal punto di vista notated datasets at affordable costs. Specifically,
economico che in termini di sforzo umano. crowdsourcing has been part of the IR toolbox as a
Per ridurre i costi legati all’attribuzione cheap and fast mechanism to obtain labels for sys-
dei giudizi di rilevanza ai documenti di tem evaluation. However, successful deployment
una collezione, sono stati proposti ap- of crowdsourcing at scale involves the adjustment
procci che integrano tecniche di crowd- of many variables, a very important one being the
sourcing e active learning. In questo number of assessors needed per task, as explained
paper viene presentata un’idea basata in (Abraham et al., 2016).
sull’utilizzo della gamification (‘ludiciz- 1.1 Search Diversification and Query
zazione’) in IR per l’attribuzione di giudizi Reformulation
di rilevanza in maniera semi-automatica.
The effectiveness of a search and the satisfaction
1
This paper is partially an extended abstract of the pa- of users can be enhanced through providing var-
per “Gamification for Machine Learning: The Classification
Game” presented at the GamifIR 2016 Workshop co-located ious results of a search query in a certain order
with SIGIR 2016 (Di Nunzio et al., 2016) of relevance and concern. The technique used
to avoid presenting similar, though relevant, re- 2016. In (Galli et al., 2014), the authors describe
sults to the user is known as a diversification of the fundamental elements and mechanics of a
search results (Abid et al., 2016). While exist- game and provide an overview of possible applica-
ing research in search diversification offers sev- tions of gamification to the IR process. In (Shov-
eral solutions for introducing variety into the re- man, 2014), approaches to properly gamify Web
sults, the majority of such work is based on the as- search are presented, i.e. making the search of
sumption that a single relevant document will ful- information and the scanning of results a more
fil a user’s information need, making them inade- enjoyable activity. Actually, many proposals of
quate for many informational queries. In (Welch game applied to different aspects of IR have been
et al., 2011), the authors propose a model to make presented. For example in (Maltzahn et al., 2014),
a tradeoff between a user’s desire for multiple rel- the authors describes a game that turns document
evant documents, probabilistic information about tagging into the activity of taking care of a gar-
an average user’s interest in the subtopics of a mul- den, with the aim of managing private archives.
tifaceted query, and uncertainty in classifying doc- A method to obtain ranking of images by utilizing
uments into those subtopics. human computation through a gamified web appli-
Most information retrieval systems operate by cation is proposed in (Lux et al., 2014). Fort et al.
performing a single retrieval in response to a introduce a strategy to gamify the annotation of a
query. Effective results sometimes require sev- French corpora (Fort et al., 2014).
eral manual reformulations by the user or semi- In this paper, we want to apply game mechanics
automatic reformulations assisted by the system. to the problem of relevance assessment with three
Diaz presents an approach to automatic query re- goals in mind: i) we want to create a set of rele-
formulation which combines the iterated nature vance judgements with the least effort by human
of human query reformulation with the automatic assessors, ii) we use interactive search interfaces
behavior of pseudo relevance feedback (Diaz, that use game mechanics, iii) we use NLP to col-
2016). In (Azzopardi, 2009), the author proposes lect different aspects of a query. In this context,
a method for generating queries for ad-hoc top- we can define our web application as a Game with
ics to provide the necessary data for this compre- a Purpose (GWAP), that is a game which presents
hensive analysis of query performance. Bailey et some purposes, usually boring and dull for people,
al. explore the impact of widely differing queries within an entertaining setting, in order to make
that searchers construct for the same information them enjoyable and to solve problem with the aid
need description. By executing those queries, we of human computation. The design and the im-
demonstrate that query formulation is critical to plementation of this interactive interface will be
query effectiveness (Bailey et al., 2015). used as a post-hoc analysis of two Text REtrieval
Conference (TREC)2 2016 tracks, namely the To-
1.2 Gamification in IR tal Recall Track and the Dynamic Domain Track.
These two tracks are interesting for our problem
Gamification is defined as “the use of game de- since they both re-create a situation where we need
sign elements in non-game contexts” (Deterding to find the best set (or the total amount) of relevant
et al., 2011), i.e. tipical game elements are used documents with the minimum effort by the asses-
for purposes different from their normal expected sor that has to judge the documents proposed by
employment. Nowadays, gamification spreads the system given an information need.
through a wide range of disciplines and its appli-
cations are implemented in different and various 2 Design of the Experiment
aspects of scientific fields of study. For instances,
gamification is applied to learning activities (Ko- In this first pilot study, we will implement a sim-
tini and Tzelepi, 2015; Kapp, 2012), business and ple game based on a visual interpretation of prob-
enterprise (Jurado et al., 2015; Stanculescu et al., abilistic classifiers (Di Nunzio, 2014; Di Nunzio,
2016; Thom et al., 2012) and medicine (Eickhoff, 2009; Di Nunzio and Sordoni, 2012). The game
2014; Chen and Pu, 2014). consists in separating two sets of colored points
on a two-dimensional plane by means of a straight
IR has recently dealt with gamification, as wit-
line, as shown in Figure 1. Despite its simplicity,
nessed by the Workshop on Gamification for In-
2
formation Retrieval (GamifIR) in 2014, 2015 and http://trec.nist.gov
this very abstract scenario received a good feed- ever, while in the classification game we already
back by kids of primary schools who tested it dur- have a set of labelled documents and the goal is
ing the European Researcher’s Night at the Uni- to find the optimal classifier, in this new game
versity of Padua3 . The next step will be to design we need to find the relevant documents. To this
and implement the game with real game develop- purpose, we will follow the idea of the works de-
ment platforms like, for example, Unity4 or Mar- scribed in the following subsections: i) building
malade5 . assessment by varying the description of the in-
formation need, ii) using an interactive interface
2.1 The Classification Game that suggests the amount of relevant information
The ‘original game’ (Di Nunzio et al., 2016) is that has to be judged, iii) using NLP approaches
based on the two-dimensional representation of to generate variations of a query.
probabilities (Di Nunzio, 2014; Singh and Raj,
2004), which is a very intuitive way of presenting 3.1 Building Relevance Assessments With
the problem of classification on a two-dimensional Query Aspects
space. Given two classes c1 and c2 , an object o is
In (Efron, 2009), the author presents a method for
assigned to category c1 if the following inequality
creating relevance judgments without explicit rel-
holds:
evance assessments. The idea is to create differ-
P (o|c2 ) < m P (o|c1 ) +q (1)
| {z } | {z } ent “aspects” of a query: given a query q and a
y x set of documents D, the same information need
where P (o|c1 ) and P (o|c2 ) are the likelihoods of that generated q could also generate other queries
the object o given the two categories, while m and that focus on another aspects of the same need. A
q are two parameters that depend on the misclas- query aspect is an articulation of a searcher’s infor-
sification costs that can be assigned by the user to mation need which might be a re-elaboration (for
compensate for either the unbalanced classes situ- example, rephrasing, specification, or generaliza-
ation or different class costs. tion) of the query. By generating several queries
If we interpret the two likelihoods as two coor- related to an information need and running each
dinates x and y of a two dimensional space, the of these against our document collection, we can
problem of classification can be studied on a two- create a pool of results where each result set per-
dimensional plot. The decision of the classifica- tains to a particular aspect of the information need
tion is represented by the ‘line’ y = mx + q that with a limited human effort.
splits the plane into two parts, therefore all the In practice, in order to build a set of relevance
points that fall ‘below’ this line are classified as assessments for q, we generate a number of query
objects that belong to class c1 (see Figure 1 for aspects using a single IR system. Then, the union
an example). Without entering into the mathemat- of the top k documents retrieved for each aspect
ical details of this approach (Di Nunzio, 2014), constitutes a list of pseudo-relevance assessments
the basic idea of the game is that the players can for the query q.
adapt the two parameters m and q in order to opti-
mize the separation of points and, at the same time, 3.2 An Interactive Interface to Generate
can use their resources to improve the estimate of Query Aspects
the two likelihoods by buying training data, and/or Building different aspects of the same information
add more points to the plot, by buying validation need is not an easy task. As explained in (Umem-
data. oto et al., 2016), searchers often cannot easily
come up with effective queries for collecting doc-
3 The Query Aspects Game
uments that cover diverse aspects. In general, ex-
The classification game can be easily adjusted into perts that have to search for relevant documents
a relevance assessment game if the two classes are usually have to issue more queries to complete the
‘relevant’ and ‘non-relevant’ (we assume only bi- tasks if search engines return few documents rel-
nary relevance assessment for the moment). How- evant to unexplored aspects. Moreover, quitting
3
http://www.venetonight.it/padova/
this tasks too early without in-depth exploration
4
https://unity3d.com prevents searchers from finding essential informa-
5
https://www.madewithmarmalade.com/ tion.
Figure 1: Layout of the original “classification game” that will be adapted to the “query aspects game”.

ing techniques to propose variations of a query to
express the same information need. This problem
has been studied for more than twenty years in IR.
In (Strzalkowski et al., 1997), the authors dis-
cuss how the simplest word-based representations
of content, while relatively better understood, are
Figure 2: Scentbars and the visualization of miss- usually inadequate since single words are rarely
ing information. Figure from (Umemoto et al., specific enough for accurate discrimination. Con-
2016) sequently, a better method is to identify groups of
words that create meaningful phrases, especially
if these phrases denote important concepts in the
Umemoto et al. propose an interactive interface, domain.
named ScentBar, that helps searchers to visualize
the amount of missing information for both the Some examples of advanced techniques of
search query and suggestion queries in the form phrase extraction, including extended N-grams
of a stacked bar chart. The interface, a portion of and syntactic parsing, attempt to uncover con-
which is shown in Figure 2, visualizes an estimate cepts, which would capture underlying semantic
of missing information for each aspect of a query uniformity across various surface forms of expres-
that could be obtained by the searcher. When the sion. Syntactic phrases, for example, appear rea-
user collects new information during the browsing sonable indicators of content since they can ade-
of the results, the bars of the different query as- quately deal with word order changes and other
pects change color to indicate the amount of effort structural variations. In the literature, there are
that the system estimates necessary to find most of examples of query reformulation using NLP ap-
the relevant information. The estimates of the re- proaches for example to the modification and/or
quired effort to complete a task are formalized as expansion of both parts thematic and geospa-
as a set-wise metric were the gain for each aspects tial that are usually recognized in a geographical
is represented by a conditional probability. query (Perea-Ortega et al., 2013), or to support the
refinement of a vague, non-technical initial query
3.3 Using NLP to Generate Query Aspects
into a more precise problem description (Roulland
The last part of the design of the query aspects et al., 2007), or to predict search satisfaction (Has-
game involves the use of natural language process- san et al., 2013).
4 Conclusions and Future Works MindTrek ’11, pages 9–15, New York, NY, USA.
ACM.
In this work, we presented the requirements of the
design of an interactive interface that uses game Giorgio Maria Di Nunzio and Alessandro Sordoni.
2012. A Visual Tool for Bayesian Data Analysis:
mechanics together with NLP techniques to gen- The Impact of Smoothing on Naive Bayes Text Clas-
erate variation of an information need in order to sifiers. In Proc. of the ACM SIGIR’12 conference on
label a collection of documents. Starting from the research and development in Information Retrieval,
successful experience of the gamification of a ma- Portland, OR, USA, August 12-16, 2012, page 1002.
chine learning problem (Di Nunzio et al., 2016), Giorgio Maria Di Nunzio, Maria Maistro, and Daniel
we are preparing a new pilot study of the ‘query Zilio. 2016. Gamification for machine learning:
aspects game’ that will be used to generate rele- The classification game. In Proceedings of the Third
International Workshop on Gamification for Infor-
vant documents for two TREC tracks: the Total mation Retrieval co-located with 39th International
Recall track and the Dynamic Domain track. The ACM SIGIR Conference on Research and Develop-
results of this study will be available at the end ment in Information Retrieval (SIGIR 2016), Pisa,
of November 2016, and can be presented and dis- Italy, July 21, 2016., pages 45–52.
cussed at the workshop. Giorgio Maria Di Nunzio. 2009. Using Scatterplots
to Understand and Improve Probabilistic Models for
Text Categorization and Retrieval. Int. J. Approx.
References Reasoning, 50(7):945–956.

Adnan Abid, Naveed Hussain, Kamran Abid, Farooq Giorgio Maria Di Nunzio. 2014. A New Decision to
Ahmad, Muhammad Shoaib Farooq, Uzma Farooq, Take for Cost-Sensitive Naı̈ve Bayes Classifiers. In-
Sher Afzal Khan, Yaser Daanial Khan, Muham- formation Processing & Management, 50(5):653 –
mad Azhar Naeem, and Nabeel Sabir. 2016. A 674.
survey on search results diversification techniques.
Neural Computing and Applications, 27(5):1207– Fernando Diaz, 2016. Pseudo-Query Reformulation,
1229. pages 521–532. Springer International Publishing,
Cham.
Ittai Abraham, Omar Alonso, Vasilis Kandylas, Rajesh Miles Efron. 2009. Using multiple query aspects to
Patel, Steven Shelford, and Aleksandrs Slivkins. build test collections without human relevance judg-
2016. How many workers to ask?: Adaptive explo- ments. In Proceedings of the 31th European Con-
ration for collecting high quality labels. In Proceed- ference on IR Research on Advances in Information
ings of the 39th International ACM SIGIR Confer- Retrieval, ECIR ’09, pages 276–287, Berlin, Heidel-
ence on Research and Development in Information berg. Springer-Verlag.
Retrieval, SIGIR ’16, pages 473–482, New York,
NY, USA. ACM. Carsten Eickhoff. 2014. Crowd-powered experts:
Helping surgeons interpret breast cancer images.
Leif Azzopardi. 2009. Query side evaluation: An In Proceedings of the First International Workshop
empirical analysis of effectiveness and effort. In on Gamification for Information Retrieval, GamifIR
Proceedings of the 32Nd International ACM SIGIR ’14, pages 53–56, New York, NY, USA. ACM.
Conference on Research and Development in Infor-
mation Retrieval, SIGIR ’09, pages 556–563, New Karën Fort, Bruno Guillaume, and Hadrien Chastant.
York, NY, USA. ACM. 2014. Creating zombilingo, a game with a purpose
for dependency syntax annotation. In Proceedings
Peter Bailey, Alistair Moffat, Falk Scholer, and Paul of the First International Workshop on Gamification
Thomas. 2015. User variability and ir system evalu- for Information Retrieval, GamifIR ’14, pages 2–6,
ation. In Proceedings of the 38th International ACM New York, NY, USA. ACM.
SIGIR Conference on Research and Development in
Information Retrieval, SIGIR ’15, pages 625–634, Luca Galli, Piero Fraternali, and Alessandro Boz-
New York, NY, USA. ACM. zon. 2014. On the Application of Game Mechan-
ics in Information Retrieval. In Proc. of the 1st
Yu Chen and Pearl Pu. 2014. Healthytogether: Explor- Int. Workshop on Gamification for Information Re-
ing social incentives for mobile fitness applications. trieval, GamifIR’14, pages 7–11, New York, NY,
In Proceedings of the Second International Sympo- USA. ACM.
sium of Chinese CHI, Chinese CHI ’14, pages 25–
34, New York, NY, USA. ACM. Ahmed Hassan, Xiaolin Shi, Nick Craswell, and Bill
Ramsey. 2013. Beyond clicks: query reformulation
Sebastian Deterding, Dan Dixon, Rilla Khaled, and as a predictor of search satisfaction. In Proceedings
Lennart Nacke. 2011. From Game Design Elements of the 22nd ACM international conference on Con-
to Gamefulness: Defining “Gamification”. In Proc. ference on information & knowledge manage-
of the 15th International Academic MindTrek Con- ment, CIKM ’13, pages 2019–2028, New York, NY,
ference: Envisioning Future Media Environments, USA. ACM.
Chien-Ju Ho, Shahin Jabbari, and Jennifer Wortman Mark Shovman. 2014. The Game of Search: What is
Vaughan. 2013. Adaptive task assignment for the Fun in That? In Proc. of the 1st Int. Workshop
crowdsourced classification. In Proceedings of the on Gamification for Information Retrieval, Gami-
30th International Conference on Machine Learn- fIR’14, pages 46–48, New York, NY, USA. ACM.
ing, ICML 2013, volume 28 of JMLR Proceedings,
pages 534–542. JMLR.org. Rita Singh and Bhiksha Raj. 2004. Classification in
likelihood spaces. Technometrics, 46(3):318–329.
Jose Luis Jurado, Alejandro Fernandez, and Cesar A.
Laurentiu Catalin Stanculescu, Alessandro Bozzon,
Collazos. 2015. Applying gamification in the con-
Robert-Jan Sips, and Geert-Jan Houben. 2016.
text of knowledge management. In Proceedings
Work and play: An experiment in enterprise gamifi-
of the 15th International Conference on Knowledge
cation. In Proceedings of the 19th ACM Conference
Technologies and Data-driven Business, i-KNOW
on Computer-Supported Cooperative Work & Social
’15, pages 43:1–43:4, New York, NY, USA. ACM.
Computing, CSCW ’16, pages 346–358, New York,
NY, USA. ACM.
Karl M Kapp. 2012. The gamification of learning and
instruction: game-based methods and strategies for Tomek Strzalkowski, Fang Lin, Jose Perez-Carballo,
training and education. John Wiley & Sons. and Jin Wang. 1997. Building effective queries
in natural language information retrieval. In Pro-
Isabella Kotini and Sofia Tzelepi. 2015. A ceedings of the Fifth Conference on Applied Natural
Gamification-Based Framework for Developing Language Processing, ANLC ’97, pages 299–306,
Learning Activities of Computational Thinking. In Stroudsburg, PA, USA. Association for Computa-
Torsten Reiners and C. Lincoln Wood, editors, Gam- tional Linguistics.
ification in Education and Business, pages 219–252.
Springer Int. Publ., Cham. Jennifer Thom, David Millen, and Joan DiMicco.
2012. Removing gamification from an enterprise
Mathias Lux, Mario Guggenberger, and Michael sns. In Proceedings of the ACM 2012 Conference
Riegler. 2014. Picturesort: Gamification of im- on Computer Supported Cooperative Work, CSCW
age ranking. In Proceedings of the First Interna- ’12, pages 1067–1070, New York, NY, USA. ACM.
tional Workshop on Gamification for Information
Retrieval, GamifIR ’14, pages 57–60, New York, Kazutoshi Umemoto, Takehiro Yamamoto, and Kat-
NY, USA. ACM. sumi Tanaka. 2016. Scentbar: A query sugges-
tion interface visualizing the amount of missed rele-
vant information for intrinsically diverse search. In
Carlos Maltzahn, Arnav Jhala, Michael Mateas, and
Proceedings of the 39th International ACM SIGIR
Jim Whitehead. 2014. Gamification of private digi-
Conference on Research and Development in Infor-
tal data archive management. In Proceedings of the
mation Retrieval, SIGIR ’16, pages 405–414, New
First International Workshop on Gamification for
York, NY, USA. ACM.
Information Retrieval, GamifIR ’14, pages 33–37,
New York, NY, USA. ACM. Michael J. Welch, Junghoo Cho, and Christopher Ol-
ston. 2011. Search result diversity for informational
José M. Perea-Ortega, Miguel A. Garcı́a-Cumbreras, queries. In Proceedings of the 20th International
and L. Alfonso Ureña López. 2013. Applying nlp Conference on World Wide Web, WWW ’11, pages
techniques for query reformulation to information 237–246, New York, NY, USA. ACM.
retrieval with geographical references. In Proceed-
ings of the 2012 Pacific-Asia Conference on Emerg-
ing Trends in Knowledge Discovery and Data Min-
ing, PAKDD’12, pages 57–69, Berlin, Heidelberg.
Springer-Verlag.

Frédéric Roulland, Aaron Kaplan, Stefania Castellani,
Claude Roux, Antonietta Grasso, Karin Pettersson,
and Jacki O’Neill, 2007. Query Reformulation and
Refinement Using NLP-Based Sentence Clustering,
pages 210–221. Springer Berlin Heidelberg, Berlin,
Heidelberg.

Burr Settles. 2011. Closing the loop: Fast, interac-
tive semi-supervised annotation with queries on fea-
tures and instances. In Proceedings of the 2011
Conference on Empirical Methods in Natural Lan-
guage Processing, EMNLP 2011, 27-31 July 2011,
John McIntyre Conference Centre, Edinburgh, UK,
A meeting of SIGDAT, a Special Interest Group of
the ACL, pages 1467–1478.