=Paper=
{{Paper
|id=Vol-1441/recsys2015_poster18
|storemode=property
|title=Recommendations to Enhance Children Web Searches
|pdfUrl=https://ceur-ws.org/Vol-1441/recsys2015_poster18.pdf
|volume=Vol-1441
|dblpUrl=https://dblp.org/rec/conf/recsys/KarimiP15
}}
==Recommendations to Enhance Children Web Searches==
Recommendations to Enhance Children Web Searches Shahrzad Karimi Maria Soledad Pera Department of Computer Science Department of Computer Science Boise State University Boise State University Boise, ID 83725 USA Boise, ID 83725 USA shahrzadkarimi@u.boisestate.edu solepera@boisestate.edu ABSTRACT recommendation systems targeting children have taken different We present the initial design and development of KidsQR, a query approaches, using large-scale query logs, tags, biased random recommendation system tailored exclusively for children. KidsQR walk methods, and bipartite graphs [4, 5]. These approaches, aids children in their quest for online information by considering however, are based on texts that are generated by adults, children vocabulary, child-friendly phrases, and entities children disregarding informal phrasing based on children writing. are familiar with. Initial experiments conducted based on the We deem the formulation of keyword queries that children can assessment of parents and elementary school teacher appraisers relate to as the solution to this problem. With that in mind, we verify the promising performance of KidsQR. have developed KidsQR, a query recommendation system that suggests keyword queries in response to a child-initiated query. Unlike previous works, we will not primarily rely on child-related Keywords data produced by adults. Instead, we attempt to consider the Information retrieval, query recommendation, children patterns of children’s informal phrasing and natural language by utilizing texts that have been written by children, to recommend queries that are adequate to initiate the search of content of 1. INTRODUCTION interest to children, which can lead to a more child-friendly and Children represent a noticeably increasing group of Internet users suitable search experience. KidsQR is unique since it considers [5]. The amount of time that children spend online has child-friendly characteristics to generate query recommendations, significantly increased in the past years [3]. However, finding the including children vocabulary, phrasing patterns, pop-culture, and appropriate online resources that are interesting from a child’s the popularity of the terms among children. Our intention is to viewpoint is a challenge. Consequently, children looking for recommend queries that have a closer resemblance to a child’s online resources are constantly exposed to material that is search intent, which results in retrieving suitable documents. unsuitable and non-relevant. A crucial requirement of conducting a successful search using search engines is the formulation of a 2. METHODOLOGY well-defined query [3]. Unfortunately, creating an appropriate In this section we present a brief overview of KidsQR. query that leads to retrieving relevant information is not an easy Generating Candidates. To identify possible queries to be task for child users. Previous studies suggest that the relative recommended, i.e., candidate queries, in response to a given performance of queries meant to retrieve information for children user’s initial query, KidsQR employs Ubersuggest.org1. is poorer than the queries intending to retrieve content for non- Ubersuggest is a query generation tool that provides hundreds of children users [4]. The reason lies in children’s limited possible suggestions given an initial user query and offers topical vocabulary, their tendency to use natural language constructions diversity among the suggestions. We have verified that phrases and complex phrasing for expressing their information needs, and provided by Ubersuggest include terms related to children pop their lack of ability to utilize keywords in phrase formulations [5]. culture, such as the name of cartoon characters. A query recommendation system can help these users “formulate Analyzing Candidates. To determine the adequacy of the elaborated information needs” [3], by providing queries that these candidates being recommended to the user, i.e. distinguishing users can apply to initiate a search. child-friendly candidates from the non-child-friendly ones, Despite the large number of studies conducted in the field of KidsQR evaluates each candidate query based on a number of query recommendation, relatively few focus explicitly on the child-related characteristics that are applicable to them. In other young group of Internet users and their difficulties,. words, KidsQR considers child-related properties to determine Consequently, literature pertaining query recommendation for how closely a candidate phrase is related to children’s interests, or children is very limited. In fact, most of the existing query if the candidate relates to child-friendly content. The recommendation systems are designed based on the information properties/characteristics to be observed in quantifying the degree needs of adults [4], which is why they suggest queries that often to which a candidate query is likely reflecting children’s search do not lead to retrieving online resources that “suit the intent are described as follows: characteristics of content for children” [4]. Existing query x Vocabulary. A fundamental step to differentiate child- friendly queries among the candidate ones, is to examine the existence of children’s vocabulary terms in each query. We Permission to make digital or hard copies of all or part of this work for consider children vocabulary lists extracted from children personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that dictionaries and schools’ academic vocabulary (such as copies bear this notice and the full citation on the first page. To copy www.opsu.edu/www/education/BuildAcademicVoc.pdf and Copyright is held by the author/owner(s). 1 RecSys 2015 Poster Proceedings, September 16–20, 2015, Vienna, While we used Ubersuggest for development purposes, other Austria. tools, such as keywordtool.io, can be considered as well. kids.wordsmyth.net/we/) and prioritize candidates that include randomly-positioned recommendations generated by Google, keywords frequently occurring in children pre-defined Bing, and KidsQR. Appraisers were then asked to select the two vocabularies. We do so, since it is anticipated that children recommendations that they found most child-friendly for each will favor queries including keywords they are familiar with. query and their selections were treated as the gold standard. For example, for the queries “color” and “city,” the candidates Using the created dataset we evaluated KidsQR based on Mean “coloring pages” and “pig in a city” are preferred over “color Reciprocal Rank (MRR) and Normalized Discounted Cumulative spectrum” and “city infrastructure” since “spectrum” and Gain (NDCG). We also compared the performance of KidsQR “infrastructure” are not common words among children. with that of Google and Bing, two well-known search engines that x Popularity. The popularity of terms among children is offer query recommendations and that are frequently used by considered by analyzing term frequency distributions2 on children [1]. As shown in Table 1, KidsQR outperforms both children stories, poems, and blog posts. Candidate queries Bing and Google, in terms of MRR and NDCG. The higher including popular children terms are also given precedence. NDCG implies that queries useful for children are positioned x Phrase-Formulating. Examining the child-friendliness of higher in the ranking of recommended queries by KidsQR. The individual terms in candidate queries is crucial, but not higher MRR indicates that, on average, users of KidsQR need to sufficient in confirming the appropriateness of a candidate scan through less query recommendations before locating a query since it does not consider the query phrase as a whole. suitable, useful one than users of other systems. For example, having the words “bar” —as in “chocolate bar”—and “open” in children vocabulary does not imply that Table 1. Performance analysis for Bing, Google, and KidsQR “open bar” is a child-related phrase. We consider stories and System MRR NDCG poems written for children, as well as texts, blog posts, and Google 0.36 0.51 online reviews written by children, to determine the Bing 0.27 0.35 appropriateness of the combination of the words, and capture children’s informal phrasing patterns. Candidate queries that KidsQR 0.7 0.72 have similar patterns to children’s informal phrasing behavior, or are child-appropriate as a phrase, most likely address a child’s search intention, hence, are prioritized. 4. CONCLUSION We have developed a query recommendation system, KidsQR, x Pop-Culture. We observed that candidate queries that do not designed specifically to address the challenges of children in include children vocabulary, or do not literally make sense as query formulation. KidsQR distinguishes the child-friendly query a phrase, can still be related to children’s popular culture. candidates from the non-child-friendly ones by simultaneously KisdQR examines candidate queries in the context of children considering multiple desired properties on children queries. pop-culture and prioritizes queries including terms related to We aim to further enhance the initial development of KidsQR so children’s movies, songs, and toys (extracted from Pixar.com that it can adequately handle informal as well as natural language and Allmovie.com, to name a few). For example, “Mary phrasing which are very common among children. We also intent Poppins” and “Mr. Potato Head” are valid candidates since to further enhance the performance of KidsQR by addressing they refer to a movie character and a toy, respectively, even children pop-culture more comprehensively. We believe the more though the former contains “Poppins”, a word not included in aspects of children’s pop-culture that we consider, the more children’s vocabulary, and the latter consists of child-related closely we can predict a child user’s search intention, i.e. words but does not have a literal meaning as a phrase. recommend queries that are anticipated to be appealing from a Ranking. KidsQR analyzes each of the candidate queries based child’s perspective can be generated. Moreover, we will examine on the characteristics mentioned above, and prioritizes candidates children vocabulary and words provided by school vocabulary that (i) are simple, (ii) refer to children’s topics of interests, (iii) lists more accurately and consider the age gap among young include terms children are familiar with, and (iv) resemble children, i.e., we will group children by age groups and explicitly children’s informal phrasing behavior. KidsQR relies on a consider their reading ability in making query recommendations multiple regression analysis model that simultaneously considers for children in the respective groups. the different contributing factors in determining whether a candidate query is, in fact, child-friendly and generates a single 5. REFERENCES ranking score for each candidate query recommendation. The top- [1] D. Bilal & M. Boehm. Towards New Methodologies for N candidates are presented to the user as the corresponding query Assessing Relevance of Information Retrieval from Web recommendations that can help capture his search intent and guide Search Engines on Children’s Queries. QQRM, 1:93-100, the online search process. 2013. 3. INITIAL EXPERIMENTS [2] S. Duarte Torres, D. Hiemstra, I. Weberand P. Serdyukov. As far as we know, a benchmark dataset that specifically Query Recommendation for Children. In ACM CIKM, pp. addresses queries conducted by children has yet to be developed. 2012-2014, 2010. Thus, we created our own dataset by conducting a user study and [3] S. Duarte Torres and I. Weber. What and How Children collecting data from 10 appraisers who were either parents of Search on the Web. In ACM CIKM, pp. 393-402, 2011 children between the ages of 3 and 12, or elementary school teachers. We presented each appraiser with 8 queries and the [4] S. Duarte Torres, D. Hiemstra, and P. Serdyukov. An corresponding set of query recommendations, comprised of Analysis of Queries Intended to Search Information for Children. In IIiX, pp. 235-244, 2010. 2 [5] S. Duarte Torres, D. Hiemstra, I. Weber, and P. Serdyukov. Sample sources considered for determining term popularity and Query Recommendation in the Information Domain of phrase suitability include kidsblogclub.com and storybud.org. Children. JASIST, 65(7): 1368-1384, 2014.