=Paper= {{Paper |id=Vol-1441/recsys2015_poster18 |storemode=property |title=Recommendations to Enhance Children Web Searches |pdfUrl=https://ceur-ws.org/Vol-1441/recsys2015_poster18.pdf |volume=Vol-1441 |dblpUrl=https://dblp.org/rec/conf/recsys/KarimiP15 }} ==Recommendations to Enhance Children Web Searches== https://ceur-ws.org/Vol-1441/recsys2015_poster18.pdf
     Recommendations to Enhance Children Web Searches
                         Shahrzad Karimi                                                      Maria Soledad Pera
                Department of Computer Science                                           Department of Computer Science
                     Boise State University                                                   Boise State University
                     Boise, ID 83725 USA                                                      Boise, ID 83725 USA
           shahrzadkarimi@u.boisestate.edu                                                solepera@boisestate.edu

ABSTRACT                                                                     recommendation systems targeting children have taken different
We present the initial design and development of KidsQR, a query             approaches, using large-scale query logs, tags, biased random
recommendation system tailored exclusively for children. KidsQR              walk methods, and bipartite graphs [4, 5]. These approaches,
aids children in their quest for online information by considering           however, are based on texts that are generated by adults,
children vocabulary, child-friendly phrases, and entities children           disregarding informal phrasing based on children writing.
are familiar with. Initial experiments conducted based on the                We deem the formulation of keyword queries that children can
assessment of parents and elementary school teacher appraisers               relate to as the solution to this problem. With that in mind, we
verify the promising performance of KidsQR.                                  have developed KidsQR, a query recommendation system that
                                                                             suggests keyword queries in response to a child-initiated query.
                                                                             Unlike previous works, we will not primarily rely on child-related
Keywords                                                                     data produced by adults. Instead, we attempt to consider the
Information retrieval, query recommendation, children                        patterns of children’s informal phrasing and natural language by
                                                                             utilizing texts that have been written by children, to recommend
                                                                             queries that are adequate to initiate the search of content of
1. INTRODUCTION                                                              interest to children, which can lead to a more child-friendly and
Children represent a noticeably increasing group of Internet users
                                                                             suitable search experience. KidsQR is unique since it considers
[5]. The amount of time that children spend online has
                                                                             child-friendly characteristics to generate query recommendations,
significantly increased in the past years [3]. However, finding the
                                                                             including children vocabulary, phrasing patterns, pop-culture, and
appropriate online resources that are interesting from a child’s
                                                                             the popularity of the terms among children. Our intention is to
viewpoint is a challenge. Consequently, children looking for
                                                                             recommend queries that have a closer resemblance to a child’s
online resources are constantly exposed to material that is
                                                                             search intent, which results in retrieving suitable documents.
unsuitable and non-relevant. A crucial requirement of conducting
a successful search using search engines is the formulation of a             2. METHODOLOGY
well-defined query [3]. Unfortunately, creating an appropriate
                                                                             In this section we present a brief overview of KidsQR.
query that leads to retrieving relevant information is not an easy
                                                                             Generating Candidates. To identify possible queries to be
task for child users. Previous studies suggest that the relative
                                                                             recommended, i.e., candidate queries, in response to a given
performance of queries meant to retrieve information for children
                                                                             user’s initial query, KidsQR employs Ubersuggest.org1.
is poorer than the queries intending to retrieve content for non-
                                                                             Ubersuggest is a query generation tool that provides hundreds of
children users [4]. The reason lies in children’s limited
                                                                             possible suggestions given an initial user query and offers topical
vocabulary, their tendency to use natural language constructions
                                                                             diversity among the suggestions. We have verified that phrases
and complex phrasing for expressing their information needs, and
                                                                             provided by Ubersuggest include terms related to children pop
their lack of ability to utilize keywords in phrase formulations [5].
                                                                             culture, such as the name of cartoon characters.
A query recommendation system can help these users “formulate
                                                                             Analyzing Candidates. To determine the adequacy of the
elaborated information needs” [3], by providing queries that these
                                                                             candidates being recommended to the user, i.e. distinguishing
users can apply to initiate a search.
                                                                             child-friendly candidates from the non-child-friendly ones,
Despite the large number of studies conducted in the field of                KidsQR evaluates each candidate query based on a number of
query recommendation, relatively few focus explicitly on the                 child-related characteristics that are applicable to them. In other
young group of Internet users and their difficulties,.                       words, KidsQR considers child-related properties to determine
Consequently, literature pertaining query recommendation for                 how closely a candidate phrase is related to children’s interests, or
children is very limited. In fact, most of the existing query                if the candidate relates to child-friendly content. The
recommendation systems are designed based on the information                 properties/characteristics to be observed in quantifying the degree
needs of adults [4], which is why they suggest queries that often            to which a candidate query is likely reflecting children’s search
do not lead to retrieving online resources that “suit the                    intent are described as follows:
characteristics of content for children” [4]. Existing query                 x    Vocabulary. A fundamental step to differentiate child-
                                                                                  friendly queries among the candidate ones, is to examine the
                                                                                  existence of children’s vocabulary terms in each query. We
 Permission to make digital or hard copies of all or part of this work for
                                                                                  consider children vocabulary lists extracted from children
 personal or classroom use is granted without fee provided that copies are
 not made or distributed for profit or commercial advantage and that
                                                                                  dictionaries and schools’ academic vocabulary (such as
 copies bear this notice and the full citation on the first page. To copy         www.opsu.edu/www/education/BuildAcademicVoc.pdf and

 Copyright is held by the author/owner(s).
                                                                             1
 RecSys 2015 Poster Proceedings, September 16–20, 2015, Vienna,                  While we used Ubersuggest for development purposes, other
 Austria.                                                                        tools, such as keywordtool.io, can be considered as well.
    kids.wordsmyth.net/we/) and prioritize candidates that include         randomly-positioned recommendations generated by Google,
    keywords frequently occurring in children pre-defined                  Bing, and KidsQR. Appraisers were then asked to select the two
    vocabularies. We do so, since it is anticipated that children          recommendations that they found most child-friendly for each
    will favor queries including keywords they are familiar with.          query and their selections were treated as the gold standard.
    For example, for the queries “color” and “city,” the candidates
                                                                           Using the created dataset we evaluated KidsQR based on Mean
    “coloring pages” and “pig in a city” are preferred over “color
                                                                           Reciprocal Rank (MRR) and Normalized Discounted Cumulative
    spectrum” and “city infrastructure” since “spectrum” and
                                                                           Gain (NDCG). We also compared the performance of KidsQR
    “infrastructure” are not common words among children.
                                                                           with that of Google and Bing, two well-known search engines that
x Popularity. The popularity of terms among children is
                                                                           offer query recommendations and that are frequently used by
    considered by analyzing term frequency distributions2 on
                                                                           children [1]. As shown in Table 1, KidsQR outperforms both
    children stories, poems, and blog posts. Candidate queries
                                                                           Bing and Google, in terms of MRR and NDCG. The higher
    including popular children terms are also given precedence.
                                                                           NDCG implies that queries useful for children are positioned
x Phrase-Formulating. Examining the child-friendliness of
                                                                           higher in the ranking of recommended queries by KidsQR. The
    individual terms in candidate queries is crucial, but not
                                                                           higher MRR indicates that, on average, users of KidsQR need to
    sufficient in confirming the appropriateness of a candidate
                                                                           scan through less query recommendations before locating a
    query since it does not consider the query phrase as a whole.
                                                                           suitable, useful one than users of other systems.
    For example, having the words “bar” —as in “chocolate
    bar”—and “open” in children vocabulary does not imply that             Table 1. Performance analysis for Bing, Google, and KidsQR
    “open bar” is a child-related phrase. We consider stories and
                                                                                   System                        MRR         NDCG
    poems written for children, as well as texts, blog posts, and
                                                                                   Google                         0.36        0.51
    online reviews written by children, to determine the
                                                                                   Bing                           0.27        0.35
    appropriateness of the combination of the words, and capture
    children’s informal phrasing patterns. Candidate queries that                  KidsQR                          0.7        0.72
    have similar patterns to children’s informal phrasing behavior,
    or are child-appropriate as a phrase, most likely address a
    child’s search intention, hence, are prioritized.
                                                                           4. CONCLUSION
                                                                           We have developed a query recommendation system, KidsQR,
x Pop-Culture. We observed that candidate queries that do not
                                                                           designed specifically to address the challenges of children in
    include children vocabulary, or do not literally make sense as
                                                                           query formulation. KidsQR distinguishes the child-friendly query
    a phrase, can still be related to children’s popular culture.
                                                                           candidates from the non-child-friendly ones by simultaneously
    KisdQR examines candidate queries in the context of children
                                                                           considering multiple desired properties on children queries.
    pop-culture and prioritizes queries including terms related to
                                                                           We aim to further enhance the initial development of KidsQR so
    children’s movies, songs, and toys (extracted from Pixar.com
                                                                           that it can adequately handle informal as well as natural language
    and Allmovie.com, to name a few). For example, “Mary
                                                                           phrasing which are very common among children. We also intent
    Poppins” and “Mr. Potato Head” are valid candidates since
                                                                           to further enhance the performance of KidsQR by addressing
    they refer to a movie character and a toy, respectively, even
                                                                           children pop-culture more comprehensively. We believe the more
    though the former contains “Poppins”, a word not included in
                                                                           aspects of children’s pop-culture that we consider, the more
    children’s vocabulary, and the latter consists of child-related
                                                                           closely we can predict a child user’s search intention, i.e.
    words but does not have a literal meaning as a phrase.
                                                                           recommend queries that are anticipated to be appealing from a
Ranking. KidsQR analyzes each of the candidate queries based
                                                                           child’s perspective can be generated. Moreover, we will examine
on the characteristics mentioned above, and prioritizes candidates
                                                                           children vocabulary and words provided by school vocabulary
that (i) are simple, (ii) refer to children’s topics of interests, (iii)
                                                                           lists more accurately and consider the age gap among young
include terms children are familiar with, and (iv) resemble
                                                                           children, i.e., we will group children by age groups and explicitly
children’s informal phrasing behavior. KidsQR relies on a
                                                                           consider their reading ability in making query recommendations
multiple regression analysis model that simultaneously considers
                                                                           for children in the respective groups.
the different contributing factors in determining whether a
candidate query is, in fact, child-friendly and generates a single         5. REFERENCES
ranking score for each candidate query recommendation. The top-            [1] D. Bilal & M. Boehm. Towards New Methodologies for
N candidates are presented to the user as the corresponding query              Assessing Relevance of Information Retrieval from Web
recommendations that can help capture his search intent and guide              Search Engines on Children’s Queries. QQRM, 1:93-100,
the online search process.                                                     2013.
3. INITIAL EXPERIMENTS                                                     [2] S. Duarte Torres, D. Hiemstra, I. Weberand P. Serdyukov.
As far as we know, a benchmark dataset that specifically                       Query Recommendation for Children. In ACM CIKM, pp.
addresses queries conducted by children has yet to be developed.               2012-2014, 2010.
Thus, we created our own dataset by conducting a user study and            [3] S. Duarte Torres and I. Weber. What and How Children
collecting data from 10 appraisers who were either parents of                  Search on the Web. In ACM CIKM, pp. 393-402, 2011
children between the ages of 3 and 12, or elementary school
teachers. We presented each appraiser with 8 queries and the               [4] S. Duarte Torres, D. Hiemstra, and P. Serdyukov. An
corresponding set of query recommendations, comprised of                       Analysis of Queries Intended to Search Information for
                                                                               Children. In IIiX, pp. 235-244, 2010.

2
                                                                           [5] S. Duarte Torres, D. Hiemstra, I. Weber, and P. Serdyukov.
    Sample sources considered for determining term popularity and              Query Recommendation in the Information Domain of
    phrase suitability include kidsblogclub.com and storybud.org.              Children. JASIST, 65(7): 1368-1384, 2014.