<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Recommendations to Enhance Children Web Searches</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Shahrzad Karimi</string-name>
          <email>shahrzadkarimi@u.boisestate.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria Soledad Pera</string-name>
          <email>solepera@boisestate.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Boise State University</institution>
          ,
          <addr-line>Boise, ID 83725</addr-line>
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <abstract>
        <p>We present the initial design and development of KidsQR, a query recommendation system tailored exclusively for children. KidsQR aids children in their quest for online information by considering children vocabulary, child-friendly phrases, and entities children are familiar with. Initial experiments conducted based on the assessment of parents and elementary school teacher appraisers verify the promising performance of KidsQR.</p>
      </abstract>
      <kwd-group>
        <kwd>Information retrieval</kwd>
        <kwd>query recommendation</kwd>
        <kwd>children</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Despite the large number of studies conducted in the field of
query recommendation, relatively few focus explicitly on the
young group of Internet users and their difficulties,.
Consequently, literature pertaining query recommendation for
children is very limited. In fact, most of the existing query
recommendation systems are designed based on the information
needs of adults [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], which is why they suggest queries that often
do not lead to retrieving online resources that “suit the
characteristics of content for children” [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Existing query
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
recommendation systems targeting children have taken different
approaches, using large-scale query logs, tags, biased random
walk methods, and bipartite graphs [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. These approaches,
however, are based on texts that are generated by adults,
disregarding informal phrasing based on children writing.
We deem the formulation of keyword queries that children can
relate to as the solution to this problem. With that in mind, we
have developed KidsQR, a query recommendation system that
suggests keyword queries in response to a child-initiated query.
Unlike previous works, we will not primarily rely on child-related
data produced by adults. Instead, we attempt to consider the
patterns of children’s informal phrasing and natural language by
utilizing texts that have been written by children, to recommend
queries that are adequate to initiate the search of content of
interest to children, which can lead to a more child-friendly and
suitable search experience. KidsQR is unique since it considers
child-friendly characteristics to generate query recommendations,
including children vocabulary, phrasing patterns, pop-culture, and
the popularity of the terms among children. Our intention is to
recommend queries that have a closer resemblance to a child’s
search intent, which results in retrieving suitable documents.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. METHODOLOGY</title>
      <p>In this section we present a brief overview of KidsQR.
Generating Candidates. To identify possible queries to be
recommended, i.e., candidate queries, in response to a given
user’s initial query, KidsQR employs Ubersuggest.org1.
Ubersuggest is a query generation tool that provides hundreds of
possible suggestions given an initial user query and offers topical
diversity among the suggestions. We have verified that phrases
provided by Ubersuggest include terms related to children pop
culture, such as the name of cartoon characters.</p>
      <p>Analyzing Candidates. To determine the adequacy of the
candidates being recommended to the user, i.e. distinguishing
child-friendly candidates from the non-child-friendly ones,
KidsQR evaluates each candidate query based on a number of
child-related characteristics that are applicable to them. In other
words, KidsQR considers child-related properties to determine
how closely a candidate phrase is related to children’s interests, or
if the candidate relates to child-friendly content. The
properties/characteristics to be observed in quantifying the degree
to which a candidate query is likely reflecting children’s search
intent are described as follows:</p>
      <p>Vocabulary. A fundamental step to differentiate
childfriendly queries among the candidate ones, is to examine the
existence of children’s vocabulary terms in each query. We
consider children vocabulary lists extracted from children
dictionaries and schools’ academic vocabulary (such as
www.opsu.edu/www/education/BuildAcademicVoc.pdf and
1 While we used Ubersuggest for development purposes, other
tools, such as keywordtool.io, can be considered as well.
kids.wordsmyth.net/we/) and prioritize candidates that include
keywords frequently occurring in children pre-defined
vocabularies. We do so, since it is anticipated that children
will favor queries including keywords they are familiar with.
For example, for the queries “color” and “city,” the candidates
“coloring pages” and “pig in a city” are preferred over “color
spectrum” and “city infrastructure” since “spectrum” and
“infrastructure” are not common words among children.
x Popularity. The popularity of terms among children is
considered by analyzing term frequency distributions2 on
children stories, poems, and blog posts. Candidate queries
including popular children terms are also given precedence.
x Phrase-Formulating. Examining the child-friendliness of
individual terms in candidate queries is crucial, but not
sufficient in confirming the appropriateness of a candidate
query since it does not consider the query phrase as a whole.
For example, having the words “bar” —as in “chocolate
bar”—and “open” in children vocabulary does not imply that
“open bar” is a child-related phrase. We consider stories and
poems written for children, as well as texts, blog posts, and
online reviews written by children, to determine the
appropriateness of the combination of the words, and capture
children’s informal phrasing patterns. Candidate queries that
have similar patterns to children’s informal phrasing behavior,
or are child-appropriate as a phrase, most likely address a
child’s search intention, hence, are prioritized.
x Pop-Culture. We observed that candidate queries that do not
include children vocabulary, or do not literally make sense as
a phrase, can still be related to children’s popular culture.
KisdQR examines candidate queries in the context of children
pop-culture and prioritizes queries including terms related to
children’s movies, songs, and toys (extracted from Pixar.com
and Allmovie.com, to name a few). For example, “Mary
Poppins” and “Mr. Potato Head” are valid candidates since
they refer to a movie character and a toy, respectively, even
though the former contains “Poppins”, a word not included in
children’s vocabulary, and the latter consists of child-related
words but does not have a literal meaning as a phrase.
Ranking. KidsQR analyzes each of the candidate queries based
on the characteristics mentioned above, and prioritizes candidates
that (i) are simple, (ii) refer to children’s topics of interests, (iii)
include terms children are familiar with, and (iv) resemble
children’s informal phrasing behavior. KidsQR relies on a
multiple regression analysis model that simultaneously considers
the different contributing factors in determining whether a
candidate query is, in fact, child-friendly and generates a single
ranking score for each candidate query recommendation. The
topN candidates are presented to the user as the corresponding query
recommendations that can help capture his search intent and guide
the online search process.</p>
    </sec>
    <sec id="sec-3">
      <title>3. INITIAL EXPERIMENTS</title>
      <p>
        As far as we know, a benchmark dataset that specifically
addresses queries conducted by children has yet to be developed.
Thus, we created our own dataset by conducting a user study and
collecting data from 10 appraisers who were either parents of
children between the ages of 3 and 12, or elementary school
teachers. We presented each appraiser with 8 queries and the
corresponding set of query recommendations, comprised of
2 Sample sources considered for determining term popularity and
phrase suitability include kidsblogclub.com and storybud.org.
randomly-positioned recommendations generated by Google,
Bing, and KidsQR. Appraisers were then asked to select the two
recommendations that they found most child-friendly for each
query and their selections were treated as the gold standard.
Using the created dataset we evaluated KidsQR based on Mean
Reciprocal Rank (MRR) and Normalized Discounted Cumulative
Gain (NDCG). We also compared the performance of KidsQR
with that of Google and Bing, two well-known search engines that
offer query recommendations and that are frequently used by
children [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. As shown in Table 1, KidsQR outperforms both
Bing and Google, in terms of MRR and NDCG. The higher
NDCG implies that queries useful for children are positioned
higher in the ranking of recommended queries by KidsQR. The
higher MRR indicates that, on average, users of KidsQR need to
scan through less query recommendations before locating a
suitable, useful one than users of other systems.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. CONCLUSION</title>
      <p>We have developed a query recommendation system, KidsQR,
designed specifically to address the challenges of children in
query formulation. KidsQR distinguishes the child-friendly query
candidates from the non-child-friendly ones by simultaneously
considering multiple desired properties on children queries.
We aim to further enhance the initial development of KidsQR so
that it can adequately handle informal as well as natural language
phrasing which are very common among children. We also intent
to further enhance the performance of KidsQR by addressing
children pop-culture more comprehensively. We believe the more
aspects of children’s pop-culture that we consider, the more
closely we can predict a child user’s search intention, i.e.
recommend queries that are anticipated to be appealing from a
child’s perspective can be generated. Moreover, we will examine
children vocabulary and words provided by school vocabulary
lists more accurately and consider the age gap among young
children, i.e., we will group children by age groups and explicitly
consider their reading ability in making query recommendations
for children in the respective groups.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Bilal</surname>
          </string-name>
          &amp;
          <string-name>
            <surname>M. Boehm</surname>
          </string-name>
          .
          <article-title>Towards New Methodologies for Assessing Relevance of Information Retrieval from Web Search Engines on Children's Queries</article-title>
          .
          <source>QQRM</source>
          ,
          <volume>1</volume>
          :
          <fpage>93</fpage>
          -
          <lpage>100</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Duarte Torres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hiemstra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. Weberand P.</given-names>
            <surname>Serdyukov</surname>
          </string-name>
          .
          <article-title>Query Recommendation for Children</article-title>
          .
          <source>In ACM CIKM</source>
          , pp.
          <fpage>2012</fpage>
          -
          <lpage>2014</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S. Duarte</given-names>
            <surname>Torres</surname>
          </string-name>
          and
          <string-name>
            <surname>I. Weber.</surname>
          </string-name>
          <article-title>What and How Children Search on the Web</article-title>
          .
          <source>In ACM CIKM</source>
          , pp.
          <fpage>393</fpage>
          -
          <lpage>402</lpage>
          ,
          <year>2011</year>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Duarte Torres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hiemstra</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Serdyukov</surname>
          </string-name>
          .
          <article-title>An Analysis of Queries Intended to Search Information for Children</article-title>
          . In IIiX, pp.
          <fpage>235</fpage>
          -
          <lpage>244</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Duarte Torres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hiemstra</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Weber</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Serdyukov</surname>
          </string-name>
          .
          <source>Query Recommendation in the Information Domain of Children. JASIST</source>
          ,
          <volume>65</volume>
          (
          <issue>7</issue>
          ):
          <fpage>1368</fpage>
          -
          <lpage>1384</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>