<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Workshops, Los Angeles, USA, March</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Impact of Voice-based Interaction on Learning Practices and Behavior of Children</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Subhasree Sengupta</string-name>
          <email>susengup@syr.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Radhika Garg</string-name>
          <email>rgarg01@syr.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Information Studies, Syracuse University</institution>
          ,
          <addr-line>Syracuse, NY</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <volume>20</volume>
      <issue>2019</issue>
      <abstract>
        <p>Smart devices have become an integral part of the everyday lives of children. Today, children can even use voice-based interactions to interact with devices for a wide range of activities. Previous research has shown that voice-driven interfaces have a potential to ofer a potent new mechanism for teaching, engaging, and supporting children in daily life. Our paper, therefore, argues that it is critical not only to investigate how children use voice-based interactions to communicate with devices (e.g., smart speakers) but also the nature of relationships that children form with these devices, the influence such use has on children's learning and behavior, and the role that parents or guardians play in deciding the norms of use for children. We also propose to explicitly and intricately investigate complexities in use and its impact relative to entangled identities (conveyed through overlapping attributes of gender, ethnicity, race, class) and larger social systems. To this end, we propose to use Social Learning Theory to understand how children learn through observing and interacting with smart devices, specifically using voice-based commands. Methodologically, we will conduct participatory design sessions and follow-up interviews to get a nuanced understanding of how children mentally contextualize voice-enabled smart devices and how social influence (e.g., parental expectation/norms), social function of identification (e.g., children's emotional connection with technology), and learning goals impact their usage patterns.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <p>• Human-centered computing → User studies; Empirical
studies in HCI; Scenario-based design; Participatory design.
Voice-based interactions; Children’s behavior and learning
practices; Social learning theory; Parasocial relationships</p>
      <p>IUI Workshops’19, March 20, 2019, Los Angeles, USA
Copyright © 2019 for the individual papers by the papers’ authors. Copying permitted
for private and academic purposes. This volume is published and copyrighted by its
editors.</p>
    </sec>
    <sec id="sec-2">
      <title>INTRODUCTION</title>
      <p>
        Today, various types of smart devices are deeply integrated in our
day to day lives. The use of technology has increased not only for
adults but also for children, be it as a source of entertainment or as
a learning aid. So much so, that the exposure and use of technology
has been considered as a crucial influence on the process of
learning and development of children [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Bower and Sturman
demonstrated that wearable devices ofer a range of pedagogical uses
(in-situ contextual information, recording, simulation,
communication, first-person view, in-situ guidance, feedback, distribution and
gamification), aford benefits to educational quality (engagement,
eficiency, and presence), and provide logistical advantages
(handsfree access and free up space) in a class room setting [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. More
recently, smart devices (e.g., smart phones, tablets, smart speakers)
have started to ofer conversational assistants (e.g., Amazon Alexa,
Siri, and Google Now) that lend flexible means of interacting with
the device. Due to the presence of such voice assistants, children no
longer need to read or write to be able to interact with the devices
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. As the amount of background information a child needs to use
these devices has reduced, it can have an impact in the information
seeking, behavioral (e.g., children might imitate and emulate certain
characteristics of these devices), and learning practices pursued by
children and the factors that afect these practices. Hence, in this
paper we argue that it is critical to investigate how and why children
are using these devices (e.g., voice-connected speakers), and the
influence voice-based interactions with devices has on children’s
behavior and learning practices.
      </p>
      <p>
        We propose to investigate this issue through the lens of
Bandura’s Social Learning Theory (SLT). It explains ‘observational
learning’ in terms of how people learn through observing othersfi
behavior, attitudes, and the outcomes (penalty or reward) one might
incur due to such a behavior [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. However, SLT is a complex and
subjective concept with many diferent facets to it, exploring all
of which is beyond the scope of this study. Therefore, to
understand ‘observational learning’ our study centers around the three
social factors provided by Over et al [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. These three factors are:
social function of identification (parasocial relationships), the type
of role that children associate with voice assistants; learning goals,
the type of learning tasks that children use voice assistants for;
social norms and customs, particularly focussing on the role that
parents/guardians play in regulating children’s use of technology.
Therefore, we will investigate three fundamental research questions
in this paper:
• RQ1: What are the type of parasocial relationships that child
form with the voice-enabled smart devices?
• RQ2: What are the type of learning objectives or tasks that
children are interested to use voice-enabled smart devices
for and how parasocial relationships may or may not impact
those?
• RQ3: How do social norms and customs (especially those
instilled/followed by parents and guardians) afect the way
children use voice-connected smart devices?
      </p>
      <p>To answer these questions we aim to conduct Participatory
Design (PD) sessions and set of interviews with design groups
consisting of children from diferent age groups: 7- 9 years, 10-12 years
and 13-17 years. This will enable us to investigate the role age and
gender of children within the context of our research questions. For
example, we will explore if the nature of parasocial relations/role
that younger children associate with voice-enabled smart devices
difer from those that older children associate with these devices.
2</p>
    </sec>
    <sec id="sec-3">
      <title>BACKGROUND AND RELATED WORK</title>
      <p>
        In this section we first discuss the framework by Over et al [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
that acts as a foundation of our proposed study. Then, we discuss
prior studies that have looked into what type of activities children
are using voice-enabled devices for, parental role in use of
voiceenabled devices by children, and type and influence of parasocial
relationships that children develop with such devices.
      </p>
      <p>
        Oven et al. highlighted three crucial factors that impact the
selectivity in ’observational learning’ (i.e the fact that people selectively
choose to imitate or emulate certain behavior they observe) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
These factors were used to operationalize SLT to be able to
understand children’s use, how they relate with these devices, and how
it impacts their learning practices.
      </p>
      <p>
        Parasocial relationships: The first factor that Over et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
stated is ‘social function of identification’ that explains that
children emulate and establish an emotional connection with those
they feel they resemble or want to be like. For example, they may
ifnd the voice of a device relatable or they may find a character in
a game that they play on a device relatable and start to personate
the characteristics of those. More importantly, children sometimes
assert roles (such as friend, mentor, pet) to these devices thereby
personifying them and forming a relationship with these devices.
These sort of connections have also been termed as parasocial
relationships (i.e., one-sided, emotionally driven relationships that
children develop with media characters) [
        <xref ref-type="bibr" rid="ref3 ref7">3, 7</xref>
        ]. Druga et al.
demonstrated that the ability to have voice-based interactions, elements
of social realism, and human-like characteristics makes these smart
devices more relatable and easy to use for children [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Brunick et
al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] highlighted how parasocial relationships can be useful for
developing educational tools for children by embedding in
intelligence agents ability to generate parasocial interactions, such as
conversational timing and response personalization. Gray et al [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
emphasized that factors such as social realism and personification
should be considered when designing an intelligent agent for
children. Therefore, we propose to explore the type of relationships
children form particularly with voice-enabled smart devices and
how that impacts the type of learning goals that children use these
devices for.
      </p>
      <p>
        Learning goals: The second factor that Over et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] put forth
is ‘learning goals’, which includes self-established or self-motivated
Sengupta and Garg, et al.
learning goals of children. For example, if a child does not know
how to do something and wants to learn about it, he/she may ask the
device for help and use the information gained from these devices
to perform that task. Preliminary work by Lovato et al. highlighted
that children in general either ‘explore’ voice assistants such as
Siri and Google Now or use them to ‘seek new information’ [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In
‘exploration’ children use the voice assistants for fun and are even
able to develop a bond with the voice assistants. In ‘information
seeking’ children use the voice assistants to find facts and develop
knowledge base. Both these forms of use have impact on children’s
development. However, the main source of data for their study
were Youtube videos of children’s activity, which might not be
representative of children’s actual usage patterns. We aim to add
to this work, by investigating actual logs of voice commands to
understand the categories of use. Further, usage patterns can greatly
difer by age and gender, thus while annotating the voice commands
we will also explore how usage difers based on age and gender. In
order to understand this better we will conduct a design session
to investigate the kind of devices children would like to use for
varying learning goals that they might have the role of parasocial
relationships in this process.
      </p>
      <p>
        Role of parents: The final influencing factor that Over et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
put forth is ‘social influence’, which comprises of social customs,
expectations, and norms that might afect the way children use or
communicate with a device. Particularly we will focus on the role
parents or guardians play a key in role in introducing a smart device
to children and establishing norms of use and the extent of use, and
children’s use of devices might also be influenced by the way their
parents or other older members of the family use these devices.
Cheng et al. presented four roles parents play in helping children
communicate with voice-controlled devices [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Parents may also
help establish boundaries for device usage by children as illustrated
by related work (e.g., [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]). They may control the amount of time,
the type of content and the nature of interactions children may
have with these devices thereby influencing the learning practices
of children. Hiniker et al. demonstrated that parents are vital in
scafolding children’s use of a novel/relatively newer technology
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Therefore, we propose to expand on prior work by including
parents in PD sessions to identify how they support or regulate
children’s use of voice-connected devices for learning, and how
this difers by age and gender of children.
3
      </p>
    </sec>
    <sec id="sec-4">
      <title>PROPOSED METHOD</title>
      <p>Our study will be conducted in following two stages:
3.1</p>
    </sec>
    <sec id="sec-5">
      <title>Historical Log Analysis</title>
      <p>In this stage we will deploy a survey on Amazon Mechanical Turk
(AMT) to collect children’s voice history logs comprising of their
interactions with smart speakers. The survey will also consist of
questions on on family structure, number of children in families,
types of smart speakers. The primary goal of the survey is to
annotate voice history logs to get an understanding of the type of
activities that children use smart speakers for and get an estimate
of the percentage of those used for learning. Further, we will also
analyze how the usage patterns, learning tasks difer by age and
gender of children.</p>
    </sec>
    <sec id="sec-6">
      <title>3.2 Participatory Design Sessions</title>
      <p>
        In second stage we will employ Cooperative Inquiry [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] for
conducting PD sessions that will focus on co-designing devices/technology
with children that they might like to use for learning. Each of these
sessions will be divided in two parts, the first part called circle time
will be used to help the participant better contextualize the task
they are about to do in the session and the second part will be the
actual design prompt based on which the participants will perform
a design activity.
      </p>
      <sec id="sec-6-1">
        <title>Design Session 1 (DS1)</title>
        <p>
          Prior research has shown that children perceive interactive media
characters as enjoyable companions and develop diferent
parasocial relationship with them. Therefore, the design session will begin
with the circle time (15 minutes), where we will ask participants to
share with us their favorite cartoon or media character. The aim of
circle time is to ask “question of the day” to get adults and children
started. After that we will ask them to design an interactive
technology/device to identify kind of roles children would like the such
devices to take, specifically as they use them for various learning
tasks (e.g explore unknown facts, improve their language skills, or
to help them to hone their deductive reasoning). We will utilize
Bags-of-stuf , Big Paper and Layered Elaboration PD techniques [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
Through such activities we will investigate two things: 1) The type
of parasocial roles that children see these smart devices to take,
and 2) the connection between the parasocial role and the type of
learning task.
        </p>
      </sec>
      <sec id="sec-6-2">
        <title>Design Session 2 (DS2)</title>
        <p>
          In this design session we will elicit information regarding how
children think about using diferent speech agents (e.g, Amazon
Alexa, Google Now) for various learning tasks, using Stickies PD
[
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] technique. The circle time prompt for this session will ask
participants how they (would) use a device/technology to explore
a fact they wish to learn about. The design prompt ask participants
to note what they liked, disliked or would like to improve based on
the current devices, using the design technique of Stickies.
        </p>
      </sec>
      <sec id="sec-6-3">
        <title>Design Session 3 (DS3)</title>
        <p>
          In the final session we will use Stickies [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], Layered Elaboration
[
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] as design methods to have children and parent revisit design
ideas that children built in DS1 and build them with the help of
parents/guardians. One of the member of the research team will
present the ideas generated during DS1 through storyboards. The
circle time prompt will have participants think about how parents
influence children’s use of technology. For the design prompt, the
parents and children will be then asked to explain their likes,
dislikes, and further design ideas. In such a way parents and children
would work together to make the storyboards better based on each
other’s ideas and opinions. For example, parents might want to
include the possibility of regulating children’s use (e.g., permitted
duration, tone of the device) in the designs proposed by children.
        </p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>4 CONCLUSION</title>
      <p>Our position paper proposes to investigate how voice-based
interactions with smart devices are afecting or can afect learning practices
of children. Particularly, we use the three factors by Oven et al. to
operationalize the use of SLT as a tool to answer our research
questions on how children mentally contextualize voice-enabled smart
devices and how social influence (e.g., parental expectation/norms),
social function of identification (e.g., children’s emotional
connection with technology), and learning goals impact their usage
patterns. To this end, we propose to employ both historical log analysis
and participatory design sessions.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Albert</given-names>
            <surname>Bandura</surname>
          </string-name>
          .
          <year>1977</year>
          .
          <article-title>Social learning theory</article-title>
          . Prentice Hall.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Matt</given-names>
            <surname>Bower</surname>
          </string-name>
          and
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Sturman</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>What are the educational afordances of wearable technologies?</article-title>
          <source>Computers &amp; Education</source>
          <volume>88</volume>
          (
          <year>2015</year>
          ),
          <fpage>343</fpage>
          -
          <lpage>353</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Kaitlin</surname>
            <given-names>L Brunick</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marisa M Putnam</surname>
            ,
            <given-names>Lauren E McGarry</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Melissa N Richards</surname>
          </string-name>
          , and Sandra L Calvert.
          <year>2016</year>
          .
          <article-title>Children's future parasocial relationships with media characters: the age of intelligent characters</article-title>
          .
          <source>Journal of Children and Media</source>
          <volume>10</volume>
          ,
          <issue>2</issue>
          (
          <year>2016</year>
          ),
          <fpage>181</fpage>
          -
          <lpage>190</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Yi</surname>
            <given-names>Cheng</given-names>
          </string-name>
          , Kate Yen, Yeqi Chen, Sijin Chen, and
          <string-name>
            <given-names>Alexis</given-names>
            <surname>Hiniker</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Why Doesn't It Work? Voice-Driven Interfaces and Young Children's Communication Repair Strategies</article-title>
          .
          <source>In Proceedings of the 17th ACM Conference on Interaction Design and Children</source>
          . ACM,
          <volume>337</volume>
          -
          <fpage>348</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Cynthia</given-names>
            <surname>Chiong</surname>
          </string-name>
          and
          <string-name>
            <given-names>Carly</given-names>
            <surname>Shuler</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Learning: Is there an app for that. In Investigations of young children's usage and learning with mobile devices and apps</article-title>
          . New York: The Joan Ganz Cooney Center at Sesame Workshop. 13-
          <fpage>20</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Stefania</given-names>
            <surname>Druga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Randi</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Cynthia</given-names>
            <surname>Breazeal</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Mitchel</given-names>
            <surname>Resnick</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Hey Google is it OK if I eat you?: Initial Explorations in Child-Agent Interaction</article-title>
          .
          <source>In Proceedings of the 2017 Conference on Interaction Design and Children</source>
          . ACM,
          <volume>595</volume>
          -
          <fpage>600</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>James</surname>
            <given-names>H Gray</given-names>
          </string-name>
          ,
          <source>Emily Reardon, and Jennifer A Kotler</source>
          .
          <year>2017</year>
          .
          <article-title>Designing for Parasocial Relationships and Learning: Linear Video, Interactive Media, and Artificial Intelligence</article-title>
          .
          <source>In Proceedings of the 2017 Conference on Interaction Design and Children</source>
          . ACM,
          <volume>227</volume>
          -
          <fpage>237</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Alexis</given-names>
            <surname>Hiniker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Bongshin</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Kiley</given-names>
            <surname>Sobel</surname>
          </string-name>
          , and Eun Kyoung Choe.
          <year>2017</year>
          .
          <article-title>Plan &amp; play: supporting intentional media use in early childhood</article-title>
          .
          <source>In Proceedings of the 2017 Conference on Interaction Design and Children</source>
          . ACM,
          <volume>85</volume>
          -
          <fpage>95</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Silvia</given-names>
            <surname>Lovato</surname>
          </string-name>
          and Anne Marie Piper.
          <year>2015</year>
          .
          <article-title>Siri, is this you?: Understanding young children's interactions with voice input systems</article-title>
          .
          <source>In Proceedings of the 14th International Conference on Interaction Design and Children</source>
          . ACM,
          <volume>335</volume>
          -
          <fpage>338</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Harriet</given-names>
            <surname>Over</surname>
          </string-name>
          and
          <string-name>
            <given-names>Malinda</given-names>
            <surname>Carpenter</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Putting the social into social learning: explaining both selectivity and fidelity in children's copying behavior</article-title>
          .
          <source>Journal of Comparative Psychology</source>
          <volume>126</volume>
          ,
          <issue>2</issue>
          (
          <year>2012</year>
          ),
          <fpage>182</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Greg</surname>
            <given-names>Walsh</given-names>
          </string-name>
          , Elizabeth Foss, Jason Yip, and
          <string-name>
            <given-names>Allison</given-names>
            <surname>Druin</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>FACIT PD: a framework for analysis and creation of intergenerational techniques for participatory design</article-title>
          .
          <source>In proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM</source>
          ,
          <volume>2893</volume>
          -
          <fpage>2902</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>