=Paper= {{Paper |id=Vol-1956/GHItaly17_paper_08 |storemode=property |title=A Dialogue-based Software Architecture for Gamified Discrimination Tests |pdfUrl=https://ceur-ws.org/Vol-1956/GHItaly17_paper_08.pdf |volume=Vol-1956 |authors=Antonio Origlia,Piero Cosi,Antonio Rodà,Claudio Zmarich |dblpUrl=https://dblp.org/rec/conf/chitaly/OrigliaCRZ17 }} ==A Dialogue-based Software Architecture for Gamified Discrimination Tests== https://ceur-ws.org/Vol-1956/GHItaly17_paper_08.pdf
            A dialogue-based software architecture for gamified
                           discrimination tests
              Antonio Origlia                                                      Piero Cosi                                    Antonio Rodà
             Dept. of Information                                       Institute of Cognitive Sciences                       Dept. of Information
                 Engineering                                             and Technology (CNR-ISTC)                                Engineering
             University of Padua                                           piero.cosi@pd.istc.cnr.it                          University of Padua
         antonio.origlia@dei.unipd.it                                                                                          roda@dei.unipd.it
                                                                              Claudio Zmarich
                                                                       Institute of Cognitive Sciences
                                                                        and Technology (CNR-ISTC)
                                                                           claudio.zmarich@cnr.it
ABSTRACT                                                                                          [16]. Among several types of discrimination tests, we choose
In this work we describe the current stage of development of a                                    the standard AX or “same-different” procedure. Tradition-
software architecture designed to present discrimination tests                                    ally, AX tests to evaluate the phonemes discrimination capa-
to pre-school children in the form of gamified tasks. We in-                                      bility of young children are designed as scripts and software
terpret the problem of administering these tests as a dialogue                                    traditionally used to administer this kind of test also follows
model using probabilistic rules to generate customised tests                                      scripts (e.g. [1]). These contain a series of (non-) word pairs
on the basis of the child’s performance. In the proposed ar-                                      presenting phoneme oppositions (i.e. ’pepi / ’pemi) in differ-
chitecture, the dialogue system controls a gaming setup com-                                      ent syllabic structures (i.e. CV-CV is a disyllabic structure
posed of a virtual agent and a robotic companion that needs                                       where each syllable has a single heading consonant). The
to be taught how to talk. This learning-by-teaching approach                                      child is given the task to indicate, after listening to the exper-
is used to camouflage a phonemes discrimination test that has                                     imenter reading the stimuli, whether the two (non-)words are
the added value of being generated at runtime on the basis of                                     the same or if they are different. These tests are designed
the child’s performance. We will describe the architectural                                       in such a way that consonants presenting a single distinc-
components involved and we will describe how the dialogue                                         tive trait are opposed at each time (e.g. voiced/unvoiced,
system can make use of linguistic knowledge to generate the                                       sonorant/non-sonorant). Control stimuli are present in such
discrimination test and administer it by controlling the agents                                   tests as pairs composed by the same word repeated twice and
involved in the game.                                                                             by pairs composed by completely different words. This ap-
                                                                                                  proach is necessary as it is impossible for a human expert to
Author Keywords                                                                                   dynamically select word pairs that comply to a set of very
Gamification; software architecture; discrimination tests                                         strict constraints. Specifically, each word pair must:
                                                                                                  • present opposed consonants that differ in exactly one trait
INTRODUCTION
Phonetic perception abilities are in place and active already                                     • syllabic structure must be the same in the two (non-)words
in the fetus, and their integrity is necessary for a normal func-
tioning future speech development [12, 23]. Since the ability                                     • present the opposition in a precise position in the syllabic
to discriminate linguistic sounds is associated to the correct                                      structure (e.g. the head consonant of the second syllable)
acquisition and production of the same sounds, an alteration
of the same ability could contribute to the onset of speech                                       • the accent must be in the same place in the two (non-)words
and language disorders [2]. For this reason the evaluation                                        Given the young age of the considered subjects, it is neces-
of the phonetic discrimination ability is important in order                                      sary to mask the test in a game-like scenario to make it less
to individuate at-risk subjects, allowing clinicians and care-                                    imposing. Healthy contact with language, in the first years of
givers to operate in focused and specific ways. For preschool                                     life, consists of a playful activity where parents and infants
children (from 3 years-old onward), the paradigms of iden-                                        engage protoconversations made of rhythmical and musical
tification and discrimination are the same as used by adults                                      content. This manifests the emotional regulation of primary
                                                                                                  inter-subjectivity [19], where interaction with the caregiver,
                                                                                                  either reciprocally directed or mediating access to objects of
Permission to make digital or hard copies of all or part of this work for personal                interest for the infant, manifests the typical playfulness of-
or academic purposes is granted without fee provided that copies are not made or                  ten observed in mammals. At 9 months, secondary inter-
distributed for profit or commercial advantage and that copies bear this notice and the
GHITALY17:
full               1st first
     citation on the     Workshop
                             page. Toon
                                      copyGames-Human       Interaction,
                                            otherwise, or republish,      Aprilon 18th,
                                                                      to post     servers2017,
                                                                                          or to
                                                                                                  subjectivity arises [22] and the baby’s interest moves onto
Cagliari, Italy.
redistribute  to lists, requires prior specific permission and/or a fee.                          sharing the ways companions use objects as she starts to in-
Copyright     ©   2017    for  the  individual  papers   by  the  papers'  authors.  Copying
Copyright c by the paper’s authors.
permitted for 1private
                   st         and academic     purposes.Interaction,
                                                          This volume      is published    and
                                                                                                  teract with the material world in a more informed way. The
GHITALY17:             Workshop    on Games-Human                     September    18th, 2017,
copyrighted
Cagliari,      by its editors.
           Italy.                                                                                 caregivers’ language also shifts, in this phase, from questions
and rhetorical comments to instructions and informative com-          available time. The system architecture we designed to ad-
ments to support the baby’s interest in participating to a task       minister the discrimination tests has two main purposes:
[10]. This is “[. . . ] the start of cultural information transfer
between generations” [20, p. 74]. Playful behaviour adapts            • dynamically adapt the test to the child’s performance;
to new roles as the child grows older but always stays in the         • support groups of virtual agents to establish social setups
background, motivating access to cultural information, rein-
forcing memory and supporting the creation of meaning [17].           To pursue the first goal, we represent the discrimination test
Language development strongly depends on inter-subjective             as a dialogue model where each stimulus, once paired with
experiences: from the effective engagement of minds and               the child’s answer, generates a new stimulus as a system re-
bodies depends cultural learning [9]. Although humans ap-             sponse. This stimulus is selected depending on a utility func-
pear to be born with a natural disposition towards cultural           tion taking into account linguistic knowledge and the child’s
learning [21], successful acquisition of cultural skills depend       performance. From an architectural point of view, this reflects
on the interaction quality, especially considering social feed-       in a dialogue manager acting as the system’s controller and in
back. Given the social nature of cultural transfer, it is not         linguistic knowledge being distributed between the dialogue
enough to expose children to new words without providing              manager and a database of Italian words. The dialogue man-
an adequate context to them. Engaging and meaningful ac-              ager is provided with the capability to establish which kind
tivities are especially important to attract interest in the chil-    of information can be obtained by presenting each available
dren and show them how words can provide the natural plea-            stimulus and with a non-words generator using phonotactic
sure that comes with gaining competence in interacting with           rules to avoid structures not belonging to the Italian language.
their loved ones and with peers. Storytelling has been demon-         The database contains morpho-syntactic, phonological and
strated to be a powerful mean to accomplish this as children          frequency data about words to improve the quality of the se-
are born with “[. . . ] an abundant and early armament of nar-        lected stimuli. To present the discrimination test in a social
rative tools” [3, p. 90]. Through storytelling, children ac-          setup, the dialogue manager controls a set of virtual agents
quire skills related to the so called emergent literacy [18],         with different characteristics. In our case, a virtual avatar
which is a necessary prerequisite to mastering reading and            is presented on a computer screen and acts as the game’s
writing. These skills cover metalinguistic awareness, cohe-           guide while a social robot is used to implement a learning-by-
sion and reference in oral communication and the capability           teaching approach, detailed in Section . The virtual avatar is
of making one’s own intentions known to others. Emergent              controlled using the Unreal Engine 41 and its voice is dynam-
literacy capabilities “[. . . ] are acquired first in language play   ically generated using the Mivoq Voice Synthesis Engine2 .
and in storytelling. Many of them are acquired in the context         The synthetic voice has a number of advantages: it allows the
of childrens interactions with peers, in early play contexts.”        system to be easily updated as the proposed stimuli are not
[5, p. 76]. Once again, the importance of social context              pre-recorded, it allows the 3D characters to address the child
and playful interaction is highlighted concerning the acqui-          by calling her by name, thus establishing a closer relation-
sition of literacy skills. Wordplay for children appears to be        ship, and it can be adapted to different kinds of characters. In
based on matching and substituting words on the basis of their        the specific case of Mivoq, personalised voices and specific
sound rather than their meaning as they appear to [5, p. 78]          prosodic styles can also be synthesised, opening to a number
“[. . . ] derive tremendous pleasure from rhyming words (“you         of applications for game-like software artefacts. A tablet in-
silly”; “no, you pilly”) or words that sound similar (adult:          terface, also controlled using the Unreal Engine 4, is provided
“Indians lived in a teepee”; child: “pee-pee!”)”. In order to         to the child to evaluate the proposed stimuli. Since the ability
become meaningful and precious for children, teaching activ-          to adequately use a tablet interface appears to be reliable for 5
ities need to have a basis of experiences showing language as         years old and onwards children [24], this is the minimum age
a tool to provide pleasure in social activities. In this paper,       recommended to apply this technology. The robot used in our
we will present a software architecture designed to present           implementation is Nao3 , which is a well established robotic
discrimination tests in a playful setup depicting a social situa-     platform to work with children. The dialogue manager does
tion with different kinds of virtual agents. This ongoing work        not make assumptions about the nature of the virtual agents it
builds upon the experience of the Colorado Literacy Tutor [6]         is connected to. The commands it generates are the same for
and of the Italian Literacy Tutor [7].                                both the robotic platform and for the 3D character (i.e. Syn-
                                                                      thesise, Speak. . . ). Command implementation is delegated
                                                                      to the specific platform to separate the test logic from its ac-
                                                                      tual implementation. The full schema of the architecture we
SYSTEM ARCHITECTURE
                                                                      present is shown in Figure 1. In the following sections, we
The scripted approach has the disadvantage of not being able          detail how each module was designed and its role in the gen-
to adjust the test depending on the subject’s performance. As         eral setting. While the system we are developing is able to
a limited amount of time is available to administer the test be-      administer the test without human supervision, we do not ex-
fore the child gets tired, choosing the most informative stimu-       clude the human expert from the experimental setup. The
lus at each step of the test would represent an advantage when        presence of a reference human figure is important to reassure
information is clearer on some traits and more uncertain on
others. Being able to concentrate on collecting information           1
                                                                        www.unrealengine.com
on specific aspects of linguistic competence that have been           2
                                                                        www.mivoq.it
                                                                      3
observed to be challenging for the child would optimise the             www.softbank.jp/en/corp/group/sbr/
the child and to integrate the obtained results in the light of        is designed to be a declarative language that highlights pat-
direct observation of the child’s behaviour. In these develop-         terns structure by using an SQL inspired ascii-art syntax. A
ment stages, moreover, the experience of practitioners is pre-         brief overview of the syntactic elements of Cypher queries
cious to improve the quality of the overall experience without         is given here to help understanding the example queries pre-
altering the validity of the test.                                     sented in this paper. The reader is referred to the online
                                                                       Cypher manual5 for a more detailed presentation of Cypher.
                                                                       As in graphical representations of graphs nodes are usually
LINGUISTIC KNOWLEDGE BASE
                                                                       represented by circles, in Cypher nodes are represented by
With the advent of the Big Data and, in particular, with the
                                                                       round brackets. For example, the query MATCH (n:VERB)
increasing availability of Linked Open Data, the need to es-
                                                                       RETURN n returns all the nodes of the graph labelled as
tablish a representation format suitable for dynamic, rapidly
                                                                       verbs. In the same way, since relationships are usually rep-
changing and interconnected objects arose. RDF represents
                                                                       resented by labelled arrows in graph schemas, relationships
the most widely used solution to this need and has been
                                                                       between nodes are described by using ASCII arrows, too.
adopted to implement the most widely known repositories of
                                                                       The query MATCH (m)-[:DERIVES FROM]->(:VERB
linked knowledge available today. An alternative to RDF is
                                                                       word: ’essere’) RETURN m returns all the nodes
now represented by graph databases. Neo4J [25] is the graph
                                                                       that contain a term that derives from the essere (to be) verb.
database solution we used in our architecture. It is an open
                                                                       The SQL-like WHERE clause may also be used to filter re-
source graph database manager that has been developed over
                                                                       sults using boolean logic. The query shown in Figure 2 shows
the last sixteen years and has been applied to a high number
                                                                       how to obtain a pair (w1 , w2 ) consisting of dysyllabic words
of tasks related, among others, to data representation [8] and
                                                                       that are phonological neighbours and are obtained by substi-
visualisation [11]. In Neo4J, nodes and relationships may be
                                                                       tuting the /p/ phoneme in the first word with the /b/ phoneme
assigned labels, which describe the type of the object they are
                                                                       in the second word. Sets of words to be excluded after having
associated to. In this work, labels are mainly used to repre-
                                                                       been presented are also included (in this example, cubo and
sent morpho-syntactic characteristics of words and the nature
                                                                       cupo) as well as the sorting logic. The first part of the Cypher
of the relationships among nodes. Nodes and relationships
                                                                       query matches words that are linked by phonological neigh-
may have properties, which are used here to store the details
                                                                       bourhood relationships at distance 1, regardless of arc orien-
of each single node or relationship. Labels and properties
                                                                       tation. A filter is then applied on the syllabic structure using
are the main way used by Neo4J to filter data and retrieve an-
                                                                       a regular expression on the SAMPA transcription property.
swers to user queries. In this work, we use the MultiWordNet-
                                                                       In this case, only words presenting a CV-CV structure with
Extended (MWN-E) dataset [14], as the knowledge base to
                                                                       the accent on the first syllable and presenting the phonemes
control the decision process for the discrimination test. The
                                                                       /p/ and /b/ in opposition on the head of the second syllable
MWN-E dataset is based on the MultiWordNet dataset [15]
                                                                       are accepted. The regular expression is dynamically gener-
and extended by introducing morpho-syntactic data (e.g. gen-
                                                                       ated by the dialogue system depending on the opposition to
der, number. . . ), derived forms (e.g. plurals, conjugations. . . )
                                                                       present and on the word structure complexity. The former
and SAMPA pronunciations. Also, phonological neighbour-
                                                                       comes from a decision process implemented in the dialogue
hoods are computed and are of particular interest for this
                                                                       manager while the latter becomes more complex as the words
work. A word A is defined to be a phonological neigh-
                                                                       available for the each considered structure become less infor-
bour of the word B if it is possible to obtain B by altering
                                                                       mative, as in the case of words presenting oppositions that
the phonological representation of A using exactly one In-
                                                                       have already been investigated.
sertion/Deletion/Substitution operation. Phonological neigh-
bourhoods are represented by establishing relationships of
                                                                       OPENDIAL
type HAS PHONOLOGICAL NEIGHBOUR between two
words if the Minimum Edit Distance of their phonological               Opendial [13] is a dialogue management framework based on
transcriptions equals 1. This kind of relationship has a dis-          probabilistic rules aiming at merging the best of rule-based
tance property that, in these cases, is set to 1. Relationships        and probabilistic dialogue management. In cases where a
of type HAS PHONOLOGICAL NEIGHBOUR are also es-                        good amount of previous knowledge about the domain is pos-
tablished between words that have the same pronunciation but           sessed by the dialogue designer with specific needs of fine-
have different written forms. In this case the value of the dis-       tuning rules, the rule based approach can be integrated with
tance property is set to 0. Other than the data included in the        probability and utility-based reasoning to fine tune the sys-
version presented in [14], the MWN-E version used in this              tem’s response. Probabilistic rules, in Opendial, are used to
work also contains frequency data for the terms in the vocab-          setup and update a Bayesian network consisting of variables
ulary presented in the Primo Vocabolario del Bambino (Chil-            representing the dialogue state. Depending on this, the dia-
dren’s first vocabulary) [4] and from the Italian Wikipedia4 .         logue manager selects the most probable user action given a
Currently, MWN-E consists of 292282 nodes containing                   set of, possibly inaccurate, inputs. Using a set of utility func-
1536550 properties. 943174 relationships among these nodes             tions provided by the dialogue designer, the manager com-
are found, phonological neighbourhood relationships at dis-            putes the most useful system reaction, possibly generating
tance 1 representing the majority. The querying language               natural language responses or executing actions. In Open-
used to extract data from a Neo4J database is Cypher. Cypher           dial, it is possible to apply a priori estimates on future values
                                                                       of state variables. The probability distributions providing a
4                                                                      5
    Data extracted from the 20/04/2017 Wikipedia.it dump                   https://neo4j.com/developer/cypher-query-language
                                                           Figure 1. System Architecture.


                                                                             trait. Then, the probability of the set of opposing traits to
                                                                             contain the sonorant trait is equal to the XOR of the result ob-
                                                                             tained by applying the HasTraits function on the considered
                                                                             phonemes. Opendial can also be extended with Java-based
                                                                             plugins and functions. In our case, we developed a set of plu-
                                                                             gins to connect the dialogue system to the Neo4J database
                                                                             and to the remote actors providing the user interface. We also
                                                                             developed the custom function to compute the set of opposed
                                                                             traits given two phonemes and a utility model to select the
                                                                             most informative stimulus at each step. The system makes
                                                                             use of the prediction and feedback mechanism provided by
                                                                             Opendial to build the probability distributions describing the
Figure 2. Example query. Extracts a word pair of disyllabic phonolog-
                                                                             likelihood of a subject to discriminate a specific trait. This is
ical neighbours opposing the /p/ sound and the /b/ sound in the head of      used to select the next stimulus that improve the user model
the second syllable.                                                         the most, given previous answers. This approach results in an
                                                                             adaptive test. The description of the utility model is beyond
                                                                             the scope of this work so we provide only a brief description
priori estimates can be updated, using Bayesian inference,                   of the aspects it takes into account. The model considers the
after the actual observation arrives to dynamically improve                  information entropy for each trait, the syllabic structures al-
the model. In Opendial, dialogue domains are described in an                 ready used to present the available oppositions, the number of
XML format specifically designed for the dialogue system.                    traits opposed in each possible phoneme pair and the intrinsic
This is composed of a set of models triggered by variable                    phoneme complexity evaluated on an acquisitional basis [26].
updates and containing sets of rules to change the dialogue                  For all these aspects, a specific utility value is computed. The
state. Opendial supports unification in its dialogue specifi-                obtained measures are combined into a utility value that is
cation language so that variables can be included to obtain                  used to select the best stimulus at each step.
generic rules. In the example shown in Figure 3, a part of the
model that identifies opposing traits given two phonemes is
presented. The condition for the considered rule to fire is that             INTERFACE
the two phonemes in the opposition variable are not the same                 The interface proposed to the child to mask the discrimination
one. If the condition is verified, a custom HasTraits function               test supports a narrative in which the Nao robot wants to learn
is used to determine if the two phonemes have the sonorant                   how to speak and the 3D character needs the child’s help to
Figure 3. Example rule to check whether two phonemes have the Sono-
rant trait in opposition. The HasTraits function has been implemented
in Java and exposed to the domain specification language.
                                                                                           Figure 4. The experimental setup.

teach it. A three-polar setup, shown in Figure 4, is established
to involve the child in a socially engaging situation. Through
this learning-by-teaching approach, the child is given an au-
thoritative role to avoid making him feel threatened or evalu-
ated. When the system starts, an introductory scenario is pre-
sented and the 3D character, shown in Figure 5, introduces
itself. The scenario ends with the 3D character asking the
child to caress Nao in order to wake it up. This has both the
goal of providing the invitation to play and to establish phys-
ical contact between Nao and the child. Whether the physical
attributes of robots constitute an advantage for acceptability
per se is still a debated issue. In our work, we attempt to fully
exploit the physical presence of the robot by presenting tasks
that require the child to physically interact with it. By propos-
ing activities that a 3D character simply cannot be involved
into, we attempt to capitalise on the robot’s potential to pro-
vide a more engaging multisensorial experience. Caressing is
also a powerful social mean to build attachment. On the other           Figure 5. The 3D character. It guides the child through the game and
hand, the high level of control over the 3D character move-             interacts with Nao during cutscenes.
ments allows to efficiently represent its higher competence in
the considered setup: differently from Nao, this avatar can
move the lips and change its facial expressions, providing ef-
fective indications on how to continue playing. An advantage
of the presented architecture is that different virtual agents
can be combined to build the test upon the various advan-
tages they offer. After a tutorial session where Nao performs
a small set of funny behaviours, the child is introduced to
the actual test. The dialogue manager selects the most ap-
propriate stimulus and coordinates the two agents so that one
presents the first (non-)word and the second presents the sec-
ond. The child is given one possibility to listen to the stimulus
again and is required to provide a same/different feedback us-
ing an evaluation card that appears on the tablet. The interface
to provide feedback is shown in Figure 6.

CONCLUSIONS AND FUTURE WORK
We have presented the work-in-progress on an architectural
setup that has been designed to administer gamified discrimi-           Figure 6. The tablet interface. The child gives feedback by touching the
nation tests. We interpret the test as a dialogue model between         red or green areas to evaluate Nao’s performance. A repeat button is
the child and a group of virtual characters controlled by a sin-        also present to allow the child to listen to the opposed words again. This
                                                                        is allowed only once for each stimulus.
gle artificial intelligence. Instead of providing pre-scripted
tests, we propose an approach where the test is dynamically
generated. The system is able to exploit a significant amount
of linguistic knowledge to automatically select the most in-      13. Lison, P., and Kennington, C. Opendial: A toolkit for
formative stimulus to present at each time. The architecture          developing spoken dialogue systems with probabilistic
does not make assumptions about the nature of the virtual             rules. ACL 2016 (2016), 67.
agents involved and can be reused to design other types of
                                                                  14. Origlia, A., Paci, G., and Cutugno, F. MWN-E: a graph
test. Future work will consist of evaluating the usability and
                                                                      database to merge morpho-syntactic and phonological
appreciation of the discrimination test we are designing with
                                                                      data for italian. In Proceedings of Subsidia (2017).
children that do not show problems in language acquisition
to establish a baseline that will be useful to evaluate the ap-   15. Pianta, E., Bentivogli, L., and Girardi, C. Developing an
proach on children with potential language problems. Also,            aligned multilingual database. In Proc. of the 1st
the possibilities given by the Mivoq engine to train person-          International Conference on Global WordNet (2002).
alised voices will also be explored.
                                                                  16. Polka, L., Jusczyk, P. W., and Rvachew, S. Methods for
ACKNOWLEDGMENTS                                                       studying speech perception in infants and children.
Antonio Origlia’s work is supported by Veneto Region and              Speech perception and linguistic experience: Issues in
European Social Fund (grant C92C16000250006).                         cross-language research (1995), 49–89.
                                                                  17. Reddy, V. How infants know minds. Harvard University
REFERENCES                                                            Press, 2008.
 1. André, C., Ghio, A., Cavé, C., and Teston, B.
    PERCEVAL: a computer-driven system for                        18. Teale, W. H., and Sulzby, E. Emergent Literacy: Writing
    experimentation on auditory and visual perception.                and Reading. Writing Research: Multidisciplinary
    CoRR abs/0705.4415 (2007).                                        Inquiries into the Nature of Writing Series. ERIC, 1986.
 2. Brancalioni, A. R., Bertagnolli, A. P. C., Bonini, J. B.,     19. Trevarthen, C. Communication and cooperation in early
    Gubiani, M. B., and Keske-Soares, M. The relation                 infancy: A description of primary intersubjectivity.
    between auditory discrimination and phonological                  Before speech: The beginning of interpersonal
    disorder. Jornal da Sociedade Brasileira de                       communication (1979), 321–347.
    Fonoaudiologia 24, 2 (2012), 157–161.                         20. Trevarthen, C. The functions of emotion in infancy. In
 3. Bruner, J. S. Acts of meaning, vol. 3. Harvard University         The healing power of emotion: Affective neuroscience,
    Press, 1990.                                                      development & clinical practice (Norton Series on
 4. Caselli, M. C., and Casadio, P. Il primo vocabolario del          Interpersonal Neurobiology), D. Fosha and S. M. F.
    bambino. Milano: Franco Angeli, 1995.                             Siegel, D. J., Eds. WW Norton & Company, 2009,
                                                                      55–85.
 5. Cassell, J. Towards a model of technology and literacy
    development: Story listening systems. Journal of              21. Trevarthen, C., and Aitken, K. Regulation of brain
    Applied Developmental Psychology 25, 1 (2004),                    development and age-related changes in infants.
    75–105.                                                           Motives: The Developmental Function of Regressive
                                                                      Periods.In M. Heimann (ed.) Regression Periods in
 6. Cole, R. A. Roadmaps, journeys and destinations                   Human Infancy. Mahwah, NJ: Erlbaum (2003),
    speculations on the future of speech technology                   107–184.
    research. In Eighth European Conference on Speech
    Communication and Technology (2003).                          22. Trevarthen, C., Hubley, P., et al. Secondary
                                                                      intersubjectivity: Confidence, confiding and acts of
 7. Cosi, P., Delmonte, R., Biscetti, S., Cole, R. A., Pellom,        meaning in the first year. Action, gesture and symbol:
    B., and Vuren, S. v. Italian literacy tutor-tools and             The emergence of language (1978), 183–229.
    technologies for individuals with cognitive disabilities.
    In InSTIL/ICALL Symposium 2004 (2004).                        23. Tsao, F.-M., Liu, H.-M., and Kuhl, P. K. Speech
                                                                      perception in infancy predicts language development in
 8. Dietze, F., Karoff, J., Valdez, A. C., Ziefle, M., Greven,
                                                                      the second year of life: A longitudinal study. Child
    C., and Schroeder, U. An open-source
                                                                      development 75, 4 (2004), 1067–1084.
    object-graph-mapping framework for neo4j and scala:
    Renesca. In International Conference on Availability,         24. Vatavu, R.-D., Cramariuc, G., and Schipor, D. M. Touch
    Reliability, and Security, Springer (2016), 204–218.              interaction for children aged 3 to 6 years: Experimental
 9. Donald, M. A mind so rare: The evolution of human                 findings and relationship to motor skills. International
    consciousness. WW Norton & Company, 2001.                         Journal of Human-Computer Studies 74 (2015), 54 – 76.
10. Halliday, M. A. K. Learning How to Mean–Explorations          25. Webber, J. A programmatic introduction to neo4j. In
    in the Development of Language. ERIC, 1975.                       Proceedings of the 3rd annual conference on Systems,
                                                                      programming, and applications: software for humanity,
11. Jiménez, P., Diez, J. V., and Ordieres-Mere, J. Hoshin           ACM (2012), 217–218.
    kanri visualization with neo4j. empowering leaders to
    operationalize lean structural networks. Procedia CIRP        26. Zmarich, C., and Bonifacio, S. Phonetic inventories in
    55 (2016), 284–289.                                               italian children aged 18-27 months: a longitudinal study.
12. Kuhl, P. K. Early language acquisition: cracking the              In INTERSPEECH (2005), 757–760.
    speech code. Nature reviews neuroscience 5, 11 (2004),
    831–843.