Artificial Intelligence for Severe Speech Impairment:
    Innovative approaches to AAC and Communication

      Monica Murero1, 2, Salvatore Vita1, Andrea Mennitto1, Giuseppe D’Ancona3
                                 1 University Federico II, (Italy)
    2 Distributed Artificial Intelligence Laboratory, Technische Universität, Berlin (Germany)
                         3 Vivantes Klinikum Group, Berlin (Germany)

                                monica.murero@unina.it


        Abstract.This paper aims to analyze how innovative Artificial Intelligence (AI)
        systems (Voiceitt®) for non-standard speech recognition may revolutionize
        Augmentative Alternative Communication (AAC) technology for people with
        severe speech impairments. By using built-in capabilities of portable devices,
        the AI-based algorithm may "understand" dysarthric speech and “translate” it
        into a fluid real-time user communication, thanks to a “voice donor” outcome
        system. The pattern classification algorithm is customized for non-standard
        speech recognition. The AI based system is personalized for each person unique
        language production and offers a real step forward in AAC efficiency. Earlier
        empirical findings show limitations in analogic assistive tools addressing
        Speech, Language, and Communication Needs (SLCNs). Recently, Speech-
        Generating Devices (SGDs) have been successfully used to support communi-
        cation in patients with Autism and Dysarthria.
           With impressive improvements in recognizing non-standard natural lan-
        guage, AI-based technology (supported by deep learning, big data, and cloud
        processing) is offering a turning point for personalized Augmentative Alterna-
        tive Communication (AAC). Upcoming AI-based innovations promise to gen-
        erate an immense transformative effect on the everyday life of the speech im-
        paired people, their caregivers, significant others, and the entire society.

        Keywords: Communication Disorders, Artificial Intelligence, non-standard
        speech recognition system.


1       Introduction

Speech, language, and communication needs (SLCNs) affect up to 1% of the world
population [1]. Recent empirical findings have shown that about 8% of children be-
tween 3 and 17 years of age are affected by a communication disorder [2], defined as
a deficit of the speech, language, voice quality, or a swallowing problem. A deficit in
learning, using, and understanding words may also result in a communication prob-
lem. Reducing communication difficulties is fundamental because when a child has a
linguistic and communication deficit, the ability to obtain information from the envi-


       Copyright © 2020 for this paper by its authors. Use permitted under Creative
            Commons License Attribution 4.0 International (CC BY 4.0).
2


ronment, to develop cognitive potential, and to interact socially is significantly com-
promised, with negative consequences on the child's development and behavior [3].
   In the era of Information and Communication Technologies (ICTs), Augmentative
Alternative Communication (AAC) systems may offer new opportunities to address
communication challenges. AAC's primary mission is to "compensate, temporarily or
permanently, the patterns of disorder or disability of individuals with severe disorders
of communication" (American Speech-Language-Hearing Association [ASHA],
1989). AAC systems provide effective means of communicating and substituting
conventional language, representing a real option for people who do not learn the
language normally and readily [1, 2]. A systematic literature review [4, 5] has shown
the effects of augmented input techniques on communication production (reception,
expression, pragmatics, and syntax) in people with developmental disabilities and
verbal apraxia. AAC systems can improve child skills in producing single words and
multiple symbols sentences. Among the AAC tools, we should mention communica-
tion boards [6], handbooks for signs, tangible objects stimulating child creativity and
initiatives, and Speech-Generating Devices (SGDs).


2      SGD

Among the AACs, the SGDs have always played a leading role, thanks to their multi-
ple advantages over other assistive communication tools. People with SLCNs often
report difficulties in everyday interpersonal interactions [1]: SGDs capabilities may
result in a reduction of social distancing. The use of SGDs may apply to various
forms of speech disabilities that are secondary to chronic pathologies, such as dysar-
thria (difficulty in articulating words), apraxia (difficulty coordinating mouth and
speech movements), or aphasia (partial or total loss of the ability to communicate
verbally or using written words) [7]. Moreover, SGDs can be easily integrated and
used in everyday life. Some studies underline that SGDs are preferred over Picture
Exchange Communication Systems (PECSs) and sign manuals [4, 8].
   The diffusion of new ICTs [9] has made SGDs increasingly efficient and ergonom-
ic [1], primarily when supported by mobile devices such as tablets and smartphones
[4]. Recent research shows that SGDs are not simple substitutes for language, but
rather a framework supporting the entire communication process by increasing the
request-response process [10]. At the present stage, SGDs are among the most effec-
tive and efficient AAC tools for addressing numerous diseases [4, 10]. For example,
in the case of autism, recent literature has shown that the use of SGDs could have a
significant impact on the participant's communication skills, resulting in an increased
speech production [11, 12], increased vocabulary (even in non-vocal children) [11],
and increased social-relational skills [13]. Furthermore, a significant advantage of-
fered by SGD is that their current communication output mode consists of voice mes-
sages.
   However, the efficiency and effectiveness of SGDs use remain related to and lim-
ited by the user's capabilities [14].
                                                                                         3


3      Voiceitt: Artificial Intelligence For Non-Standard Speech
       Recognition

Speech impairments can result in isolation, depression, frustration, sense of inadequa-
cy, lack of self-esteem, and lower quality of life not only for the person living with
the disability, but also for those interacting with the disabled person including her/his
family, caregivers, friends, and colleagues [15, 16].
    A real innovation for Speech-Generating Devices that may overcome the limita-
tions imposed by the users' capabilities and the current obstacles to Speech, Lan-
guage, and Communication Needs lies in the use of Artificial Intelligence (AI). The
term "artificial intelligence" was officially coined by John McCarthy in 1956 and,
since then, AI has grown into a structured field of science and engineering. According
to Russell & Norvig [17], there are eight prominent AI definitions that can be applied
according to the AI context, design, and application.
    For example, Artificial Intelligence systems based on standard or non-standard
voice recognition can be defined as "Computational intelligence", or “the study of the
design of intelligent agents" [18]. In this context, the definition of AI reflects the "ra-
tional agent" approach – i.e., machines that can act and behave in "rational" ways and
try to achieve the best possible outcome, or the best-expected result [17, 19].
    In recent years, AI has been providing innovative approaches and disruptive prom-
ises to our society. Rational agents for standard speech-recognition software have
become increasingly accurate. Interestingly, after the initial training of the input algo-
rithms, no programming is done by humans to enable Machine Learning (ML) algo-
rithms to perform their tasks [20]. Recent advances in Deep Learning and algorithms
development have further improved standard speech recognition systems, thanks to
the availability of large datasets, to the increased access to vast computational power,
and to the reduced costs required for accessing and storing large amounts of data [20,
21]. Thanks to AI, SGD systems appear more and more performant and ergonomic [1,
11], creating a new phenotype of SGD: communicators who learn from the speaker.
    AI system for standard speech recognition (Alexa, Siri, and thousands of voice-
activated apps) do not haves non-standard speech recognition capabilities and, for this
reason, they are not able to "understand" speech-impaired people.
    Recently, Deep Learning speech recognition technology has been designed to un-
derstand unique speech impairments, disorders, or disabilities [22]. Voiceitt® is a
smart device software application that translates unintelligible sounds into clear
speech in real-time. It can foster inclusion and allow the disabled to be more inde-
pendent by enabling those with a motor or cognitive disability communicate with
caregivers, family members, health care professionals, and society as a whole. In this
context, Voiceitt® represents an unprecedented AI-based system designed to under-
stand non-standard communication speech (i.e., dysarthria) secondary to congenital or
acquired conditions, including Cerebral Palsy, Autism, cerebrovascular accident,
Parkinson disease, brain tumor, or Traumatic Brain Injury. Non-standard speech
recognition uses different Deep Learning techniques, including pattern classification
technology, that is personalized for each speaker. Unlike standard speech recognition,
the system is not language-dependent but rather speaker-dependent. It is the only
4


Augmentative and Alternative Communication (AAC) device that allows people with
speech disabilities to communicate with their own voice as input, and to speak
through a donor voice technique as output.
   Voiceitt® can combine unique statistical modeling and machine learning into its
app. In this way speech-impaired people can be understood by anyone, not only care-
givers. In fact, it has been observed that while people that do not routinely deal with
the speech-impaired person may struggle understanding her/his voice, the family
members, the friends, and the caregivers can often understand the disabled with ease,
because they have learned how to adapt to her/his unique pattern of speech. From
these observations, Artificial Intelligence has been able to construct algorithms repli-
cating this pattern. The app offers a new algorithmic solution for recognizing im-
paired speech. Each individual user has a distinct phonetic inventory. Adapting to
each individual's communication is similar to adjusting the system to a new language.
Voiceitt® manages this in a two-stage solution:


• Collect recordings of known utterances: Upon launch of the application, an initiali-
  zation requires the user to provide a tiny sample of recordings (five words, each
  repeated twice). The algorithm learns how the user says these specific words and
  recognizes them when spoken. As a single user continues to use the application in
  this form, the algorithm steadily learns from the new recordings of the user.
• Apply clustering techniques to the user's phonetic inventory: For example, by col-
  lecting a large number of recordings from any one user, the algorithm can run clus-
  tering methods on the phonetic data and identify units of sound in the user's unique
  speech style; this process allows the mapping of the units of sound in standard
  speech recognition and the application to a more extensive, unlimited vocabulary.
• When used in combination with the Internet of Things (IoT), the app has proven to
  be effective in increasing autonomous behavior [1]. Speech-impaired people may
  ultimately control the environment and internet-connected interdigital objects by
  clear voice-commanded applications [22].


4      Conclusion

Speech impairments can result in isolation, depression, frustration, sense of inadequa-
cy, lack of self-esteem, and reduced quality of life [15, 16, 23].
   AI may accelerate progress in serving people with complex communication needs,
non-standard speech impairments, and multiple disabilities. Voiceitt® may offer
voice-activated means of living. In this way, more autonomous perspectives are fore-
seen for the disabled. Moreover, upcoming AI-based technological systems combined
with convergent applications (Internet of Things, IoT) will offer challenging opportu-
nities and implementing capabilities. Future research should focus on improving the
prototype features, increasing the non-standard speech datasets, enlarging the target of
possible users, and expanding AI-based solutions converging into IoT. These chal-
lenges will lead to new opportunities for the speech-impaired people, including im-
proved chances to communicate and participate within the society. In conclusion,
                                                                                     5


innovative AI systems are a turning point for the production of unprecedented Aug-
mentative Alternative Communication (AAC) solutions addressing Speech, Lan-
guage, and Communication Needs in people with speech impairment.


References

1. Elsahar, Y., Hu, S., Bouazza-Marouf, K., Kerr, D., Mansor, A.: Augmentative and
    alternative communication (AAC) advances: A review of configurations for
    individuals with a speech disability. Sensors. 19, 1911 (2019).
2. Black, L.I., Vahratian, A., Hoffman, H.J.: Communication Disorders and Use of
    Intervention Services among Children Aged 3-17 Years: United States, 2012.
    NCHS Data Brief. Number 205. Centers Dis. Control Prev. (2015).
3. Hirotomi, T.: An AAC system designed for improving behaviors and attitudes in
    communication between children with CCN and their peers. In: International
    Conference on Universal Access in Human-Computer Interaction. pp. 530–541.
    Springer (2018).
4. Rega, A., Mennitto, A., Iovino, L.: Liar (Language Interface For Autistic’s
    Rehabilitation): Technological Aids For Specialists Supporting The Acquisition
    Of Verbal Behavior In Persons With Autism. In: 9th International Conference on
    Education and New Learning Technologies. IATED (2017).
5. Logan, K., Iacono, T., Trembath, D.: A systematic review of research into aided
    AAC to increase social-communication functions in children with autism
    spectrum disorder. Augment. Altern. Commun. 33, 51–64 (2017).
6. Bondy, A.S., Frost, L.A.: The picture exchange communication system. Focus
    autistic Behav. 9, 1–19 (1994).
7. Kerr, D., Bouazza-Marouf, K., Gaur, A., Sutton, A., Green, R.: A breath
    controlled AAC system. (2016).
8. Gevarter, C., O’Reilly, M.F., Rojeski, L., Sammarco, N., Lang, R., Lancioni, G.E.,
    Sigafoos, J.: Comparisons of intervention components within augmentative and
    alternative communication systems for individuals with developmental
    disabilities: A review of the literature. Res. Dev. Disabil. 34, 4404–4414 (2013).
9. Murero, M., Rice, R.E.: The Internet and health care: theory, research, and
    practice. Routledge (2013).
10. Ricci, C., Miglino, O., Alberti, G., Perilli, V., Lancioni, G.E.: Speech generating
    technology to support request responses of persons with intellectual and multiple
    disabilities. Int. J. Dev. Disabil. 63, 238–245 (2017).
11. Wahl, B., Cossy-Gantner, A., Germann, S., Schwalbe, N.R.: Artificial intelligence
    (AI) and global health: how can AI contribute to health in resource-poor settings?
    BMJ Glob. Heal. 3, e000798 (2018).
12. Kasari, C., Kaiser, A., Goods, K., Nietfeld, J., Mathy, P., Landa, R., Murphy, S.,
    Almirall, D.: Communication interventions for minimally verbal children with
    autism: A sequential multiple assignment randomized trial. J. Am. Acad. Child
    Adolesc. Psychiatry. 53, 635–646 (2014).
13. Allen, A.A., Schlosser, R.W., Brock, K.L., Shane, H.C.: The effectiveness of
6


    aided augmented input techniques for persons with developmental disabilities: A
    systematic review. Augment. Altern. Commun. 33, 149–159 (2017).
14. Bourque, K.S., Goldstein, H.: Expanding communication modalities and functions
    for preschoolers with autism spectrum disorder: Secondary analysis of a peer
    partner speech-generating device intervention. J. Speech, Lang. Hear. Res. 63,
    190–205 (2020).
15. Feeney, R., Desha, L., Ziviani, J., Nicholson, J.M.: Health-related quality-of-life
    of children with speech and language difficulties: A review of the literature. Int. J.
    Speech. Lang. Pathol. 14, 59–72 (2012).
16. Zhong, B.-L., Luo, W., Xu, Y.-M., Li, W.-X., Chen, W.-C., Liu, L.-F.: Major
    depressive disorder in Chinese persons with speech disability: High rates of
    prevalence and perceived need for mental health care but extremely low rate of
    use of mental health services. J. Affect. Disord. 263, 25–30 (2020).
17. Intelligence, A.: Artificial Intelligence. Wikipedia Semi-Structured Resour.
    (2013).
18. Poole, D., Mackworth, A., Goebel, R.: Computational Intelligence. (1998).
19. Iovino, L., Vita, S., Mennitto, A.: “ CareMe”: a new way to face problem
    behaviors at school. In: PSYCHOBIT (2019).
20. Parloff, R.: Why deep learning is suddenly changing your life. Fortune. New York
    Time Inc. (2016).
21. Ark, T. Vander: The Rise of AI: What’s Happening, What it Means, How to
    Prepare?, https://www.gettingsmart.com/2018/03/rise-ai-whats-happening-means-
    prepare/.
22. Murero, M.: Wearable internet for wellness and health. Geogr. Internet. (2020).
23. Scherbaum, R., Hartelt, E., Kinkel, M., Gold, R., Muhlack, S., Tönges, L.:
    Parkinson’s Disease Multimodal Complex Treatment improves motor symptoms,
    depression and quality of life. J. Neurol. 267, 954–965 (2020).