The Role of Customization in Supporting Speech Therapy: the Case of e-SpeechT Vita Santa Barletta1 , Miriana Calvano1 , Antonio Curci1,2 , Rosa Lanzilotti1 and Antonio Piccinno1 1 University of Bari "Aldo Moro", Via Edoardo Orabona 4, 70125, Bari, Italy 2 University of Pisa, Largo B. Pontecorvo 3, 56127, Pisa, Italy Abstract Speech therapy is the medical field in which speech impairments are treated. It mainly concerns the inability of people to adequately enunciate words, to construct and elaborate appropriate sentences when speaking, and generally lack linguistic skills. This research work presents a software system, called e-SpeechT, supporting speech therapies performed in presence and remotely from both physicians’ side and other actors involved in the therapies, i.e., patients, caregivers, and also teachers, schools, etc. The study explores the requirements and functionalities with respect to customization elements and how Artificial Intelligence (AI) can support all parties involved in the process in performing their tasks. The system was designed and developed following the participatory design focusing on the different roles, experiences, and demands of the involved end users and inviting them to contribute from different perspectives. Keywords Participatory design, Speech Therapy, Customization, Software engineering, Artificial Intelligence (AI), 1. Introduction Today, medicine is constantly being exposed to innovative technologies, and new products and services. Many professionals are beginning to trust the use of new instruments to perform their daily activities, such as administering therapies to their patients. This research work focuses on speech therapy and how technology can be a powerful instrument to support all parties involved. More specifically, this case study examines the roles of speech therapists, caregivers, and patients who are children aged 4 to 7 years. Speech therapy refers to the medical field that focuses on the study of linguistic impairments that can negatively influence the personal and professional sphere of an individual, affecting his ability to understand and use spoken language with others [1]. However, subjects who suffer from speech disorders can be rehabilitated by performing exercises that aim to strengthen the muscles of the face and throat, as well as their cognitive skills [2]. The traditional approach to speech therapy, without the aid of technology, is characterized by limitations in multiple aspects, such as accessibility, Proceedings of the 8th International Workshop on Cultures of Participation in the Digital Age (CoPDA 2024): Differenti- ating and Deepening the Concept of "End User" in the Digital Age, June 2024, Arenzano, Italy $ vita.barletta@uniba.it (V. S. Barletta); miriana.calvano@uniba.it (M. Calvano); antonio.curci@uniba.it (A. Curci); rosa.lanzilotti@uniba.it (R. Lanzilotti); antonio.piccinno@uniba.it (A. Piccinno)  0000-0002-0163-6786 (V. S. Barletta); 0000-0002-9507-9940 (M. Calvano); 0000-0001-6863-872X (A. Curci); 0000-0002-2039-8162 (R. Lanzilotti); 0000-0003-1561-7073 (A. Piccinno) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings feasibility, and personalization. The introduction of e-health becomes crucial because it has the goal of overcoming the challenges of patients and physicians when dealing with medical issues, gaining benefits in terms of costs and time resources [3]. In the current scenario, there are systems and mobile applications developed to support patients to perform therapy’s activities and therapists to administer the treatment. For instance, Happi Scrive 1 and KidEWords 2 are two crossword puzzle applications for children used for the development of writing skills through an attractive graphical interface. Another example is Teach and Touch 3 which is an application designed to support the speech therapist when administering the therapy; the child is allowed to perform a personalized rehabilitation path on specific morphosyntactic difficulties. In this context, a new web application integrating AI and customization functionalities, called e-SpeechT, can be introduced. The integration of Artificial Intelligence (AI) in this scenario provides new solutions based on the needs and requirements of patients and therapists by not only providing functionalities that allow to automate tasks and reduce the cognitive demand of repetitive activities, but also by supporting the management and evolution of the system [4, 5]. In particular, patients can be able to perform therapies in the comfort of their homes, while allowing continuous monitoring and control from therapists to obtain a broader and more complete understanding of their patient’s medical situation [6, 7]. Customization in e-health has the potential of revolutionizing speech therapies promoting inclusion and the establishment of a symbiosis between humans and machines. This paper presents and describes a web-application, called the e-SpeechT, created to support the actors involved (e.g. patients, caregivers and therapists) while performing, administering, and managing therapy; through the integration of AI techniques these activities can be automated and the system can be further adaptable to the patient’s performance. This solution aims to improve the effectiveness and accessibility of speech rehabilitation, showing the requirements and architectural components of e-SpeechT and illustrating how artificial intelligence is integrated into it. 2. e-SpeechT Design e-SpeechT was developed according to the SCRUM framework, as an agile project; the framework proved to be powerful and useful, especially due to its iterative nature and flexibility since it allows one to meet user needs efficiently and effectively [8]. Speech therapists were involved throughout the process, employing the Human-Centered Design (HCD), which allowed to obtain rich insights into their needs and preferences to create compliant solutions [9]. The ultimate objective is to create an interactive system that can be customizable and tailorable by end users as domain experts in the design team [10]. 1 https://apps.apple.com/it/app/happi-scrive/id464675842 2 https://apps.apple.com/it/app/kidewords-by-chocolapps/id879490139 3 http://www.teachandtouch.it/ 2.1. Requirements User requirements are fundamental when designing or developing any kind of software; they describe the functionalities the final system will have and provide guidance regarding the priority of tasks and activities to carry out throughout the process. For e-SpeechT, a number of actors were identified: speech therapist, caregiver, patient, school, teacher, and others to encompass the multi-faceted roles that end users can play, but for sake of simplicity, only the three main actors are here represented. The requirements were grouped in three different categories, one for each type of the main actors, and elicited through multiple interviews involving professionals. Through their collaboration in every sprint of the SCRUM process, it was possible to understand their preferences, priorities and needs, as well as those belonging to caregivers and patients. Some of the main requirements and the corresponding functionalities, that were identified for each role, are listed below and are represented in Figure 1. Speech therapist Speech therapists are professionals who organize, manage and administer therapies to children; they also have the important duty of communicating and helping care- givers, such as parents, to help their children throughout the process [11]. In particular, the application must allow therapists to: • Create and manage diagnoses for their patients. • Share words and exercises with other professionals who use it. • Manage appointments and schedule them, even in their office. • Verify the results of a therapy performed by a patient. • Be supported by an automatic correction feature to reduce their workload, while allowing them to change the decisions they made to keep professionals in control. • Send notifications to patients about the status of their therapy. • Automatically classify new patients according to the level of severity of their impairment. • Change the criteria based on which the patients are classified by the system. Caregiver Caregivers are the intermediaries between therapists and patients. This is a case of multi-tiered proxy design problems in which the end user, who must be able to tailor and customize the system, and the caregiver do not coincide with the actual end user of the system, i.e. the child [12]. Caregivers have to follow them and ensure that their children perform all assigned tasks, as well as accompanying them to appointments. In particular, the application must allow caregivers to: • Start and stop the child’s execution of a therapy. • Manage the way the system is presented to the child, in terms of wallpapers, scenarios, and graphical elements. Patient Patients are the target of therapies and are affected by speech impairments of different degrees of severity. Their age ranges from 4 to 7 years old. The requirements collected for their personal area of e-SpeechT state that the system must: • Display playful and gaming aspects to avoid feelings of stress or frustration. • Include gamification, encompassing game mechanics such as rewards and playful feed- back. • Allow the child to log in without the aid of a professional or caregiver. Speech System Create Diagnosis Perform Create Exercises Therapy Create Patient Exercises Correct Speech Exercises Manage Therapist Scenario Manage Therapy Manage Child's Personal Data Set Thresholds End Therapy Manage Appointments Start Therapy Caregiver Figure 1: Use Cases Diagram representing the actors and the respective functionalities. 3. Customizing the e-SpeechT Behavior After eliciting the requirements, creating the use cases, and carrying out the preliminary phases of the Human-Centered Design (HCD), e-SpeechT was designed and developed. This section illustrates how customization plays a pivotal role in the web application and how it aims to guarantee the user-specific solutions to health problems, carry out the right diagnoses and avoid stress, boredom and frustration. e-SpeechT embodies End-User-Development (EUD) in multiple aspects and functionalities, which differ depending on the type of user involved. 3.1. Speech-Therapist’s Participation The e-SpeechT application allows different types of intervention, from customization to EUD activities, to personalize its behavior. From the Speech Therapist’s side, the application allows to create new examples to evaluate their patients, since each word is characterized by peculiarities that result in different outcomes in exercises; it is also possible to assign images to words with no constraints, meaning that therapists can pick images that bring continuity to the work done during on-site appointments. Exercises can be created and designed based on patients’ needs, diagnosis and level of impairment; they belong to one of the three categories - i.e., naming images, minimum pair recognition and repetition of words In addition, therapists are free to create and modify therapies depending on the patient and their diagnosis by manipulating attributes like exercises to be administered, beginning and end date and date, time and place of the appointment. Therapies can contain one or multiple exercises, which can be administered individually or through series. e-SpeechT offers an exercise automatic correction functionality, which exploits machine learning and speech recognition algorithms in order to facilitate the process of correcting exercises from the therapist’s standpoint. They are used to adapt how the system behaves according to specific groups of patients. Thresholds are used to adjust the level of strictness of the system when it comes to automatically correct exercises during the evaluation of the patient’s vocal interaction; three categories of thresholds are available, which depend on three levels of severity of impairments: Moderate, Severe, and Slight, as shown in Figure 2. (a) Customization of the thresholds of the auto- (b) Customization of the Patient’s area matic correction Figure 2: Examples of functionalities that include user’s preferences and customization 3.2. Participation of the Caregiver User-specific personalization is crucial to ensure that patients (children aged from 4 to 7 years old) feel welcomed, comfortable, and not under examination. In e-SpeechT, scenarios are used to differentiate the style, aesthetic and atmosphere of the patient’s page during the therapy sessions, when carrying out exercises. Caregivers, through EUD activities, are able to modify the whole scenarios by choosing among the ones proposed by the system or creating a new one from scratch. Visual familiarity can prevent children from feeling examined or judged. Another important aspect of the the system lies in setting a graphical password for the patient, which children are able to log in the application with. 4. Discussion and Future works This research work presents a web-application, e-SpeechT, that aims at supporting therapists, patients, and caregivers during treatments. It describes the design and development process of the application in question, which was performed in accordance with the Human-Centered Design highlighting its participatory process. In this way, the role of the end users can be further investigated to analyse how they are progressing into active participants during the design and development phases. In this context, e-health systems that include AI can bring advantages automating tasks performed by therapists and providing patient-specific solutions. Additionally, through the improvement of speech recognition algorithms it would be possible to easily and automatically detect errors in enunciating/repeating words [13]. However, AI can present some limitations such as, for example, the limited context understanding and the risk of overreliance. For this reason, it is necessary to guarantee the right balance between the technology and the human intervention in clinical practice. Another important aspect is to understand how to guarantee the correct balance between the end users’ contribution and how to guarantee that they will be motivated to contribute to the system’s performance over long periods of time. Therefore, as a starting point, usability studies were carried out to understand the weaknesses and strengths of the system; as a future work, it is intended to conduct a longitudinal study to test the medical feasibility by gathering information on how children respond to therapies when using the system in question and whether all the involved actors continue to be motivated to use e-SpeechT. Future work also involve conducting tests to evaluate the application with respect to the WCAG (Web Content Accessibility Guidelines)4 ; this is an important aspect because this appli- cation is primarily aimed at individuals possibly affected by disabilities. Another future work is to expand the application’s environment by integrating smart assis- tants and Internet of Things (IoT) devices. This feature has the objective of automatizing the process of guiding children in their daily activities when having to carry out exercises at home. This project has already started, but is still in its early development stages. In conclusion, the intersection between speech therapy and technology can revolutionize the approach to deal with speech disorders. However, it is important to underline that the role of AI in this scenario is not a panacea, but an instrument capable of helping users perform their tasks without replacing them. References [1] P. Butcher, A. Elias, R. Raven, J. Yeatman, D. Littlejohns, Psychogenic voice disorder unresponsive to speech therapy: Psychological characteristics and cognitive-behaviour therapy, International Journal of Language & Communication Disorders 22 (1987) 81–92. doi:https://doi.org/10. 3109/13682828709088690. [2] V. S. Barletta, F. Cassano, A. Pagano, A. Piccinno, A collaborative ai dataset creation for speech therapies, in: CEUR Workshop Proceedings, volume 3136, 2022, pp. 81–85. URL: https://hdl.handle. net/11586/407013. [3] E. Sillence, L. Little, P. Briggs, E-health, in: British Computer Society Conference on Human- Computer Interaction, 2008. URL: https://api.semanticscholar.org/CorpusID:209022601. [4] V. S. Barletta, D. Caivano, L. Colizzi, G. Dimauro, M. Piattini, Clinical-chatbot ahp evaluation based on “quality in use” of iso/iec 25010, International Journal of Medical Informatics 170 (2023) 104951. URL: https://www.sciencedirect.com/science/article/pii/S1386505622002659. doi:https: //doi.org/10.1016/j.ijmedinf.2022.104951. 4 https://www.w3.org/WAI/WCAG22/Understanding/understanding-act-rules.html [5] P. Tagde, S. Tagde, T. Bhattacharya, P. Tagde, H. Chopra, R. Akter, D. Kaushik, M. H. Rahman, Blockchain and artificial intelligence technology in e-Health, Environmental Science and Pollution Research 28 (2021) 52810–52831. URL: https://link.springer.com/10.1007/s11356-021-16223-0. doi:10. 1007/s11356-021-16223-0. [6] F. Cassano, A. Pagano, A. Piccinno, Supporting speech therapies at (smart) home through voice assistance, volume 483, 2022, p. 105 – 113. URL: https://www.scopus.com/inward/ record.uri?eid=2-s2.0-85137986566&doi=10.1007%2f978-3-031-06894-2_10&partnerID=40&md5= de8c5b2f0a3ed3f354cb010281ad4004. doi:10.1007/978-3-031-06894-2_10. [7] V. Barletta, M. Calvano, A. Curci, A. Piccinno, A new interactive paradigm for speech therapy, in: J. Abdelnour Nocera, M. Kristín Lárusdóttir, H. Petrie, A. Piccinno, M. Winckler (Eds.), Human- computer interaction – INTERACT 2023, Springer Nature Switzerland, Cham, 2023, pp. 380–385. doi:10.1007/978-3-031-42293-5_39. [8] T. Badriyah, E. N. Fadila, I. Syarif, N. Jauari Akhmad, Rapid development of m-health application with the sprint design approach and scrum process : Application development for e-prescribing, in: 2018 International Conference on Applied Science and Technology (iCAST), 2018, pp. 336–342. doi:10.1109/iCAST1.2018.8751603. [9] V. S. Barletta, M. Calvano, A. Curci, A. Piccinno, A. Pagano, Poster: Speech therapies and smart assistants: An interaction paradigm proposal, 2023. URL: https://www.scopus.com/ inward/record.uri?eid=2-s2.0-85173634861&doi=10.1145%2f3605390.3610823&partnerID=40& md5=62e9ad3f964c7c3df3b04931fd4fd68a. doi:10.1145/3605390.3610823, cited by: 0. [10] M. F. Costabile, D. Fogli, A. Marcante, P. Mussio, L. Parasiliti Provenza, A. Piccinno, Designing customized and tailorable visual interactive systems, International Journal of Software Engineering and Knowledge Engineering 18 (2008) 305–325. URL: http://www.worldscientific.com/doi/abs/10. 1142/S0218194008003702. doi:10.1142/S0218194008003702. [11] T. Pring, E. Flood, B. Dodd, V. Joffe, The working practices and clinical experiences of paediatric speech and language therapists: a national uk survey, International Journal of Language & Communication Disorders 47 (2012) 696–708. URL: https://onlinelibrary.wiley.com/doi/abs/10.1111/ j.1460-6984.2012.00177.x. doi:https://doi.org/10.1111/j.1460-6984.2012.00177.x. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1460-6984.2012.00177.x. [12] D. Fogli, A. Piccinno, Co-evolution of End-User Developers and Systems in Multi-tiered Proxy Design Problems, volume 7897 of Lecture Notes in Computer Science, Springer, Berlin Heidel- berg, 2013, pp. 153–168. URL: http://dx.doi.org/10.1007/978-3-642-38706-7_12. doi:10.1007/ 978-3-642-38706-7_12. [13] M. Calvano, A. Curci, A. Pagano, A. Piccinno, Speech therapy supported by ai and smart assistants, in: R. Kadgien, A. Jedlitschka, A. Janes, V. Lenarduzzi, X. Li (Eds.), Product-Focused Software Process Improvement, Springer Nature Switzerland, Cham, 2024, pp. 97–104.