Constructing a neurovisual therapy for with a social robot for a neglect handicap Alexandru Bundea1, Peter Forbrig 1 1 Universität Rostock, Fakultät für Informatik, Lehrstuhl für Softwaretechnik, Albert-Einstein-Str. 22, 18059, Germany Abstract There is a global shortage of needed medical personnel to cope with the demand for stroke therapies. Social robots may take on the role of therapists to guide patients through selected therapies. This paper presents a neurovisual therapy for neglect disorder using a social robot. The system can perform tasks for optokinetic stimulation, training for gaze saccades, and visual exploration. Some of the feedback from the robot is of elementary importance. The patient receives important cues from the robot in addition to instructions to acquire a search strategy for himself with the tasks of the therapy. In the course of the paper, besides an introduction of the topic, we will give an overview of other neurovisual therapies and then introduce our system. Keywords 1 Social Robot, Collaboration, Therapy 1. Introduction People around the world are experiencing increased life expectancy. With increasing age, the chances of suffering an affliction of such as stroke increases. Consequently, strokes are becoming more common among the population in absolute terms. There are already problems in the present to provide medical care to people after a stroke. The occurred consequences of a stroke can result in different disorders. Here we will regard a neurovisual disorder. The general probability of suffering such a disorder after a stroke is between 40-60% in people over 65 [1]. A “Neglect” disorder is one of these possible outcomes [2]. Patients suffering a neglect disorder show impaired or lost awareness of events and visual, auditory, and tactile stimuli located on the contralesional side of space. One example could be, that a patient would not recognize all objects on a table, even though it appears he has a full view of the table. In this way, especially situations like daily road traffic can become problematic. But also everyday things like eating and washing can become difficult [1]. Spontaneous recovery usually occurs in the first 3 months [3]. After this time, recovery becomes increasingly unlikely and most sufferers will have permanent damage to their field of vision thereafter [4]. Typically, training against a neglect disorder is compensatory. Consequently, it tries to help patients live better with their handicaps. The goal of these therapies should be that patients can better manage their daily lives. For this purpose, neurovisual therapies are performed with a therapist in a 1-to-1 setting. The therapist explains the tasks to the patient, gives feedback and motivates him. Due to the shortage of medical personnel, patients are often not provided enough therapy hours. Though the patients in this state will likely not experience a real increase in their ability to see, it could still help them to be more active in their daily lives. Therefore, there is the idea is to use a socially assistive robot (SAR) in a neglect therapy, which accompanies the patient through the therapy. The patient may still want to see a EICS ’22: Engineering Interactive Computing Systems conference, June 21–24, 2022, Sophia Antipolis, France. EMAIL: alexandru-nicolae.bundea@uni-rostock.de (A. 1); peter.forbrig@uni-rostock.de (A. 2); ORCID: 0000-0001-8315-3405 (A. 1); 0000-0003-3427-0909 (A. 2); ©️ 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) real medical person for their therapy, but if future robot therapies could provide a similar effect as compared to human therapists, they could provide complementary therapy sessions for a patient. Theoretically, a patient can work with the neglect application alone, but because patients would not interact with other humanoid persons, their motivation to continue the sessions may decrease and they are rather tempted to drop out of their scheduled therapy. Thus, we intend to use the robot as a motivational helper. In this short paper, we present a system of such neurovisual therapy with a SAR. We will show related similar digital neurovisual therapy applications and present a system overview in the following. Then we briefly discuss our experience and limitations of this therapy system with our preliminary work inside the E-BRAiN-project. 2. Related work To the best of our knowledge, we are not aware of any neurovisual therapy system which works with a SAR. Therefore, at this point, we will look at other, already established therapy systems. Kerkhoff et al. [5] present in their paper their comprehensive software EYEMOVE for a standardized diagnosis and therapy of visual exploration disorders. The software was developed for normal Windows computers and contains a variety of tasks that have been implemented in it. Among the tasks, there are those for training gaze saccades but also tasks for visual exploration of everyday situations. It also includes a diagnostic function for testing visual exploration ability. During the actual procedure, a head support is to be worn and a one-time calibration is to be performed for the patient. The patient is supposed to learn how to handle the tasks with practice rounds. Another software is the "OK-Neglectraining" by Psycware [6]. This offers saccade, exploration and reading exercises for reading. However, this software has not been actively developed since 2004. Currently, a new development direction is being pursued in with virtual reality. The use of corresponding head-mounted devices has not yet been widely tested with neurovisual therapy, but certain experiments are in the process of being researched. One advantage of displaying VR devices is the representation of a putative three-dimensional space, which exerts a greater influence on spatial encoding [7]. Knobel et al. [8] investigated an immersive 3D swiping task for neglect disorder. Patients were asked to mark objects in a threedimensional space in a hemisphere stretched around them. The task could be well performed and accepted and showed a high correlation with a control group who did a comparable task using pencil and paper. 3. Technical overview and design of the system A graphical representation of the technical overview of the E-BRAiN system is found in Figure 1. The system was built for different post-stroke therapies, but here we show the case for a neglect therapy. This therapy was developed together with the medical partners of the E-BRAiN project and consists of different single exercises. These are “Optokinetic Stimulation” [9], Gaze saccades [10], and three forms of a visual exploration [11]. The exercises have already been clinically tested but still had to be adapted so that the system and the robot can provide appropriate feedback when the patient needs help. The hardware setup consists of several things, as seen in Figure 1. Approximately separated by functionality, the data flows between the devices are also shown there. We have a (1) central (Linux) computer with the stored patient and therapy data, the (2) Pepper robot, a (3) monitor and (4) an (Android) tablet. The therapy is started and controlled by a therapist present with another (4) administrator PC. The monitor and robot are shown in Figure 2. On the software side, most of the programs on the central computer were written using Python. Services that we have used to a significant extent are MQTT [12] as a communication protocol between the devices and the central computer and RosaeNLG [13] for creating the text content for the robot. The process of a therapy is such that first a potential stroke patient is medically screened. Only when the patient shows enough potential to profit from the training, we continue. This data is entered into the database in a web interface hosted by the central computer. Depending on the severity of the handicap, a specific configuration is set for NVT exercises. Therapy appointments are then created and started there by command. A specific Python script is executed, which queries the patient data at the beginning and then performs the therapy with the patient-specific configuration. The therapy interaction was built on a finite-state model, so the therapy session starts in the first state "Greeting" and "progresses" through it until the final state "Saying Goodbye". Before the therapy session starts, all end devices must already be switched on. These concerns mainly the robot and a 27-inch monitor with an included Android tablet. The central script sends the content to be displayed as a JSON message to the end devices. The Android NVT app we developed then interprets these messages and displays the content. The app was implemented as an Android app so that eventually the robot could also display the NVT content. Our version of the Pepper robot can only be used by operating him as an Android app. Robot Control Legend Data flow Data flow under development Interaction Management of Therapies Pepper Apps for Exercises Flickboard Therapist Therapy Administration MQTT Server Tablet1 Patient Tablet2 User Models Interaction Server Touch Screen Dialog Scripts Visualisation Tool DSL Tool Programmer Programming Support Figure 2: Visible patient setup, task 1, Figure 1: Overview of the system components [14] optokinetic stimulation The design of the therapy was driven by the past experiences with neglect patients from our medical partners. Since the E-BRAiN system was already built for other stroke therapies, we were able to use this infrastructure for creating the program logic and develop the front-end for the patients. For medical and ethical reasons, we included affected stroke patients only later in the development phase, when the robot therapy was running stable. In this study, we mainly want to explore the therapeutic success and acceptance of utilizing SAR in this kind of therapy. Therefore, we orientate ourselves on the course of existing therapies. The medical partners already had experience with treating neglect patients with those other neurovisual therapy programs. On this basis, a therapy script was defined, what a session should contain and what the patient should see. The development team met regularly with the medical partners, to discuss and evaluate the progress of the implementation. The participating patients have confirmed that the therapy tasks can be carried out well with the visual content and touch screen. 4. Implemented neglect therapy The patient receives an introduction to the therapy and the respective day of the session at the beginning. The patient should sit with his eyes 30cm in front of the monitor. The patient should look at a fixation point at the level of 1/3 or 2/3 of the width of the screen and vertically in the center and concentrate on it. The affected side should be trained and this side should be exposed to the screen, which means that the healthy side should be on the narrow side of the monitor. Each task is explained again in detail before it is performed. We had to develop and adapt tasks so that the robot can provide feedback accurately. During therapy, it is important to give the patient instructions repeatedly and over and over again. Patients with a neurovisual handicap are noticeably easy to distract [15]. For this reason, a large part of a therapy session consists of many explanatory passages and prompts to keep the patient focused on the tasks. In absolute terms, this means that in a session length of e.g. 60 minutes has about half of the time not spent on active exercises. Concerning the visual training fields which could be trained in a neglect-therapy (left, right, top, bottom), we focused on training an affected left or right side. In our therapy, we have 5 different tasks. Task 1 “Optokinetic stimulation”: at the beginning of the task, symbols are shown to the patient. During this task, the patient is supposed to touch a symbol on the edge of the screen of the healthy side and follow this symbol with the eyes, not turning the head. When the touched symbol has reached the edge of the screen, the patient is supposed to say the word "Now" soon after. The number of times the patient manages to do this correctly is measured. Speech recognition works using a built-in tablet microphone and VOSK library for recognizing the word. On all patient's actions, the robot will give feedback. Task 2 “Gaze saccades”: The patient first sees a white cross at the location of the fixation point. After a short time, it turns red, and a single symbol appears somewhere randomly on the screen. The patient should then touch it. If the patient does not find the symbol after a short time, the robot has a "help mode". The robot gives a hint of where the searched symbol should be. The “help mode” mode is an important feature of the neglect therapy, as it prompts the patient to search the template in a specific way and for him to learn a “search strategy”. For this, the screen is conceptionally divided into 9 equal rectangles. The robot gives a hint where the symbol can be found. The patient will be given prompts in a certain order. First, he should search from top to bottom and starting with the “healthy” third of the screen, through the middle, to the handicapped side. If this strategy is used, a neglect patient can help his brain to recognize more objects in his field of vision. If the patient is unsuccessful with finding the symbol, a new gaze saccade template will be presented. Task 3: “Visual exploration” - (3a) "Detection of all target stimuli", (3b) "Detection of "the other" target stimulus", (3c) "Visual exploration with photographic material": These tasks are similar in principle, they involve the search for target stimuli on templates. For this purpose, randomly generated templates are used, on which the patient has to touch the correct target stimuli. There are 5 levels of templates. On higher levels (many) "distractor" symbols appear. The patient is supposed to try to get as many templates correct as possible. Again, as in the 2nd task, the robot helps with the "help mode" if he cannot find the remaining target(s). Task 3c is special because it no longer works with random templates but with photo series. In this task, patients are supposed to find e.g. cows in pictures. To make this task work reliably, we have "masked" the target stimuli (i.e., the cows) and saved these target “masks” of the objects as individual mask files in PNG format. When a patient touches an image, we compute whether the touch point hit a target object or not. For the preparation of this task, a lot of work has to be done beforehand and masks should be as exactly possible. Drawing the outlines of unregular objects (here the cows) required careful work. This was done for 233 images. For future analysis of the images, we log the result data of each image to be able to identify which images have been problematic for patients. Figure 3: Screenshot of task 3b, visual exploration, Figure 4: Screenshot of task 3b, visual exploration, level 1 level 5 Figure 5: Screenshot of task 3c, visual exploration, Figure 6: Example execution of the visual exploration task level 2 3c 5. Therapy sequence A therapy runs in such a way that a patient sits down in front of the monitor and presses a "Start" button on the screen. This is followed by a welcome for today’s session and a brief general introduction to the therapy. Shortly thereafter, the explanation of the first task starts. The tasks are performed in the order described in section 3.2. After the explanation of a task, the patients can choose alternating symbols, colors to be worked on in the following exercise round. In Task 1 optokinetic stimulation, the patient goes through the set number of rounds in each case, which is performed for a fixed time of 3 minutes. This also applies to task 2 gaze saccades. After each of these rounds, a diagram is displayed showing the patient's performance and the robot says a few sentences about the patient's last performance. In tasks 3a, b, c no rounds are performed, because they are to be performed continuously for X set minutes. At the end of each exercise of a day, there is a diagram which shows the total daily performance of the exercise and compares it -if available- with the performance data of the previous days as seen in Figure 7. This is followed by an optional break, which patients can skip to move on to the next exercise. Once all of the day's exercises are completed, a closing commentary follows to motivate the patient and then allows for feedback from the patient on today's session. Lastly, the robot says goodbye. Figure 7: Screenshot of the monitor of an example performance diagram of task 2 “gaze saccades” (in german) 6. Discussion and limitation of the system The system can provide patients with the neurovisual tasks we have programmed and displays on the screen exercises comparable to those presented in the related works. Nevertheless, we have kept the therapy application tasks close to existing task types and other therapy software programs. For example, the help prompts for the tasks are now robot spoken by the robot instead of the human therapist. Additionally, as in other neurovisual applications, these prompts are not shown on the screen, thus we did not realize this as well. One advantage of the system is the robot, which can motivate. Furthermore, we provide a comprehensive introduction to each task with the explanatory texts, which should allow a patient to train with the system by himself with the robot. In addition, the system can form motivating incentives with the daily performance analysis and the overall comparative monitoring of the therapy performance. Certain methods in neurovisual rehabilitation methods such as the use of touch input with screens are still relatively new, but can provide interesting new opportunities and observations. This is different compared to other programs with visual exploration applications, which use the method of counting target stimuli and then entering them into the program via keyboard, rather than the touch input as we do. Closer observation of whether this approach provides an advantage could be further investigated. One more medical proposal was the simplification of texts and diagrams to avoid confusing patients with complicated content. This was mainly used in the dialogue for feedback and performance diagram of a task. Specifically, in the therapy diagrams, certain metrics were removed, among other things, to not let patients focus on other performance values during the tasks. For example, in Task 3, the performance diagram only consists of the average level of exploration templates achieved for today. In addition, we do not use decimal numbers, but always round to a whole number. Nevertheless, since this system is only a prototype that has only been tested on a few patients so far, there are likely some points that can be improved. During the development of the system with the medical partners, several things had to be implemented differently than when a human therapist treats the patient. The biggest problem is the negative extension of the therapy with additional time need for the still needed explanation texts and repeated prompts to perform the tasks correctly. One such phrase could be “Well done in the last template! Now please look again at the fixation point!”. This means that depending on the exercise configuration, in a case such as 60 minutes of therapy time, there are approximately 30 minutes of "net" exercise time included. This amount of non-task time is due in part to the use of explanatory texts that are repeated on each therapy day and also frequently repeated prompts during the exercises to focus, e.g. the fixation point. This may be worrying, since patients in an early stage of their rehabilitation are often already at their limit after 20-30 minutes of continuous neglect therapy [16]. In this sense, however, a shorter active practice time can also be selected in our system, but this worsens the time ratio in favor of the explanatory texts that always remain the same length. The reason for these repeated prompts and lengthy explanations are for keeping the patient focused on the task. As mentioned before, a patient with a neglect disorder is rather easily distracted. This can occur in incidents such as patients doing something else in the middle of a task. For example, as described earlier in Task 3c, patients should look for target stimuli like cows. However, patients may suddenly count flowers or respond to sounds outside the therapy room. In such situations, a human therapist can intervene and remind the patient of the task in progress as needed. However, this is currently not done for our therapy system and may need certain additional methods and hardware. In the future, mechanisms can be built in here in the future that can, for example, using eye-tracking to monitor whether the patient is probably still engaged in the actual task. From our current impressions, our current robot changes the way the therapy is carried out mostly in a way, that only the robot speaks and the patient only listens. This is in contrast to sessions with a human therapist, in which dialogues between patient and therapist can arise. A (verbal) dialogue option is not included in our current system, since the patients should primarily concentrate on the implementation of the session. 7. Conclusion and outlook We presented a neurovisual therapy for a neglect disorder using a social robot as an instructor. We showcased the system with the hardware components and explained the tasks we adapted from clinically proven therapies. Then we briefly described the course of a therapy and discussed the key features of the system. With the soon starting larger patient study, we hope to provide certain contributions. (1) to show that a neurovisual therapy with a SAR can provide an objectively comparable training improvement and (2) the experiences from the operation of such a therapy application and which features and special characteristics such a system must have in order to offer real patients the greatest possible chances to cope better with everyday life again. 8. Acknowledgements This joint research project “E-BRAiN - Evidence-based Robot Assistance in Neurorehabilitation” is supported by the European Social Fund (ESF), reference: ESF/14-BM-A55-0001/19-A01, and the Ministry of Education, Science and Culture of Mecklenburg-Vorpommern, Germany. The sponsors had no role in the decision to publish or any content of the publication. 9. References [1] Irwin B. Suchoff, Neera Kapoor, Kenneth J. Ciuffreda, Daniella Rutner, Esther Han, and Shoshana Craig. 2008. The frequency of occurrence, types, and characteristics of visual field defects in acquired brain injury: A retrospective analysis. Optometry - Journal of the American Optometric Association 79, 5, 259–265. DOI: https://doi.org/10.1016/j.optm.2007.10.012. [2] Darren S. J. Ting, Alex Pollock, Gordon N. Dutton, Fergus N. Doubal, Daniel S. W. Ting, Michelle Thompson, and Baljean Dhillon. 2011. Visual neglect following stroke: current concepts and future focus. Survey of ophthalmology 56, 2, 114–134. DOI: https://doi.org/10.1016/j.survophthal.2010.08.001. [3] Tanja C. W. Nijboer, Boudewijn J. Kollen, and Gert Kwakkel. 2013. Time course of visuospatial neglect early after stroke: a longitudinal cohort study. Cortex; a journal devoted to the study of the nervous system and behavior 49, 8, 2021–2027. DOI: https://doi.org/10.1016/j.cortex.2012.11.006. [4] Josef Zihl. 2013. Rehabilitation of visual disorders after brain injury (2. ed., 1. issued in paperback). Neuropsychological rehabilitation. Psychology Press, Hove. [5] G. Kerkhoff and C. Marquardt. 2009. EYEMOVE. Standardisierte Diagnostik und Therapie visueller Explorationsstörungen. Der Nervenarzt 80, 10, 1190, 1192-4, 1196-204. DOI: https://doi.org/10.1007/s00115-009-2811-4. [6] A. Beer. 2022. OK-Neglecttraining. Psycware: Psychologische Software und Medien (2022). Retrieved April 11, 2022 from https://www.psycware.de/index.html. [7] Michael Knodt. 2022. Einsatz immersiver virtueller Realitäten präsentiert über ein Head-mounted Display in der neurologischen Rehabilitation. Dissertation. Universität Saarland. [8] Samuel E. J. Knobel, Brigitte C. Kaufmann, Stephan M. Gerber, Dario Cazzoli, René M. Müri, Thomas Nyffeler, and Tobias Nef. 2020. Immersive 3D Virtual Reality Cancellation Task for Visual Neglect Assessment: A Pilot Study. Frontiers in human neuroscience 14, 180. DOI: https://doi.org/10.3389/fnhum.2020.00180. [9] Jong H. Kim, Byung H. Lee, Seok M. Go, Sang W. Seo, Kenneth M. Heilman, and Duk L. Na. 2015. Improvement of hemispatial neglect by a see-through head-mounted display: a preliminary study. Journal of neuroengineering and rehabilitation 12, 114. DOI: https://doi.org/10.1186/s12984-015-0094-5. [10] Dale Purvis, Ed. 2001. Neuroscience (2nd ed.). Sinauer Associates, Sunderland, Mass. [11] Brigitte C. Kaufmann, Samuel E. J. Knobel, Tobias Nef, René M. Müri, Dario Cazzoli, and Thomas Nyffeler. 2019. Visual Exploration Area in Neglect: A New Analysis Method for Video- Oculography Data Based on Foveal Vision. Frontiers in neuroscience 13, 1412. DOI: https://doi.org/10.3389/fnins.2019.01412. [12] Roger A Light. 2017. Mosquitto: server and client implementation of the MQTT protocol. JOSS 2, 13, 265. DOI: https://doi.org/10.21105/joss.00265. [13] 2022. What is NLG : RosaeNLG // Docs (March 2022). Retrieved April 11, 2022 from https:// rosaenlg.org/rosaenlg/3.2.1/about/nlg.html. [14] Peter Forbrig, Alexandru Bundea, Ann Pedersen, and Thomas Platz. 2022. Using a Humanoid Robot to Assist Post-stroke Patients with Standardized Neurorehabilitation Therapy. In Intelligent Sustainable Systems, Atulya K. Nagar, Dharm S. Jat, Gabriela Marín-Raventós and Durgesh K. Mishra, Eds. Lecture Notes in Networks and Systems. Springer Singapore, Singapore, 19–28. DOI: https://doi.org/10.1007/978-981-16-6369-7_3. [15] Kathleen Vancleef, Michael J. Colwell, Olivia Hewitt, and Nele Demeyere. 2019. Current practice and challenges in screening for visual perception deficits after stroke: a qualitative study, 14. [16] Georg Kerkhoff, Gilles Rode, and Stephanie Clarke. 2021. Treating Neurovisual Deficits and Spatial Neglect. In Clinical Pathways in Stroke Rehabilitation, Thomas Platz, Ed. Springer International Publishing, Cham, 191–217. DOI: https://doi.org/10.1007/978-3-030-58505- 1_11#DOI.