How do Physiotherapists and Patients talk? Developing the RiMotivAzione dialogue corpus. Andrea Bolioli[1], Francesca Alloatti[1,2], Mariafrancesca Guadalupi[1], Roberta Iolanda Lanzi[1], Giorgia Pregnolato[3], Andrea Turolla[3] 1 1 CELI - Language Technology, Italy 2 Department of Computer Science - Università degli Studi di Torino, Italy 3 IRCCS Fondazione Ospedale San Camillo, Italy Abstract 2018). A recent review of scientific literature about Artificial Intelligence and IoT in healthcare The research project RiMotivAzione aims at can be found in (Shah and Chircu, 2018). helping post-stroke patients who are following The research project RiMotivAzione aims at an arm and hand rehabilitation path. In this pa- per we present the RiMotivAzione corpus, the helping the patients who suffered from a stroke first collection of dialogues between physio- and are following an arm and hand rehabilitation therapists and patients recorded in an Italian path. The goal is to motivate the patients to follow hospital and annotated following the RIAS an- the assigned exercises through the use of a new notation protocol. We describe the dataset, the wearable device with motion sensors developed by methodologies applied and our first investiga- the Istituto Italiano di Tecnologia (IIT), integrated tions on relevant features of the dialogue pro- with a visual App and a conversational interface. cess. The corpus was the basis for the design of a conversational interface integrated with a This last component guides the user through the wearable device for rehabilitation, to be used therapeutic path proposing the exercises, giving by the patient during the exercises that he or advice and asking for feedback. she may perform independently.1 The implementation of voice technologies in the healthcare domain allows for patients with motor 1 Introduction impairments to interact with devices through spo- ken language (Moore et al., 2018), while arm and In recent years, computational linguistics and hand are busy performing the assigned exercises. medical research have started to collaborate in or- The interaction is seamless and spontaneous. The der to analyze the communication in the health- patient can keep up autonomously with the ther- care domain, in particular between clinicians and apy thanks to the guidance provided by the voice patients. From a medical perspective, linguistic assistant. The physiotherapist can monitor the pa- analysis and dialogue modeling can be used to tients at a distance, to evaluate their progress, and better understand and potentially enhance com- he can prevent a situation of therapy neglect by the munication in different healthcare settings (Sen patient, while the latter is motivated to stick to the et al., 2017; Chang et al., 2013; Marzuki et al., path and he can reach his rehabilitation goals on 2017), as well as to identify "preclinical" or "pre- time. Needless to say, these digital assistants are symptomatic" diseases for specific ranges of pa- not meant to substitute the clinician. tients, e.g. discovering early linguistic signs of cognitive decline (Beltrami et al., 2018). 2 Methodological Background and Natural Language Processing (NLP) technolo- Related Work gies are also used to develop new communicative tools, e.g. virtual assistants, to alleviate the bur- As we described in the previous section, the study den on medical personnel or shift to a home-based of communication and conversation in the medi- patient-centered model of care. Through mHealth cal domain is growing in the last years, as well (mobile health), for example, people can receive as the introduction of conversational agents in the assistance at home, and monitoring devices can healthcare sector. A review of current applications check the well-being of a person (Sezgin et al., and evaluation measures of conversational agents 1 used for health-related purposes can be found, for Copyright c 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 example, in (Laranjo et al., 2018). Otherwise, International (CC BY 4.0). there is no systematic review of scientific literature concerning the linguistic analysis of dialogues in specification of the Dialogue Act Markup Lan- healthcare. Some scientific studies describe how guage (DiAML), used in many annotated corpora. communication can influence clinical outcomes in In RiMotivAzione project, we deemed RIAS as the the rehabilitation setting, e.g. how patient satis- most useful one for its specific focus on medical faction, decision-making, and stress level correlate conversation. Even though RIAS is the closest do- with physicians’ communicative acts (Hall and main tagset to annotate our corpus, some problems Roter, 2012). Some researchers propose methods still emerged and they will be presented in next to detect and track topics in psycho-therapeutic section. conversations (Chaoua et al., 2018). Other re- searchers conducted an analysis of actual commu- 3 Corpus Annotation nicative behaviors, including nonverbal ones, be- tween physicians and patients in rehabilitation, us- The RiMotivAzione corpus includes two complete ing transcription and coding of utterances (Chang cycles of physiotherapy sessions with two patients et al., 2013). in post-stroke rehabilitation (namely, P1 and P2) and three physiotherapists (T1, T2, T3). The inter- The analysis of speech acts and conversational views were video recorded in IRCCS Fondazione interaction can play a relevant role in dialogue Ospedale "San Camillo" in Venice. Each session modeling for healthcare thanks to the classifica- lasted about 1 hour. The physiotherapy cycle for tion of utterances, the analysis of dialogue turns patient P1 included 14 sessions, while P2 took 16 and threads, the discovery of recurrent patterns. sessions. Therefore the total duration of record- Speech acts have been investigated in linguistics ings is about 30 hours. and computational linguistics for long. Specifi- The patients were carefully selected by the doc- cally, the task of automatic speech act recogni- tors, since they must present some features. Above tion has been addressed leveraging both super- all, they had to agree to be part of the experimen- vised and unsupervised approaches (Basile and tation and they needed to talk in Italian. In an en- Novielli, 2018). Otherwise, in the healthcare do- vironment where dialect is still strong, their ability main there is still much room for investigation. to speak Italian was not to be treated lightly. More- In the RiMotivAzione project, we deal with over, the patients did not have to present any issues physiotherapy sessions in a hospital. The task is related to aphasia. These requirements restrained to collect and analyze para-linguistic and linguis- the viable options to two candidates. tic data, according to the aforementioned goal of Both speakers were encouraged to talk freely the research project. In this specific setting, i.e. about any topic that may have emerged. Their conversational analysis of physician-patient dis- only constraint was the use of Italian; when peo- course, the most widely used method is the Roter ple slipped into dialectal terminology (in this Interaction Analysis System (RIAS). RIAS was case, Venetian), it was explicitly marked with developed as a tagset for coding medical dialogue the tag in the corpus. The audio since 1991 by Debra Roter et al. (Roter, 1991; tracks were transcribed and annotated following Roter and Larson, 2002) and it has been con- Savy’s (2005) guidelines for orthographic tran- structed as to be viable for all kind of sessions, e.g. scription for spoken Italian, where applicable. As conversations in the oncological setting (2017), a pre-processing, we used two Automatic Speech between patients and psychotherapists or even pa- Recognition (ASR) systems, i.e. Google Speech- tients and pharmacists. Moreover, RIAS was orig- to-Text and Nuance Transcription Engine. Auto- inally developed to annotate audio, while we tran- matic transcriptions were corrected manually and scribed the speech and annotated the transcrip- anonymized. Video and audio tracks have been tions. This is motivated by the NLP analysis we separately saved for future projects. wanted to perform on the text, e.g. syntactic and Overlapping between the two speakers and semantic analysis, machine learning, automatic di- pauses were not marked, as it was not relevant to alogue act classification. Other dialogue annota- our study. Similarly, any intervention in the dia- tion schemes exist, namely (Bunt et al., 2017; Ser- logue from a third party was not transcribed since ban et al., 2017; Stolcke et al., 2000), that includes our interest was solely in the doctor and patient’s rich taxonomies of communicative functions. The linguistic behaviours. Each dialogue turn of the ISO 24617-2 standard, for example, includes the corpus was annotated by two different annotators following the RIAS guidelines. All the annotators hen’s score, was promising (k = 0.63). In case of have a background in linguistics and a specific ed- disagreement (about 25% of the data), the process ucation about linguistic corpora. As a single di- was followed by reconciliation or a final decision alogue turn may contain more than one sentence by a super annotator, where the two annotators and more than one speech act, the tags assigned to could not overcome the disagreement. each turn may be more than one. The RiMotivAzione corpus has been built and RIAS tagset includes 29 categories divided in archived according to GDPR norms. It is not pub- four macro-categories called Medical Interview licly available but it can be requested to the authors Functions (MIF) that cover the majority of the for research purposes. exchanges between a doctor and a patient: Data Gathering, Information Exchange, Emotional Ex- 4 Corpus Analysis pression and Responsiveness, Partnership Build- The RiMotivAzione corpus contains about 98778 ing and Activation. Table 1 contains the list of tokens. The total number of dialogue turns is categories occurring at least 200 times in the cor- 7670: 3377 dialogue turns in P1 sessions, 4293 pus, together with examples. in P2 sessions. To the best of the authors’ knowledge, the RIAS In Table 2 and Table 3 we reported the number system has never been used to annotate sessions of types, tokens, the ratio between types and to- of physiotherapy until now. This means that not kens (the Lexical Richness Index) and the number all of the tags applied completely to the situa- of questions for the two patients. tion, or that some tags may be under-represented It is worth noticing that Lexical Richness Index compared to other studies: for instance, the tag ranges from 0 to 1 and it is closer to 0 in the doc- Concerns was applied to few sentences, since tors’ speech, meaning that medical personnel em- patients in physiotherapy sessions may inherently ploy a poorer vocabulary while talking to a patient. express less concern than oncological patients. This is due to the fact that a therapist needs to stick All the categories defined in Roter et al. (2017) to a protocol and cannot digress over a certain were used. Moreover, two more tags were limit. On the other hand, the patient talks quan- added to include all the exchanges: Unclear and titatively less: he pronounces fewer words, and Technical problems. The first applied to incom- most of the time those words are simple answers plete sentences, unintelligible ones (also marked to the questions posed by the clinician. The patient with the tag), or even in cases where talks less but he can wander more across conversa- the sentence referred to the physical context, mak- tion topics: he may disclose some personal detail ing the general meaning impossible to retrieve for about his life or just chit chat. This behavior is ac- the annotator. The second tag applied to situations tually encouraged by the therapist, since it makes where the wearable device wasn’t working prop- the therapy session less dull and more spontaneous erly, therefore resulting in some technical issue out for both the participants (Delany et al., 2010; Ed- of the scope of the therapy. wards et al., 2004). To sum up, the doctor needs Another issue concerns the use of irony. Specif- to talk a lot to instruct the patients about the ex- ically, Patient 2 heavily employed irony while ercise they need to fulfill, as well as to ask ques- talking to the therapist, even when the dialogue tions (mainly regarding general well-being and in- concerned his health and well-being. Irony is hard quiries about the therapy itself). Meanwhile, the to interpret, resulting in the difficulty to assign patient may talk less because most of the time he correctly a tag to those sentences. Tag Jokes just has to answer short questions (such as "Does was used in this case, and where inappropriate, it hurt?"); or, when he talks more, it is about some a discussion between the annotators oriented the external topic which generates an increment in the choice. vocabulary richness index. As the annotation task was difficult and it was As the main goal of the study is to replicate inherently affected by subjectivity, we measured the clinician’s communicative style onto a con- the resulting inter-annotator agreement and we put versational interface, the major interest is on how in place strategies to solve the disagreement, in or- the therapists talk, rather than the patients. Pa- der to annotate all the dialogue turns. The agree- tients’ manner of speaking is taken into consid- ment calculated at this stage, according to the Co- eration when imagining all the orders or phrases Specific RIAS code Examples Social talk non vedevo l’ora di venirla a trovare. Directions per scendere chiudo, per salire apro la mano. Agreements esatto, perché lo abbiamo registrato proprio cosí. Medical condition un po’, poco, fastidio piú che male. Approvals bravissimo. Unclear [dialect] vara! Therapeutic regimen venerdí faremo la parte clinica ti faró io la scala di valutazione. Jokes and laughter ci vediamo domani, è piú una minaccia che un invito. Asking for understanding vorrei portarla cosí, hai capito? Checking for understanding chiudo le dita. cosí? Concerns sei sicura che funziona? CeQ Medical condition a fare gli esercizi non ha dolore? Table 1: Tags and examples of categories occurring at least 200 times in the corpus. Parameters Patient 1 Therapist Word Frequency Types 2065 3017 vai 1166 Tokens 10533 39305 apri 432 Lexical Richness Index 0,19 0,07 rilassa 400 Questions 40 667 bravissimo 353 mantieni 314 Table 2: Patient 1 corpus. bravo 288 lascia 199 Parameters Patient 2 Therapist fare 187 Types 2451 2406 prova 156 Tokens 18233 30707 ottimo 153 Lexical Richness Index 0,13 0,07 Questions 380 805 Table 4: Most frequent Verbs and Adjectives used by therapist 1. Table 3: Patient 2 corpus. iotherapy, especially for patients that suffered a that the user could say to the voice assistant to ex- stroke (Palma and Sidoti, 2019). press his needs. Table 4 and Table 5 list the most The quantitative analysis operated over the an- frequent Verbs and Adjectives pronounced by the notated corpus confirms the qualitative remarks physiotherapists. Apart from "Okay", which is the made so far. In Figure 1 we present the distribution most frequent word for both therapists (1231 and of dialogue tags, both for patients and therapists, 1019 occurrences), both therapists often use adjec- i.e. the distribution of utterance type according tives of positive value: bravissimo, bravo, ottimo, to RIAS categories. We plotted on a logarithmic buono. Other frequent words are mainly verbs ex- scale the frequencies of the tags. pressed at the first plural person, such as we do, Sentences annotated as Social talk were we’ll try, or equivalent expressions (let’s relax). abundant, while those marked as Concerns were The use of the "we" is a communication element copious just for a patient, because he was frus- that aims at putting on the same level the clini- trated about his health situation and the difficul- cian and the patient; the goal is to make the pa- ties to manage the physiotherapy. During the ses- tient feel more comfortable and therefore enhanc- sions with Patient 1, the physiotherapist was able ing the probability of therapy adherence. At the to engage a conversation about a hobby of his same time, adjectives such as "good" and "very (motorcycles); even though this discussion topic good" praise the patient’s efforts, underlining the is not relevant to the therapy, the fact that they progress he is making. The psychological com- were talking about something interesting for the ponent is of paramount importance during phys- patient contributed to the improvement of his med- Figure 1: Distribution of dialogue tags in RiMotivAzione corpus Word Frequency tic behaviors emerged during the conversations. vai 340 These patterns were used to build the conversa- proviamo 199 tional style and infrastructure of the dialogue sys- apro 198 tem. pronto 174 facciamo 134 5 Conclusions and Next Steps attento 124 We created a corpus of conversations between pa- andare 123 tients and clinicians, in Italian, and we annotated scendere 120 the dialogue turns according to the Roter Interac- vediamo 115 tion Analysis System (RIAS). This corpus was the fare 111 first step in the design of a conversational inter- face integrated with a smart wearable device, to Table 5: Most frequent Verbs and Adjectives used by therapist 2. guide and assist the patients through the exercises assigned by the physiotherapist. The first step in the future work will be to ical condition (Gard and Gyllenstein, 2000). deepen the linguistic analysis conducted on the All of these conversational elements are put in corpus, especially regarding the tagged dialogue place willingly by the clinician and, even more, it acts. A stronger qualitative investigation over the is the style patients are used to. In the voice assis- data will be carried out. The second step will be tant design we try to mirror these strategies, pro- to enrich the dataset: unfortunately, only two pa- viding praises when appropriate and asking ques- tients were deemed appropriate for the experimen- tions to constantly monitor the user’s well-being. tation, while a corpus should contain dialogues The data extracted from the transcription and the from more speakers. annotation represents the most frequent linguis- The RiMotivAzione corpus can be requested to the authors for research purposes. G. Gard and A. L. Gyllenstein. 2000. The importance The system prototype will be tested in San of emotions in physiotherapeutic practice. Physical Therapy Reviews, 5(3):155–160. Camillo Hospital by a set of stroke patients, fol- lowing the clinical trial procedures. Thanks to the J. A. Hall and D. L. Roter. 2012. Physician-patient results of the test, we will produce experimental communication. In H. A. Friedman, editor, The Ox- data to investigate if and how a voice assistant in- ford Handbook of Health Psychology. Oxford Uni- versity Press. tegrated with a wearable device can increase the effectiveness of the therapy. L. Laranjo, A.G. Dunn, H.L. Tong, A.B. Kocaballi, and al. 2018. Conversational agents in healthcare: a sys- 6 Acknowledgments tematic review. Journal of the American Medical Informatics Association, 25(9):1248–1258. RiMotivAzione is a two-year Research and In- E. Marzuki, C. Cummins, H. Rohde, H. Branigan, and novation project supported by POR FESR 2014- G. Clegg. 2017. Resuscitation procedures as multi- 2020 Regione Piemonte. The partners are Koiné party dialogue. In Proc. SEMDIAL 2017 (SaarDial) Sistemi, CELI, IRCCS Fondazione Ospedale San Workshop on the Semantics and Pragmatics of Dia- logue, pages 60–69. Camillo, Synesthesia, Istituto Italiano di Tecnolo- gia (IIT) and Morecognition. We are thankful to R.J. Moore, M.H. Szymanski, R. Arar, and G. J. our colleagues and project partners, in particular Ren. 2018. Studies in Conversational UX Design. Springer. Paolo Ariano and Nicoló Celadon. S. Palma and E Sidoti. 2019. La comunicazione nei processi di cura. COMUNIT IMPERFET, References 46(4):243–251. P. Basile and N. . Novielli. 2018. Overview of the D. Roter. 1991. The Roter method of interaction pro- evalita 2018 italian speech act labeling (ilisten) task. cess analysis (RIAS manual). The Johns Hopkins In Proceedings of the Sixth Evaluation Campaign of University, Baltimore. Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2018) co-located D. Roter, S. Isenberg, and L. Czaplicki. 2017. The roter with the Fifth Italian Conference on Computational interaction analysis system: Applicability within the Linguistics (CLiC-it 2018). context of cancer and palliative care. Oxford Text- book of Communication in Oncology and Palliative D. Beltrami, G. Gagliardi, Rossini Favretti, E. R., Ghi- Care, pages 717–726. doni, F. Tamburini, and L. Calzá. 2018. Speech analysis by natural language processing techniques: D. Roter and S. Larson. 2002. The roter interaction a possible tool for very early detection of cognitive analysis system (rias): utility and flexibility for anal- decline? Frontiers in Aging Neuroscience, 10:369. ysis of medical interactions. Patient education and counseling, pages 128–132. Harry Bunt, Volha Petukhova, David Traum, and Jan R. Savy. 2005. Specifiche per la trascrizione or- Alexandersson. 2017. Dialogue act annotation with tografica annotata dei testi in Italiano Parlato. Anal- the iso24617-2 standard. Multimodal Interaction isi di un dialogo. Liguori, Napoli. with W3C Standards. T. Sen, M.R. Ali, M.E. Hoque, R. Epstein, and P. Du- C.L. Chang, B.K. Park, and S.S. Kim. 2013. Conver- berstein. 2017. Modeling doctor-patient commu- sational analysis of medical discourse in rehabilita- nication with affective text analysis. In 2017 Sev- tion: A study in korea. The journal of spinal cord enth International Conference on Affective Comput- medicine, 36(1):24–30. ing and Intelligent Interaction (ACII), pages 170– 177. I. Chaoua, D. R. Recupero, S. Consoli, A. Harma, and R. Helaoui. 2018. Detecting and tracking ongoing Iulian Vlad Serban, Ryan Lowe, Peter Henderson, and topics in psychotherapeutic conversations. AIH@ Joelle Pineau Laurent Charli and. 2017. A survey of IJCAI, pages 97–108. available corpora for building data-driven dialogue systems. arXiv:1512.05742. C.M. Delany, I. Edwards, G.M. Jensen, and E. Skinner. 2010. Closing the gap between ethics knowledge E. Sezgin, S. Yildirim, S. Ozkan-Yildirim, and and practice through active engagement: an applied E. Sumuer. 2018. Current and Emerging MHealth model of physical therapy ethics. Physical Therapy, Technologies: Adoption, Implementation, and Use. 90(7):1068–1078. Springer. I. Edwards, M. Jones, J. Carr, A. Braunack-Mayer, and R. Shah and A. Chircu. 2018. Iot and ai in healthcare: G.M. Jensen. 2004. Clinical reasoning strategies in A systematic literature review. Issues in Information physical therapy. Physical Therapy, 84(4):312–330. Systems, 19(3):33–41. Andreas Stolcke, Klaus Ries, and Elizabeth Shriberg Noah Coccaro. 2000. Dialogue act modeling for automatic tagging and recognition of conversational speech. Computational Linguistics, 26(3).