Video-Based Automated Emotional Monitoring In Mental Health Care supported by a Generic Patient Data Management System1 Hayette Hadjar1, Julian Lange 1, Binh Vu1, Engel Felix2, Mayer Gwendolyn3, Paul Mc Kevitt4, and Matthias Hemmje2 1 University of Hagen, Faculty of Mathematics and Computer Science, Hagen, Germany {hayette.hadjar, julian.lange, binh.vu}@fernuni-hagen.de 2 Research Institute for Telecommunication and Cooperation, Dortmund, Germany {fengel, mhemmje}@ftk.de 3 Heidelberg University, Department of Internal Medicine II, General Internal Medicine and Psychosomatics, Heidelberg, Germany gwendolyn.mayer@med.uni-heidelberg.de 4 Ulster University, Derry/Londonderry, Northern Ireland p.mckevitt@ulster.ac.uk Abstract. The detection of emotion and expression from video streaming plays a very important role in the mental health care of a patient. The data obtained from it can be used to support the diagnosis of emotional needs related to de- pression or other kinds of mental illnesses. These data can provide useful emo- tion monitoring information for health monitoring systems using automatic cal- culation of this Affective Computing (AC) information and storing them in pa- tient data management systems. This research has been developed in the context of the SenseCare project, in order to support the treatment of patients with pri- mary or comorbid mental disorders. There are two processes for tracking emo- tion in video, real-time, and offline facial expression video analysis. Real-time video analysis uses streamed webcam videos as data input. Offline video analy- sis uses pre-recorded video files as input. We focus in this paper on the real- time video analysis process, and we employ deep learning in web browsers for face detection and recognition using JavaScript. Keywords: Video Content Analysis, Affective Computing (AC), Emotion Recognition, Facial Expression Analysis, Emotions representation, Emotional Monitoring. 1 Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) 2 1 Introduction and Motivation Understanding and utilizing psychological knowledge in order to e.g. automatically detect psychological events is one of the key research challenges in Affective Compu- ting (AC), especially in Research and Development (R&D) of software related to automatic emotion detection. Furthermore, ambient assisted living and tele- monitoring health care technologies can facilitate the collection of vital signal data remotely (e.g., ECG, heart, and breath sounds) as well as the collection of software- based automatic assessment and monitoring signals of mental or emotional status [1]. The SenseCare (Sensor Enabled Affective Computing for Enhancing Medical Care) Platform [2] has been developed as a prototypical AC R&D platform providing software services applied to the care of patients with different support needs in the field of mental health care. This technology provides various opportunities for physi- cians, psychotherapists, clinicians, or other healthcare professionals. Such target user groups can e.g, be enabled to intervene early in the case of a critical mental state that could result in a crisis and thus a worsening of the patients’state of health. Hence, primary care professionals can achieve an improved overview of the emotional well- being of patients through the SenseCare [3] AC R&D platform’s software services. SenseCare integrates data streams from multiple sensors and fuses data from these sensor signal streams to provide a global assessment that includes objective levels of understanding emotional expressions, as well as the corresponding well-being, and cognitive state of the patients. Several potential use cases for a system like SenseCare underline the topicality, of which the recent crisis due to COVID-19 is only one: Pa- tients with mental disorders on isolation wards have to stay outside the support sys- tem, as e.g. psychiatrists, psychologists, and other clinical staff fear infection [4]. Remote, emotion-sensitive support could have supported these patients better and first solutions in tele-medical intervention and corresponding pathways have been devel- oped in the meantime [5]. Additionally, patients in an online group therapy setting (e.g. due to rural provenience with low density of psychotherapists) can be supported by emotion-sensitive videoconference tools. Recent changes in the accounting system of e-health applications by health insurances will promote this development [6]. Fur- thermore, patients with complex psychosomatic diseases often suffer from a comorbid depression or anxiety, which leads to a vicious circle of deleterious effects. For exam- ple, every fifth patient with heart failure suffers from depression, which may lead to a lack of treatment adherence [7]. Continuous monitoring of these patients by software services of platforms like SenseCare may reduce high health related costs. Finally, elderly patients in ambient assisted living are in need for a continuous monitoring of their emotional state, as sudden changes in the mood can be a risk-marker for demen- tia [8]. Processing of voluminous data streams from video recordings, on the basis of the recently introduced Information Visualization for Big Data (IVIS4BigData) model [9] elaborates data stream types addressed by our visualization approach. The so-called Knowledge Management Ecosystem Portal (KM-EP) is the back- bone system of the SenseCare platform [10] and is comprised of five subsystems where each of them has several components of its own. The Information Retrieval Subsystem (IRS) of the SenseCare KM-EP indexes AC content and enables users to 3 search for AC content using keywords, faceted search, and taxonomies. The Learning Management Subsystem (LMS) of the SenseCare KM-EP provides tools for authors and trainers to create AC-related e-learning courses using content in the SeneCare KM-EP. SenseCare KM-EP users can later register in these SenseCare courses to obtain new AC knowledge. The Content and Knowledge Management Subsystem (CKMS) of the SenseCare KM-EP acts as a central repository for AC publications, AC multimedia, AC software, AC R&D dialogs, or AC-related medical records in the SenseCare KM-EP. Producers of AC content can use components in this KM-EP subsystem to import, create, manage, and classify their AC contents. Furthermore, SenseCare KM-EP users can access these AC contents and rate their quality. The User Management Subsystem (UMS) of the SenseCare KM-EP manages all users and groups of the SenseCare KM-EP. Other systems can ask to authenticate SenseCare KM-EP user’s identity using OpenID Connect [11] integrated into this SenseCare KM-EP subsystem. The Storage Management Subsystem (SMS) of the SenseCare KM-EP provides storage for files and documents. They can either be stored in a local server or on the cloud for better access speed and availability. Solutions already exist for the administration of medical data and processes. Incor- porated within a specialist internship at the FernUniversität in Hagen were exemplary projects: IndivoHealth [12] and Tolven [13] considered as a solution for electronic patient records and validated with regard to the requirements of SenseCare. On the practical side, our objective is to develop and implement further new soft- ware modules as R&D prototypes and corresponding AC software services that can be integrated with the SenseCare KM-EP. Such R&D results can then be re-used to achieve some directions for future R&D work in this domain. The main contributions of this paper are:  Implementation of a prototype module that collects patients’ facial expressions and corresponding emotion data in real-time, during treatment sessions, or at home for cases of patients with or at risk of a mental disorder. The software categorizes emotional states according to the seven basic emotions described by Paul Ekman (anger, contempt, disgust, enjoyment, fear, sadness, and surprise) [14].  The prototype employs deep learning in browsers by using JavaScript, and stores results (Date - Time- Detected emotion) in a MongoDB. The remainder of this paper is organized as follows. Section 2 discusses the state of the art of using sensors in healthcare, existing tools, and Convolutional Neural Net- works (CNNs). In section 3 we detail the conceptual design of modeling API, infor- mation model and implementation of the solution, section 4 discusses our findings, and finally we conclude in section 5. 2 Selected State of the Art and Related Work Research into wireless sensor networks and smart environments for remote monitor- ing for healthcare applications [15] employs wearable micro-machined sensors for providing accurate biomechanical analysis under ambulatory conditions. 4 In the continuous monitoring of human activities, wearable sensors can e.g. detect abnormal and/or unforeseen situations by monitoring physiological parameters along with other symptoms [16]. There are many software tools that employ methods of machine learning to assist people in the areas of health. Eq-Radio [17]: Researchers from MIT’s Computer Science and Artificial Intelli- gence Laboratory (CSAIL) have developed EQ-Radio, a device that can detect a per- son’s emotions using wireless signals. It transmits an RF signal and analyzes its re- flections off a person’s body to recognize his emotional state (e.g. happy, sad).The key enabler underlying EQ-Radio is a new algorithm for extracting the individual heartbeats from the wireless signal at an accuracy comparable to on-body ECG moni- tors. EQ-Radio has three components: a radio for capturing RF reflections, a heartbeat extraction algorithm, and a classification subsystem that maps the learned physiologi- cal signals to emotional states [17]. Valossa AI [18] is qualified to recognize sentiments and emotions from facial ex- pressions and speech, either from recorded video content or live feed. Mika Rautiainen, founder and CEO of Valossa says that going through a video of a therapy session takes a whole day from a human being. But AI tells her in a real-time analysis what happens on the patient's face. FaceReady [19]: from Noldus Information Technology is a facial expression analy- sis software. It can automatically analyze the expressions happy, sad, angry, sur- prised, scared, disgusted, and neutral. It can also calculate Action Units, valence, arousal, gaze direction, head orientation, and personal characteristics such as gender and age. SHORE® [20] Software of Fraunhofer IIS enables the quick detection of faces and objects as well as for the analysis of faces in image sequences, videos and single frames. It can estimate gender, age and facial expressions in real time. The software runs on standard Convolutional Neural Networks (CNNs) are a type of deep neural network designed to process multiple data types, but it was initially designed to ana- lyze images [21] CNNs are the most popular neural network model employed in im- age classification [22], CNNs comprise several layers, such as the Convolutional Lay- er, Non-Linearity Layer, Rectification Layer, Rectified Linear Units (ReLU), Pooling Layer, Fully Connected Layer, and Dropout Layer. Existing solutions stream frames from a video stream over a network with OpenCV [23] for the following advantages: (i) firstly, building a security application that re- quires all frames to be sent to a central hub for additional processing and logging, and (ii) secondly, the client machine may be highly resource-constrained (such as a Rasp- berry Pi) and lack the necessary computational horsepower required to run computa- tionally expensive algorithms (such as CNNs). 3 Conceptual Architecture of Health Information Subsystems The conceptual architecture of the Health Information System (HIS) within the SenseCare KM-EP can be characterized by several subsystems which organize and 5 process information by specifying the type of data processed in each subsystem inde- pendently of the others. Within the SenseCare KM-EP’s HIS, the Carna subsystem for data management and information systems can run workflows for processing different types of AC data (offline data and real-time data). Hence, it utilizes a workflow engine and enables implementation of customized workflow action steps by Java code. The system con- sists of different modules, the most important of which are Carna.dms (Data Man- agement System), Carna.process[emotion detection] (support processes, using the example of Emotion Detection), and Carna.tenantmodules (general tenant-based modules). Each Carna module within the SenseCare KM-EP’s HIS implements a REST-API to access its functionality. Fig. 1 shows the most important of the imple- mented REST interfaces. Fig. 1. SenseCare KM-EP HIS’s conceptual architecture of the Carna modules supporting the integration of Health Care support processes [24]. In the carna.dms module, among other data, process-related data and registered processes are saved. When a workflow process is started for a patient, a new process instance is initialized by the process module and an associated data record is created. When a healthcare task (that is implemented by a process) is finished, a documenta- tion record is appended into the process-instance record table. To support the conceptual architecture and API modeling for the KM-EP HIS’s Emotion Detection system, the activities of the SenseCare Emotional Monitoring Use Case Scenario are: 1- The offline video analysis pipeline of the KM-EP HIS’s Emotion Detection uses pre-recorded patient videos. These files are stored offline to be pre- trained with CNN models and classifiers in order to detect emotion from facial expression. 2- The real-time video analysis of the KM-EP HIS’s Emotion Detection uses data input from webcams for detection and recognition of facial expression in real time, this process is the main focus of the remainder of this paper. 6 4 Prototype Implementation of Emotion Detection A prototype solution for the SenseCare KM-EP Emotion Detection has been devel- oped with the Model-View-Controler (MVC) architecture paradigm. Hence, the soft- ware prototypes’s source code is divided into three layers. On the model layer, data storage, integrity, consistency, querying, and access support is allocated. The global neural network models that are exported on this level from faceapi.js are AgeGenderNet, FaceExpressionNet, FaceLandmark68Net, FaceLandmark68TinyNet, FaceRecognitionNet, SsdMobilenetv1, TinyFaceDetector, Mtcnn, and TinyYolov2. On the Controller level, the operations receive, interpret & validate input, create & update are specified and implemented. On the View level, the query & modify models are specified and implemented. In our case the clinical user or the patient interacts with the interface by means of a webcam on the view layer. The implementation of a corresponding REST API requires these elements: 1. Identify the objects that will be presented as a resource is the very first step in de- signing a REST API-based application. 2. Create model URIs by designing the resource URIs – focus on the relationship be- tween resources and its sub-resources. There resources URIs are endpoints for RESTful services. 3. Determine Representations: Mostly representations are defined in either XML or JSON format. For example: emotions: {angry: number, disgusted: number, fearful: number, happy: num- ber, neutral: number, sad: number, surprised: number} Number in our case is the percentage of the security of the model that the detec- tives have a particular emotion, each face element has expressions attribute. Example: => surprised: 0.990011861078746733256 In the initial prototype implementation the following base technologies are employed:  Tensorflow.js [25] is a library for machine learning in JavaScript, develop ML models in JavaScript, and use ML directly in the browser or in Node.js.  Face-api.js [26] is a JavaScript module, built on top of tensorflow.js core, and it implements several CNNs for face detections and recognition, and it has been op- timized to work on web and mobile devices.  Node.js [27] for synchronous or real-time communication in the web application, it is employed to produce highly accurate face recognition and detection.  MongoDB [28]: is an open source NoSQL database, it is a popular choice for han- dling big data.  Mongoose and NodeJSExpress [29] for transactions written in real-time and db connectivity to MongoDB in order to store results of real-time video analysis of fa- cial expression. 7 The overall distribution and operational deployment of the system within client server distribution architecture is shown in Figure 2 below. Fig. 2. Client-Server architecture of the initial Emotion Detection prototype. The system is divided into the frontend and the backend. The frontend in the cli- ents’ machine is combined of Face-api.js in TensorFlow.js, HTML/CSS/JavaScript, and browser to display the front-end. The backend server is developed using NodeJS Express, mongoose.Database, and MongoDB. The implementation allows both offline video and stream video to be uploaded and processed. We can input an HTML ele- ment like images or offline video using the id of the element, and input stream video with function startVideo() to start webcam in the browser. 5 Discussion of the findings The conducted experiment showed us that the developed module functionally meets basic requirements, and it is important to implement additional functionality in order to increase research study benefits. In the case of real-time video emotion recognition, the SenseCare KM-EP HIS’s Emotion Detection API stores the best emotion detected from the webcam in every 500 milliseconds, this choice of timing can be changed in the API. The stored data has the following format: AllExpressiondetected: {date + time, label of best expression} A part of the stored data in MongoDB can be seen in the table below. Table 1. Data stored in the MongoDB # db.emotionsave.find() { "_id" : ObjectId("5f399f182232afa8b58f96ab"), "dateTime" : "2020-8-16 22:2:0", 8 "expression" : "neutral", "__v" : 0 } { "_id" : ObjectId("5f399f1e2232afa8b58f96ac"), "dateTime" : "2020-8-16 22:2:6", "expression" : "neutral", "__v" : 0 } A demonstration of face expression Recognition of images from “FACES A database of facial expressions in younger, middle-aged, and older women and men” [30] is shown in Figure 3. Image 1 Face Expression Recognition Image 2 Face Expression Recognition Fig. 3. Demonstration of an Emotion Detection output. The prototype is under development and in our first observation during a test on ma- chines with different OSs (e.g. Windows, Ubuntu, MacOS), the values of the results of real-time video emotion analysis, and the response time changes according to the capacity and the hardware performance of the web server. Hence, a real-time detec- tion of emotions requires powerful hardware, e.g. Memory of the server must be greater than 6 GB. And high quality images in the input stream are required to identi- fy a face (descriptor). We also observed that SSD Mobilenet V1 neural network gives better accuracy then Tiny Face detector and MTCNN, and the accurate detection of emotions based on facial expressions decreases when the light quality in the experi- ment site decreases. Finally, the challenge is how we can recognize video facial ex- pressions with increased accuracy and in a quick inference time. 6 Conclusion and Future Work In this paper, we describe the implementation of a video-based automated emotional monitoring prototype consisting of two new subsystems of the SenseCare KM-EP. The first subsystem that is a prototype implementation of the Carna Patient data man- agement and information system for the area managing healthcare service processes. The second subsystem is the Emotion Detection subsystem that is implemented proto- typical to detect emotions based on analyzing facial expressions in videos. We discuss the use of CNNs in an initial prototype implementation to support face detection and expression recognition supporting deriving corresponding emotion as AC results. To establish a REST API we employ the face-api.js package, Node.js, TensorFlow.js core, and MongoDB to store patient detected expressions with the date and time in real-time. 9 We also have presented an initial conceptual architecture as well as an initial in- formation model of our system and have specified the technical software architecture of the API and discussed our first findings during the implementation of the API. Future work includes: - Integration of the video-based automated emotional monitoring module in the carna.dmg/KM-EP, and evaluation of the solution in a real HIS (Hospital Information System, e.g. GNU Health). - Visualization and perception of all stored expressions or Graphical representation of Emotions/Time, in order to make optimal decisions in healthcare. - Implementation of additional support processes in carna.dmg, and integration of real sources such as video/audio data. References 1. Crist TM, Kaufman SB, Crampton KR. Home telemedicine: a home health care agency strategy for maximizing resources. Home Health Care Management Prac- tice. 1996 Jun 1; 8(4):1-9. 2. Sensor Enabled Affective Computing for Enhancing Medical Care, link : http://www.sensecare.eu/, (viewed 24 July 2020). 3. Engel, F., Bond, R., Keary, A., Mulvenna, M., Walsh, P., Hiuru, Z., Kowohl, U.,Hemmje, M.L.: Sensecare: Towards an experimental platform for home-based, visualisation of emotional states of people with dementia. Computer Science, Springer, 2016. 4. Duan, L., & Zhu, G. (2020). Psychological interventions for people affected by the COVID-19 epidemic. Lancet Psychiatry, 7(4), 300-302. doi:10.1016/s2215- 0366(20)30073-0. 5. Torous, J., Jän Myrick, K., Rauseo-Ricupero, N., & Firth, J. (2020). Digital Mental Health and COVID-19: Using Technology Today to Accelerate the Curve on Ac- cess and Quality Tomorrow. JMIR Ment Health, 7(3), e18848. doi:10.2196/18848. 6. Gerke, S., Stern, A. D., & Minssen, T. (2020). Germany's digital health reforms in the COVID-19 era: lessons and opportunities for other countries. NPJ Digit Med, 3, 94. doi:10.1038/s41746-020-0306-7. 7. Celano, C. M., Villegas, A. C., Albanese, A. M., Gaggin, H. K., & Huffman, J. C. (2018). Depression and Anxiety in Heart Failure: A Review. Harv Rev Psychiatry, 26(4), 175-184. doi:10.1097/hrp.0000000000000162. 8. Ismail, Z., Gatchel, J., Bateman, D. R., Barcelos-Ferreira, R., Cantillon, M., Jaeger, J., . . . Mortby, M. E. (2018). Affective and emotional dysregulation as pre- dementia risk markers: exploring the mild behavioral impairment symptoms of de- pression, anxiety, irritability, and euphoria. Int Psychogeriatr, 30(2), 185-196. doi:10.1017/s1041610217001880. 9. M. X. Bornschlegl, K. Berwind,M. Kaufmann, F. C. Engel, P.Walsh, M. L. Hemmje,and R. Riestra, “IVIS4BigData: A reference model for advanced visual interfaces supporting big data analysis in virtual research environments”, Lecture 10 Notes in Computer Science (including subseries Lecture Notes in Artificial Intelli- gence and Lecture Notes in Bioin-formatics), vol. 10084 LNCS, pp. 1-18, 2016. 10. B. Vu, ‘Realizing an Applied Gaming Ecosystem - Extending an Education Portal Suite towards an Ecosystem Portal’, Technische Universität Darmstadt, 2015. 11. OpenID Connect, link: https://openid.net/connect/(viewed 24 July 2020). 12. IndivoHealth , http://indivohealth.org/. 13. Tolven, http://tolvenhealth.com . 14. Ekman, P., & Yamey, G. (2004). Emotions revealed: recognising facial expres- sions: in the first of two articles on how recognising faces and feelings can help you communicate, Paul Ekman discusses how recognising emotions can benefit you in your professional life. Student BMJ, 12, 140-142. 15. J. Ko, C. Lu, , M. Srivastava, J. Stankovic, A. Terzis, and M. Welsh. Wireless sen- sor networks for healthcare. In Proceedings of the IEEE, 2010. 16. Subhas Chandra Mukhopadhyay, “Wearable Sensors for Human Activity Monitor- ing: A Review,” IEEE Sensors Journal 15, no. 3 (2015): 1321–30. 17. Mingmin Zhao, Fadel Adib, Dina Katabi,”Emotion Recognition using Wireless Signals”, 2016 , http://eqradio.csail.mit.edu/files/eqradio-paper.pdf. 18. Valossa Video AI | Video Recognition | Image Analysis | Content Intelligence, link: https://valossa.com/, (viewed 24 July 2020). 19. Facial expression recognition software | FaceReader, link: https://www.noldus.com/facereader, (viewed 24 July 2020). 20. SHORE®, link: https://www.iis.fraunhofer.de/en/ff/sse/imaging-and- analysis/ils/tech/shore-facedetection.html, (viewed 24 July 2020). 21. T. Guo, J. Dong, H. Li and Y. Gao, "Simple convolutional neural network on im- age classification," 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA), Beijing, 2017, pp. 721-724. 22. Machine Intelligence and Signal Processing, Ebook, Proceedings of International Confer-ence, Springer, Singapore, MISP 2019, ISBN 978-981-13-0923-6. 23. OpenCV (Open Source Computer Vision Library), link: https://opencv.org/ , (viewed 23 June 2020). 24. Lange, Julian: Use of a data management system to support the Treatment of pa- tient in the psychological field. 2019. 25. TensorFlow.js, JavaScript library for machine learning, link: https://www.tensorflow.org/js, (viewed 23 June 2020).] 26. Face-api.js, JavaScript API for face detection and face recognition in the browser and nodejs with tensorflow.js, link:https://github.com/justadudewhohacks/face- api.js/, (viewed 23 June 2020). 27. Node.js, link: https://nodejs.org/en/, (viewed 24 July 2020). 28. The most popular database for modern apps | MongoDB link: https://www.mongodb.com/ , (viewed 24 July 2020). 29. Express - Node.js web application framework, https://expressjs.com/, (viewed 24 July 2020). 30. FACES A database of facial expressions in younger, middle-aged, and older wom- en and men https://faces.mpdl.mpg.de/imeji/collection/IXTdg721TwZwyZ8e?q=# , (viewed 24 July 2020).