MILKI-PSY Cloud: MLOps-based Multimodal Sensor Stream Processing Pipeline for Learning Analytics in Psychomotor Education Michal Slupczynski1 , Ralf Klamma1 1 RWTH Aachen University, Aachen, Germany Abstract Psychomotor learning develops our bodies in organized patterns with the help of environmental signals. With modern sensor arrays, we can acquire multimodal data to compare activities with stored reference models of body motions. To do this on a large scale in an efficient way, we need cloud-based infrastructures for the storage, processing and visualization of psychomotor learning analytics data. In this paper, we propose a conceptual sensor stream processing pipeline for this purpose. The solution is based on the so-called MLOps approach, a variation of the successful DevOps model for open source software engineering for large-scale machine learning solutions based on standard components. This processing pipeline will facilitate the multimodal analysis of many training scenarios collaboratively. Keywords multimodal learning analytics, explainable AI, cloud infrastructuring, machine learning as a service, psychomotor learning, big data, MILKI-PSY 1. Introduction Psychomotor learning is the development of muscular activities as organized patterns informed by signals taken from the environment. We research sports activities like running and engineer- ing activities like human-robot collaboration. Streaming data sensors like video, motion capture suits, and IMU sensors are used to acquire data from real-world training scenarios to compare them to stored reference models of motion in the training situation. New streaming data sensors demand big data solutions for the storage, processing and visualization of multimodal learning analytics data. This conceptual paper presents the MILKI-PSY Cloud (MPC), an application of existing Machine Learning for IT Operations (MLOps) approaches to provide a collaborative Machine learning (ML)-based cloud solution to process multimodal sensor streams and provide direct and long-term feedback in psychomotor learning scenarios. We start by describing the background and related work of our contribution in Section 2. In Section 3, we present our data processing pipeline as an application of the MLOps methodology. We conclude this contribution and give an outlook on further research in Section 4. MILeS 22: Proceedings of the second international workshop on Multimodal Immersive Learning Systems, September 13, 2022, Toulouse, France $ slupczynski@dbis.rwth-aachen.de (M. Slupczynski); klamma@dbis.rwth-aachen.de (R. Klamma) € https://dbis.rwth-aachen.de/dbis/index.php/user/slupczynskim/ (M. Slupczynski)  0000-0002-0724-5006 (M. Slupczynski); 0000-0002-2296-3401 (R. Klamma) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 2. Related Work Multimodal stream data ingestion Multimodal sensor applications generate vast amounts of heterogeneous and multidimensional data with unique properties [1] that needs to be trans- mitted, stored and analyzed in a scalable manner. Distributed communication systems such as Apache Kafka 1 provide a foundation optimized for efficiency and low latency ingestion of stream data [2]. The processing of different modalities is then done via data fusion, which can be subdivided into early fusion (merging before the ML process), late fusion (learning each modality independently, then fusing) and cross-modality fusion (learning across modalities) [3]. Time series databases (TSDB) Time series databases (TSDB) are database systems that are designed for storage, processing, and visualization of time series data (TSD) such as sensor readings [4, 5, 6], which can manage big quantities of information and enable stream data versioning, time-related searches [7] and analysis mechanisms such as trend detection, or anomaly/outlier sensing [1]. One of the most notable examples of TSDB is InfluxDB 2 . Annotation of multimodal data Raw sensor data requires additional annotation to provide labels for the ML systems to infer the prediction outcomes correctly. Usually, different data modalities require specialized applications to support the annotation process. Several offline and online tools have been proposed for data annotation such as Mova [8], Microsoft PSI [9] or the Visual Inspection Tool [10]. Multimodal data processing pipelines The usage of processing pipelines for multimodal ML data has been discussed and compared in several publications [3, 11, 12]. Commonly, the pipeline phases include feature extraction, data fusion, model training and exploitation of the trained ML applications. Implementing a user-oriented approach during the development of multimodal learning environments is important to better align to learner needs [12]. A relatively novel addition to multimodal ML pipelines is the application of MLOps principles [13, 14] to build a collaborative environment and provide immersive learning feedback in psychomotor education. ML artifact management A versioning system for ML models such as DVC 3 provides crucial tracking and managing of changes to the ML pipeline. A system like TorchServe 4 can then be used to publish and serve the different versions of the ML models. While different stages of the MLOps pipeline can be supported by separate tools, an integrated platform like OpenML 5 can provide all the benefits in a central place. 1 https://kafka.apache.org 2 https://www.influxdata.com 3 https://dvc.org/ 4 https://github.com/pytorch/serve 5 https://openml.org Machine Learning for IT Operations (MLOps) MLOps [13, 14] is a DevOps-based [15] practice that aims to reliably [16] and sustainably [17] design, deploy, test and monitor ML models in cloud and edge systems [18, 19]. MLOps extends DevOps as follows: continuous integration requires data and model versioning and validation, continuous delivery considers automatic and steady delivery of the ML models, while continuous testing involves continuous retraining of the ML models to adapt to a steady stream of input data [20, 21, 22]. Explainable AI (XAI) Explanations are meta-information in ML-based systems that describe the meaning or significance of an input instance for a certain output categorization [23]. A variety of ML mechanisms can be used on sensor-based TSD [24], while the usage of XAI methods facilitate trust and confidence in the models applied to the TSD [25]. 3. MILKI-PSY Cloud (MPC): ML-based Multimodal Sensor Stream Processing Pipeline for Direct Learner Feedback A multimodal sensor network produces continuous data streams with various sampling fre- quencies, which should be processed in near real-time to offer learning analytics and learner feedback [26]. Figure 1 gives an overview of the proposed MPC infrastructure, which is struc- tured in three phases: data ingestion (phase 1), machine learning (phase 2) and data exploitation (phase 3). This contribution is based on the MPC architecture [27]. Feature pipeline (phase 1) - Multimodal stream data ingestion First, a data ingestion system (1.1) should be used to provide scalable communication between the sensor devices and the MPC. Annotating the raw sensor data is done in the next step of the pipeline (1.2) to provide the labels necessary for the ML feature detection and decision-making. To allow repeatable ML training, the data from multiple concurrent source streams need to be aggregated and stored with some versioning mechanism. A TSDB provides mechanisms for versioned data storage and analysis (1.3) and should thus be used for data ingestion and leverage. ML pipeline, continuous training (phase 2) Next, we use a Federated ML approach for model building and evaluation (2.1) [28]. Trained and evaluated ML models need to be versioned (2.2). Once a new version of a model is checked into the storage system, it needs to be deployed and made available (2.3). After training and serving the ML model, the deployed pipeline needs to be monitored and continuously evaluated for the model performance (2.4). The measures used for this evaluation need to be defined by domain experts and calculated automatically in the context of the project depending on the specific ML use case. Alternatively, a MLOps solution (2.X) can provide an integrated solution for model building, versioning and serving. If a drop in accuracy or a sufficiently large amount of new input data (e.g. a new expert recording) is detected through pipeline monitoring (2.4), a retraining of the ML pipeline (phase 2) can be triggered as described by continuous training. For example, an ML algorithm could be trained to differentiate whether the feedback given by the system correlates with positive learner progress and change the feedback mechanism if it does not. Data exploitation & feedback (phase 3) With a properly trained and deployed ML model, a frontend application can then use it to provide feedback to the learner (3.1), either directly in near real-time through interactive AR/VR elements, or long-term performance reports and other gamification elements. The usage of explainable AI principles could provide an additional layer of context to the decisions made by the ML algorithm which would benefit the domain experts and learners in gaining insights into the feedback mechanism. This would not only reduce potential bias embedded in the ML model but also enhance the trust of the end-users in the system leading to better learning results. Finally, new multimodal sensor data is generated through learner interaction with the system (3.2) leading to a new MPC pipeline cycle with further data ingestion. 1 Data pipeline Stream data ingestion Data annotation Data storage & versioning 1.1 1.2 1.3 Data collection Data annotation Time-Series DB Apache Kafka PSI, VIT, ... InfluxDB Continuous 2 Machine learning pipeline training Model building Model storage Model deployment ML performance & evaluation 2.1 & versioning 2.2 & serving 2.3 monitoring 2.4 Federated ML Model versioning Model serving Pipeline monitoring PyTorch, ... Git, DVC, ... TorchServe, ... Evidently, ... OR 2.X Integrated MLOps Platform ClearML, MLReef, Kubeflow, MLFlow, ... 3 Data exploitation & feedback Learner feedback Learner progress 3.1 3.2 Direct feedback Data generation AR/VR, Reports, ... Multimodal sensors Pipeline phase Offline tool Data flow Pipeline flow Legend Phase/step Pipeline step Cloud tool 1 / 1.1 number Pipeline cycle Figure 1: Proposed architecture of the MILKI-PSY Cloud 3.0.1. Stakeholders At different stages of the proposed pipeline, different stakeholders are involved in decision- making, execution and practice. The stakeholder groups involved in the project range from domain experts to end-users, including pedagogical specialists, data science engineers, coaches, teachers and students. Recording and annotation of training data and raw sensor readings are performed by sports scientists due to their expertise in the psychomotor skills to be taught using the proposed pipeline. The intricacies of the ML training and evaluation need to be executed by experienced ML experts, while the result of the “data exploitation & feedback” phase will mostly benefit coaches, teachers and learners. For effective interaction with the toolset used in the MPC, each stake- holder group needs to be provided with a different set of interfaces. Sports scientists need a simple way to feed their data into the system and a powerful annotation engine to add meaning to the multimodal data stream. Application developers and designers can make use of the data stored in the TSDB together with the trained and deployed ML models to create immer- sive learning environments and provide learner feedback. Since the proposed pipeline will be running its toolset in a cloud with powerful hardware, ML experts and researchers will have access to computing resources to process the TSD, so they can focus on model training and development. The versioning in the MLOps pipeline enables coaches, teachers and students not to require extensive ML expertise to make use of the best-performing version of the ML models. The proposed cloud pipeline enables the stakeholders to collaborate on different aspects of the immersive learning environment, from data ingestion and annotation over model training to interaction with the learners. 4. Conclusions and Outlook This contribution presents a ML-based sensor data processing pipeline to facilitate direct and long-term learner feedback in psychomotor education. Pipeline automation through MLOps is critical in ensuring reproducibility, helping to reduce manual infrastructuring inherent with ML development. The usage of XAI to support the ML process is beneficial in facilitating trust and providing individualized feedback for multimodal psychomotor learning scenarios. Future work includes further implementation and evaluation of the proposed framework in a practical environment. We will focus on an end-user-oriented MLOps execution and evaluation. Acknowledgments We thank Aleksandra Nekhviadovich for their help in the study. This work was funded by the German Federal Ministry of Education and Research (BMBF) within the project “MILKI-PSY”6 under the project id 16DHB4015. 6 https://milki-psy.de/ References [1] F. A. Tsvetanov, Storing data from sensors networks, IOP Conference Series: Materials Sci- ence and Engineering 1032 (2021) 012012. doi:10.1088/1757-899x/1032/1/012012. [2] B. R. Hiraman, C. Viresh M., K. Abhijeet C., A study of apache kafka in big data stream pro- cessing, in: 2018 International Conference on Information , Communication, Engineering and Technology (ICICET), IEEE, 2018, pp. 1–3. doi:10.1109/ICICET.2018.8533771. [3] W. C. S. Iv, R. Kapoor, P. Ghosh, Multimodal classification: Current landscape, taxonomy and future directions, ACM Computing Surveys (2022). doi:10.1145/3543848. [4] T. Dunning, B. E. Friedman, Time series databases: New ways to store and access data / Ted Dunning and Ellen Friedman, first edition ed., O’Reilly, Beijing, 2014. [5] Google Cloud Architecture Center, Processing streaming time se- ries data: overview, 2021. URL: https://cloud.google.com/architecture/ processing-streaming-time-series-data-overview. [6] S. K. Jensen, T. B. Pedersen, C. Thomsen, Time series management systems: A survey, IEEE Transactions on Knowledge and Data Engineering 29 (2017) 2581–2600. doi:10.1109/ TKDE.2017.2740932. [7] H. Parviainen, The complete guide to time series data, 2021. URL: https://www.clarify.io/ learn/time-series-data. [8] O. Alemi, P. Pasquier, C. Shaw, Mova, Proceedings of the 2014 International Workshop on Movement and Computing (2014) 37–42. doi:10.1145/2617995.2618002. [9] D. Bohus, S. Andrist, A. Feniello, N. Saw, M. Jalobeanu, P. Sweeney, A. L. Thompson, E. Horvitz, Platform for situated intelligence, 2021. doi:10.48550/arXiv.2103.15975. [10] D. Di Mitri, J. Schneider, R. Klemke, M. Specht, H. Drachsler, Read Between the Lines, in: unknown (Ed.), Proceedings of the 9th International Conference on Learning Analytics & Knowledge - LAK19, ACM Press, New York, New York, USA, 2019, pp. 51–60. doi:10. 1145/3303772.3303776. [11] T. Baltrusaitis, C. Ahuja, L.-P. Morency, Multimodal machine learning: A survey and taxonomy, IEEE transactions on pattern analysis and machine intelligence 41 (2019) 423–443. doi:10.1109/TPAMI.2018.2798607. [12] K. Sharma, M. Giannakos, Multimodal data capabilities for learning: What can multimodal data tell us about learning?, British Journal of Educational Technology 51 (2020) 1450–1484. doi:10.1111/bjet.12993. [13] D. Kreuzberger, N. Kühl, S. Hirschl, Machine learning operations (mlops): Overview, definition, and architecture, 2022. doi:10.48550/arXiv.2205.02302. [14] N. Talagala, Why mlops (and not just ml) is your business’ new com- petitive frontier, 2018. URL: https://www.aitrends.com/machine-learning/ mlops-not-just-ml-business-new-competitive-frontier/. [15] A. Dyck, R. Penners, H. Lichter, Towards definitions for release engineering and devops, in: 2015 IEEE/ACM 3rd International Workshop on Release Engineering, IEEE, 2015, p. 3. doi:10.1109/RELENG.2015.10. [16] M. Borg, Agility in software 2.0 – notebook interfaces and mlops with buttresses and rebars, in: A. Przybylek, A. Jarzebowicz, I. Lukovic, Y. Y. Ng (Eds.), Lean and Agile Software Development, volume 438 of Lecture Notes in Business Information Processing, Springer International Publishing, Cham, 2022, pp. 3–16. doi:10.1007/978-3-030-94238-0_1. [17] D. A. Tamburri, Sustainable mlops: Trends and challenges, in: 2020 22nd International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), IEEE, 2020, pp. 17–23. doi:10.1109/SYNASC51798.2020.00015. [18] C. Min, A. Mathur, U. G. Acer, A. Montanari, F. Kawsar, Sensix++: Bringing mlops and multi- tenant model serving to sensory edge devices, 2021. URL: http://arxiv.org/pdf/2109.03947v1. [19] E. Raj, D. Buffoni, M. Westerlund, K. Ahola, Edge mlops: An automation framework for aiot applications, in: 2021 IEEE International Conference on Cloud Engineering (IC2E), IEEE, 2021, pp. 191–200. doi:10.1109/IC2E52221.2021.00034. [20] R. Dawson, Why is DevOps for Machine Learning so Different?, 2019. URL: https: //hackernoon.com/why-is-devops-for-machine-learning-so-different-384z32f1. [21] A. Tripathi, Mlops: A complete guide to machine learning operations: Mlops vs devops, 2021. URL: https://ashutoshtripathi.com/2021/08/18/ mlops-a-complete-guide-to-machine-learning-operations-mlops-vs-devops/. [22] Y. Zhou, Y. Yu, B. Ding, Towards mlops: A case study of ml pipeline platform, in: 2020 International Conference on Artificial Intelligence and Computer Engineering (ICAICE), IEEE, 2020, pp. 494–500. doi:10.1109/ICAICE51518.2020.00102. [23] A. Das, P. Rad, Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey, 2020. URL: http://arxiv.org/pdf/2006.11371v2. [24] L.-R. Jácome-Galarza, M.-A. Realpe-Robalino, J. Paillacho-Corredores, J.-L. Benavides- Maldonado, Time series in sensor data using state-of-the-art deep learning approaches: A systematic literature review, in: Á. Rocha, P. C. López-López, J. P. Salgado-Guerrero (Eds.), Communication, Smart Technologies and Innovation for Society, volume 252 of Smart innovation, systems and technologies, Springer Singapore, Singapore, 2022, pp. 503–514. doi:10.1007/978-981-16-4126-8_45. [25] T. Rojat, R. Puget, D. Filliat, J. Del Ser, R. Gelin, N. Díaz-Rodríguez, Explainable artificial intelligence (xai) on timeseries data: A survey, 2021. doi:10.48550/arXiv.2104.00950. [26] M. K. Garg, D.-J. Kim, D. S. Turaga, B. Prabhakaran, Multimodal analysis of body sensor network data streams for real-time healthcare, in: J. Z. Wang, N. Boujemaa, N. O. Ramirez, A. Natsev (Eds.), Proceedings of the international conference on Multimedia information retrieval - MIR ’10, ACM Press, New York, New York, USA, 2010, p. 469. doi:10.1145/ 1743384.1743467. [27] M. Slupczynski, R. Klamma, Milki-psy cloud: Facilitating multimodal learning analytics by explainable ai and blockchain, in: R. Klemke, K. Sanusi, D. Majonica, A. Richert, V. Varney, T. Keller, J. Schneider, D. Di Mitri, ..., N. Riedl (Eds.), First International Workshop on Multimodal Immersive Learning Systems (MILeS 2021), 2021, pp. 22–28. URL: http: //ceur-ws.org/Vol-2979/paper3.pdf. [28] P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. Nitin Bhagoji, K. Bonawitz, Z. Charles, ..., S. Zhao, Advances and open problems in federated learning, Foundations and Trends in Machine Learning 14 (2021) 1–210. doi:10.1561/2200000083.