=Paper=
{{Paper
|id=Vol-2979/paper3
|storemode=property
|title=MILKI-PSY Cloud: Facilitating multimodal learning analytics by explainable AI and blockchain
|pdfUrl=https://ceur-ws.org/Vol-2979/paper3.pdf
|volume=Vol-2979
|authors=Michal Slupczynski,Ralf Klamma
|dblpUrl=https://dblp.org/rec/conf/ectel/SlupczynskiK21
}}
==MILKI-PSY Cloud: Facilitating multimodal learning analytics by explainable AI and blockchain==
MILKI-PSY Cloud: Facilitating multimodal learning analytics by explainable AI and blockchain Michal Slupczynski1[0000−0002−0724−5006] and Ralf Klamma1[0000−0002−2296−3401] RWTH Aachen University, Aachen, Germany {lastname}@dbis.rwth-aachen.de Abstract. Modern cloud-based big data engineering approaches like machine learning and blockchain enable the collection of learner data from numerous sources of different modalities (like video feeds, sen- sor data etc.), allowing multimodal learning analytics (MMLA) and re- flection on the learning process. In particular, complex psycho-motor skills like dancing or operating a complex machine are profiting from MMLA. However, instructors, learners, and other institutional stake- holders may have issues with the traceability and the transparency of machine learning processes applied on learning data on the one side, and with privacy, data protection and security on the other side. We propose an approach for the acquisition, storage, processing and presen- tation of multimodal learning analytics data using machine learning and blockchain as services to reach explainable artificial intelligence (AI) and certified traceability of learning data processing. Moreover, we facilitate end-user involvement into to whole development cycle by extending es- tablished open-source software DevOps processes by participative design and community-oriented monitoring of MMLA processes. The MILKI- PSY cloud (MPC) architecture is extending existing MMLA approaches and Kubernetes based automation of learning analytics infrastructure deployment from a number of research projects. The MPC will facilitate further research and development in this field. Keywords: multimodal learning analytics · explainable AI · cloud in- frastructuring · machine learning as a service · blockchain as a service · psychomotor learning · big data · MILKI PSY 1 Introduction Learning complex psychomotor skills involves coordinating physical movements according to a predefined reference model. The increasing availability of big data solutions presents opportunities in education to converge cloud infrastruc- tures and learning infrastructures. This allows professional communities of prac- tice of instructors, learners and other institutional stakeholders to create better collaborative environments for multimodal learning analytics (MMLA). These Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 2 Michal Slupczynski et al. cloud-based environments can be used to enhance collaborative knowledge shar- ing within and between the communities by sticking to established standards, by using established open-source software development processes and by build- ing knowledge repositories. Artificial Intelligence (AI)-based systems are used to enable efficient analysis of the vast amounts of data that can be collected while performing learning activities. However, the communities will not accept AI-based solutions as the data are not secured and the processing is transparent for all community members by design. To realize this, all community members should be involved in the design of machine learning [8] and other related pro- cesses. In the end, it must be explainable to the end users of a system why the AI arrived at the presented results. This conceptual paper presents the MILKI-PSY cloud architecture, an ex- tension of existing MMLA approaches and the results of a number of European and national research projects for infrastructure building in complex learning domains. After the related work section, we present our MILKI-PSY cloud. The papers then concludes and gives an outlook on further research. 2 Related Work Machine learning as a service (MLaaS) [10] is an umbrella term for various cloud- based platforms that cover most infrastructure issues in training AIs, such as data preprocessing, model training, and model evaluation. This approach is very useful and effective not only for data scientists, data engineers, and other ma- chine learning professionals, but also for students and researchers who can use it to train machine learning models while benefiting from the scalability of the cloud provider. Blockchain as a Service (BaaS) [11] enables enterprises to use cloud-based solutions to build, host, and use their own blockchain apps, smart contracts, and functions on blockchain infrastructure developed by a vendor. BaaS provides access to a blockchain network of a desired configuration with- out the need to develop, host and deploy an on-premises blockchain and build in-house expertise on the subject. The distributed ledger of a blockchain can be used to manage the process of issuing, storing, and releasing students’ academic certificates, to store and share competencies and learning outcomes that students have achieved, to assess learning progress [1]. In addition, blockchain technology can be used to enable easier and more secure transfer of credits between learning centers. Explainable AI builds a common communication platform between hu- mans and AI that helps perform learning analytics and improves the usability of AI-powered mentoring processes. To this end, machine-learned features should correlate with human-derived thought constructs and mental models to facili- tate understanding of the neural network learning process [5]. In this context, the American Institute of Standards and Technology (NIST1 ) presented four principles of comprehensible AI [9]: 1 https://www.nist.gov/ MILKI-PSY Cloud: Multimodal LA through explainable AI and blockchain 3 1. Explanation AI should provide evidence, support, or rationale for each out- put. This principle does not require that evidence be correct or intelligible; it merely states that a system is capable of providing an explanation. 2. Meaningful Systems provide explanations that are understandable to the individual user. A system satisfies this principle if the recipient understands the system’s explanations and/or they are useful in accomplishing a task. 3. Explanation accuracy The explanation correctly reflects the system’s process for producing the output. Taken together, the first two principles only require that a system produce explanations that are understandable to the target audience, without requiring the explanation to correctly reflect the system’s process for producing its output. ”Explanation accuracy” requires that a system’s explanations should be accurate. 4. Knowledge limits The system operates only under the conditions for which it was designed or when the system achieves sufficient confidence in its output. The previous principles implicitly assume that a system operates within its knowledge limits. This principle states that systems identify cases for which they were not designed or approved, or that their responses are not reliable. By identifying and declaring knowledge boundaries, this practice safeguards responses so that judgment is not made when it may be inappropriate. Due to the lack of end-user engage- ment concept in DevOps, Koren et al. introduced the extended DevOpsUse ap- proach [3] in our research group, which aims to unify agile practices of develop- ers, operators, and end-users. The inner- CO-DESIGN IDEAS & most circle of the schema (see Fig. 1) re- NEEDS DEV flects the standard DevOps lifecycle as a FEEDBACK DEVELOP BETA basis. To reflect the importance of end- AWARENESS D Re evelo lea TESTING p& se user contributions as contributions to So- MONITOR & M Test on ito TEST r cietal Software Engineering, an additional USE ring is added to represent end-user DEPLOY OPS activities in the different phases of the De- PRACTICE CONTEXT vOps cycle. To improve usability and un- derstanding of complex information sys- USE tems, such as AI-based cloud solutions, the involvement of end-users in the design and creation process is crucial. In particu- Fig. 1: DevOpsUse Cycle [3] lar, this means that users are not only in- volved in the elicitation of requirements, but are also instrumental for beta testing, providing deployment context, and using the application for their practice. This in turn provides awareness to issues and is a valuable source for ideas and feed- back to improve the usability of the designed technology. Learning analytics [2] is the measurement, collection, analysis, and reporting of data about learners and their contexts for the purpose of understanding and 4 Michal Slupczynski et al. optimizing learning and the environments in which it occurs. Most conventional learning analytics approaches examine learning processes and contexts from a single data source (e.g., the logs of an LMS), which provides only a partial view of learning. Multimodale Learning Analytics (MMLA) involves complex technical issues in collecting, merging, and analyzing different types of learning data data from heterogeneous data sources. Di Mitri et al. [4] proposed a Multimodal Learning Analytics Pipeline (MMLAP), which provides a generic approach to collecting and analyzing multimodal data to support learning activities in physical and digital spaces. The pipeline is structured in five steps, namely: (1) collection of data, (2) storage of data, (3) data labeling, (4) data processing and (5) data application. This means that after merging of data streams from different sources to create a model for the physiological state of the user (1), the multimodal data is organised for storage and retrieval (2) and labelled to assign meaning and expert interpretations to the multimodal recordings (3). The ”raw” data stream needs to be aggregated, cleaned, aligned and interpreted to extract relevant information (4) necessary to give the learner direct and immediate feedback (5). This architecture serves as a reference model of our system structure. 3 MILKI PSY Cloud: An infrastructure for distributed multimodal learning data analysis with a focus on informational self-determination A cloud infrastructure for distributed multimodal learning data analysis should be able to collect data from heterogeneous input streams from a variety of sources, like software solutions and hardware sensors. Additionally to learner data, a data annotation layer is required to apply expert knowledge, which serves as a reference model to compare the learner data to. The learning data then needs to be collected, stored, analyzed and processed to provide insights and under- standing into the multimodal data stream. The design of AI elements in the MILKI PSY Cloud (MPC) should not only involve the end-users (DevOpsUse), but also follow the principles of explainable AI. This understanding can then used to provide direct feedback to the learners. Additionally, blockchain approaches can be used to provide secure certification of learner progress. Blockchain ap- proaches enable data self-sovereignty by allowing individuals to decide who can access and use their data and personal information. We use OpenID Connect2 for authentication and authorization, which enables modern and secure access to all services. In the educational context, this allows learners to manage their credentials without relying on the educational institution as a trusted interme- diary. This becomes important in particular, when learners are changing their institutions. Figure 2 gives an overview of the proposed infrastructure. The central com- ponent here is the MPC [12], marked with a blue frame in the figure. 2 http://results.learning-layers.eu/infrastructure/oidc/ MILKI-PSY Cloud: Multimodal LA through explainable AI and blockchain 5 Legend Distributed Learning Record Store Apache Encrypted Machine Learning Blockchain Ka ka communica on ... ... ... LRS 1 LRS n Data flow Apache Ka a Ethereum blockchain Human erronenous 2. Data Storage ac vity recogni on ARLEM, xAPI 4. Data Processing Distributed Storage AI nodes Docker Container Kubernetes Mul modal Reference Learner Cluster sensor data model feedback 1. Data Collec on 3. Data Annota on 5. Data Exploita on Mul modal Camera Recorded Visual Intelligent Social Direct Dash- Bots Cer ficates sensor kits feeds Movements Annota ons Tutors Feedback boards Learner Expert Learner Fig. 2: Architecture of the MILKI PSY Cloud 6 Michal Slupczynski et al. The data pipeline, which is based on the Multimodal Learning Analytics Pipeline by Di Mitri et al. [4], starts with the aggregation and processing of multimodal sensor data from learners. Thus, the first step is to collect mul- timodal sensor data from learners (1), by means of body-mounted sensors to measure various physiological parameters, video feeds to detect a skeletal map of the learner’s movements, or learning progress from a Learning Management System (LMS). The resulting data streams are sent from the respective devices to the cloud. Data collected in this way is loaded into the Apache Kafka3 clus- ter via data brokers for data storage (2), where it is stored in a chronological sequence. The third step of MMLAP is annotation of data (3), i.e., collecting expert knowledge and recording a multimodal reference model of motion that learners can use to orient themselves. These data will be analyzed together with the raw sensor data. In the next step (4), a distributed machine learning cluster is tasked with performing Human Erroneous Activity Recognition (HEAR) to not only recognize what actions the human learner is performing, but also to track their errors and be able to identify, which body parts performed move- ment which does not match the reference model. This information can be used to provide localized feedback to the learner. The collected action-based informa- tion about the learner’s activity and the errors they made in their movement is then processed (4) and used via Experience API (xAPI) or the ARLEM4 stan- dard [13] to the Learning Record Store (LRS). To complete the MMLAP, the data exploitation step (5) is to send the data to the learner for direct and im- mediate feedback. Both the sensor data and the LRS information can be used to analyze the learner activities to compare them with the reference model and provide targeted feedback to the learner. This can be done using intelligent tutoring systems, social bots or analytics dashboards. Additionally, blockchain- backed certificates can be used to facilitate traceability of learning records. The Kubernetes5 platform operated by Research Group for Advanced Community Information Systems (ACIS) creates a decentralized environment for the devel- opment, deployment, and monitoring of community-oriented microservices that include las2peer [7] nodes in a p2p fashion. Part of this cluster are services de- veloped by the las2peer community to deliver dynamic and adaptive learning content, providing a socio-technical infrastructure to scale mentoring processes using distributed artificial intelligence [6]. Within the cloud, we rely on a Kubernetes-based solution. This enables a modern and scalable infrastructure for decentralized storage of data. Support for AI-based tools developed by appropriate partners during the course of the project, can thus also be hosted directly by us in the Kubernetes cloud. 3 https://kafka.apache.org/ 4 IEEE 1589-2020 https://standards.ieee.org/standard/1589-2020.html 5 https://kubernetes.io/ MILKI-PSY Cloud: Multimodal LA through explainable AI and blockchain 7 4 Conclusions and Outlook The increasing availability of cloud-based big data solutions facilitates the in- tegration of learning infrastructures over institutional and national boundaries. The collaborative exchange of knowledge and the interoperability of the learning approaches improve the spread of multimodal learning analytics, but also raises issues of trust into the infrastructures and the complex services that are hosted on these infrastructures. Learners, instructors and institutional stakeholders need transparency and traceability of learning records and the further processing of learning analytics data. In particular, there is an emphasis on self-sovereignty, which is especially important in the context of storing and processing privacy- sensitive learner data collected in the course of learning activities. The increased use of AI algorithms means that machine learning and AI-based services are becoming part of the infrastructure (MLaaS). Explainable AI is used here as a communication platform between humans and AI to improve the usability of AI- assisted mentoring processes and can be used to provide personalized and direct learning feedback. Blockchain approaches support data protection law compli- ant and traceable management of learner data, even when learner changes their learning institutions. Thus, to verify the use of complex AI systems, all stake- holders are involved in the design of machine learning and blockchains. In the project context, the Multimodal Learning Analytics Pipeline will be used to support the development of psychomotor skills with artificial intelligence. In particular, this will be investigated in two application domains: sports and complex processes in human-robot interaction. The goal of the MPC is to ac- quire, store, process, and display real-time multimodal data to promote digital learning of psychomotor activities such as a human-robot interaction or while playing sports. In cooperation with project partners, the organizational basis for the creation of a distributed cloud development and learning platform (MPC with special involvement of learners and end users (DevOpsUse) will be created using an agile development process. The use of OpenID creates an open learning environment that enables secure and easy access to the provided services. The multimodal learning data is collected by different sensors and camera and learn- ing systems via data brokers and processed in the MPC. To create a reference model, motion profiles are recorded by experts and analyzed in the system. These are then compared together with the raw sensor data by an AI-based Human Erroneous Activity Recognition mechanism to detect differences between the learner’s activities and the targeted learning goal and provide contextual feed- back. Cloud-, Fog- and Edge-based machine learning models will be compared with client-side solutions in the future. References 1. Alammary, A., Alhazmi, S., Almasri, M., Gillani, S.: Blockchain-Based Applica- tions in Education: A Systematic Review. Applied Sciences 9(12), 2400 (2019). https://doi.org/10.3390/app9122400 8 Michal Slupczynski et al. 2. Clow, D.: An overview of learning analytics. Teaching in Higher Education 18(6), 683–695 (2013). https://doi.org/10.1080/13562517.2013.827653 3. de Lange, P., Nicolaescu, P., Klamma, R., Koren, I.: DevOpsUse for Rapid Training of Agile Practices Within Undergraduate and Startup Communities. In: Verbert, K., Sharples, M., Klobučar, T. (eds.) Adaptive and Adaptable Learning. Lecture Notes in Computer Science, vol. 9891, pp. 570–574. Springer International Pub- lishing, Cham (2016). https://doi.org/10.1007/978-3-319-45153-4 65 4. Di Mitri, D., Schneider, J., Klemke, R., Specht, M., Drachsler, H.: Read Between the Lines. In: unknown (ed.) Proceedings of the 9th International Conference on Learning Analytics & Knowledge - LAK19. pp. 51–60. ACM Press, New York, New York, USA (2019). https://doi.org/10.1145/3303772.3303776 5. Holzinger, A., Malle, B., Saranti, A., Pfeifer, B.: Towards multi-modal causabil- ity with Graph Neural Networks enabling information fusion for explainable AI. Information Fusion 71, 28–37 (2021). https://doi.org/10.1016/j.inffus.2021.01.008 6. Klamma, R., de Lange, P., Neumann, A.T., Hensen, B., Kravcik, M., Wang, X., Kuzilek, J.: Scaling Mentoring Support with Distributed Artificial Intelligence. In: Kumar, V., Troussas, C. (eds.) Intelligent Tutoring Systems, Lecture Notes in Computer Science, vol. 12149, pp. 38–44. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-49663-0 6 7. Klamma, R., Renzel, D., de Lange, P., Janßen, H.: las2peer – a primer. https://doi.org/10.13140/RG.2.2.31456.48645 8. Maadi, M., Akbarzadeh Khorshidi, H., Aickelin, U.: A Review on Human- AI Interaction in Machine Learning and Insights for Medical Applications. In- ternational journal of environmental research and public health 18(4) (2021). https://doi.org/10.3390/ijerph18042121 9. Phillips, P.J., Hahn, C.A., Fontana, P.C., Broniatowski, D.A., Przy- bocki, M.A.: Four Principles of Explainable Artificial Intelligence (2020). https://doi.org/10.6028/NIST.IR.8312-draft 10. Ribeiro, M., Grolinger, K., Capretz, M.A.: MLaaS: Machine Learning as a Service. In: 2015 IEEE 14th International Conference on Ma- chine Learning and Applications (ICMLA). pp. 896–902. IEEE (2015). https://doi.org/10.1109/ICMLA.2015.152 11. Samaniego, M., Jamsrandorj, U., Deters, R.: Blockchain as a Service for IoT. In: 2016 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData). pp. 433–436. IEEE (2016). https://doi.org/10.1109/iThings-GreenCom-CPSCom- SmartData.2016.102 12. Slupczynski, M., Klamma, R.: Vorschlag einer Infrastruktur zur verteil- ten Datenanalyse von multimodalen psychomotorischen Lerneraktivitäten. https://doi.org/10.13140/RG.2.2.23664.58886/1 13. Wild, F., Perey, C., Hensen, B., Klamma, R.: IEEE Standard for Augmented Real- ity Learning Experience Models. In: 2020 IEEE International Conference on Teach- ing, Assessment, and Learning for Engineering (TALE). pp. 1–3. IEEE (2020). https://doi.org/10.1109/TALE48869.2020.9368405