=Paper= {{Paper |id=Vol-2979/paper3 |storemode=property |title=MILKI-PSY Cloud: Facilitating multimodal learning analytics by explainable AI and blockchain |pdfUrl=https://ceur-ws.org/Vol-2979/paper3.pdf |volume=Vol-2979 |authors=Michal Slupczynski,Ralf Klamma |dblpUrl=https://dblp.org/rec/conf/ectel/SlupczynskiK21 }} ==MILKI-PSY Cloud: Facilitating multimodal learning analytics by explainable AI and blockchain== https://ceur-ws.org/Vol-2979/paper3.pdf
          MILKI-PSY Cloud: Facilitating multimodal
           learning analytics by explainable AI and
                          blockchain

                           Michal Slupczynski1[0000−0002−0724−5006] and Ralf
                                     Klamma1[0000−0002−2296−3401]

                                  RWTH Aachen University, Aachen, Germany
                                     {lastname}@dbis.rwth-aachen.de



              Abstract. Modern cloud-based big data engineering approaches like
              machine learning and blockchain enable the collection of learner data
              from numerous sources of different modalities (like video feeds, sen-
              sor data etc.), allowing multimodal learning analytics (MMLA) and re-
              flection on the learning process. In particular, complex psycho-motor
              skills like dancing or operating a complex machine are profiting from
              MMLA. However, instructors, learners, and other institutional stake-
              holders may have issues with the traceability and the transparency of
              machine learning processes applied on learning data on the one side,
              and with privacy, data protection and security on the other side. We
              propose an approach for the acquisition, storage, processing and presen-
              tation of multimodal learning analytics data using machine learning and
              blockchain as services to reach explainable artificial intelligence (AI) and
              certified traceability of learning data processing. Moreover, we facilitate
              end-user involvement into to whole development cycle by extending es-
              tablished open-source software DevOps processes by participative design
              and community-oriented monitoring of MMLA processes. The MILKI-
              PSY cloud (MPC) architecture is extending existing MMLA approaches
              and Kubernetes based automation of learning analytics infrastructure
              deployment from a number of research projects. The MPC will facilitate
              further research and development in this field.

              Keywords: multimodal learning analytics · explainable AI · cloud in-
              frastructuring · machine learning as a service · blockchain as a service ·
              psychomotor learning · big data · MILKI PSY


   1       Introduction

   Learning complex psychomotor skills involves coordinating physical movements
   according to a predefined reference model. The increasing availability of big
   data solutions presents opportunities in education to converge cloud infrastruc-
   tures and learning infrastructures. This allows professional communities of prac-
   tice of instructors, learners and other institutional stakeholders to create better
   collaborative environments for multimodal learning analytics (MMLA). These




Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
2        Michal Slupczynski et al.

cloud-based environments can be used to enhance collaborative knowledge shar-
ing within and between the communities by sticking to established standards,
by using established open-source software development processes and by build-
ing knowledge repositories. Artificial Intelligence (AI)-based systems are used
to enable efficient analysis of the vast amounts of data that can be collected
while performing learning activities. However, the communities will not accept
AI-based solutions as the data are not secured and the processing is transparent
for all community members by design. To realize this, all community members
should be involved in the design of machine learning [8] and other related pro-
cesses. In the end, it must be explainable to the end users of a system why the
AI arrived at the presented results.
    This conceptual paper presents the MILKI-PSY cloud architecture, an ex-
tension of existing MMLA approaches and the results of a number of European
and national research projects for infrastructure building in complex learning
domains.
    After the related work section, we present our MILKI-PSY cloud. The papers
then concludes and gives an outlook on further research.


2     Related Work

Machine learning as a service (MLaaS) [10] is an umbrella term for various cloud-
based platforms that cover most infrastructure issues in training AIs, such as
data preprocessing, model training, and model evaluation. This approach is very
useful and effective not only for data scientists, data engineers, and other ma-
chine learning professionals, but also for students and researchers who can use
it to train machine learning models while benefiting from the scalability of the
cloud provider. Blockchain as a Service (BaaS) [11] enables enterprises to use
cloud-based solutions to build, host, and use their own blockchain apps, smart
contracts, and functions on blockchain infrastructure developed by a vendor.
BaaS provides access to a blockchain network of a desired configuration with-
out the need to develop, host and deploy an on-premises blockchain and build
in-house expertise on the subject. The distributed ledger of a blockchain can be
used to manage the process of issuing, storing, and releasing students’ academic
certificates, to store and share competencies and learning outcomes that students
have achieved, to assess learning progress [1]. In addition, blockchain technology
can be used to enable easier and more secure transfer of credits between learning
centers. Explainable AI builds a common communication platform between hu-
mans and AI that helps perform learning analytics and improves the usability of
AI-powered mentoring processes. To this end, machine-learned features should
correlate with human-derived thought constructs and mental models to facili-
tate understanding of the neural network learning process [5]. In this context,
the American Institute of Standards and Technology (NIST1 ) presented four
principles of comprehensible AI [9]:
1
    https://www.nist.gov/
  MILKI-PSY Cloud: Multimodal LA through explainable AI and blockchain                         3

 1. Explanation AI should provide evidence, support, or rationale for each out-
    put. This principle does not require that evidence be correct or intelligible;
    it merely states that a system is capable of providing an explanation.
 2. Meaningful Systems provide explanations that are understandable to the
    individual user. A system satisfies this principle if the recipient understands
    the system’s explanations and/or they are useful in accomplishing a task.
 3. Explanation accuracy The explanation correctly reflects the system’s process
    for producing the output. Taken together, the first two principles only require
    that a system produce explanations that are understandable to the target
    audience, without requiring the explanation to correctly reflect the system’s
    process for producing its output. ”Explanation accuracy” requires that a
    system’s explanations should be accurate.
 4. Knowledge limits The system operates only under the conditions for which it
    was designed or when the system achieves sufficient confidence in its output.
    The previous principles implicitly assume that a system operates within its
    knowledge limits. This principle states that systems identify cases for which
    they were not designed or approved, or that their responses are not reliable.
    By identifying and declaring knowledge boundaries, this practice safeguards
    responses so that judgment is not made when it may be inappropriate.

    Due to the lack of end-user engage-
ment concept in DevOps, Koren et al.
introduced the extended DevOpsUse ap-
proach [3] in our research group, which
aims to unify agile practices of develop-
ers, operators, and end-users. The inner-                              CO-DESIGN

                                                   IDEAS &
most circle of the schema (see Fig. 1) re-          NEEDS              DEV

flects the standard DevOps lifecycle as a                    FEEDBACK
                                                                            DEVELOP
                                                                                            BETA
basis. To reflect the importance of end-      AWARENESS           D
                                                               Re evelo
                                                                 lea
                                                                                          TESTING
                                                                         p&
                                                                     se
user contributions as contributions to So-                MONITOR
                                                                        & M Test
                                                                           on
                                                                              ito   TEST
                                                                                  r
cietal Software Engineering, an additional
USE ring is added to represent end-user                            DEPLOY
                                                                       OPS
activities in the different phases of the De-      PRACTICE
                                                                                    CONTEXT
vOps cycle. To improve usability and un-
derstanding of complex information sys-
                                                                       USE
tems, such as AI-based cloud solutions,
the involvement of end-users in the design
and creation process is crucial. In particu-     Fig. 1: DevOpsUse Cycle [3]
lar, this means that users are not only in-
volved in the elicitation of requirements,
but are also instrumental for beta testing,
providing deployment context, and using the application for their practice. This
in turn provides awareness to issues and is a valuable source for ideas and feed-
back to improve the usability of the designed technology.
    Learning analytics [2] is the measurement, collection, analysis, and reporting
of data about learners and their contexts for the purpose of understanding and
4       Michal Slupczynski et al.

optimizing learning and the environments in which it occurs. Most conventional
learning analytics approaches examine learning processes and contexts from a
single data source (e.g., the logs of an LMS), which provides only a partial view
of learning.
    Multimodale Learning Analytics (MMLA) involves complex technical issues
in collecting, merging, and analyzing different types of learning data data from
heterogeneous data sources. Di Mitri et al. [4] proposed a Multimodal Learning
Analytics Pipeline (MMLAP), which provides a generic approach to collecting
and analyzing multimodal data to support learning activities in physical and
digital spaces. The pipeline is structured in five steps, namely: (1) collection
of data, (2) storage of data, (3) data labeling, (4) data processing and (5) data
application. This means that after merging of data streams from different sources
to create a model for the physiological state of the user (1), the multimodal data
is organised for storage and retrieval (2) and labelled to assign meaning and
expert interpretations to the multimodal recordings (3). The ”raw” data stream
needs to be aggregated, cleaned, aligned and interpreted to extract relevant
information (4) necessary to give the learner direct and immediate feedback (5).
This architecture serves as a reference model of our system structure.


3     MILKI PSY Cloud: An infrastructure for distributed
      multimodal learning data analysis with a focus on
      informational self-determination

A cloud infrastructure for distributed multimodal learning data analysis should
be able to collect data from heterogeneous input streams from a variety of
sources, like software solutions and hardware sensors. Additionally to learner
data, a data annotation layer is required to apply expert knowledge, which serves
as a reference model to compare the learner data to. The learning data then needs
to be collected, stored, analyzed and processed to provide insights and under-
standing into the multimodal data stream. The design of AI elements in the
MILKI PSY Cloud (MPC) should not only involve the end-users (DevOpsUse),
but also follow the principles of explainable AI. This understanding can then used
to provide direct feedback to the learners. Additionally, blockchain approaches
can be used to provide secure certification of learner progress. Blockchain ap-
proaches enable data self-sovereignty by allowing individuals to decide who can
access and use their data and personal information. We use OpenID Connect2
for authentication and authorization, which enables modern and secure access
to all services. In the educational context, this allows learners to manage their
credentials without relying on the educational institution as a trusted interme-
diary. This becomes important in particular, when learners are changing their
institutions.
    Figure 2 gives an overview of the proposed infrastructure. The central com-
ponent here is the MPC [12], marked with a blue frame in the figure.
2
    http://results.learning-layers.eu/infrastructure/oidc/
MILKI-PSY Cloud: Multimodal LA through explainable AI and blockchain                                                                              5




                                                                                                                                  Legend
                                  Distributed                Learning Record Store
      Apache                                                                                                                        Encrypted
                                Machine Learning                                                 Blockchain
       Ka ka                                                                                                                      communica on
                                                                         ...                            ...
            ...                                                LRS 1             LRS n
                                                                                                                                    Data flow

                                                                                                                                     Apache
                                                                                                                                      Ka a

                                                                                                                                    Ethereum
                                                                                                                                    blockchain
                                      Human erronenous
          2. Data Storage             ac vity recogni on
                                                                    ARLEM, xAPI              4. Data Processing                     Distributed
                                                                                                                                     Storage

                                                                                                                                     AI nodes

                                                                                                                                     Docker
                                                                                                                                    Container

                                                                                                                                    Kubernetes
                            Mul modal                        Reference                                         Learner                Cluster
                            sensor data                        model                                          feedback

         1. Data Collec on                    3. Data Annota on                              5. Data Exploita on



       Mul modal            Camera         Recorded          Visual            Intelligent     Social      Direct        Dash-
                                                                                                Bots                                Cer ficates
       sensor kits           feeds        Movements        Annota ons            Tutors                  Feedback        boards




                  Learner                           Expert                                                Learner



                      Fig. 2: Architecture of the MILKI PSY Cloud
6      Michal Slupczynski et al.

    The data pipeline, which is based on the Multimodal Learning Analytics
Pipeline by Di Mitri et al. [4], starts with the aggregation and processing of
multimodal sensor data from learners. Thus, the first step is to collect mul-
timodal sensor data from learners (1), by means of body-mounted sensors to
measure various physiological parameters, video feeds to detect a skeletal map
of the learner’s movements, or learning progress from a Learning Management
System (LMS). The resulting data streams are sent from the respective devices
to the cloud. Data collected in this way is loaded into the Apache Kafka3 clus-
ter via data brokers for data storage (2), where it is stored in a chronological
sequence. The third step of MMLAP is annotation of data (3), i.e., collecting
expert knowledge and recording a multimodal reference model of motion that
learners can use to orient themselves. These data will be analyzed together with
the raw sensor data. In the next step (4), a distributed machine learning cluster
is tasked with performing Human Erroneous Activity Recognition (HEAR) to
not only recognize what actions the human learner is performing, but also to
track their errors and be able to identify, which body parts performed move-
ment which does not match the reference model. This information can be used
to provide localized feedback to the learner. The collected action-based informa-
tion about the learner’s activity and the errors they made in their movement is
then processed (4) and used via Experience API (xAPI) or the ARLEM4 stan-
dard [13] to the Learning Record Store (LRS). To complete the MMLAP, the
data exploitation step (5) is to send the data to the learner for direct and im-
mediate feedback. Both the sensor data and the LRS information can be used
to analyze the learner activities to compare them with the reference model and
provide targeted feedback to the learner. This can be done using intelligent
tutoring systems, social bots or analytics dashboards. Additionally, blockchain-
backed certificates can be used to facilitate traceability of learning records. The
Kubernetes5 platform operated by Research Group for Advanced Community
Information Systems (ACIS) creates a decentralized environment for the devel-
opment, deployment, and monitoring of community-oriented microservices that
include las2peer [7] nodes in a p2p fashion. Part of this cluster are services de-
veloped by the las2peer community to deliver dynamic and adaptive learning
content, providing a socio-technical infrastructure to scale mentoring processes
using distributed artificial intelligence [6].

    Within the cloud, we rely on a Kubernetes-based solution. This enables a
modern and scalable infrastructure for decentralized storage of data. Support
for AI-based tools developed by appropriate partners during the course of the
project, can thus also be hosted directly by us in the Kubernetes cloud.




3
  https://kafka.apache.org/
4
  IEEE 1589-2020 https://standards.ieee.org/standard/1589-2020.html
5
  https://kubernetes.io/
    MILKI-PSY Cloud: Multimodal LA through explainable AI and blockchain          7

4     Conclusions and Outlook

The increasing availability of cloud-based big data solutions facilitates the in-
tegration of learning infrastructures over institutional and national boundaries.
The collaborative exchange of knowledge and the interoperability of the learning
approaches improve the spread of multimodal learning analytics, but also raises
issues of trust into the infrastructures and the complex services that are hosted on
these infrastructures. Learners, instructors and institutional stakeholders need
transparency and traceability of learning records and the further processing of
learning analytics data. In particular, there is an emphasis on self-sovereignty,
which is especially important in the context of storing and processing privacy-
sensitive learner data collected in the course of learning activities. The increased
use of AI algorithms means that machine learning and AI-based services are
becoming part of the infrastructure (MLaaS). Explainable AI is used here as a
communication platform between humans and AI to improve the usability of AI-
assisted mentoring processes and can be used to provide personalized and direct
learning feedback. Blockchain approaches support data protection law compli-
ant and traceable management of learner data, even when learner changes their
learning institutions. Thus, to verify the use of complex AI systems, all stake-
holders are involved in the design of machine learning and blockchains.
    In the project context, the Multimodal Learning Analytics Pipeline will be
used to support the development of psychomotor skills with artificial intelligence.
In particular, this will be investigated in two application domains: sports and
complex processes in human-robot interaction. The goal of the MPC is to ac-
quire, store, process, and display real-time multimodal data to promote digital
learning of psychomotor activities such as a human-robot interaction or while
playing sports. In cooperation with project partners, the organizational basis for
the creation of a distributed cloud development and learning platform (MPC
with special involvement of learners and end users (DevOpsUse) will be created
using an agile development process. The use of OpenID creates an open learning
environment that enables secure and easy access to the provided services. The
multimodal learning data is collected by different sensors and camera and learn-
ing systems via data brokers and processed in the MPC. To create a reference
model, motion profiles are recorded by experts and analyzed in the system. These
are then compared together with the raw sensor data by an AI-based Human
Erroneous Activity Recognition mechanism to detect differences between the
learner’s activities and the targeted learning goal and provide contextual feed-
back. Cloud-, Fog- and Edge-based machine learning models will be compared
with client-side solutions in the future.


References

 1. Alammary, A., Alhazmi, S., Almasri, M., Gillani, S.: Blockchain-Based Applica-
    tions in Education: A Systematic Review. Applied Sciences 9(12), 2400 (2019).
    https://doi.org/10.3390/app9122400
8       Michal Slupczynski et al.

 2. Clow, D.: An overview of learning analytics. Teaching in Higher Education 18(6),
    683–695 (2013). https://doi.org/10.1080/13562517.2013.827653
 3. de Lange, P., Nicolaescu, P., Klamma, R., Koren, I.: DevOpsUse for Rapid Training
    of Agile Practices Within Undergraduate and Startup Communities. In: Verbert,
    K., Sharples, M., Klobučar, T. (eds.) Adaptive and Adaptable Learning. Lecture
    Notes in Computer Science, vol. 9891, pp. 570–574. Springer International Pub-
    lishing, Cham (2016). https://doi.org/10.1007/978-3-319-45153-4 65
 4. Di Mitri, D., Schneider, J., Klemke, R., Specht, M., Drachsler, H.: Read Between
    the Lines. In: unknown (ed.) Proceedings of the 9th International Conference on
    Learning Analytics & Knowledge - LAK19. pp. 51–60. ACM Press, New York, New
    York, USA (2019). https://doi.org/10.1145/3303772.3303776
 5. Holzinger, A., Malle, B., Saranti, A., Pfeifer, B.: Towards multi-modal causabil-
    ity with Graph Neural Networks enabling information fusion for explainable AI.
    Information Fusion 71, 28–37 (2021). https://doi.org/10.1016/j.inffus.2021.01.008
 6. Klamma, R., de Lange, P., Neumann, A.T., Hensen, B., Kravcik, M., Wang, X.,
    Kuzilek, J.: Scaling Mentoring Support with Distributed Artificial Intelligence.
    In: Kumar, V., Troussas, C. (eds.) Intelligent Tutoring Systems, Lecture Notes in
    Computer Science, vol. 12149, pp. 38–44. Springer International Publishing, Cham
    (2020). https://doi.org/10.1007/978-3-030-49663-0 6
 7. Klamma, R., Renzel, D., de Lange, P., Janßen, H.: las2peer – a primer.
    https://doi.org/10.13140/RG.2.2.31456.48645
 8. Maadi, M., Akbarzadeh Khorshidi, H., Aickelin, U.: A Review on Human-
    AI Interaction in Machine Learning and Insights for Medical Applications. In-
    ternational journal of environmental research and public health 18(4) (2021).
    https://doi.org/10.3390/ijerph18042121
 9. Phillips, P.J., Hahn, C.A., Fontana, P.C., Broniatowski, D.A., Przy-
    bocki, M.A.: Four Principles of Explainable Artificial Intelligence (2020).
    https://doi.org/10.6028/NIST.IR.8312-draft
10. Ribeiro, M., Grolinger, K., Capretz, M.A.: MLaaS: Machine Learning
    as a Service. In: 2015 IEEE 14th International Conference on Ma-
    chine Learning and Applications (ICMLA). pp. 896–902. IEEE (2015).
    https://doi.org/10.1109/ICMLA.2015.152
11. Samaniego, M., Jamsrandorj, U., Deters, R.: Blockchain as a Service for IoT.
    In: 2016 IEEE International Conference on Internet of Things (iThings) and
    IEEE Green Computing and Communications (GreenCom) and IEEE Cyber,
    Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData).
    pp. 433–436. IEEE (2016). https://doi.org/10.1109/iThings-GreenCom-CPSCom-
    SmartData.2016.102
12. Slupczynski, M., Klamma, R.: Vorschlag einer Infrastruktur zur verteil-
    ten Datenanalyse von multimodalen psychomotorischen Lerneraktivitäten.
    https://doi.org/10.13140/RG.2.2.23664.58886/1
13. Wild, F., Perey, C., Hensen, B., Klamma, R.: IEEE Standard for Augmented Real-
    ity Learning Experience Models. In: 2020 IEEE International Conference on Teach-
    ing, Assessment, and Learning for Engineering (TALE). pp. 1–3. IEEE (2020).
    https://doi.org/10.1109/TALE48869.2020.9368405