=Paper=
{{Paper
|id=Vol-2610/paper1
|storemode=property
|title=Context-aware Multimodal Learning Analytics Taxonomy
|pdfUrl=https://ceur-ws.org/Vol-2610/paper1.pdf
|volume=Vol-2610
|authors=Maka Eradze,María Jesús Rodríguez-Triana,Mart Laanpere
|dblpUrl=https://dblp.org/rec/conf/lak/EradzeRL20
}}
==Context-aware Multimodal Learning Analytics Taxonomy==
                  Companion Proceedings 10th International Conference on Learning Analytics & Knowledge (LAK20)
          Context-aware Multimodal Learning Analytics Taxonomy
                                                       Maka Eradze
                       School of Digital Technologies, Tallinn University, Estonia
                     Faculty of Engineering, University of Naples Federico II, Italy
                                                      maka@tlu.ee
                               María Jesús Rodríguez Triana, Mart Laanpere
                       School of Digital Technologies, Tallinn University, Estonia
                                           mjrt@tlu.ee, martlaa@tlu.ee
          ABSTRACT: Analysis of learning interactions can happen for different purposes. As
          educational practices increasingly take place in hybrid settings, data from both spaces are
          needed. At the same time, to analyse and make sense of machine aggregated data afforded
          by Technology-Enhanced Learning (TEL) environments, contextual information is needed. We
          posit that human labelled (classroom observations) and automated observations
          (multimodal learning data) can enrich each other. Researchers have suggested learning
          design (LD) for contextualisation, the availability of which is often limited in authentic
          settings. This paper proposes a Context-aware MMLA Taxonomy, where we categorize
          systematic documentation and data collection within different research designs and
          scenarios, paying special attention to authentic classroom contexts. Finally, we discuss
          further research directions and challenges.
          Keywords: multimodal learning analytics, human-labelled observations, automated
          observations, classroom observations, technology-enhanced classrooms, learning design,
          context
1         INTRODUCTION AND BACKGROUND
As teaching and learning processes most often take place blended learning settings, to create a
holistic picture of educational context and analyse these processes for different purposes, different
data sources and collection methods are needed. Learning interaction (between people and/or with
artefacts) has been an important part of educational research. While some decades ago, researchers
focused on face-to-face interactions and used traditional data-collection techniques such as
observations, technological advancements led attention to Technology-enhanced Learning (TEL)
researchers towards digital interactions, as it is illustrated by the appearance of the Learning
Analytics (LA) community. Thus, both research trends often cover only one part of the educational
process due to the data available. The Multimodal Learning Analytics (MMLA) field emerged as a
response to this need, combining different data-sources from different spaces, e.g., with the help of
sensors, EEG devices etc. At the same time, to guide the data collection and analysis process, human
Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International
                                                                                                                        1
(CC BY 4.0).
                  Companion Proceedings 10th International Conference on Learning Analytics & Knowledge (LAK20)
inference and contextual information (such as learning designs where teachers report about their
intentions, actors, roles, media use and other information about the learning context) are often
needed (Hernández-Leo, Rodriguez Triana, Inventado, & Mor, 2017). To address this need, previous
research proposes to benefit from the synergistic LD and LA relationship, where LD contextualizes
data analysis and LA informs LD.
The Learning Analytics (LA) community emerged with the widespread adoption of digital learning
platforms, mainly focusing on the analysis of digital interactions (Ochoa & Worsley, 2016). However,
depending on the learning activity, meaningful interactions may also not be tracked in theses
spaces, narrowing down the interaction analysis to the data available in the digital platforms that
can lead to a street-light effect (Freedman, 2010). To respond to this limitation, a new wave of
Multimodal Learning Analytics (MMLA) community promotes the collection and analysis of different
data sources across spaces (Blikstein & Worsley, 2016). Typically, MMLA datasets include not only
log data, but also data generated by sensors located in mobile and wearable devices (Ochoa &
Worsley, 2016). To make sense of the MMLA data, input from humans is often used; human-
mediated labelling is often used to relate raw data to more abstract constructs (Worsley et al.,
2016)(Di Mitri, Schneider, Klemke, Specht, & Drachsler, 2019). At the same time, analytics
approaches need theory (Joksimović, Kovanović, & Dawson, 2019) to create a hypothesis space (Di
Mitri, Schneider, Specht, & Drachsler, 2018). Moreover, contextual information such as the learning
design can guide the data collection and interpretation (Lockyer & Dawson, 2011)(Rodríguez-Triana,
Martínez-Monés, Asensio-Pérez, & Dimitriadis, 2013). However, it is worth noting that in authentic
settings LD may not be available due to different limitations and adoption problems (Dagnino,
Dimitriadis, Pozzi, Asensio-Pérez, & Rubia-Avi, 2018)(Lockyer, Heathcote, & Dawson, 2013;
Mangaroska & Giannakos, 2018).
Traditional human-mediated data collection methods, such as observations can also respond to the
aforementioned need for contextual information, as they are inherently highly contextual. Through
observational methods, quantitative and qualitative data can be systematically collected and
analysed (Cohen, Manion, & Morrison, 2018)(Eradze, Rodríguez Triana, & Laanpere, 2017). However,
despite the richness of observational data, several constraints prevent researchers and practitioners
from applying them (e.g., time-consuming data collection and analysis, intrusive approach,
difficulties registering fine-grain events or multiple events at the same time, etc). Therefore,
educational research and practice may benefit from aligning traditional (human-labelled) and
modern (automated) classroom observations; thanks to the evidence collected from the physical
space, they can support the triangulation, contextualization and sensemaking of MMLA data. On the
one hand, observations could aid the MMLA contextual and methodological needs, and on the other
MMLA could alleviate the complexity and workload of human-driven observations: enrich the data,
speed up the observation process by automatization or gather evidence on indicators unobservable
to the human eye, as already indicated by previous authors (Anguera, Portell, Chacón-Moscoso, &
Sanduvete-Chaves, 2018)(Bryant et al., 2017). Furthermore, technological solutions may further
reinforce the use of specific coding schemas, contributing to the quality and availability of the data;
speed up the process of observations (Kahng & Iwata, 1998), and enhance validity and reliability of
data (Ocumpaugh et al., 2015).
Based on the overviewed community challenges and concerns rooted in previous research, to
Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International
                                                                                                                        2
(CC BY 4.0).
                  Companion Proceedings 10th International Conference on Learning Analytics & Knowledge (LAK20)
provide a holistic picture on teaching and learning processes and with a systematic picture on the
use of MMLA in different scenarios, this research has connected two research paradigms (traditional
and modern) based on systematic, human-labelled and automated observations. More concretely,
we explore synergies between these two approaches in authentic, blended, TEL classroom settings.
Also, to reinforce the contextualization, whenever available, we propose to use the LD, reflecting the
pedagogical grounding and the teacher intentions leading to that activity. Connecting these three
factors: human-mediated, automated observations and LD contextualization is not a trivial task, and
special attention needs to be paid to the specificities, meaning, affordances, constraints and quality
of the data sources, as well as LD availability challenges.
To envision the data collection and documentation process, we propose a Context-Aware MMLA
Taxonomy. The presented taxonomy classifies different research designs depending on how
systematic the documentation of the learning design and the data collection have been. The
following section will overview the taxonomy and the final chapter of the paper will close with a
discussion detailing further research directions and challenges.
2         CONTEXT-AWARE MULTIMODAL LEARNING ANALYTICS TAXONOMY
To provide a contextualized and holistic view of the teaching and learning activities taking place in
TEL classrooms, connecting two research paradigms (Daniel, 2019), this paper proposes a Context-
aware MMLA Taxonomy to support the alignment of LD, human and automated observations
(MMLA). In this taxonomy, in line with previous research indicating to LA adoption challenges
(Buckingham Shum, Ferguson, & Martinez-Maldonaldo, 2019), we regard authentic learning
contexts as a baseline, anchoring scenario. The taxonomy (Figure 1) classifies human-labelled and
automated data collection on two axes: systematic documentation and data-collection, viewing
authentic cases as a baseline for data collection and analysis. These two axes represent context-
awareness (systematic documentation) and rigorous quantitative classroom observation data
collection (systematic data collection) to enable alignment of data sources and rich MMLA analysis.
Ideal - Systematic documentation and data collection: In the most desirable case, the learning
design (including actors, roles, resources, activities, timeline, and learning objectives) is set up-front
and documented in an authoring tool. Then, during the enactment, logs are collected automatically
from the digital space and systematic observations from the physical one. During the enactment, the
additional layer of enactment lesson structure is inferred through unstructured observations. To
ensure the interoperability, actors and objects should be identifiable (across the learning design,
logs and observations) and timestamps for each event should be registered. Once the data is
aggregated in a multimodal dataset, further analysis can be executed.
Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International
                                                                                                                        3
(CC BY 4.0).
                  Companion Proceedings 10th International Conference on Learning Analytics & Knowledge (LAK20)
                                    Figure 1: Context-Aware MMLA Taxonomy
Authentic (baseline) - Non-systematic documentation but systematic data collection: We regard
this level as a compromise between the limitations of authentic settings but still rich in terms of
data. Here, the predefined learning design cannot be automatically used to guide the analysis (either
because of its format or because it is not available). However, the timestamped lesson structure is
inferred by the observer. Therefore, the actors are not identifiable across observations and digital
traces. Nevertheless, both structured observations and logs are systematically gathered and
collected in the Learning Record Store using a common format (e.g., xAPI). These conditions will
enable the application contextualized analysis on a more baseline level, using multimodal analytics.
 Limited - Non-systematic documentation or data collection: Data collection happens non-
systematically. As in the previous case, no information about the learning design is available (i.e.,
actors are not known). In terms of the design of the data collection, the protocol with corresponding
codes may not be predefined, and semi-structured (non-systematic) observations are used. Thus,
even if logs are systematically gathered, the lack of systematization of the observations hinder the
application of multimodal data analysis. Although this is not an advisable scenario, logs and
observations can be analysed independently and still provide an overview of what happened in the
physical and digital planes. Besides, even if observations are done systematically, if the vocabulary
(actors, objects and actions) are not agreed across datasets, then the potential of the multimodal
analysis could be limited.
3         DISCUSSION, CHALLENGES AND FUTURE RESEARCH
This paper overviewed modern challenges in MMLA community underlying data contextualization
and sense-making needs, especially in authentic learning scenarios. Based on these challenges and
problems we suggested aligning modern and traditional data collection methods (human-labelled
and automated) and LD. As researchers and practitioners need to take into account authentic
learning settings in MMLA data collection, we proposed the Context-aware Multimodal Taxonomy to
classify different levels of data collection and documentation, for different research designs. It is
Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International
                                                                                                                        4
(CC BY 4.0).
                  Companion Proceedings 10th International Conference on Learning Analytics & Knowledge (LAK20)
worth noting that we also created specific conceptual and technological tools (Eradze & Laanpere,
2017; Eradze, Rodríguez-Triana, & Laanpere, 2017). Both, the taxonomy and tools have been
evaluated in authentic settings (corresponding to the baseline scenario) through an iterative analysis
of multimodal data (human-labelled and automated observations) involving different qualitative
sources such as teacher reflections and qualitative observations. Preliminary results show that, in
authentic settings, the baseline scenario was useful for two-level contextualization: observed lesson
structure, human-labelled observations. At the same time, in this specific case, systematic human-
labelled observations introduced additional semantics, pedagogical constructs, and indicate to the
potential of using theoretical constructs in the automated observation data-sets through (validated)
coding schemas. This factor further contributes to the creation of hypothesis space.
However, to enable alignment of MMLA observations and LD, in ideal scenarios (see Figure 1) and to
facilitate the adoption of MMLA in the context of classroom observations by final users, there is a
need for further reinforcement for sense-making and analysis to enable actionable insights based on
MMLA data. To reach that goal, it would be necessary to create MMLA architectures and pipelines to
integrate MMLA data and visualize it in a dashboard. In this regard, the on-going MMLA research
efforts (Schneider, Di Mitri, Limbu, & Drachsler, 2018; Shankar et al., 2019) look very promising. At
the same time, further research is needed for the pedagogically-grounded and theory-driven
analysis of data and understanding how the Context-aware MMLA taxonomy and the related
solutions can inform the teaching practice.
REFERENCES
Alison Bryant, J., Liebeskind, K., & Gestin, R. (2017). Observational Methods. In The International
      Encyclopedia         of     Communication          Research        Methods        (pp.      1–10).
      https://doi.org/10.1002/9781118901731.iecrm0171
Anguera, M. T., Portell, M., Chacón-Moscoso, S., & Sanduvete-Chaves, S. (2018). Indirect observation
      in everyday contexts: Concepts and methodological guidelines within a mixed methods
      framework. Frontiers in Psychology, 9(JAN), 13. https://doi.org/10.3389/fpsyg.2018.00013
Blikstein, P., & Worsley, M. (2016). Multimodal Learning Analytics and Education Data Mining : using
      computational technologies to measure complex learning tasks. Journal of Learning Analytics,
      3(2), 220–238. https://doi.org/http://dx.doi.org/10.18608/jla.2016.32.11
Buckingham Shum, S., Ferguson, R., & Martinez-Maldonaldo, R. (2019). Human-Centred Learning
      Analytics. Journal of Learning Analytics, 6(2), 1–9. https://doi.org/10.18608/jla.2019.62.1
Cohen, L., Manion, L., & Morrison, K. (2018). Research methods in education, 8th ed. In Routledge.
      https://doi.org/10.1080/19415257.2011.643130
Dagnino, F. M., Dimitriadis, Y. A., Pozzi, F., Asensio-Pérez, J. I., & Rubia-Avi, B. (2018). Exploring
      teachers’ needs and the existing barriers to the adoption of Learning Design methods and
      tools: A literature survey. British Journal of Educational Technology, 49(6), 998–1013.
      https://doi.org/10.1111/bjet.12695
Daniel, B. K. (2019). Big Data and data science: A critical review of issues for educational research.
      British Journal of Educational Technology. https://doi.org/10.1111/bjet.12595
Di Mitri, D., Schneider, J., Klemke, R., Specht, M., & Drachsler, H. (2019). Read Between the Lines: An
      Annotation Tool for Multimodal Data for Learning. Proceedings of the 9th International
      Conference on Learning Analytics & Knowledge, 51–60. ACM.
Di Mitri, D., Schneider, J., Specht, M., & Drachsler, H. (2018). From signals to knowledge: A
      conceptual model for multimodal learning analytics. Journal of Computer Assisted Learning,
      34(4), 338–349. https://doi.org/10.1111/jcal.12288
Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International
                                                                                                                        5
(CC BY 4.0).
                  Companion Proceedings 10th International Conference on Learning Analytics & Knowledge (LAK20)
Eradze, M., & Laanpere, M. (2017). Lesson observation data in learning analytics datasets:
     Observata. In: Lavoué É., Drachsler H., Verbert K., Broisin J., Pérez-Sanagustín M. (Eds) Data
     Driven Approaches in Digital Education. EC-TEL 2017. Lecture Notes in Computer Science, Vol
     10474. Springer, Cham, 10474 LNCS, 504–508. https://doi.org/10.1007/978-3-319-66610-5_50
Eradze, M., Rodríguez-Triana, M. J., & Laanpere, M. (2017). Semantically Annotated Lesson
     Observation Data in Learning Analytics Datasets: a Reference Model. Interaction Design and
     Architecture(s)          Journal,         33(75–91),          75–91.          Retrieved        from
     http://www.mifav.uniroma2.it/inevent/events/idea2010/doc/33_4.pdf
Eradze, M., Rodríguez Triana, Jesús, M., & Laanpere, M. (2017). How to aggregate lesson observation
     data into learning analytics dataset? Joint Proceedings of the 6th Multimodal Learning Analytics
     (MMLA) Workshop and the 2nd Cross-LAK Workshop Co-Located with 7th International
     Learning Analytics and Knowledge Conference (LAK 2017). Vol. 1828. No. CONF. CEUR, 2017.,
     1828, 74–81. CEUR.
Freedman, D. H. (2010). Why scientific studies are so often wrong: The streetlight effect. Discover
     Magazine, 26.
Hernández-Leo, D., Rodriguez Triana, M. J., Inventado, P. S., & Mor, Y. (2017). Preface: Connecting
     Learning Design and Learning Analytics. Interaction Design and Architecture(s) Journal, 33(Ld),
     3–8. Retrieved from https://infoscience.epfl.ch/record/231720
Joksimović, S., Kovanović, V., & Dawson, S. (2019). The Journey of Learning Analytics. HERDSA
     Review of Higher Education, 6, 27–63.
Kahng, S., & Iwata, B. A. (1998). Computerized systems for collecting real‐time observational data.
     Journal of Applied Behavior Analysis, 31(2), 253–261.
Lockyer, L., & Dawson, S. (2011). Learning designs and learning analytics. Proceedings of the 1st
     International Conference on Learning Analytics and Knowledge - LAK ’11, 153.
     https://doi.org/10.1145/2090116.2090140
Lockyer, L., Heathcote, E., & Dawson, S. (2013). Informing pedagogical action: Aligning learning
     analytics with learning design. American Behavioral Scientist, 57(10), 1439–1459.
Mangaroska, K., & Giannakos, M. N. (2018). Learning analytics for learning design: A systematic
     literature review of analytics-driven design to enhance learning. IEEE Transactions on Learning
     Technologies, 1–1. https://doi.org/10.1109/TLT.2018.2868673
Ochoa, X., & Worsley, M. (2016). Augmenting Learning Analytics with Multimodal Sensory Data.
     Journal of Learning Analytics, 3(2), 213–219.
Ocumpaugh, J., Baker, R. S., Rodrigo, M. M., Salvi, A., van Velsen, M., Aghababyan, A., & Martin, T.
     (2015). HART. Proceedings of the 33rd Annual International Conference on the Design of
     Communication - SIGDOC ’15, 1–6. https://doi.org/10.1145/2775441.2775480
Rodríguez-Triana, M. J., Martínez-Monés, A., Asensio-Pérez, J. I., & Dimitriadis, Y. (2013). Towards a
     script-aware monitoring process of computer-supported collaborative learning scenarios.
     International     Journal      of    Technology        Enhanced     Learning,      5(2),   151–167.
     https://doi.org/10.1504/IJTEL.2013.059082
Schneider, J., Di Mitri, D., Limbu, B., & Drachsler, H. (2018). Multimodal learning hub: A tool for
     capturing customizable multimodal learning experiences. European Conference on Technology
     Enhanced Learning, 45–58. Springer.
Shankar, S. K., Ruiz-Calleja, A., Serrano-Iglesias, S., Ortega-Arranz, A., Topali, P., & Martınez-Monés,
     A. (2019). A Data Value Chain to Model the Processing of Multimodal Evidence in Authentic
     Learning Scenarios.
Worsley, M., Abrahamson, D., Blikstein, P., Grover, S., Schneider, B., & Tissenbaum, M. (2016).
     Situating multimodal learning analytics. 12th International Conference of the Learning Sciences:
     Transforming Learning, Empowering Learners, ICLS 2016, 1346–1349. International Society of
     the Learning Sciences (ISLS).
Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International
                                                                                                                        6
(CC BY 4.0).