<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Generative Multimodal Analysis (GMA) for Learning Process Data Analytics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ridwan Whitehead</string-name>
          <email>ridwan.whitehead@oulu.fi</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andy Nguyen</string-name>
          <email>Andy.nguyen@oulu.fi</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sanna Järvelä</string-name>
          <email>sanna.jarvela@oulu.fi</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Learning and Educational Technology (LET) Research Lab, University of Oulu</institution>
          ,
          <addr-line>Pentti Kaiteran katu 1, Oulu</addr-line>
          ,
          <country country="FI">Finland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper introduces Generative Multimodal Analysis (GMA), a novel method designed for utilizing Artificial Intelligence (GenAI) in the analysis of multimodal data derived from learning processes. The method is encapsulated in a systematic framework that integrates and optimizes GenAI technology with multimodal large language models (MLLMs) for application in multimodal learning analytics. The recent emergence and advancement of GenAI, particularly MLLMs, has opened new avenues for the automated interpretation and meaningful analysis of varied data sources. Current research in the field has sightseen diverse applications of GenAI in transforming learning and teaching practice. However, there is a noticeable gap in systematic methodologies for applying GenAI to scrutinize learning process data. This paper aims to bridge this gap by proposing the GMA method in the sphere of multimodal learning analytics with learning process data. In addition to the proposed methodological framework, this study also proposes an operational prototype for the practical implementation of GMA. This prototype serves as a tool for examining multimodal data in learning processes. To demonstrate the applicability and effectiveness of our proposed method, we conducted and presented a case study. Our approach offers essential guidance for learning scientists and educational technology application developers, reflecting the contemporary trends and needs in educational technologies. By providing a structured, innovative approach for employing GenAI in learning process data analysis, this study contributes significantly to the advancement of learning analytics methods.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Generative Artificial Intelligence</kwd>
        <kwd>Learning Analytics</kwd>
        <kwd>Multimodal Data 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The rapid evolution of educational technologies and learning sciences, particularly the groundbreaking
strides in artificial intelligence (AI) technology, has heralded an era where data-driven insights have
become valuable in enhancing learning processes. This intersection of AI with educational
methodologies is reshaping how information is analyzed, interpreted, and applied to improve teaching
strategies and learning outcomes. For instance, Luckin and Cukurova [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] highlighted the potential of
AI to provide personalized learning experiences, adapting to individual learner's needs. Likewise,
Holmes et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] discussed the transformative role of AI in education, particularly in providing insights
into learning patterns. Recently, Järvelä et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] proposed a human-AI collaboration approach for
better unfolding the learning processes in the context of socially shared regulation of learning. Despite
these advancements, the application of AI in learning analytics has primarily been fragmented. Current
research predominantly focuses on isolated applications of AI in educational contexts [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], lacking
a comprehensive methodology for systematic analysis. This gap is particularly apparent in the sphere
of generative Artificial Intelligence (GenAI), where its potential for learning process data analytics
remains underexplored.
      </p>
      <p>
        The advent of GenAI and its integration into educational contexts, particularly through multimodal
large language models (MLLMs), represents a significant leap in the domain of educational technology.
GenAI, a specialized subset of artificial intelligence that focuses on generating new content, has seen
rapid evolution over the past decade. A key development in this evolution has been the creation of
sophisticated multimodal large language models (MLLMs), such as OpenAI's GPT (Generative
Pretrained Transformer) series, which exemplify the advancements in natural language processing and
understanding. The release of ChatGPT, based on the GPT architecture, in November 2022 represented
a significant milestone [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. ChatGPT, with its advanced language understanding and generation
0009-0002-2888-7304 (R. Whitehead); 0000-0002-0759-9656 (A. Nguyen); 0000-0001-6223-3668 (S. Järvelä)
© 2023 Copyright for this paper by its authors.
      </p>
      <p>Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
capabilities, offered a more interactive and intuitive way for users to engage with AI. Furthermore,
GenAI opens new possibilities in content creation and data analysis. Accordingly, learning analytics,
encompassing the collection, measurement, analysis, and reporting of data about learners and their
contexts, could greatly benefit from the integration of GenAI. This integration enables more nuanced
extraction of insights from diverse learning process data, thereby enhancing our understanding of
student learning behaviors and outcomes.</p>
      <p>
        This paper addresses the identified gap in comprehensive methodologies for GenAI in multimodal
learning analytics and proposes Generative Multimodal Analysis (GMA) as a structured approach to
employ GenAI effectively in this field. Such a methodology is essential due to the growing complexity
of learning environments and the variety of data they produce. As argued by Järvelä and Bannert [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
the integration of multimodal data analysis in educational research is crucial for a holistic understanding
of learning processes. As an integral part of this study, we have developed a specialized software tool
designed for GMA. This tool is intended to serve as a practical resource for researchers and educational
developers, providing them with a robust and user-friendly platform to apply GMA methodologies in
their work. The effectiveness and relevance of this software tool are demonstrated through its
application in a case study. This implementation not only showcases the tool's functionality but also
affirms GMA’s efficacy and broad applicability in the field of educational research and development.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Generative Multimodal Analysis (GMA) for Learning Process Data</title>
    </sec>
    <sec id="sec-3">
      <title>Analytics</title>
      <p>Generative Multimodal Analysis (GMA) represents a comprehensive methodological framework
designed to transform the approach of researchers and analysts in educational settings. This framework
leverages the capabilities of GenAI to expedite and enrich the process of extracting and integrating
verbal and nonverbal elements from learning process data. Furthermore, GMA extends its integration
to include material objects and the environment, capturing how they are utilized by learners engaged in
active interaction within the learning context. By harnessing the power of GenAI, GMA facilitates a
more profound and holistic understanding of the multifaceted nature of learning environments, where
verbal communication, nonverbal cues, physical objects, and the surrounding environment all play
integral roles in the learning process.</p>
      <p>
        Figure 1 provides a visual depiction of the Generative Multimodal Analysis (GMA) Framework,
illustrating its dynamic capabilities in leveraging generative AI for educational research. Within this
framework, generative AI is adept at producing various types of outputs, including 1) predefined events,
2) improvisational events, 3) detailed descriptions of the learning process, and 4) comprehensive
descriptions of the learning context. Crucially, as per the human-AI collaboration approach in research
suggested by Järvelä, Nguyen and Hadwin [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], it's imperative that these AI-generated outputs are
subjected to evaluation and validation by human researchers or analysts. This collaborative approach
ensures that the insights offered by AI are grounded in human understanding and expertise as well as
its reliability.
      </p>
      <p>Furthermore, the information generated through the GMA Framework can be dissected and
examined through various analytical lenses. These include a) process-oriented analysis, which focuses
on the dynamics and phases of the learning process; b) quantitative modeling, offering a statistical
perspective and uncovering patterns and correlations; and c) qualitative inquiry, which delves into the
deeper, nuanced aspects of the learning environment and experiences. This multifaceted analytical
approach allows for a comprehensive and multi-dimensional understanding of the learning process,
harnessing the strengths of both AI and human analysis.</p>
      <p>As for illustration, Figure 2 provides a demonstration of how the GMA is applied to automatically
extract specific pre-defined events. In this case, the focus is on identifying and analysing the non-verbal
posture states of learners engaged in a collaborative learning setting. The figure showcases the
capability of GMA to discern and categorize various non-verbal cues, particularly the postures of
learners, which play a significant role in understanding engagement, interaction dynamics, and overall
effectiveness of collaborative learning processes. This example highlights the advanced analytical
power of GMA in recognizing and interpreting subtle yet crucial aspects of learner behaviour in
collaborative learning.</p>
      <p>Figure 3 showcases the user interface of our newly developed Generative Multimodal Analysis (GMA)
Toolkit for describing video data. This interface is specifically designed to facilitate direct interaction
with video data, particularly focusing on observational data from learning processes.
The example presented within the figure provides a detailed demonstration of how the toolkit can be
utilized to analyze observational video data. Specifically, it illustrates an analysis generated by the
toolkit from a video capturing a collaborative learning session. This visual representation highlights the
toolkit’s capabilities in processing and interpreting complex, real-time learning environments, thereby
offering valuable insights into the dynamics of collaborative learning.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Discussion and Future Research Directions</title>
      <p>future studies will significantly contribute to enhancing the utility and scalability of GMA, thereby
advancing its role in educational technology, learning analytics, and learning sciences.</p>
    </sec>
    <sec id="sec-5">
      <title>4. Acknowledgements</title>
      <p>This research has been funded by the Research Council of Finland (aka. Academy of Finland) grants
350249, and the University of Oulu profiling project Profi7 Hybrid Intelligence - 352788.</p>
    </sec>
    <sec id="sec-6">
      <title>5. REFERENCES</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Luckin</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Cukurova</surname>
          </string-name>
          , '
          <article-title>Designing educational technologies in the age of AI: A learning sciences-driven approach'</article-title>
          ,
          <source>British Journal of Educational Technology</source>
          , vol.
          <volume>50</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>2824</fpage>
          -
          <lpage>2838</lpage>
          , Nov.
          <year>2019</year>
          , doi: 10.1111/bjet.12861.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>W.</given-names>
            <surname>Holmes</surname>
          </string-name>
          et al.,
          <article-title>'Ethics of AI in Education: Towards a Community-Wide Framework'</article-title>
          ,
          <string-name>
            <surname>Int J Artif Intell</surname>
            <given-names>Educ</given-names>
          </string-name>
          , Apr.
          <year>2021</year>
          , doi: 10.1007/s40593-021-00239-1.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Järvelä</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Hadwin</surname>
          </string-name>
          , '
          <article-title>Human and artificial intelligence collaboration for socially shared regulation in learning'</article-title>
          ,
          <source>British Journal of Educational Technology</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Crompton</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Burke</surname>
          </string-name>
          , '
          <article-title>Artificial intelligence in higher education: the state of the field'</article-title>
          ,
          <source>International Journal of Educational Technology in Higher Education</source>
          , vol.
          <volume>20</volume>
          , no.
          <issue>1</issue>
          , p.
          <fpage>22</fpage>
          ,
          <string-name>
            <surname>Apr</surname>
          </string-name>
          .
          <year>2023</year>
          , doi: 10.1186/s41239-023-00392-8.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>O.</given-names>
            <surname>Zawacki-Richter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. I.</given-names>
            <surname>Marín</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bond</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Gouverneur</surname>
          </string-name>
          , '
          <article-title>Systematic review of research on artificial intelligence applications in higher education - where are the educators?'</article-title>
          ,
          <source>International Journal of Educational Technology in Higher Education</source>
          , vol.
          <volume>16</volume>
          , no.
          <issue>1</issue>
          , p.
          <fpage>39</fpage>
          ,
          <string-name>
            <surname>Oct</surname>
          </string-name>
          .
          <year>2019</year>
          , doi: 10.1186/s41239-019-0171-0.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y. K.</given-names>
            <surname>Dwivedi</surname>
          </string-name>
          et al.,
          <article-title>'“So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice</article-title>
          and policy',
          <source>International Journal of Information Management</source>
          , vol.
          <volume>71</volume>
          , p.
          <fpage>102642</fpage>
          ,
          <string-name>
            <surname>Aug</surname>
          </string-name>
          .
          <year>2023</year>
          , doi: 10.1016/j.ijinfomgt.
          <year>2023</year>
          .
          <volume>102642</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E.</given-names>
            <surname>Kasneci</surname>
          </string-name>
          et al.,
          <article-title>'ChatGPT for good? On opportunities and challenges of large language models for education'</article-title>
          ,
          <source>Learning and Individual Differences</source>
          , vol.
          <volume>103</volume>
          , p.
          <fpage>102274</fpage>
          ,
          <string-name>
            <surname>Apr</surname>
          </string-name>
          .
          <year>2023</year>
          , doi: 10.1016/j.lindif.
          <year>2023</year>
          .
          <volume>102274</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Järvelä</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Bannert</surname>
          </string-name>
          , '
          <article-title>Temporal and adaptive processes of regulated learning - What can multimodal data tell?', Learning and Instruction</article-title>
          , vol.
          <volume>72</volume>
          , p.
          <fpage>101268</fpage>
          ,
          <string-name>
            <surname>Apr</surname>
          </string-name>
          .
          <year>2021</year>
          , doi: 10.1016/j.learninstruc.
          <year>2019</year>
          .
          <volume>101268</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Baker</surname>
          </string-name>
          , '
          <article-title>Challenges for the Future of Educational Data Mining: The Baker Learning Analytics Prizes'</article-title>
          ,
          <source>Journal of Educational Data Mining</source>
          , vol.
          <volume>11</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          , Jun.
          <year>2019</year>
          , doi: 10.5281/ZENODO.3554745.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Gardner</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Sheridan</surname>
          </string-name>
          , '
          <article-title>Data Analytics in Higher Education: An Integrated View'</article-title>
          ,
          <source>Journal of Information Systems Education</source>
          , vol.
          <volume>31</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>61</fpage>
          -
          <lpage>71</lpage>
          , Jan.
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Järvelä</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rosé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Järvenoja</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Malmberg</surname>
          </string-name>
          , '
          <article-title>Examining socially shared regulation and shared physiological arousal events with multimodal learning analytics'</article-title>
          ,
          <source>British Journal of Educational Technology</source>
          , vol.
          <volume>54</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>293</fpage>
          -
          <lpage>312</lpage>
          ,
          <year>2023</year>
          , doi: 10.1111/bjet.13280.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. N.</given-names>
            <surname>Ngo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Dang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.-P. T.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          , '
          <article-title>Ethical principles for artificial intelligence in education'</article-title>
          ,
          <source>Educ Inf Technol</source>
          , vol.
          <volume>28</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>4221</fpage>
          -
          <lpage>4241</lpage>
          ,
          <year>2023</year>
          , doi: 10.1007/s10639-022-11316-w.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>