<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Multimodal Analogies for Science Education</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Shradha Sehgal</string-name>
          <email>ssehgal4@illinois.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bhavya</string-name>
          <email>bhavya2@illinois.edu</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Krishna Phani Datta</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aditi Mallavarapu</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>ChengXiang Zhai</string-name>
          <email>czhai@illinois.edu</email>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Analogies, Education, Large Language Models, Multimodal, CEUR-WS</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Illinois Urbana-Champaign</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Analogies are an efective teaching tool for helping students understand new concepts by connecting them to familiar contexts. However, generating analogies that aid students' learning is non-trivial and requires a nuanced understanding that draws meaningful parallels between familiar concepts. Researchers have addressed this challenge by using computational models to generate textual or word-level analogies. We believe that that adding visual elements to textual analogical explanations can ofer greater comprehension to students than relying solely on textual analogies. Accordingly, we introduce the idea of multimodal analogies - a fusion of textual analogies and their visual counterparts to enhance understanding of scientific concepts. Further, we introduce and explore generating three types of multimodal analogies for science education, namely, general analogies; adaptive analogies tailored to the background, needs and preferences of learners; and iteratively refined analogies via human-AI and multi-agent collaboration. We leverage models like GPT-4 for text generation followed by DALL-E-3 for images and qualitatively analyze the created analogies of each of the three types. Our analysis helps identify some limitations of existing models and pinpoint future research directions in this area. Moreover, we showcase a demo system where students can engage with multimodal analogies and provide feedback. We aim to use this system to garner feedback on the AI-generated analogies and ultimately create a large-scale, high quality dataset of multi-modal analogies for scientific education.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Analogies are comparisons that highlight similarities
between two diferent things to clarify or explain concepts
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. They function by transferring knowledge from a
wellunderstood subject (the source or analogue) to one that is
less familiar (the target) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Analogies are useful as an
educational tool as they have been proven to boost
understanding and critical thinking among students [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. By connecting
new and complex concepts to familiar ones, analogies help
students bridge gaps in knowledge and making the learning
process more efective [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Generating analogies requires
extensive topic knowledge and the ability to think abstractly
and creatively [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Thus, analogies are typically created by
specialists within a field who have comprehensive
knowledge of the concepts [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. To automate this process and
reduce the time for generating analogies, researchers have
studied the automated generation of word-level analogies
like “king:man :: queen:woman” using computational
methods [
        <xref ref-type="bibr" rid="ref7 ref8 ref9">7, 8, 9</xref>
        ]. Few works have investigated creating science
analogies with explanations but these have been limited to
using only the text modality [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ].
      </p>
      <p>
        Given the efectiveness of visual elements in aiding
student learning [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], we propose to augment explanation-type
textual analogies with image representations to create
multimodal analogies. We believe adding visual components to
textual analogies can enhance the overall understandability
of the content and increase student engagement. Specifically
for science concepts that often include structural diagrams
and complex relations, visual analogies can help students
understand concepts alongside text.
      </p>
      <sec id="sec-1-1">
        <title>To this end, we explore how to leverage LLMs and difusion models to generate three types of multi-modal analogies, namely general analogies, adaptive analogies tailored</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related</title>
    </sec>
    <sec id="sec-3">
      <title>Work</title>
      <sec id="sec-3-1">
        <title>In this section, we describe related work on computational models of analogies, application of analogies to education, and leveraging LLMs for education.</title>
        <sec id="sec-3-1-1">
          <title>2.1. Computational Models of Analogies</title>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Computational modeling of analogies refers to the algorithms and models for generating analogies. This section reviews the computational models used to generate analogies of diferent modalities - text and visual.</title>
        <p>
          2.1.1. Text-based analogies
Analogies have predominantly been studied at the
wordlevel, in the form of “A:B::C:D”, such as, “king:man::queen:
woman” [
          <xref ref-type="bibr" rid="ref7 ref8 ref9">7, 8, 9</xref>
          ]. These type of proportional analogies are
commonly used in entrance exams like the SAT or NCEE
to test student understanding. There exist multiple ways
to create word-level analogies, one prominent approach is
the Structural Mapping Engine (SME) [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], which is a
rulebased approach to finding analogies based on structural
representations and attributes of target and source concepts.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>More recently, deep-learning based methods have also been developed to study such analogies[14, 15, 16].</title>
        <p>CEUR</p>
        <p>ceur-ws.org</p>
        <p>
          However, most of these works focused on word-level and
proportional analogies, only recently have researchers
studied generating explanations using deep learning and LLM
approaches [
          <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
          ]. Work by Bhavya et al [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] is closest
to ours as they use pre-trained language models (PLMs) for
complex analogy generation. However, they broadly study
the applicability of LLMs for text analogy generation.
Instead, our focus is on understanding the potential of the
generated multimodal analogies for science education, a
pivotal application domain.
2.1.2. Visual Analogies
Prior works treat visual analogies similar to word-level
analogies (where A,B,C,D are images). These often include
identifying the missing image portion represented by ”?”
in A:B::C:? [
          <xref ref-type="bibr" rid="ref17 ref18">17, 18, 19</xref>
          ] or stylistic and geometric
transformations between images [
          <xref ref-type="bibr" rid="ref18">19, 18</xref>
          ]. Adding to the line of
research in both textual and visual analogies, our work
focuses on generating multimodal analogies, consisting of a
textual explanation and a visual representation of the same.
We do not relate images, but relate the target and source
concepts, that are depicted in the image. Chakrabarty et al.
(2023) created similar visual metaphors by elaborating on
existing metaphors using LLMs. However, metaphors are
much shorter in length and are more abstract than the
longform explanatory analogies we work with. Additionally,
they employ existing metaphors in their study, whereas we
generate both the textual analogy and the corresponding
image from scratch given a target concept.
        </p>
        <sec id="sec-3-3-1">
          <title>2.2. Applications of Analogies To Education</title>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>The application of analogies in education has been exten</title>
        <p>
          sively explored across various disciplines (e.g., science, math,
computer science), highlighting their significant role in
enhancing learning and understanding concepts and language.
Some studies [21, 22] have demonstrated how analogies can
simplify complex concepts and foster problem-solving skills,
particularly in science and mathematics. By linking new
information to pre-existing knowledge, analogies facilitate
deeper comprehension and retention [23, 24]. Vieira et al.
(2022) showcase the innovative use of musical analogies
to teach abstract scientific theories, thereby making
challenging concepts more accessible to students. Collectively,
these studies underscore the efectiveness of analogies as a
powerful educational tool, capable of enhancing student
engagement, understanding, and cognitive development [
          <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
          ].
Considering the numerous advantages of employing
analogies in education and the need for proactively generating
analogies that aid learning in digital interactive
environments, we explore the creation of scientific analogies using
the combination of language and difusion generative
models.
        </p>
        <sec id="sec-3-4-1">
          <title>2.3. LLMs for Education</title>
          <p>Generative AI models (e.g., GPT [26], Claude1, DALL-E
[27], LLama [28]) with billions of parameters that have been
trained on tremendous amounts of data have recently shown
great promise on several tasks including text and image
generation in multiple domains [29, 30]. The success of Large
Language Models (LLMs) across diverse tasks has led
researchers to explore their potential for education as well.</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>1https://claude.ai</title>
        <p>This includes improving teaching and learning capabilities
using LLMs across several domains [31, 32], such as
personalized learning [33], intelligent tutoring [34], adaptive
assessment [35] and course content generation [36].
Moreover, LLMs can also be utilized to provide automated and
personalized feedback to students [37]. Human-LM
interaction is also being researched in the context of education,
since students and teachers interact with these tutors and
chatbots [38, 39]. Our study builds upon this rich body of
work.</p>
        <p>Educational resource and content creation has emerged as
a key application area where LLMs have been harnessed [40,
41]. However, most work focuses on generating other kinds
of content, such as educational questions [42], explanation,
or assessment[37]. To the best of our knowledge, ours is
the first work to explore multimodal analogy generation for
educational purposes, using LLMs.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. LLM-based multi-modal analogy generation</title>
      <sec id="sec-4-1">
        <title>3.1. Analogy Generation Pipeline</title>
        <sec id="sec-4-1-1">
          <title>In this section, we examine the potential of using LLMs and</title>
          <p>difusion models for generating three types of multi-modal
analogies (general, adaptive and iteratively refined).</p>
          <p>For our exploration, we used the text labels in biology
diagrams found on grade 6-12 educational websites2, as
target concepts. We seeded our search with biology diagrams
since they contain visual representations of concepts and
can be shown pictorially in image analogies. We collected
30 biology concepts this way.</p>
          <p>To generate multi-modal (i.e., both text and image)
anlogies, we first use GPT-4 3 to create a text-based analogy and
feed the analogy into DALL-E-34. Some of our exploration
was done via the chat interface on their websites and the
rest programatically via API calls.</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. General Analogies</title>
        <sec id="sec-4-2-1">
          <title>For use-cases where the intended learners are unknown</title>
          <p>or too broad, an educator might wish to generate general
analogies that are broadly relevant to learners across
gradelevels and backgrounds. To this end, we explored prompts
like the following: “Generate a structural analogy for the
biology concept &lt;target&gt; (part of &lt;main_topic&gt;).” with the
GPT-4 model. We found analogies comparing the structure
or the function (e.g., procedure of a science phenomenon)
of the target and source concepts. For the DALL-E-3 model,
we explored prompts like “Generate an image representing
the scientific analogy given below.” along with the GPT-4
generated text analogy in the input.</p>
          <p>Figure 2 and 3 present examples of the visual analogies
and their corresponding texts. The authors of this paper
(three graduate students with a background in Computer
Science) qualitatively analyzed 30 multimodal analogies,
comparing the text analogies alongside the images. We found
the image analogies to be useful in providing an overview of
the analogy as the text itself can sometimes be too verbose
or complicated to understand. Moreover, the image
analogies help visualize the analogical ideas detailed in the text.</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>2https://byjus.com/biology/important-diagrams/ 3https://openai.com/index/gpt-4-research/ 4https://openai.com/index/dall-e-3/</title>
        </sec>
        <sec id="sec-4-2-3">
          <title>This reinforces the utility of a visual representation along</title>
          <p>side a text-based analogy to enhance its quality. We plan
to release this manually validated dataset of 30 multimodal
analogies of biology concepts for educators and researchers
to use.</p>
          <p>We also found some limitations with this approach. For
example, the labels in images were often incoherent with
the text not being in English. This is a known limitation
of text-to-image models that are not good at rendering text
in images [43]. Image-based analogies could benefit from
better text labels as the images could then explicitly mention
the similar representations between the two topics. Future
work can look at how we can add text labels after image
generation, as they could be useful for students to learn to
draw structural diagrams. Another limitation that emerged
was that the image analogies often represented multiple
surrounding concepts as opposed to just the target concept (for
example, the analogy for stomach also portrays other parts
like esophagus and intestine). Thus, someone unfamiliar
with the concept may not be able to discern which part the
analogy is about. Finally, some images appeared ominous
due to the nature of the target concept. For example,
analogies about the concept ‘eye’ depicted its various subparts
and could appear eerie depending on the audience. Thus,
we recommend that the images be sensitized by educators
before presenting them to the students.</p>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. Adaptive Analogies</title>
        <sec id="sec-4-3-1">
          <title>For more tailored use cases, one might wish to have analogies that are customized to learners’ backgrounds, needs, and preferences (e.g., grade level, interest, cultural background.).</title>
          <p>To this end, we explore prompting GPT-4 to create
gradelevel analogies. Figure 4 shows examples of how source
concepts and analogies generated for the same scientific
concept can be diferent based on the grade-level context.
We found a diference in the relatability and complexity
of the analogies based on the knowledge levels, thereby
suggesting that it is possible to create more personalized
and contextually appropriate analogies. We envision
encoding the knowledge level for diferent grades in a language
model, to generate these customized analogies. This can be
done through providing knowledge of diferent subject and
(a) Analogy for the concept ‘States of Matter’ for Grade 3
students - image generated by AI model DALL-E-3: Imagine
matter as diferent types of snacks. Solids are like a bar of
chocolate — firm and holding its shape. Liquids are like a
smoothie — you can pour it and it takes the shape of its
container, but it’s still touchable. Gases are like the steam from
a hot bowl of soup — you can see it moving freely into the air,
and it doesn’t keep its shape at all.
(b) Analogy for the concept ‘States of Matter’ for Grade 12
students - image generated by AI model DALL-E-3:
Consider matter as if it were a crowd at diferent types of events.
At a lecture, attendees sit close together, mostly stationary,
like particles in a solid. At a networking event, people move
around the room, mingling and shifting positions, similar to
the movement of particles in a liquid. At a festival, attendees
are spread out, moving freely around a large space, akin to
particles in a gas that move independently and occupy any
available space.
textbook chapters and online resources as context
information in the prompt or through fine-tuning on grade-level
information.</p>
          <p>In addition to introducing adaptivity at the text level,
future work could also investigate adaptivity at the image
level. For example, image style (e.g., cartoon, abstract), color
palette, etc. could all be adapted to a particular learner. This
could have important implications for accessible education.
For example, images could be adjusted for low-vision or
color-blind learners.</p>
          <p>One important point to be mindful of is that adapting to
certain learner traits (e.g., culture) could potentially lead to
(a) Initial image generated using the text
analogy comparing Cell Wall and
Castle Wall.
(b) Image generated based on image 5a
and the prompt: ‘Make image more
colourful’.
(c) Image generated based on image 5b and the
prompt ‘Make castle walls more prominent’.
the generation of ofensive or stereotypical analogies. Thus,
practitioners must exercise caution while generating
adaptive analogies and researchers should investigate methods
to prevent ofensive generation (e.g., safeguard models to
detect such text and images).</p>
        </sec>
      </sec>
      <sec id="sec-4-4">
        <title>3.4. Iteratively refined analogies</title>
        <p>In the above two types of analogies, we’ve described a single
round of generation. However, that might not always be
suficient to get the best or desired analogies. Naturally, we
can think of an iterative approach to continually refine the
generated analogies. To this end, we explored two ways
of refinement: (1) human-AI collaboration where humans
iteratively prompt the model to tweak the analogies, (2)
multi-agent collaboration where multiple large image and
language models (agents) iteratively generate and critique
analogies for improvement.
3.4.1. Human-AI collaboration
We propose a human-feedback approach to improve the
image analogy quality iteratively. Figure 5 showcases the
example of the ‘Cell Wall’, where we prompt the
text-toimage model DALL-E to iterate on the images based on
our feedback. We find that the model adjusts the visual
analogies based on human feedback, such as making the
image more colourful and emphasizing diferent structural
aspects. This shows promise for mining a large collection
of multimodal analogies through human-AI collaboration.
Specifically, we believe that through working closely with
educators and students, we can iterate and improve the
multimodal analogies, and generate a high-quality dataset
tailored for educational purposes.</p>
        <p>To realize this goal, we are currently developing a
platform to enable eficient and large-scale human-AI
collaboration. Figure 6 showcases a current version of the demo
system where users (e.g., educators and students) can search
for and provide feedback on previously generated analogies
through the like, dislike, and comment features. Moreover,
the system has a feature to report inappropriate or ofensive
analogies that should not be shown in future. User
feedback could then be integrated into the system via multiple
ways, such as, refining the prompts and tuning the
generation models via Reinforcement Learning Human Feedback
(RLHF) so that they are better aligned with user needs[44].
The system could also be expanded to obtain finer-grained
feedback (e.g., based on factuality, grade-level
appropriateness, etc.) to enhance the quality of multimodal analogies
with humans in the loop.
3.4.2. Multi-agent collaboration</p>
        <sec id="sec-4-4-1">
          <title>In general, there could be several ways in which multiple</title>
          <p>agents collaborate together to generate high quality
analogies. We explored one such approach, where we leverage the
Claude3 Sonnet model5 to simulate a teacher and critique
the GPT-4+DALL-E-3 generated analogy. The generated
critique is then passed on to GPT-4 and the model is asked
to refine the original analogy based on the critique. Figure 7
shows the example of multi-agent collaboration for the cell
wall concept, where the Claude model provides feedback to
improve the structural representation, such as adding more
layers and openings in the wall, to make the analogy more
scientifically accurate and understandable to students. The
critique is fed to DALL-E-3 and it incorporates the suggested
changes to update the image analogy.</p>
          <p>In future, it would be interesting to explore how to
optimize such a multi-agent collaboration to improve the quality
of generated analogies with no or minimal human
interac5https://claude.ai/
tion. For example, model-generated critique could be shown
to teachers as starting points for improving the analogy.
Another possibility could be to share model-generated critique
with AI researchers and system engineers to distill
common model failures and develop guidelines for future users
generating analogies.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Discussion and Conclusion</title>
      <p>We introduce the theme of multimodal analogies for science
education, consisting of text and image-based analogies. We
explore how to generate three types of multimodal
analogies leveraging GPT-4 for textual analogy generation and
feeding that into DALL-E-3 to create visual analogies. Our
qualitative validation and exploration suggests that the
generated image analogies successfully contain and represent
text-based analogies in most cases. Furthermore, we show
how they can be adapted to learners at diferent grade levels
and can even be refined iteratively via human or multi-agent
collaboration.</p>
      <p>As next steps, we must work closely with students and
educators to assess and improve the quality of multimodal
analogies for science education. We showcased our demo
system for displaying multimodal analogies to student
learners and educators through which we hope to gather
feedback. While automatically generated analogies are helpful
as a starting point, incorporating human-AI collaboration
and crowdsourcing with educators and practitioners can
provide valuable feedback and adjustments. Such collaboration
can enhance the system’s utility and ensure its analogies
are valid and appropriate for students and resonate with
diferent contexts and cultures.</p>
      <p>We have identified several interesting research challenges
that still need to be solved (e.g., how to generate legible
labels in images, how to mitigate generation of potentially
ofensive images, how to efectively support multi-agent and
human-AI collaboration). Another important future work
we hope to use the demo system for is to study the impact
of the generated analogies on science learning amongst
students.</p>
      <p>Overall, we believe our work highlights a unique
application of AI for generating multi-modal content and
resources for science education. Multi-modal content [45, 46]
is known to help with engaging learners and our work
suggests that LLMs and difusion models have a great
potential in generating such content. Thus, our approach and
ifndings could be widely useful to generate other kinds of
multimodal, scientific and educational content (e.g., stories
and dialogues), in addition to analogies, to enable more
engaging learning environments.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Acknowledgements</title>
      <sec id="sec-6-1">
        <title>This work is supported by the National Science Foundation and the Institute of Education Sciences, U.S. Department of Education, through Award # 2229612 (National AI Institute for Inclusive Intelligent Technologies for Education).</title>
        <p>254536039. S. Moore, A. N. Raferty, A. Singla, Generative ai for
[19] Y. Tewel, Y. Shalev, I. Schwartz, L. Wolf, Zerocap: Zero- education (gaied): Advances, opportunities, and
chalshot image-to-text generation for visual-semantic lenges, 2024. arXiv:2402.01580.</p>
        <p>arithmetic, 2022. arXiv:2111.14447. [33] T. Alqahtani, H. Badreldin, M. Alrashed, A. Alshaya,
[20] T. Chakrabarty, A. Saakyan, O. Winn, S. Alghamdi, K. Saleh, S. Alowais, O. Alshaya, I.
RahA. Panagopoulou, Y. Yang, M. Apidianaki, S. Mure- man, M. Al Yami, A. Albekairy, The emergent role
san, I spy a metaphor: Large language models of artificial intelligence, natural learning processing,
and difusion models co-create visual metaphors, and large language models in higher education and
in: A. Rogers, J. Boyd-Graber, N. Okazaki (Eds.), research, Research in Social and Administrative
Findings of the Association for Computational Pharmacy 19 (2023). doi:10.1016/j.sapharm.2023.
Linguistics: ACL 2023, Association for Computational 05.016.</p>
        <p>Linguistics, Toronto, Canada, 2023, pp. 7370–7388. [34] S. P. Chowdhury, V. Zouhar, M. Sachan,
AutoURL: https://aclanthology.org/2023.findings-acl.465. tutor meets large language models: A language
doi:10.18653/v1/2023.findings-acl.465. model tutor with rich pedagogy and guardrails, 2024.
[21] P. Thagard, Analogy, explanation, and educa- arXiv:2402.09216.</p>
        <p>tion, Journal of Research in Science Teaching [35] M. Javaid, A. Haleem, R. Singh, S. Khan,
29 (1992) 537–544. URL: https://onlinelibrary. I. Haleem Khan, Unlocking the opportunities
wiley.com/doi/abs/10.1002/tea.3660290603. through chatgpt tool towards ameliorating the
doi:https://doi.org/10.1002/tea.3660290603. education system, BenchCouncil Transactions on
arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/Bteenac.h3m6a6r0k2s9,06S0ta3n.dards and Evaluations 3 (2023)
[22] L. Novick, K. Holyoak, Mathematical problem solv- 100115. doi:10.1016/j.tbench.2023.100115.
ing by analogy, Journal of experimental psychology. [36] D. Leiker, S. Finnigan, A. R. Gyllen, M. Cukurova,
Learning, memory, and cognition 17 (1991) 398–415. Prototyping the use of large language models (llms)
doi:10.1037/0278-7393.17.3.398. for adult learning content creation at scale, 2023.
[23] S. M. Glynn, B. K. Britton, M. Semrud-Clikeman, arXiv:2306.01815.</p>
        <p>K. D. Muth, Analogical Reasoning and Prob- [37] J. Meyer, T. Jansen, R. Schiller, L. W. Liebenow,
lem Solving in Science Textbooks, Springer US, M. Steinbach, A. Horbach, J. Fleckenstein, Using
Boston, MA, 1989, pp. 383–398. URL: https:// llms to bring evidence-based feedback into the
classdoi.org/10.1007/978-1-4757-5356-1_21. doi:10.1007/ room: Ai-generated feedback increases secondary
stu978-1-4757-5356-1_21. dents’ text revision, motivation, and positive emotions,
[24] R. Duit, The role of analogies and metaphors in learn- Computers and Education: Artificial Intelligence 6
ing science, Science Education 75 (1991) 649 – 672. (2024) 100199. URL: https://www.sciencedirect.com/
doi:10.1002/sce.3730750606. science/article/pii/S2666920X23000784. doi:https://
[25] H. Vieira, C. Morais, Musical analogies to teach middle doi.org/10.1016/j.caeai.2023.100199.
school students topics of the quantum model of the [38] M. Lee, M. Srivastava, A. Hardy, J. Thickstun, E.
Duratom, Journal of Chemical Education 99 (2022). doi:10. mus, A. Paranjape, I. Gerard-Ursin, X. L. Li, F.
Lad1021/acs.jchemed.2c00289. hak, F. Rong, R. E. Wang, M. Kwon, J. S. Park,
[26] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Ka- H. Cao, T. Lee, R. Bommasani, M. Bernstein, P. Liang,
plan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sas- Evaluating human-language model interaction, 2024.
try, A. Askell, et al., Language models are few-shot arXiv:2212.09746.
learners, Advances in neural information processing [39] J. Jeon, S. Lee, Large language models in education:
systems 33 (2020) 1877–1901. A focus on the complementary relationship between
[27] A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Rad- human teachers and chatgpt, Education and
Informaford, M. Chen, I. Sutskever, Zero-shot text-to-image tion Technologies 28 (2023) 15873–15892. URL: https:
generation, in: International conference on machine //doi.org/10.1007/s10639-023-11834-1. doi:10.1007/
learning, Pmlr, 2021, pp. 8821–8831. s10639-023-11834-1.
[28] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. [40] W. Gan, Z. Qi, J. Wu, J. C.-W. Lin, Large language
Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, models in education: Vision and opportunities, 2023.
F. Azhar, et al., Llama: Open and eficient foundation arXiv:2311.13160.
language models, arXiv preprint arXiv:2302.13971 [41] S. Moore, R. Tong, A. Singh, Z. Liu, X. Hu, Y. Lu,
(2023). J. Liang, C. Cao, H. Khosravi, P. Denny, C. Brooks,
[29] J. Yang, H. Jin, R. Tang, X. Han, Q. Feng, H. Jiang, J. Stamper, Empowering education with llms: The
S. Zhong, B. Yin, X. Hu, Harnessing the power of llms next-gen interface and content generation, in:
Comin practice: A survey on chatgpt and beyond, ACM munications in Computer and Information Science,
Transactions on Knowledge Discovery from Data 18 volume 1831, Springer, 2023, pp. 32–37. doi:10.1007/
(2024) 1–32. 978-3-031-36336-8_4.
[30] M. U. Hadi, R. Qureshi, A. Shah, M. Irfan, A. Zafar, M. B. [42] Z. Wang, J. Valdez, D. B. Mallick, R. Baraniuk,
ToShaikh, N. Akhtar, J. Wu, S. Mirjalili, et al., A survey wards human-like educational question generation
on large language models: Applications, challenges, with large language models, in: International
Conferlimitations, and practical usage, Authorea Preprints ence on Artificial Intelligence in Education, 2022. URL:
(2023). https://api.semanticscholar.org/CorpusID:251137828.
[31] H. Lin, S. Wan, W. Gan, J. Chen, H.-C. Chao, Metaverse [43] J. Betker, G. Goh, L. Jing, TimBrooks, J. Wang, L. Li,
in education: Vision, opportunities, and challenges, LongOuyang, JuntangZhuang, JoyceLee, YufeiGuo,
2022. arXiv:2211.14951. WesamManassra, PrafullaDhariwal, CaseyChu,
Yunx[32] P. Denny, S. Gulwani, N. T. Hefernan, T. Käser, inJiao, A. Ramesh, Improving image generation with
better captions, ???? URL: https://api.semanticscholar.</p>
        <p>org/CorpusID:264403242.
[44] L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright,</p>
        <p>P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray,
et al., Training language models to follow instructions
with human feedback, Advances in neural information
processing systems 35 (2022) 27730–27744.
[45] M. Sankey, D. Birch, M. W. Gardiner, Engaging
students through multimodal learning environments:
The journey continues, Proceedings of the 27th
Australasian Society for Computers in Learning in Tertiary</p>
        <p>Education (2010) 852–863.
[46] B. Bouchey, J. Castek, J. Thygeson, Multimodal
learning, Innovative Learning Environments in STEM
Higher Education: Opportunities, Challenges, and
Looking Forward (2021) 35–54.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Minsky</surname>
          </string-name>
          , The Society of Mind, Simon &amp; Schuster, New York,
          <year>1988</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2] d. hofstadter, E. Sander, Surfaces and Essence: :
          <article-title>Analogy as the Fuel and</article-title>
          Fire of Thinking,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Treagust</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Harrison</surname>
          </string-name>
          , G. Venville,
          <article-title>Teaching science efectively with analogies: An approach for preservice and inservice teacher education</article-title>
          ,
          <source>Journal of Science Teacher Education</source>
          <volume>9</volume>
          (
          <year>1998</year>
          ). doi:
          <volume>10</volume>
          .1023/A:
          <fpage>1009423030880</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L. E.</given-names>
            <surname>Richland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Simms</surname>
          </string-name>
          ,
          <article-title>Analogy, higher order thinking, and education</article-title>
          ,
          <source>Wiley Interdisciplinary Reviews: Cognitive Science</source>
          <volume>6</volume>
          (
          <year>2015</year>
          )
          <fpage>177</fpage>
          -
          <lpage>192</lpage>
          . URL: https: //doi.org/10.1002/wcs.1336. doi:
          <volume>10</volume>
          .1002/wcs.1336.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M. B.</given-names>
            <surname>Goldwater</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gentner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. D.</given-names>
            <surname>LaDue</surname>
          </string-name>
          , J. C.
          <article-title>Libarkin, Analogy generation in science experts and novices</article-title>
          ,
          <source>Cognitive Science 45</source>
          (
          <year>2021</year>
          )
          <article-title>e13036</article-title>
          . doi:
          <volume>10</volume>
          .1111/cogs.13036.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Kretz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Krawczyk</surname>
          </string-name>
          ,
          <article-title>Expert analogy use in a naturalistic setting, Frontiers in Psychology 5 (</article-title>
          <year>2014</year>
          )
          <article-title>1333</article-title>
          . doi:
          <volume>10</volume>
          .3389/fpsyg.
          <year>2014</year>
          .
          <volume>01333</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Jurgens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mohammad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Turney</surname>
          </string-name>
          , K. Holyoak, SemEval
          <article-title>-2012 task 2: Measuring degrees of relational similarity</article-title>
          , in: E. Agirre,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Diab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Manandhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Marton</surname>
          </string-name>
          , D. Yuret (Eds.),
          <source>*SEM 2012: The First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval</source>
          <year>2012</year>
          ),
          <article-title>Association for Computational Linguistics</article-title>
          , Montréal, Canada,
          <year>2012</year>
          , pp.
          <fpage>356</fpage>
          -
          <lpage>364</lpage>
          . URL: https://aclanthology.org/S12-1047.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>V.</given-names>
            <surname>Popov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Hristova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Anders</surname>
          </string-name>
          ,
          <article-title>The relational luring efect: Retrieval of relational information during associative recognition</article-title>
          ,
          <source>Journal of Experimental Psychology: General</source>
          <volume>146</volume>
          (
          <year>2017</year>
          )
          <fpage>722</fpage>
          -
          <lpage>745</lpage>
          . URL: https://api.semanticscholar.org/CorpusID:20177507.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Kmiecik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Brisson</surname>
          </string-name>
          ,
          <string-name>
            <surname>R. G. Morrison,</surname>
          </string-name>
          <article-title>The time course of semantic and relational processing during verbal analogical reasoning</article-title>
          ,
          <source>Brain and Cognition</source>
          <volume>129</volume>
          (
          <year>2019</year>
          )
          <fpage>25</fpage>
          -
          <lpage>34</lpage>
          . URL: https://api.semanticscholar.org/ CorpusID:54169303.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhou</surname>
          </string-name>
          , E-kar:
          <article-title>A benchmark for rationalizing natural language analogical reasoning, in: Findings of the Association for Computational Linguistics: ACL 2022, Association for Computational Linguistics</article-title>
          ,
          <year>2022</year>
          . URL: http://dx.doi.org/ 10.18653/v1/
          <year>2022</year>
          .findings-acl.
          <volume>311</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2022</year>
          .findings- acl.311.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>B.</given-names>
            <surname>Bhavya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhai</surname>
          </string-name>
          ,
          <article-title>Analogy generation by prompting large language models: A case study of InstructGPT</article-title>
          , in: S. Shaikh,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ferreira</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Stent (Eds.),
          <source>Proceedings of the 15th International Conference on Natural Language Generation</source>
          , Association for Computational Linguistics, Waterville, Maine, USA and virtual meeting,
          <year>2022</year>
          , pp.
          <fpage>298</fpage>
          -
          <lpage>312</lpage>
          . URL: https: //aclanthology.org/
          <year>2022</year>
          .inlg-main.
          <volume>25</volume>
          . doi:
          <volume>10</volume>
          .18653/ v1/
          <year>2022</year>
          .inlg- main.25.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>E.</given-names>
            <surname>Bobek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Tversky</surname>
          </string-name>
          ,
          <article-title>Creating visual explanations improves learning</article-title>
          ,
          <source>Cognitive Research: Principles and Implications</source>
          <volume>1</volume>
          (
          <year>2016</year>
          ).
          <source>doi:10.1186/ s41235- 016- 0031- 6.</source>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>K. D. Forbus</surname>
            ,
            <given-names>R. W.</given-names>
          </string-name>
          <string-name>
            <surname>Ferguson</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          <string-name>
            <surname>Lovett</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Gentner</surname>
          </string-name>
          ,
          <article-title>Extending sme to handle large-scale cognitive modeling</article-title>
          ,
          <source>Cognitive science 41</source>
          <volume>5</volume>
          (
          <year>2017</year>
          )
          <fpage>1152</fpage>
          -
          <lpage>1201</lpage>
          . URL: https://api.semanticscholar.org/CorpusID:4572276.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          , G. Corrado,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <article-title>Eficient estimation of word representations in vector space</article-title>
          ,
          <year>2013</year>
          . arXiv:
          <volume>1301</volume>
          .
          <fpage>3781</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>G.</given-names>
            <surname>Rossiello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gliozzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Farrell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Fauceglia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Glass</surname>
          </string-name>
          ,
          <article-title>Learning relational representations by analogy using hierarchical Siamese networks</article-title>
          , in: J.
          <string-name>
            <surname>Burstein</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Doran</surname>
          </string-name>
          , T. Solorio (Eds.),
          <source>Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <source>Association for Computational Linguistics</source>
          , Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>3235</fpage>
          -
          <lpage>3245</lpage>
          . URL: https://aclanthology.org/N19-1327. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          - 1327.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ushio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. Espinosa</given-names>
            <surname>Anke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schockaert</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. CamachoCollados</surname>
          </string-name>
          ,
          <article-title>BERT is to NLP what AlexNet is to CV: Can pre-trained language models identify analogies?</article-title>
          , in: C.
          <string-name>
            <surname>Zong</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Xia</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Navigli</surname>
          </string-name>
          (Eds.),
          <source>Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing</source>
          (Volume
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          , Online,
          <year>2021</year>
          , pp.
          <fpage>3609</fpage>
          -
          <lpage>3624</lpage>
          . URL: https: //aclanthology.org/
          <year>2021</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>280</volume>
          . doi:
          <volume>10</volume>
          .18653/ v1/
          <year>2021</year>
          .acl- long.280.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S. E.</given-names>
            <surname>Reed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Deep visual analogy-making</article-title>
          ,
          <source>in: Neural Information Processing Systems</source>
          ,
          <year>2015</year>
          . URL: https://api.semanticscholar.org/ CorpusID:1836951.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bitton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Yosef</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Strugo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Shahaf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Schwartz</surname>
          </string-name>
          , G. Stanovsky, Vasr:
          <article-title>Visual analogies of situation recognition</article-title>
          ,
          <source>in: AAAI Conference on Artificial Intelligence</source>
          ,
          <year>2022</year>
          . URL: https://api.semanticscholar.org/CorpusID:
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>