<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Empowering Education with LLMs - the Next-Gen Interface and Content Generation, July</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Exploring Pre-Service Teachers' Perceptions of Large Language Models-Generated Hints in Online Mathematics Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sai Gattupalli</string-name>
          <email>sgattupalli@umass.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Will Lee</string-name>
          <email>williamlee@cs.umass.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Danielle Allessio</string-name>
          <email>allessio@umass.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Danielle Crabtree</string-name>
          <email>dcrabtree@umass.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ivon Arroyo</string-name>
          <email>ivon@cs.umass.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Beverly Woolf</string-name>
          <email>bev@cs.umass.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Large Language Models, Evaluations, Intelligent Tutoring Systems, Mathematics</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Massachusetts Amherst</institution>
          ,
          <addr-line>Amherst, MA 01003</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>07</volume>
      <issue>2023</issue>
      <abstract>
        <p>Despite the potential and emerging applications of large language models (LLMs) for education, little is known about their efectiveness in learning. Similarly, educators' preferences and perceptions on the utility of LLMs have received limited to no attention. Hence, we conducted an exploratory study to investigate pre-service teachers' perceptions (N=33) with respect to the possible utility of LLMs (such as GPT4) in online mathematics education. Our initial quantitative and qualitative findings indicate that while human-created mathematical contents - especially visuals - are still preferred, transformergenerated walk through, instructions, and guidance are helpful as tutoring in math problem-solving. Implications and future directions are also discussed.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In a recent “Dear Colleague” letter from the National Science Foundation [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], the agency stresses
the importance of rapid research at the intersections of education and large language models
(LLMs), both in informal and formal education settings. The letter underscores the hidden value,
potential, and benefit of transformer-based LLMs [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] in education where they are designed to
comprehend human language but might also be applied to support and tutor students.
      </p>
      <p>
        In engaging K-12 students in math learning, efective pedagogical strategies and techniques
often center around understanding students’ learning goals and motivations [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. These strategies
could be further enhanced through the application of LLMs in maths education, as their potential
to serve as ”scafolding” tools becomes increasingly apparent [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ][
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. They can aid young
learners in navigating math learning hurdles and streamlining their educational journey, directly
responding to their learning needs and aspirations.
      </p>
      <p>
        One example of this concept is MathSpring (MS), an intelligent online tutoring system
developed by researchers from the University of Massachusetts Amherst and funded by the NSF.
The platform aids students in their math problem-solving practice, reinforcing each problem
Japan
with carefully crafted hints created by researchers with backgrounds in mathematics and
education. However, the hint creation process currently requires significant time and expertise,
and poses challenges related to human resources. The complexity of this process may contribute
to burnout among teachers and researchers [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], underlining the need for methods to alleviate
this burden.
      </p>
      <p>In this light, integrating LLMs could ofer a valuable solution, creating a responsive
teaching strategy that supports both student engagement and teacher sustainability. LLMs could
streamline the hint-creation process, in ways that help reduce the workload of educators while
maintaining, or even enhancing, the quality of support provided to students. In turn, this could
lead to a more engaging math classes which are capable of flexibly responding to K-12 students’
learning goals and motivations.</p>
      <p>
        LLMs such as OpenAI’s ChatGPT [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] emerge as potential game-changers, especially in K-12
math education. ChatGPT produces human-like explanation in text format and has shown
eficacy across various sectors, including education, law, and medicine [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ][
        <xref ref-type="bibr" rid="ref9">9</xref>
        ][
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. It also has
potential in various learning tasks; including answering student math queries, crafting math hint
narratives, summarizing math problems, and acting as a virtual math tutor [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. An enhanced
iteration of ChatGPT, based on GPT-4, was introduced in March 2023, is the focus of our work
and showcases even more advanced natural language generation, understanding, and learning
capabilities [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. GPT-4’s abilities are transforming not only learning but also higher education
research [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Its knack for answering queries, providing explanations, and assisting in math
problem-solving makes it an asset in creating interactive learning experiences. Our work aims
to explore the potential and efectiveness of GPT-4 produced hints for MS online tutor, in an
attempt to make math learning more personalized and afective learning to aid learners (Grades
4 and up) in their math-problem-solving eforts.
      </p>
      <p>
        In this exploratory study, the term ”transformer-generated hints” are suggestions produced
via prompt engineering techniques [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. These hints can significantly benefit students grappling
with mathematical word problems, which is a key skill often found challenging [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The term
”human-crafted hints” refer to the hints created by real-humans. Building on [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]’s assertion
that “automatically generating high-quality step-by-step solutions to math word problems has
many applications in education,” we stress the importance of teacher involvement in reviewing
hints, irrespective of their origin — human-crafted or transformer-generated. We think a blend
of transformer-generated hints and teacher input can promote learning while expediting the
hint-creation process. This is essential for maintaining the reliability and efectiveness of MS,
or any online learning platform.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Research Questions and Contribution of the Study</title>
      <p>The value of this study lies in its dual focus: the advancement of online math education and the
exploration of LLMs in educational contexts. As LLMs are gaining momentum for their potential
to bolster learner engagement, academic achievement, and student success, little evaluation
has been conducted. Hence, the outcomes of this study are primed to serve as a roadmap for
educational stakeholders. This includes administrators and policymakers, assisting them to
discern the potential advantages, constraints, and practical facets of integrating LLMs into
online mathematics learning.</p>
      <p>Moreover, by collecting the perceptions of pre-service teachers towards
transformergenerated hints (Ht) compared to human-crafted hints (Hm), we aim to uncover preferences
that could potentially inform future enhancements and integration of LLM-based tools within
mathematics education. This inquiry thus proposes three contributions:
• Implementation of transformer-generated hints for K-12 mathematics education
• Assessment of teachers’ perceptions and preferences regarding LLMs-generated hints,
and their ensuing pedagogical implications
• The combined expertise in education, computer science, artificial intelligence, and math
education, to investigate the utility of LLMs in math education.</p>
      <p>Central to our study is our primary research question (RQ), which delves into the perceptions
of pre-service teachers:</p>
      <p>RQ: How do pre-service teachers perceive the efectiveness and appropriateness of
transformer-generated hints in comparison to human-crafted hints?</p>
      <p>Addressing this RQ allows us to underline the possible influences of transformer-generated
hints on student learning.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Related Work</title>
      <p>
        Our study serves as a initial efort for understanding the implications of LLMs in math education,
particularly in relation to transformer-generated hints and their prospective role in elevating
teaching and learning experiences. Radford [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] proposed the idea of applying transformers
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] by using semi-supervised approach where a natural language model using a large corpus
of unlabelled text would be trained. In learning sciences, similar language models have been
proposed and deployed for mathematical learning. For instance, MathBERT, introduced by
[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], was trained using large volumes of mathematics-related text from K-12 and college-level
mathematics textbooks, open-source course syllabuses, and research papers in mathematics.
Grifith et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] undertook quantitative experiments to compare diferent transformer-based
neural networks for solving mathematical word problems, while other studies proposed using
transformer models for solving diferential equations [ 18], performing reasoning [19], and
proofing [ 20].
      </p>
      <p>However, these prior eforts have not tested their prototypes with real students and educators
for feedback, leaving a crucial element unexplored. Moreover, many of these systems provide
solutions without guiding students with hints, which is a significant aspect of learning.</p>
      <p>Our work extends experiments conducted by [21] where the focus was on investigating the
mathematical capabilities of ChatGPT, by treating ChatGPT as an assistant to professional
mathematicians by posing various use cases such as question answering and theorem
searching. Our approach is similar in ways that we aim to gather insights on the efectiveness and
acceptance of LLMs-generated hints in our MS online tutoring system. The target demographic
is undergraduate students preparing to become K-12 teachers and education professionals in
the US.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>Participants (N=33) in this study are undergraduate students enrolled in education courses
at a Northeastern university. All participants aspire to become K-12 teachers and education
professionals. The participants are active members of the Education Club, a student-led group
committed to nurturing connections among educators and the wider community. Their
perceptions and responses towards both human-crafted and transformer-generated hints were
gathered within an hour-long session during the club’s weekly meetings. Given this is an
exploratory study, we did not collect any demographic data.</p>
      <p>We adopted a mixed-methods approach in this exploratory study, striking a balance between
the quantitative analysis of hint counts and the qualitative insights derived from open-ended
responses. Our qualitative analysis employs the grounded theory approach, which guides the
thematic dissection of the participant responses.</p>
      <p>Five multiple-choice math word problems based on Common Core standards were randomly
selected from the MS SQL production database and presented to participants via a Qualtrics
survey. Participants gave their preference between two distinct hint variants (one
humancreated by education professionals and the other generated by GPT-4) and answered a ”Why
did you choose this hint variant?” open-response question. All survey questions have been
made available at https://osf.io/t84v7/.</p>
      <p>For definitions, human-crafted hints (H m) embody
the collective wisdom of math teachers, research
scientists, and university professors. These hints emphasize
pedagogical strategies that enhance individual learning
and problem-solving abilities. However, as discussed
above, their creation is a time-consuming process,
requiring the skillful expertise of educators. In contrast,
the transformer-generated hints (Ht) were produced
using the GPT-4, utilizing prompts tailored from a Prompt
Engineering GitHub repo [22]. This approach aimed to
simulate the instructional strategies of an experienced
US math teacher.</p>
      <p>The prompts fed to GPT-4 were designed to place
the model in the role of a math teacher with a decade’s
worth of experience teaching students from a variety of
economic, ethnic, cultural, and language backgrounds.</p>
      <p>Figure 2 shows a sample MS question that all
participants responded to, along with the hint variants.</p>
      <sec id="sec-4-1">
        <title>4.1. Data Collection</title>
        <p>Every participant responded to the Qualtrics survey
by indicating their preferred hint variant and their
rationale behind choosing it. Notably, the participants
were blind to whether the hints were human-created or
transformer-generated, which is an approach taken to
eliminate any potential biases and preconceived notions
about advanced technologies or GPT-based services in
general. The collected data was organized into a
spreadsheet, which provided the basis for our ensuing analysis
detailed in the below section. The full dataset has been
made available at https://osf.io/t84v7/.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <sec id="sec-5-1">
        <title>5.1. Analysis</title>
        <p>The analysis of the participants’ hint preferences was carried out via a histogram representation
for clarity and ease of understanding. In addition, we employed a pre-trained BART language
model from the Hugging Face API [23] to compile and interpret participants’ written responses,
the insights from which are further discussed in the Discussion and Conclusion section.</p>
        <p>To comprehensively categorize the collected written responses, we used a pre-trained BERT
model in conjunction with k-means clustering. This allowed us to discern prominent themes
associated with the three types of hints under consideration: human-crafted hints (H m),
transformer-generated hints (Ht), and mixed-result (Hmt). A detailed summary of these
preferences can be found in Table 1. We conducted our analysis using BART and BERT models, along
with k-means clustering and visualization, all within Google Colab. Our notebooks have been
made available online.1</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Findings</title>
        <p>The results, as illustrated in Table 1, are segmented into three sections. We first present the
instances where participants showed a preference for human-crafted hints (Q1 and Q4). This is
followed by an examination of the instances where participants favored transformer-generated
hints (Q2 and Q3). Lastly, we delve into the case where the results were mixed (Q5).</p>
        <p>Each question’s results were organized according to the hint variant participants preferred:
human-crafted or transformer-generated. Therefore, Q1 and Q4, which both indicated a
preference for human-crafted hints, are discussed together. Similarly, we grouped Q2 and Q3, as
they shared a favorability for transformer-generated hints. The analysis of Q5, which revealed
mixed preferences among participants, is in the discussion section.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.2.1. Human-Crafted Hints (H m)</title>
        <p>Participants preferred human-crafted hints over transformer-generated hints for Q1 and Q4,
see Figure 3. The Visual Cue themes from the qualitative data (responses) that emerged from
Grounded theory and thematic analysis are Explanation with Example, Visualization and
Explanation in Detail and are displayed in Table 2 along with sample participant utterances.
1https://github.com/wlee-umass/AIEDLLM_Workshop</p>
      </sec>
      <sec id="sec-5-4">
        <title>5.2.2. Transformer-Generated Hints (Ht)</title>
        <p>Participants preferred transformer-generated hints over human-crafted hints for Q2 and Q3, see
Figure 4. The themes from the qualitative data (responses) that emerged from Grounded theory
and thematic analysis are Explanation Through Connection, Guidance and Simple Walkthrough
and are displayed in Table 3 along with sample participant utterances.
• ”The language is more descriptive ie the
explana</p>
        <p>tion that a right angle forms an L shape.”
• ”I feel like when it comes to applying concepts
in an equation format it becomes more confusing
to not elaborate on how equations are formed or
why they appear the way they do. Even if the
previous two hints lead up to the equation itself,
having this hint appear like this might draw the
two hints into a full circle with this elaboration,
because if the child struggles with just the first
two hints, they might need more than just given
steps, and rather, scafolding alongside with the
information provided.”
• ”This coincides with the first question and is closer</p>
        <p>to what they have most likely been taught!”
• ”[T]his guides students through the problem.”
• ”[L]eads the student into the correct 1st step.”
• ”[M]akes it very easy for a student to understand</p>
        <p>what they should be doing.”</p>
      </sec>
      <sec id="sec-5-5">
        <title>5.2.3. Mixed - Human-Crafted Hints/Transformer-Generated Hints (H mt)</title>
        <p>Participants had a mixed preference for both the human-created and transformer-generated
hints for Q5, see Figure 5. Themes and that arose from the participants’ responses to why they
preferred certain hints were ease of understanding and comprehension. Example of utterances
from these themes include ”This is less dense to read and easier to understand” and ”This
explains the process of converting a fraction to a decimal which is important for the students
comprehension instead of just giving them the answer.”</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Discussion and Conclusion</title>
      <p>When it comes to human-crafted hints, one trend we observed from the thematic analysis is
that participants preferred the teacher-crafted visual cues ((H m)). This is evident from the
summaries of participants’ written responses. For example, we identified that participants in
this category preferred ”seeing the numbers (is) easier than just seeing the words ... and this is more
helpful because it has both a verbal component and a visual component.” The emphasis on visual
cues was perceived as beneficial because they complemented the textual information. This
observation substantiates the pedagogical importance of visual aids in enhancing comprehension
and reinforcing concepts.</p>
      <p>In the case of transformer-generated hints (Ht), our analysis revealed that participants favored
detailed, step-by-step instructions, and clear narratives. Participants appreciated the guidance
towards the subsequent stages of problem-solving. For example, from the summarization we
conducted, participants indicated that ”I liked being talked through the problem. I like the
inclusion of “think about the letter L’. This has a more readable, in depth explanation. This
helps explain what a complementary angle is AND gives a suficient example.” This feedback
suggests that future applications of LLMs in intelligent tutoring systems should simulate a
conversational problem-solving process, while allowing for interactive queries from the students.
An interesting observation from a consultant K-12 teacher suggested that GPT-4 generated
hints might be too lengthy and detailed for young learners with shorter attention spans. The
transformer-generated hints were deemed beneficial by the participants despite the lack of
visuals, attributed mainly to their explanatory verbiage.</p>
      <p>One interesting finding is a mixed result of Q5. As seen in Figure 5, participants rated
human-crafted hints and transformer-generated hints almost equally. One reason we observed
is that the human-crafted and transformer-generated hints were framed similarly, resulting in a
balanced preference and hence the mixed result. Unlike the other four problems, we speculate
that the use of a third-person narrative in Q5 may have influenced the participants to perceive
the hints as less personally relevant in their math problem solving. Therefore, the resemblance
between the two hint variations might have led the participants to not pay as much attention to
the minute diferences between each hint variants.</p>
      <p>There are two limitations in this exploratory study. One limitation is that due to time
constraint, we only selected five mathematical problems - although the problems spanned
a wide range of topics. Second limitation is the only choice of OpenAI’s GPT-4 model for
generating hints. We believe exploring and comparing other pre-trained decoder based models
- or even fine-tuning a model - in the future is essential to generate readable and interpretable
hints.</p>
      <p>Looking ahead, our plan is to evaluate hints generated by LLMs using the Flesch Reading Ease
Formula [24]. This tool will help us understand the efectiveness of generated hints to further
explain participants’ responses. This is important because students with reading disabilities
may find the substantial reading involved with LLMs challenging. Although our current work
does not focus on translations, we believe LLMs can facilitate the scaling of auto-translation for
generated hints on-demand. The ability to translate hints, math contents, and be able to provide
emotional support in diferent languages on-demand is crucial if LLMs are to accommodate
bilingualism [25], and non-native English Language Learners (ELL).</p>
      <p>We foresee using LLMs in crafting digital learning companions [ 26], with varied skin tones
and languages, to be embedded into learning platforms such as MS, in our attempt to foster a
more personalized learning experience for all learners. We believe our collected log data from
past experiments, such as math mastery, time spent, and correctness, can be utilized to
finetune or introduce new layers into a scalable LLM model. The outcome of this adaptation may
serve as an advanced intelligent tutor capable of guiding and supporting students interactively,
irrespective of their background.</p>
      <p>To conclude, our study utilizing OpenAI’s GPT-4 LLM aimed to explore the perspectives and
preferences of pre-service teachers concerning the eficacy of transformer-generated hints. This
research provides broad implications, one being the potential development of more personalized
and intelligent features and tools that support all MS users. We aim to further the understanding
of personalized learning experiences and the role of LLMs in shaping education.
[18] G. Lample, F. Charton, Deep learning for symbolic mathematics, arXiv preprint
arXiv:1912.01412 (2019).
[19] A. Lewkowycz, A. Andreassen, D. Dohan, E. Dyer, H. Michalewski, V. Ramasesh, A. Slone,
C. Anil, I. Schlag, T. Gutman-Solo, et al., Solving quantitative reasoning problems with
language models, arXiv preprint arXiv:2206.14858 (2022).
[20] K. Cobbe, V. Kosaraju, M. Bavarian, M. Chen, H. Jun, L. Kaiser, M. Plappert, J. Tworek,
J. Hilton, R. Nakano, et al., Training verifiers to solve math word problems, arXiv preprint
arXiv:2110.14168 (2021).
[21] S. Frieder, L. Pinchetti, R.-R. Grifiths, T. Salvatori, T. Lukasiewicz, P. C. Petersen, A.
Chevalier, J. Berner, Mathematical capabilities of chatgpt, arXiv preprint arXiv:2301.13867
(2023).
[22] F. K. Akın, Awesome chatgpt prompts, GitHub, 2023. URL: https://github.com/f/
awesome-chatgpt-prompts, online; Accessed: 2023-05-23.
[23] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, L.
Zettlemoyer, Bart: Denoising sequence-to-sequence pre-training for natural language generation,
translation, and comprehension, arXiv preprint arXiv:1910.13461 (2019).
[24] J. N. Farr, J. J. Jenkins, D. G. Paterson, Simplification of flesch reading ease formula., Journal
of applied psychology 35 (1951) 333.
[25] D. Allessio, B. Woolf, N. Wixon, F. R. Sullivan, M. Tai, I. Arroyo, Ella me ayudó (she
helped me): Supporting hispanic and english language learners in a math its, in: Artificial
Intelligence in Education: 19th International Conference, AIED 2018, London, UK, June
27–30, 2018, Proceedings, Part II 19, Springer, 2018, pp. 26–30.
[26] B. P. Woolf, I. Arroyo, K. Muldner, W. Burleson, D. G. Cooper, R. P. Dolan, R. Christopherson,
The efect of motivational learning companions on low achieving students and students
with disabilities., in: Intelligent Tutoring Systems (1), 2010, pp. 327–337.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Dear</given-names>
            <surname>Colleague</surname>
          </string-name>
          <article-title>Letter: Rapidly Accelerating Research on Artificial Intelligence in K-12 Education in Formal and Informal Settings (</article-title>
          <year>nsf23097</year>
          )
          <string-name>
            <surname>| NSF - National Science</surname>
          </string-name>
          Foundation - nsf.gov, https://www.nsf.gov/pubs/2023/nsf23097/nsf23097.jsp?org=NSF,
          <year>2023</year>
          . [Accessed 28-May-2023].
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , Ł. Kaiser,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>30</volume>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>I.</given-names>
            <surname>Arroyo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Woolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Burelson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Muldner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tai</surname>
          </string-name>
          ,
          <article-title>A multimedia adaptive tutoring system for mathematics that addresses cognition, metacognition and afect</article-title>
          ,
          <source>International Journal of Artificial Intelligence in Education</source>
          <volume>24</volume>
          (
          <year>2014</year>
          ).
          <source>doi: 1 0 . 1 0 0 7 / s 4 0</source>
          <volume>5 9 3 - 0 1 4 - 0 0 2</volume>
          <fpage>3</fpage>
          -
          <lpage>y</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Anghileri</surname>
          </string-name>
          ,
          <article-title>Scafolding practices that enhance mathematics learning</article-title>
          ,
          <source>Journal of Mathematics Teacher Education</source>
          <volume>9</volume>
          (
          <year>2006</year>
          )
          <fpage>33</fpage>
          -
          <lpage>52</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bakker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Smit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wegerif</surname>
          </string-name>
          ,
          <article-title>Scafolding and dialogic teaching in mathematics education: Introduction and review</article-title>
          ,
          <source>ZDM</source>
          <volume>47</volume>
          (
          <year>2015</year>
          )
          <fpage>1047</fpage>
          -
          <lpage>1065</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Woods</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sebastian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. C.</given-names>
            <surname>Herman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. L.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. M.</given-names>
            <surname>Reinke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Thompson</surname>
          </string-name>
          ,
          <article-title>The relationship between teacher stress and job satisfaction as moderated by coping, Psychology in the Schools (</article-title>
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7] OpenAI, Gpt-4
          <source>technical report</source>
          ,
          <year>2023</year>
          .
          <article-title>a r X i v : 2 3 0 3 . 0 8 7 7 4</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>H.</given-names>
            <surname>Gimpel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Decker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Eymann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lämmermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mädche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Röglinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ruiner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schoch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schoop</surname>
          </string-name>
          , et al.,
          <article-title>Unlocking the power of generative ai models and systems such asgpt-4 and chatgpt for higher education (</article-title>
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>H.</given-names>
            <surname>Nori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>King</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>McKinney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Carignan</surname>
          </string-name>
          , E. Horvitz,
          <article-title>Capabilities of gpt-4 on medical challenge problems</article-title>
          , arXiv preprint arXiv:
          <volume>2303</volume>
          .13375 (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>D. M. Katz</surname>
            ,
            <given-names>M. J.</given-names>
          </string-name>
          <string-name>
            <surname>Bommarito</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Arredondo</surname>
          </string-name>
          ,
          <article-title>Gpt-4 passes the bar exam</article-title>
          ,
          <source>Available at SSRN</source>
          <volume>4389233</volume>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wardat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Tashtoush</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>AlAli</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. M. Jarrah</surname>
          </string-name>
          ,
          <article-title>Chatgpt: A revolutionary tool for teaching and learning mathematics</article-title>
          ,
          <source>Eurasia Journal of Mathematics, Science and Technology Education</source>
          <volume>19</volume>
          (
          <year>2023</year>
          )
          <article-title>em2286</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <article-title>GitHub - f/awesome-chatgpt-prompts: This repo includes ChatGPT prompt curation to use ChatGPT better</article-title>
          . - github.com, https://github.com/f/awesome-chatgpt-prompts,
          <year>2023</year>
          . [Accessed 29-May-2023].
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Misquitta</surname>
          </string-name>
          ,
          <article-title>A review of the literature: Fraction instruction for struggling learners in mathematics</article-title>
          ,
          <source>Learning Disabilities Research &amp; Practice</source>
          <volume>26</volume>
          (
          <year>2011</year>
          )
          <fpage>109</fpage>
          -
          <lpage>119</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>He-Yueya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Poesia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. D.</given-names>
            <surname>Goodman</surname>
          </string-name>
          ,
          <article-title>Solving math word problems by combining language models with symbolic solvers</article-title>
          ,
          <source>arXiv preprint arXiv:2304.09102</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Narasimhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Salimans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Sutskever</surname>
          </string-name>
          , et al.,
          <article-title>Improving language understanding by generative pre-training (</article-title>
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yamashita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Prihar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Hefernan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Graf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Mathbert: A pre-trained language model for general nlp tasks in mathematics education</article-title>
          ,
          <source>arXiv preprint arXiv:2106.07340</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>K.</given-names>
            <surname>Grifith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kalita</surname>
          </string-name>
          ,
          <article-title>Solving arithmetic word problems with transformers and preprocessing of problem text</article-title>
          ,
          <source>arXiv preprint arXiv:2106.00893</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>