<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>with an LLM-powered Chatbot when Completing CS1 Assignments</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ruiwei Xiao</string-name>
          <email>ruiweix@andrew.cmu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xinying Hou</string-name>
          <email>xyhou@umich.edu</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Harsh Kumar</string-name>
          <email>harsh@cs.toronto.edu</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Steven Moore</string-name>
          <email>stevenmo@andrew.cmu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>John Stamper</string-name>
          <email>jstamper@cmu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Liut</string-name>
          <email>michael.liut@utoronto.ca</email>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Carnegie Mellon University</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Multiple recent studies have integrated large language models (LLMs) into diverse educational contexts, including CS1 classrooms. One common application is integrating a chatbot to serve as a teaching assistant. In this preliminary analysis, we explored four methods (correlation analysis, Latent Dirichlet Allocation, expert evaluation, LLM labeling, and evaluation) with multiple levels of data to analyze students' help requests with a basic chat-based LLM tutor when completing CS1 assignments. This dataset contains 73 initial help-seeking conversation sessions with corresponding student self-reported survey answers. It also included 18 hallucinating responses from all the conversation sessions. Our results indicate that students with lower self-eficacy tended to create longer help requests, while students with higher self-eficacy tended to conduct more concise ones. Other than this, we found that learners shared more commonalities than diferences when conducting help requests, including the length of turn-taking and the struggle to locate LLM hallucinations. As AI-based chatbots become prevalent in education settings, this preliminary analysis sheds light on what types of learner data can be collected, and what analytic approaches can be leveraged to unpack students' help-seeking with these LLM-based learning systems.</p>
      </abstract>
      <kwd-group>
        <kwd>large language models</kwd>
        <kwd>computing education</kwd>
        <kwd>intelligent tutors</kwd>
        <kwd>learner-centered design</kwd>
        <kwd>field experiment</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The rise of large language models in educational contexts
has marked a significant increase in the use of LLMs for
teaching and learning [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. These models have been
integrated into various educational settings, including
introductory computer science (CS1) education, providing
innovative ways to support student learning [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3, 4, 5</xref>
        ]. As the use of
LLM-powered learning systems became prevalent, taking
a closer look at students’ educational behaviors with these
systems became important. Previous educational data
mining work highlights that there are diferent types of features
in the datasets, such as demographic features, performance
features, and activity/engagement features [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In this work,
we collected a comprehensive set of data and conducted a
preliminary analysis of how students interact with a
chatbased LLM tutor to complete a CS1 assignment. Through
a mixed-method analysis of multifactual data, our study
reveals that, while self-eficacy is negatively correlated with
the length of a student’s initial help request, learners exhibit
few diferences in other help-seeking behaviors. Instead,
students share common flaws in their help-seeking
strategies, such as providing insuficient information in the help
request, or failing to locate errors in the LLM hallucination.
These findings highlight some needs for future research
and development of LLM-based educational systems. For
instance, these systems should not only be more
learnercentered and adaptive, but also capable of automatically
recognizing and scafolding the context of learners’ questions,
thereby supporting students from diverse backgrounds more
efectively.
      </p>
      <p>By using multiple methods and data levels to understand</p>
      <p>CEUR</p>
      <p>ceur-ws.org
student help-seeking with a chat-based LLM tutor, this work
aims to contribute to the long line of research on designing
educational tools that are sensitive to learners’ individual
needs and features. By addressing the identified
commonalities and diferences in help-seeking behaviors, we aimed
to inform the development of a more efective,
contextaware LLM-based educational system that can enhance the
learning experience for all students, regardless of their
backgrounds.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related</title>
    </sec>
    <sec id="sec-3">
      <title>Work</title>
      <sec id="sec-3-1">
        <title>2.1. LLM-based Programming Tutors</title>
        <p>
          Prior studies showed that LLM can generate code,
explanations, and conversations [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. After uncovering its
potential, significant efort has been invested in creating and
evaluating LLM-based intelligent programming tutors with
diverse content focus and granularity. However, as LLM
products have shown enhanced flexibility and are more
accessible to students, there are rising concerns about their
over-utilization in computing education [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Therefore,
recent developments have emphasized incorporating LLM
with intelligent programming tutors in a more
pedagogical way [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. One direction is to provide an LLM-powered
middle-stage code puzzle to support students actively in
completing their programming experience [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. The other
direction is to provide help similar to that of a teaching
assistant to support students in completing programming
exercises but avoid providing direct code [
          <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
          ]. The QA
bot from which we collected data in this paper is from this
direction. In this work, we are mainly interested in
understanding how students interact with the QA bot and
how their help-seeking queries relate to their learning
backgrounds, such as self-eficacy and fluency in English.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. Granularity of Data in Current LLM-Based Tutoring Data Analysis</title>
        <p>
          The multilevel nature of data in social sciences is
pervasive, however, reporting practices for data aggregation lack
standardization, resulting in considerable variability in the
information and statistics included by authors [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. In the
learning engineering domain, the levels of data have been
categorized based on their granularity from low to high
as: log data (records of students’ usage events, e.g. xAPI
data [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]), inferred data (characteristics or prediction of
behaviors of the data subjects inferred from other data, e.g.
anxiety level [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], student model [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]), aggregate data (the
aggregation of multiple events or student’s data, e.g.
learning curve analysis [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]), program-level data (course or
program-level data used to track or predict the performance
of a group of students during the entire course or program,
e.g. course recommendation data [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]), and profile data
(student’s personal information, such as demographic data
[
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], learning preferences, behavioral data, goals and
aspirations, socio-emotional data, etc.) [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ].
        </p>
        <p>
          To analyze the usage of LLM-based programming tutors,
most of the existing works conducted analysis using log data,
inferred data from surveys, and aggregate data of LLM-tutor
usage; few of them have analysis on profile data on
personalized learning experiences and program-level inferences.
To bridge the gap, we use our system, QuickTA [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], to
collect the data levels mentioned above, and explore
extensive approaches and aspects of analysis on help-seeking
requests with emphasis on profile data to understand to
what extent learners’ diferences would influence the usage
of LLM-tutors.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>2.3. Self-Eficacy and Help-Seeking</title>
        <p>
          Self-eficacy refers to individuals’ subjective evaluations of
their abilities to successfully perform an activity [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. In
computer science education, students’ self-eficacy refers to
their perception of competence to complete CS courses and
ifnish programming tasks [
          <xref ref-type="bibr" rid="ref22 ref23">22, 23</xref>
          ]. Prior research looked
into how specific learning supports might impact student
self-eficacy in CS learning [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] and whether the advantages
of support mechanisms could benefit students with varying
levels of self-eficacy [
          <xref ref-type="bibr" rid="ref22 ref25">22, 25</xref>
          ]. According to the relationship
between students’ self-eficacy and help-seeking behaviors,
results from previous work are mixed [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]. Some scholars
found that students with high self-eficacy tend to show high
help-seeking behavior [
          <xref ref-type="bibr" rid="ref27 ref28">27, 28</xref>
          ]. However, others reported
the opposite direction and found that students with a high
sense of self-eficacy avoid seeking help even in times of
need [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ].
        </p>
        <p>As such, the student’s self-eficacy might afect their use
and interaction with an LLM tutor. We collected this type
of data by asking students to self-report their self-eficacy
regarding the assignment topic each time before interacting
with the LLM tutor.</p>
      </sec>
      <sec id="sec-3-4">
        <title>2.4. Impact of English Fluency on Help-Seeking</title>
        <p>
          Previous research has examined user behavior in searching
for information in English as a foreign language [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ]. When
using English to search, non-native speakers identified the
query formulation as the most challenging task. Due to
their relatively low English language proficiency level,
identifying keywords to build a query in a non-native language
is their main dificulty [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ]. Similarly to online search, an
important step in using an LLM-based QA bot is also
formulating an appropriate query [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ]. However, limited work
has been done to investigate the impact of English fluency
on Learner-LLM interactions in the context of a CS course.
        </p>
        <p>Given that our context involved a significant number of
non-English native speakers, this provided us a chance to
investigate the relationship between students’ self-reported
level of English fluency and their interactions with the LLM
tutors.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Methods</title>
      <p>We deployed QuickTA, an LLM-powered chat-based
tutoring system, in a large introduction to computer
programming course. Students were given access to QuickTA when
solving weekly lab assignments. This section describes the
context of the deployment, the design of QuickTA, and the
dataset we used in this analysis.</p>
      <sec id="sec-4-1">
        <title>3.1. Classroom Context and QuickTA</title>
        <p>This study was conducted within an “Introduction to
Computer Programming” course (CS1), ofered at a prominent
research-intensive post-secondary institution in Canada
during the Fall 2023 semester. The course is interdisciplinary,
with the majority of students (typically over 75%) in their
ifrst year of post-secondary study intending to major in
computer science. The 12-week course utilized a flipped
classroom model and included ten assignments (starting in
week 2 and ending in week 11), pre-and post-homework
due weekly, two-term tests (in weeks 6 and 11, respectively),
and a final exam held two weeks after the conclusion of the
regular semester.</p>
        <p>The specific lab assignment we selected for this analysis
focused on the topics of if statements and for loops. At the
end of the assignment introduction, students were given a
link to access QuickTA Figure 1. Students were informed
that QuickTA was designed to help them with the
assignment and that they could access it as often as needed. The
students were also told that this was an experimental tool,
so they should be careful about relying on its responses and
should proactively report any issues.</p>
        <p>
          We used GPT-4 (the most advanced model back in Fall
2023) to power QuickTA in helping students with the
assignment. As shown in previous research, a system-prompted
LLM might be more efective as a tutor [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ]. Therefore,
we designed a system prompt based on Hattie’s feedback
model [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ]. The prompt was tested internally through
multiple iterations with the teaching team before being used
for deployment. The detailed configuration of the model is
described in Appendix A.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Dataset</title>
        <p>At the time of the assignment, there were 1,068 students
enrolled in the class. While QuickTA is available for
every lab activity and exam preparation, the interaction with
QuickTA is optional. In this analysis, we selected student
data for only one lab activity (Lab 4).</p>
        <p>For each lab activity, students were asked to complete
three questions every time before starting a QuickTA
conversation. The definition and average score of each
question are as follows: 1) self-eficacy (M=4.74, SD=1.38): a
self-reported question from a scale of 1 to 7, representing
the student’s confidence on the topic before using QuickTA
(1-not confident at all, 7-the most confident); 2) English
lfuency: (M=4.15, SD=0.88) a self-reported question from a
scale of 1 to 7, representing the student’s English fluency
(1-not fluent at all, 7-the most fluent); and 3) conceptual
knowledge (M=0.60, SD=0.49): a score of a multiple choice
question that tests student’s conceptual knowledge on this
task, indicating how well the student mastered on the topic
before using QuickTA.</p>
        <p>A total of 73 students used QuickTA when completing lab
4 homework, and completed all the required sections. When
using QuickTA, 23 students conducted multiple
conversation sessions (closed and then reopened) with QuickTA. As
we focused on students’ initial help requests with QuickTA,
when answering RQ1 and RQ2, we only kept their 73 initial
help requests and follow-up turn-takings in these initial
conversation sessions. We also noticed that QuickTA sometimes
expressed errors when answering students’ help requests.
Therefore, in RQ3, we looked into how students dealt with
these erroneous answers. A total of 18 hallucinating
responses emerged from all the conversation sessions.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Results</title>
      <p>We applied a mixed analysis approach to analyze our
multiple levels of data. More specifically, we adopted statistical
analysis in 4.1 and 4.2, we applied an LLM model for
thematic coding and content extraction unsupervised learning
models for topic extraction in 4.2, and qualitative expert
evaluation in 4.2 and 4.3.</p>
      <sec id="sec-5-1">
        <title>4.1. RQ1: QuickTA usage with learner feature data</title>
        <p>To understand the relationship between learner-level
features and students’ usage of QuickTA, we investigated the
correlations between learner features with the number of
LDA Result of Students’ Requests
turn-takings (one turn is considered as initiated by the
student and then QuickTA answered) in each conversation
session and the number of words in each initial help request
using correlation analyses. We applied Pearson correlation
when the two variables were continuous and Point-Biserial
Correlation when one of them was dichotomous.
4.2. RQ2: Information coverage in learners’
help requests
A major aspect that influences the length of the help request
would be the number of diferent types of information
covered in the content. For instance, the help requests with
learner’s code or execution messages (normally above 50
words) would be longer than requests with learners’
questions only (usually within 20 words). Therefore, we explored
the similarities and diferences between learners in their
queries’ information coverage.</p>
        <p>
          We first applied Latent Dirichlet Allocation (LDA) in
Figure 2 to explore diferences in frequent topics that learners
are interested in while finding more commonality than
differences in help query content related to learner traits.
Without significant diferences between most-frequent words in
prompts for diferent learners, students often include words
related to lab name (e.g., lab4), programming problem name
or description (e.g., ispalindrome, lowercase), and some
programming syntax (e.g., return, true).
Then we used GPT-4 to label the types of information
covered in each help request to further investigate whether
certain types of learners are more skillful prompt writers. The
information coverage rate can provide another perspective
on users’ prompting behavior. In this analysis, we focused
on whether learners include suficient context information
in their help requests to increase the help-seeking responses’
quality of accuracy and concreteness. We acknowledged
that ”suficient context” can be diferent given diferent
request types. Thus we first defined the types of requests
adopting from the help-seeking model [
          <xref ref-type="bibr" rid="ref34">34</xref>
          ] as requests with
clear motivation: debug-output, debug-syntax, next-step,
create test cases, knowledge-concept, knowledge-procedure,
capability, clarification; and requests with unclear
motivation (Table 3). Two domain experts coded learners’ initial
help request data. These experts identified the potential
components that can be included in each help request and
labeled what are essential components for diferent types
of requests.
        </p>
        <p>After the definition of request types and information
types, we applied a multi-shot prompt engineering
strategy with examples to make GPT-4 label each user request
using the rubric in Table 4. Lastly, each student’s request
is grouped into one help-seeking type, and its information
coverage is visualized as shown in Figure 3.</p>
        <p>The result of diferent learners’ information coverage
indicates no significant diference between learners (Table 5).
A common pattern of all learners is that they tend to miss
essential context for multiple types of help requests. Overall
speaking, only 24.7% of students included all information
components identified as important by researchers, with
particularly low coverage on queries related to debug output
(20%), debug syntax (0%), and next-step (3%). More
specifically, when asking for the next step to do, 63.3% of students
only include a question (e.g., “how do I do the function
reverse sentence”) without mentioning their current code or
problem description.
4.3. RQ3: Learner diferences when reacting
to LLM hallucinations
The previous section reveals the improvement needed in
learners’ help-initiating behaviors, and in this section,
we closely looked at learners’ reactions to hallucinating
QuickTA-generated responses.</p>
        <p>We manually coded all the conversation sessions and
identified 18 hallucinating responses in 8 conversation sessions
from 6 unique students. Of these 8 conversation sessions, 5
were in the initial conversation session, and 3 were in later
conversation sessions. Due to the nature of programming
tasks, students can run the code with tests to validate the
responses. Therefore, in most cases(94%), students are able
to identify the answer that is wrong from the test result.
However, only one student successfully debugged with the
system and reached the correct code state, while other
students failed to identify the source of errors before quitting
the conversation. According to the conversation log, the
successful student already completed the lab assignment
and only wanted to test the capability of our system, thus
this student can accurately identify the cause of error in
LLMś incorrect responses within one round of turn-taking
(e.g., “Why would sorted_string be initialized with the first
character of the string? We would not be sure of its position
until during the sorting process rather than prior”). Other
students who had no access to the correct answer are often
trapped in the same hallucinating answer for an average of
15 rounds of turn-taking with the problem unsolved, which
can potentially cause low learning eficiency and frustration.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Discussion</title>
      <p>In this work, we applied multiple analysis methods with
multifactual data to understand how learners used a chat-based
LLM tutor when completing a CS1 lab assignment. In
general, we found more commonalities than diferences among
Information coverage by learners’ request type
diferent learners regarding their usage, information
coverage, and responses to possible hallucinations. These
commonalities suggest future directions for learner feature data
collection and highlight shared challenges among novices
in both programmers and AI tool users.</p>
      <p>By addressing the three research questions, we found
help requests written by learners with lower self-eficacy
are significantly longer than those from higher self-eficacy
learners. When looking at their actual help requests,
learners with higher self-eficacy were more capable of describing
the problem (e.g., “I want to write a code where it checks the
letter after the separator” ), while lower self-eficacy students
tended to ask more general questions (e.g. “I am trying to
solve is_palindrome_
string”) or no specific questions but mainly provided
information such as the problem statement and their current code.
Aside from the diference between learners with diferent
self-eficacy on the help request formulation, we found no
other diferences in behavior and usage of QuickTA among
learners with varying self-eficacy, language ability, and
conceptual knowledge. This suggests that representing learners
with a more comprehensive set of features may be
necessary to uncover the diferences in their help-seeking.
Additionally, factors beyond learners’ features, such as tool
accessibility to the intended audience, could also influence
system usage. Follow-up surveys or interviews with
students can help identify these blockers and provide insights
into increasing the use rates and improving the eficiency
of help requests.</p>
      <p>Secondly, since many learners could not efectively
initiate a help-seeking request, more design considerations
should be given to this process. Currently, there are two
main approaches to reducing learners’ cognitive load
during help-seeking: 1) Proactive, Context-Aware Systems:
These systems automatically incorporate all relevant
information into the prompt and regulate learners’ help-seeking
behaviors. This allows learners to focus on their tasks
without worrying about the help-seeking process, thereby
reducing the likelihood of errors and time wasted. 2)
Scaffolded Query Formation System: This approach uses
menu-based selections or templates to assist learners in
forming their queries. By providing structured support,
learners can gradually develop meta-cognitive skills and
become more adept at the help-seeking process through
practice. A potential next step to facilitate the initiation of
help-seeking requests could involve conducting controlled
experiments to evaluate the advantages and disadvantages
of these two systems in terms of their impact on learning
and help-seeking behaviors.</p>
      <p>In response to LLM hallucinations, most learners were
able to diferentiate incorrect responses from correct ones
by running the code and comparing the outputs. Although
there is a lean chance in our setting that learners submit the
incorrect answer without awareness, they did not know how
to proceed either. Only one student successfully located and
corrected the error in the responses. The conversation log
indicates that this student had already completed the lab
assignment before querying the LLM. Despite the relatively
small sample size, this finding suggests that hallucinations
can impact a wide range of learners, and a higher level of
prior knowledge or more advanced debugging skills may
be necessary to identify and resolve hallucinations at the
programming problem level.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion</title>
      <p>This analysis explored the use of multiple methods and data
levels to understand student help-seeking with a chat-based
LLM tutor. Our findings indicated that students with lower
self-eficacy tend to write longer help requests compared to
those with higher self-eficacy. Beyond this, learners
generally exhibited more similarities than diferences in their help
requests, such as providing insuficient context information
when asking for assistance, or not being able to identify the
exact error when dealing with LLM hallucination. These
insights highlight the need for future LLM-powered learning
systems to better support learners with learners’ features
from more dimensions and better scafolding on the help
request initiation and hallucination handling process. By
addressing these, we can enhance the efectiveness of LLM
tutors and improve the overall learning experience for
students from various backgrounds.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>We acknowledge the financial support of: the Learning &amp;
Education Advancement Fund from the Ofice of the
ViceProvost, Innovations in Undergraduate Education,
University of Toronto, Microsoft’s Accelerating Foundation Model
Research program, the Natural Sciences and Engineering
Research Council of Canada grant #RGPIN-2024-04348, and
the Defense Advanced Research Projects Agency award for
AI Tools for Adult Learning awarded to QuickTA.</p>
    </sec>
    <sec id="sec-9">
      <title>A. LLM Tutor Specification</title>
      <p>• model version: OpenAI GPT-4 model
• dates of use: Sept-Dec 2023</p>
      <sec id="sec-9-1">
        <title>Configuration Settings:</title>
        <p>• temperature: 0
• max tokens: 300
• top-p: 1
• frequency penalty: 0
• presence penalty: 0.6</p>
      </sec>
      <sec id="sec-9-2">
        <title>Prompt Design : Figure 4.</title>
        <p>Under no circumstances should you provide direct code
snippets. You are an AI tutor, called QuickTA, for CSC-X Lab
4, focused on assisting with programming tasks on loops,
conditional statements, and string manipulations without
providing direct solutions. Your role is to clarify doubts,
provide hints, and ofer feedback based on the lab guidelines.
Here are the details of Lab 4:
Lab 4: CSC-X - if statements &amp; for loops
Objective: Apply for loops and if statements to manipulate
strings.</p>
        <p>Lab Tasks:
• Implement two functions in lab4.py as per the
docstrings.
• Reuse or modify the ”is_palindrome” function
from Lab 3 to write is_palindrome_string and
reverse_sentence.</p>
        <p>• Test your code with 3-5 test cases per function.
Lab Restrictions:
• No lists or list methods.</p>
        <p>• No try-except statements.</p>
        <p>Your responses should be structured as follows:
1. Understand the student’s strategy for tackling the
functions in lab4.py.
2. Provide hints or clarifications without giving direct
answers.
3. Encourage testing and understanding of the code, and
celebrate their moments of clarity.</p>
        <p>Note: Use simple words avoiding technical jargons, and utilize
real-world examples to illustrate concepts. Do not provide
direct answers or the exact code to solve the lab tasks. Engage
only in programming and assignment-related discussions. No
role-playing or of-topic engagements are allowed.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Meyer</surname>
          </string-name>
          , T. Jansen,
          <string-name>
            <given-names>R.</given-names>
            <surname>Schiller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. W.</given-names>
            <surname>Liebenow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Steinbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Horbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fleckenstein</surname>
          </string-name>
          ,
          <article-title>Using llms to bring evidence-based feedback into the classroom: Ai-generated feedback increases secondary students' text revision, motivation, and positive emotions</article-title>
          ,
          <source>Computers and Education: Artificial Intelligence</source>
          <volume>6</volume>
          (
          <year>2024</year>
          )
          <fpage>100199</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Milano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>McGrane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Leonelli</surname>
          </string-name>
          ,
          <article-title>Large language models challenge the future of higher education</article-title>
          ,
          <source>Nature Machine Intelligence</source>
          <volume>5</volume>
          (
          <year>2023</year>
          )
          <fpage>333</fpage>
          -
          <lpage>334</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kazemitabaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Henley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. J.</given-names>
            <surname>Ericson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Weintrop</surname>
          </string-name>
          , T. Grossman,
          <article-title>How novices use llmbased code generators to solve cs1 coding tasks in a self-paced learning environment</article-title>
          ,
          <source>arXiv preprint arXiv:2309.14049</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <article-title>Exploring the potential of chatbots to provide mental well-being support for computer science students</article-title>
          ,
          <source>in: Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 2</source>
          ,
          <issue>2022</issue>
          , pp.
          <fpage>1339</fpage>
          -
          <lpage>1339</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Stamper</surname>
          </string-name>
          ,
          <article-title>Exploring how multiple levels of gpt-generated programming hints support or disappoint novices</article-title>
          ,
          <source>in: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Cohausz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tschalzev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bartelt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Stuckenschmidt</surname>
          </string-name>
          ,
          <article-title>Investigating the importance of demographic features for edm-predictions</article-title>
          .,
          <source>International Educational Data Mining Society</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Cambaz</surname>
          </string-name>
          ,
          <string-name>
            <surname>X. Zhang,</surname>
          </string-name>
          <article-title>Use of ai-driven code generation models in teaching and learning programming: a systematic literature review</article-title>
          ,
          <source>in: Proceedings of the 55th ACM Technical Symposium on Computer Science Education V. 1</source>
          ,
          <issue>2024</issue>
          , pp.
          <fpage>172</fpage>
          -
          <lpage>178</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Stamper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <article-title>Enhancing llm-based feedback: Insights from intelligent tutoring systems and the learning sciences</article-title>
          ,
          <source>arXiv preprint arXiv:2405.04645</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>X.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. J.</given-names>
            <surname>Ericson</surname>
          </string-name>
          , Codetailor:
          <article-title>Llmpowered personalized parsons puzzles for engaging support while learning programming</article-title>
          ,
          <source>arXiv preprint arXiv:2401.12125</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lifiton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. E.</given-names>
            <surname>Sheese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Savelka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Denny</surname>
          </string-name>
          , Codehelp:
          <article-title>Using large language models with guardrails for scalable support in programming classes</article-title>
          ,
          <source>in: Proceedings of the 23rd Koli Calling International Conference on Computing Education Research</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kazemitabaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Z.</given-names>
            <surname>Henley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Denny</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Craig</surname>
          </string-name>
          , T. Grossman, Codeaid:
          <article-title>Evaluating a classroom deployment of an llm-based programming assistant that balances student and educator needs</article-title>
          ,
          <source>in: Proceedings of the CHI Conference on Human Factors in Computing Systems</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>J. M. LeBreton</surname>
            ,
            <given-names>A. N.</given-names>
          </string-name>
          <string-name>
            <surname>Moeller</surname>
            ,
            <given-names>J. L.</given-names>
          </string-name>
          <string-name>
            <surname>Wittmer</surname>
          </string-name>
          ,
          <article-title>Data aggregation in multilevel research: Best practice recommendations and tools for moving forward</article-title>
          ,
          <source>Journal of Business and Psychology</source>
          <volume>38</volume>
          (
          <year>2023</year>
          )
          <fpage>239</fpage>
          -
          <lpage>258</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Manso-Vázquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Caeiro-Rodríguez</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>LlamasNistal, An xapi application profile to monitor selfregulated learning strategies</article-title>
          ,
          <source>IEEE Access 6</source>
          (
          <year>2018</year>
          )
          <fpage>42467</fpage>
          -
          <lpage>42481</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Baker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hutt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bosch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ocumpaugh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Biswas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Paquette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Andres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nasiar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Munshi</surname>
          </string-name>
          ,
          <article-title>Detectordriven classroom interviewing: focusing qualitative researcher time by selecting cases in situ, Educational technology research and development (</article-title>
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>23</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Koedinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>McLaughlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Stamper</surname>
          </string-name>
          ,
          <source>Automated student model improvement</source>
          .,
          <source>International Educational Data Mining Society</source>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Koedinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. F.</given-names>
            <surname>Carvalho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>McLaughlin</surname>
          </string-name>
          ,
          <article-title>An astonishing regularity in student learning rate</article-title>
          ,
          <source>Proceedings of the National Academy of Sciences</source>
          <volume>120</volume>
          (
          <year>2023</year>
          )
          <article-title>e2221311120</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z. A.</given-names>
            <surname>Pardos</surname>
          </string-name>
          ,
          <article-title>Extracting course similarity signal using subword embeddings</article-title>
          ,
          <source>in: Proceedings of the 14th Learning Analytics and Knowledge Conference</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>857</fpage>
          -
          <lpage>863</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>L.</given-names>
            <surname>Green</surname>
          </string-name>
          , G. Celkan,
          <article-title>Student demographic characteristics and how they relate to student achievement</article-title>
          ,
          <source>Procedia-Social and Behavioral Sciences</source>
          <volume>15</volume>
          (
          <year>2011</year>
          )
          <fpage>341</fpage>
          -
          <lpage>345</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J.</given-names>
            <surname>Goodell</surname>
          </string-name>
          , J. Kolodner, Learning engineering toolkit:
          <article-title>Evidence-based practices from the learning sciences, instructional design, and beyond</article-title>
          , Taylor &amp; Francis,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>H.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lawson</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Musabirov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Raferty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Stamper</surname>
          </string-name>
          , et al.,
          <article-title>Supporting self-reflection at scale with large language models: Insights from randomized field experiments in classrooms</article-title>
          , arXiv e-prints (
          <year>2024</year>
          )
          <article-title>arXiv2406</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bandura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. H.</given-names>
            <surname>Walters</surname>
          </string-name>
          ,
          <article-title>Social learning theory</article-title>
          , volume
          <volume>1</volume>
          , Englewood clifs Prentice Hall,
          <year>1977</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>C.-Y. Tsai</surname>
          </string-name>
          ,
          <article-title>Improving students' understanding of basic programming concepts through visual programming language: The role of self-eficacy</article-title>
          ,
          <source>Computers in Human Behavior</source>
          <volume>95</volume>
          (
          <year>2019</year>
          )
          <fpage>224</fpage>
          -
          <lpage>232</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>J. B. Wiggins</surname>
            ,
            <given-names>J. F.</given-names>
          </string-name>
          <string-name>
            <surname>Grafsgaard</surname>
            ,
            <given-names>K. E.</given-names>
          </string-name>
          <string-name>
            <surname>Boyer</surname>
            ,
            <given-names>E. N.</given-names>
          </string-name>
          <string-name>
            <surname>Wiebe</surname>
            ,
            <given-names>J. C.</given-names>
          </string-name>
          <string-name>
            <surname>Lester</surname>
          </string-name>
          ,
          <article-title>Do you think you can? the influence of student self-eficacy on the efectiveness of tutorial dialogue for computer science</article-title>
          ,
          <source>International Journal of Artificial Intelligence in Education</source>
          <volume>27</volume>
          (
          <year>2017</year>
          )
          <fpage>130</fpage>
          -
          <lpage>153</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>D.</given-names>
            <surname>Zingaro</surname>
          </string-name>
          ,
          <article-title>Peer instruction contributes to self-eficacy in cs1</article-title>
          ,
          <source>in: Proceedings of the 45th ACM technical symposium on Computer Science Education</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>373</fpage>
          -
          <lpage>378</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>X.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. J.</given-names>
            <surname>Ericson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Understanding the efects of using parsons problems to scafold code writing for students with varying cs self-eficacy levels</article-title>
          ,
          <source>in: Proceedings of the 23rd Koli Calling International Conference on Computing Education Research</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Takaku</surname>
          </string-name>
          ,
          <article-title>Help seeking, self-eficacy, and writing performance among college students</article-title>
          ,
          <source>Journal of writing research 3</source>
          (
          <year>2011</year>
          )
          <fpage>1</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>C. X.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. P.</given-names>
            <surname>Ang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Klassen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. S.</given-names>
            <surname>Yeo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. Y.</given-names>
            <surname>Wong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. S.</given-names>
            <surname>Huan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. H.</given-names>
            <surname>Chong</surname>
          </string-name>
          ,
          <article-title>Correlates of academic procrastination and students' grade goals</article-title>
          ,
          <source>Current Psychology</source>
          <volume>27</volume>
          (
          <year>2008</year>
          )
          <fpage>135</fpage>
          -
          <lpage>144</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Nelson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Ketelhut</surname>
          </string-name>
          ,
          <article-title>Exploring embedded guidance and self-eficacy in educational multi-user virtual environments</article-title>
          ,
          <source>International Journal of ComputerSupported Collaborative Learning</source>
          <volume>3</volume>
          (
          <year>2008</year>
          )
          <fpage>413</fpage>
          -
          <lpage>427</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>A. M. Ryan</surname>
            ,
            <given-names>P. R.</given-names>
          </string-name>
          <string-name>
            <surname>Pintrich</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Midgley</surname>
          </string-name>
          ,
          <article-title>Avoiding seeking help in the classroom: Who and why?</article-title>
          ,
          <source>Educational Psychology Review</source>
          <volume>13</volume>
          (
          <year>2001</year>
          )
          <fpage>93</fpage>
          -
          <lpage>114</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>P.</given-names>
            <surname>Chu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Komlodi</surname>
          </string-name>
          , G. Rózsa,
          <article-title>Online search in english as a non-native language</article-title>
          ,
          <source>Proceedings of the Association for Information Science and Technology</source>
          <volume>52</volume>
          (
          <year>2015</year>
          )
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>S. I.</given-names>
            <surname>Hwang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. W.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Matsui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Iguchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hiraki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ahn</surname>
          </string-name>
          ,
          <article-title>Is chatgpt a “fire of prometheus” for non-native english-speaking researchers in academic writing?</article-title>
          ,
          <source>Korean Journal of Radiology</source>
          <volume>24</volume>
          (
          <year>2023</year>
          )
          <fpage>952</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>H.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Rothschild</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. G.</given-names>
            <surname>Goldstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Hofman</surname>
          </string-name>
          ,
          <article-title>Math education with large language models: Peril or promise?</article-title>
          ,
          <source>Available at SSRN</source>
          <volume>4641653</volume>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hattie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Timperley</surname>
          </string-name>
          ,
          <article-title>The power of feedback</article-title>
          ,
          <source>Review of educational research 77</source>
          (
          <year>2007</year>
          )
          <fpage>81</fpage>
          -
          <lpage>112</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>I.</given-names>
            <surname>Roll</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Aleven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. M.</given-names>
            <surname>McLaren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Koedinger</surname>
          </string-name>
          ,
          <article-title>Improving students' help-seeking skills using metacognitive feedback in an intelligent tutoring system</article-title>
          ,
          <source>Learning and instruction 21</source>
          (
          <year>2011</year>
          )
          <fpage>267</fpage>
          -
          <lpage>280</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>