<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Investigating the Impact and Student Perceptions of Guided Parsons Problems for Learning Logic with Subgoals</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>North Carolina State University</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sutapa Dey Tithi</string-name>
          <email>stithi@ncsu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiaoyi Tian</string-name>
          <email>xtian9@ncsu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Min Chi</string-name>
          <email>mchi@ncsu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tifany</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Intelligent Tutoring Systems, Parsons Problems, Subgoal Scafolding, Logic Education</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CSEDM'25: 9th Educational Data Mining in Computer Science Education (CSEDM) Workshop</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Parsons problems (PPs) have shown promise in structured problem solving by providing scafolding that decomposes the problem and requires learners to reconstruct the solution. However, some students face dificulties when first learning with PPs or solving more complex Parsons problems. This study introduces Guided Parsons problems (GPPs) designed to provide step-specific hints and improve learning outcomes in an intelligent logic tutor. In a controlled experiment with 76 participants, GPP students achieved significantly higher accuracy of rule application in both level-end tests and post-tests, with the strongest gains among students with lower prior knowledge. GPP students initially spent more time in training (1.52 vs. 0.81 hours) but required less time for post-tests, indicating improved problem solving eficiency. Our thematic analysis of GPP student self-explanations revealed task decomposition, better rule understanding, and reduced dificulty as key themes, while some students felt the structured nature of GPPs restricted their own way of reasoning. These findings reinforce that GPPs can efectively combine the benefits of worked examples and problem solving practice, but could be further improved by individual adaptation.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Parsons problems have emerged as a promising scafold for teaching structured problem solving in
logic education, which enables learners to reconstruct jumbled proof steps into valid solutions while
reducing cognitive load [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. While Parsons problems broken into smaller sub-problems or subgoals can
efectively train problem solving skills at a low dificulty level, they often pose challenges when students
ifrst encounter them, or the proof structure is complex. The open-ended nature of Parsons problems can
make it dificult for these learners to determine which rules to apply and how to connect logical steps
into well-structured arguments [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. These challenges reflect broader limitations in teaching problem
solving techniques. Worked examples often lead to passive engagement by not clearly explaining the
reasoning behind each step [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This can make it dificult for students to understand why certain choices
were made. Conversely, unstructured problem solving can place high cognitive demands on students as
they try to construct multi-step proofs [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>To address these gaps, we introduce Guided Parsons problems (GPPs), a new problem solving approach
to augment subgoal-oriented Parsons Problems by providing step-specific hints in an intelligent logic
tutor. In GPP, proofs are divided into chunks or “subgoals”. These subgoals are data-driven groupings of
logic statements that represent key logical units—while embedding contextual hints that clarify relevant
rule application (e.g., “Apply Simplification here to isolate ¬P.” ). We also designed a self-explanation
module to understand students’ perceptions of GPPs. After each GPP, students described how the GPP
subgoals helped them. GPPs were designed to promote more active student engagement in problem
solving during the guided practice of partially-worked examples.</p>
      <p>We deployed our tutor with GPPs in an undergraduate classroom of CS majors, and conducted a
∗First Author (led and carried out most of the work)</p>
      <p>CEUR</p>
      <p>ceur-ws.org
controlled experiment. In the controlled experiment, we implemented two training conditions: 1) the
Control group who received worked example (WE) and problem solving (PS) logic-proof construction
problems, and 2) the GPP group who received Guided Parsons problems (GPPs) along with PS. We used
a mixed-methods analysis approach to analyze the impact of GPPs on learning outcomes quantitatively
and student perceptions qualitatively. In this study, we investigate the following research questions:
• RQ1: What is the impact of Guided Parsons problems (GPP) on student performance and learning
outcome?
• RQ2: To what extent does student proficiency level moderate the relationship between GPPs
and student learning outcomes?
• RQ3: What common themes emerge from students’ self-explanations on their learning
experiences with GPPs?</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>
        GPPs build on the principles of Cognitive Load Theory [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and scafolded problem solving frameworks
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Sweller et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] suggested three types of cognitive load: intrinsic (inherent to the material and
may vary based on prior knowledge), extraneous (unnecessary processing, may vary based on how
information is presented), and germane (productive mental efort). They argued that learning outcomes
are optimized when the intrinsic load is managed, the extraneous load is minimized, and the germane
load is promoted.
      </p>
      <p>
        Worked examples have been shown to reduce the intrinsic cognitive load and improve learning [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
However, Nievelstein et al. found that worked examples may not be beneficial for students with high
prior knowledge when problems are structured [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. On the other hand, Renkl et al. demonstrated that
worked examples are most efective when they provide instructional explanations or rationales for the
solution steps [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        Parsons problems, which can be considered as partially worked examples, require students to construct
a solution from a given set of jumbled solution steps [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In programming education, Parsons problems
have been extensively explored and found to improve students’ code writing abilities [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref9">10, 11, 9, 12</xref>
        ].
Poulsen et al. showed that Parsons problems reduced the dificulty in constructing mathematical
proofs [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Understanding high-level contextual significance [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and subgoal labels [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] can help
students solve Parsons problems and improve their learning outcomes. Shabrina et al. demonstrated
that data-driven, subgoal-oriented Parsons problems can enhance students’ subgoaling skills in solving
propositional logic proofs [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. However, they also found that students struggle with Parsons problems
when they first encounter this type of structured or chunked problem or when the connections among
diferent parts of the problem are complex. These results suggest that the design of Parsons problems
and their support have important implications for learning.
      </p>
      <p>
        In this study, we explore GPPs, a new graphical representation of Parsons problems with step-specific
hints attached, to improve students’ problem solving skills in the context of logic-proof problems. GPPs
decompose the proof structure into chunks or subgoals that group statements into logically meaningful
units, aligning with Renkl’s concept of “meaningful building blocks” [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Renkle et al. also emphasized
the importance of reflecting on the general aspects of specific problem solutions for transfer learning.
Margulieux et al. showed that learners’ explanations of the problem solving process can lead to better
problem solving performance [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. To understand the impact of students’ self-explaining problem
subgoals, we incorporate a self-explanation module in GPPs. Additionally, we integrate step-specific
hints into GPPs to address the “rationale gap” identified in traditional worked examples [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Thus, our
intervention is designed to maintain low intrinsic load through subgoals and step-specific scafolding
while facilitating active problem solving participation, a balance that research suggests optimizes
cognitive engagement [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Tutor Context</title>
      <p>Our logic tutor teaches how to construct propositional logic proofs where a set of given premises and a
conclusion are presented as visual nodes for each problem in a graphical representation, as shown in
Figure 1. Students iteratively derive new logic statement nodes to complete each proof. A new logic
statement node can be derived working forwards by clicking on 1-2 parent nodes and a logic rule, or
working backwards by clicking on a node and hypothesizing a rule and parent nodes to justify it.</p>
      <p>We measure students’ prior competency based on how they solve two pretest problems. Next, the
training session consists of five ordered levels of increasing dificulty, and each level consists of four
problems. The last problem in each training level is a level-end test problem to measure how the student
learns at that level. The posttest level (level 7) consists of six problems. During the pretest, training
level-end test, and posttest problems, the tutor ofers no help. For each problem, the students receive a
score between 0 and 100 based on eficient proof construction, with higher scores corresponding to
attempts with smaller solution size, higher accuracy of rule application, and shorter time.</p>
      <p>Based on the intervention, the tutor presents three types of problems during training: worked
examples (WEs), problem solving (PS), and guided Parsons problems (GPPs).</p>
      <p>Worked examples (WEs) are solved by the tutor step-by-step as the students click on the next step (&gt;)
button (Figure 1b). On the other hand, PS problems require students to derive all the steps (nodes) of a
proof and provide a justification for each step using logic rules (edges) applied to parent logic statement
nodes (Figure 1c). Along with WE and PS problem types, our intervention presents a new problem
type, Guided Parsons problem (GPP), where a partially solved proof is presented, and students must
complete it by justifying a few of the nodes by selecting a rule and parent nodes to derive them.</p>
      <p>
        A clustering algorithm was used on previous years’ student interaction log data to extract the most
commonly related steps as intermediate goals/subgoals, and these subgoals were verified by an expert
instructor to be important in solving each tutor problem [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The most commonly related steps are
presented together as a chunk on the screen. Figure 1a shows a screenshot of the initial presentation
of a guided Parsons problem (GPP). At the top of the screen, the problem’s given statement nodes
1, 2, and 3 are shown in purple. The goal is to derive the conclusion (the purple node labeled with
C:  ∨  ), and the complete proof is a connected graph with edges from the givens to the conclusion.
Generally, to solve a logic proof in DT, students derive new nodes by clicking on given statement nodes
and a rule, until the conclusion is derived. A derived statement node is said to be justified when it has
arrows from its parent nodes to the derived node, labeled with the rule that justifies the statement.
For example, in Figure 1a, the node 2.C for statement J is a parent node with the Addition (Add) rule
to derive the conclusion C:  ∨  , illustrated with an arrow from node 2.C to the C, the conclusion.
Each GPP provides students with all the statement nodes needed to complete a proof, but students
must add a few justifications to connect all the nodes to one another with missing edges for rules. The
nodes without incoming edges are unjustified. GPPs guide students to justify each unjustified node by
specifying the rule used to derive it. In Figure 1a, statement 2.1: − can be derived from 1.C:  ∧ −
using Simplification. GPPs guide students with a hint, as shown in the Figure 1a, to derive the statement.
To complete this step, students click on the yellow question mark above 2.1, choose the rule Simp, and
click on statement 1.C to show that there should be an edge from 1.C to 2.1. GPPs are divided into
chunks, where important subgoals in the problem are shown in light blue/cyan, grouped with the nodes
used to derive them. For example, in this problem, there are 2 chunks 1 and 2, with subgoals 1.C and
2.C respectively, that are needed to complete the problem. DT guides students using popup hints with
instructions to work backwards from the conclusion to connect to node 2.C, then connect 2.1 to chunk
1’s conclusion 1.C, and then 1.1 to the givens.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Methods</title>
      <sec id="sec-4-1">
        <title>Experimental Conditions. We designed two training conditions in the logic tutor:</title>
        <p>(a) GPP: Guided Parsons Problem</p>
        <p>(b) WE: Worked Example
(c) PS: Problem Solving</p>
        <p>• Control: Students assigned to the Control condition received PS or WE (selected randomly)
during training.</p>
        <p>• GPP: Students assigned to GPP condition received PS or GPP (selected randomly) during training.</p>
        <p>The tutor was deployed with the two training conditions in an undergraduate Discrete Mathematics
course at a public research university in the United States in the Spring of 2024. We did not collect
course-specific demographics; it is noteworthy that discrete math is a mandatory course for all CS
majors. Therefore, for an approximation, we report the demographics of the 2021-22 graduating class of
CS majors with the gender composition of 83% men and 17% women; and race/ethnicity of 58% white,
18.5% Asian, 3% Hispanic/Latin, 2% Black/African American, 9% other races, with the remaining 9.5%
having international student status for whom race/ethnicity information was not available. This study
was approved by the university IRB, and only authorized researchers could access the data collected
from the participants.</p>
        <p>Each participating student in that course was assigned to one of the two training conditions after
they completed the pretest problems. We used random stratified sampling, assigning students to groups
after level 1. They are assigned randomly but ensuring we balance lower and higher level 1 scores
between all conditions being implemented that semester. We compare only students who completed all
7 tutor levels, with 30 students in the Control group and 46 students in the GPP group.</p>
        <p>
          Data collection and analysis. For both training conditions, we analyzed their interaction logs to
measure their learning and performance. For students in the GPP group, we collected students’
selfexplanation responses after solving each GPP problem (e.g., “How did the subgoals ( ∧ ¬ ),  help you
derive the conclusion?”). Research shows that self-explanation promotes learning [
          <xref ref-type="bibr" rid="ref18 ref19">18, 19</xref>
          ]. Answering
the self-explanation question was mandatory, and a total of 326 explanations were collected from GPP
students. We hoped that self-explanations would help students use the GPPs to better understand the
structure of logic proofs and how a directed strategy guides experts to build subgoals that link the givens
to the conclusions. Therefore, we conducted a thematic analysis of the self-explanations to determine
whether and how students were learning about subgoals through GPPs (RQ3). The themes were derived
through an inductive coding process [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] following established thematic analysis methodology [21, 22].
Two researchers independently coded a subset of explanations (initial agreement: 91.9%), and then they
discussed any code discrepancies. After reaching consensus on the codes, one researcher coded the
remaining self-explanations.
        </p>
        <p>Performance Metrics. A student’s problem score is a combination of normalized metrics for the
problem completion time, total number of steps, and rule application accuracy on a single
problem, which ranks a student based on how fast, eficient, and accurate they are. Total steps in
a problem include any attempt a student makes to derive a new node (including mistakes). Rule
application accuracy is the total number of correct rule applications divided by all rule application
attempts.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>We report the results by research questions RQ1 on student learning, RQ2 on GPP efectiveness, and
RQ3 on GPP self-explanations.</p>
      <sec id="sec-5-1">
        <title>5.1. RQ1: Student Performance and Learning</title>
        <p>To evaluate the impact of GPPs on learning outcomes, we analyzed students’ performance scores and
normalized learning gains (NLG) across training conditions. Our analyses focused on performance
metrics derived from the training level-end test problems (2.8, 3.8, 4.8, 5.8, and 6.8) and the posttest
problems (7.1–7.6). Students solved these problems independently without any tutor help. We performed
a combination of statistical comparisons, including mixed-efects regression and Mann-Whitney U tests,
to account for the non-normal nature of our data and problem-specific variability.</p>
        <p>Problem Score &amp; NLG. In the pretest problems, there were no significant diferences in problem
scores across the two conditions (Control (mean) = 62.8, GPP (mean) = 63.8,  = 0.9 ). We conducted
mixed-efect regression analyses to examine the relationship between training conditions and problem
(a) Average number of incorrect steps across
conditions in the training level-end test and posttest
problems.</p>
        <p>(b) Average number of backward attempts across
conditions in the training level-end test and posttest
problems.
score, with problem IDs as random efects and training conditions as fixed efects. For training
levelend test problems, results indicated a marginally significant association (  = 0.055 ) between training
conditions and problem score (Control (mean) = 57.9, GPP (mean) = 62.8). Problem scores were not
significantly diferent in posttest problems across conditions (  = 0.46 ) (Control (mean) = 70.4, GPP
(mean) = 72.6) (Table 1).</p>
        <p>
          To identify the efectiveness of the training conditions in promoting learning, we analyzed students’
normalized learning gain (NLG) across the two training conditions. Normalized learning gain is
calculated using the average problem scores on the pre and posttest problems with the equation [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]:
NLG =
(posttest score − pretest score)
√(100 − pretest score)
(1)
        </p>
        <p>Note that, we normalize NLG scores between -1 and 1. In our analyses, NLG may show negative
numbers. These negative numbers are due to the dificulty of the pretest section compared to the much
higher dificulty of the posttest section and do not indicate negative learning. Table 1 shows there was
no significant diference in NLG between the two conditions, although the GPP group yielded higher
positive NLG rates (78% in GPP vs. 73% in Control), moving their NLG toward positive more often.</p>
        <p>Time. The training time for GPP was significantly higher than the control group (GPP (mean): 1.52
hrs vs. Control (mean): 0.81 hrs). But in the training level-end test problems as well as posttest problems,
there was no significant diference in time between the two groups (Table 2).</p>
        <p>Rule Application Accuracy. In addition to the overall problem score, we observed students’ step
derivation behavior. Total Correct Steps are the number of correct rule applications over the whole tutor.
Total Incorrect Steps are the number of rule applications that are either incorrect by virtue of selecting
a rule that does not apply to the arguments (e.g. using the rule Simplifaction on a logical statement that
cannot be simplfied) or not being able to correctly identify what statement a rule application would
derive. There was a significant diference in the mean number of incorrect steps between groups in
the level-end tests (Control (mean) = 37.4, GPP (mean) = 19.6, p &lt; 0.001) and a marginal diference in
incorrect steps in the final posttest (Control (mean) = 16.4, GPP (mean) = 8.4, p = 0.06). The trend in
the average number of incorrect steps is shown in Figure 2a.</p>
        <p>Rule application accuracy is defined as the number of total correct rule applications divided by the
total number of attempts for rule application. Mann-Whitney U tests indicate that, training with GPP
problems significantly improved the rule application accuracy among the students in both training
level-end test problems, and the posttest problems (Table 3).</p>
        <p>Backward Actions. As a measure of students’ response to the training with backward strategy in
GPP, we count their independent attempts to work backwards. In Figure 2b, we report mean backward
attempt counts for the training level-end test and posttest problems. In the posttest problems, GPP
students had significantly more backward attempts (Control (mean) = 1.1, GPP (mean) = 1.8, p = 0.01).</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. RQ2: Impact of GPP by Pretest Score</title>
        <p>
          Prior research suggests that instructional interventions like worked examples and Parsons problems may
have varying efectiveness based on learners’ prior knowledge [
          <xref ref-type="bibr" rid="ref17">17, 23</xref>
          ]. To investigate this phenomenon
in our context, we analyzed how the impact of GPPs difered based on prior proficiency. We compared
metrics across three phases (pretest, level-end test, posttest) using Mann-Whitney U tests. We used
the Bonferroni correction to adjust the significance thresholds. We categorized two prior proficiency
groups based on a median split on the pretest score.
        </p>
        <p>As shown in Table 4, GPPs significantly improved rule application accuracy for low prior knowledge
students at both level-end test (Control (mean) = 51.6, GPP (mean) = 68.3, p &lt; .001) and posttest
(Control (mean) = 66.7, GPP (mean) = 79.1, p &lt; .001) problems. High prior knowledge students showed
comparable final rule accuracy across groups (Control (mean) = 78.5, GPP (mean) = 80.5,  = 0.68 ),
potentially suggesting a ceiling efect for learning rules. GPPs reduced redundant steps for high prior
knowledge students during level-end test problems (Control (mean) = 11.5, GPP (mean) = 9.66, p = 0.02)
but increased steps for low prior knowledge counterparts (Control (mean) = 9.71, GPP (mean) = 11.64,
p = 0.03). High prior knowledge students in the GPP condition showed significantly reduced posttest
times (p=0.01). Low prior knowledge students showed no significant time diferences. These results
suggest that GPPs helped advanced learners become more eficient, while they only helped novices
gain a better understanding of logic rules.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. RQ3: GPP Self-explanation Themes</title>
        <p>To understand student perceptions and experiences with GPPs, we conducted a thematic analysis on 326
unique student explanations from 46 students in GPP group (method detailed in Section 4) and identified
ifve key themes: Task Decomposition, Rule Understanding, Reduced Dificulty , Backward Reasoning, and
Dificulty . Next, we discuss these emergent themes from student explanations.</p>
        <p>Task Decomposition ( = 178 ): The most common (mentioned in 54.6% of explanations) benefit
students mentioned about GPPs is that the subgoal-oriented solutions naturally helped them break
down the proofs into manageable steps. As one student noted, “They broke down the problem into more
understandable smaller problems that I was able to solve and then piece together.”, and another student
quoted “They showed me short-term goals so that I know how to piece together the puzzle pieces.”. As a
result, the GPP made it “much less intimidating to solve” a logic problem than those without subgoals.</p>
        <p>Rule Understanding ( = 124 ): Step-specific hints in GPP provided context to explain the
connection among diferent subgoals. Thirty-eight percent of explanations highlighted improved rule
understanding, as one student mentioned, “They helped by giving me a better understanding of the rules
needed to complete the problem” and when and how to apply them appropriately (e.g., “The hints were
useful, I wouldn’t really have thought to double negate something like that. It’s certainly something I’ll
keep in mind in future.”).</p>
        <p>Reduced Dificulty ( = 73 ): GPP shows a skeleton of the solution first, and thus encourages task
planning before working through a problem. In 22% of explanations, students reported reduced mental
cognitive load through guided workflows (e.g., “Made the problem easier to solve.”, “The problem showed
an easy relationship.”, “It allowed me to work on simpler goals and not get distracted on long mistakes.”).</p>
        <p>Backward Reasoning ( = 31 ): GPP hints were designed to help students complete the proof using
a backward strategy. Although in our tutor, students derive new statements using forward reasoning
more often, GPP could help students practice backward reasoning while being a low-efort training
intervention. In 9.5% of their explanations, students reported that the hints and subgoals helped them
perform backward chaining easily on this problem type (e.g., “They provided obvious stepping stones to
move backward through the logic in a readily apparent path.”).</p>
        <p>Dificulty ( = 24 ): Some student explanations (7.4% of explanations) reported struggle with the
constrained GPP workflows. For example, one student noted “It didn’t help, and it made the problem
harder by disrupting my own way of working through the problem, and forced me to work it the way they
want me to.”, “They made the problem harder, rather than helping.”. Within this dificulty category, a few
explanations reported “confusion”, implying that parts of the GPP design might confuse students and
impose extraneous cognitive load on them.</p>
        <p>In summary, student self-explanations seemed to indicate an overall positive experience with GPP
problems, with the majority noting the benefits of task decomposition; other benefits include mastery
of rule understanding and guided workflow. However, we also noted a potential limitation of GPP, with
a small subset of students reporting that the structured nature of GPP might disrupt their approach to
independent problem solving. Additionally, we received suggestions for improving the design of GPP.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Discussion</title>
      <p>
        Our findings highlight the efectiveness of Guided Parsons problems (GPPs) as an intervention for
enhancing problem solving skills. By maintaining a balance between structured scafolding and student
autonomy, GPPs address critical gaps in traditional PPs. The chunking of proofs into semantically
meaningful segments, along with step-specific hints, aligns with the principles of Cognitive Load
Theory [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], potentially reducing intrinsic cognitive load. This approach proved particularly beneficial
for students with low prior knowledge, who demonstrated significant improvements in rule application
accuracy. Such improvements also reflect Renkl’s argument that well-structured examples, along with
clear rationales, are critical for schema acquisition and deeper understanding [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>In addition, moderation analysis reveals that GPP can be helpful at diferent proficiency levels. While
low prior knowledge students benefited primarily from added scafolding, resulting in fewer incorrect
steps, high prior knowledge students benefited from the GPP framework to improve their eficiency, as
evidenced by the reduced number of steps (p=0.02). These findings align with the expertise reversal
efect [ 23], suggesting that learners with high prior knowledge may require less detailed support and
can benefit from advanced scafolding techniques. Future iterations of GPPs could incorporate adaptive
hints to ensure that learners with diferent prior proficiency receive appropriately tailored support.</p>
      <p>Student self-explanation on GPPs aligned with the quantitative analysis findings from our RQ1 and
RQ2, and highlighted the benefits of having subgoal-oriented proof structures, revealing clear, logical
lfows, and illustrating the relevance of necessary rules. Three prominent themes, Task Decomposition,
Rule Understanding, and Reduced Dificulty , emphasize how the subgoals and step-specific hints made
the proofs more manageable, potentially reducing cognitive load. Conversely, several students perceived
the structured nature of the proof as disruptive to their own reasoning processes. These results suggest
that GPPs could be further enhanced by making them adaptive to individual student skill levels, which
has been shown to be efective for programming [ 24]. This could enable high prior knowledge students
to maintain autonomy and exploration, while ofering additional scafolding only as needed.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion and Future Work</title>
      <p>This study introduces Guided Parsons problems (GPPs) as a novel intervention for logic education,
combining the structured guidance of worked examples with active engagement in subgoal-focused problem
solving. Our results demonstrate the efectiveness of GPP in improving rule application accuracy and
reducing cognitive load—particularly among low prior knowledge students. However, our study had
several limitations. The study was conducted in only one ITS platform, limiting generalizability. Next, a
confounding variable in this study could be the lack of self-explanation prompts in the control group. In
the future, this can be addressed by asking what aspects of the worked examples students found helpful
when solving problems. Future research should explore a more adaptive implementation of GPPs to
dynamically adjust the amount of scafolding according to learners’ performance and metacognitive
needs. Longitudinal studies could further examine how GPPs influence knowledge retention over
extended periods or contribute to transferable problem solving skills. Future studies may also benefit
from measuring the perceived cognitive load using standardized survey questions.</p>
    </sec>
    <sec id="sec-8">
      <title>8. Acknowledgement</title>
      <sec id="sec-8-1">
        <title>The work is supported by National Science Foundation (NSF) grant 2013502.</title>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>9. Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used ChatGPT, Claude, and Grammarly to: Grammar
and spelling check. After using these tool(s)/service(s), the author(s) reviewed and edited the content as
needed and take(s) full responsibility for the publication’s content.
[21] G. Guest, K. M. MacQueen, E. E. Namey, Applied thematic analysis, sage publications, 2011.
[22] M. Muller, Curiosity, creativity, and surprise as analytic tools: Grounded theory method, in: Ways
of Knowing in HCI, Springer, 2014, pp. 25–48.
[23] S. Kalyuga, The expertise reversal efect, in: Managing cognitive load in adaptive multimedia
learning, IGI Global, 2009, pp. 58–80.
[24] C. Haynes-Magyar, B. Ericson, The impact of solving adaptive parsons problems with common
and uncommon solutions, in: Proceedings of the 22nd Koli Calling International Conference on
Computing Education Research, 2022, pp. 1–14.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Prather</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Homer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Denny</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. A.</given-names>
            <surname>Becker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Marsden</surname>
          </string-name>
          , G. Powell,
          <article-title>Scafolding task planning using abstract parsons problems</article-title>
          , in: IFIP World Conference on Computers in Education, Springer,
          <year>2022</year>
          , pp.
          <fpage>591</fpage>
          -
          <lpage>602</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Shabrina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mostafavi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. D.</given-names>
            <surname>Tithi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Barnes</surname>
          </string-name>
          ,
          <article-title>Learning problem decompositionrecomposition with data-driven chunky parsons problems within an intelligent logic tutor</article-title>
          .,
          <source>International Educational Data Mining Society</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Renkl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. K.</given-names>
            <surname>Atkinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U. H.</given-names>
            <surname>Maier</surname>
          </string-name>
          ,
          <article-title>From studying examples to solving problems: Fading worked-out solution steps helps learning</article-title>
          ,
          <source>in: Proceeding of the 22nd Annual Conference of the Cognitive Science Society</source>
          ,
          <year>2000</year>
          , pp.
          <fpage>393</fpage>
          -
          <lpage>398</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Sweller</surname>
          </string-name>
          ,
          <article-title>Cognitive load during problem solving: Efects on learning</article-title>
          ,
          <source>Cognitive science 12</source>
          (
          <year>1988</year>
          )
          <fpage>257</fpage>
          -
          <lpage>285</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Wood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Bruner</surname>
          </string-name>
          , G. Ross,
          <article-title>The role of tutoring in problem solving</article-title>
          ,
          <source>Journal of child psychology and psychiatry 17</source>
          (
          <year>1976</year>
          )
          <fpage>89</fpage>
          -
          <lpage>100</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Paas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Renkl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sweller</surname>
          </string-name>
          ,
          <article-title>Cognitive load theory and instructional design: Recent developments</article-title>
          ,
          <source>Educational psychologist 38</source>
          (
          <year>2003</year>
          )
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>F.</given-names>
            <surname>Nievelstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. Van</given-names>
            <surname>Gog</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Van Dijck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. P.</given-names>
            <surname>Boshuizen</surname>
          </string-name>
          ,
          <article-title>The worked example and expertise reversal efect in less structured tasks: Learning to reason about legal cases</article-title>
          ,
          <source>Contemporary Educational Psychology</source>
          <volume>38</volume>
          (
          <year>2013</year>
          )
          <fpage>118</fpage>
          -
          <lpage>125</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Renkl</surname>
          </string-name>
          ,
          <article-title>The worked-out examples principle in multimedia learning</article-title>
          . (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Denny</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Luxton-Reilly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Simon</surname>
          </string-name>
          ,
          <article-title>Evaluating a new exam question: Parsons problems</article-title>
          ,
          <source>in: Proceedings of the fourth international workshop on computing education research</source>
          ,
          <year>2008</year>
          , pp.
          <fpage>113</fpage>
          -
          <lpage>124</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>N.</given-names>
            <surname>Weinman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fox</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Hearst</surname>
          </string-name>
          ,
          <article-title>Improving instruction of programming patterns with faded parsons problems</article-title>
          ,
          <source>in: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>V.</given-names>
            <surname>Karavirta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Helminen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ihantola</surname>
          </string-name>
          ,
          <article-title>A mobile learning application for parsons problems with automatic feedback</article-title>
          ,
          <source>in: Proceedings of the 12th koli calling international conference on computing education research</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>B. J.</given-names>
            <surname>Ericson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Foley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rick</surname>
          </string-name>
          ,
          <article-title>Evaluating the eficiency and efectiveness of adaptive parsons problems</article-title>
          ,
          <source>in: Proceedings of the 2018 ACM Conference on International Computing Education Research</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>60</fpage>
          -
          <lpage>68</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Poulsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Viswanathan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Herman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>West</surname>
          </string-name>
          ,
          <article-title>Evaluating proof blocks problems as exam questions</article-title>
          ,
          <source>ACM Inroads 13</source>
          (
          <year>2022</year>
          )
          <fpage>41</fpage>
          -
          <lpage>51</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B. B.</given-names>
            <surname>Morrison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. E.</given-names>
            <surname>Margulieux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ericson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Guzdial</surname>
          </string-name>
          ,
          <article-title>Subgoals help students solve parsons problems</article-title>
          ,
          <source>in: Proceedings of the 47th ACM Technical Symposium on Computing Science Education</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>42</fpage>
          -
          <lpage>47</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>L.</given-names>
            <surname>Margulieux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Catrambone</surname>
          </string-name>
          ,
          <article-title>Using learners' self-explanations of subgoals to guide initial problem solving in app inventor</article-title>
          ,
          <source>in: Proceedings of the 2017 ACM Conference on International Computing Education Research</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>21</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Renkl</surname>
          </string-name>
          ,
          <article-title>Worked-out examples: Instructional explanations support learning by self-explanations, Learning</article-title>
          and instruction
          <volume>12</volume>
          (
          <year>2002</year>
          )
          <fpage>529</fpage>
          -
          <lpage>556</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Koedinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Corbett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Perfetti</surname>
          </string-name>
          ,
          <article-title>The knowledge-learning-instruction framework: Bridging the science-practice chasm to enhance robust student learning</article-title>
          ,
          <source>Cognitive science 36</source>
          (
          <year>2012</year>
          )
          <fpage>757</fpage>
          -
          <lpage>798</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>K. VanLehn</surname>
            ,
            <given-names>R. M.</given-names>
          </string-name>
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>M. T.</given-names>
          </string-name>
          <string-name>
            <surname>Chi</surname>
          </string-name>
          ,
          <article-title>A model of the self-explanation efect</article-title>
          ,
          <source>The journal of the learning sciences 2</source>
          (
          <year>1992</year>
          )
          <fpage>1</fpage>
          -
          <lpage>59</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>K.</given-names>
            <surname>Bisra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Nesbit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Salimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. H.</given-names>
            <surname>Winne</surname>
          </string-name>
          ,
          <article-title>Inducing self-explanation: A meta-analysis</article-title>
          ,
          <source>Educational Psychology Review</source>
          <volume>30</volume>
          (
          <year>2018</year>
          )
          <fpage>703</fpage>
          -
          <lpage>725</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Thomas</surname>
          </string-name>
          , A
          <article-title>general inductive approach for analyzing qualitative evaluation data</article-title>
          ,
          <source>American journal of evaluation 27</source>
          (
          <year>2006</year>
          )
          <fpage>237</fpage>
          -
          <lpage>246</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>