A Longitudinal Study on Student Persistence in
                  Programming Self-assessments∗

               Cheng-Yu Chung                     Yancy Vance Paredes                    Mohammed Alzaid
             Arizona State University               Arizona State University            Arizona State University
                 Tempe, Arizona                         Tempe, Arizona                      Tempe, Arizona
         Cheng.Yu.Chung@asu.edu yvmparedes@asu.edu      Mohalzaid@asu.edu
                    Kushal Reddy Papakannu      I-Han Hsiao
                                Arizona State University               Arizona State University
                                    Tempe, Arizona                         Tempe, Arizona
                           kushalreddy95@gmail.com                 Sharon.Hsiao@asu.edu

ABSTRACT                                                          1.    INTRODUCTION
Self-assessment is an educational practice that helps stu-        Engagement and persistence in learning have been consid-
dents evaluate their own learning by distributed practices.       ered as a key to attaining achievements in computing educa-
This evaluation potentially has an effect on student’s self-      tion [19]. Researchers in self-regulated learning (SRL) have
efficacy and therefore can influence their choice of activities   shown that there is a relationship between students’ belief in
and the likelihood of their success. The variation in the self-   the effectiveness of learning strategies and such motivational
assessing behavior of students over the course of learning is     responses [36]. This belief about one’s “perceived capabili-
often less explored. For instance, a student’s short-term be-     ties for learning” is referred to as self-efficacy [31]. This not
havior may not necessarily infer how they will behave in the      only affects how a student assesses his or her own learning
long-term. It is unclear how such development in their self-      outcome at the moment but also how the student chooses
assessing behavior is related to academic performance and         certain tasks and adapt to particular learning strategies.
the corresponding self-assessment strategies. This longitudi-
nal study aims to fill the gap by examining a self-assessment     Self-assessment is an educational practice that helps stu-
platform used in an introductory programming class from           dents evaluate their learning condition [7]. This practice can
three different semesters. We analyzed the activity logs and      be extended by the theory of spacing effect and distributed
modeled students’ short-term and long-term study persis-          practices for the provision of continuous evaluation of learn-
tence on the platform using a probabilistic mixture model.        ing outcomes that can help students improve in a course [3].
The results suggest that short-term persistence was not re-       When students keep receiving learning feedback from such
lated to short-term performance. However, the performance         a tool and the outcomes are attributed to the effort in self-
in the final exam was associated with earlier persistence pat-    assessment [21], their belief in self-efficacy may change [30]
terns. A further analysis showed that low-performing stu-         and therefore may be able to adjust and adapt their learning
dents who maintained the self-assessment pattern improved         strategies to fit the best their conditions [2]. Research has
in exams. Nevertheless, this longitudinal study contributes       shown that such temporally spaced and distributed practices
empirical evidence to the understanding of the development        are better than “compressed” ones in memory research [6].
of self-assessment behavior in relation to academic perfor-       There have been also research trying to correlate students’
mance.                                                            activity traces of self-assessments to SRL [15].

                                                                  Following the train of thought about the relationship be-
Keywords                                                          tween self-assessment, self-efficacy, and learning strategies,
self-assessment, self efficacy, study persistence, programming    we further hypothesize that students’ self-assessment behav-
concepts, learning analytics, computing education, proba-         ior is not stationary throughout the course of learning [14].
bilistic mixture model                                            It is intuitive to have an impression that active and higher
∗Copyright c 2020 for this paper by its authors. Use per-         usage of self-assessment should be positively correlated to
mitted under Creative Commons License Attribution 4.0 In-         student’s performance. However, in our previous work we
ternational (CC BY 4.0).                                          observed that it was not always the case in our subjects.
                                                                  We found that students made adjustment to the usage of
                                                                  self-assessment according to, hypothetically, the attribution
                                                                  of effectiveness in terms of exam performance [14]. An active
                                                                  user did not necessarily end up with a higher performance
                                                                  in the exam. Moreover, there has been research examin-
                                                                  ing the effectiveness of self-assessment in terms of memory,
                                                                  long-term retention, and cognitive outcomes like motivation,
                                                                  persistence, and self-efficacy, however, there are only few
                                                                  research papers focused on the explanation of variance be-
tween the changes of self-assessment behavior and students’        Backer discussed such a potential in terms of SRL: “Self-
performance in a course. Therefore, this work aims to ex-          regulated learning is a behavioral expression of metacogni-
amine the dynamics of the self-assessment usage pattern,           tively guided motivation” [34]. Students’ activity and trace
which is referred to as persistence pattern onward, and eval-      records could be a source of information about the process
uate how it is correlated to the variance in the performance       of learning, in which the challenge is to “obtain representa-
in exams. Specifically, this work is guided by the following       tions of learning as it unfolds...that are clearly and precisely
research questions:                                                matched to what theory describes.” [34]. An example of such
                                                                   a theory is the 4-phase model of SRL proposed by Winne
                                                                   and Hadwin: 1) identification of resources, 2) goal setting, 3)
RQ1 What are the persistence patterns of self-assessments          carrying out the task, 4) reviewing the work [35, 34]. In this
    in students from an introductory computer program-             model, learners are assumed to be agents who decide and
    ming course? Are these patterns generalizable for stu-         chose what to do depending on information from the envi-
    dents from different semesters?                                ronment, e.g., feedback, assessment outcomes, etc. Such an
                                                                   SRL procedure may not follow a certain sequence of phases
RQ2 What is the relationship between the dynamics of per-          (“weakly sequenced”) and the results from it can be used
    sistence patterns and the variance in exam performance         recursively for the current of successive SRL phases [34].
    in the course? What are the changes that positively
    or negatively correlated to the performance?                   The subject of this work is the relationship between stu-
RQ3 What are the effective practices of self-assessment for        dent’s self-assessment behavior, changes of such behavior,
    students whose performance is relatively low in the            and how they are correlated with the performance in a course.
    course? What are such practices for students who have          We hypothesize that this process can be considered an “un-
    relatively higher performance?                                 folding” process following the 4-phase model of SRL from
                                                                   Winne and Hadwin where students, who use the system
                                                                   based on their own decision, try to obtain some feedback
We have organized the rest of this paper in the following          (resources) on their understanding of learning topics and
way. In the next section, we discussed related works in            therefore can identify what they need to further study on.
SRL in computing education and behavioral analytics in
programming learning. Section 3 describes how we mod-              2.2    Behavioral Analytics in Programming
eled the persistence patterns using a probabilistic model.                Learning
Section 4 illustrates our findings of the dynamics of persis-      Behavioral analytics is an area of research that is gaining
tence pattern in relation to the variance in students’ exam        popularity. It traces its roots from research on understand-
performance, which is followed by a discussion of the result       ing data captured from e-commerce. Research in this field,
with previous work and SRL theory. Finally, the conclusion         such as exploratory studies, was driven by the advance-
and limitations of the model are described in Section 6.           ment of technology where systems became capable of cap-
                                                                   turing large quantities of data which may come from mul-
2. RELATED WORK                                                    tiple sources. There has been growing interest in exploring
                                                                   the application of behavioral analytics to education data to
2.1 Self-regulated Learning and Academic                           support pedagogy. This spans from student’s performance
    Success in Computing Education                                 prediction, intelligent course recommendation, data-driven
Theories and practices of SRL have been established and            learning analytics, and personalized learning [12]. Some of
evaluated since the late 1990s with the focus on student           these education data are considered to be ambient data (ac-
development, cognitive-behavioral processes, and social and        cretion data) that learners generate [33] while using learn-
motivational aspects [36]. On the line of social and motiva-       ing environments. This could be in the form of capturing
tional aspects, researchers discussed the construct feedback       the event where a user clicks on a hyperlink to open a web
cycle within which a student experiences choosing a task           resource. That can capture the user’s cognition and moti-
of interest [5], judging the performance and comparing it          vation. In another work, sequential analysis was applied to
to a standard [5, 36], building a perception of self-efficacy      behavioral data to explore how it was affected by the moti-
and being persistent on the process [29]. To assess multi-         vation of students to learn [32]. They indicated that online
faceted SRL behavior, various methodological instruments           reading duration in the online learning system was a better
can be employed for different constructs, e.g., diary for per-     indicator of reading seriousness in learners. Another work
sonal and offline events, SRL scales for self-efficacy, “online”   proposes to extend how behavioral analytics is perceived [4].
think-aloud protocols for SRL processes occurring during the       In this case, they proposed to look into investigating the de-
learning [36]. Discipline-specific strategies of SRL in com-       viation of a student from a normal behavior. This normal
puting education has also been examined in the context of          behavior is contextually dependent on the issue at focus.
programming problem-solving [22], self-awareness [23], and
metacognitive strategies [16].                                     Modeling student’s learning is an ongoing research in the
                                                                   field. Such student models reside in intelligent tutoring sys-
In recent years, researchers from educational data mining          tems or any adaptive educational systems. In these systems,
(EDM) have started the discussion of applying EDM meth-            behavior logs are often used to estimate students’ learning
ods, which have a focus on using computerized methodolo-           (i.e., interaction with the tutors which results in updates
gies for linking students’ trace records to performance met-       on the knowledge components). In the context of learn-
rics by which researchers can optimize the learning process        ing a programming language, several parameters have been
on both research and practice side. For example, Winne and         used to estimate the coding knowledge of students. This
Figure 1: A Screenshot of the Self-assessment Platform. The left panel shows the question and options. The
right panel shows the feedback/discussion from other students (which is only accessible after the student
answers a question).


includes the sequence of success in programming problem-                Semester       # of Students    Statistics of Attempts
solving [17]; how the students progressed in solving program-           Fall 2016           217         M = 26.90, SD = 29.23
ming assignments [26]; the dialogic strategies between stu-             Spring 2018         112         M = 17.59, SD = 27.20
dents [8]; identifying the strategies of students when seeking          Fall 2018           211         M = 9.50, SD = 21.46
information related to programming [24]; assignment sub-
mission compilation behavior [1, 20]; how students trou-                          Table 1: Statistics of Datasets
bleshoot and test their solutions [9]; and code snapshot pro-
cess state [11].
                                                                            what the student thinks about it that helped him or
                                                                            her to develop and grow.
3. METHODOLOGY
                                                                         • Feedback: When the student receives feedback, it fa-
3.1 Research Platform                                                      cilitates their development as independent learners.
The research platform [3] utilized in this study is a home-                When the feedback is immediate it helps the students
grown system designed as an educational tool grounded in                   evaluate and regulate their learning at their own pace.
learning science principles (Figure 1). It is based on dis-
tributed practice [3], retrieval practice and testing effects [27],      • Peer interaction: The benefit of peer interaction in
reflection and metacognition [18], feedback [10], peer inter-              learning is significant, Therefore, a designated discus-
action [28]                                                                sion board for each question was provided to facilitate
                                                                           this interaction. This also increases the social benefit
This platform acts as a supplemental self-assessment tool                  from the reflections.
for introductory programming courses. It provides students
                                                                         • Persistent and regularity: Providing one multiple-ch-
with small distributed opportunities to master their pro-
                                                                           oice question a day keeps the student interested to
gramming knowledge. It publishes daily questions to mea-
                                                                           check for newly posted questions and encourage them
sure the learning of a specified programming knowledge com-
                                                                           to practice regularly.
ponent and provide extended learning and reflecting oppor-
tunities to the students. The design rationale of the system
is based on the following learning concepts:                          3.2    Data Collection
                                                                      As students access the system, they are prompted with the
   • Distributed practices: rather than having the content            quiz of the day. They can attempt the question right away or
     presented at once, the learning becomes more effective           leave it for later and move to the question history. The sys-
     when broken into chunks. This strategy is even more              tem allows the students to attempt a question multiple times
     effective with constant increments of small practices            until the correct answer is selected. Each attempt is marked
     over time.                                                       with the appropriate flag indicating the review source (quiz
                                                                      of the day, review, attempt & retry) and whether the student
   • Retrieval practice and testing effects ensure that the           answered correctly or not. At the beginning of the course,
     student remembers what they have learned. It also                students were encouraged to reflect on their attempts. In
     enhances long-term retention.                                    the system, they are prompted to reflect right after an at-
                                                                      tempt through the discussion board where they can interact
   • Reflection and metacognition encourage students to               with peers. The credentials of the peers are anonymized to
     take the time and think about the learned content and            preserve privacy and to facilitate unbiased discussions and
interactions. Finally, students can access previously posted                      Pattern    Quantity    Ratio
questions at any time using the calendar feature or the ques-                      CCI         151       0.44
tion history list.                                                                 CII         119       0.35
                                                                                   ACC         24        0.07
In this study, we collected data from an introductory com-                         CCC         14        0.04
puter programming course offered in a university. The data-                        AAC         10        0.03
set was from three different semesters. An overview of the                         ACI          8        0.02
raw dataset is shown in Table 1. After dropping those stu-                         ICC          6        0.02
dents without grade or any activity on the platform, we                            ICI          5        0.01
were left with 344 students for the analysis (the number of                        ACA          4        0.01
attempts: M = 17.68, SD = 28.25).                                                  IIC          3        0.01

                                                                   Table 2: The Distribution of Persistence Patterns
3.3    Discovering Persistence Patterns by Prob-
       abilistic Mixture Model
Activity stream data is known for its rich properties such as     macro persistence patterns. The majority of students were
analyzing students’ time management behaviors [25]. The           categorized into the macro pattern CCI (∼ 44%) and CII
activity made by a student during practice such as submit-        (∼ 35%). The distribution of found patterns is shown in
ting an answer to a question is recorded as transactions in       Table 2. This result is not surprising because the use of
the activity-stream data. To determine the persistence, we        the platform every week was not mandatory and students
consolidated the click-stream data by counting the num-           tended to study intensively right before the exam regardless
ber of times transactions were recorded in each week [14,         of the reason.
13]. The data was then grouped into three exam periods.
A mixture model was applied to the activity stream. Our           4.1   Exploring the Relationship of Persistence
exploratory analysis revealed three micro patterns: Active,
Cramming, and Inactive. The Active represents the students
                                                                        Patterns and Exam Performance
who practice actively in all the weeks in a given time-frame.     Programming concepts are often complex and coupled. In
                                                                  a course about introduction to programming, we can expect
The Cramming represents the students who use the plat-
form only right before an exam. The Inactive represents           that an advanced concept is usually built up from sets of
users who use the platform minimally.                             fundamental concepts. The content of a later exam is in-
                                                                  evitably accumulated from previous exams. To illustrate
                                                                  the complexity of accumulated programming concepts over-
To ground the identification of these patterns, we adapted
tools from time-series analysis. A moving average model           time, we calculated the correlation between exam perfor-
(MA) with a time-lag of 1 was applied to the component av-        mance and found out that the correlation between E1 and
erages of patterns. Each exam period where more than half         E2 (Pearson’s r(E1, E2)) is 0.74; r(E1, E3) = 0.71; and
of values were less than 0.05 (the minimum value of compo-        r(E2, E3) = 0.79. We also found the partial correlation be-
                                                                  tween E2 and E3 with controlled E1 is 0.56. Following the
nents after normalization) was marked as Condition 1. To
capture the peak of changes in the amount of activity, we         heuristic interpretation of Pearson’s r, this result suggested
also calculated the difference and marked exam periods with       that the performance of E1, E2, and E3 was moderately
the value beyond M +SD or M −SD as Condition 2. After-            to highly correlated and we should consider the effect from
                                                                  previous exam performance when analyzing the variance in
ward, the tagging of patterns was done in this order: if an
exam period is marked as Condition 2, tag it as Cramming;         students’ persistence patterns and exam performance.
if it is marked as Condition 1, tag it as Active; otherwise,
tag it as Inactive. An illustration of these characteristics      Based on this result, we believed that the only period in
and the tagging process is shown in Figure 2.                     a semester where we could observe the marginal correlation
                                                                  between persistence patterns and exam performance was the
                                                                  first exam period where P1 and E1 occurred. We hypothe-
4.    RESULTS                                                     sized that different micro persistence patterns were related
The major goal of this study is to find out the correlation be-   to the exam performance considering that students might
tween students’ persistence patterns and their performance        improve their understanding of learning topics by active self-
in the class. A macro (long-term) persistence pattern con-        assessment on the platform. To test this hypothesis, we
sists of three micro (short-term) patterns distributed in the     conducted a one-way ANOVA analysis on P1 and E1. The
three exam periods. The assumption behind this model is           result showed that P1 did not have significant main effect on
that a student’s effort can be represented by a sequence of       E1 (F (2, 341) = 0.72, p = 0.48), namely, micro/short-term
events where a later event (e.g., an exam score) or an ef-        persistence pattern was not marginally related to short-term
fort (e.g., the decision to study every week actively) is the     performance.
decision based on the past events. Specifically, a student’s
activity is modeled by a sequence (P 1, E1, P 2, E2, P 3, E3)     In spite of the rejected hypothesis, we still observed that
where P 1, P 2, P 3 are three micro persistence patterns and      students with different P1 did not develop in the same way
E1, E2, E3 are the normalized performance of three exams          in the later exams (See Figure 3). We believed this probably
in the course. The composition of P 1, P 2, P 3 is referred to    indicated that in our sample group students’ short-term per-
as macro persistence pattern.                                     sistence had an effect on long-term development throughout
                                                                  the semester. We tested this hypothesis by an ANOVA anal-
Out of records from 344 students, we found 10 different           ysis of 3-by-3 factorial design over P1 and P2 on E2, which
Figure 2: An Illustration of the Characteristics of Persistence Patterns. Each row represents the characteris-
tics of a persistence pattern. The annotation of persistence pattern was done by checking whether the moving
average and differencing passed predefined thresholds (see Section 3.3 for detail). From top to bottom, the
annotated patterns are CCC, AAC, and IIC.


Figure 3: The Relationship of Exam Performance and Persistence Patterns for Students Starting with Active
(A; the upper row) and Cramming (C; the lower row). The left column represents low-performing (LP)
students and the right one represents high-performing (HP) students. The patterns of interest with significant
difference were found mainly in the LP group, including the pairs (AAC, ACA), (CCC, CCI), (CCC, CII).
                                                                of LP-AAC was relatively consistent in the first two periods
                                                                and changed to Cramming in the third period.

                                                                This result indicated that keeping Active in the first two
                                                                periods might be helpful for LP students. This was probably
                                                                due to the difficulty of learning topics in the course. Since
                                                                topics in E1 and E2 were important for students to build up
                                                                the fundamentals of programming languages, if they did not
                                                                self-assess actively, they might not know that they needed to
                                                                catch up as soon as possible. Note the sample size of these
                                                                pattern was extremely small. A future study with a large
                                                                sample size is needed to cross-validate this finding.
Figure 4: The Comparison Exam Performance in LP
Students with Different Persistence Patterns. We
can see that the group “CCC” seemed to outper-
formed all the other groups in E3.                              4.2.2    Effective Persistence Patterns in LP Students
                                                                         Starting with Cramming
                                                                The Cramming micro pattern (C) represents an intense amo-
did not show any significant main effects from P1, P2, or       unt of effort on the self-assessment platform in a short period
the interaction of P1 and P2. However, the test of factorial    of time. This is a common pattern we can find when the time
design over P1 and the interaction of exam pairs (i.e., the     is close to an exam date. In our sample dataset, we found
formula P 1 + P 1 : P 2 + P 2 : P 3 + P 1 : P 3) on E3 showed   that the pairs (LP-CCC, LP-CCI), (LP-CCC, LP-CII), and
that the main effect of the interaction of P1 and P2 was        (LP-CCC, HP-CCC) revealed interesting patterns in terms
significant (F (4, 327) = 3.25, p = 0.01). Together with our    of the variance in exam performance.
previous test, this result suggested that even though short-
term behavior might not bring an effect to the immediate        For the first pair, (LP-CCC, LP-CCI), the analysis showed
exam performance, in the long run, a student’s earlier be-      there was no significant difference in their performance of
havior (P1 and P2) might have an effect on the performance      E2, however, in E3 LP-CCC (M = 0.61, SD = 0.17) per-
of the final exam (E3).                                         formed significantly better than LP-CCI (M = 0.33, SD =
                                                                0.21) (t(91) = 3.19, p = 0.02, d = 1.04). When compar-
                                                                ing LP-CCC to LP-CII (M = 0.33, SD = 0.21), the sig-
4.2     Correlating the Dynamics of Persistence                 nificance was only found, again, in E3 (t(65) = 3.60, p =
        Patterns to Exam Performance                            0.01, d = 1.36). In other words, LP students starting with
Our next question was about the relationship between the        Cramming and keeping this persistence pattern across the
trajectory of persistence patterns (i.e., the variance in P1,   semester performed better than those who did not keep the
P2, P3) and student performance throughout the course of        persistence pattern (CCI and CII) in the final exam. We
three exams. We first grouped students into high-performing     also found that LP-CCC students even had the best perfor-
(HP) and low-performing (LP) by checking whether their          mance in E3 compared to those with other macro patterns
performance in E1 was higher than 0.6 or not. This cut          (M = 0.35, SD = 0.23) (see Figure 4; t(175) = 3.48, p =
point was in accordance with the fact that 60% is a com-        0.02, d = 1.15). An ANOVA analysis of 3-by-3 factorial
mon cut point which decides the passing grade. The choice       design over P2*P3 on E3 also showed that the main ef-
of E1 was based on the finding in Section 4.1. Moreover,        fect from P3 was significant (F (1, 144) = 6.73, p = 0.01),
this transformation made the analysis of multiple categor-      which further emphasized the importance of the persistence
ical groups cleaner and easier to follow. An overview of        pattern in the final period. Moreover, when comparing LP-
students’ macro patterns and exam performance is shown in       CCC students to their high performing sibling, HP-CCC, we
Figure 3.                                                       found that although in E1 LP-CCC (M = 0.45, SD = 0.12)
                                                                performed worse than HP-CCC (M = 0.83, SD = 0.10)
                                                                (t(12) = −5.80, p = 0.00, d = −3.23; which was mainly due
4.2.1    Effective Persistence Patterns in LP Students          to the grouping), their performance in E2 and E3 was not
         Starting with Active                                   significantly different.
The Active micro pattern (A) represents a continuous ef-
fort on the self-assessment platform for a period of time       The analysis showed that LP-CCC students not only outper-
(see Section 3.3 for the definition). In the sample dataset,    formed those with other macro persistence patterns C in the
we identified four macro patterns starting with A: ACC,         LP group, but performed on a par with HP students with
ACI, AAC, and ACA. Among these patterns, we found that          the same pattern in the final exam. These results collab-
the development of AAC and ACA students was of interest.        oratively suggested that being consistent on the Cramming
First, LP students with AAC (LP-AAC) had similar perfor-        behavior was an effective practice for LP students. One pos-
mance in E1 and E2 compared to LP-ACA students. How-            sible explanation was that LP students who kept the Cram-
ever, LP-AAC students (M = 0.41, SD = 0.19) performed           ming behavior on the self-assessment platform throughout
significantly better than LP-ACA (M = 0.08, SD = 0.08)          the semester might be showing their grip and willingness to
(t(6) = 2.82, p = 0.05, d = 1.659). One apparent difference     improve in the class. Another possible assumption of this
in these two groups of students was that LP-ACA changed         effective practice was that intensive self-assessment helped
from A to C in the second period, and changed back to A in      students to identify the learning topics or concepts they need
the third period. On the other hand, the persistence pattern    to further review and study.
                                                                 plicit from the perspective of the students. It is assumed
                                                                 that as students use the self-assessment platform, they be-
                                                                 come aware of which learning content they currently need
                                                                 to improve on (“the identification of needed skills”). This
                                                                 will allow them to review and address these learning gaps
                                                                 accordingly (“the development of time management and sub-
                                                                 goal planning”). Following this hypothesis, the practice of
                                                                 self-assessment may reflect a part of SRL strategies which
                                                                 allows for the interpretation of the findings in this work in
                                                                 terms of SRL behaviors.

                                                                 Our analysis showed that low-performing students who kept
                                                                 the cramming persistence pattern in self-assessment were
                                                                 able to improve and achieve competitive performance with
                                                                 the high-performing counterpart in the final exam. This re-
                                                                 sult suggests that although generally, cramming or procras-
                                                                 tination is a less-desirable behavior in learning [25], when
                                                                 such behavior is found in self-assessment platforms, a pos-
Figure 5: Comparing Exam Performance of High                     itive outcome may be seen. An optimistic explanation to
Performing Students in Exam1 and Exam2. We can                   this result is that the platform was seen as supplemental
see that the performance of students with the pat-               material to the course since it follows a format that closely
tern “CC” dropped the most compared to students                  resembles the formal assessments (i.e., multiple-choice ques-
with the other patterns. Only students with the                  tions). Thus, students who wanted to improve their per-
pattern “AA” had the tendency to improve the per-                formance could obtain actionable feedback allowing them to
formance.                                                        review exam content in a short period of time. On the other
                                                                 hand, a relatively pessimistic explanation is that students
                                                                 were simply gaming the system to memorize the content in
4.2.3    Effective Persistence Patterns in HP Students           the hopes of seeing similar questions in the exams.
The performance of HP students was relatively stable com-
pared to LP students (Figure 3). An ANOVA analysis of            One interesting pattern we found from the analysis was that
3-by-3 factorial design over P1 and P2 on the difference of      the effect of early behavior might reflect on performance at
E1 and E2 showed that the main effect of the interaction of      a later time (see Section 4.1). Our analysis showed that low-
P1 and P2 was significant (F (2, 148) = 3.40, p = 0.03). Fol-    performing students who kept the same persistence patterns
lowing this outcome, we further compared and examined the        in the first two exam periods might perform better than the
value with different persistence patterns (see Figure 5). The    others. This result may suggest that 1) in this course, the
only pattern of interest we found was that HP students start-    effort in the first two exam periods was crucial, which is not
ing with Cramming and kept doing so were not able to keep        surprising considering the comprehensive nature of midterm
their performance in E2. A statistics test showed that the       exams; and 2) being persistent only for a short term was
performance of this group of students dropped significantly      not enough. This can be leveraged to guide students in
from E1 (M = 0.81, SD = 0.11) to E2 (M = 0.70, SD =              practice. For example, a recommender can be implemented
0.21) (t(142) = −3.85, p = 0.00, d = −0.65). Such a pattern      to inform students that being persistent in the long run is
was not found from E2 to E3.                                     important. This recommender can also adapt to students’
                                                                 self-assessments in a short term.
This result could be a signal that for HP students who
wanted to stay competitive, intensive self-assessment might      This study gave us a glimpse of the students’ behavior as
not help much. A possible explanation to this outcome was        they progressed into the course. We observed how many
that due to that the topics on the self-assessment platform      students utilized the system and crammed as they prepared
were “limited” in terms of scope and the amount of content,      for an upcoming exam. Those who kept their persistence
when the complexity of topics increased, students were not       patterns outperformed those who did not in terms of perfor-
able to use the platform to review the important and nec-        mance in the formal assessments. This result suggested that
essary learning content that were not covered by the self-       persistently putting an effort to utilize additional materials,
assessment platform.                                             in general, would positively be reflected on the course per-
                                                                 formance. However, we did not look into their self-reported
5.   DISCUSSION                                                  motivations or reasons for coming back to the system, e.g.,
                                                                 whether it was for them to self-assess or to practice on to ad-
One SRL strategy in the literature relevant to the perfor-
                                                                 ditional learning materials. Since what we know about the
mance of self-assessment is assessment of task difficulty pro-
                                                                 students is currently limited to their activity log data on the
posed by Falkner and colleagues. When the identification
                                                                 system, we may only assume that those who persistently
of needed skills is incorporated, it would lead to the de-
                                                                 used the system belonged to the so-called “hard-working”
velopment of time management and sub-goal planning [16].
                                                                 ones who had been the better-performing students prior to
The performance of self-assessment has two fundamental
                                                                 taking the course or had a better metacognitive skills.
goals: for students to practice learning concepts, and for
them to obtain feedback about their current span of knowl-
                                                                 Additionally, although the findings may have a potential
edge. While the former is explicit, the latter is rather im-
connection to SRL, however, in this work we are not able                 the 46th ACM Technical Symposium on Computer
to ground this hypothesis due to the lack of data from au-               Science Education, pages 522–527. ACM, 2015.
thentic SRL measurements (e.g., qualitative questionnaires           [2] M. Alzaid and I.-H. Hsiao. Behavioral Analytics for
for mapping constructs which are widely used in literature).             Distributed Practices in Programming
Additionally, the lack of some persistence patterns or suf-              Problem-Solving. In 2019 IEEE Frontiers in
ficient samples potentially makes our interpretation biased              Education Conference (FIE), volume 2019-Octob,
toward a certain kind of persistence patterns. This is despite           pages 1–8. IEEE, 10 2019.
the fact that the analyses were based on data collected from         [3] M. Alzaid, D. Trivedi, and I.-H. Hsiao. The effects of
the same course given in three different semesters, which                bite-size distributed practices for programming
to some extent had consolidated the possible patterns that               novices. In 2017 IEEE Frontiers in Education
could be found in this specific course. These are considered             Conference (FIE), volume 2017-Octob, pages 1–9.
the current limitations of this work and a future study can              IEEE, 10 2017.
further evaluate our findings in consideration of these issues.      [4] A. R. Baig and H. Jabeen. Big data analytics for
                                                                         behavior monitoring of students. Procedia Computer
6.   CONCLUSIONS AND LIMITATIONS                                         Science, 82:43–48, 2016.
This work examined the relationship between the persis-              [5] A. Bandura and D. H. Schunk. Cultivating
tence of self-assessment and the exam performance in an in-              competence, self-efficacy, and intrinsic interest
troductory course of computer programming. We collected                  through proximal self-motivation. Journal of
data from the same course given in three different semesters.            Personality and Social Psychology, 1981.
Out of 344 students, we identified 10 different long-term per-       [6] A. S. Benjamin and J. Tullis. What makes distributed
sistence patterns by a probabilistic mixture model, each of              practice effective? Cognitive Psychology, 2010.
which consisted of three short-term persistence patterns in-         [7] D. Boud and N. Falchikov. Quantitative Studies of
cluding Active, Cramming, and Inactive. From a series of                 Student Self-Assessment in Higher Education: A
analyses which explained the variance of exam performance                Critical Analysis of Findings. Higher Education,
by the dynamics or changes in persistence patterns, we found             18(5):529–549, 1989.
that low-performing students benefited from the continuity           [8] K. E. Boyer, R. Phillips, A. Ingram, E. Y. Ha,
of intensive or active self-assessment; and, somewhat, on the            M. Wallis, M. Vouk, and J. Lester. Investigating the
contrary, high-performing students might not achieve similar             relationship between dialogue structure and tutoring
effectiveness by the continuity of cramming. We discussed                effectiveness: a hidden markov modeling approach.
these outcomes under the framework of self-regulated learn-              International Journal of Artificial Intelligence in
ing and provided possible assumptions and explanations in                Education, 21(1-2):65–81, 2011.
the context of the course.
                                                                     [9] K. Buffardi and S. H. Edwards. Effective and
                                                                         ineffective software testing behaviors by novice
There are some limitations to the methodology of this work.
                                                                         programmers. In Proceedings of the ninth annual
First, the persistence model in use was built on the amount
                                                                         international ACM conference on International
of students’ activities on the self-assessment platform which
                                                                         computing education research, pages 83–90. ACM,
only included the count of unique question attempts. In
                                                                         2013.
other words, our model did not consider other probably valu-
able information in self-assessment such as the correctness         [10] D. L. Butler and P. H. Winne. Feedback and
of first attempts, the coverage or difficulty of learning topics,        Self-Regulated Learning: A Theoretical Synthesis.
among others. The second limitation is that our definition of            Review of Educational Research, 65(3):245–281, 9
the micro persistence pattern Cramming may overlap Active                1995.
and Inactive to some extent. Specifically, we did not discrim-      [11] A. S. Carter, C. D. Hundhausen, and O. Adesope. The
inate students who were “Active and Cramming” from those                 normalized programming state model: Predicting
who were “Inactive and Cramming”. One reason for this                    student performance in computing courses based on
decision was to avoid the overfitting that made the result               programming behavior. In Proceedings of the eleventh
of patterns too sparse. Nevertheless, we might introduce                 annual International Conference on International
some bias toward the Cramming pattern in our analysis.                   Computing Education Research, pages 141–150. ACM,
Finally, the persistence patterns found in the same course               2015.
seemed consistent or similar in the three different semesters.      [12] L. Cen, D. Ruta, and J. Ng. Big education:
However, this could be due to the structure of the content               Opportunities for big data analytics. In 2015 IEEE
or the nature of content provided in this specific course of             International Conference on Digital Signal Processing
computer programming. Although we can expect that some                   (DSP), pages 502–506. IEEE, 2015.
behavioral patterns may be general across the field of study        [13] C.-Y. Chung and I.-H. Hsiao. Quantitative Analytics
(e.g., cramming before the exam), our findings related to                in Exploring Self-Regulated Learning Behaviors:
the exam performance may not be the case. Future research                Effects of Study Persistence and Regularity. In 2019
should take the organization of course content into account              IEEE Frontiers in Education Conference (FIE), pages
when trying to replicate the result in a different course.               1–9. IEEE, 10 2019.
                                                                    [14] C.-Y. Chung and I.-H. Hsiao. Investigating Patterns of
7.   REFERENCES                                                          Study Persistence on Self-Assessment Platform of
 [1] A. Altadmri and N. C. Brown. 37 million                             Programming Problem-Solving. In Proceedings of the
     compilations: Investigating novice programming                      51st ACM Technical Symposium on Computer Science
     mistakes in large-scale student data. In Proceedings of             Education, pages 162–168, New York, NY, USA, 2
     2020. ACM.                                                 [27] H. L. Roediger and A. C. Butler. The critical role of
[15] A. Cicchinelli, E. Veas, A. Pardo,                              retrieval practice in long-term retention. Trends in
     V. Pammer-Schindler, A. Fessl, C. Barreiros, and                Cognitive Sciences, 15(1):20–27, 1 2011.
     S. Lindstädt. Finding traces of self-regulated learning   [28] R. D. Roscoe and M. T. H. Chi. Tutor learning: the
     in activity streams. Proceedings of the 8th                     role of explaining and responding to questions.
     International Conference on Learning Analytics and              Instructional Science, 36(4):321–350, 7 2008.
     Knowledge - LAK ’18, pages 191–200, 2018.                  [29] D. H. Schunk. Enhancing self-efficacy and achievement
[16] K. Falkner, R. Vivian, and N. J. Falkner. Identifying           through rewards and goals: Motivational and
     computer science self-regulated learning strategies. In         informational effects. Journal of Educational Research,
     Proceedings of the 2014 conference on Innovation &              1984.
     technology in computer science education - ITiCSE          [30] D. H. Schunk. Sequential attributional feedback and
     ’14, pages 291–296, New York, New York, USA, 2014.              children’s achievement behaviors. Journal of
     ACM Press.                                                      Educational Psychology, 76(6):1159–1169, 1984.
[17] J. Guerra, S. Sahebi, P. Brusilovsky, and Y. Lin. The      [31] D. H. Schunk and E. L. Usher. Assessing self-efficacy
     Problem Solving Genome: Analyzing Sequential                    for self-regulated learning. In B. J. Zimmerman and
     Patterns of Student Work with Parameterized                     D. H. Schunk, editors, Educational psychology
     Exercises. In 7th International Conference on                   handbook: handbook of self-regulation of learning and
     Educational Data Moining, pages 153–160, London,                performance. Routledge, 2011.
     UK, 2014. International Educational Data Mining            [32] J. C. Sun, C. Lin, and C. Chou. Applying learning
     Society.                                                        analytics to explore the influence of online learners’
[18] D. J. Hacker, J. Dunlosky, and A. C. Graesser.                  motivation on their online learning behavioral
     Metacognition in Educational Theory and Practice.               patterns. In 2016 5th IIAI International Congress on
     Routledge, 3 1998.                                              Advanced Applied Informatics (IIAI-AAI), pages
[19] B. Hoffman, R. Morelli, and J. Rosato. Student                  377–380, 2016.
     Engagement is Key to Broadening Participation in           [33] E. J. Webb, D. T. Campbell, R. D. Schwartz, and
     CS. In Proceedings of the 50th ACM Technical                    L. Sechrest. Unobtrusive measures, volume 2. Sage
     Symposium on Computer Science Education - SIGCSE                Publications, 1999.
     ’19, pages 1123–1129, New York, New York, USA,             [34] P. H. Winne and R. S. Baker. The Potentials of
     2019. ACM Press.                                                Educational Data Mining for Researching
[20] M. C. Jadud and B. Dorn. Aggregate compilation                  Metacognition, Motivation and Self-Regulated
     behavior: Findings and implications from 27,698                 Learning. JEDM - Journal of Educational Data
     users. In Proceedings of the eleventh annual                    Mining, 2013.
     International Conference on International Computing        [35] P. H. Winne and A. F. Hadwin. Studying as
     Education Research, pages 131–139. ACM, 2015.                   self-regulated learning. In Metacognition in
[21] H. H. Kelley and J. L. Michela. Attribution Theory              educational theory and practice. Routledge, 1998.
     and Research. Annual Review of Psychology, 1980.           [36] B. J. Zimmerman and D. H. Schunk. Self-Regulated
[22] D. Loksa and A. J. Ko. The Role of Self-Regulation in           Learning and Performance: An Introduction and an
     Programming Problem Solving Process and Success.                Overview. In Educational psychology handbook:
     In Proceedings of the 2016 ACM Conference on                    handbook of self-regulation of learning and
     International Computing Education Research, pages               performance, page 1. Routledge, 2011.
     83–91, New York, NY, USA, 8 2016. ACM.
[23] D. Loksa, A. J. Ko, W. Jernigan, A. Oleson, C. J.
     Mendez, and M. M. Burnett. Programming, Problem
     Solving, and Self-Awareness. In Proceedings of the
     2016 CHI Conference on Human Factors in
     Computing Systems, pages 1449–1461, New York, NY,
     USA, 5 2016. ACM.
[24] Y. Lu and I.-H. Hsiao. Seeking programming-related
     information from large scaled discussion forums, help
     or harm?. In Proceedings of the 9th International
     Conference on Educational Data Mining, pages
     442–447, 2016.
[25] J. Park, R. Yu, F. Rodriguez, R. Baker, P. Smyth, and
     M. Warschauer. Understanding Student
     Procrastination via Mixture Models. In Educational
     Data Mining, pages 187–197, Buffalo, NY, 2018.
     International Educational Data Mining Society.
[26] C. Piech, M. Sahami, D. Koller, S. Cooper, and
     P. Blikstein. Modeling how students learn to program.
     In Proceedings of the 43rd ACM technical symposium
     on Computer Science Education, pages 153–160.
     ACM, 2012.