A Longitudinal Study on Student Persistence in Programming Self-assessments∗ Cheng-Yu Chung Yancy Vance Paredes Mohammed Alzaid Arizona State University Arizona State University Arizona State University Tempe, Arizona Tempe, Arizona Tempe, Arizona Cheng.Yu.Chung@asu.edu yvmparedes@asu.edu Mohalzaid@asu.edu Kushal Reddy Papakannu I-Han Hsiao Arizona State University Arizona State University Tempe, Arizona Tempe, Arizona kushalreddy95@gmail.com Sharon.Hsiao@asu.edu ABSTRACT 1. INTRODUCTION Self-assessment is an educational practice that helps stu- Engagement and persistence in learning have been consid- dents evaluate their own learning by distributed practices. ered as a key to attaining achievements in computing educa- This evaluation potentially has an effect on student’s self- tion [19]. Researchers in self-regulated learning (SRL) have efficacy and therefore can influence their choice of activities shown that there is a relationship between students’ belief in and the likelihood of their success. The variation in the self- the effectiveness of learning strategies and such motivational assessing behavior of students over the course of learning is responses [36]. This belief about one’s “perceived capabili- often less explored. For instance, a student’s short-term be- ties for learning” is referred to as self-efficacy [31]. This not havior may not necessarily infer how they will behave in the only affects how a student assesses his or her own learning long-term. It is unclear how such development in their self- outcome at the moment but also how the student chooses assessing behavior is related to academic performance and certain tasks and adapt to particular learning strategies. the corresponding self-assessment strategies. This longitudi- nal study aims to fill the gap by examining a self-assessment Self-assessment is an educational practice that helps stu- platform used in an introductory programming class from dents evaluate their learning condition [7]. This practice can three different semesters. We analyzed the activity logs and be extended by the theory of spacing effect and distributed modeled students’ short-term and long-term study persis- practices for the provision of continuous evaluation of learn- tence on the platform using a probabilistic mixture model. ing outcomes that can help students improve in a course [3]. The results suggest that short-term persistence was not re- When students keep receiving learning feedback from such lated to short-term performance. However, the performance a tool and the outcomes are attributed to the effort in self- in the final exam was associated with earlier persistence pat- assessment [21], their belief in self-efficacy may change [30] terns. A further analysis showed that low-performing stu- and therefore may be able to adjust and adapt their learning dents who maintained the self-assessment pattern improved strategies to fit the best their conditions [2]. Research has in exams. Nevertheless, this longitudinal study contributes shown that such temporally spaced and distributed practices empirical evidence to the understanding of the development are better than “compressed” ones in memory research [6]. of self-assessment behavior in relation to academic perfor- There have been also research trying to correlate students’ mance. activity traces of self-assessments to SRL [15]. Following the train of thought about the relationship be- Keywords tween self-assessment, self-efficacy, and learning strategies, self-assessment, self efficacy, study persistence, programming we further hypothesize that students’ self-assessment behav- concepts, learning analytics, computing education, proba- ior is not stationary throughout the course of learning [14]. bilistic mixture model It is intuitive to have an impression that active and higher ∗Copyright c 2020 for this paper by its authors. Use per- usage of self-assessment should be positively correlated to mitted under Creative Commons License Attribution 4.0 In- student’s performance. However, in our previous work we ternational (CC BY 4.0). observed that it was not always the case in our subjects. We found that students made adjustment to the usage of self-assessment according to, hypothetically, the attribution of effectiveness in terms of exam performance [14]. An active user did not necessarily end up with a higher performance in the exam. Moreover, there has been research examin- ing the effectiveness of self-assessment in terms of memory, long-term retention, and cognitive outcomes like motivation, persistence, and self-efficacy, however, there are only few research papers focused on the explanation of variance be- tween the changes of self-assessment behavior and students’ Backer discussed such a potential in terms of SRL: “Self- performance in a course. Therefore, this work aims to ex- regulated learning is a behavioral expression of metacogni- amine the dynamics of the self-assessment usage pattern, tively guided motivation” [34]. Students’ activity and trace which is referred to as persistence pattern onward, and eval- records could be a source of information about the process uate how it is correlated to the variance in the performance of learning, in which the challenge is to “obtain representa- in exams. Specifically, this work is guided by the following tions of learning as it unfolds...that are clearly and precisely research questions: matched to what theory describes.” [34]. An example of such a theory is the 4-phase model of SRL proposed by Winne and Hadwin: 1) identification of resources, 2) goal setting, 3) RQ1 What are the persistence patterns of self-assessments carrying out the task, 4) reviewing the work [35, 34]. In this in students from an introductory computer program- model, learners are assumed to be agents who decide and ming course? Are these patterns generalizable for stu- chose what to do depending on information from the envi- dents from different semesters? ronment, e.g., feedback, assessment outcomes, etc. Such an SRL procedure may not follow a certain sequence of phases RQ2 What is the relationship between the dynamics of per- (“weakly sequenced”) and the results from it can be used sistence patterns and the variance in exam performance recursively for the current of successive SRL phases [34]. in the course? What are the changes that positively or negatively correlated to the performance? The subject of this work is the relationship between stu- RQ3 What are the effective practices of self-assessment for dent’s self-assessment behavior, changes of such behavior, students whose performance is relatively low in the and how they are correlated with the performance in a course. course? What are such practices for students who have We hypothesize that this process can be considered an “un- relatively higher performance? folding” process following the 4-phase model of SRL from Winne and Hadwin where students, who use the system based on their own decision, try to obtain some feedback We have organized the rest of this paper in the following (resources) on their understanding of learning topics and way. In the next section, we discussed related works in therefore can identify what they need to further study on. SRL in computing education and behavioral analytics in programming learning. Section 3 describes how we mod- 2.2 Behavioral Analytics in Programming eled the persistence patterns using a probabilistic model. Learning Section 4 illustrates our findings of the dynamics of persis- Behavioral analytics is an area of research that is gaining tence pattern in relation to the variance in students’ exam popularity. It traces its roots from research on understand- performance, which is followed by a discussion of the result ing data captured from e-commerce. Research in this field, with previous work and SRL theory. Finally, the conclusion such as exploratory studies, was driven by the advance- and limitations of the model are described in Section 6. ment of technology where systems became capable of cap- turing large quantities of data which may come from mul- 2. RELATED WORK tiple sources. There has been growing interest in exploring the application of behavioral analytics to education data to 2.1 Self-regulated Learning and Academic support pedagogy. This spans from student’s performance Success in Computing Education prediction, intelligent course recommendation, data-driven Theories and practices of SRL have been established and learning analytics, and personalized learning [12]. Some of evaluated since the late 1990s with the focus on student these education data are considered to be ambient data (ac- development, cognitive-behavioral processes, and social and cretion data) that learners generate [33] while using learn- motivational aspects [36]. On the line of social and motiva- ing environments. This could be in the form of capturing tional aspects, researchers discussed the construct feedback the event where a user clicks on a hyperlink to open a web cycle within which a student experiences choosing a task resource. That can capture the user’s cognition and moti- of interest [5], judging the performance and comparing it vation. In another work, sequential analysis was applied to to a standard [5, 36], building a perception of self-efficacy behavioral data to explore how it was affected by the moti- and being persistent on the process [29]. To assess multi- vation of students to learn [32]. They indicated that online faceted SRL behavior, various methodological instruments reading duration in the online learning system was a better can be employed for different constructs, e.g., diary for per- indicator of reading seriousness in learners. Another work sonal and offline events, SRL scales for self-efficacy, “online” proposes to extend how behavioral analytics is perceived [4]. think-aloud protocols for SRL processes occurring during the In this case, they proposed to look into investigating the de- learning [36]. Discipline-specific strategies of SRL in com- viation of a student from a normal behavior. This normal puting education has also been examined in the context of behavior is contextually dependent on the issue at focus. programming problem-solving [22], self-awareness [23], and metacognitive strategies [16]. Modeling student’s learning is an ongoing research in the field. Such student models reside in intelligent tutoring sys- In recent years, researchers from educational data mining tems or any adaptive educational systems. In these systems, (EDM) have started the discussion of applying EDM meth- behavior logs are often used to estimate students’ learning ods, which have a focus on using computerized methodolo- (i.e., interaction with the tutors which results in updates gies for linking students’ trace records to performance met- on the knowledge components). In the context of learn- rics by which researchers can optimize the learning process ing a programming language, several parameters have been on both research and practice side. For example, Winne and used to estimate the coding knowledge of students. This Figure 1: A Screenshot of the Self-assessment Platform. The left panel shows the question and options. The right panel shows the feedback/discussion from other students (which is only accessible after the student answers a question). includes the sequence of success in programming problem- Semester # of Students Statistics of Attempts solving [17]; how the students progressed in solving program- Fall 2016 217 M = 26.90, SD = 29.23 ming assignments [26]; the dialogic strategies between stu- Spring 2018 112 M = 17.59, SD = 27.20 dents [8]; identifying the strategies of students when seeking Fall 2018 211 M = 9.50, SD = 21.46 information related to programming [24]; assignment sub- mission compilation behavior [1, 20]; how students trou- Table 1: Statistics of Datasets bleshoot and test their solutions [9]; and code snapshot pro- cess state [11]. what the student thinks about it that helped him or her to develop and grow. 3. METHODOLOGY • Feedback: When the student receives feedback, it fa- 3.1 Research Platform cilitates their development as independent learners. The research platform [3] utilized in this study is a home- When the feedback is immediate it helps the students grown system designed as an educational tool grounded in evaluate and regulate their learning at their own pace. learning science principles (Figure 1). It is based on dis- tributed practice [3], retrieval practice and testing effects [27], • Peer interaction: The benefit of peer interaction in reflection and metacognition [18], feedback [10], peer inter- learning is significant, Therefore, a designated discus- action [28] sion board for each question was provided to facilitate this interaction. This also increases the social benefit This platform acts as a supplemental self-assessment tool from the reflections. for introductory programming courses. It provides students • Persistent and regularity: Providing one multiple-ch- with small distributed opportunities to master their pro- oice question a day keeps the student interested to gramming knowledge. It publishes daily questions to mea- check for newly posted questions and encourage them sure the learning of a specified programming knowledge com- to practice regularly. ponent and provide extended learning and reflecting oppor- tunities to the students. The design rationale of the system is based on the following learning concepts: 3.2 Data Collection As students access the system, they are prompted with the • Distributed practices: rather than having the content quiz of the day. They can attempt the question right away or presented at once, the learning becomes more effective leave it for later and move to the question history. The sys- when broken into chunks. This strategy is even more tem allows the students to attempt a question multiple times effective with constant increments of small practices until the correct answer is selected. Each attempt is marked over time. with the appropriate flag indicating the review source (quiz of the day, review, attempt & retry) and whether the student • Retrieval practice and testing effects ensure that the answered correctly or not. At the beginning of the course, student remembers what they have learned. It also students were encouraged to reflect on their attempts. In enhances long-term retention. the system, they are prompted to reflect right after an at- tempt through the discussion board where they can interact • Reflection and metacognition encourage students to with peers. The credentials of the peers are anonymized to take the time and think about the learned content and preserve privacy and to facilitate unbiased discussions and interactions. Finally, students can access previously posted Pattern Quantity Ratio questions at any time using the calendar feature or the ques- CCI 151 0.44 tion history list. CII 119 0.35 ACC 24 0.07 In this study, we collected data from an introductory com- CCC 14 0.04 puter programming course offered in a university. The data- AAC 10 0.03 set was from three different semesters. An overview of the ACI 8 0.02 raw dataset is shown in Table 1. After dropping those stu- ICC 6 0.02 dents without grade or any activity on the platform, we ICI 5 0.01 were left with 344 students for the analysis (the number of ACA 4 0.01 attempts: M = 17.68, SD = 28.25). IIC 3 0.01 Table 2: The Distribution of Persistence Patterns 3.3 Discovering Persistence Patterns by Prob- abilistic Mixture Model Activity stream data is known for its rich properties such as macro persistence patterns. The majority of students were analyzing students’ time management behaviors [25]. The categorized into the macro pattern CCI (∼ 44%) and CII activity made by a student during practice such as submit- (∼ 35%). The distribution of found patterns is shown in ting an answer to a question is recorded as transactions in Table 2. This result is not surprising because the use of the activity-stream data. To determine the persistence, we the platform every week was not mandatory and students consolidated the click-stream data by counting the num- tended to study intensively right before the exam regardless ber of times transactions were recorded in each week [14, of the reason. 13]. The data was then grouped into three exam periods. A mixture model was applied to the activity stream. Our 4.1 Exploring the Relationship of Persistence exploratory analysis revealed three micro patterns: Active, Cramming, and Inactive. The Active represents the students Patterns and Exam Performance who practice actively in all the weeks in a given time-frame. Programming concepts are often complex and coupled. In a course about introduction to programming, we can expect The Cramming represents the students who use the plat- form only right before an exam. The Inactive represents that an advanced concept is usually built up from sets of users who use the platform minimally. fundamental concepts. The content of a later exam is in- evitably accumulated from previous exams. To illustrate the complexity of accumulated programming concepts over- To ground the identification of these patterns, we adapted tools from time-series analysis. A moving average model time, we calculated the correlation between exam perfor- (MA) with a time-lag of 1 was applied to the component av- mance and found out that the correlation between E1 and erages of patterns. Each exam period where more than half E2 (Pearson’s r(E1, E2)) is 0.74; r(E1, E3) = 0.71; and of values were less than 0.05 (the minimum value of compo- r(E2, E3) = 0.79. We also found the partial correlation be- tween E2 and E3 with controlled E1 is 0.56. Following the nents after normalization) was marked as Condition 1. To capture the peak of changes in the amount of activity, we heuristic interpretation of Pearson’s r, this result suggested also calculated the difference and marked exam periods with that the performance of E1, E2, and E3 was moderately the value beyond M +SD or M −SD as Condition 2. After- to highly correlated and we should consider the effect from previous exam performance when analyzing the variance in ward, the tagging of patterns was done in this order: if an exam period is marked as Condition 2, tag it as Cramming; students’ persistence patterns and exam performance. if it is marked as Condition 1, tag it as Active; otherwise, tag it as Inactive. An illustration of these characteristics Based on this result, we believed that the only period in and the tagging process is shown in Figure 2. a semester where we could observe the marginal correlation between persistence patterns and exam performance was the first exam period where P1 and E1 occurred. We hypothe- 4. RESULTS sized that different micro persistence patterns were related The major goal of this study is to find out the correlation be- to the exam performance considering that students might tween students’ persistence patterns and their performance improve their understanding of learning topics by active self- in the class. A macro (long-term) persistence pattern con- assessment on the platform. To test this hypothesis, we sists of three micro (short-term) patterns distributed in the conducted a one-way ANOVA analysis on P1 and E1. The three exam periods. The assumption behind this model is result showed that P1 did not have significant main effect on that a student’s effort can be represented by a sequence of E1 (F (2, 341) = 0.72, p = 0.48), namely, micro/short-term events where a later event (e.g., an exam score) or an ef- persistence pattern was not marginally related to short-term fort (e.g., the decision to study every week actively) is the performance. decision based on the past events. Specifically, a student’s activity is modeled by a sequence (P 1, E1, P 2, E2, P 3, E3) In spite of the rejected hypothesis, we still observed that where P 1, P 2, P 3 are three micro persistence patterns and students with different P1 did not develop in the same way E1, E2, E3 are the normalized performance of three exams in the later exams (See Figure 3). We believed this probably in the course. The composition of P 1, P 2, P 3 is referred to indicated that in our sample group students’ short-term per- as macro persistence pattern. sistence had an effect on long-term development throughout the semester. We tested this hypothesis by an ANOVA anal- Out of records from 344 students, we found 10 different ysis of 3-by-3 factorial design over P1 and P2 on E2, which Figure 2: An Illustration of the Characteristics of Persistence Patterns. Each row represents the characteris- tics of a persistence pattern. The annotation of persistence pattern was done by checking whether the moving average and differencing passed predefined thresholds (see Section 3.3 for detail). From top to bottom, the annotated patterns are CCC, AAC, and IIC. Figure 3: The Relationship of Exam Performance and Persistence Patterns for Students Starting with Active (A; the upper row) and Cramming (C; the lower row). The left column represents low-performing (LP) students and the right one represents high-performing (HP) students. The patterns of interest with significant difference were found mainly in the LP group, including the pairs (AAC, ACA), (CCC, CCI), (CCC, CII). of LP-AAC was relatively consistent in the first two periods and changed to Cramming in the third period. This result indicated that keeping Active in the first two periods might be helpful for LP students. This was probably due to the difficulty of learning topics in the course. Since topics in E1 and E2 were important for students to build up the fundamentals of programming languages, if they did not self-assess actively, they might not know that they needed to catch up as soon as possible. Note the sample size of these pattern was extremely small. A future study with a large sample size is needed to cross-validate this finding. Figure 4: The Comparison Exam Performance in LP Students with Different Persistence Patterns. We can see that the group “CCC” seemed to outper- formed all the other groups in E3. 4.2.2 Effective Persistence Patterns in LP Students Starting with Cramming The Cramming micro pattern (C) represents an intense amo- did not show any significant main effects from P1, P2, or unt of effort on the self-assessment platform in a short period the interaction of P1 and P2. However, the test of factorial of time. This is a common pattern we can find when the time design over P1 and the interaction of exam pairs (i.e., the is close to an exam date. In our sample dataset, we found formula P 1 + P 1 : P 2 + P 2 : P 3 + P 1 : P 3) on E3 showed that the pairs (LP-CCC, LP-CCI), (LP-CCC, LP-CII), and that the main effect of the interaction of P1 and P2 was (LP-CCC, HP-CCC) revealed interesting patterns in terms significant (F (4, 327) = 3.25, p = 0.01). Together with our of the variance in exam performance. previous test, this result suggested that even though short- term behavior might not bring an effect to the immediate For the first pair, (LP-CCC, LP-CCI), the analysis showed exam performance, in the long run, a student’s earlier be- there was no significant difference in their performance of havior (P1 and P2) might have an effect on the performance E2, however, in E3 LP-CCC (M = 0.61, SD = 0.17) per- of the final exam (E3). formed significantly better than LP-CCI (M = 0.33, SD = 0.21) (t(91) = 3.19, p = 0.02, d = 1.04). When compar- ing LP-CCC to LP-CII (M = 0.33, SD = 0.21), the sig- 4.2 Correlating the Dynamics of Persistence nificance was only found, again, in E3 (t(65) = 3.60, p = Patterns to Exam Performance 0.01, d = 1.36). In other words, LP students starting with Our next question was about the relationship between the Cramming and keeping this persistence pattern across the trajectory of persistence patterns (i.e., the variance in P1, semester performed better than those who did not keep the P2, P3) and student performance throughout the course of persistence pattern (CCI and CII) in the final exam. We three exams. We first grouped students into high-performing also found that LP-CCC students even had the best perfor- (HP) and low-performing (LP) by checking whether their mance in E3 compared to those with other macro patterns performance in E1 was higher than 0.6 or not. This cut (M = 0.35, SD = 0.23) (see Figure 4; t(175) = 3.48, p = point was in accordance with the fact that 60% is a com- 0.02, d = 1.15). An ANOVA analysis of 3-by-3 factorial mon cut point which decides the passing grade. The choice design over P2*P3 on E3 also showed that the main ef- of E1 was based on the finding in Section 4.1. Moreover, fect from P3 was significant (F (1, 144) = 6.73, p = 0.01), this transformation made the analysis of multiple categor- which further emphasized the importance of the persistence ical groups cleaner and easier to follow. An overview of pattern in the final period. Moreover, when comparing LP- students’ macro patterns and exam performance is shown in CCC students to their high performing sibling, HP-CCC, we Figure 3. found that although in E1 LP-CCC (M = 0.45, SD = 0.12) performed worse than HP-CCC (M = 0.83, SD = 0.10) (t(12) = −5.80, p = 0.00, d = −3.23; which was mainly due 4.2.1 Effective Persistence Patterns in LP Students to the grouping), their performance in E2 and E3 was not Starting with Active significantly different. The Active micro pattern (A) represents a continuous ef- fort on the self-assessment platform for a period of time The analysis showed that LP-CCC students not only outper- (see Section 3.3 for the definition). In the sample dataset, formed those with other macro persistence patterns C in the we identified four macro patterns starting with A: ACC, LP group, but performed on a par with HP students with ACI, AAC, and ACA. Among these patterns, we found that the same pattern in the final exam. These results collab- the development of AAC and ACA students was of interest. oratively suggested that being consistent on the Cramming First, LP students with AAC (LP-AAC) had similar perfor- behavior was an effective practice for LP students. One pos- mance in E1 and E2 compared to LP-ACA students. How- sible explanation was that LP students who kept the Cram- ever, LP-AAC students (M = 0.41, SD = 0.19) performed ming behavior on the self-assessment platform throughout significantly better than LP-ACA (M = 0.08, SD = 0.08) the semester might be showing their grip and willingness to (t(6) = 2.82, p = 0.05, d = 1.659). One apparent difference improve in the class. Another possible assumption of this in these two groups of students was that LP-ACA changed effective practice was that intensive self-assessment helped from A to C in the second period, and changed back to A in students to identify the learning topics or concepts they need the third period. On the other hand, the persistence pattern to further review and study. plicit from the perspective of the students. It is assumed that as students use the self-assessment platform, they be- come aware of which learning content they currently need to improve on (“the identification of needed skills”). This will allow them to review and address these learning gaps accordingly (“the development of time management and sub- goal planning”). Following this hypothesis, the practice of self-assessment may reflect a part of SRL strategies which allows for the interpretation of the findings in this work in terms of SRL behaviors. Our analysis showed that low-performing students who kept the cramming persistence pattern in self-assessment were able to improve and achieve competitive performance with the high-performing counterpart in the final exam. This re- sult suggests that although generally, cramming or procras- tination is a less-desirable behavior in learning [25], when such behavior is found in self-assessment platforms, a pos- Figure 5: Comparing Exam Performance of High itive outcome may be seen. An optimistic explanation to Performing Students in Exam1 and Exam2. We can this result is that the platform was seen as supplemental see that the performance of students with the pat- material to the course since it follows a format that closely tern “CC” dropped the most compared to students resembles the formal assessments (i.e., multiple-choice ques- with the other patterns. Only students with the tions). Thus, students who wanted to improve their per- pattern “AA” had the tendency to improve the per- formance could obtain actionable feedback allowing them to formance. review exam content in a short period of time. On the other hand, a relatively pessimistic explanation is that students were simply gaming the system to memorize the content in 4.2.3 Effective Persistence Patterns in HP Students the hopes of seeing similar questions in the exams. The performance of HP students was relatively stable com- pared to LP students (Figure 3). An ANOVA analysis of One interesting pattern we found from the analysis was that 3-by-3 factorial design over P1 and P2 on the difference of the effect of early behavior might reflect on performance at E1 and E2 showed that the main effect of the interaction of a later time (see Section 4.1). Our analysis showed that low- P1 and P2 was significant (F (2, 148) = 3.40, p = 0.03). Fol- performing students who kept the same persistence patterns lowing this outcome, we further compared and examined the in the first two exam periods might perform better than the value with different persistence patterns (see Figure 5). The others. This result may suggest that 1) in this course, the only pattern of interest we found was that HP students start- effort in the first two exam periods was crucial, which is not ing with Cramming and kept doing so were not able to keep surprising considering the comprehensive nature of midterm their performance in E2. A statistics test showed that the exams; and 2) being persistent only for a short term was performance of this group of students dropped significantly not enough. This can be leveraged to guide students in from E1 (M = 0.81, SD = 0.11) to E2 (M = 0.70, SD = practice. For example, a recommender can be implemented 0.21) (t(142) = −3.85, p = 0.00, d = −0.65). Such a pattern to inform students that being persistent in the long run is was not found from E2 to E3. important. This recommender can also adapt to students’ self-assessments in a short term. This result could be a signal that for HP students who wanted to stay competitive, intensive self-assessment might This study gave us a glimpse of the students’ behavior as not help much. A possible explanation to this outcome was they progressed into the course. We observed how many that due to that the topics on the self-assessment platform students utilized the system and crammed as they prepared were “limited” in terms of scope and the amount of content, for an upcoming exam. Those who kept their persistence when the complexity of topics increased, students were not patterns outperformed those who did not in terms of perfor- able to use the platform to review the important and nec- mance in the formal assessments. This result suggested that essary learning content that were not covered by the self- persistently putting an effort to utilize additional materials, assessment platform. in general, would positively be reflected on the course per- formance. However, we did not look into their self-reported 5. DISCUSSION motivations or reasons for coming back to the system, e.g., whether it was for them to self-assess or to practice on to ad- One SRL strategy in the literature relevant to the perfor- ditional learning materials. Since what we know about the mance of self-assessment is assessment of task difficulty pro- students is currently limited to their activity log data on the posed by Falkner and colleagues. When the identification system, we may only assume that those who persistently of needed skills is incorporated, it would lead to the de- used the system belonged to the so-called “hard-working” velopment of time management and sub-goal planning [16]. ones who had been the better-performing students prior to The performance of self-assessment has two fundamental taking the course or had a better metacognitive skills. goals: for students to practice learning concepts, and for them to obtain feedback about their current span of knowl- Additionally, although the findings may have a potential edge. While the former is explicit, the latter is rather im- connection to SRL, however, in this work we are not able the 46th ACM Technical Symposium on Computer to ground this hypothesis due to the lack of data from au- Science Education, pages 522–527. ACM, 2015. thentic SRL measurements (e.g., qualitative questionnaires [2] M. Alzaid and I.-H. Hsiao. Behavioral Analytics for for mapping constructs which are widely used in literature). Distributed Practices in Programming Additionally, the lack of some persistence patterns or suf- Problem-Solving. In 2019 IEEE Frontiers in ficient samples potentially makes our interpretation biased Education Conference (FIE), volume 2019-Octob, toward a certain kind of persistence patterns. This is despite pages 1–8. IEEE, 10 2019. the fact that the analyses were based on data collected from [3] M. Alzaid, D. Trivedi, and I.-H. Hsiao. The effects of the same course given in three different semesters, which bite-size distributed practices for programming to some extent had consolidated the possible patterns that novices. In 2017 IEEE Frontiers in Education could be found in this specific course. These are considered Conference (FIE), volume 2017-Octob, pages 1–9. the current limitations of this work and a future study can IEEE, 10 2017. further evaluate our findings in consideration of these issues. [4] A. R. Baig and H. Jabeen. Big data analytics for behavior monitoring of students. Procedia Computer 6. CONCLUSIONS AND LIMITATIONS Science, 82:43–48, 2016. This work examined the relationship between the persis- [5] A. Bandura and D. H. Schunk. Cultivating tence of self-assessment and the exam performance in an in- competence, self-efficacy, and intrinsic interest troductory course of computer programming. We collected through proximal self-motivation. Journal of data from the same course given in three different semesters. Personality and Social Psychology, 1981. Out of 344 students, we identified 10 different long-term per- [6] A. S. Benjamin and J. Tullis. What makes distributed sistence patterns by a probabilistic mixture model, each of practice effective? Cognitive Psychology, 2010. which consisted of three short-term persistence patterns in- [7] D. Boud and N. Falchikov. Quantitative Studies of cluding Active, Cramming, and Inactive. From a series of Student Self-Assessment in Higher Education: A analyses which explained the variance of exam performance Critical Analysis of Findings. Higher Education, by the dynamics or changes in persistence patterns, we found 18(5):529–549, 1989. that low-performing students benefited from the continuity [8] K. E. Boyer, R. Phillips, A. Ingram, E. Y. Ha, of intensive or active self-assessment; and, somewhat, on the M. Wallis, M. Vouk, and J. Lester. Investigating the contrary, high-performing students might not achieve similar relationship between dialogue structure and tutoring effectiveness by the continuity of cramming. We discussed effectiveness: a hidden markov modeling approach. these outcomes under the framework of self-regulated learn- International Journal of Artificial Intelligence in ing and provided possible assumptions and explanations in Education, 21(1-2):65–81, 2011. the context of the course. [9] K. Buffardi and S. H. Edwards. Effective and ineffective software testing behaviors by novice There are some limitations to the methodology of this work. programmers. In Proceedings of the ninth annual First, the persistence model in use was built on the amount international ACM conference on International of students’ activities on the self-assessment platform which computing education research, pages 83–90. ACM, only included the count of unique question attempts. In 2013. other words, our model did not consider other probably valu- able information in self-assessment such as the correctness [10] D. L. Butler and P. H. Winne. Feedback and of first attempts, the coverage or difficulty of learning topics, Self-Regulated Learning: A Theoretical Synthesis. among others. The second limitation is that our definition of Review of Educational Research, 65(3):245–281, 9 the micro persistence pattern Cramming may overlap Active 1995. and Inactive to some extent. Specifically, we did not discrim- [11] A. S. Carter, C. D. Hundhausen, and O. Adesope. The inate students who were “Active and Cramming” from those normalized programming state model: Predicting who were “Inactive and Cramming”. One reason for this student performance in computing courses based on decision was to avoid the overfitting that made the result programming behavior. In Proceedings of the eleventh of patterns too sparse. Nevertheless, we might introduce annual International Conference on International some bias toward the Cramming pattern in our analysis. Computing Education Research, pages 141–150. ACM, Finally, the persistence patterns found in the same course 2015. seemed consistent or similar in the three different semesters. [12] L. Cen, D. Ruta, and J. Ng. Big education: However, this could be due to the structure of the content Opportunities for big data analytics. In 2015 IEEE or the nature of content provided in this specific course of International Conference on Digital Signal Processing computer programming. Although we can expect that some (DSP), pages 502–506. IEEE, 2015. behavioral patterns may be general across the field of study [13] C.-Y. Chung and I.-H. Hsiao. Quantitative Analytics (e.g., cramming before the exam), our findings related to in Exploring Self-Regulated Learning Behaviors: the exam performance may not be the case. Future research Effects of Study Persistence and Regularity. In 2019 should take the organization of course content into account IEEE Frontiers in Education Conference (FIE), pages when trying to replicate the result in a different course. 1–9. IEEE, 10 2019. [14] C.-Y. Chung and I.-H. Hsiao. Investigating Patterns of 7. REFERENCES Study Persistence on Self-Assessment Platform of [1] A. Altadmri and N. C. Brown. 37 million Programming Problem-Solving. In Proceedings of the compilations: Investigating novice programming 51st ACM Technical Symposium on Computer Science mistakes in large-scale student data. In Proceedings of Education, pages 162–168, New York, NY, USA, 2 2020. ACM. [27] H. L. Roediger and A. C. Butler. The critical role of [15] A. Cicchinelli, E. Veas, A. Pardo, retrieval practice in long-term retention. Trends in V. Pammer-Schindler, A. Fessl, C. Barreiros, and Cognitive Sciences, 15(1):20–27, 1 2011. S. Lindstädt. Finding traces of self-regulated learning [28] R. D. Roscoe and M. T. H. Chi. Tutor learning: the in activity streams. Proceedings of the 8th role of explaining and responding to questions. International Conference on Learning Analytics and Instructional Science, 36(4):321–350, 7 2008. Knowledge - LAK ’18, pages 191–200, 2018. [29] D. H. Schunk. Enhancing self-efficacy and achievement [16] K. Falkner, R. Vivian, and N. J. Falkner. Identifying through rewards and goals: Motivational and computer science self-regulated learning strategies. In informational effects. Journal of Educational Research, Proceedings of the 2014 conference on Innovation & 1984. technology in computer science education - ITiCSE [30] D. H. Schunk. Sequential attributional feedback and ’14, pages 291–296, New York, New York, USA, 2014. children’s achievement behaviors. Journal of ACM Press. Educational Psychology, 76(6):1159–1169, 1984. [17] J. Guerra, S. Sahebi, P. Brusilovsky, and Y. Lin. The [31] D. H. Schunk and E. L. Usher. Assessing self-efficacy Problem Solving Genome: Analyzing Sequential for self-regulated learning. In B. J. Zimmerman and Patterns of Student Work with Parameterized D. H. Schunk, editors, Educational psychology Exercises. In 7th International Conference on handbook: handbook of self-regulation of learning and Educational Data Moining, pages 153–160, London, performance. Routledge, 2011. UK, 2014. International Educational Data Mining [32] J. C. Sun, C. Lin, and C. Chou. Applying learning Society. analytics to explore the influence of online learners’ [18] D. J. Hacker, J. Dunlosky, and A. C. Graesser. motivation on their online learning behavioral Metacognition in Educational Theory and Practice. patterns. In 2016 5th IIAI International Congress on Routledge, 3 1998. Advanced Applied Informatics (IIAI-AAI), pages [19] B. Hoffman, R. Morelli, and J. Rosato. Student 377–380, 2016. Engagement is Key to Broadening Participation in [33] E. J. Webb, D. T. Campbell, R. D. Schwartz, and CS. In Proceedings of the 50th ACM Technical L. Sechrest. Unobtrusive measures, volume 2. Sage Symposium on Computer Science Education - SIGCSE Publications, 1999. ’19, pages 1123–1129, New York, New York, USA, [34] P. H. Winne and R. S. Baker. The Potentials of 2019. ACM Press. Educational Data Mining for Researching [20] M. C. Jadud and B. Dorn. Aggregate compilation Metacognition, Motivation and Self-Regulated behavior: Findings and implications from 27,698 Learning. JEDM - Journal of Educational Data users. In Proceedings of the eleventh annual Mining, 2013. International Conference on International Computing [35] P. H. Winne and A. F. Hadwin. Studying as Education Research, pages 131–139. ACM, 2015. self-regulated learning. In Metacognition in [21] H. H. Kelley and J. L. Michela. Attribution Theory educational theory and practice. Routledge, 1998. and Research. Annual Review of Psychology, 1980. [36] B. J. Zimmerman and D. H. Schunk. Self-Regulated [22] D. Loksa and A. J. Ko. The Role of Self-Regulation in Learning and Performance: An Introduction and an Programming Problem Solving Process and Success. Overview. In Educational psychology handbook: In Proceedings of the 2016 ACM Conference on handbook of self-regulation of learning and International Computing Education Research, pages performance, page 1. Routledge, 2011. 83–91, New York, NY, USA, 8 2016. ACM. [23] D. Loksa, A. J. Ko, W. Jernigan, A. Oleson, C. J. Mendez, and M. M. Burnett. Programming, Problem Solving, and Self-Awareness. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pages 1449–1461, New York, NY, USA, 5 2016. ACM. [24] Y. Lu and I.-H. Hsiao. Seeking programming-related information from large scaled discussion forums, help or harm?. In Proceedings of the 9th International Conference on Educational Data Mining, pages 442–447, 2016. [25] J. Park, R. Yu, F. Rodriguez, R. Baker, P. Smyth, and M. Warschauer. Understanding Student Procrastination via Mixture Models. In Educational Data Mining, pages 187–197, Buffalo, NY, 2018. International Educational Data Mining Society. [26] C. Piech, M. Sahami, D. Koller, S. Cooper, and P. Blikstein. Modeling how students learn to program. In Proceedings of the 43rd ACM technical symposium on Computer Science Education, pages 153–160. ACM, 2012.