Comparing Ebook Student Interactions With Test Scores: A Case Study Using CSAwesome Hisamitsu Maeda, Barbara Ericson, Paramveer Dhillon University of Michigan, Ann Arbor, MI, USA {himaeda|barbarer|dhillonp}@umich.edu ABSTRACT Python, Java, C++, and SQL [10]. During the 2020-21 Interactive ebooks have better learning gains than static academic year, Runestone had 69,400 registered users and ebooks, students prefer them, and more computing courses served an average of over two million page views a weeks. are using interactive ebooks. The number of computing ebooks on the open-source Runestone interactive ebook plat- Runestone interactive ebooks log timestamped user interac- form increased from one in 2011 to over 30 in 2020. Rune- tions1 . For the analyses in this paper, we use the CSAwe- stone currently serves over two million page views a week. some interactive ebook. CSAwesome has been endorsed by It includes books for CS0, CS1, CS2, data science, and the College Board for the Advanced Placement (AP) Com- web development and supports coding in Java, Python, and puter Science A (CSA) course [4]. Advanced Placement C++. The ebooks include instructional material: text, im- courses are taken by secondary students for college credit ages, videos, and interactive practice problems with imme- and/or placement. The AP CSA course is equivalent to a diate feedback: multiple-choice, fill-in-the-blank, write code first course for majors (CS1) at the college level and cov- (active code), mixed-up code (Parsons), clickable code, and ers object-oriented programming in Java. Our CSAwesome matching. User interaction with the ebooks is timestamped data is from custom courses. Instructors can create a cus- and logged. This information includes page views, video tom course for any of the free ebooks on Runestone and have plays, video completion, Parsons moves, problem answers, their students register for that custom course. and learner written code. This fine grained data may help us automatically identify struggling students. This paper re- Interactive ebooks provide rich data that could be lever- ports on several analyses comparing student activities to the aged to improve instruction [23]. We may be able to iden- midterm score from one of the Runestone ebooks, CSAwe- tify struggling students in order to provide help. Our re- some. Specifically, we compared the major types of log file search questions were 1) what student activities correlate entries to the midterm score and also conducted an in-depth with scores on the midterm and 2) what Parsons problem analysis of mixed-up code (Parsons) problem data. activities correlate with the midterm score? To answer these questions, we performed regression analyses at both the higher level based on the major types of activities and at the lower Keywords level with a more in-depth analysis of Parsons problem data. e-book, data mining, Parsons problem 1. INTRODUCTION 2. RELATED WORK Research has shown that interactive ebooks have better learn- We are interested in the relationship between students’ ac- ing gains than static ebooks [20]. In addition, most students tivities in the ebook and their pretest and midterm scores. report that the interactive features help them learn and want In a related study, Pollari-Malmi et al. [20] found an in- to use interactive ebooks in future courses [15]. A 2013 work- crease in use, motivation, and learning gains from use of an ing group predicted that traditional CS textbooks would be interactive ebook versus an equivalent static ebook. Ericson replaced by interactive ebooks [13]. et al. [6] found that teachers who used more of the interac- tive features in an ebook had higher gains in confidence and Runestone is an open-source ebook platform that has grown higher scores on the final posttest. Parker et al. [18] found from serving one interactive ebook [15] in 2011 to over 30 that students and teachers used an interactive ebook differ- free ebooks in 2021. It supports several languages, including ently with teachers showing more characteristics of expert learners. Akçapınar et al. [1] used student reading behavior in an ebook to predict at-risk students. Park et al. [17] used statistical change detection techniques to detect changes in student behavior from clickstream data in a Learning Man- agement System (LMS). They found that students who in- creased their reviewing activity relative to other students in the course had a higher probability of passing the course. 1 Copyright ©2021 for this paper by its authors. Use permitted under Cre- Researchers can request an anonymized logfile from Brad ative Commons License Attribution 4.0 International (CC BY 4.0) Miller, the founder of Runestone. Several researchers [3, 24, 21, 22, 9, 19, 2] have been ex- ploring Parsons problems as both an alternative to writing code from scratch and as a summative assessment. In Par- sons problems learners must place mixed-up blocks of code in the correct order. They may also have to correctly in- dent the blocks. Parsons problems can also have distrac- tors, which are code blocks that are not needed in a correct solution. Helminen et al. [12] visualized students’ problem- solving process on Parsons problems using a graphical rep- resentation. They detected several different approaches to solving Parsons problems including top-down, control struc- tures first, and trial and error. Some learners got stuck in circular loops and repeated the same incorrect solution. Er- icson et al. [8] found that more learners attempted Parsons problems than nearby multiple-choice questions in an inter- active ebook. Maharjan et al. [14] proposed an edit distance trail to show students’ solution paths in Parsons problems in a simpler fashion than a graphical representation. They also discussed potential issues with studying Parsons problems using descriptive statistics methods. Morrison et al. [16] found that subgoal labels help students solve Parsons prob- lems. Du et al. [3] conducted a review of recent studies on Parsons problems. Ericson invented two types of adaptation for Parsons prob- Figure 1: Example Parsons Problem with Paired Distractors lems [7, 5]. In intra-problem adaptation the problem can be dynamically made easier by removing a distractor, pro- viding indentation, or combining two blocks into one. The mixed-up code blocks that the learner must place in the cor- adaptation is triggered by clicking a ”Help” button. In inter- rect order [19] as shown in Figure 1. The Parsons problems problem adaptation the difficulty of the next problem is au- in this ebook were adaptive. While some adaptive systems tomatically modified based on the learner’s performance on use selection adaptation in which the next problem is se- the last problem. The problem can be made easier by pairing lected from a set of possible problems based on the learner’s distractor and correct code blocks or by removing distrac- performance, this system modifies the difficulty of the cur- tors. It can be made more difficult by using all distractors rent or next problem in the ebook based on the learner’s and randomly mixing the distractor blocks in with the cor- performance. rect code blocks. Learners are nearly twice as likely to cor- rectly solve adaptive Parsons problems than non-adaptive We analyzed five categories of log file entries: page views, ones and report that solving Parsons problems helps them video (play, pause, and completion), active code interac- learn to fix and write code [5]. A randomized controlled tion (run, edit, slide, and unit test), Parsons problem block study provided evidence that solving adaptive Parsons prob- moves and answers, and answers to multiple-choice ques- lems takes significantly less time than writing the equivalent tions. Overall, the log file contained data from 1,893 stu- code and with similar learning gains from pretest to posttest dents in 57 custom classes. Most of these were high school [7]. classes, but some of them were college classes. This paper analyzed a subset of the log data from 505 students who took 3. DATASET both the pretest and midterm in their course. The summary The log file used in our analysis comes from the CSAwesome statistics of the data are shown in Table 1. Over 37% of the interactive ebook on the open-source Runestone platform. log file activities were interaction with active code. This in- This book was revised by Beryl Hoffman of Elms College cluded running the code, editing the code, sliding the history and the Mobile CSP project in 2019 for the 2019 AP CS to view different versions of the code, and running unit tests. A exam [4]. The log includes page views, video interaction, The AP CSA exam includes 40 multiple-choice questions and and the results from interactive practice problems: multiple- four free response questions where the student must write choice, write code (active code), and mixed-up code (Parsons Java code to solve a problem. The ebook is broken into problems). 10 content-based units and five practice units. At the end of each content-based unit there are at least 10 multiple- An active code is a traditional programming exercise in choice questions, mixed-up code (Parsons) problems, and which students write/edit/run code in the ebook. Many write code problems [4]. active code exercises have unit tests that students can use to verify that their code is correct. Students can also use We were also interested in the effect of class size on students’ a slider to view previous versions of the code. The “Show performance. For this analysis, we divided the students into CodeLens” button on the active code will display a program two groups: classes with less than 30 students, which is the visualizer (CodeLens), allowing students to step through typical maximum class size in high school, and classes with their code line by line and visualize the variables. It is a more than 30 students—224 students were in large classes, version of Python Tutor [11]. A Parsons problem provides and 281 were in small classes. Event Type Count Percentage active code 1,020,735 37.66% page view 651,136 24.02% Parsons move and answer 524,206 19.34% multiple choice answer 261,321 9.64% video 83,892 3.09% other 169,363 6.25% Total 2,710,653 100% Table 1: Various types of events in our dataset. 4. METHODOLOGY 4.1 Regression analysis of student activities In this section, we analyze the impact of student activities on student performance. First, we explore the data and check the relationship between different student activities. Fig- ure 2 shows that the percentage correct on the midterm has Figure 2: Correlation structure between the different activi- a positive correlation with the percentage of correct for each ties. activity. Multiple-choice problems have the highest correla- tion with the midterm. This could be because the midterm is a set of 20 multiple choice questions. On the other hand, there is a negative correlation between the midterm and the number of some other activities (i.e. the number of times that CodeLens was used and the percent of videos that were completed). Next, we employ a linear regression model to this data with the percentage of correct answers on the midterm exam as the dependent variable. The count and type of student ac- tivities were used as independent variables. Since the activ- ity variables have a skewed distribution, we log-transformed them as x → log(x + 1) and standardized them before run- ning the regression analysis. Also, since there is a difference in the test results based on the class size (as can be seen from Figure 3), we add the class size as a dummy variable in our regression model. The regression results are shown in Table 2. Figure 3: Figure showing results for the pretest and midterm The percent correct on the midterm was negatively corre- exams. lated with being in a larger class, perhaps because there is less one-on-one interaction with the instructor and, as a result, lower learning outcomes. Midterm results were clicks the “Help” button, the ebook will remove a distractor also negatively correlated with the number of interactions block from the solution, provide the indentation, or combine with the CodeLens and the number of videos completed. two blocks into one, hence providing an implicit hint. This may indicate that both of these activities were more likely to be used by struggling students. Midterm results First, we pre-process the log data to gather detailed infor- were positively correlated with the percent correct on the mation on the Parsons interactions. As shown in Figure 4, pretest, percentage correct on other multiple-choice ques- students have to move from a state where all the blocks are tions, the number of page views, and the number of videos jumbled to the state in which all the blocks are placed cor- played. It is interesting that the midterm score is positively rectly. The final state is the correct solution. We count correlated with the number of videos played, but negatively each step taken by the students and the number of failures correlated with the number of videos completed. It could incurred until a student finds the correct ordering. We count be that stronger students watch a video till they find what the number of times a student got help (clicked the “Help” they need and then quit. button). We also count the number of steps and time until a student asked for help. In order to tease apart the effect 4.2 Regression Analysis of Parsons Problems of “getting help,” we add an interaction term with the “help In this section, we conduct an in-depth analysis of Parsons flag” in our regression. problems. As described earlier, these Parsons problems used both intra-problem and inter-problem adaptation. In intra- The regression result is shown in Table 3. As can be seen problem adaptation, if a student submits at least three in- from the result, being in a large class is negatively related correct solutions, they are notified that they can use a “Help” to the midterm score, as we found in our previous regression button to make the problem easier. Each time the student analysis. We also found that the number of steps before a Variable Coefficient Variable Coefficient Large Class (or not) -0.362*** Large Class (or not) -0.0862*** (0.00) (0.000) Percentage correct for pretest 0.1045*** Parsons problem correct 0.0797* (0.009) (0.09) Percentage correct for multiple choice 0.5366*** Number of incorrect submissions -0.0136*** (0.000) (0.000) Number of active code interactions 0.0901* Number of times help is used -0.0279** (0.09) (when the help is used) (0.03) Number of CodeLens interactions -0.1403*** Number of steps before getting help 0.413* (0.002) (when the help is used) (0.07) Number of page views 0.0931** Elapsed time before getting help -0.3472* (0.03) (when the help is used) (0.08) Number of videos played 0.158*** N 402 (0.005) R2 0.141 Number of videos completed -0.08** *** p < 0.01, ** p < 0.05, * p < 0.1 (0.03) N 417 Table 3: Regression analysis of Parsons problems. The de- R2 0.416 pendent variable is the midterm score and the independent *** p < 0.01, ** p < 0.05, * p < 0.1 variables are the various student activities pertaining to Par- sons problems. p-values are shown in parenthesis. Table 2: Regression result: The outcome variable is the midterm score and the independent variables are the various student activities. Since the activity variables have a skewed problem. As discussed earlier, we counted the number of distribution, we log-transformed them as x → log(x + 1) and steps it took from the initial state to the correct order. We standardized them before running the regression analysis. p- define “extra steps” as the number of code-block moves be- values are shown in parenthesis. yond the required number of moves to a correct answer. For example, if a solution can be reached in just 10 steps, and a student took 12 steps, this would be two extra steps. student got help from the software was positively correlated Additionally, we also measured the amount of time it took with the midterm test results. This could be because the students to correct the jumbled code blocks in the Parsons students who received help from the software after perform- problem. Figure 5 shows the number of extra steps taken as ing more correct steps were more motivated to succeed in a function of the time taken for students in both large and the course in the first place. In other words, stronger stu- small classes. dent learners could figure out more of the problem before they asked for help. On the other hand, the time taken to Figure 6 compares the extra steps taken as a function of get support was negatively associated with the test score. the class size and the midterm test score. As can be seen, This implies that students who took too long to get support there is a significant relationship between students with good scored poorly on the midterm. In addition, students who re- midterm test scores and the number of extra steps taken (t- ceived more support, i.e., received more help, did not score statistic 5.082, p < 0.001). On the other hand, there is no as well on the midterm. relationship between class size and the number of extra steps taken (t statistic 1.151, p > 0.1). Figure 7 shows the result comparing the time by class size and midterm test score. Unlike the number of “extra steps,” there is no significant association between students with good midterm test scores and the time taken by the learners (t statistic 1.7843, p < 0.07). Also, there is no relationship between class size and time, t statistics 0.683, p > 0.1. From this analysis, it appears that students who took fewer extra steps while solving this Parsons problem have better midterm scores. A similar trend is also observed in another problem from Unit 4 as shown in Figure 8. This could mean that taking extra steps and/or taking longer to solve a Par- sons problem indicates that a student is struggling. Figure 4: The process of solving the Parsons problem 5. LIMITATIONS Next, for a particular problem that the learners solved, draw- The log file data was from a random selection of custom ing a sideways L with a turtle in Unit 2 as shown in Figure1, courses on the Runestone platform. We do not have any we also analyzed the number of block moves and the time additional information about these courses, such as which elapsed before the learner found the correct answer for each items were assigned, final grades, or student demographics. Figure 5: Relationship between extra steps and time Figure 7: Comparison time by midterm score and class size. 1. small class and midterm score less than median, 2. small class and midterm score greater than median, 3. large class and midterm score less than median, and 4. large class and midterm score greater than median Figure 6: Comparison extra step by midterm score and class size. 1. small class and midterm score less than median, 2. small class and midterm score greater than median, 3. large class and midterm score less than median, and 4. large class and midterm score greater than median Figure 8: Comparison extra step by midterm score and class size of another problem in Unit 4 on nested loops In the summer of 2021 we will be receiving log file data for this same ebook from teachers who attended professional development with the Mobile CS Team. That data should The results show a positive association between the number allow for a more in-depth analysis. of correctly completed Parsons problems and the learners’ midterm scores. Further, there was a negative association 6. CONCLUSION between the midterm scores and the time the learner took to In this paper, we performed several quantitative analyses get help on a Parsons problem as well as a negative associ- on the clickstream data from student interaction with the ation between the class size and the midterm score. A close CSAwesome ebook. We analyzed the relationship between look at two Parsons problems showed a negative correlation students’ activities on the ebook and their midterm scores. with the number of extra steps and time to solve a Parsons We found several positive and negative correlations. In a problem and the midterm score. regression analysis the most highly weighted variable with a positive correlation was the percentage correct on other While our analyses uncover subtle patterns in students’ in- multiple choice questions and most highly weighted variable teractions with the CSAwesome ebook, it will be interesting with a negative correlation was being in a large class. to test the robustness of our findings and see if they gen- eralize to other interactive ebook platforms or across dif- We also analyzed the learner interaction patterns on the ferent programming courses, e.g., C++ or Python. If Par- mixed-code (Parsons) problems. Specifically, we examined sons problems can help detect struggling students early in the impact of the number of steps taken, time taken, as well a course it may be possible to intervene to improve student as the frequency of help on the students’ midterm scores. performance. 7. REFERENCES V. Karavirta, L. Mannila, B. Miller, B. Morrison, [1] G. Akçapınar, M. N. Hasnine, R. Majumdar, R. R. Rodger, Susan H, and A. Shaffer, Clifford. B. Flanagan, and H. Ogata. Developing an Requirements and design strategies for open source early-warning system for spotting at-risk students by interactive computer science ebooks. In Proceedings of using ebook interaction logs. Smart Learning the ITiCSE working group reports conference on Environments, 6(1):4, 2019. Innovation and technology in computer science [2] P. Denny, A. Luxton-Reilly, and B. Simon. Evaluating education-working group reports, pages 53–72, 2013. a new exam question: Parsons problems. In [14] S. Maharjan and A. Kumar. Using edit distance trails Proceedings of the fourth international workshop on to analyze path solutions of parsons puzzles. In EDM, computing education research, pages 113–124, 2008. 2020. [3] Y. Du, A. Luxton-Reilly, and P. Denny. A review of [15] B. N. Miller and D. L. Ranum. Beyond pdf and epub: research on parsons problems. In Proceedings of the toward an interactive textbook. In Proceedings of the Twenty-Second Australasian Computing Education 17th ACM annual conference on Innovation and Conference, ACE’20, page 195–202, New York, NY, technology in computer science education, pages USA, 2020. Association for Computing Machinery. 150–155, 2012. [4] B. Ericson, B. Hoffman, and J. Rosato. Csawesome: [16] B. B. Morrison, L. E. Margulieux, B. Ericson, and Ap csa curriculum and professional development M. Guzdial. Subgoals help students solve parsons (practical report). In Proceedings of the 15th problems. In Proceedings of the 47th ACM Technical Workshop on Primary and Secondary Computing Symposium on Computing Science Education, pages Education, WiPSCE ’20, New York, NY, USA, 2020. 42–47, 2016. Association for Computing Machinery. [17] J. Park, K. Denaro, F. Rodriguez, P. Smyth, and [5] B. Ericson, A. McCall, and K. Cunningham. M. Warschauer. Detecting changes in student behavior Investigating the affect and effect of adaptive parsons from clickstream data. In Proceedings of the Seventh problems. In Proceedings of the 19th Koli Calling International Learning Analytics amp; Knowledge International Conference on Computing Education Conference, LAK ’17, page 21–30, New York, NY, Research, pages 1–10, 2019. USA, 2017. Association for Computing Machinery. [6] B. Ericson, S. Moore, B. Morrison, and M. Guzdial. [18] M. C. Parker, K. Rogers, B. J. Ericson, and Usability and usage of interactive features in an online M. Guzdial. Students and teachers use an online ap cs ebook for cs teachers. In Proceedings of the Workshop principles ebook differently: Teacher behavior in Primary and Secondary Computing Education, consistent with expert learners. In Proceedings of the pages 111–120, 2015. 2017 ACM Conference on International Computing [7] B. J. Ericson, J. D. Foley, and J. Rick. Evaluating the Education Research, pages 101–109, 2017. efficiency and effectiveness of adaptive parsons [19] D. Parsons and P. Haden. Parson’s programming problems. In Proceedings of the 2018 ACM Conference puzzles: a fun and effective learning tool for first on International Computing Education Research, programming courses. In Proceedings of the 8th pages 60–68, 2018. Australasian Conference on Computing [8] B. J. Ericson, M. J. Guzdial, and B. B. Morrison. Education-Volume 52, pages 157–163, 2006. Analysis of interactive features designed to enhance [20] K. Pollari-Malmi, J. Guerra, P. Brusilovsky, L. Malmi, learning in an ebook. In Proceedings of the eleventh and T. Sirkiä. On the value of using an interactive annual International Conference on International electronic textbook in an introductory programming Computing Education Research, pages 169–178, 2015. course. In Proceedings of the 17th Koli Calling [9] B. J. Ericson, L. E. Margulieux, and J. Rick. Solving International Conference on Computing Education parsons problems versus fixing and writing code. In Research, pages 168–172, 2017. Proceedings of the 17th Koli Calling International [21] W. Wang, R. Zhi, A. Milliken, N. Lytle, and T. W. Conference on Computing Education Research, pages Price. Crescendo: Engaging students to self-paced 20–29, 2017. programming practices. In Proceedings of the 51st [10] B. J. Ericson and B. N. Miller. Runestone: A platform ACM Technical Symposium on Computer Science for free, on-line, and interactive ebooks. In Proceedings Education, pages 859–865, 2020. of the 51st ACM Technical Symposium on Computer [22] N. Weinman, A. Fox, and M. A. Hearst. Improving Science Education, pages 1012–1018, 2020. instruction of programming patterns with faded [11] P. J. Guo. Online python tutor: embeddable parsons problems. In Proceedings of the 2021 CHI web-based program visualization for cs education. In Conference on Human Factors in Computing Systems, Proceeding of the 44th ACM technical symposium on CHI ’21, New York, NY, USA, 2021. Association for Computer science education, pages 579–584, 2013. Computing Machinery. [12] J. Helminen, P. Ihantola, V. Karavirta, and L. Malmi. [23] H. Yan, F. Lin, et al. Including learning analytics in How do students solve parsons programming the loop of self-paced online course learning design. problems? an analysis of interaction traces. In International Journal of Artificial Intelligence in Proceedings of the Ninth Annual International Education, pages 1–18, 2020. Conference on International Computing Education [24] R. Zhi, M. Chi, T. Barnes, and T. W. Price. Research, ICER ’12, page 119–126, New York, NY, Evaluating the effectiveness of parsons problems for USA, 2012. Association for Computing Machinery. block-based programming. In Proceedings of the 2019 [13] A. Korhonen, T. Naps, C. Boisvert, P. Crescenzi, ACM Conference on International Computing Education Research, pages 51–59, 2019.