Comparing Ebook Student Interactions With Test Scores:
            A Case Study Using CSAwesome

                               Hisamitsu Maeda, Barbara Ericson, Paramveer Dhillon
                                              University of Michigan, Ann Arbor, MI, USA
                                          {himaeda|barbarer|dhillonp}@umich.edu


ABSTRACT                                                                  Python, Java, C++, and SQL [10]. During the 2020-21
Interactive ebooks have better learning gains than static                 academic year, Runestone had 69,400 registered users and
ebooks, students prefer them, and more computing courses                  served an average of over two million page views a weeks.
are using interactive ebooks. The number of computing
ebooks on the open-source Runestone interactive ebook plat-               Runestone interactive ebooks log timestamped user interac-
form increased from one in 2011 to over 30 in 2020. Rune-                 tions1 . For the analyses in this paper, we use the CSAwe-
stone currently serves over two million page views a week.                some interactive ebook. CSAwesome has been endorsed by
It includes books for CS0, CS1, CS2, data science, and                    the College Board for the Advanced Placement (AP) Com-
web development and supports coding in Java, Python, and                  puter Science A (CSA) course [4]. Advanced Placement
C++. The ebooks include instructional material: text, im-                 courses are taken by secondary students for college credit
ages, videos, and interactive practice problems with imme-                and/or placement. The AP CSA course is equivalent to a
diate feedback: multiple-choice, fill-in-the-blank, write code            first course for majors (CS1) at the college level and cov-
(active code), mixed-up code (Parsons), clickable code, and               ers object-oriented programming in Java. Our CSAwesome
matching. User interaction with the ebooks is timestamped                 data is from custom courses. Instructors can create a cus-
and logged. This information includes page views, video                   tom course for any of the free ebooks on Runestone and have
plays, video completion, Parsons moves, problem answers,                  their students register for that custom course.
and learner written code. This fine grained data may help
us automatically identify struggling students. This paper re-             Interactive ebooks provide rich data that could be lever-
ports on several analyses comparing student activities to the             aged to improve instruction [23]. We may be able to iden-
midterm score from one of the Runestone ebooks, CSAwe-                    tify struggling students in order to provide help. Our re-
some. Specifically, we compared the major types of log file               search questions were 1) what student activities correlate
entries to the midterm score and also conducted an in-depth               with scores on the midterm and 2) what Parsons problem
analysis of mixed-up code (Parsons) problem data.                         activities correlate with the midterm score? To answer these
                                                                          questions, we performed regression analyses at both the higher
                                                                          level based on the major types of activities and at the lower
Keywords                                                                  level with a more in-depth analysis of Parsons problem data.
e-book, data mining, Parsons problem

1.    INTRODUCTION                                                        2.   RELATED WORK
Research has shown that interactive ebooks have better learn-             We are interested in the relationship between students’ ac-
ing gains than static ebooks [20]. In addition, most students             tivities in the ebook and their pretest and midterm scores.
report that the interactive features help them learn and want             In a related study, Pollari-Malmi et al. [20] found an in-
to use interactive ebooks in future courses [15]. A 2013 work-            crease in use, motivation, and learning gains from use of an
ing group predicted that traditional CS textbooks would be                interactive ebook versus an equivalent static ebook. Ericson
replaced by interactive ebooks [13].                                      et al. [6] found that teachers who used more of the interac-
                                                                          tive features in an ebook had higher gains in confidence and
Runestone is an open-source ebook platform that has grown                 higher scores on the final posttest. Parker et al. [18] found
from serving one interactive ebook [15] in 2011 to over 30                that students and teachers used an interactive ebook differ-
free ebooks in 2021. It supports several languages, including             ently with teachers showing more characteristics of expert
                                                                          learners. Akçapınar et al. [1] used student reading behavior
                                                                          in an ebook to predict at-risk students. Park et al. [17] used
                                                                          statistical change detection techniques to detect changes in
                                                                          student behavior from clickstream data in a Learning Man-
                                                                          agement System (LMS). They found that students who in-
                                                                          creased their reviewing activity relative to other students in
                                                                          the course had a higher probability of passing the course.

                                                                          1
Copyright ©2021 for this paper by its authors. Use permitted under Cre-     Researchers can request an anonymized logfile from Brad
ative Commons License Attribution 4.0 International (CC BY 4.0)           Miller, the founder of Runestone.
Several researchers [3, 24, 21, 22, 9, 19, 2] have been ex-
ploring Parsons problems as both an alternative to writing
code from scratch and as a summative assessment. In Par-
sons problems learners must place mixed-up blocks of code
in the correct order. They may also have to correctly in-
dent the blocks. Parsons problems can also have distrac-
tors, which are code blocks that are not needed in a correct
solution. Helminen et al. [12] visualized students’ problem-
solving process on Parsons problems using a graphical rep-
resentation. They detected several different approaches to
solving Parsons problems including top-down, control struc-
tures first, and trial and error. Some learners got stuck in
circular loops and repeated the same incorrect solution. Er-
icson et al. [8] found that more learners attempted Parsons
problems than nearby multiple-choice questions in an inter-
active ebook. Maharjan et al. [14] proposed an edit distance
trail to show students’ solution paths in Parsons problems in
a simpler fashion than a graphical representation. They also
discussed potential issues with studying Parsons problems
using descriptive statistics methods. Morrison et al. [16]
found that subgoal labels help students solve Parsons prob-
lems. Du et al. [3] conducted a review of recent studies on
Parsons problems.

Ericson invented two types of adaptation for Parsons prob-
                                                                 Figure 1: Example Parsons Problem with Paired Distractors
lems [7, 5]. In intra-problem adaptation the problem can
be dynamically made easier by removing a distractor, pro-
viding indentation, or combining two blocks into one. The
                                                                 mixed-up code blocks that the learner must place in the cor-
adaptation is triggered by clicking a ”Help” button. In inter-
                                                                 rect order [19] as shown in Figure 1. The Parsons problems
problem adaptation the difficulty of the next problem is au-
                                                                 in this ebook were adaptive. While some adaptive systems
tomatically modified based on the learner’s performance on
                                                                 use selection adaptation in which the next problem is se-
the last problem. The problem can be made easier by pairing
                                                                 lected from a set of possible problems based on the learner’s
distractor and correct code blocks or by removing distrac-
                                                                 performance, this system modifies the difficulty of the cur-
tors. It can be made more difficult by using all distractors
                                                                 rent or next problem in the ebook based on the learner’s
and randomly mixing the distractor blocks in with the cor-
                                                                 performance.
rect code blocks. Learners are nearly twice as likely to cor-
rectly solve adaptive Parsons problems than non-adaptive
                                                                 We analyzed five categories of log file entries: page views,
ones and report that solving Parsons problems helps them
                                                                 video (play, pause, and completion), active code interac-
learn to fix and write code [5]. A randomized controlled
                                                                 tion (run, edit, slide, and unit test), Parsons problem block
study provided evidence that solving adaptive Parsons prob-
                                                                 moves and answers, and answers to multiple-choice ques-
lems takes significantly less time than writing the equivalent
                                                                 tions. Overall, the log file contained data from 1,893 stu-
code and with similar learning gains from pretest to posttest
                                                                 dents in 57 custom classes. Most of these were high school
[7].
                                                                 classes, but some of them were college classes. This paper
                                                                 analyzed a subset of the log data from 505 students who took
3.   DATASET                                                     both the pretest and midterm in their course. The summary
The log file used in our analysis comes from the CSAwesome       statistics of the data are shown in Table 1. Over 37% of the
interactive ebook on the open-source Runestone platform.         log file activities were interaction with active code. This in-
This book was revised by Beryl Hoffman of Elms College           cluded running the code, editing the code, sliding the history
and the Mobile CSP project in 2019 for the 2019 AP CS            to view different versions of the code, and running unit tests.
A exam [4]. The log includes page views, video interaction,      The AP CSA exam includes 40 multiple-choice questions and
and the results from interactive practice problems: multiple-    four free response questions where the student must write
choice, write code (active code), and mixed-up code (Parsons     Java code to solve a problem. The ebook is broken into
problems).                                                       10 content-based units and five practice units. At the end
                                                                 of each content-based unit there are at least 10 multiple-
An active code is a traditional programming exercise in          choice questions, mixed-up code (Parsons) problems, and
which students write/edit/run code in the ebook. Many            write code problems [4].
active code exercises have unit tests that students can use
to verify that their code is correct. Students can also use      We were also interested in the effect of class size on students’
a slider to view previous versions of the code. The “Show        performance. For this analysis, we divided the students into
CodeLens” button on the active code will display a program       two groups: classes with less than 30 students, which is the
visualizer (CodeLens), allowing students to step through         typical maximum class size in high school, and classes with
their code line by line and visualize the variables. It is a     more than 30 students—224 students were in large classes,
version of Python Tutor [11]. A Parsons problem provides         and 281 were in small classes.
      Event Type                     Count     Percentage
      active code                 1,020,735       37.66%
      page view                     651,136       24.02%
      Parsons move and answer       524,206       19.34%
      multiple choice answer        261,321        9.64%
      video                          83,892        3.09%
      other                         169,363        6.25%
      Total                       2,710,653         100%

        Table 1: Various types of events in our dataset.


4. METHODOLOGY
4.1 Regression analysis of student activities
In this section, we analyze the impact of student activities on
student performance. First, we explore the data and check
the relationship between different student activities. Fig-
ure 2 shows that the percentage correct on the midterm has
                                                                  Figure 2: Correlation structure between the different activi-
a positive correlation with the percentage of correct for each
                                                                  ties.
activity. Multiple-choice problems have the highest correla-
tion with the midterm. This could be because the midterm
is a set of 20 multiple choice questions. On the other hand,
there is a negative correlation between the midterm and the
number of some other activities (i.e. the number of times
that CodeLens was used and the percent of videos that were
completed).

Next, we employ a linear regression model to this data with
the percentage of correct answers on the midterm exam as
the dependent variable. The count and type of student ac-
tivities were used as independent variables. Since the activ-
ity variables have a skewed distribution, we log-transformed
them as x → log(x + 1) and standardized them before run-
ning the regression analysis. Also, since there is a difference
in the test results based on the class size (as can be seen
from Figure 3), we add the class size as a dummy variable
in our regression model. The regression results are shown in
Table 2.
                                                                  Figure 3: Figure showing results for the pretest and midterm
The percent correct on the midterm was negatively corre-          exams.
lated with being in a larger class, perhaps because there
is less one-on-one interaction with the instructor and, as
a result, lower learning outcomes. Midterm results were           clicks the “Help” button, the ebook will remove a distractor
also negatively correlated with the number of interactions        block from the solution, provide the indentation, or combine
with the CodeLens and the number of videos completed.             two blocks into one, hence providing an implicit hint.
This may indicate that both of these activities were more
likely to be used by struggling students. Midterm results         First, we pre-process the log data to gather detailed infor-
were positively correlated with the percent correct on the        mation on the Parsons interactions. As shown in Figure 4,
pretest, percentage correct on other multiple-choice ques-        students have to move from a state where all the blocks are
tions, the number of page views, and the number of videos         jumbled to the state in which all the blocks are placed cor-
played. It is interesting that the midterm score is positively    rectly. The final state is the correct solution. We count
correlated with the number of videos played, but negatively       each step taken by the students and the number of failures
correlated with the number of videos completed. It could          incurred until a student finds the correct ordering. We count
be that stronger students watch a video till they find what       the number of times a student got help (clicked the “Help”
they need and then quit.                                          button). We also count the number of steps and time until
                                                                  a student asked for help. In order to tease apart the effect
4.2     Regression Analysis of Parsons Problems                   of “getting help,” we add an interaction term with the “help
In this section, we conduct an in-depth analysis of Parsons       flag” in our regression.
problems. As described earlier, these Parsons problems used
both intra-problem and inter-problem adaptation. In intra-        The regression result is shown in Table 3. As can be seen
problem adaptation, if a student submits at least three in-       from the result, being in a large class is negatively related
correct solutions, they are notified that they can use a “Help”   to the midterm score, as we found in our previous regression
button to make the problem easier. Each time the student          analysis. We also found that the number of steps before a
    Variable                                  Coefficient                              Variable                Coefficient
    Large Class (or not)                       -0.362***                         Large Class (or not)          -0.0862***
                                                  (0.00)                                                          (0.000)
    Percentage correct for pretest             0.1045***                       Parsons problem correct           0.0797*
                                                 (0.009)                                                           (0.09)
    Percentage correct for multiple choice     0.5366***                Number of incorrect submissions        -0.0136***
                                                 (0.000)                                                          (0.000)
    Number of active code interactions           0.0901*                Number of times help is used            -0.0279**
                                                  (0.09)                   (when the help is used)                 (0.03)
    Number of CodeLens interactions           -0.1403***              Number of steps before getting help         0.413*
                                                 (0.002)                   (when the help is used)                 (0.07)
    Number of page views                        0.0931**               Elapsed time before getting help          -0.3472*
                                                  (0.03)                   (when the help is used)                 (0.08)
    Number of videos played                     0.158***                              N                             402
                                                 (0.005)                             R2                            0.141
    Number of videos completed                   -0.08**              ***
                                                                            p < 0.01, ** p < 0.05, * p < 0.1
                                                  (0.03)
    N                                              417
                                                                 Table 3: Regression analysis of Parsons problems. The de-
    R2                                            0.416
                                                                 pendent variable is the midterm score and the independent
    ***
          p < 0.01, ** p < 0.05, * p < 0.1                       variables are the various student activities pertaining to Par-
                                                                 sons problems. p-values are shown in parenthesis.
Table 2: Regression result: The outcome variable is the
midterm score and the independent variables are the various
student activities. Since the activity variables have a skewed   problem. As discussed earlier, we counted the number of
distribution, we log-transformed them as x → log(x + 1) and      steps it took from the initial state to the correct order. We
standardized them before running the regression analysis. p-     define “extra steps” as the number of code-block moves be-
values are shown in parenthesis.                                 yond the required number of moves to a correct answer.
                                                                 For example, if a solution can be reached in just 10 steps,
                                                                 and a student took 12 steps, this would be two extra steps.
student got help from the software was positively correlated     Additionally, we also measured the amount of time it took
with the midterm test results. This could be because the         students to correct the jumbled code blocks in the Parsons
students who received help from the software after perform-      problem. Figure 5 shows the number of extra steps taken as
ing more correct steps were more motivated to succeed in         a function of the time taken for students in both large and
the course in the first place. In other words, stronger stu-     small classes.
dent learners could figure out more of the problem before
they asked for help. On the other hand, the time taken to        Figure 6 compares the extra steps taken as a function of
get support was negatively associated with the test score.       the class size and the midterm test score. As can be seen,
This implies that students who took too long to get support      there is a significant relationship between students with good
scored poorly on the midterm. In addition, students who re-      midterm test scores and the number of extra steps taken (t-
ceived more support, i.e., received more help, did not score     statistic 5.082, p < 0.001). On the other hand, there is no
as well on the midterm.                                          relationship between class size and the number of extra steps
                                                                 taken (t statistic 1.151, p > 0.1).

                                                                 Figure 7 shows the result comparing the time by class size
                                                                 and midterm test score. Unlike the number of “extra steps,”
                                                                 there is no significant association between students with
                                                                 good midterm test scores and the time taken by the learners
                                                                 (t statistic 1.7843, p < 0.07). Also, there is no relationship
                                                                 between class size and time, t statistics 0.683, p > 0.1.

                                                                 From this analysis, it appears that students who took fewer
                                                                 extra steps while solving this Parsons problem have better
                                                                 midterm scores. A similar trend is also observed in another
                                                                 problem from Unit 4 as shown in Figure 8. This could mean
                                                                 that taking extra steps and/or taking longer to solve a Par-
                                                                 sons problem indicates that a student is struggling.
   Figure 4: The process of solving the Parsons problem
                                                                 5.   LIMITATIONS
Next, for a particular problem that the learners solved, draw-   The log file data was from a random selection of custom
ing a sideways L with a turtle in Unit 2 as shown in Figure1,    courses on the Runestone platform. We do not have any
we also analyzed the number of block moves and the time          additional information about these courses, such as which
elapsed before the learner found the correct answer for each     items were assigned, final grades, or student demographics.
     Figure 5: Relationship between extra steps and time       Figure 7: Comparison time by midterm score and class size.
                                                               1. small class and midterm score less than median, 2. small
                                                               class and midterm score greater than median, 3. large class
                                                               and midterm score less than median, and 4. large class and
                                                               midterm score greater than median


Figure 6: Comparison extra step by midterm score and class
size. 1. small class and midterm score less than median, 2.
small class and midterm score greater than median, 3. large
class and midterm score less than median, and 4. large class
and midterm score greater than median
                                                               Figure 8: Comparison extra step by midterm score and class
                                                               size of another problem in Unit 4 on nested loops
In the summer of 2021 we will be receiving log file data for
this same ebook from teachers who attended professional
development with the Mobile CS Team. That data should          The results show a positive association between the number
allow for a more in-depth analysis.                            of correctly completed Parsons problems and the learners’
                                                               midterm scores. Further, there was a negative association
6.    CONCLUSION                                               between the midterm scores and the time the learner took to
In this paper, we performed several quantitative analyses      get help on a Parsons problem as well as a negative associ-
on the clickstream data from student interaction with the      ation between the class size and the midterm score. A close
CSAwesome ebook. We analyzed the relationship between          look at two Parsons problems showed a negative correlation
students’ activities on the ebook and their midterm scores.    with the number of extra steps and time to solve a Parsons
We found several positive and negative correlations. In a      problem and the midterm score.
regression analysis the most highly weighted variable with
a positive correlation was the percentage correct on other     While our analyses uncover subtle patterns in students’ in-
multiple choice questions and most highly weighted variable    teractions with the CSAwesome ebook, it will be interesting
with a negative correlation was being in a large class.        to test the robustness of our findings and see if they gen-
                                                               eralize to other interactive ebook platforms or across dif-
We also analyzed the learner interaction patterns on the       ferent programming courses, e.g., C++ or Python. If Par-
mixed-code (Parsons) problems. Specifically, we examined       sons problems can help detect struggling students early in
the impact of the number of steps taken, time taken, as well   a course it may be possible to intervene to improve student
as the frequency of help on the students’ midterm scores.      performance.
7.   REFERENCES                                                       V. Karavirta, L. Mannila, B. Miller, B. Morrison,
 [1] G. Akçapınar, M. N. Hasnine, R. Majumdar,                       R. R. Rodger, Susan H, and A. Shaffer, Clifford.
     B. Flanagan, and H. Ogata. Developing an                         Requirements and design strategies for open source
     early-warning system for spotting at-risk students by            interactive computer science ebooks. In Proceedings of
     using ebook interaction logs. Smart Learning                     the ITiCSE working group reports conference on
     Environments, 6(1):4, 2019.                                      Innovation and technology in computer science
 [2] P. Denny, A. Luxton-Reilly, and B. Simon. Evaluating             education-working group reports, pages 53–72, 2013.
     a new exam question: Parsons problems. In                   [14] S. Maharjan and A. Kumar. Using edit distance trails
     Proceedings of the fourth international workshop on              to analyze path solutions of parsons puzzles. In EDM,
     computing education research, pages 113–124, 2008.               2020.
 [3] Y. Du, A. Luxton-Reilly, and P. Denny. A review of          [15] B. N. Miller and D. L. Ranum. Beyond pdf and epub:
     research on parsons problems. In Proceedings of the              toward an interactive textbook. In Proceedings of the
     Twenty-Second Australasian Computing Education                   17th ACM annual conference on Innovation and
     Conference, ACE’20, page 195–202, New York, NY,                  technology in computer science education, pages
     USA, 2020. Association for Computing Machinery.                  150–155, 2012.
 [4] B. Ericson, B. Hoffman, and J. Rosato. Csawesome:           [16] B. B. Morrison, L. E. Margulieux, B. Ericson, and
     Ap csa curriculum and professional development                   M. Guzdial. Subgoals help students solve parsons
     (practical report). In Proceedings of the 15th                   problems. In Proceedings of the 47th ACM Technical
     Workshop on Primary and Secondary Computing                      Symposium on Computing Science Education, pages
     Education, WiPSCE ’20, New York, NY, USA, 2020.                  42–47, 2016.
     Association for Computing Machinery.                        [17] J. Park, K. Denaro, F. Rodriguez, P. Smyth, and
 [5] B. Ericson, A. McCall, and K. Cunningham.                        M. Warschauer. Detecting changes in student behavior
     Investigating the affect and effect of adaptive parsons          from clickstream data. In Proceedings of the Seventh
     problems. In Proceedings of the 19th Koli Calling                International Learning Analytics amp; Knowledge
     International Conference on Computing Education                  Conference, LAK ’17, page 21–30, New York, NY,
     Research, pages 1–10, 2019.                                      USA, 2017. Association for Computing Machinery.
 [6] B. Ericson, S. Moore, B. Morrison, and M. Guzdial.          [18] M. C. Parker, K. Rogers, B. J. Ericson, and
     Usability and usage of interactive features in an online         M. Guzdial. Students and teachers use an online ap cs
     ebook for cs teachers. In Proceedings of the Workshop            principles ebook differently: Teacher behavior
     in Primary and Secondary Computing Education,                    consistent with expert learners. In Proceedings of the
     pages 111–120, 2015.                                             2017 ACM Conference on International Computing
 [7] B. J. Ericson, J. D. Foley, and J. Rick. Evaluating the          Education Research, pages 101–109, 2017.
     efficiency and effectiveness of adaptive parsons            [19] D. Parsons and P. Haden. Parson’s programming
     problems. In Proceedings of the 2018 ACM Conference              puzzles: a fun and effective learning tool for first
     on International Computing Education Research,                   programming courses. In Proceedings of the 8th
     pages 60–68, 2018.                                               Australasian Conference on Computing
 [8] B. J. Ericson, M. J. Guzdial, and B. B. Morrison.                Education-Volume 52, pages 157–163, 2006.
     Analysis of interactive features designed to enhance        [20] K. Pollari-Malmi, J. Guerra, P. Brusilovsky, L. Malmi,
     learning in an ebook. In Proceedings of the eleventh             and T. Sirkiä. On the value of using an interactive
     annual International Conference on International                 electronic textbook in an introductory programming
     Computing Education Research, pages 169–178, 2015.               course. In Proceedings of the 17th Koli Calling
 [9] B. J. Ericson, L. E. Margulieux, and J. Rick. Solving            International Conference on Computing Education
     parsons problems versus fixing and writing code. In              Research, pages 168–172, 2017.
     Proceedings of the 17th Koli Calling International          [21] W. Wang, R. Zhi, A. Milliken, N. Lytle, and T. W.
     Conference on Computing Education Research, pages                Price. Crescendo: Engaging students to self-paced
     20–29, 2017.                                                     programming practices. In Proceedings of the 51st
[10] B. J. Ericson and B. N. Miller. Runestone: A platform            ACM Technical Symposium on Computer Science
     for free, on-line, and interactive ebooks. In Proceedings        Education, pages 859–865, 2020.
     of the 51st ACM Technical Symposium on Computer             [22] N. Weinman, A. Fox, and M. A. Hearst. Improving
     Science Education, pages 1012–1018, 2020.                        instruction of programming patterns with faded
[11] P. J. Guo. Online python tutor: embeddable                       parsons problems. In Proceedings of the 2021 CHI
     web-based program visualization for cs education. In             Conference on Human Factors in Computing Systems,
     Proceeding of the 44th ACM technical symposium on                CHI ’21, New York, NY, USA, 2021. Association for
     Computer science education, pages 579–584, 2013.                 Computing Machinery.
[12] J. Helminen, P. Ihantola, V. Karavirta, and L. Malmi.       [23] H. Yan, F. Lin, et al. Including learning analytics in
     How do students solve parsons programming                        the loop of self-paced online course learning design.
     problems? an analysis of interaction traces. In                  International Journal of Artificial Intelligence in
     Proceedings of the Ninth Annual International                    Education, pages 1–18, 2020.
     Conference on International Computing Education             [24] R. Zhi, M. Chi, T. Barnes, and T. W. Price.
     Research, ICER ’12, page 119–126, New York, NY,                  Evaluating the effectiveness of parsons problems for
     USA, 2012. Association for Computing Machinery.                  block-based programming. In Proceedings of the 2019
[13] A. Korhonen, T. Naps, C. Boisvert, P. Crescenzi,                 ACM Conference on International Computing
Education Research, pages 51–59, 2019.