<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Promoting Self-regulated Learning in Online Learning by Triggering Tailored Interventions</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>HaeJin Lee</string-name>
          <email>haejin2@illinois.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Suma Bhat</string-name>
          <email>spbhat2@illinois.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paul Hur</string-name>
          <email>khur4@illinois.edu</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nigel Bosch</string-name>
          <email>pnb@illinois.edu</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Illinois</institution>
          ,
          <addr-line>Urbana-Champaign</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Illinois</institution>
          ,
          <addr-line>Urbana-Champaign</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Illinois</institution>
          ,
          <addr-line>Urbana-Champaign</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Illinois</institution>
          ,
          <addr-line>Urbana-Champaign</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In online education, students are expected to be independent learners who can self-regulate and re ect on their activities during the learning process. However, not all students have self-regulated learning (SRL) skills, and students with weak SRL skills tend to underperform in distance learning environments. The aim of our pilot study was to promote selfregulated learning in online education by triggering tailored SRL interventions automatically. As a rst step toward, we constructed a quantitative research design where 58 students participated in 1) learning about introductory descriptive statistical concepts and 2) interacting with a self-paced online learning software throughout the experiment. We used the participants' action log les as a dataset to extract generalizable features, including pretest grade, quiz grade, reading time, and posttest grade. Then, we trained a random forest regressor model to predict student outcome (posttest). The correlation between actual and predicted posttest score was r = .576, indicating promise for accurately predicting and intervening. In the next phase of this work, we will apply SHAP (SHapley Additive exPlanations) to personalize SRL interventions by recommending each student to review the single topic that most negatively contributes to predicted posttest grade.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Self-regulated learning</kwd>
        <kwd>interventions</kwd>
        <kwd>machine learning explanations</kwd>
        <kwd>computer-based learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Students with di erent online learning skills, academic
performance, and levels of technology experience may go through
hardships to become autonomous learners who academically
succeed in ever-growing online learning courses in
universities. Previous studies demonstrated that successful
acaCopyright ©2021 for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0)
demic achievement is strongly connected to the level of
students' self regulative abilities, in which students take
initiative in the learning process [
        <xref ref-type="bibr" rid="ref14 ref17 ref8">8, 17, 14</xref>
        ].
      </p>
      <p>
        Self-regulated learning (SRL) skills are techniques for
achieving academic success by regulating a student's own actions
and decisions during the learning process [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. Students with
SRL skills are able to manage their own plans and re ect on
learning progress throughout their learning.
      </p>
      <p>
        However, distance learning poses threats to student
success, especially students without SRL skills, since they are
supposed to learn and complete assignments on their own.
Having SRL skills in e-learning environments is particularly
important for students to achieve academic goals, but SRL
ability is not an inherent skill that every student possesses [
        <xref ref-type="bibr" rid="ref3 ref9">9,
3</xref>
        ]. Students lacking self-regulated skills are prone to
underperform their peers who are able to direct their own learning
process [
        <xref ref-type="bibr" rid="ref21 ref22">22, 21</xref>
        ]. Therefore, supporting students lacking SRL
skills to develop into responsible and autonomous learners
in online learning is crucial.
      </p>
      <p>
        Our study responds to the need for SRL support by
promoting self-regulated learning in an online learning environment
via interventions automatically customized for each student.
Many works demonstrated that aspects of student
performance and experiences can be predicted by using students'
action log les [
        <xref ref-type="bibr" rid="ref1 ref11 ref15 ref16 ref4 ref5 ref6">15, 1, 16, 4, 6, 11, 5</xref>
        ].
      </p>
      <p>We take a step forward by using predicted student outcomes
to trigger personalized SRL interventions in online courses.
In our study, SRL interventions consist of suggesting that
students engage in speci c SRL behaviors, such as
reviewing particular readings or quizzes that contribute to a lower
predicted posttest grade. We emphasize triggering tailored
interventions automatically for each student since it can help
a particular student at the right time with interventions
chosen by a machine learning model. The machine learning
model in this case is designed to predict students' outcomes
(speci cally, posttest grade), while interventions are based
on model explainability methods intended to discover
SRLrelated reasons why the model made a particular prediction.
Our study involves three conditions: the model training
condition, the treatment condition, and the placebo condition.
However, in this paper, we primarily discuss an
investigation with the model training condition, in which we collect
the data from students' action log les to train the machine
learning model that will ultimately trigger SRL
interventions in the treatment condition. We propose a method
of 1) predicting student outcome (posttest grade) in online
learning using machine learning techniques, and 2)
triggering tailored SRL interventions for students by
implementing SHAP (SHapley Additive exPlanations) analysis for the
predicted student outcome. We focus on the
methodological step #1 in this paper, but discuss ongoing work
toward step #2. We evaluate aspects of this methodology in
a study where students learned about introductory
descriptive statistics concepts and interact with a self-paced online
learning environment.</p>
      <p>In our pilot study, we attempted to answer the following
research questions:</p>
      <p>RQ1) How much do generalizable learning features (e.g.,
quiz/test grades) extracted from e-learning platforms
predict student outcomes in machine learning models?
RQ2) Is it possible to suggest each student to review
speci c topics from the learning module by triggering
SRL interventions at the right moment?</p>
    </sec>
    <sec id="sec-2">
      <title>2. RELATED WORK</title>
    </sec>
    <sec id="sec-3">
      <title>2.1 Self-regulated Learning (SRL)</title>
      <p>
        Previous studies have demonstrated the importance of
selfregulated learning (SRL) in academic contexts [
        <xref ref-type="bibr" rid="ref14 ref23">23, 14</xref>
        ].
Particularly, researchers have focused on associations between
self-regulated learning and academic achievements. Xiao
et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], Yusuf [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], and Zimmerman [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] investigated
the reasons why students with self-regulated skills tend to
accomplish strong academic achievement in their studies.
Among 14 di erent types of self-regulated learning strategies
that Zimmerman and Pons [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] identi ed in their research
on students' learning strategies, our study primarily focused
on reviewing records, which indicates student-initiated
endeavors to review tests, notes, or textbooks for preparing
further testing. Zimmerman and Pons collected data about
participants' SRL strategies by conducting a structured
interview and demonstrated how students from a high
achievement group used reviewing strategies more frequently than
lower achievers. The signi cance of developing SRL skills
for university students has been further shown by other
previous analyses [
        <xref ref-type="bibr" rid="ref12 ref8">12, 8</xref>
        ]. In an online learning environment,
the importance of self-monitoring skills only increases since
students are responsible for their own learning.
      </p>
    </sec>
    <sec id="sec-4">
      <title>2.2 Modeling SRL</title>
      <p>
        In order to analyze and measure students' self-regulatory
behaviors in distance learning, researchers have used students'
action log les in various ways. For instance,
Maldonadomahauad et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] implemented process mining technique to
detect self-regulated learning strategies and identi ed
clusters of learners in Massive Open Online Courses (MOOCs).
In another example, Segedy et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] applied coherence
analysis to interpret and characterize learner's behaviors in
open-ended computer-based learning environments to shed
light on students' SRL. In addition, there is a large body
of research indicating that predicting student performance
using event log les in e-learning is feasible [
        <xref ref-type="bibr" rid="ref12 ref2">12, 2</xref>
        ]. These
studies concentrated on demonstrating detecting and
characterizing students' SRL behaviors in online learning.
      </p>
    </sec>
    <sec id="sec-5">
      <title>2.3 Intervening to Support SRL</title>
      <p>
        Our study's primary objective is to apply SHAP analysis to
extend current modeling methods so that they can support
SRL in online education. In this regard, Mu et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] took
a very similar approach where they aimed to support
wheelspinning students in computerized educational systems by
suggesting actionable interventions. This study is especially
related to our study since they used SHAP to trigger
individualized interventions. Our study di ers in that we focus
on SRL speci cally, and will test the e ectiveness of
interventions experimentally, which is critical because the act of
intervening based on a model's input features may then
affect the model (e.g., perhaps reducing its accuracy).
      </p>
    </sec>
    <sec id="sec-6">
      <title>3. METHOD</title>
      <p>Our overall study design consists of an experiment with
three conditions: the model training condition, the
treatment condition, and the placebo condition. In this paper,
we primarily focus on the model training condition where we
collected data for machine learning model training and
implemented SHAP analysis. The other two conditions
(treatment condition and placebo condition) are currently
collecting data with interventions from the model described here.
Participants in the experimental condition group will receive
tailored SRL interventions based on machine learning
predictions, whereas students assigned to the placebo group will
get SRL interventions almost identical to the ones from the
experimental condition group except not based on machine
learning predictions. In all conditions, including the model
training condition we study in this paper, participants
engaged in learning about introductory statistical concepts by
using custom web-based online learning software (Figure 1).
Figure 2 below illustrates an overview of our research
procedure.</p>
      <p>At the start of the study session, students completed a brief
survey regarding their demographics and prior academic
history. Following the survey, participants took a 10-minute
pretest and used the self-guided learning session for up to 1
hour. The self-paced online learning environment included
12 di erent illustrated statistical readings, and one quiz to
go with each reading (see partial screenshot in Figure 1).
Throughout the learning session, students were not required
to complete all the modules, and were allowed to complete
the components more than once if they wanted to. After
30 minutes elapsed during learning session, each student
received a simple baseline SRL intervention in which they were
told which topics they had not yet viewed, or a list of
topics in order from least- to most-viewed if they had viewed
them all. Hence, in the model training condition, the
intervention was not based on machine learning. The message
prompt and the corresponding list of topics would, in
theory, help to make aware patterns in the student's learning
behavior up to that point in time, which could lead to
reecting more deeply about their current learning trajectory.
Based on this information, we expected the student would
be able to make more informed decisions on how to
regulate their learning by adapting their future learning
behaviors. This would allow for more systematic and controllable
decision-making processes to determine which topics to visit
or review, and make salient what areas they may feel to have
not su ciently studied. This intervention repeated every 10
minutes thereafter until the 1-hour learning session was over
or the student chose to end it before the hour. Subsequently,
they completed a 10-minute posttest with questions modeled
after the pretest questions (but not identical).</p>
    </sec>
    <sec id="sec-7">
      <title>3.1 Data Collection</title>
      <p>We collected data from 58 university students who
participated in our pilot study. Students were required to have
completed either zero or one college-level statistics course,
but not more, to avoid inappropriately matching
introductory material to expert students. Students' event log les
were extracted from the online learning system, which recorded
their learning activities in real-time including information
needed to provide interventions. These log les contained
activities that were recorded during the students'
interactions with every stage of the web-based online learning
software.</p>
    </sec>
    <sec id="sec-8">
      <title>3.2 Feature Extraction</title>
      <p>Following data collection, we extracted each student's
various features, including some attributes related to SRL
behaviors, in order to use them as predictors for training the
machine learning model. Students' interactions were
distributed across various log les for each possible learning
activity, quiz and test scores, time spent reading, and other
les. For feature extraction, we merged all students' feature
outputs into a single table which we later used for machine
learning data analysis. Extracted features consisted of:</p>
      <sec id="sec-8-1">
        <title>Prior to the self-guided learning session:</title>
        <p>Pretest grade (mean of multiple-choice question
correctness)</p>
        <sec id="sec-8-1-1">
          <title>Time spent taking the pretest</title>
        </sec>
      </sec>
      <sec id="sec-8-2">
        <title>During the self-guided learning session:</title>
        <sec id="sec-8-2-1">
          <title>Quiz grade for each 12 descriptive statistics quizzes</title>
          <p>Time spent reading each 12 descriptive statistics
readings
Number of times the student reviewed statistical
readings/quizzes
Number of events where the student clicked the button
for going back to the Main Topics Menu
Whether a student looked at other windows/tabs (i.e.,
the learning environment lost focus)</p>
        </sec>
        <sec id="sec-8-2-2">
          <title>Time spent completing the learning session</title>
        </sec>
      </sec>
      <sec id="sec-8-3">
        <title>Following the self-guided learning session:</title>
        <sec id="sec-8-3-1">
          <title>Posttest grade</title>
        </sec>
        <sec id="sec-8-3-2">
          <title>Time taken for the posttest</title>
          <p>Note that there were 12 versions of each quiz grade and
reading time feature, each extracted from one of the
readings or quizzes. Features following the learning session were
outcomes, rather than predictors; in particular, we focus on
posttest grade in this paper.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>3.3 Data Analysis</title>
      <p>From the extracted feature data, consisting of 58 instances
(1 per student), we discovered some of the students did not
seem to try their best to participate in our study. Five
students did not attempt to take any of the quizzes from
the learning session and their posttest scores were lower
than their pretest scores. Nevertheless, we did not exclude
these observations from our dataset since these students may
re ect future treatment condition students and real-world
classroom students.</p>
      <p>We performed exploratory data analysis on the feature dataset
and will discuss our ndings in the Results (Section 4).</p>
    </sec>
    <sec id="sec-10">
      <title>3.4 Model Training</title>
      <p>
        Initially, we trained our model using a decision tree regressor
to predict student performance (posttest score) using quiz
grades, reading times, and pretest score (in Python using
Scikit-learn [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]). However, the decision tree model did not
yield stable R2 values across 5-fold cross-validation. Since
we had a relatively small dataset size for training a
machine learning model, folds had, on average, 11{12 instances.
Small folds contributed to instability in results since a
decision tree might produce predictions for one fold with little
or no variance. We thus changed to random forest
regression, which can randomly sample observations and features
to build a forest trees with ample variation between trees,
which eliminated the problem of invalid R2 values.
      </p>
    </sec>
    <sec id="sec-11">
      <title>3.5 SHAP (SHapley Additive exPlanations)</title>
      <p>Using posttest scores predicted by the Random Forest
Regressor model, we implemented SHAP analysis to interpret
the model prediction. We calculated SHAP values using
the tree explainer to explain model predictions and will use
these explanations (in the next steps of the project) to
trigger individualized SRL interventions to meet each student's
need. We sorted calculated SHAP values in ascending order
to determine speci c features that contribute to getting a
lower posttest score.</p>
    </sec>
    <sec id="sec-12">
      <title>4. RESULTS</title>
      <p>In this section, we present ndings on each stage of our
research process in detail.</p>
    </sec>
    <sec id="sec-13">
      <title>4.1 Data Analysis</title>
      <p>We calculated Pearson correlations as a rst step of data
analysis in order to measure the strength of a linear
relationship between posttest grade (target variable) and other
potential predictor variables. We expected that clear
relationships would be needed for the machine learning model
to succeed given the small dataset size. Among the
predictor variables, pretest grade had the highest correlation
(r = .530), indicate at least one sizable|if unsurprising|
relationship in the data. From this we discovered that
students' initial knowledge (as evident from pretest score) was
closely related to posttest grade.</p>
      <p>However, quiz grade features and reading time features had
promising positive correlation coe cients (up to r = .395).
Though most features these were not statistically signi
cantly related to posttest score given the large number of
predictors and small dataset size, trends indicated that these
were promising indicators for the success of machine
learning methods. Since our goal was to suggest speci c
reading/quiz topic that students should review, we included 12
quiz grades and 12 reading time features for our predictors,
along with pretest. Some features related to SRL had
correlation coe cients trending in the expected direction, but
in order to make our machine learning model simple, we
did not use them as predictors. Moreover, we wanted to
use highly generalizable feature types that could be easily
extracted from diverse online learning platforms.
Within the chosen predictors, we checked whether pretest
grade and posttest grade were normally distributed. We
plotted frequency histograms for pretest grade and posttest
grade features. Corresponding histograms were relatively
bell-shaped and symmetric about the mean values, so we
concluded that pretest and posttest grade features are
normally distributed. This is essential to avoiding ceiling or
oor e ects for analysis of learning.</p>
    </sec>
    <sec id="sec-14">
      <title>4.2 Extracted Features</title>
      <p>
        Initially, we extracted 11 additional features from students'
log les, but we used only pretest grade, quiz grade, and
reading time as features for predicting posttest grade. Since
we had a small sample size with 58 observations, we had
to reduce the number of predictors to make the model
simple. However, in future work with more participants we
plan to use more predictors, such as features related to
speci c SRL constructs extracted via coherence analysis [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
Moreover, we can include both SRL-related and unrelated
features (e.g., pretest score in this study) and apply SHAP
to disentangle the e ects of SRL features, speci cally, to
provide interventions only on those.
      </p>
      <p>In Figure 3, the left grey boxes represent the overall study
process that students went through in our experiment. On
the right, the gure shows the composition of a student's
recorded action log le as a whole, which is composed of
demographic survey, pretest, descriptive statistics surveys,
reading times, log, browser tab focus, and posttest les. The
diagram shows from which log les the predictors and a
target variable were extracted.</p>
      <p>Since students were not asked to complete all the
reading/quiz components from the learning session, there were
many participants who did skip several readings or quizzes.
In these cases, we assigned -1 for corresponding reading
times and quiz grade features to di erentiate the cases.
Table 1 below shows a statistics summary of extracted features,
including minimum, maximum, and possible values that the
features can take.
DEMOGRAPHIC</p>
      <p>SURVEY
PRETEST
app
:
app</p>
      <p>SELF-PACED
ONLINE LEARNING</p>
      <p>POSTTEST
Pretest grade</p>
      <p>Posttest grade
Quiz Grade</p>
      <p>Reading Time
:</p>
    </sec>
    <sec id="sec-15">
      <title>4.3 Machine Learning Model Training</title>
      <p>We trained the random forest regressor model using pretest
grade, quiz grade, and reading time features to predict posttest
grade. We used 5-fold cross-validation for validation, as
mentioned earlier, and measured R2, root mean squared
error (RMSE), and Pearson's r for model evaluation metrics.
During the model training, we set the maximum depth of the
trees to be 4 and xed the random seed in order to produce
a consistent outcome. We used 4 as a maximum depth of
the tree because the results became notably more stable in
initial tests (with a partially collected dataset) by reducing
the depth of the trees.</p>
      <p>After training, we evaluated the model and obtained the
following results: 1) mean R2 value within 5-fold
crossvalidation was .262, 2) mean RMSE was 15.17 (on a 0{100
posttest grade scale), and 3) mean Pearson's r was .576.
From these observations, it is noticeable that our mean R2
value was far from perfect averaged across folds. However,
given the small data size we have, the trained model works
relatively well. Furthermore, the accuracy was stable across
folds: across the 5 folds, the standard deviation of R2 was
.067 and the standard deviation of RMSE was 2.20.</p>
    </sec>
    <sec id="sec-16">
      <title>5. DISCUSSION</title>
      <p>The objective of our pilot study was to promote self-regulated
learning in online education by triggering individualized SRL
interventions using machine learning techniques and SHAP
values. To accomplish this, we began with extracting
learning relevant features from 58 students' action log les which
recorded students' learning traces during the online
learning process. Using three types of predictors (pretest grade,
quiz grade, and reading time), we trained a random
forest regressor model to predict student outcome (posttest
grade). Using the predicted student outcome, we applied
SHAP technique to trigger personalized SRL interventions
by recommending each student to review the single most
learning session that contributes to getting a lower student
outcome. As a result, we developed a exible way of
triggering individualized SRL interventions in a digital learning
environment.</p>
    </sec>
    <sec id="sec-17">
      <title>5.1 Significance</title>
      <p>
        Our presented method is noteworthy in several aspects. Firstly,
our proposed technique can be used as a means to help
students who lack SRL skills, technology experience, or have
weak academic performance to develop SRL skills in e-learning
environments. Especially with the unexpected outbreak of
the global COVID-19 pandemic, a large number of students
are using online learning platforms as a way of learning and
assessing learning outcomes. The need for methods to help
students learn e ectively online is thus increasing rapidly;
we hope our proposed method can contribute to the
technology development of helping students in online education.
Secondly, our approach is original in that we integrated
complex machine learning techniques (i.e., a complex, non-linear
regression model) with SHAP as a recent machine learning
interpretability method to decide how to intervene.
Previously, this idea had only been explored on archival data [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ],
and not for SRL interventions in particular.
      </p>
    </sec>
    <sec id="sec-18">
      <title>5.2 Implications</title>
      <p>The presented method can be useful in some respects.
Students can receive personalized SRL interventions, which can
help them to improve their outcomes and develop SRL skills
in online learning. In particular, we can support speci c
groups of students who need the most help in online
education. For instance, students with no prior experience using
e-learning platforms tend to be less experienced with
monitoring their learning activities. This group of students can
bene t from the implementation of our method of triggering
tailored SRL interventions since they would be exposed to
suggestions that encourage them to engage in speci c SRL
activities.</p>
      <p>Our method is highly exible since we required only 58
students as a training sample (which is su cient here, but could
be expanded) and 3 types of predictors extracted from event
log les for training the model and administering SRL
interventions. In particular, the three predictor types are
generalizable features (pretest grade, quiz grade, and reading
time) which can be easily extracted from many online
learning platforms. Moreover, most e-learning platforms store
students' action in log les, so we expect that it is feasible
to introduce our SRL intervention mechanism into online
learning platforms. Thus, we expect this method to be
applicable in a variety of computer-based learning contexts.
Our method will also allow researchers to explore the causal
nature of educational interventions driven by machine
learning models. For example, are students who spend more time
on a speci c topic doing well because of time spent on that
topic (as implied by a causal interpretation of the model),
or do students who do well also happen to spend time on
that particular topic because of some unobserved trait? Our
explorations of interventions based on the features will
allow us to manipulate the inputs of models and explore the
nature of these connections.</p>
    </sec>
    <sec id="sec-19">
      <title>5.3 Limitations</title>
      <p>Even though we had a fairly stable model accuracy within
the sample size (N = 58), model accuracy might be
improved substantially with more data. The current training
data size limits the feasibility of extracting a large
number of specialized features that might only apply to a small
fraction of students. Moreover, there may be complex
interactions between students' learning behaviors on di erent
topics that require additional data to uncover. Likewise, if
our model is applied into online courses where students take
the course for no credit, then the expected e ectiveness of
our method might be weak since students may not be
motivated to study in the same way that students taking actual
courses for credit. Further data is needed to explore these
e ects.</p>
      <p>Collecting additional data would also a ord the opportunity
to explore new types of features that could address some of
the unexplained variance in our model. For example, there
are many features unrelated to SRL, such as prior
experience level, perceptions of statistics, and others that might
improve model accuracy. We hope to address these gaps in
the model in future work. Notably, the model explanation
method used here will allow us to still provide interventions
based on SRL features even with other features included.
Improved model accuracy might be most helpful when
determining when to provide interventions and to whom,
unlike our current approach in which every student receives an
intervention at predetermined points in time.</p>
    </sec>
    <sec id="sec-20">
      <title>6. CONCLUSION</title>
      <p>Notwithstanding the aforementioned limitations, our pilot
study made an e ort to throw light on the matter of students
struggling within online courses through integrating machine
learning and SHAP to promote self-regulated learning in
digital learning. In the near future, we hope students in the
treatment condition produce better learning behaviors and
academic outcomes when our proposed method is applied.</p>
    </sec>
    <sec id="sec-21">
      <title>7. ACKNOWLEDGEMENTS</title>
      <p>This research was supported by a grant from the Technology
Innovation in Educational Research and Design (TIER-ED)
initiative at the University of Illinois Urbana{Champaign.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Baker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ocumpaugh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Gowda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Kamarainen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Metcalf</surname>
          </string-name>
          .
          <article-title>Extending log-based a ect detection to a multi-user virtual environment for science</article-title>
          .
          <source>In 22nd Conference on User Modeling, Adaptation and Personalization (UMAP</source>
          <year>2014</year>
          ), pages
          <fpage>290</fpage>
          {
          <fpage>300</fpage>
          . Springer,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Dalipi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Imran</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Z.</given-names>
            <surname>Kastrati</surname>
          </string-name>
          .
          <article-title>MOOC dropout prediction using machine learning techniques: Review and research challenges</article-title>
          .
          <source>In Proceedings of the 2018 IEEE Global Engineering Education Conference (EDUCON)</source>
          , pages
          <fpage>1007</fpage>
          {
          <fpage>1014</fpage>
          . IEEE, May
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E. S.</given-names>
            <surname>Ghatala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Levin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pressley</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Goodwin</surname>
          </string-name>
          .
          <article-title>A componential analysis of the e ects of derived and supplied strategy-utility information on children's strategy selections</article-title>
          .
          <source>Journal of Experimental Child Psychology</source>
          ,
          <volume>41</volume>
          (
          <issue>1</issue>
          ):
          <volume>76</volume>
          {
          <fpage>92</fpage>
          ,
          <string-name>
            <surname>Feb</surname>
          </string-name>
          .
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Gobert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Pedro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Raziuddin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Baker</surname>
          </string-name>
          .
          <article-title>From log les to assessment metrics: Measuring students' science inquiry skills using educational data mining</article-title>
          .
          <source>Journal of the Learning Sciences</source>
          ,
          <volume>22</volume>
          (
          <issue>4</issue>
          ):
          <volume>521</volume>
          {
          <fpage>563</fpage>
          ,
          <string-name>
            <surname>Oct</surname>
          </string-name>
          .
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Hur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bosch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Paquette</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Mercier</surname>
          </string-name>
          .
          <article-title>Harbingers of collaboration? The role of early-class behaviors in predicting collaborative problem solving</article-title>
          .
          <source>In Proceedings of the 13th International Conference on Educational Data Mining (EDM</source>
          <year>2020</year>
          ), pages
          <fpage>104</fpage>
          {
          <fpage>114</fpage>
          .
          <string-name>
            <surname>International Educational Data Mining Society</surname>
          </string-name>
          ,
          <year>2020</year>
          .
          <article-title>hur-csteps2-edm20-camera</article-title>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bosch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Baker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Paquette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ocumpaugh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M. A. L.</given-names>
            <surname>Andres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. L.</given-names>
            <surname>Moore</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Biswas. Expert</surname>
          </string-name>
          feature
          <article-title>-engineering vs. deep neural networks: Which is better for sensor-free a ect detection</article-title>
          ? In C. P. Rose,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mart</surname>
          </string-name>
          nez-Maldonado,
          <string-name>
            <given-names>H. U.</given-names>
            <surname>Hoppe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Luckin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mavrikis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Porayska-Pomsta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>McLaren</surname>
          </string-name>
          , and B. du Boulay, editors,
          <source>Proceedings of the 19th International Conference on Arti cial Intelligence in Education (AIED</source>
          <year>2018</year>
          ), pages
          <fpage>198</fpage>
          {
          <fpage>211</fpage>
          ,
          <string-name>
            <surname>Cham</surname>
            ,
            <given-names>CH</given-names>
          </string-name>
          ,
          <year>2018</year>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Maldonado-Mahauad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perez-Sanagust</surname>
          </string-name>
          <string-name>
            <surname>n</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Kizilcec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Morales</surname>
          </string-name>
          , and J.
          <string-name>
            <surname>Munoz-Gama</surname>
          </string-name>
          .
          <article-title>Mining theory-based patterns from Big data: Identifying self-regulated learning strategies in Massive Open Online Courses</article-title>
          .
          <source>Computers in Human Behavior</source>
          ,
          <volume>80</volume>
          :
          <fpage>179</fpage>
          {
          <fpage>196</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Mega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ronconi</surname>
          </string-name>
          , and R. De Beni.
          <article-title>What makes a good student? How emotions, self-regulated learning, and motivation contribute to academic achievement</article-title>
          .
          <source>Journal of Educational Psychology</source>
          ,
          <volume>106</volume>
          (
          <issue>1</issue>
          ):
          <volume>121</volume>
          {
          <fpage>131</fpage>
          ,
          <string-name>
            <surname>Feb</surname>
          </string-name>
          .
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Moynahan</surname>
          </string-name>
          .
          <article-title>Assessment and selection of paired associate strategies: A developmental study</article-title>
          .
          <source>Journal of Experimental Child Psychology</source>
          ,
          <volume>26</volume>
          (
          <issue>2</issue>
          ):
          <volume>257</volume>
          {
          <fpage>266</fpage>
          ,
          <string-name>
            <surname>Oct</surname>
          </string-name>
          .
          <year>1978</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jetten</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Brunskill</surname>
          </string-name>
          .
          <article-title>Towards suggesting actionable interventions for wheel-spinning students</article-title>
          .
          <source>In Proceedings of the 13th International Conference on Educational Data Mining (EDM</source>
          <year>2020</year>
          ), pages
          <fpage>183</fpage>
          {
          <fpage>193</fpage>
          .
          <string-name>
            <surname>International Educational Data Mining Society</surname>
          </string-name>
          ,
          <year>July 2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L.</given-names>
            <surname>Paquette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rowe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Baker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lester</surname>
          </string-name>
          , J. DeFalco, K. Brawner,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sottilare</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Georgoulas</surname>
          </string-name>
          .
          <article-title>Sensor-free or sensor-full: A comparison of data modalities in multi-channel a ect detection</article-title>
          .
          <source>In Proceedings of the 8th International Conference on Educational Data Mining (EDM</source>
          <year>2015</year>
          ), pages
          <fpage>93</fpage>
          {
          <fpage>100</fpage>
          .
          <string-name>
            <surname>International Educational Data Mining Society</surname>
          </string-name>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Han</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Ellis</surname>
          </string-name>
          . Combining University Student Self-
          <article-title>Regulated Learning Indicators and Engagement with Online Learning Events to Predict Academic Performance</article-title>
          .
          <source>IEEE Transactions on Learning Technologies</source>
          ,
          <volume>10</volume>
          (
          <issue>1</issue>
          ):
          <volume>82</volume>
          {
          <fpage>92</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Duchesnay</surname>
          </string-name>
          .
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          ,
          <volume>12</volume>
          :
          <fpage>2825</fpage>
          {
          <fpage>2830</fpage>
          ,
          <string-name>
            <surname>Nov</surname>
          </string-name>
          .
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P. R.</given-names>
            <surname>Pintrich</surname>
          </string-name>
          .
          <article-title>Understanding self-regulated learning</article-title>
          .
          <source>New Directions for Teaching and Learning</source>
          ,
          <source>1995(63):</source>
          <volume>3</volume>
          {
          <fpage>12</fpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Segedy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Kinnebrew</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Biswas</surname>
          </string-name>
          .
          <article-title>Using coherence analysis to characterize self-regulated learning behaviours in open-ended learning environments</article-title>
          .
          <source>Journal of Learning Analytics</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ):
          <volume>13</volume>
          {
          <fpage>48</fpage>
          {
          <fpage>13</fpage>
          {
          <fpage>48</fpage>
          , May
          <year>2015</year>
          . Number:
          <volume>1</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Taub</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. V.</given-names>
            <surname>Mudrick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Azevedo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. C.</given-names>
            <surname>Millar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rowe</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Lester</surname>
          </string-name>
          .
          <article-title>Using multi-channel data with multi-level modeling to assess in-game performance during gameplay with Crystal Island</article-title>
          .
          <source>Computers in Human Behavior</source>
          ,
          <volume>76</volume>
          :
          <fpage>641</fpage>
          {
          <fpage>655</fpage>
          ,
          <string-name>
            <surname>Nov</surname>
          </string-name>
          .
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yao</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <article-title>The relationships of self-regulated learning and academic achievement in university students</article-title>
          . volume
          <volume>60</volume>
          , 01003,
          <string-name>
            <surname>Jan</surname>
          </string-name>
          .
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Yusuf</surname>
          </string-name>
          .
          <article-title>The impact of self-e cacy, achievement motivation, and self-regulated learning strategies on students' academic achievement</article-title>
          .
          <source>Procedia-Social and Behavioral Sciences</source>
          ,
          <volume>15</volume>
          :
          <fpage>2623</fpage>
          {
          <fpage>2626</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>B. J.</given-names>
            <surname>Zimmerman</surname>
          </string-name>
          .
          <article-title>Self-regulated learning and academic achievement: An overview</article-title>
          .
          <source>Educational Psychologist</source>
          ,
          <volume>25</volume>
          (
          <issue>1</issue>
          ):3{
          <fpage>17</fpage>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>B. J.</given-names>
            <surname>Zimmerman</surname>
          </string-name>
          .
          <article-title>Becoming a self-regulated learner: An overview</article-title>
          .
          <source>Theory Into Practice</source>
          ,
          <volume>41</volume>
          (
          <issue>2</issue>
          ):
          <volume>64</volume>
          {
          <fpage>70</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>B. J.</given-names>
            <surname>Zimmerman</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Martinez-Pons</surname>
          </string-name>
          .
          <article-title>Construct validation of a strategy model of student self-regulated learning</article-title>
          .
          <source>Journal of Educational Psychology</source>
          ,
          <volume>80</volume>
          (
          <issue>3</issue>
          ):
          <volume>284</volume>
          {
          <fpage>290</fpage>
          ,
          <year>1988</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>B. J.</given-names>
            <surname>Zimmerman</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. M.</given-names>
            <surname>Pons</surname>
          </string-name>
          .
          <article-title>Development of a structured interview for assessing student use of self-regulated learning strategies</article-title>
          .
          <source>American Educational Research Journal</source>
          ,
          <volume>23</volume>
          (
          <issue>4</issue>
          ):
          <volume>614</volume>
          {
          <fpage>628</fpage>
          ,
          <string-name>
            <surname>Jan</surname>
          </string-name>
          .
          <year>1986</year>
          . Publisher: American Educational Research Association.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>B. J.</given-names>
            <surname>Zimmerman</surname>
          </string-name>
          and
          <string-name>
            <given-names>D. H.</given-names>
            <surname>Schunk</surname>
          </string-name>
          .
          <article-title>Handbook of self-regulation of learning and performance. Educational psychology handbook series</article-title>
          . Routledge/Taylor &amp; Francis Group,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>