=Paper= {{Paper |id=Vol-3051/UGR_4 |storemode=property |title=Promoting Self-regulated Learning in Online Learning by Triggering Tailored Interventions |pdfUrl=https://ceur-ws.org/Vol-3051/UGR_4.pdf |volume=Vol-3051 |authors=HaeJin Lee,Paul Hur,Suma Bhat,Nigel Bosch |dblpUrl=https://dblp.org/rec/conf/edm/LeeHBB21 }} ==Promoting Self-regulated Learning in Online Learning by Triggering Tailored Interventions== https://ceur-ws.org/Vol-3051/UGR_4.pdf
     Promoting Self-regulated Learning in Online Learning by
                Triggering Tailored Interventions

                                           HaeJin Lee                               Paul Hur
                                       University of Illinois                 University of Illinois
                                       Urbana–Champaign                       Urbana–Champaign
                                     haejin2@illinois.edu                    khur4@illinois.edu
                                         Suma Bhat                              Nigel Bosch
                                       University of Illinois                 University of Illinois
                                       Urbana–Champaign                       Urbana–Champaign
                                    spbhat2@illinois.edu                      pnb@illinois.edu

ABSTRACT                                                                  demic achievement is strongly connected to the level of stu-
In online education, students are expected to be independent              dents’ self regulative abilities, in which students take initia-
learners who can self-regulate and reflect on their activities            tive in the learning process [8, 17, 14].
during the learning process. However, not all students have
self-regulated learning (SRL) skills, and students with weak              Self-regulated learning (SRL) skills are techniques for achiev-
SRL skills tend to underperform in distance learning envi-                ing academic success by regulating a student’s own actions
ronments. The aim of our pilot study was to promote self-                 and decisions during the learning process [20]. Students with
regulated learning in online education by triggering tailored             SRL skills are able to manage their own plans and reflect on
SRL interventions automatically. As a first step toward, we               learning progress throughout their learning.
constructed a quantitative research design where 58 students
participated in 1) learning about introductory descriptive                However, distance learning poses threats to student suc-
statistical concepts and 2) interacting with a self-paced on-             cess, especially students without SRL skills, since they are
line learning software throughout the experiment. We used                 supposed to learn and complete assignments on their own.
the participants’ action log files as a dataset to extract gener-         Having SRL skills in e-learning environments is particularly
alizable features, including pretest grade, quiz grade, reading           important for students to achieve academic goals, but SRL
time, and posttest grade. Then, we trained a random forest                ability is not an inherent skill that every student possesses [9,
regressor model to predict student outcome (posttest). The                3]. Students lacking self-regulated skills are prone to under-
correlation between actual and predicted posttest score was               perform their peers who are able to direct their own learning
r = .576, indicating promise for accurately predicting and                process [22, 21]. Therefore, supporting students lacking SRL
intervening. In the next phase of this work, we will apply                skills to develop into responsible and autonomous learners
SHAP (SHapley Additive exPlanations) to personalize SRL                   in online learning is crucial.
interventions by recommending each student to review the
single topic that most negatively contributes to predicted                Our study responds to the need for SRL support by promot-
posttest grade.                                                           ing self-regulated learning in an online learning environment
                                                                          via interventions automatically customized for each student.
Keywords                                                                  Many works demonstrated that aspects of student perfor-
Self-regulated learning, interventions, machine learning ex-              mance and experiences can be predicted by using students’
planations, computer-based learning                                       action log files [15, 1, 16, 4, 6, 11, 5].

                                                                          We take a step forward by using predicted student outcomes
1.    INTRODUCTION                                                        to trigger personalized SRL interventions in online courses.
Students with different online learning skills, academic per-             In our study, SRL interventions consist of suggesting that
formance, and levels of technology experience may go through              students engage in specific SRL behaviors, such as review-
hardships to become autonomous learners who academically                  ing particular readings or quizzes that contribute to a lower
succeed in ever-growing online learning courses in univer-                predicted posttest grade. We emphasize triggering tailored
sities. Previous studies demonstrated that successful aca-                interventions automatically for each student since it can help
                                                                          a particular student at the right time with interventions cho-
                                                                          sen by a machine learning model. The machine learning
                                                                          model in this case is designed to predict students’ outcomes
                                                                          (specifically, posttest grade), while interventions are based
                                                                          on model explainability methods intended to discover SRL-
                                                                          related reasons why the model made a particular prediction.

Copyright ©2021 for this paper by its authors. Use permitted under Cre-   Our study involves three conditions: the model training con-
ative Commons License Attribution 4.0 International (CC BY 4.0)
dition, the treatment condition, and the placebo condition.      light on students’ SRL. In addition, there is a large body
However, in this paper, we primarily discuss an investiga-       of research indicating that predicting student performance
tion with the model training condition, in which we collect      using event log files in e-learning is feasible [12, 2]. These
the data from students’ action log files to train the machine    studies concentrated on demonstrating detecting and char-
learning model that will ultimately trigger SRL interven-        acterizing students’ SRL behaviors in online learning.
tions in the treatment condition. We propose a method
of 1) predicting student outcome (posttest grade) in online      2.3   Intervening to Support SRL
learning using machine learning techniques, and 2) trigger-      Our study’s primary objective is to apply SHAP analysis to
ing tailored SRL interventions for students by implement-        extend current modeling methods so that they can support
ing SHAP (SHapley Additive exPlanations) analysis for the        SRL in online education. In this regard, Mu et al. [10] took
predicted student outcome. We focus on the methodolog-           a very similar approach where they aimed to support wheel-
ical step #1 in this paper, but discuss ongoing work to-         spinning students in computerized educational systems by
ward step #2. We evaluate aspects of this methodology in         suggesting actionable interventions. This study is especially
a study where students learned about introductory descrip-       related to our study since they used SHAP to trigger indi-
tive statistics concepts and interact with a self-paced online   vidualized interventions. Our study differs in that we focus
learning environment.                                            on SRL specifically, and will test the effectiveness of inter-
                                                                 ventions experimentally, which is critical because the act of
In our pilot study, we attempted to answer the following         intervening based on a model’s input features may then af-
research questions:                                              fect the model (e.g., perhaps reducing its accuracy).


   • RQ1) How much do generalizable learning features (e.g.,
                                                                 3.    METHOD
     quiz/test grades) extracted from e-learning platforms       Our overall study design consists of an experiment with
     predict student outcomes in machine learning models?        three conditions: the model training condition, the treat-
                                                                 ment condition, and the placebo condition. In this paper,
   • RQ2) Is it possible to suggest each student to review       we primarily focus on the model training condition where we
     specific topics from the learning module by triggering      collected data for machine learning model training and im-
     SRL interventions at the right moment?                      plemented SHAP analysis. The other two conditions (treat-
                                                                 ment condition and placebo condition) are currently collect-
2. RELATED WORK                                                  ing data with interventions from the model described here.
                                                                 Participants in the experimental condition group will receive
2.1 Self-regulated Learning (SRL)                                tailored SRL interventions based on machine learning pre-
Previous studies have demonstrated the importance of self-       dictions, whereas students assigned to the placebo group will
regulated learning (SRL) in academic contexts [23, 14]. Par-     get SRL interventions almost identical to the ones from the
ticularly, researchers have focused on associations between      experimental condition group except not based on machine
self-regulated learning and academic achievements. Xiao          learning predictions. In all conditions, including the model
et al. [17], Yusuf [18], and Zimmerman [19] investigated         training condition we study in this paper, participants en-
the reasons why students with self-regulated skills tend to      gaged in learning about introductory statistical concepts by
accomplish strong academic achievement in their studies.         using custom web-based online learning software (Figure 1).
Among 14 different types of self-regulated learning strategies   Figure 2 below illustrates an overview of our research pro-
that Zimmerman and Pons [22] identified in their research        cedure.
on students’ learning strategies, our study primarily focused
on reviewing records, which indicates student-initiated en-      At the start of the study session, students completed a brief
deavors to review tests, notes, or textbooks for preparing       survey regarding their demographics and prior academic his-
further testing. Zimmerman and Pons collected data about         tory. Following the survey, participants took a 10-minute
participants’ SRL strategies by conducting a structured in-      pretest and used the self-guided learning session for up to 1
terview and demonstrated how students from a high achieve-       hour. The self-paced online learning environment included
ment group used reviewing strategies more frequently than        12 different illustrated statistical readings, and one quiz to
lower achievers. The significance of developing SRL skills       go with each reading (see partial screenshot in Figure 1).
for university students has been further shown by other pre-     Throughout the learning session, students were not required
vious analyses [12, 8]. In an online learning environment,       to complete all the modules, and were allowed to complete
the importance of self-monitoring skills only increases since    the components more than once if they wanted to. After
students are responsible for their own learning.                 30 minutes elapsed during learning session, each student re-
                                                                 ceived a simple baseline SRL intervention in which they were
2.2   Modeling SRL                                               told which topics they had not yet viewed, or a list of top-
In order to analyze and measure students’ self-regulatory be-    ics in order from least- to most-viewed if they had viewed
haviors in distance learning, researchers have used students’    them all. Hence, in the model training condition, the inter-
action log files in various ways. For instance, Maldonado-       vention was not based on machine learning. The message
mahauad et al. [7] implemented process mining technique to       prompt and the corresponding list of topics would, in the-
detect self-regulated learning strategies and identified clus-   ory, help to make aware patterns in the student’s learning
ters of learners in Massive Open Online Courses (MOOCs).         behavior up to that point in time, which could lead to re-
In another example, Segedy et al. [15] applied coherence         flecting more deeply about their current learning trajectory.
analysis to interpret and characterize learner’s behaviors in    Based on this information, we expected the student would
open-ended computer-based learning environments to shed          be able to make more informed decisions on how to regu-
                                                                 learning data analysis. Extracted features consisted of:

                                                                 Prior to the self-guided learning session:


                                                                    • Pretest grade (mean of multiple-choice question cor-
                                                                      rectness)
                                                                    • Time spent taking the pretest


                                                                 During the self-guided learning session:


                                                                    • Quiz grade for each 12 descriptive statistics quizzes
                                                                    • Time spent reading each 12 descriptive statistics read-
                                                                      ings
                                                                    • Number of times the student reviewed statistical read-
                                                                      ings/quizzes
                                                                    • Number of events where the student clicked the button
                                                                      for going back to the Main Topics Menu
                                                                    • Whether a student looked at other windows/tabs (i.e.,
                                                                      the learning environment lost focus)
                                                                    • Time spent completing the learning session

Figure 1: A portion of the topics menu from a self-paced
learning session.                                                Following the self-guided learning session:


late their learning by adapting their future learning behav-        • Posttest grade
iors. This would allow for more systematic and controllable         • Time taken for the posttest
decision-making processes to determine which topics to visit
or review, and make salient what areas they may feel to have
not sufficiently studied. This intervention repeated every 10    Note that there were 12 versions of each quiz grade and
minutes thereafter until the 1-hour learning session was over    reading time feature, each extracted from one of the read-
or the student chose to end it before the hour. Subsequently,    ings or quizzes. Features following the learning session were
they completed a 10-minute posttest with questions modeled       outcomes, rather than predictors; in particular, we focus on
after the pretest questions (but not identical).                 posttest grade in this paper.

3.1   Data Collection                                            3.3    Data Analysis
We collected data from 58 university students who partic-        From the extracted feature data, consisting of 58 instances
ipated in our pilot study. Students were required to have        (1 per student), we discovered some of the students did not
completed either zero or one college-level statistics course,    seem to try their best to participate in our study. Five
but not more, to avoid inappropriately matching introduc-        students did not attempt to take any of the quizzes from
tory material to expert students. Students’ event log files      the learning session and their posttest scores were lower
were extracted from the online learning system, which recorded   than their pretest scores. Nevertheless, we did not exclude
their learning activities in real-time including information     these observations from our dataset since these students may
needed to provide interventions. These log files contained       reflect future treatment condition students and real-world
activities that were recorded during the students’ interac-      classroom students.
tions with every stage of the web-based online learning soft-
ware.                                                            We performed exploratory data analysis on the feature dataset
                                                                 and will discuss our findings in the Results (Section 4).
3.2   Feature Extraction
Following data collection, we extracted each student’s var-      3.4    Model Training
ious features, including some attributes related to SRL be-      Initially, we trained our model using a decision tree regressor
haviors, in order to use them as predictors for training the     to predict student performance (posttest score) using quiz
machine learning model. Students’ interactions were dis-         grades, reading times, and pretest score (in Python using
tributed across various log files for each possible learning     Scikit-learn [13]). However, the decision tree model did not
activity, quiz and test scores, time spent reading, and other    yield stable R2 values across 5-fold cross-validation. Since
files. For feature extraction, we merged all students’ feature   we had a relatively small dataset size for training a ma-
outputs into a single table which we later used for machine      chine learning model, folds had, on average, 11–12 instances.
                                        Figure 2: An overview of the research procedure.


Small folds contributed to instability in results since a deci-   in order to make our machine learning model simple, we
sion tree might produce predictions for one fold with little      did not use them as predictors. Moreover, we wanted to
or no variance. We thus changed to random forest regres-          use highly generalizable feature types that could be easily
sion, which can randomly sample observations and features         extracted from diverse online learning platforms.
to build a forest trees with ample variation between trees,
which eliminated the problem of invalid R2 values.                Within the chosen predictors, we checked whether pretest
                                                                  grade and posttest grade were normally distributed. We
3.5    SHAP (SHapley Additive exPlanations)                       plotted frequency histograms for pretest grade and posttest
Using posttest scores predicted by the Random Forest Re-          grade features. Corresponding histograms were relatively
gressor model, we implemented SHAP analysis to interpret          bell-shaped and symmetric about the mean values, so we
the model prediction. We calculated SHAP values using             concluded that pretest and posttest grade features are nor-
the tree explainer to explain model predictions and will use      mally distributed. This is essential to avoiding ceiling or
these explanations (in the next steps of the project) to trig-    floor effects for analysis of learning.
ger individualized SRL interventions to meet each student’s
need. We sorted calculated SHAP values in ascending order         4.2    Extracted Features
to determine specific features that contribute to getting a       Initially, we extracted 11 additional features from students’
lower posttest score.                                             log files, but we used only pretest grade, quiz grade, and
                                                                  reading time as features for predicting posttest grade. Since
                                                                  we had a small sample size with 58 observations, we had
4.    RESULTS                                                     to reduce the number of predictors to make the model sim-
In this section, we present findings on each stage of our
                                                                  ple. However, in future work with more participants we
research process in detail.
                                                                  plan to use more predictors, such as features related to spe-
                                                                  cific SRL constructs extracted via coherence analysis [15].
4.1    Data Analysis                                              Moreover, we can include both SRL-related and unrelated
We calculated Pearson correlations as a first step of data        features (e.g., pretest score in this study) and apply SHAP
analysis in order to measure the strength of a linear rela-       to disentangle the effects of SRL features, specifically, to
tionship between posttest grade (target variable) and other       provide interventions only on those.
potential predictor variables. We expected that clear rela-
tionships would be needed for the machine learning model          In Figure 3, the left grey boxes represent the overall study
to succeed given the small dataset size. Among the pre-           process that students went through in our experiment. On
dictor variables, pretest grade had the highest correlation       the right, the figure shows the composition of a student’s
(r = .530), indicate at least one sizable—if unsurprising—        recorded action log file as a whole, which is composed of
relationship in the data. From this we discovered that stu-       demographic survey, pretest, descriptive statistics surveys,
dents’ initial knowledge (as evident from pretest score) was      reading times, log, browser tab focus, and posttest files. The
closely related to posttest grade.                                diagram shows from which log files the predictors and a tar-
                                                                  get variable were extracted.
However, quiz grade features and reading time features had
promising positive correlation coefficients (up to r = .395).     Since students were not asked to complete all the read-
Though most features these were not statistically signifi-        ing/quiz components from the learning session, there were
cantly related to posttest score given the large number of        many participants who did skip several readings or quizzes.
predictors and small dataset size, trends indicated that these    In these cases, we assigned -1 for corresponding reading
were promising indicators for the success of machine learn-       times and quiz grade features to differentiate the cases. Ta-
ing methods. Since our goal was to suggest specific read-         ble 1 below shows a statistics summary of extracted features,
ing/quiz topic that students should review, we included 12        including minimum, maximum, and possible values that the
quiz grades and 12 reading time features for our predictors,      features can take.
along with pretest. Some features related to SRL had cor-
relation coefficients trending in the expected direction, but
             SESSION
             BEGINS
                                                           Demo
                                                           Survey

               app




                                                                                                                 ⇐÷÷ :
          DEMOGRAPHIC
            SURVEY
                                                                                     Pretest grade
                                                       Pretest                                                                       Logfile




               :
                                                            Descriptive
            PRETEST


                                       Descriptive          Probability          Quiz Grade
                                                                                                       Reading       Reading Time
                                       Statistics                                                       Times
                                        Surveys
                                                            Regression

           SELF-PACED
         ONLINE LEARNING


                                                                                     Posttest grade                                 Tabfocus
                                                      Posttest
              app


            POSTTEST




                     Figure 3: Overview of study stages (left) and data sources resulting from the study (right).




Table 1: Statistics of features. Note that it is possible for reading time to exceed the 60-minute session time if a student was
not interacting (e.g., walked away from the computer).
                                            Pretest Grade                 Posttest Grade      Quiz Grade         Reading Time
                               Min                   8.3                       25                     -1              -1
                               Max                   91.7                      100                    100        358 minutes
                         Possible Values         [0,100]                     [-1,100]            [-1,100]           [-1,∞]
4.3   Machine Learning Model Training                              5.2    Implications
We trained the random forest regressor model using pretest         The presented method can be useful in some respects. Stu-
grade, quiz grade, and reading time features to predict posttest   dents can receive personalized SRL interventions, which can
grade. We used 5-fold cross-validation for validation, as          help them to improve their outcomes and develop SRL skills
mentioned earlier, and measured R2 , root mean squared er-         in online learning. In particular, we can support specific
ror (RMSE), and Pearson’s r for model evaluation metrics.          groups of students who need the most help in online educa-
During the model training, we set the maximum depth of the         tion. For instance, students with no prior experience using
trees to be 4 and fixed the random seed in order to produce        e-learning platforms tend to be less experienced with mon-
a consistent outcome. We used 4 as a maximum depth of              itoring their learning activities. This group of students can
the tree because the results became notably more stable in         benefit from the implementation of our method of triggering
initial tests (with a partially collected dataset) by reducing     tailored SRL interventions since they would be exposed to
the depth of the trees.                                            suggestions that encourage them to engage in specific SRL
                                                                   activities.
After training, we evaluated the model and obtained the
following results: 1) mean R2 value within 5-fold cross-           Our method is highly flexible since we required only 58 stu-
validation was .262, 2) mean RMSE was 15.17 (on a 0–100            dents as a training sample (which is sufficient here, but could
posttest grade scale), and 3) mean Pearson’s r was .576.           be expanded) and 3 types of predictors extracted from event
From these observations, it is noticeable that our mean R2         log files for training the model and administering SRL inter-
value was far from perfect averaged across folds. However,         ventions. In particular, the three predictor types are gen-
given the small data size we have, the trained model works         eralizable features (pretest grade, quiz grade, and reading
relatively well. Furthermore, the accuracy was stable across       time) which can be easily extracted from many online learn-
folds: across the 5 folds, the standard deviation of R2 was        ing platforms. Moreover, most e-learning platforms store
.067 and the standard deviation of RMSE was 2.20.                  students’ action in log files, so we expect that it is feasible
                                                                   to introduce our SRL intervention mechanism into online
5.    DISCUSSION                                                   learning platforms. Thus, we expect this method to be ap-
The objective of our pilot study was to promote self-regulated     plicable in a variety of computer-based learning contexts.
learning in online education by triggering individualized SRL
interventions using machine learning techniques and SHAP           Our method will also allow researchers to explore the causal
values. To accomplish this, we began with extracting learn-        nature of educational interventions driven by machine learn-
ing relevant features from 58 students’ action log files which     ing models. For example, are students who spend more time
recorded students’ learning traces during the online learn-        on a specific topic doing well because of time spent on that
ing process. Using three types of predictors (pretest grade,       topic (as implied by a causal interpretation of the model),
quiz grade, and reading time), we trained a random for-            or do students who do well also happen to spend time on
est regressor model to predict student outcome (posttest           that particular topic because of some unobserved trait? Our
grade). Using the predicted student outcome, we applied            explorations of interventions based on the features will al-
SHAP technique to trigger personalized SRL interventions           low us to manipulate the inputs of models and explore the
by recommending each student to review the single most             nature of these connections.
learning session that contributes to getting a lower student
outcome. As a result, we developed a flexible way of trig-         5.3    Limitations
gering individualized SRL interventions in a digital learning      Even though we had a fairly stable model accuracy within
environment.                                                       the sample size (N = 58), model accuracy might be im-
                                                                   proved substantially with more data. The current training
5.1   Significance                                                 data size limits the feasibility of extracting a large num-
Our presented method is noteworthy in several aspects. Firstly,    ber of specialized features that might only apply to a small
our proposed technique can be used as a means to help stu-         fraction of students. Moreover, there may be complex in-
dents who lack SRL skills, technology experience, or have          teractions between students’ learning behaviors on different
weak academic performance to develop SRL skills in e-learning      topics that require additional data to uncover. Likewise, if
environments. Especially with the unexpected outbreak of           our model is applied into online courses where students take
the global COVID-19 pandemic, a large number of students           the course for no credit, then the expected effectiveness of
are using online learning platforms as a way of learning and       our method might be weak since students may not be moti-
assessing learning outcomes. The need for methods to help          vated to study in the same way that students taking actual
students learn effectively online is thus increasing rapidly;      courses for credit. Further data is needed to explore these
we hope our proposed method can contribute to the tech-            effects.
nology development of helping students in online education.
                                                                   Collecting additional data would also afford the opportunity
Secondly, our approach is original in that we integrated com-      to explore new types of features that could address some of
plex machine learning techniques (i.e., a complex, non-linear      the unexplained variance in our model. For example, there
regression model) with SHAP as a recent machine learning           are many features unrelated to SRL, such as prior experi-
interpretability method to decide how to intervene. Previ-         ence level, perceptions of statistics, and others that might
ously, this idea had only been explored on archival data [10],     improve model accuracy. We hope to address these gaps in
and not for SRL interventions in particular.                       the model in future work. Notably, the model explanation
                                                                   method used here will allow us to still provide interventions
                                                                   based on SRL features even with other features included.
Improved model accuracy might be most helpful when de-               self-regulated learning strategies in Massive Open
termining when to provide interventions and to whom, un-             Online Courses. Computers in Human Behavior,
like our current approach in which every student receives an         80:179–196, 2018.
intervention at predetermined points in time.                    [8] C. Mega, L. Ronconi, and R. De Beni. What makes a
                                                                     good student? How emotions, self-regulated learning,
6.   CONCLUSION                                                      and motivation contribute to academic achievement.
Notwithstanding the aforementioned limitations, our pilot            Journal of Educational Psychology, 106(1):121–131,
study made an effort to throw light on the matter of students        Feb. 2014.
struggling within online courses through integrating machine     [9] E. D. Moynahan. Assessment and selection of paired
learning and SHAP to promote self-regulated learning in              associate strategies: A developmental study. Journal
digital learning. In the near future, we hope students in the        of Experimental Child Psychology, 26(2):257–266, Oct.
treatment condition produce better learning behaviors and            1978.
academic outcomes when our proposed method is applied.          [10] T. Mu, A. Jetten, and E. Brunskill. Towards
                                                                     suggesting actionable interventions for wheel-spinning
7.   ACKNOWLEDGEMENTS                                                students. In Proceedings of the 13th International
This research was supported by a grant from the Technology           Conference on Educational Data Mining (EDM 2020),
Innovation in Educational Research and Design (TIER-ED)              pages 183–193. International Educational Data Mining
initiative at the University of Illinois Urbana–Champaign.           Society, July 2020.
                                                                [11] L. Paquette, J. Rowe, R. Baker, B. Mott, J. Lester,
                                                                     J. DeFalco, K. Brawner, R. Sottilare, and
8.   REFERENCES                                                      V. Georgoulas. Sensor-free or sensor-full: A
 [1] R. Baker, J. Ocumpaugh, S. M. Gowda, A. M.
                                                                     comparison of data modalities in multi-channel affect
     Kamarainen, and S. J. Metcalf. Extending log-based
                                                                     detection. In Proceedings of the 8th International
     affect detection to a multi-user virtual environment for
                                                                     Conference on Educational Data Mining (EDM 2015),
     science. In 22nd Conference on User Modeling,
                                                                     pages 93–100. International Educational Data Mining
     Adaptation and Personalization (UMAP 2014), pages
                                                                     Society, 2016.
     290–300. Springer, 2014.
                                                                [12] A. Pardo, F. Han, and R. A. Ellis. Combining
 [2] F. Dalipi, A. S. Imran, and Z. Kastrati. MOOC
                                                                     University Student Self-Regulated Learning Indicators
     dropout prediction using machine learning techniques:
                                                                     and Engagement with Online Learning Events to
     Review and research challenges. In Proceedings of the
                                                                     Predict Academic Performance. IEEE Transactions on
     2018 IEEE Global Engineering Education Conference
                                                                     Learning Technologies, 10(1):82–92, 2017.
     (EDUCON), pages 1007–1014. IEEE, May 2018.
                                                                [13] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,
 [3] E. S. Ghatala, J. R. Levin, M. Pressley, and
                                                                     B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,
     D. Goodwin. A componential analysis of the effects of
                                                                     R. Weiss, V. Dubourg, J. Vanderplas, A. Passos,
     derived and supplied strategy-utility information on
                                                                     D. Cournapeau, M. Brucher, M. Perrot, and
     children’s strategy selections. Journal of Experimental
                                                                     E. Duchesnay. Scikit-learn: Machine learning in
     Child Psychology, 41(1):76–92, Feb. 1986.
                                                                     Python. Journal of Machine Learning Research,
 [4] J. D. Gobert, M. S. Pedro, J. Raziuddin, and R. S.              12:2825–2830, Nov. 2011.
     Baker. From log files to assessment metrics:
                                                                [14] P. R. Pintrich. Understanding self-regulated learning.
     Measuring students’ science inquiry skills using
                                                                     New Directions for Teaching and Learning,
     educational data mining. Journal of the Learning
                                                                     1995(63):3–12, 1995.
     Sciences, 22(4):521–563, Oct. 2013.
                                                                [15] J. R. Segedy, J. S. Kinnebrew, and G. Biswas. Using
 [5] P. Hur, N. Bosch, L. Paquette, and E. Mercier.
                                                                     coherence analysis to characterize self-regulated
     Harbingers of collaboration? The role of early-class
                                                                     learning behaviours in open-ended learning
     behaviors in predicting collaborative problem solving.
                                                                     environments. Journal of Learning Analytics,
     In Proceedings of the 13th International Conference on
                                                                     2(1):13–48–13–48, May 2015. Number: 1.
     Educational Data Mining (EDM 2020), pages
     104–114. International Educational Data Mining             [16] M. Taub, N. V. Mudrick, R. Azevedo, G. C. Millar,
     Society, 2020. hur-csteps2-edm20-camera.pdf.                    J. Rowe, and J. Lester. Using multi-channel data with
                                                                     multi-level modeling to assess in-game performance
 [6] Y. Jiang, N. Bosch, R. S. Baker, L. Paquette,
                                                                     during gameplay with Crystal Island. Computers in
     J. Ocumpaugh, J. M. A. L. Andres, A. L. Moore, and
                                                                     Human Behavior, 76:641–655, Nov. 2017.
     G. Biswas. Expert feature-engineering vs. deep neural
     networks: Which is better for sensor-free affect           [17] S. Xiao, K. Yao, and T. Wang. The relationships of
     detection? In C. P. Rosé, R. Martı́nez-Maldonado,              self-regulated learning and academic achievement in
     H. U. Hoppe, R. Luckin, M. Mavrikis,                            university students. volume 60, 01003, Jan. 2019.
     K. Porayska-Pomsta, B. McLaren, and B. du Boulay,          [18] M. Yusuf. The impact of self-efficacy, achievement
     editors, Proceedings of the 19th International                  motivation, and self-regulated learning strategies on
     Conference on Artificial Intelligence in Education              students’ academic achievement. Procedia-Social and
     (AIED 2018), pages 198–211, Cham, CH, 2018.                     Behavioral Sciences, 15:2623–2626, 2011.
     Springer.                                                  [19] B. J. Zimmerman. Self-regulated learning and
 [7] J. Maldonado-Mahauad, M. Pérez-Sanagustı́n, R. F.              academic achievement: An overview. Educational
     Kizilcec, N. Morales, and J. Munoz-Gama. Mining                 Psychologist, 25(1):3–17, 1990.
     theory-based patterns from Big data: Identifying           [20] B. J. Zimmerman. Becoming a self-regulated learner:
     An overview. Theory Into Practice, 41(2):64–70, 2002.
[21] B. J. Zimmerman and M. Martinez-Pons. Construct
     validation of a strategy model of student self-regulated
     learning. Journal of Educational Psychology,
     80(3):284–290, 1988.
[22] B. J. Zimmerman and M. M. Pons. Development of a
     structured interview for assessing student use of
     self-regulated learning strategies. American
     Educational Research Journal, 23(4):614–628, Jan.
     1986. Publisher: American Educational Research
     Association.
[23] B. J. Zimmerman and D. H. Schunk. Handbook of
     self-regulation of learning and performance.
     Educational psychology handbook series.
     Routledge/Taylor & Francis Group, 2011.