<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Comparison of Off the Shelf Data Mining Methodologies in Educational Game Analytics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>David J. Gagnon</string-name>
          <email>david.gagnon@wisc.edu</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Erik Harpstead</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefan Slater</string-name>
          <email>slater.research@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Carnegie Mellon University</institution>
          ,
          <addr-line>Pittsburgh, PA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Pennsylvania</institution>
          ,
          <addr-line>Philadelphia, PA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Wisconsin-Madison</institution>
          ,
          <addr-line>Madison, WI</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper we compare the accuracy of nine common machine learning algorithms in predicting quitting and performance on knowledge assessment tests in the context of two middle school science learning games. The games being studied, the Crystal Cave and Wave Combinator, are both short duration (played for an average of 25 and 28 minutes respectfully), web-based games designed for use in classroom contexts. We used samples of 1,254 and 5,308 anonymous internet players respectively collected during Fall of 2018. We recorded raw clickstream data and used feature engineering methods to calculate simple descriptive features such as average timings between events and the number and types of player moves. We then used these features to model players quitting the game at each level, as well as content knowledge measured by subsequent assessment. We found that logistic regression produced the best models overall and model quality was influenced by specific game levels and assessment items. We conclude by discussing future work to improve predicting player quitting and player knowledge assessment.</p>
      </abstract>
      <kwd-group>
        <kwd>Feature engineering</kwd>
        <kwd>digital games</kwd>
        <kwd>videogames</kwd>
        <kwd>modeling</kwd>
        <kwd>prediction</kwd>
        <kwd>quitting</kwd>
        <kwd>assessment</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Digital games are increasingly being used to support learning in
educational contexts across a wide variety of subjects, including
social studies [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], mathematics [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], physics [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], and history [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
Beyond content knowledge, games have also been used to support
the development of cognitive and noncognitive skills, such as
persistence and spatial reasoning [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. As video games see
increasing use in classroom contexts, the need to analyze the rich
interaction data that they produce for meaningful behavioral and
learning indicators from play becomes greater as well.
Educational data mining (EDM) is well-suited to the problem of
analyzing digital games which feature rich interaction data, and
methods common to EDM have been frequently deployed to
better understand data produced by digital games. For instance,
EDM techniques have been used to model quitting behavior
among students playing an educational physics simulation game
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], problem-solving in a game-based programming task [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], and
computational thinking skills in Zoombinis [
        <xref ref-type="bibr" rid="ref14 ref7">7, 14</xref>
        ].
      </p>
      <p>
        In this paper, we use EDM techniques to predict quitting behavior
and content knowledge within two middle school science games,
Crystal Cave and Wave Combinator [
        <xref ref-type="bibr" rid="ref1 ref11">1,11</xref>
        ]. We sought to model
these outcomes because of their relevance for the use of these
games in educational contexts. The identification of quitting
behavior affords game designers and educators the opportunity to
intervene with scaffolds or feedback that can help keep students
on-task and working productively [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Prediction and modeling of
content understanding in students enables game designers and
educators the opportunity to generate additional opportunities for
a student to practice a given skill, or correct specific
misconceptions that might exist about that content. While such
techniques have been employed in intelligent tutoring systems via
knowledge tracing and knowledge inference methods [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], the
open-ended structure of many educational games makes these
methods difficult to employ successfully.
1.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Games Being Studied</title>
      <p>The two games used in this study, Crystal Cave and Wave
Combinator, are available online for free public use and are short
duration experiences, played for an average of 25 and 28 minutes
respectfully. They are primarily used in classroom contexts.
In Wave Combinator, players must manipulate the amplitude,
frequency and offset of a wave in order to match the shape of a
target wave (Figure 1). Once the player’s wave is within a certain
range of the target wave, they are allowed to continue to the next
level. At key points of the game, a multiple-choice question
appears on screen that assess the vocabulary used in the game
(Figure 2). While these assessment items are presented as being
asked by in-game characters, they are not situated within a
broader narrative context, but were retrofitted into the game for
the sake of this research. This study will be examining play data
from the first 7 levels of the game and the 2 multiple choice items
that follow.</p>
      <p>In Crystal Cave, players assemble differently shaped molecules to
form crystals of varying stabilities (Figure 3). Stability is
determined by the density of the resultant molecular pattern as
well as the proper alignment of the positive and negative charges
on portions of the molecules. For each level, different thresholds
of stability for the players’ molecular design will result in
completing the level with 1 to 3 stars. Each level unlocks when
the player has achieved a certain number of stars, leading to a
semi-structured progression through the game that allows students
to repeat challenges to find optimal molecular arrangements or to
progress to new challenges. As with the Wave Combinator,
multiple choice quiz items are presented by game characters, but
without meaningful integration into the game context. These
questions appear after completing specific levels (Figure 4). This
study will be examining play data from the entire set of 7 levels of
the game and the 3 multiple choice items that are intermixed.</p>
      <p>These two games were chosen because they represent different
archetypes of educational games. Wave Combinator provides
players with controls that manipulate the outputs of a simple
simulation in real-time. Players have to construct meanings about
the purpose of each control in order to find a solution. Crystal
Cave is a more constructive task that delays feedback for several
moves and requires players to apply simple chemistry rules to
develop reasonable strategies. Our goal in looking at two different
games was to explore the degree to which similar minimal feature
engineering approaches would perform across differing game
structures and design attributes.</p>
    </sec>
    <sec id="sec-3">
      <title>2. METHODS</title>
    </sec>
    <sec id="sec-4">
      <title>2.1 Process of Data Collection &amp;</title>
    </sec>
    <sec id="sec-5">
      <title>Instrumentation</title>
      <p>Data is collected from the games using both Google Analytics
(GA) as well as a researcher developed event logging system. GA
were used to quickly record and visualize overall game metrics
such as number and location of player sessions, session length,
and high-level progression through each game (i.e., completed
level 5, then quit). GA were primarily used during development
and to understand audiences but are not included in the current
analyses beyond understanding the audience and usage patterns.
Multiple choice knowledge assessment measures were designed
by the researchers for both games. Each item was aligned with the
documented learning goals of the game. The instruments were
designed to use a similar visual style as the rest of the game play
(See Figures 2 and 4). Players completed the assessment measures
after finishing gameplay; the assessment items were not
embedded into the game itself.</p>
      <p>Our analyses focused on two labels – quitting behavior, when a
player quits a level before it is completed and leaves the game,
and performance on the post-test assessment measures. Population
and Sampling Process
Based on GA, 93% of the games’ usage was from United States,
based on IP addresses. Gameplay sessions were primarily
recorded during school hours and on weekdays, leading
researchers to believe that the games are used primarily in
classroom contexts. Gameplay sessions were recorded
anonymously making it impossible to tell if a session represented
a new or returning player. While we acknowledge this limitation,
our analysis assumes that an individual session represents a
unique player. During the data collection period of September 1
through December 31, 2018 the Crystal Cave was played 20,963
times with an average of 24.68 minutes/session. The Wave
Combinator was played 23,353 times with an average of 28.78
minutes/session. Of these sessions, 1,254 of the Crystal Cave
sessions and 5,308 Wave Combinator sessions are included in this
study based on the availability of the logging system and data
exclusion rules described below.</p>
    </sec>
    <sec id="sec-6">
      <title>2.2 Data Logging</title>
      <p>Within each game, a JavaScript based logging client captures and
transmits clickstream events to a server for storage. Events are
recorded for all discrete player actions such as starting a level,
making a move, and completing a challenge. Each event is
timecoded using the client browser’s native time, an automatically
generated session identifier, and details about the event that took
place. The events are encoded as JSON and sent via an HTTP
POST request. These requests are scheduled for delivery to the
backend logging server using a first-in-first-out queue and are
only dismissed after delivery is confirmed.</p>
      <p>The backend server is comprised of a researcher built, open
source, PHP-based web service. Client requests are parsed,
appended with the server’s system time, and inserted as individual
records into a MySQL database. As each clickstream event is sent
as a seperate network request and recorded as a individual row,
the system is easily parallelized for large numbers of clients. For
this study, a single quad core Apache / PHP server Virtual
Machine and a single quad core MySQL Virtual Machine server
were provisioned in a University data-center.</p>
    </sec>
    <sec id="sec-7">
      <title>2.3 Feature Engineering and Distribution</title>
      <p>We designed features that describe the actions players are able to
take in each game. We intentionally explored basic features that
could conceivably be extended to other educational games. We
developed features that describe the counts of each (game
specific) move type, the average number of each move type in
each level of play, timings and attributes of each move, the
scoring the game provided to the player in each level, and
attributes of re-starting and replaying challenges by players.
Features were calculated using data collected chronologically
before the outcome being modeled. For example, features to
model quitting in level 5 were calculated using play data derived
from levels 1 through 4, and any available data from level 5 play,
but not level 6 or greater. This was done to preserve the predictive
nature of the research and to create a model that could
conceivably be used to make predictions in real-time within a
gameplay session. Play sessions with less than 10 total moves
were excluded from the final dataset. The table below describes
the features used for each game and attempts to align them across
games when appropriate.</p>
      <p>We defined “quitting” on a given level as a session where the
session ends (log events halt abruptly) before the current level is
completed. Using this definition, each session is labeled with
either a “quit” or a “complete” for each level. The distribution for
these events leans strongly toward completing, with an average of
70.3% and 81.4% of sessions completing each level in Crystal
Cave and Wave Combinator respectfully. We use each level’s
quitting distributions as our baseline model.</p>
      <p>We defined “incorrect” answers as a session selecting any of the 3
options that were not correct for each assessment item. As with
quitting, the distribution for the assessments leans toward correct
answers being provided. An average of 65.6% of sessions selected
correct answers for the three Crystal Cave questions while an
average of 65.8% of sessions selected correct answers for the two
Wave Combinator questions. As with quitting, we use each
question’s distribution as the baseline model.</p>
    </sec>
    <sec id="sec-8">
      <title>2.4 Modeling process</title>
      <p>
        We modeled the data using several algorithms provided by
RapidMiner, a multiplatform data science tool [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This tool was
chosen for ease of use and free use in educational contexts across
the vast majority of computational platforms (OSX, Windows,
and Linux). Individual models were generated for quitting at
levels 1, 3, 5 and 7 for each game. Individual models were also
generated for each knowledge assessment, where the results of the
assessment were represented as a binomial indicating a correct vs.
incorrect response. Models that were used included RapidMiner’s
implementations of Naive Bayes, Generalized Linear Model,
Logistic Regression, Fast Large Margin, Deep Learning1,
1 RapidMiner’s Deep Learner component is based on a multi-layer
feed-forward artificial neural network. For more details see:
Decision Tree, Random Forest, Gradient Boosted Trees and
Support Vector Machine. RapidMiner’s default hyperparameters
were used for all models, including a preprocessing step to
standardize all values to have zero mean and unit variance as well
as the option to use a single thread to ensure reproducibility.
Model specific hyperparameters are seen in Appendix A. A single
60/40 split process randomly divided the source data into a 60%
training set and 40% validation set. Accuracy percentage of each
model was determined, along with baseline accuracies for quitting
and knowledge assessment. The same initial feature space was
used to predict both quitting behavior and post-test performance.
      </p>
    </sec>
    <sec id="sec-9">
      <title>3. RESULTS</title>
      <p>Table 3. Performance for predicting instances of quitting at
each level within Crystal Cave.</p>
      <p>Accuracy
https://docs.rapidminer.com/latest/studio/operators/modeling/pr
edictive/neural_nets/deep_learning.html
Baseline calculations were quite high for predicting quitting at
each of the different levels across both games. For example,
91.1% of players who start level 7 in Wave Combinator also
complete it. By predicting that all players will complete level 7, a
model will have a 0.911 accuracy, leaving very little room for
improvement. Across all levels, a baseline model that always
predicts completing the level will have an average accuracy of
0.703 for Crystal Cave and 0.814 for Wave Combinator.
For predictions of quitting at each level in Crystal Cave, Deep
Neural Networks performed best on average. The most accurate
prediction was 0.908 for level 1, followed by 0.786 for level 5,
0.737 for level 3 and 0.707 for level 7. The largest improvement
over the baseline was for level 7, with the model performing
22.7% more accurately. This was followed by level 5 with a
15.8% improvement over baseline, and level 1 with a 15.1%
improvement over baseline. The model performed slightly worse
than baseline (0.958) predicting quitting for level 3. On average,
the Deep Neural Networks predicted quitting with an accuracy of
0.784. All models performed better than the baseline model
except for Fast Large Margin.</p>
      <p>For predictions of quitting at each level in Wave Combinator,
Logistic Regression was the most accurate, but offered little
improvement over baseline for most levels. The most accurate
prediction was 0.999 for quitting in level 1 followed by 0.914 for
level 7, 0.819 for level 3 and 0.630 for level 5. The largest
improvement over baseline was seen in level 1 with the model
performing 11.2% more accurately. This advantage quickly
dissolves with only a 0.8% improvement in level 5, a 0.3%
improvement for level 7 and a 0.1% improvement for level 3. On
average, Logistic Regression predicted quitting with an accuracy
of 0.841. Deep Learning and Gradient Boosted Tree algorithms
failed to perform better than the baseline model for this
prediction.
Baseline predictions for the assessment items were lower than for
quitting, but still much higher than a fair coin toss. Averaging
across the 3 items in the Crystal Cave and the 2 items in the Wave
Combinator, a baseline model that always predicts a correct
answer will have an accuracy of 0.656 and 0.658, respectfully.
For predicting incorrect answers on the 3 assessment items in the
Crystal Cave, Logistic Regression was the most accurate. The
model best predicts the outcome of question 1 with an accuracy of
0.745, followed by question 0 with an accuracy of 0.603 and
question 2 with an accuracy of 0.600. Compared to the baseline,
the greatest improvement was 4.4% on question 2. The model
demonstrated a 2.8% improvement on question 1 and 2.5%
improvement on question 0. On average, the model predicted
incorrect answers with an accuracy of 0.649.</p>
      <p>For predicting incorrect answers on the 2 assessment items in the
Wave Combinator, Logistic Regression was the most accurate.
The model has an accuracy of 0.776 for question 1 followed by an
accuracy of 0.590 for question 0. This translates to an
improvement of 9.2% for question 0 and identical accuracy to
baseline for question 1. On average, the model predicted incorrect
answers with an accuracy of 0.683.</p>
    </sec>
    <sec id="sec-10">
      <title>4. DISCUSSION</title>
      <p>This paper compares the accuracy of 9 common modeling
algorithms for predicting quitting and knowledge assessment in
two different learning games using the simplest possible feature
engineering. We found that, on the whole, these models were able
to successfully predict quitting behavior and correct answers in
our two games and their associated post-tests. This is a promising
finding for continuing to deploy educational data mining methods
in order to capture and identify learning and behaviors of interest
within digital games.</p>
      <p>Accurate prediction of quitting behaviors and post-test
performance has a number of practical applications within
educational settings. For instance, players who are identified as
being at-risk for quitting a level may be given targeted behavioral
or affective scaffolds to keep them on-task and working
productively. Players who have a low predicted score for a
posttest assessment can be given additional practice opportunities
ondemand, based on the specific misconceptions or difficulties they
are having.</p>
      <p>
        That said, there is room for improvement in the performance of
these models. More complex, move sequence features may lead to
more meaningful descriptors of the player’s thinking. While the
features that were used in this paper were certainly grounded in
the interactions afforded to the player, they were only computed
in terms of simple counts and averages. One possible next step
would be to use sequential pattern mining to first identify
common sequences of moves that correlate with outcomes of
interest [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. The presence of these patterns could then be used as
an engineered feature to train the models.
      </p>
      <p>The extreme accuracy (0.999) of level 1 quitting predictions for
the wave combinator invites speculation for the usefulness of
building models based only on very recent events. The approach
used here was to use all player actions leading up to the quitting
or assessment event. This may have the unintended consequence
of diluting player moves that may immediately lead to a success
in a specific level, with moves from much earlier in the gameplay
that are now irrelevant to the challenge at hand. A next step would
be to modify the feature generating scripts to experiment with
different time windows for modeling.</p>
      <p>Another limitation of this work is that accuracy may not be the
best measure of the effectiveness of the predictions. In future
work, the performance of the models should be reported by
providing precision, recall and F1 scores. This issue is
compounded by the fact that baseline predictions, based only on
the percentages of players that complete a level or correctly
answer a quiz item, are quite high, leaving very little room for
improvement. The authors are unable to conclude that the models
are deriving their accuracy from the strength of the features and
not simply the unbalanced distribution of the phenomena.
Finally, the validity of the answers provided for the multiple
choice assessment items could be studied. These items are not
standardized measures, but reasonable assessments designed by
the researchers. Further evaluating their validity and reliability
may highlight insights as to why they are harder to predict.
Additionally, by modifying the system to record the time spent
answering each assessment would help identify obvious issues
such as spending less than 1 second before answering, not nearly
enough time to read and decide on a correct answer.</p>
    </sec>
    <sec id="sec-11">
      <title>5. SUMMARY</title>
      <p>In summary, logistic regressions performed better than all
competing algorithms for quitting in Wave Combinator and
content knowledge tests in both games. Deep Learning models
performed best in predicting quitting in the Crystal Cave game.
Level quits can be predicted with an average accuracy of 0.784 for
Crystal Cave and 0.841 for Wave Combinator, an improvement of
12.4% and 3.1% over baseline, respectfully. Correct answers
across the embedded knowledge assessment items can be
predicted with an average accuracy of 0.649 for Crystal Cave and
0.683 for the Wave Combinator. The models provided a 3.3% and
4.6% improvement over baseline for these games.</p>
      <p>These results show that educational data mining techniques can
provide some predictive value to different kinds of educational
games even with relatively minimal feature engineering. We hope
that other researchers can be encouraged to apply similar methods
to their own games given our results.</p>
    </sec>
    <sec id="sec-12">
      <title>6. ACKNOWLEDGMENTS</title>
      <p>The authors gratefully acknowledge partial support of this
research by NSF through the University of Wisconsin Materials
Research Science and Engineering Center (DMR-1720415) and
the Wisconsin Department of Public Instruction.
Hyperparameters
Family = binomial
Solver = L_BFGS
Solver = L_BFGS
Strategy = 1 against all
Activation = rectifier
Hidden layer sizes = 50,50
Epochs = 10.0
Criterion = gain_ratio
Maximal depth = 2</p>
      <sec id="sec-12-1">
        <title>Random Forest Voting Strategy = confidence vote</title>
      </sec>
      <sec id="sec-12-2">
        <title>Naive Bayes</title>
      </sec>
      <sec id="sec-12-3">
        <title>Generalized Linear Model Logistic Regression Fast Large Margin Deep Learning</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Crystal</given-names>
            <surname>Cave</surname>
          </string-name>
          [Computer Software]. (
          <year>2017</year>
          ). Madison: Field Day.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Karumbaiah</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shute</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          (
          <year>2018</year>
          )
          <article-title>Predicting Quitting in Students Playing a Learning Game</article-title>
          .
          <source>Proceedings of the 11th International Conference on Educational Data Mining</source>
          ,
          <fpage>21</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Kiili</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Devlin</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perttula</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tuomi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Lindstedt</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Using video games to combine learning and assessment in mathematics education</article-title>
          .
          <source>International Journal of Serious Games</source>
          ,
          <volume>2</volume>
          (
          <issue>4</issue>
          ),
          <fpage>37</fpage>
          -
          <lpage>55</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Maguth</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>List</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Wunderle</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Teaching Social Studies with Video Games</article-title>
          .
          <source>The Social Studies</source>
          ,
          <volume>106</volume>
          (
          <issue>1</issue>
          ),
          <fpage>32</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Malkiewich</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shute</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kai</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paquette</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          (
          <year>2016</year>
          )
          <article-title>Classifying behavior to elucidate elegant problem solving in an educational game</article-title>
          .
          <source>Proceedings of the 9th International Conference on Educational Data Mining</source>
          ,
          <fpage>448</fpage>
          -
          <lpage>453</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Mierswa</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Klinkenberg</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>RapidMiner Studio (9.2) [Data science, machine learning</article-title>
          ,
          <source>predictive analytics]</source>
          . Retrieved from https://rapidminer.com/
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Rowe</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Asbell-Clarke</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gasca</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bardar</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Scruggs</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Labeling Implicit Computational Thinking in Pizza Pass Gameplay</article-title>
          .
          <source>In Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems.</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Shute</surname>
            ,
            <given-names>V. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ventura</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Ke</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>The power of play: The effects of Portal 2 and Lumosity on cognitive and noncognitive skills</article-title>
          .
          <source>Computers &amp; Education</source>
          ,
          <volume>80</volume>
          ,
          <fpage>58</fpage>
          -
          <lpage>67</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Shute</surname>
            ,
            <given-names>V. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ventura</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>Y. J.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Assessment and learning of qualitative physics in newton's playground</article-title>
          .
          <source>The Journal of Educational Research</source>
          ,
          <volume>106</volume>
          (
          <issue>6</issue>
          ),
          <fpage>423</fpage>
          -
          <lpage>430</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Watson</surname>
            ,
            <given-names>W. R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mong</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Harris</surname>
            ,
            <given-names>C. A.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>A case study of the in-class use of a video game for teaching high school history</article-title>
          .
          <source>Computers &amp; Education</source>
          ,
          <volume>56</volume>
          (
          <issue>2</issue>
          ),
          <fpage>466</fpage>
          -
          <lpage>474</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Wave</surname>
            <given-names>Combinator</given-names>
          </string-name>
          [Computer Software]. (
          <year>2017</year>
          ). Madison: Field Day.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Wallner</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Sequential Analysis of Player Behavior</article-title>
          .
          <source>In CHI PLAY '15 Proceedings of the 2015 Annual Symposium on Computer-Human Interaction in Play</source>
          ,
          <volume>349</volume>
          -
          <fpage>358</fpage>
          . https://doi.org/10.1145/2793107.2793112
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Pavlik</given-names>
            <surname>Jr.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.I.</given-names>
            ,
            <surname>Cen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            , &amp;
            <surname>Koedinger</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.R.</surname>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Performance Factors Analysis - A New Alternative to Knowledge Tracing</article-title>
          . In V. Dimitrova &amp; R. Mizoguchi (Eds.),
          <source>Proceedings of the 14th International Conference on Artificial Intelligence in Education. Brighton</source>
          , England.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Zoombinis</surname>
          </string-name>
          [Computer Software]. (
          <year>2015</year>
          ).
          <source>TERC. 8</source>
          .
          <string-name>
            <surname>APPENDIX</surname>
            <given-names>A</given-names>
          </string-name>
          :
          <article-title>Hyperparameters used for each model</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>