Expanding Knowledge Tracing to Prediction of
                          Gaming Behaviors
                       Sarah E Schultz                                                   Ivon Arroyo
                Worcester Polytechnic Institute                                Worcester Polytechnic Institute
                      100 Institute Rd                                               100 Institute Rd
                       Worcester, MA                                                  Worcester, MA
                    seschultz@wpi.edu                                                iarroyo@wpi.edu
ABSTRACT                                                            disengagement or negative affect. Some work has been done in
Knowledge tracing has been used to predict students’ knowledge      modeling engagement and affect in Intelligent Tutoring Systems
and performance for almost twenty years. Recently, researchers      [3], but relatively little research has focused on combining these
have become interested in looking at students’ behaviors,           methods with ways of tracking knowledge, such as knowledge
especially those considered gaming behaviors. In this work, we      tracing, in order to create a model that can predict both student
attempt to leverage a variation of knowledge tracing to predict     performance and disengaged behavior and intervene
gaming behaviors without damaging the prediction of                 appropriately.
performance. We compare the predictions of this model to those
of knowledge tracing and a separate engagement tracing model.       2. PREVIOUS WORK
Keywords                                                            2.1 Bayesian Knowledge Tracing
                                                                    Corbett and Anderson’s Bayesian Knowledge Tracing (BKT)
Knowledge tracing, affect, engagement, gaming, behavior             [1] (Figure 2) is a hidden Markov model. At each time step there
                                                                    is a latent node, knowledge, and an observed node, performance.
1. INTRODUCTION                                                     The parameters for this model are P(L0), the probability that a
When Corbett and Anderson first published the knowledge             student already knows the skill; P(T), the probability of learning
tracing model in 1995, they claimed that their goal was “to         the skill from one time-step to the next; P(G), the probability
implement a simple student modeling process that would allow        that a student who does not know the skill but correctly guesses;
the tutor to […] tailor the sequence of practice exercises to the   and P(S), the probability that a student who does know the skill
student’s needs” [1]. While knowledge tracing is generally able     slips and gets the answer incorrect. As mentioned in the
to predict students’ performance “quite well,” it does not take     introduction, P(F), forgetting, is traditionally set at 0, however
into account the possibility of disengagement. Traditionally,       for this work we allow forgetting in order to see if looking at
knowledge tracing is used with the probability of transition from   behavior affects the amount of forgetting that students appear to
a learned to an unlearned state set at 0, so students who become    do.                         P(L)     P(L)
disengaged are not presumed to be forgetting the skill. When the                               P(F)     P(F)
forgetting transition is allowed, models such as knowledge
tracing can become confounded, mistaking disengagement for
unlearning, as illustrated in Figure 1.


                                                                               Figure 2- Bayesian Knowledge Tracing
                                                                    2.2 HMM-IRT
                                                                    In 2006, Johns and Woolf proposed the Dynamic Mixture Model
                                                                    (DMM) for predicting student knowledge and engagement [4].
                                                                    They used a hidden Markov model like BKT for tracing
                                                                    engagement, but paired it with an Item Response Theory-like
                                                                    model for predicting knowledge. Rather than predicting
                                                                    knowledge at each time step, there is a single knowledge node
  Figure 1- Bayesian Knowledge Estimation of a student on           for every skill and students’ performance relies on that in
                    one skill (bottom line)                         addition to their engagement state. This allowed more accurate
Figure 1 suggests that this student was un-learning, while after    knowledge predictions than IRT alone, as disengagement,
looking at the logs in detail, it was clear that, after the 7th     indicated by gaming behaviors, could explain away some
problem, the student was just clicking through all the available    incorrect attempts, rather than attributing those to knowledge.
multiple-choice answers without attempting to answer
correctly.This type of behavior is defined by Baker et al as
“gaming the system” [2] and is considered to be an indicator of
                                                                  the same as the HMM part of Johns and Woolf’s model or the
                                                                  engagement piece of the KAT model, but not connected to any
                                                                  other model (top part of figure 4).

                                                                  5. DATASETS AND METHODS
                                                                  The data and methods used in this work was the same as that
                                                                  used in [5]. The data came from two tutors for middle and high
                                                                  school mathematics, ASSISTments and Wayang Outpost. For
                                                                  details, please see [5] in the main conference proceedings.


             Figure 3- Dynamic Mixture Model
                                                                  6. RESULTS AND ANALYSIS
                                                                  While KT and KTB both outperform KAT and DMM in all
2.3 The KAT Model                                                 predictions, in seven of the nine knowledge components, KTB
                                                                  was better able to predict performance than standard knowledge
In our previous work [5], we proposed the knowledge and affect
tracing (KAT) model (Figure 5), which combines two hidden         tracing, although the only significant difference between the two
Markov models, BKT and the engagement tracing piece of            was in the ASSISTments skill “Circle Graph” (p=0.03).
DMM. As in DMM, affect influences performance. This model         Interestingly, the Bayesian engagement tracing model was better
was able to predict both performance and behavior better than     able to predict students’ behavior than KTB in eight of the nine
the dynamic mixture model, but did not predict performance as     knowledge components, although the differences are again not
well as standard BKT, perhaps due to over-parameterization [5].   significant, except in two cases, “Box and Whisker,” and
                                                                  “Triangles” (p=0.02).

                                                                  7. DISCUSSION
                                                                  We have proposed a new model, knowledge tracing with
                                                                  behavior, which can predict both student performance and
                                                                  behavior, and have shown that it can do so at least as well as
                                                                  BKT and a separate Bayesian engagement tracing, at predicting
                                                                  future behaviors (correctness at responding math problems and
                                                                  gaming behaviors). KTB seems to stop the false forgetting effect
                                                                  that is recorded by KT when forgetting is not allowed to be zero.
                 Figure 4- The KAT Model

3. THE KTB MODEL                                                  ACKNOWLEDGEMENTS
We propose the “Knowledge Tracing with Behavior” (KTB)            This research is supported by the Office of Naval Research,
model. This model has only one latent node, which we call         STEM Challenge Award, # N0001413C0127US. We also
“knowledge”-- although in reality is a combination of both        acknowledge funding from NSF (#1316736, 1252297, 1109483,
knowledge and engagement-- and two observables, performance       1031398, 0742503), and IES (# R305A120125 &
and gaming behaviors. This model is shown in Figure 5.            R305C100024). Any opinions or conclusions expressed are
                                                                  those of the authors, not necessarily of the funders.

                                                                  REFERENCES
                                                                  [1] Corbett, A.T., Anderson, J.R., “Knowledge tracing:
                                                                  Modeling the acquisition of procedural knowledge.” User
                                                                  Modeling and User-Adapted Interaction, 1995, 4, p.253-278.
                                                                  [2] Baker, R.S., Corbett, A.T., Koedinger, K.R., Wagner, A.Z.
                   Figure 5- KTB Model                            (2004) Off-Task Behavior in the Cognitive Tutor Classroom:
This model has fewer parameters than the dynamic mixture          When Students "Game The System". In Proceedings of ACM
model or KAT model, but still can predict both performance and    CHI 2004: Computer-Human Interaction, 383-390.
disengaged behavior of the students.
                                                                  [3] Beck, J.E. “Engagement tracing: using response times to
The variable called Gaming Behavior (B) is defined as either      model student disengagement.” Proceedings of AIED
gaming or normal. See our definition for “gaming” in this         conference, 2005. p. 88-95. IOS Press
context in our previous work [5].
                                                                  [4] Johns, J. and Woolf, B.P. “A Dynamic Mixture Model to
4. BAYESIAN ENGAGEMENT TRACING                                    Detect Student Motivation and Proficiency.” Proceedings of
                                                                  AAAI Conference, 2006, 1, p. 163-168.
Since the performance prediction of the KTB model can be
compared to that of Bayesian Knowledge Tracing, it is             [5] Schultz, S. and Arroyo, I. “Tracing Knowledge and
necessary to have a model of engagement tracing to compare the    Engagement in Parallel in an Intelligent Tutoring System.” To
behavior predictions. To that end, we include a model of          appear in Proceedings of the 7th Annual International
“Bayesian Engagement Tracing” (BET) in this work, which is        Conference on Educational Data Mining, 2014