=Paper= {{Paper |id=Vol-1183/bkt20y_paper03 |storemode=property |title= The Sequence of Action Model: Leveraging the Sequence of Attempts and Hints |pdfUrl=https://ceur-ws.org/Vol-1183/bkt20y_paper03.pdf |volume=Vol-1183 |dblpUrl=https://dblp.org/rec/conf/edm/ZhuWH14 }} == The Sequence of Action Model: Leveraging the Sequence of Attempts and Hints== https://ceur-ws.org/Vol-1183/bkt20y_paper03.pdf
 The Sequence of Action Model: Leveraging the Sequence
                 of Attempts and Hints
              Linglong Zhu                                    Yutao Wang                              Neil T. Heffernan
  Department of Computer Science                  Department of Computer Science              Department of Computer Science
   Worcester Polytechnic Institute                 Worcester Polytechnic Institute             Worcester Polytechnic Institute
  100 Institute Road, Worcester, MA               100 Institute Road, Worcester, MA           100 Institute Road, Worcester, MA
             lzhu@wpi.edu                              yutaowang@wpi.edu                                 nth@wpi.edu



ABSTRACT                                                               multiple model fitting procedures and showed that there are no
Intelligent Tutoring Systems (ITS) have been proven to be              real differences in predictive accuracy between these two models.
efficient providing student assistance and assessing their                  However, little attention is paid to the data generated when
performance when they do their homework. Researchers have              students interact with computer tutors. Shih, Koedinger, and
analyzed how students’ knowledge grows and predict their               Scheines (2010) utilize Hidden Markov Model clustering to
performance from within intelligent tutoring systems. Most of          discover different strategies students used while working on a ITS
them focus on using correctness of the previous question or the        and predict learning outcomes based on these strategies. Their
number of hints and attempts students need to predict their future     work is based on a dataset that consists of a series of transactions
performance, but ignore the sequence of hints and attempts. In         and each transaction is a  tuple.
this research work, we build a Sequence of Actions (SOA) model         This model takes into account both students’ action, attempt or
taking advantage of the sequence of hints and attempts a student       help request, and action duration. The experimental results of
needed for the previous question to predict students’ performance.     their Stepwise-HMM-Cluster model shows that persistent
A two step modeling methodology is put forward in the work and         attempts lead to better performance than hint-scaffolding strategy.
is a combination of Tabling method and the Logistic Regression.        Some papers have shown the value of using the raw number of
We compared SOA with Knowledge Tracing (KT) and Assistance             attempts and hints. In fact, the National Educational Technology
Model (AM) and combinations of SOA/AM and KT. The                      Plan cited Feng, Heffernan, and Koedinger’s work (2006) and the
experimental results showed that the Sequence of Action model          User Modeling community gave it an award for best paper for
has reliably better predictive accuracy than KT and AM and its         showing that the raw number of hints and attempts is informative
performance of prediction is improved after combining with KT.         in predicting state test scores. Wang and Heffernan (2011) built
                                                                       an Assistance Model (AM) and generated a performance table
Keywords                                                               based on students’ behavior of doing the previous question.
Knowledge Tracing, Educational Data Mining, Student Modeling,          Hawkins et al.(2013) extended AM by looking at students’
Sequence of Action, Assistance Model, Ensemble.                        behavior for the two previous questions.
                                                                             These educational data mining models that utilize the
1. INTRODUCTION                                                        number of assistance students request and the number of attempts
One of the student modeling tasks is to trace the student’s            they make to predict students’ performance have ignored the
knowledge by using student’s performance. Corbett and Anderson         sequencing of students’ interaction with ITS. Consider a thought
(1995) put forward the well-known Knowledge Tracing (KT)               experiment. Suppose you know that Bob Smith asked for one of
based on their observation that the students’ knowledge is not         the three hints and makes one wrong answer before eventually
fixed, but is assumed to be increasing. KT model makes use of          getting the question correct. What if someone told you that Bob
Bayesian network to model students’ learning process and               first made an attempt then had to ask for a hint compared to the
predicate their performance.                                           first requesting a hint and then making a wrong attempt. Would
                                                                       this information (whether he started with an attempt or a hint) add
     A variety of extensions of KT model are put forward in            value to your ability to predict whether Bob will get the next
recent years. Baker, Corbett, and Aleven (2008) build a contextual     question correct? We suspected that a student who first makes an
guess and slip model based on KT that provides more accurate           attempt tends to learn by himself and has higher probability to
and reliable student modeling than KT. Pardos and Heffernan            master the knowledge and answer the next same question correct.
extends KT four parameters model to support individualization
and skill specific parameters and get better prediction of students’        In our previous work, we showed a Sequence of Action
performance. Qiu and Qi et al. find that forgetting is a more likely   (SOA) model that made use of information about the action
cognitive explanation for the over prediction of KT when               sequence of attempts and hints for a student in previous question
considering the time students take to finish their tasks.              better predicted the correctness of a current question.. We
                                                                       reported experimental results of an improvement upon the KT
     Alternative methods to KT model have been developed. For          model. However, we later found a mistake in that experiment. So
example, in order to generate adaptive instructions for students,      this paper serves as a correction of the previous results and as a
Pavlik Jr., Cen, and Koedinger (2009) put forward the                  formal presentation of the SOA model to the community. We
Performance Factor Analysis (PFA) model that can make                  present the SOA model and compare it to the KT model and the
predictions for individual students with individual skills. Gong,      Assistance model, as well as the combined models to see if
Beck, and Heffernan (2010) compared KT with PFA using                  knowing sequence of action information does improve upon a
standard Knowledge Tracing model, or even upon knowing                 problem PRAQZPN, he made one wrong attempt before making
number of hints and number of attempts alone.                          the correct answer and its action sequencing is ‘aa.’
     The raw data and experiment result is available online:
https://sites.google.com/site/assistmentsdata/projects/zhu2014.
1.1 The Tutoring System and Dataset
The data we used originated from the ASSISTments platform, an
online tutoring system for K12 students that gives immediate
feedback to teachers, students, and parents. The ASSISTments
gives tutorial assistance if a student makes a wrong attempt or
asks for help. Figure 1 shows an example of a hint, which is one
type of assistance. Other types of assistance include scaffolding
questions and context-sensitive feedback messages, known as
“buggy messages.”




                                                                             Figure 2. Students’ action records in ASSISTments

                                                                            We used data from one Mastery Learning class. Mastery
                                                                       Learning is a strategy that requires
                                                                                                       . students to continually work
                                                                       on a problem set until they have achieved a preset criterion
                                                                       (typically three consecutive correct answers). Questions in each
                                                                       problem set are generated randomly from several templates and
                                                                       there is no problem-selection algorithm used to choose the next
                                                                       question.
                                                                            Sixty-six 12-14 year-old, 8th grade students participated in
                                                                       these classes and generated 34,973 problem logs. We only used
                                                                       data from a problem set for a given student if they had reached the
     Figure 1. Assistance in ASSISTments. Which is first:              mastery criterion. This data was collected in a suburban middle
            asking for a hint or make an attempt?                      school in central Massachusetts. Students worked on these
                                                                       problems in a special “math lab” period, which was held in
      Figure 1 shows a student who asked for a hint (shown in          addition to their normal math class.
yellow and also indicated by the. button says “Show hint 2 of 4”),           If a problem only has one hint, the hint is the answer of the
but it also shows that the student typed in eight and got feedback     problem and is called the bottom hint. After a student asks for a
that this was wrong. Though Figure 1 shows the number of hints         bottom hint, any other attempt is meaningless because he or she
and attempts, interestingly you cannot tell whether the student        already knows the answer. In the experiment, we only consider
asked a hint first or made an attempt first. This paper’s argument     the problem logs that have at least two hints. And the answer will
is that information is very important.                                 be marked as incorrect if students ask for a hint or the first attempt
     ASSISTments records all the details about how a student           is incorrect. Moreover, we excluded such problem logs where: 1)
does his or her homework and tests from which scientists can get       students quit the system immediately after they saw the question
valuable material to investigate students’ behavior and their          and the action logs were blank ,or 2) after they requested hints,
learning process. These records include the start time and end         but did not make any attempts and no answer was recorded.
time of a problem, the time interval between an attempt, if he or            Here we only consider the question pairs that have the same
she asks for a hint, the number of attempts a student makes, the       skill and skills having only one question were removed because
number of hints a student asks for, as well as the answer and result   they do not help in predicting. Questions of the same skills were
for each attempt a student makes.                                      sorted by start time in ASSISTments. We split equally 66 students
     Figure 2 shows an example of a detailed sequence of action        into six groups, 11 students in each, to run 6-fold cross validation.
recorded by the system. The row in blue means that the answer is       We trained the SOA model and the KT model on the data from
correct, the row in red means that the answer is wrong, and the        five of the groups and then computed the prediction accuracy on
row in orange means the student asked for a hint. We can see that      the sixth group. We did this for all six groups.
this student answered correctly on his first attempt for the first
problem PRAQM5U. The sequence of action is ‘a’ (‘a’ represents         2. INDIVIDUAL MODELS
an attempt). For the second problem PRAQM2W, he asked three
hints continuously before making the correct answer. The
                                                                       2.1 KT
                                                                       Knowledge Tracing (KT) is one of the most common methods
sequence of action is ‘hhha’ (‘h’ represents a hint). For the third
                                                                       that are used to model the process of student’s knowledge gaining
problem PRAQM2F, he alternatively asked for hints and made
                                                                       and to predict students’ performance. The KT models is an
attempts, and the sequence of action is ‘hahaha’. For the last
                                                                       Hidden Markov Model (HMM) with a hidden node (student
knowledge node) and an observed node (student performance             and some students kept trying many times. Some students asked
node). It assumes that a skill has four parameters; two knowledge     for hints and made attempts alternatively and we believe they
parameters and two performance parameters. The two knowledge          were learning by themselves. In the data, there are 217 different
parameters are: prior and learn. The prior knowledge parameter is     sequences of actions. Intuitively, students’ actions reflect their
the probability that a particular skill was known by the student      study attitude and this determines their performance. Based on the
before interacting with the tutor. The learn parameter is the         assumption that students who make more attempts tend to master
probability that a student transits from the unlearned state to the   knowledge better than students who ask for more hints, we
learned state after each learning opportunity, i.e., after see a      divided them into five categories or bins: (1) One Attempt: the
question. The two performance parameters are: guess and slip.         student correctly answered the question after one attempt; (2) All
Guess is the probability that a student will guess the answer         Attempts: the student made many attempts before finally getting
correctly even if the skill associated with the question is in the    the question correct; (3) All Hints: the student only asked for hints
unlearned state. Slip is the probability that a student will answer   without any attempts at all; (4) Alternative, Attempt First: the
incorrectly even if he or she has mastered the skill for that         students asked for hints and made attempts alternatively and made
question.                                                             an attempt at first; and (5) Alternative, Hint First: the students
      The goal of KT is to estimate the student knowledge from his    asked for hint and made attempts alternatively and asked for a hint
or her observed actions. At each successive opportunity to apply a    first. Table 2 shows the division and some examples of the action
skill, KT updates its estimated probability that the student knows    sequences in each category.
the skill, based on the skill-specific learning and performance            Table 2. Sequence of Action Category and Examples
parameters and the observed student performance (evidence). It is
                                                                       Sequence of Action Category/
able to capture the temporal nature of data produced where                                                             Examples
                                                                                Bin Name
student knowledge is changing over time. KT provides both the
ability to predict future student response values, as well as                One Attempt/Bin ‘a’                           a
providing the different states of student knowledge. For this               All Attempts/Bin ‘a+’             aa, aaa, …, aaaaaaaaaaaa
reason, KT provides insight that makes it useful beyond the scope
of simple response prediction.                                                All Hints/Bin ‘h+’                  ha, hha,…, hhhhhhha
                                                                      Alternative, Attempt First/Bin ‘a-
2.2 Assistance Model                                                                                         aha, aahaaha,…, aahhhhaaa
                                                                                    mix’
Motivated by the intuition that students who need more assistance
have lower probability possessing the knowledge, Wang and               Alternative, Hint First/Bin ‘h-
                                                                                                                  haa, haha,…, hhhhaha
Heffernan (2011) built a purely data driven “Assistance” model to                    mix’
discover the relationship between assistance information and               Notice that each sequence ends with an attempt because in
students’ knowledge.                                                  ASSISTments, a student cannot continue to next question unless
     A parameter table was built in which rows represent the          he or she fills in the right answer of the current problem. In Table
number of attempts a student required in the previous question        2, ‘a’ stands for answer and ‘h’ stands for hint. An action
and columns represent the number of hints the student asked for.      sequence “ahha” means that a student makes an attempt and then
Each cell contains the probability that the student will answer the   asks for two hints before he or she types the correct answer and
current question correctly. The attempts are separated into three     moves on to the next question.
bins: one attempt, small number of attempts (2-5 times), and large
numbers of attempts (more than five attempts). Hints are separated    2.3.1 Sequence of Action Tabling
into four bins: no hint, small number of hints (1, 50%], large        After dividing all of sequence of actions into five categories, we
number of hints [50%, 100%), and all hints where students for all     use a Tabling method, which gets the next percent correct directly
hints. Table 1 shows the parameter table gained from our dataset.     from the training data. For each fold, one table is generated by the
As with Wang and Heffernan’s experimental results, the                tabling method by counting the number of total appearance and
parameter table confirms that students requiring more assistance      the number of next correct of each bin. After counting, a next
to solve a problem probably have less corresponding knowledge.        correct percent is calculated by dividing Next Correct Count by
Table 1. Assistance Model parameter table, average across six         Total Count of Bin.
                           folds                                      Table 2. Next correct percent table of training group of fold 1
                       attempt= 1     0=6          Bin           Total        Next Correct        Next Correct
 hint_percent = 0        0.8410          0.7963          0.7808           Name           Count             Count               Percent
0