=Paper=
{{Paper
|id=Vol-1137/LA_machinelearning_submission_4
|storemode=property
|title=Modelling student online behaviour in a virtual learning environment
|pdfUrl=https://ceur-ws.org/Vol-1137/LA_machinelearning_submission_4.pdf
|volume=Vol-1137
|dblpUrl=https://dblp.org/rec/conf/lak/HlostaHVKZW14
}}
==Modelling student online behaviour in a virtual learning environment==
<pdf width="1500px">https://ceur-ws.org/Vol-1137/LA_machinelearning_submission_4.pdf</pdf>
<pre>
     Modelling student online behaviour in a virtual learning
                         environment

                                Martin Hlosta† Drahomira Herrmannova† Lucie Vachova††
                                     Jakub Kuzilek† Zdenek Zdrahal† Annika Wolff†
                                                                                                                                        ∗
     Knowledge Media Institute, The Open University†                                          University of Economics, Prague††
                      Walton Hall                                                     Department of Exact Methods, Faculty of Management
               Milton Keynes, MK7 6AA                                                                  Jarosovska 1117/II
 {martin.hlosta; d.herrmannova; jakub.kuzilek;                                                     Jindrichuv Hradec, 377 01
    z.zdrahal; annika.wolff}@open.ac.uk                                                                vachova@fm.vse.cz

ABSTRACT                                                                             massive open online courses (MOOCs) [Cormier,2008]. The
In recent years, distance education has enjoyed a major                              concept of distance education is however not new. The Open
boom. Much work at The Open University (OU) has focused                              University is an institution with over forty years of experi-
on improving retention rates in these modules by provid-                             ence with distance education, historically based on off-line
ing timely support to students who are at risk of failing the                        materials and nowadays making an increasing use of the In-
module. In this paper we explore methods for analysing stu-                          ternet. The great advantage of the online courses is in the
dent activity in online virtual learning environment (VLE) –                         fact they are accessible to virtually anyone with Internet
General Unary Hypotheses Automaton (GUHA) and Markov                                 access.
chain-based analysis – and we explain how this analysis can                             The other side of the coin is that the retention rates in
be relevant for module tutors and other student support                              these courses are often low. [Koller et al.,2013] mention, that
staff. We show that both methods are a valid approach to                             an average retention rate of a Coursera1 course is around
modelling student activities. An advantage of the Markov                             5%. The situation at traditional universities as well as at
chain-based approach is in its graphical output and in the                           The Open University is significantly better, however, there
possibility to model time dependencies of the student activ-                         is still a room for improvement.
ities.                                                                                  There might be many reasons for the low retention rates,
                                                                                     from the fact that the online courses are often offered to
                                                                                     anybody interested to the fact that the performance of each
Categories and Subject Descriptors                                                   student depends almost exclusively on how much are they
D.4.8 [Performance]: Modelling and Prediction;                                       willing to study on their own at home. Our work at The
H.2.8 [Database Applications]: Data Mining                                           Open University aims at analysing students’ activities in the
                                                                                     online courses in order to gain insight into their behavioural
General Terms                                                                        patterns, which can be utilised for building prediction mod-
                                                                                     els.
Algorithms, Design, Experimentation, Human Factors
                                                                                     1.1    Problem Description
Keywords                                                                                The Open University2 is the biggest university in the Unit-
Student Data, Distance Learning, Predictive Models, Ma-                              ed Kingdom, offering several hundred distance learning mod-
chine Learning, Information Visualisation                                            ules, which can be studied both as standalone modules or as
                                                                                     part of a university degree. Anybody can sign up for a mod-
1.     INTRODUCTION                                                                  ule provided by The OU, without any previous education
                                                                                     whatsoever. The students receive their study material and
  The recent years have seen a massive growth of differ-                             submit their assignments through an online virtual learning
ent possibilities of online education, such as the well known                        environment.
∗This work was carried out at the Knowledge Media Insti-                                Students participating in a module are generally split into
tute.                                                                                smaller study groups of no more than few tens of students,
                                                                                     typically according to their geographic location. Each group
                                                                                     has an assigned tutor. The tutors grade the students’ as-
                                                                                     signments and exams, answer their questions in the online
Permission to make digital or hard copies of all or part of this work for            forums, provide general advice and guidance, etc.
personal or classroom use is granted without fee provided that copies are               In order to support the students who are at risk of fail-
not made or distributed for profit or commercial advantage and that copies           ing the module The OU also implements various interven-
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific   1
permission and/or a fee. Request permissions from Permissions@acm.org.                 https://www.coursera.org/, a well known and one of the
LAK ’14, March 24 – 28 2014, Indianapolis, IN, USA                                   biggest platforms providing open online courses.
                                                                                     2
Copyright 2014 ACM 978-1-4503-2664-3/14/03 ...$15.00.                                  http://www.open.ac.uk/
tions (such as phone calls from a specialised student support         • binary flags indicating whether student was active in
teams) during the course of the module. Because the number              the VLE and in various content types.
of students studying each module can reach several thousand
(the modules used in our analysis have enrolment of around       4.    METHODS
two thousand students) and the resources available for the         For analysis of student behaviour in the virtual learn-
interventions are limited, the interventions have to be care-    ing environments, we have used two different approaches –
fully planned. Therefore, an important question one might        GUHA [Hájek et al.,1966] and modelling based on Markov
ask is how to identify students at risk of failing the module    chains [Norris,1998].
so that the intervention is meaningful and efficient.
   Improving student retention through these focused inter-      4.1     Activity types analysis
ventions and helping the tutors to focus on the students,
                                                                    As mentioned in Section 3, the VLE data contain infor-
who require help, provides many benefits, from improved
                                                                 mation about the type of content the student accessed. The
student satisfaction to financial savings for the university.
                                                                 content type can be for example forum, wiki, resource, quiz,
In order to support the identification of students who are
                                                                 etc. In total there are 11 different content types. Using the
currently at risk, we utilise several statistical and machine
                                                                 binary flags, indicating whether student was active in given
learning methods. The available data contain both the in-
                                                                 week and content type, we utilised Bayes Theorem [Bishop
formation about the students’ activity in the VLE as well
                                                                 and Nasrabadi,2006] for determining the probability that the
as their demographic information. However, for modelling
                                                                 student will fail to complete the module. Moreover, we anal-
student behaviour, only the data from the VLE was used.
                                                                 ysed each of the content types in terms of mean number of
A more detailed description of our data set can be found in
                                                                 students succeeding in the module based on activity or inac-
Section 3.
                                                                 tivity in the given content type. Based on this investigation
                                                                 we have selected a set of content types which significantly
2.    PREVIOUS WORK                                              influence students’ performance, these were then used in the
   The current work builds on previous research done at          further analyses.
The Open University. The initial experiments with machine
learning techniques were using the VLE and assessment data       4.2     GUHA
[Wolff and Zdrahal,2012]. One of the main findings of this          General Unary Hypotheses Automaton (GUHA), originally
research was that decision trees generally outperform the        published in [Hájek et al.,1966], is one of the oldest data
other methods [Wolff and Zdrahal,2012]. This research also       mining methods for automatic discovery of new interesting
included creation of a dashboard providing the university        hypotheses from the data. To achieve this goal, GUHA uses
staff with real-time information about student performance.      various different procedures (ASSOC, IMPL, CORREL). The
   Additional methods were tested in [Wolff et al.,2013b] and    choice of the procedure to use depends mainly on the user
in [Wolff et al.,2013a]. In the latter work, demographic data    needs and his experience. We have selected the ASSOC pro-
were added to the predictions, however this research did         cedure [Rauch and Simunek,2001], which allows to discover
not confirm that this data provide a significant increase in     interesting associations between attributes in the data. The
performance.                                                     interestingness of the association is mostly based on their
                                                                 co-occurrence. The ASSOC procedure allows to limit the
3.    DATA SPECIFICATION                                         resulting rules by specifying constraints on the attributes.
                                                                 This property is important for our field of interest.
  The analyses we performed were done using real data of
                                                                    For our research, we used the ASSOC procedure that is
several modules from The Open University. We examined a
                                                                 implemented in the 4ft-Miner module within the Lisp-Miner
number of subsequent presentations of each module.
                                                                 software tool 3 . The specification of the constraints enabled
  The available data contain two types of information:
                                                                 us to restrict the rules only to those, which cover students
     • Information about the results of student assignments      that fail or succeed in the TMA. We used three basic types
       (TMAs – tutor marked assignments). There are sev-         of features for the analysis introduced in Section 3.
       eral assignments in each module, typically between five      The search space for both types of binary flags was rea-
       and seven. Generally, the module is ended by a final      sonable to perform analysis. For weekly aggregated counts,
       exam.                                                     it was necessary to reduce the search space via interval
                                                                 discretisation. For this purpose, we utilized LISp-miner
     • Data about student activity from the virtual learning     and unsupervised equal frequency discretisation [Wong and
       environment (VLE).                                        Chiu,1987].
                                                                    GUHA method produce large set of hypotheses. The ex-
   The VLE data are aggregated by days and content type          ample of such results with the categorized binary flags are
(e.g. forum, wiki, resource, ...). This means that for each      depicted in the Figure 1. Unfortunately the information con-
day we know how many times did the student interact with         tained in the output is difficult to interpret. Moreover, the
given content type. For our analysis we summarise the data       information of the time dimension is lost and this is even
by weeks and content types. Summarising the data by weeks        worse when using various content types. For us, this was
seemed to be reasonable, it simplifies our analyses without      the motivation to look for another modelling method.
loosing too much detail. The features generated by the sum-
marisation and used in the methods are:                          4.3     Markov chain-based analysis of student ac-
                                                                         tivity
     • click counts aggregated by week,                          3
                                                                   LISp-Miner lispminer.vse.cz/ – software tool for imple-
     • click counts aggregated by week and content type,         mentation of the GUHA method.
Figure 1: Screenshot with discovered rules from
LISp-miner.                                                       Figure 2: Representation of students behaviour in
                                                                  scenario 3 (TMA 1 was not submitted)
       Scenario                            TMA 1      TMA 1
                                        not submit    success
 1.    zero in any of 0 - 4                    51.6      42.6
 2.    zero only in 1 - 4                     95.7        4.3
 3.    zero only in 2 - 4                     92.3        7.7
 4.    zero only in 3 - 4                    100.0        0.0
 5.    zero only in 4                          54.3      45.7
 6.    zero only in 0                          15.4      71.8
 7.    zero only in 0 - 1                       6.7     86.7
 8.    zero only in 0 - 2                      15.4      69.2
 9.    zero only in 0 - 3                      57.1      42.9
 10.   zero in at least one of 0 - 3,          18.2      71.6
       non-zero in 4
 11.   zero in at least one of 0 - 2,          14.6      75.1
       non-zero in 3 - 4
 12.   zero in at least one of 0 - 1,           9.3      80.4
                                                                  Figure 3: Representation of students behaviour in
       non-zero in 2 - 4
                                                                  scenario 3 (TMA 1 was passed)
Table 1: Summary of examined situations for stu-
dents behaviour evaluation (data in % of total num-
                                                                  represented by VLE activity, raise their chance to success in
ber of course students)
                                                                  TMA 1 (scenarios 6, 7, 8).
                                                                    Figures 2 and 3 specify more closely the situations from
                                                                  the scenario 3 (with the TMA 1 not submitted, and TMA
                                                                  1 passed, respectively). Colour tones of arrows differ (from
  In this part of analysis we examined the differences in in-     white to red) depending on the percentage of students who
tensity of student activity between students who were suc-        moved in given direction. The more red the colour the bigger
cessful in the first TMA (TMA 1) and those who did not            the percentage of students it represents. The rows represent
submit TMA 1. Students who failed in TMA 1 are not in-            the weeks in which VLE activity was measured. First row
cluded in this analysis, due to the fact that they represent      shows activity before the beginning of the course (Week 0),
only a small portion of students who submit TMA 1. In this        the other four rows capture the VLE activity in week 1, 2,
stage of research we analysed only students who at least once     3 and 4 respectively. The columns represent the intervals of
had zero VLE activity in one of the weeks under consider-         VLE activity - different colour of the node mean different
ation. In VLE passive students represent the specific group       interval. First column is zero VLE activity, while in other
important while looking for potentially at-risk students. On      columns the activity is divided into intervals with the cut
this group we have studied different scenarios - the list of      points in multiples of 30.
them is displayed in Table 1. Moreover, the Table 1 shows           This type of analysis enables us to look for specific pat-
percentage of students who behaved according to given Sce-        terns in students’ behaviour. In the similar way we can
nario and were successful in TMA 1, in comparison to those        analyse the different types of VLE activities as shown in
who did not submit TMA 1, as well. Numbers in column              Figure 4. In this case the nodes represent not the intensity
Scenario represents particular weeks of course.                   of student activity, but capture the interest of student in
  Based on the data in Table 1, we can identify behaviour of      specific content type. And (unlike the previous two figures)
at-risk students. This is evident especially in scenarios 2, 3,   this directed acyclic graph as a whole depicts the Markov
4 and 7. This shows us that students who tend to reach zero       chain [Norris,1998].
VLE activity in later weeks are more probable not tu submit
(scenarios 2, 3, 4). On the other hand those who have zero
VLE activity in earlier weeks and later start to show interest    5.   CONCLUSION
                                                                 Dave Cormier. 2008. The CCK08 MOOC–connectivism
                                                                     course, 1/4 way. Dave’s Educational Blog, 2.
                                                                 Petr Hájek, I. Havel, and Michal Chytil. 1966. The guha
                                                                     method of automatic hypotheses determination. Com-
                                                                     puting, 1(4):293–308.
                                                                 Daphne Koller, Andrew Ng, Chuong Do, and Zhenghao
                                                                    Chen. 2013. Retention and intention in massive open
                                                                    online courses: In depth. EDUCAUSE, June.
                                                                 James R Norris. 1998. Markov chains. Number 2008 in
                                                                    Cambridge series in statistical and probabilistic math-
                                                                    ematics. Cambridge university press.
                                                                 Jan Rauch and Milan Simunek. 2001. Mining for association
                                                                     rules by 4ft-miner. In INAP, pages 285–295.

Figure 4: Markov chain for the various activity types            Annika Wolff and Zdenek Zdrahal. 2012. Improving reten-
combinations in weeks 0-5.                                           tion by identifying and supporting ”at-risk” students.
                                                                     EDUCAUSE Review Online, July/Summer.

   In this paper we have examined two methods for analysing      Annika Wolff, Zdenek Zdrahal, Drahomira Herrmannova,
activity of students in the online virtual learning environ-         and Petr Knoth. 2013a. Predicting student per-
ment before the first tutor marked assignment – GUHA and             formance from combined data sources. In Alejandro
Markov chain-based graphical models. Both methods pro-               Peña-Ayala, editor, Educational Data Mining: Appli-
vide useful insights into the students’ behaviour during their       cations and Trends, number 524 in Studies in Compu-
studies. The benefit of the latter lies mostly in its graph-         tational Intelligence, pages 175–202. Springer Interna-
ical output, which might be easier to interpret and could            tional Publishing, Cham.
potentially provide support in planning interventions, and       Annika Wolff, Zdenek Zdrahal, Andriy Nikolov, and Michal
in the possibility to model time dependencies of the student         Pantucek. 2013b. Improving retention: predicting at-
activities. We believe that the understanding of the student         risk students by analysing clicking behaviour in a vir-
behavioural patterns will also be useful for building better         tual learning environment. In Third Conference on
predictive models of student performance.                            Learning Analytics and Knowledge (LAK 2013). ISBN
                                                                     978-1-4503-1785-6.
References
                                                                 Andrew K. C. Wong and David K. Y. Chiu. 1987. Syn-
Christopher M Bishop and Nasser M Nasrabadi. 2006.                  thesizing statistical knowledge from incomplete mixed-
    Pattern recognition and machine learning, volume 1.             mode data. IEEE Trans. Pattern Anal. Mach. Intell.,
    springer New York.                                              9(6):796–805, June.

</pre>