Inferring Knowledge Acquisition through Web
                Navigation Behaviour

                                       He Yu

                           School of Computer Science
        University of Manchester, Oxford Road, Manchester, M13 9PL, UK
                       he.yu@postgrad.manchester.ac.uk


      Abstract. As we witness the growing popularity of online learning, we
      address the problem of knowing if users are actually learning. The tradi-
      tional assessment approaches involve tests, assignments and peer assess-
      ments. We explore if there is a way to measure learning and personalise
      the user learning experience in an unobtrusive manner. My PhD proposes
      using data-driven methods to measure learning by mining user interac-
      tion data to identify regularities that could be indicators of learning.

      Keywords: Navigation behaviour · Knowledge acquisition · MOOC.


1    Introduction
MOOCs were first introduced in 2006 and quickly gained popularity in 2012 [8].
There are many well-known providers such as Coursera, Udacity and edX. with
81 million registered learners in total [3]. While each platform has its own struc-
ture and style, they can generally be divided into two categories: cMOOCs and
xMOOCs. These terms were first proposed by Stephen Downes as an xMOOC
resembles more of a traditional course and cMOOCs focus more on generating
knowledge through communities [9].

    The goal of this project is to examine the hypothesis of whether knowledge
acquisition and learning can be inferred from users navigational behaviours.
These interactive behaviours can be low-level events such as a mouse click or
high-level events such as searching certain keywords. Both low-level and high-
level events will be included in the investigation as it is shown that context may
be spread across multiple events and composition of these events needs to be
taken into consideration to find appropriate interpretations [5].


2    Motivation
Measuring learning and its properties has always been an important practice in
education. Traditional methods include modelling students automatically using
an assessment system such as knowledge tracing [14], knowledge assessment [13],
peer review [4] or endorser review [6] or using automatic analysis tools such as
2        He Yu

adaptive assessment functionality [11]. These systems are convenient as mod-
elling is performed automatically but they also lack personalisation that suits
the needs of individual learners [10] such as content customisation based on indi-
vidual goals and personal trajectory that optimizes the learning process. Using
Web navigation behaviour to assess learning can be applied in consideration of
individuality such as different level of expertise.

     Another motivation for using Web navigation behaviour to measure prop-
erties of learning is to provide context to assessments. Additional information
such as engagement, material coverage and self-efficacy can potentially offer con-
text and explanation to traditional assessment results. Work has been done on
MOOC platforms to measure properties of learning such as using various indica-
tors to measure engagement [2] and using Bayesian networks to predict dropout
[7]. Our research will be conducted on the MOVING platform which is a cMOOC
platform (see Fig. 1).


                    Fig. 1. A screenshot of the MOVING MOOC.


   The MOVING platform will measure properties of learning through the data
generated by the following four sources, these measurements will serve as the
best practice in our investigation:

    – The adaptive support training widget which will display charts about the
      usage of the different features of the platform.

    – Self-assessment data from user answers to questions provided by the adap-
      tive training support and their written feedback.
        Inferring Knowledge Acquisition through Web Navigation Behaviour          3

 – User prior knowledge level according to self-reported prior knowledge assess-
   ment.

 – Progress data from the curriculum progress widget which will show the status
   of user progress in the entire curriculum.

3   Research Challenges
To investigate the relations between Web navigation behaviours and learning,
three research questions are identified:

RQ1: What are the traditional assessment methods on MOOC platform and
what properties of learning can we investigate?

   We explore the current state of the art assessment methods, how they are
used and their advantages/disadvantages. We will discuss the challenges of these
methods in terms of MOOC settings. We will investigate the properties that are
associated with learning, for example, level or engagement or participation. In
particular, how are these properties indicate learning and how to measure them.

   The research question will be approached by conducting a literature review
on assessments and properties of learning.

RQ2: Can we use Web navigation behaviour to measure these properties of learn-
ing

    We explore whether a user’s interactive navigational behaviours can be used
as a reliable and effective way to measure their knowledge acquisition. Two
approaches will be taken to investigate the research question. The first one is
literature-driven. Potential interactive behaviours that are connected to learning,
in general, can be extracted from the related literature. The second approach
is data-driven when we identify frequent navigational behaviours/patterns using
pattern-mining algorithms.

RQ3: How do we evaluate our findings?

    If correlations were discovered between navigation behaviour and properties
of learning, how can we evaluate these findings? It is possible that we can evaluate
by analysing the longitudinal data and compare the results with our earlier
discoveries. Also, we may be able to gather data from other platforms to evaluate
these correlations.

4   Contribution
The main contributions of this project will be in the area of learning analytics and
human-computer interaction. The project will collect learner-generated interac-
4         He Yu

tive weblogs, perform analysis and use the best practice to discover connections
that can predict learning, which will be contributions to learning analytics [12].
It focuses on the interaction between the users and an online learning platform,
which suggests contributions to human-computer interaction.

    Using Web navigation behaviour to measure learning can be complementary
to assessments in many aspects, for example:

    – Unobtrusiveness. Traditional assessments can be used to motivate the stu-
      dents, however, in certain online learning settings it can potentially pose
      more problems such as higher dropout rate. It has been expressed that it is
      a problem if the assessments become obtrusive and learners can be reluctant
      towards them [1]. Assessing users by their automatically generated data is
      less disruptive to their learning process, it can be a significant improvement
      to current assessment methods and online learning platforms.

    – Automatic analysis. The implementation of efficient assessments is still prob-
      lematic despite and advancements in learning technology [15], while our ap-
      proach can be much more efficient compared to other assessment methods.
      Monitoring knowledge acquisition using Web behaviour can be achieved au-
      tomatically with the users data. With a reliable model, analysis and feedback
      could be done automatically as well.

    – Context. Measuring the properties of learning can provide more context to
      traditional assessment methods. It can be used to interpret and explain the
      test results, to demonstrate individual strength/weaknesses and to generate
      appropriate assessments.


The project can also contribute to the adaption of online learning platforms.
Using automatically generated data and analysis, feedback can be provided au-
tomatically. With the feedback, personalised guidance or even learning contents
may be provided which can discover more possibilities.


4.1     Future Work

Our next goal is to find correlates between Web navigation patterns and knowl-
edge acquisition. This will be approached from a data-driven and hypothesis-
driven angle. However, in the event that the MOVING platform does not contain
enough users to carry out the study, a separate study with recruited participants
may be practical to continue the project. Other methods may include acquiring
and utilising the interactive data from other MOOC platforms that contains
more active users.

   Statistical analysis such as regression analysis will be performed between
knowledge acquisition and Web navigation behaviour to suggest correlation and
        Inferring Knowledge Acquisition through Web Navigation Behaviour         5

causality between different metrics and behaviours. However, if no such connec-
tions are found, it is possible to investigate different metrics with navigational
behaviours such as the level of engagement, the effectiveness of the course and
dropout behaviours.

   The next step is evaluating the models generated from the statistical analyses.
For iterative/exploratory testing purposes, small-scale studies will be designed to
evaluate the models generated from the previous stage. For confirmatory testing
purposes, the models will be investigated as for how they scale on longitudinal
analysis of data.

   The final stage is designing interventions. If knowledge acquisition can be
identified over time, the remaining questions are, how to support those who might
be struggling? How to encourage engagement? How to speed up the learning
process? Is there any way of delivering interventions without being too intrusive.


5   Acknowledgments
This work was supported by the EU’s Horizon 2020 programme under grant
agreement H2020-693092 MOVING (http://moving-project.eu).
6       He Yu

References
 1. Apostolos K.: Multilitteratus Incognitus: Allergic to assessment or mea-
    surement? (2011), http://idstuff.blogspot.com/2011/10/allergic-to-assessment-or-
    measurement.html
 2. Bote-Lorenzo, M.L., Gómez-Sánchez, E.: Predicting the decrease of engagement
    indicators in a MOOC. In: Proceedings of the Seventh International Learning
    Analytics & Knowledge Conference on - LAK ’17. pp. 143–147. ACM Press,
    New York, New York, USA (2017). https://doi.org/10.1145/3027385.3027387,
    http://dl.acm.org/citation.cfm?doid=3027385.3027387
 3. Dhawal Shah: By The Numbers: MOOCS in 2017 Class Central (2018),
    https://www.class-central.com/report/mooc-stats-2017/
 4. Gamage, D., Whiting, M.E., Rajapakshe, T., Thilakarathne, H., Perera, I., Fer-
    nando, S.: Improving Assessment on MOOCs Through Peer Identification and
    Aligned Incentives. In: Proceedings of the Fourth (2017) ACM Conference on
    Learning @ Scale - L@S ’17. pp. 315–318. ACM Press, New York, New York,
    USA (2017). https://doi.org/10.1145/3051457.3054013
 5. Hilbert, D.M., Redmiles, D.F.: Extracting usability information from user
    interface events. ACM Computing Surveys 32(4), 384–421 (dec 2000).
    https://doi.org/10.1145/371578.371593
 6. Kay, J.S., Nolan, T.J., Grello, T.M.: The Distributed Esteemed Endorser Re-
    view. In: Proceedings of the Third (2016) ACM Conference on Learning @
    Scale - L@S ’16. pp. 157–160. ACM Press, New York, New York, USA (2016).
    https://doi.org/10.1145/2876034.2893396
 7. Lacave,    C.,   Molina,     A.I.,    Cruz-Lemus,     J.A.:    Learning   Analytics
    to identify dropout factors of Computer Science studies through
    Bayesian networks. Behaviour & Information Technology pp. 1–
    15       (jun       2018).         https://doi.org/10.1080/0144929X.2018.1485053,
    https://www.tandfonline.com/doi/full/10.1080/0144929X.2018.1485053
 8. Laura     Pappano:       Massive      Open     Online    Courses      Are    Multi-
    plying    at   a    Rapid      Pace    -    The    New      York   Times    (2012),
    https://www.nytimes.com/2012/11/04/education/edlife/massive-open-online-
    courses-are-multiplying-at-a-rapid-pace.html
 9. Prpic, J., Melton, J., Taeihagh, A., Anderson, T.: MOOCs and
    Crowdsourcing: Massive Courses and Massive Resources (feb 2017).
    https://doi.org/10.5210/fm.v20i12.6143
10. Ren, Z., Rangwala, H., Johri, A.: Predicting Performance on MOOC Assessments
    using Multi-Regression Models (May 2016), http://arxiv.org/abs/1605.02269
11. Rosen, Y., Rushkin, I., Ang, A., Federicks, C., Tingley, D., Blink, M.J.: Designing
    Adaptive Assessments in MOOCs. In: Proceedings of the Fourth (2017) ACM
    Conference on Learning @ Scale - L@S ’17. pp. 233–236. ACM Press, New York,
    New York, USA (2017). https://doi.org/10.1145/3051457.3053993
12. Siemens,       G.:       What        Are       Learning       Analytics     (2010),
    http://www.elearnspace.org/blog/2010/08/25/what-are-learning-analytics/
13. Wang, S., He, F., Andersen, E.: A Unified Framework for Knowledge Assessment
    and Progression Analysis and Design. In: Proceedings of the 2017 CHI Conference
    on Human Factors in Computing Systems - CHI ’17. pp. 937–948. ACM Press,
    New York, New York, USA (2017). https://doi.org/10.1145/3025453.3025841
14. Wang, Z., Zhu, J., Li, X., Hu, Z., Zhang, M.: Structured Knowledge Tracing Models
    for Student Assessment on Coursera. In: Proceedings of the Third (2016) ACM
        Inferring Knowledge Acquisition through Web Navigation Behaviour            7

    Conference on Learning @ Scale - L@S ’16. pp. 209–212. ACM Press, New York,
    New York, USA (2016). https://doi.org/10.1145/2876034.2893416
15. Xiong, Y., Suen, H.K.: Assessment approaches in massive open online courses:
    Possibilities, challenges and future directions. International Review of Education
    64(2), 241–263 (apr 2018). https://doi.org/10.1007/s11159-018-9710-5