Introduction

Users behavioural inference with Markovian decision process and active learning

Firas Jarboui

fjarboui@aneo.fr 0 1

Vincent Rocchisani

vrocchisani@aneo.fr 0

Wilfried Kirchenmann

0 0 ANEO , Boulogne Billancourt , France 1 ENSTA , France and ENIT , Tunisia

59 61

Studies on Massive Open Online Courses (MOOCs) users discuss the existence of typical profiles and their impact on the learning process of students. One of the concerns when creating a new MOOC is knowing how the users behave when going through the contents. We can identify either quantitative methods that allow you to infer hardly interpretable groups of similar behaviour[1] or hardly context-transposable qualitative methods[2]. Our ambition is to find an efficient way to identify the behavioural pattern of interest to a given human expert. Within the #MOOCLive project3, we developed a mix-method to match the quantitative interpretation to the context needs.

Introduction Methodology

We tackled the following three problems in order to achieve our goal. The value associated to each element of this sample is the sum of rewards that the user’s action would yield under the given gain function. This is thoroughly discussed in[ 3 ]. Each user is then characterised by the expected utility of each state with a discount factor γ.

( U (G|H) = P

P(G|H) = P a∈HeUG(G(|aH)) G0∈SGF eU(G0|H) ) ⇒ GbH =

X G∈SGF

G × P(G|H) 2. Qualitative class definition: This step is purely human. The experts are asked to interfere and define the classes that will be used to build the quantitative classification. In this stage, the expert intervention is purely based on his a priori. If the expert’s a priori is invalidated during the process, he will have to restart from here with an updated point of view. 3. Fitting the classification: To have well classified users a Gaussian kernel label propagation is used. This provides a probability distribution of membership to each pattern for each behaviour. An active learning process is used to iterate the propagation of the labels under the supervision of the human expert. After each fold, we sample the users randomly and test if the output probability distribution makes sense. The human expert either agrees with the results, changes them or tags them as unsure. If the rate of changed results is high, we continue the active learning loop. As a result, the rate of bad labels will decay.

Once the classifier stabilizes, we consider the rate of behaviours that the expert tagged as unsure. If this exceeds a threshold, we roll back to the second step to challenge the a priori class definitions.

If the unsure tags rate is low enough, we can safely assume that the two models converged with respect to the expert.

We applied this methodology on a MOOC4 with a sociologist. We started with an a priori of three user profiles. Up to this date, after three iteration of the methodology, we were able to identify seven profiles that fulfil the context needs and to classify the users accordingly. 3

Conclusion

Our method assists a human expert to find the optimal information about the studied population. Although this work is still in progress and only tested on MOOC log data, it should be applicable on other log data streams of information. Future tests will involve marketing related data. We are currently investigating the efficiency of this method as well as the best techniques to use for each step. This is part of a preliminary work for a thesis. 4 https://www.fun-mooc.fr/courses/VirchowVillerme/06005/session01/about

Course model as an MDP Users log data Qualitative class definition User classes

(expert a priori)

Qualitative analysis to redefine the classes

Quantitative modelling of the users MDP utility functions as users features

Sample of users Users sample labeling Fitting the classification

error > threshold

Active learning

Label propagation (gaussian kernel) Evaluate

propagation evaluate classification error < threshold

Satisfactory results

Chase

Geigle and Cheng Xiang Zhai: Modelling MOOC Student Behaviour With Two-Layer Hidden Markov Models . Learning at Scale ( 2017 )

2. Paula de Barba Carleton Corin , Linda Corrin and Gregor Kennedy: Visualizing patterns of student engagement and performance in moocs . ( 2014 )

3. Constantin

. Rothkopf and Christos Dimitrakakis: Preference Elicitation and Inverse Reinforcement Learning . cornell university library ( 2011 )