<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Users behavioural inference with Markovian decision process and active learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Firas Jarboui</string-name>
          <email>fjarboui@aneo.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vincent Rocchisani</string-name>
          <email>vrocchisani@aneo.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wilfried Kirchenmann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ANEO</institution>
          ,
          <addr-line>Boulogne Billancourt</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>ENSTA</institution>
          ,
          <addr-line>France and ENIT</addr-line>
          ,
          <country country="TN">Tunisia</country>
        </aff>
      </contrib-group>
      <fpage>59</fpage>
      <lpage>61</lpage>
      <abstract>
        <p>Studies on Massive Open Online Courses (MOOCs) users discuss the existence of typical profiles and their impact on the learning process of students. One of the concerns when creating a new MOOC is knowing how the users behave when going through the contents. We can identify either quantitative methods that allow you to infer hardly interpretable groups of similar behaviour[1] or hardly context-transposable qualitative methods[2]. Our ambition is to find an efficient way to identify the behavioural pattern of interest to a given human expert. Within the #MOOCLive project3, we developed a mix-method to match the quantitative interpretation to the context needs.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>Methodology</title>
      <p>
        We tackled the following three problems in order to achieve our goal.
The value associated to each element of this sample is the sum of rewards
that the user’s action would yield under the given gain function. This is
thoroughly discussed in[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Each user is then characterised by the expected
utility of each state with a discount factor γ.
      </p>
      <p>( U (G|H) = P</p>
      <p>P(G|H) = P
a∈HeUG(G(|aH))
G0∈SGF eU(G0|H)
)
⇒ GbH =</p>
      <p>X
G∈SGF</p>
      <p>G × P(G|H)
2. Qualitative class definition: This step is purely human. The experts are
asked to interfere and define the classes that will be used to build the
quantitative classification. In this stage, the expert intervention is purely based
on his a priori. If the expert’s a priori is invalidated during the process, he
will have to restart from here with an updated point of view.
3. Fitting the classification: To have well classified users a Gaussian kernel
label propagation is used. This provides a probability distribution of
membership to each pattern for each behaviour. An active learning process is
used to iterate the propagation of the labels under the supervision of the
human expert. After each fold, we sample the users randomly and test if
the output probability distribution makes sense. The human expert either
agrees with the results, changes them or tags them as unsure.
If the rate of changed results is high, we continue the active learning loop.
As a result, the rate of bad labels will decay.</p>
      <p>Once the classifier stabilizes, we consider the rate of behaviours that the
expert tagged as unsure. If this exceeds a threshold, we roll back to the second
step to challenge the a priori class definitions.</p>
      <p>If the unsure tags rate is low enough, we can safely assume that the two
models converged with respect to the expert.</p>
      <p>We applied this methodology on a MOOC4 with a sociologist. We started
with an a priori of three user profiles. Up to this date, after three iteration of
the methodology, we were able to identify seven profiles that fulfil the context
needs and to classify the users accordingly.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Conclusion</title>
      <p>Our method assists a human expert to find the optimal information about the
studied population. Although this work is still in progress and only tested on
MOOC log data, it should be applicable on other log data streams of information.
Future tests will involve marketing related data. We are currently investigating
the efficiency of this method as well as the best techniques to use for each step.
This is part of a preliminary work for a thesis.
4 https://www.fun-mooc.fr/courses/VirchowVillerme/06005/session01/about</p>
      <sec id="sec-3-1">
        <title>Course model as an MDP</title>
      </sec>
      <sec id="sec-3-2">
        <title>Users log data</title>
        <sec id="sec-3-2-1">
          <title>Qualitative class definition</title>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>User classes</title>
        <p>(expert a priori)</p>
      </sec>
      <sec id="sec-3-4">
        <title>Qualitative analysis to redefine the classes</title>
        <p>Quantitative modelling of the users
MDP utility functions as users features</p>
      </sec>
      <sec id="sec-3-5">
        <title>Sample of users</title>
      </sec>
      <sec id="sec-3-6">
        <title>Users sample labeling</title>
        <sec id="sec-3-6-1">
          <title>Fitting the classification</title>
          <p>error &gt; threshold</p>
          <p>Active learning</p>
        </sec>
      </sec>
      <sec id="sec-3-7">
        <title>Label propagation (gaussian kernel)</title>
      </sec>
      <sec id="sec-3-8">
        <title>Evaluate</title>
        <p>propagation
evaluate
classification
error &lt; threshold</p>
      </sec>
      <sec id="sec-3-9">
        <title>Satisfactory results</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Chase</given-names>
            <surname>Geigle</surname>
          </string-name>
          and Cheng Xiang Zhai:
          <article-title>Modelling MOOC Student Behaviour With Two-Layer Hidden Markov Models</article-title>
          . Learning at Scale (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Paula de Barba Carleton Corin</surname>
          </string-name>
          ,
          <article-title>Linda Corrin and Gregor Kennedy: Visualizing patterns of student engagement and performance in moocs</article-title>
          . (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Constantin</surname>
            <given-names>A</given-names>
          </string-name>
          .
          <article-title>Rothkopf and Christos Dimitrakakis: Preference Elicitation and Inverse Reinforcement Learning</article-title>
          . cornell university library (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>