<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Predicting Peer-Review Participation at Large Scale Using an Ensemble Learning Method</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Erkan Er</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eduardo Gómez-Sánchez</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Miguel L. Bote-Lorenzo</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yannis Dimitriadis</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juan I. Asensio-Pérez</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>GSIC/EMIC</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Universidad de Valladolid</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valladolid</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Spain. erkan@gsic.uva.es</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>edugom</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>migbot</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>yannis</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>juaase}@tel.uva.es</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Peer review has been an effective approach for the assessment of massive numbers of student artefacts in MOOCs. However, low student participation is a barrier that can result in inefficiencies in the implementation of peer reviews, disrupting student learning. In this regard, knowing earlier the estimate number of peer works that students will review may bring numerous pedagogical utilities in MOOCs. Previously, we have attempted to predict student participation in peer review in a MOOC context. Building on our previous work, in this study we propose an ensemble learning approach with a refined set of features. Results show that the prediction performance improves when a preceding classification model is trained to identify students with no peer-review participation and that the refined features were effective with more transferability to other contexts.</p>
      </abstract>
      <kwd-group>
        <kwd>MOOC</kwd>
        <kwd>Peer review</kwd>
        <kwd>Engagement prediction</kwd>
        <kwd>Ensemble learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Peer review (or peer assessment), in which an equal-status student assesses a peer’s
work [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], has been a solution to the evaluation of thousands of student artefacts (e.g.,
an essay) in MOOCs [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. However, this solution itself brings some practical challenges
at large scale, one of which is the low student participation [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Given that MOOC
participants have different goals and come from diverse backgrounds, their participation
in peer reviews might not be persistent [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. With low participation rates, a peer review
activity might yield various issues. For example, submissions of striving students may
receive neither feedback nor a grade, which may lead to a decrease in their motivation
to continue the course. Nevertheless, not many researchers have focused on student
participation in peer review at large scales [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. More research is needed to develop
practical solutions for effective peer-review activities at large scale. One research line
could involve the prediction of students’ participation in peer reviews. An accurate
estimation of peer-review participation can be utilized in various practical ways. For
example, instructors can use this information to tune peer-review activities (e.g.,
incorporating an adaptive time schedule for completing peer reviews based on students’
expected level of participation). This information can be also used to inform the design of
Copyright © 2017 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors.
other collaborative activities (e.g., forming groups that are inter-homogenous in terms
of students’ desire to review teammates’ work).
      </p>
      <p>
        The work presented in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] was our first attempt to predict the number of peer works
a student will review by using regression methods with a large feature set. The results
were promising with a reasonably low error that decreases as the course progresses and
more data reflecting the student behaviour becomes available. However, the model was
built with a large feature set, which may result in overfitting in MOOC contexts with
fairly less students participating in peer reviews. Further, a large part of the error was
accumulated on those students who submitted their assignment but did not review any
peer submission. This paper addresses these limitations by building a new feature set
with less yet more informative variables, and by proposing an ensemble learning model.
      </p>
      <p>In the following section, we describe the course data at hand and provide the details of
our feature-generation approach. Next, we present the experimental study by describing
the feature selection approach and the details of the ensemble method. Then, the
prediction performance of each prediction model employed are shared. We conclude by
discussing follow-up research ideas.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Previous Findings</title>
      <p>
        In our previous work [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], we obtained promising results by using regression methods
to predict student participation in peer reviews in a MOOC (with 3620 enrollments)
published by Canvas Network1. The feature set contained more than 80 items, including
weekly cumulative features (e.g., number of discussion activities in total during whole
week) as well as daily features (e.g., number of content visits per each day before the
peer-review activity). There were four assignments involving submission of a learning
artefact, and they were evaluated using peer reviews. Figure 1 provides the histograms
along with descriptive statistics regarding the number of peer works reviewed by each
student. The recommended (or required) number of peer reviews appears to be three as
most students performed three peer reviews at each session.
      </p>
      <p>µ=2.62, SD=1.42
1st Peer Reviews
µ=2.56, SD=1.24
2nd Peer Reviews
µ=2.41, SD=1.67
3rd Peer Reviews
µ=2.46, SD=1.35
4th Peer Reviews
1 https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/XB2TLU
The id of the course is 770000832960949.</p>
      <p>The prediction models included one of three regression methods (LASSO –least
absolute shrinkage and selection operator, ridge, and elastic net) and the performance of
each method was tested. In Table 1 shows the results of the prediction performance
with the LASSO regression (which was chosen as it was the best performing method).
The total mean absolute error (MAE) scores were reasonably low in general, and the
performance improved considerable with the inclusion of past peer-review activities
starting from the 2nd peer-review session. However, the prediction of the participation
of students with no actual peer-review participation was inaccurate. This finding has a
non-negligible impact on the overall error (note that around 1/6 of students who
submitted their assignment did not review any of their peers), suggesting a need for
reducing the error resulted from the disengaged students to improve the overall prediction
performance. Furthermore, we found that many features were redundant, particularly
those derived based on student activities on a specific day (e.g., quiz activity 2 days
before the peer reviews). Therefore, the predictive model obtained was complex with
many features that were particular to the context, limiting the transferability of the
model to other MOOCs. Another possible problem could be the overfitting as this
complex model were trained and tested on a small sample. The current study addresses the
limitations of the previous work by studying more deeply the feature space and
proposing an ensemble learning approach, as described in the following sections.</p>
    </sec>
    <sec id="sec-3">
      <title>Improvements</title>
      <sec id="sec-3-1">
        <title>Feature Generation</title>
        <p>
          Given the limitations of the features used previously, we have revised them to obtain a
reduced yet predictive set that can be transferable over different peer-review sessions
within the same course and that can also apply to other MOOC contexts. For this
purpose, we mainly adopted the features proposed in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], which are based on edX MOOCs.
Given that Canvas Network MOOCs have a different database structure than edX
MOOCs, we have either adopted similar features or extracted the same ones when
possible. The effectiveness of such features in predicting student engagement in MOOCs
has been shown [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. These features could be effective in predicting students’
peer-review participation as their overall course engagement is likely to be associated with
their peer-review engagement [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Each feature was computed using the data between
consecutive peer-review sessions (e.g., features for the 3rd peer reviews were calculated
using the data obtained after the 2nd peer reviews) since students’ recent activities could
be more relevant to their subsequent peer-review participation.
        </p>
        <p>
          Furthermore, features about learners’ activity sequences (e.g., taking a quiz followed
by reading) can be powerful predictors of engagement in MOOC contexts [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. The
sequence features are about the order of student activities and can help to identify
different student profiles. Sequence features can easily scale up to thousands as activities
could follow many different orders [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. To obtain a small yet relevant set, we decided
to focus on assignment, discussion and content activities and generated 2-activity length
features. The complete list of features generated (n=41) is provided in Table 2.
a denotes the type of the request (content, quiz, assignment, or discussion); 1 is also calculated
combining all requests; 2 is different than pr_subms_count if students reviewed the same
submission multiple times; and 3 are divided by the total number of requests.
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Ensemble Learning Method</title>
        <p>
          Ensemble learning method is a type of machine learning technique that involves the use
of multiple learning algorithms to achieve higher predictive performance than what
could be achieved using a single learning algorithm. Ensemble methods are found to
improve predictive models in the MOOC literature [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. The motivation for using an
ensemble learning method for the current prediction task has emerged from our
previous work, in which we found that overall prediction performance suffers largely from
poorly predicting the participation of students who have zero actual peer-review
participation. Identifying such students beforehand using classification methods (i.e.,
nonparticipants vs participants) and running the regression models for only participants of
peer reviews might potentially lead to higher accuracy. Therefore, to improve the
prediction accuracy, we propose a sequential ensemble approach [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], in which a
classification step is integrated prior to regression to identify those with no peer reviews ahead
of time and exclude them from the regression analysis. Later, those classified as having
no participation were combined with the regression predictions to evaluate the overall
performance. Figure 2 depicts the ensemble method proposed.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental Study</title>
      <sec id="sec-4-1">
        <title>Method</title>
        <p>
          First, we replicated our previous study with revised feature set. Two regression methods
were tested. The first one is LASSO, which has an internal-feature selection mechanism
based on L1 regularization. LASSO has been effective in previous MOOC research
[
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. However, LASSO may have performance issues when features are correlated [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ],
which might be the case in the current study as some features were extracted from
similar data. Therefore, we also used a correlation-based feature-selection (CFS) [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] to
train a linear regression (LR) model. CFS focuses on the predictive ability of each
feature while maintaining a low correlation among them to minimize the redundancy.
        </p>
        <p>
          In the ensemble learning model, logistic regression (LGR) was chosen as the
classifier as it was found to be more accurate compared to the others that were pilot-tested
(e.g., stochastic gradient descent and decision trees). L1 regularization and CFS were
also used to perform feature-selection for the classification model. While whole dataset
was used to train the classification model, only data about students with at least one
peer review was used to train the regression model. Only students who submitted the
corresponding assignment were included in predictions since only those students could
review others’ submissions. Beginning with the 2nd assignment, features of previous
assignment score and peer-review participation were included in the predictions. Since
the sample size was small, 10-fold cross validation method was used, and the
performance was evaluated using MAE [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. MAE was used as the metric since it provides
plain interpretation of performance when target variable has a narrow range (i.e., 0-4).
Also, please note that prediction scores were rounded to the closest integer value (as
decimal numbers would not be practical in a real course). We used the scikit-learn
implementations of LASSO, LGR, and LR, and WEKA implementation of CFS.
4.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Results and Discussion</title>
        <p>The MAE scores at each actual participation level, which is 0 to 4, as well as the total
MAE scores of each prediction model are provided in Table 3 and Table 4. When
compared to the previous results (see Table 1), the performance of the regression model
(see Table 3) seemed to remain almost the same with the refined list of features, with a
similar trend of increasing accuracy at each subsequent prediction. The error rates were
the highest at the 0-participation level. Given the likelihood of overfitting with complex
models, we favour the use of the refined feature set to minimize this possibility. Also,
the current feature set has the capacity to be transferred to any other week involving a
peer-review prediction as well as to other MOOCs.</p>
        <p>According to the results of the ensemble model in Table 3, the prediction
performance has slightly increased (except the 1st peer reviews) when a classification phase
is incorporated before running the regression model, compared to the performance of
regression alone. That is, the classification model helped reduce the error introduced
by students with zero peer-review participation. However, at the same time, it seems
that the error increased in the prediction of other levels of participation due to poor
classification performance. Also, no improvement was noted for the predictions at the
1st peer-review session probably because students who do and who do not contribute to
peer reviews seem to have very similar profiles at this stage of the course based on the
current feature set used. Further, the feature-selection methods did not appear to have
different effects on the prediction performance.</p>
        <p>The results showed that the proposed ensemble method produced better predictions
than that obtained using the regression method alone. This was because students with
no peer-review participation were undermining the performance of the regression
model, which was addressed by incorporating a classification phase to identify and
exclude those with no participation when training the regression model. However, the
overall performance did not improve considerably as the students with no peer-review
participation were not classified perfectly, therefore yielding a mediocre performance
at certain levels of participation. Nonetheless, given that the standard deviation of actual
peer-review participation has a range of 2.41-2.62, the MAE scores achieved with the
ensemble method seem to be promising, ranging from 0.45 to 1.04. Thus, the proposed
predictive model holds potential to be utilized in a real MOOC context.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and Future Work</title>
      <p>
        In this study, building on our previous work we proposed a sequential ensemble
learning method with a refined set of features to obtain an accurate prediction of students’
peer-review participation. The results showed that proposed ensemble model holds a
potential to be further explored in future research. First, the classification model needs
further attention. The reasons for its moderate performance needs to be explored and
addressed accordingly using different classification approaches and more relevant
features. For example, a nested ensemble approach could be utilized. Second, the ensemble
method failed to improve the prediction performance at the 1st peer reviews. Possibly
student profiles as identified with the current feature set was not distinctive early in the
course, and therefore they offered no benefits for the classification. More distinctive
features need to be identified to improve the classification performance. Nonetheless,
the challenge of identifying students who will not participate in peer reviews early in
the semester constitute an interesting research opportunity. Moreover, although the
approach used in this study demonstrates the validity of the prediction model, it is not
applicable to an ongoing MOOC as the values of the target variable (which is the
number of peer work reviewed) would be needed to train the models. Therefore, other
relevant training paradigms (e.g., in-situ learning) should be used to build accurate yet
practical models that can be useful in continuing MOOCs [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>Access to the data used in this paper was granted by Canvas Network. This work has
been partially funded by research projects TIN2014-53199-C3-2-R and VA082U16,
and by the Spanish network of excellence SNOLA (TIN2015-71669-REDT).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Topping</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Peer assessment between students in colleges and universities</article-title>
          .
          <source>Rev. Educ. Res</source>
          .
          <volume>68</volume>
          ,
          <fpage>249</fpage>
          -
          <lpage>276</lpage>
          (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Piech</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Do</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koller</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Tuned Models of Peer Assessment in MOOCs</article-title>
          .
          <source>In: International Conference on Educational Data Mining</source>
          . pp.
          <fpage>153</fpage>
          -
          <lpage>160</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Estevez-Ayres</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crespo-García</surname>
            ,
            <given-names>R.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fisteus</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Delgado-Kloos</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>An algorithm for peer review matching in Massive courses for minimising students' frustration</article-title>
          .
          <source>J. Univers. Comput. Sci</source>
          .
          <volume>19</volume>
          ,
          <fpage>2173</fpage>
          -
          <lpage>2197</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Suen</surname>
          </string-name>
          , H.:
          <article-title>Peer assessment for massive open online courses (MOOCs)</article-title>
          .
          <source>Int. Rev. Res. Open Distrib. Learn</source>
          .
          <volume>15</volume>
          , (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Er</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bote-Lorenzo</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gómez-Sánchez</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dimitriadis</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Asensio-Pérez</surname>
            ,
            <given-names>J.I.</given-names>
          </string-name>
          :
          <article-title>Predicting Student Participation in Peer Reviews in MOOCs</article-title>
          .
          <source>In: Proceedings of the Second European MOOCs Stakeholder Summit</source>
          <year>2017</year>
          . , Madrid (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Veeramachaneni</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>O'Reilly</surname>
          </string-name>
          , U.-M.,
          <string-name>
            <surname>Taylor</surname>
          </string-name>
          , C.:
          <article-title>Towards Feature Engineering at Scale for Data from Massive Open Online Courses</article-title>
          .
          <source>arXiv:1407.5238v1. 6</source>
          , (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Jayaprasad</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jayaprasad</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Transfer Learning for Predictive Models in Massive Open Online Courses</article-title>
          .
          <source>Artif. Intell</source>
          .
          <volume>1</volume>
          -
          <fpage>12</fpage>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Tseng</surname>
            ,
            <given-names>S.-F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsao</surname>
            ,
            <given-names>Y.-W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>L.-C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chan</surname>
            ,
            <given-names>C.-L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lai</surname>
            ,
            <given-names>K.R.</given-names>
          </string-name>
          :
          <article-title>Who will pass? Analyzing learner behaviors in MOOCs</article-title>
          .
          <source>Res. Pract. Technol. Enhanc. Learn</source>
          .
          <volume>11</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Crossley</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paquette</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dascalu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McNamara</surname>
            ,
            <given-names>D.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>R.S.:</given-names>
          </string-name>
          <article-title>Combining clickstream data with NLP tools to better understand MOOC completion</article-title>
          .
          <source>Proc. Sixth Int. Conf. Learn. Anal. Knowl</source>
          . -
          <source>LAK '16</source>
          .
          <fpage>6</fpage>
          -
          <lpage>14</lpage>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Brooks</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thompson</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teasley</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A time series interaction analysis method for building predictive models of learners using log data</article-title>
          .
          <source>Proc. Fifth Int. Conf. Learn. Anal. Knowl</source>
          . - LAK '
          <fpage>15</fpage>
          .
          <fpage>126</fpage>
          -
          <lpage>135</lpage>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Boyer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Veeramachaneni</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Robust Predictive Models on MOOCs: Transferring Knowledge across Courses</article-title>
          .
          <source>Proc. 9th Int. Conf. Educ. Data Min</source>
          .
          <fpage>298</fpage>
          -
          <lpage>305</lpage>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>Z.-H.</given-names>
          </string-name>
          :
          <article-title>Ensemble Learning</article-title>
          . In: Li,
          <string-name>
            <surname>S.Z</surname>
          </string-name>
          . (ed.)
          <source>Encyclopedia of Biometrics</source>
          . pp.
          <fpage>2170</fpage>
          -
          <lpage>273</lpage>
          (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Robinson</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yeomans</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , Reich, J.,
          <string-name>
            <surname>Hulleman</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gehlbach</surname>
          </string-name>
          , H.:
          <article-title>Forecasting student achievement in MOOCs with natural language processing</article-title>
          .
          <source>Proc. Sixth Int. Conf. Learn. Anal. Knowl</source>
          . - LAK '
          <fpage>16</fpage>
          .
          <fpage>383</fpage>
          -
          <lpage>387</lpage>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Zou</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hastie</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Regularization and variable selection via the elastic-net</article-title>
          .
          <source>J. R. Stat. Soc</source>
          .
          <volume>67</volume>
          ,
          <fpage>301</fpage>
          -
          <lpage>320</lpage>
          (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning</article-title>
          .
          <source>In: Proceedings of the Seventeenth International Conference on Machine Learning</source>
          . pp.
          <fpage>359</fpage>
          -
          <lpage>366</lpage>
          (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Sawyer</surname>
          </string-name>
          , R.:
          <article-title>Sample size and the accuracy of predictions made from multiple regression equations</article-title>
          .
          <source>J. Educ. Stat</source>
          .
          <volume>7</volume>
          ,
          <fpage>91</fpage>
          -
          <lpage>104</lpage>
          (
          <year>1982</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Bote-Lorenzo</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gómez-Sánchez</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <article-title>Predicting the decrease of engagement indicators in a MOOC</article-title>
          .
          <source>In: Seventh International Conference on Learning Analytics and Knowledge</source>
          . pp.
          <fpage>143</fpage>
          -
          <lpage>147</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>