<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Predicting Student Participation in Peer Reviews in MOOCs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Erkan Er</string-name>
          <email>erkan@gsic.uva.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Miguel Luis Bote-Lorenzo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eduardo Gómez-Sánchez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yannis Dimitriadis</string-name>
          <email>yannis@tel.uva.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juan Ignacio Asensio-Pérez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>GSIC/EMIC, Universidad de Valladolid</institution>
          ,
          <addr-line>Valladolid</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <fpage>65</fpage>
      <lpage>70</lpage>
      <abstract>
        <p>Assessing and providing feedback to thousands of student artefacts in MOOCs is an unfeasible task for instructors. Peer review, a well-known pedagogical approach that offers various learning gains, has been a common approach to address this practical challenge. However, low student participation is a potential barrier to the success of peer reviews. The present study proposes an approach to predict student participation in peer reviews in a MOOC context, which can be utilized to achieve an effective peer-review activity. We attempt to predict the number of different peer works that students will review for each of four assignments based on their past activities in the course. Results show that students' preceding activities were predictive of their participation in peer reviews starting from the first assignment, and that the prediction accuracy improved considerably with the inclusion of past peer-review activities.</p>
      </abstract>
      <kwd-group>
        <kwd>MOOC</kwd>
        <kwd>Peer review</kwd>
        <kwd>Engagement prediction</kwd>
        <kwd>Regression</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Massive open online courses (MOOCs) enable millions to receive university-level
courses at no cost. However, the massiveness comes with several practical challenges.
One known challenge is the assessment of thousands of student artefacts (submitted to
open-ended assignments) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. One approach to address this challenge has been the use
of peer review (or peer assessment). Peer review is an active learning process in which
a student work is examined and rated by another equal-status student [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Besides its
utility in terms of reducing the workload of instructors, which is considered a main
benefit in the MOOC context, peer review offers learning gains for both those students
who performed the review and those whose work was reviewed. These benefits include,
but are not limited to, the development of higher-order thinking skills, problem solving
skills, communication skills, and teamwork skills [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. However, conducting an
effective peer review itself is a challenge in large scales. One barrier to its successful
implementation is the low student participation. Considering the lack of instructor
mediation and the large diversity in MOOC participants (e.g., native language, culture,
etc.), there are high chances that not many students will be naturally motivated to
review a peer’s work [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Lack of participation in peer review may result in situations in
which the submissions of striving students remain ungraded, leading to a decrease in
their motivation to continue the course. Nevertheless, as opposed to numerous studies
that are concerned about resolving the validity issues of peer reviews [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ], there exists
scarce works that investigated student participation in peer review at large scale [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
Thus, there is a need for further research to contribute to the solution of this problem.
      </p>
      <p>
        The present study proposes an approach to predict student participation in reviewing
peers’ work in a MOOC context, and in this paper, we share the preliminary findings
of this in-progress research. In particular, we attempt to predict the number of different
peer works that students will review for a specific assignment based on their past
activities in the course. An accurate estimation of number of times a student will perform
peer review can help instructors take timely actions to achieve a successful peer-review
process [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. For example, the peer-review task might be rather challenging for some
students depending on their abilities [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and these students may need more time for
completing their reviews. Therefore, instead of a firm deadline for peer reviews, an
adaptive schedule based on the predicted participation levels can be used to promote
participation in peer reviews. In addition, this estimation might be utilized in designing
other effective collaborative learning activities. For example, using the information
regarding the levels of participation, student groups can be formed in a way that
maximizes the likelihood that each peer work will be reviewed by another group member. As
student participation in peer reviews can be also considered an engagement indicator,
other approaches that are used to foster engagement can be applied [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>In the following section, we describe the course data at hand and the features
generated for the prediction task. Next, we present the experimental study by describing the
details of the method and the results regarding the performance of each prediction
model employed. We conclude by presenting the follow-up research ideas.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Predicting Participation in Peer Reviews</title>
      <sec id="sec-2-1">
        <title>Course Data</title>
        <p>The course data for this study was retrieved from a public dataset published by Canvas
Network 1. No contextual information was available (e.g., whether the peer review was
mandatory or not), but we attempted to make some inferences about the course design
based on the available log data, since such contextual information may help us explain
better the prediction results. The course had 3620 enrolments and contained four main
assignments (each worth 25 points) for which students needed to upload a specific
artefact. These assignments were reviewed by peers, and they were scheduled starting
from the second week of the course with a one-week interval between each one of them.</p>
        <p>
          The course data contains fine-grained information regarding students’ content visits
as well as their various activities in discussions, assignments, and quizzes (e.g., create,
view, or subscribe to a discussion topic, submit or view an assignment, etc.). Moreover,
1
https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/XB2TLU
The id of the course is 770000832960949.
we identified the number of peer submissions reviewed by each student (at each
assignment), which is the outcome (or dependent) variable in this study. Given that most
students reviewed three different peer works at each assignment (see Figure 1), it is likely
that students were suggested to perform at least 3 reviews by the course instructors.
Descriptive statistics regarding the outcome variable are given in the figure below.
µ=2.62, SD=1.42
1st Peer Reviews
µ=2.56, SD=1.24
2nd Peer Reviews
µ=2.41, SD=1.67
3rd Peer Reviews
µ=2.46, SD=1.35
4th Peer Reviews
In this subsection, we briefly discuss the rationale for the features generated to be used
in the prediction of student participation in peer reviews. Active MOOC learners are
likely to perform well as a result of their consistent participation in most activities of
the course including the peer reviews [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Such active students may probably achieve a
good understanding of the course content as a result of their engagement (e.g., viewing
course content pages, participating in discussions, completing quizzes) [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], and
therefore they are more likely to feel confident reviewing a peer’s work. Accordingly, in the
present study, we hypothesize that students’ preceding engagement in the course is
associated with their subsequent participation in peer-review activities. For this purpose,
we built a set of predictors (or features) based on various student activities in the course
(e.g., discussions, assignments, and quizzes) and used them to predict students’
participation in peer-review activities. Based on the overview of the data at hand and the
previous research [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], a set of features (see Table 1) was generated to characterize the
student engagement in the course. These features considered only student activities
during the last 6 days before the deadline of the corresponding assignment (since there was
a one-week interval between assignments).
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experimental Study</title>
      <sec id="sec-3-1">
        <title>Method</title>
        <p>
          Considering the large set of features, we preferred to use regularized regression
methods, which can penalize the weak predictors and eliminate them to improve the model
performance. Three regularized regression methods were chosen: least absolute
shrinkage and selection operator (LASSO), elastic net, and ridge regression since these
methods incorporate an internal feature-selection mechanism [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. These three methods
were applied to make a prediction regarding the number of different peer works that
were reviewed by students at each assignment. To evaluate the model performance, the
mean absolute error (MAE) scores were used [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Since the sample size was small,
10-fold cross validation method was used.
Using only those features that are available prior to the peer review activity, the number
of peer reviews performed was predicted for each assignment period separately. In each
prediction, only students who submitted the corresponding assignment were included
since only those students can review others’ submissions. The MAE scores of all
models are provided as a boxplot in Figure 2. Based on Figure 2, for all methods, the
accuracy seemed to increase at each subsequent prediction and levelled at the 4th
peer-review activity. This increase was expected as the previous peer-review activities were
considered starting from the 2nd set of submissions. That is, features derived from
student activities in the course were relevant when predicting the subsequent participation
in peer reviews starting from the first assignment, and features regarding the past
peerreview participation were the strongest predictors. These results support our hypothesis
regarding the predictive potential of students’ overall engagement in a course in their
subsequent peer-review participation. Moreover, LASSO and Elastic Net yielded a
higher accuracy at each assignment period compared to ridge regression. Although
majority of the predictions were within an acceptable error range, particularly for LASSO
and Elastic Net, there were some outliers, which are to be further examined.
        </p>
        <p>
          Based on our examination on the coefficients of each feature (as determined by the
regression methods), students’ previous peer-review participation was in general the
strongest predictors (e.g., _pr_count, _pr_subms_count). Many of the retained features
were related to students’ discussion activities (e.g., _dr_count, _de_msg_cc,
dr_avg_p_day) and assignment activities (e.g., ar_{x}_days_bef, _ar_count). Features
related to students’ course content views were not strong predictors in overall.
1st Peer Reviews
2nd Peer Reviews
3rd Peer Reviews
4th Peer Reviews
1.02
1.03
1.08
0.66
0.67
0.78
0.56
0.57
0.59
0.58
0.59
0.89 [MAE]
In this study, we presented the preliminary findings of our ongoing research on
predicting student participation in peer reviews. The results suggest that students’ preceding
activities in a MOOC might be useful in predicting their participation in peer-review
activities. The strongest predicters were not among the features associated with course
content views, while they were among those associated with discussion and assignment
activities. This finding is actually not surprising in a MOOC context since many MOOC
learners may only view course content without active participation [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>Among the regression methods, LASSO and Elastic Net performed better than the
ridge regression. That is, the methods with more extreme penalization (e.g., LASSO)
yielded a higher prediction accuracy, suggesting the presence of some irrelevant
features. Therefore, we plan to perform a deeper analysis of the feature space and generate
more features related to those with stronger predictive ability. We also plan to take a
closer look at the outliers and explore the possible reasons, which may also inform our
analysis on the feature space. Once the prediction model is refined and finalized, we
plan to evaluate its performance in a real course.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>Access to the data used in this paper was granted by Canvas Network. This work has
been partially funded by research projects TIN2014-53199-C3-2-R and VA082U16.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Meek</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blakemore</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marks</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Is peer review an appropriate form of assessment in a MOOC? Student participation and performance in formative peer review</article-title>
          .
          <source>Assess. Eval. High. Educ</source>
          .
          <volume>1</volume>
          -
          <fpage>14</fpage>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Topping</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Peer assessment between students in colleges and universities</article-title>
          .
          <source>Rev. Educ. Res</source>
          .
          <volume>68</volume>
          ,
          <fpage>249</fpage>
          -
          <lpage>276</lpage>
          (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Wen</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsai</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
          </string-name>
          , C.Y.:
          <article-title>Attitudes towards peer assessment: a comparison of the perspectives of pre- service and in- service teachers</article-title>
          .
          <source>Innov. Educ. Teach. Int</source>
          .
          <volume>43</volume>
          ,
          <fpage>83</fpage>
          -
          <lpage>92</lpage>
          (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Suen</surname>
          </string-name>
          , H.:
          <article-title>Peer assessment for massive open online courses (MOOCs)</article-title>
          .
          <source>Int. Rev. Res. Open Distrib. Learn</source>
          .
          <volume>15</volume>
          , (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kulkarni</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Le</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Papadopoulos</surname>
          </string-name>
          , K., Cheng, J.,
          <string-name>
            <surname>Koller</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klemmer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Peer and self assessment in massive online classes</article-title>
          .
          <source>ACM Trans. Comput. Interact</source>
          .
          <volume>20</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>31</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robinson</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Park</surname>
          </string-name>
          , J.-Y.:
          <article-title>Peer grading in a MOOC: Reliability, validity, and perceived effects</article-title>
          .
          <source>J. Asynchronous Learn. Netw</source>
          .
          <volume>18</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Estevez-Ayres</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crespo-García</surname>
            ,
            <given-names>R.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fisteus</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Delgado-Kloos</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>An algorithm for peer review matching in Massive courses for minimising students' frustration</article-title>
          .
          <source>J. Univers. Comput. Sci</source>
          .
          <volume>19</volume>
          ,
          <fpage>2173</fpage>
          -
          <lpage>2197</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Gikandi</surname>
            ,
            <given-names>J.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morrow</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>N.E.</given-names>
          </string-name>
          :
          <article-title>Online formative assessment in higher education: A review of the literature</article-title>
          .
          <source>Comput. Educ</source>
          .
          <volume>57</volume>
          ,
          <fpage>2333</fpage>
          -
          <lpage>2351</lpage>
          (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Neubaum</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wichmann</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eimler</surname>
            ,
            <given-names>S.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krämer</surname>
            ,
            <given-names>N.C.</given-names>
          </string-name>
          :
          <article-title>Investigating incentives for students to provide peer feedback in a semi-open online course</article-title>
          .
          <source>In: The International Symposium on Open Collaboration</source>
          . pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Tseng</surname>
            ,
            <given-names>S.-F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsao</surname>
            ,
            <given-names>Y.-W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>L.-C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chan</surname>
            ,
            <given-names>C.-L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lai</surname>
            ,
            <given-names>K.R.</given-names>
          </string-name>
          :
          <article-title>Who will pass? Analyzing learner behaviors in MOOCs</article-title>
          .
          <source>Res. Pract. Technol. Enhanc. Learn</source>
          .
          <volume>11</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Bote-Lorenzo</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gómez-Sánchez</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <article-title>Predicting the decrease of engagement indicators in a MOOC</article-title>
          .
          <source>In: Seventh International Conference on Learning Analytics and Knowledge</source>
          . pp.
          <fpage>143</fpage>
          -
          <lpage>147</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Hastie</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tibshirani</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Friedman</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>The elements of statistical learning: Data mining, inference, and prediction</article-title>
          . (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Sawyer</surname>
          </string-name>
          , R.:
          <article-title>Sample size and the accuracy of predictions made from multiple regression equations</article-title>
          .
          <source>J. Educ. Stat</source>
          .
          <volume>7</volume>
          ,
          <fpage>91</fpage>
          -
          <lpage>104</lpage>
          (
          <year>1982</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Kizilcec</surname>
            ,
            <given-names>R.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Piech</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schneider</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <article-title>Deconstructing disengagement: Analyzing learner subpopulations in massive open online courses</article-title>
          .
          <source>In: Third International Conference on Learning Analytics and Knowledge</source>
          . pp.
          <fpage>170</fpage>
          -
          <lpage>179</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>