<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Patricia Gutierrez</string-name>
          <email>patricia@iiia.csic.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nardine Osman</string-name>
          <email>nardine@iiia.csic.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carles Sierra</string-name>
          <email>sierra@iiia.csic.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>IIIA-CSIC</institution>
          ,
          <addr-line>Campus de la UAB, Barcelona</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper we introduce an automated assessment service for online learning support in the context of communities of learners. The goal is to introduce automatic tools to support the task of assessing massive number of students as needed in Massive Open Online Courses (MOOC). The nal assessments are a combination of tutor's assessment and peer assessment. We build a trust graph over the referees and use it to compute weights for the assesments aggregations. The model proposed intends to be a support for intelligent online learning applications that encourage student's interactions within communities of learners and bene ts from their feedback to build trust measures and provide automatic marks.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The authors of [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] propose methods to estimate peer
reliability and correct peer biases. They present results over real
world data from 63,000 peer assessments of two Coursera
courses. The models proposed are probabilistic and they
are compared to the grade estimation algorithm used on
Coursera's platform, which does not take into account
individual biases and reliabilities. Di erently from them, we
place more trust in students who grade like the tutor and
do not consider student's biases. When a student is biased
its trust measure will be very low and his/her opinion will
have a moderate impact over the nal marks.
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] proposes the CrowdGrader framework, which de nes a
crowdsourcing algorithm for peer evaluation. The accuracy
degree (i.e. reputation) of each student is measured as the
distance between his/her self assesment and the aggregated
opinion of the peers weighted by their accuracy degrees. The
algorithm thus implements a reputation system for students,
where higher accuracy leads to higher in uence on the
consensus grades. Di erently from this work, we give more
weight to those peers that have similar opinions to those of
the tutor.
      </p>
      <p>In this paper, and di erently from previous works, we want
to study the reliability of student assessments when
compared with tutor assessments. Although part of the learning
process is that students participate in the de nition of the
evalution criteria, tutors want to be certain that the
scoring of the students' works is fair and as close as possible to
his/her expert opinion.</p>
      <p>
        Our inspiration comes from a use case explored in the
EUfunded project PRAISE [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. PRAISE enables online virtual
communities of students with shared interests and goals to
come together and share their music practice with each other
so the process of learning becomes social. It provides tools
for giving and receiving feedback, as feedback is considered
an essential part of the learning process. Tutors de ne lesson
plans as pedagogical work ows of activities, such as
uploading recorded songs, automatic performance analysis, peer
feedback, or re exive pedagogy analysis. The goal of any
lesson plan is to improve student skills, for instance, the
performance speed competence or the interpretation maturity
level. Assessments of students' performances have to
evaluate the achievement of these skills. Once a lesson plan is
de ned, PRAISE's interface tools allow students to navigate
through the activities, to upload assignments, to practice, to
assess each other, and so on. The tools allow tutors to
monitor what students have done and to assess them. In this
work we concentrate on the development of a service that
can be included as part of a lesson plan and helps tutors
in the overall task of assessing the students participating in
the lesson plan. This assessment is based on aggregating
students' assessments, taking into consideration the trust
that tutors have on the students' individual capabilities in
judging each others work.
      </p>
      <p>To achieve our objective we propose in this paper an
automated assessment method (Section 2) based on tutor
assessments, aggregations of peer assessments and on trust
measures derived from peer interactions. We experimentaly
evaluate (Section 3) the accuracy of the method over di
erent topologies of student interactions (i.e. di erent types of
student grouping). The results obtained are based on
simulated data, leaving the validation with real data for future
work. We then conclude with a discussion of the results
(Section 4).
2. COLLABORATIVE ASSESSMENT
In this section we introduce the formal model of the method
and the algorithms for collaborative assessment.</p>
    </sec>
    <sec id="sec-2">
      <title>2.1 Notation and preliminaries</title>
      <p>We say an online course has a tutor , a set of peer students
S, and a set of assignments A that need to be marked by the
tutor and/or students with respect to a given set of criteria
C.</p>
      <p>The automated assessment state S is then de ned as the
tuple:</p>
      <p>S = hR; A; C; Li
R = f g [ S de nes the set of possible referees (or markers),
where a referee could either be the tutor or some student
s 2 S. A is the set of submitted assignments that need to
be marked and C = hc1; : : : ; cni is the set of criteria that
assignments are marked upon. L is the set of marks (or
assessments) made by referees, such that L : R A ! [0; ]n (we
assume marks to be real numbers between 0 and some
maximum value ). In other words, we de ne a single assessment
as: = M~ , where 2 A, 2 R, and M~ = hm1; : : : ; mni
describes the marks provided by the referee on the n criteria
of C, mi 2 [0; ].</p>
      <p>
        Similarity between marks. We de ne a similarity function
sim : [0; ]n [0; ]n ! [0; 1] to determine how close two
assesments and are. We calculate the similarity between
assessments = fm1; : : : ; mng and = fm01; : : : ; m0ng as
follows:
sim( ;
) = 1
n
X
i=1
jmi
n
X
i=1
m0ij
This measure satis es the basic properties of a fuzzy
similarity [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Other similarity measures could be used.
Trust relations between referees. Tutors need to decide
up to which point they can believe on the assessments made
by peers. We use two di erent intuitions to make up this
belief. First, if the tutor and the student have both assessed
some assigments, their similarity gives a hint of how close
the judegements of the student and the tutor are. Similarly,
we can de ne the judgement closeness of any two students by
looking into the assignments evaluated by both of them. In
case there are no assigments evaluated by the tutor and one
particular student we could simply not take that student's
opinion into account because the tutor would not know how
much to trust the judgement of this student, or, as we do
in this paper, we approximate that unknown trust by lookig
into the chain of trust between the tutor and the student
through other students. To model this we de ne two di
erent types of trust relations:
      </p>
      <p>Direct trust : This is the trust between referees ; 2 R
that have at least one assignement assessed in common.
The trust value is the average of similarities on the
assessments over the same peers. Let the set A ; be
the set of all assignments that have been assessed by
both referees. That is, A ; = f j 2 L and 2
Lg. Then,</p>
      <p>TD( ; ) =</p>
      <p>P 2A ; sim( ;</p>
      <p>)
jA ; j
We could also de ne direct trust as the conjunction of
the similarities for all common assignments as:</p>
      <p>TD( ; ) =
sim( ;</p>
      <p>)
^
2A ;
However, this would not be practical, as a signi cant
di erence in just one assessment of those assessed by
two referees would make their mutual trust very low.
Indirect trust : This is the trust between referees ; 2
R without any assignement assessed by both of them.
We compute this trust as a transitive measure over
chains of referees for which we have pair-wise direct
trust values. We de ne a trust chain as a sequence of
referees qj = h i; :::; i; i+1; : : : ; mj i where i 2 R,
1 = and mj = and TD( i; i+1) is de ned for
all pairs ( i; i+1) with i 2 [1; mj 1]. We note by
Q( ; ) the set of all trust chains between and .
Thus, indirect trust is de ned as a aggregation of the
direct trust values over these chains as follows:</p>
      <p>TI ( ; ) =</p>
      <p>max
qj2Q( ; )</p>
      <p>Y
i2[1;mj 1]</p>
      <p>
        TD( i; i+1)
Hence, indirect trust is based in the notion of
transitivity.1
1TI is based on a fuzzy-based similarity relation sim
presented before and ful lling the -Transitivity property:
sim(u; v) sim(v; w) sim(u; w), 8u; v; w 2 V , where is
a t-norm [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>Update direct trust and edges
Ideally, we would like to not overrate the trust of a tutor on
a student, that is, we would like that TD(a; b) TI (a; b) in
all cases. Guaranteeing this in all cases is impossible, but we
can decrease the number of overtrusted students by selecting
an operator that gives low values to TI . In particular, we
prefer to use the product Q operator, because this is the
tnorm that gives the smallest possible values. Other opertors
could be used, for instance the min function.</p>
      <p>Trust Graph. To provide automated assessments, our
proposed method agregates the assessments on a given
assignment taking into consideration how much trusted is each
marker/referee from the point of view of the tutor (i.e.
taking into consideration the trust of the tutor on the referee
in marking assignments). The algorithm that computes the
student nal assessment is based on a graph de ned as
follows:</p>
      <p>G = hR; E; wi
where the set of nodes R is the set of referees in S, E
R R are edges between referees with direct or indirect
trust relations, and w : E ! [0; 1] provides the trust value.
We note by D E the set of edges that link referees with
direct trust. That is, D = fe 2 EjTD(e) 6= ?g. An similarly,
I E for indirect trust, I = fe 2 EjTI (e) 6= ?g n D. The w
values will be used as weights to combine peer assessments
and are de ned as:
w(e) =
(TD(e) , if e 2 D</p>
      <p>TI (e) , if e 2 I
2.2 Computing collaborative assessments
Algorithm 1 implements the collaborative assessment method.
We keep the notation ( ; ) to refer to the edge connecting
nodes and in the trust graph and Q( ; ) to refer the set
of trust chains between and .</p>
      <p>The rst thing the algorithm does is to build a trust graph
from L. Then, the nal assessments are computed as
follows. If the tutor marks an assignment, then the tutor mark
is considered the nal mark. Otherwise, a weighted average
( ) of the marks of student peers is calculated for this
assignment, where the weight of each peer is the trust value
between the tutor and that peer. Other forms of
aggregation could be considered to calculate , for instance a peer
assessment may be discarded if it is very far from the rest
of assessments, or if the referee's trust falls below a certain
threshold.</p>
      <p>Figure 1 shows four trust graphs built from four assessments
histories that corresponds to a chronological sequence of
assessments made. The criteria C in this example are speed
and maturity and the maximum mark value is = 10. For
simplicity we only represent those referees that have made
assessments in L. In Figure 1(a) there is one node
representing the tutor who has made the only assessment over the
assignment ex1 and there are no links to other nodes as no one
else has assessed anything. In (b) student Dave assesses the
same exercise as the tutor and thus a link is created between
them. The trust value w(tutor; Dave) = TD(tutor; Dave) is
high since their marks were similar. In (c) a new assessment
by Dave is added to L with no consequences in the graph
construction. In (d) student Patricia adds an assessment on
ex2 that allows to build a direct trust between Dave and
Patricia and an indirect trust between the tutor and
Patricia, through Dave. The automated assessments generated
in case (d) are: h5; 5i for exercise 1 (which preserves the
tutor's assessment) and h3:7; 3:7i for exercise 2 (which uses a
weighted aggregation of the peers' assessments).</p>
      <p>Note that the trust graph built from L is not necessarily
connected. A tutor wants to reach a point in which the graph
is totally connected because that means that the
collaborative assessment algorithm generates an assessment for every
assignment. Figure 2 shows an example of a trust graph of
a particular learning community involving 50 peer students
and a tutor. When S has a history of 5 tutor assessments
and 25 student assessments (jLj = 30) we observe that not
all nodes are connected. As the number of assessments
in(a) L=f teuxt1or=h5;5ig
(b) L=f teuxt1or=h5;5i; edxa1ve=h6;6ig
(a) jLj = 30
(b) jLj = 200
(c) L=f teuxt1or=h5;5i; edxa1ve= (d) L=f teuxt1or=h5;5i; edxa1ve=
h6;6i; edxa2ve=h2;2ig h6;6i; edxa2ve=h2;2i; epxa2tricia=h8;8ig
creases, the trust graph becomes denser and eventually it
gets completely connected. In (b) and (c) we see a complete
graph.
3. EXPERIMENTAL PLATFORM AND
EVAL</p>
      <p>UATION
In this Section we describe how we generate simulated
social networks, describe our experimental platform, de ne our
benchmarks and discuss experimental results.</p>
    </sec>
    <sec id="sec-3">
      <title>3.1 Social Network Generation</title>
      <p>
        Several models for social network generation have been
proposed re ecting di erent characteristics present in real social
communities. Topological and structural features of such
networks have been explored in order to understand wich
generating model resembles best the structure of real
communities [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>A social network can be de ned as a graph N where the set
of nodes represent the individuals of the network and the
set of edges represent connections or social ties among those
individuals. In our case, individuals are the members of the
learning community: the tutor and students. Connections
represent the social ties and they are usually the result of
interactions in the learning community. For instance a social
relation will be born between two students if they interact
with each other, say by collaboratively working on a project
together. In our experimentation, we rely on the social
network in order to simulate which student will assess the
assignment of which other student. We assume students will
assess the assignments of students they know, as opposed
to picking random assignments. As such, we clarify that
social networks are di erent from the trust graph of
Section 2. While the nodes of both graphs are the same, edges
(c) jLj = 400
of the social network represent social ties, whereas edges in
the trust graph represent how much does one referee trust
another in judging others work.</p>
      <p>
        To model social networks where relations represent social
ties, we follow three di erent approaches: the Erdo}s-Renyi
model for random networks [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], the Barabasi-Albert model
for power law networks[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and a hierarchical model for
cluster networks.
      </p>
      <sec id="sec-3-1">
        <title>3.1.1 Random Networks</title>
        <p>The Erdo}s-Renyi model for random networks consists of a
graph containing n nodes connected randomly. Each
possible edge between two vertices may be included in the graph
with probability p and may not be included with probability
(1 p). In addition, in our case there is always an edge
between the node representing the tutor and the rest of nodes,
as the tutor knows all of its students (and may eventually
mark any of those students).</p>
        <p>The degree distribution of random graphs follows a Poisson
distribution. Figure 3(a) shows an example of a random
graph with 51 nodes and p = 0:5 and its degree distribution.
Note that the point with degree 50 represents the tutor node
while the rest of the nodes degree t a Poisson distribution.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.1.2 Power Law Networks</title>
        <p>The Barabasi-Albert model for power law networks base
their graph generation on the notions of growth and
preferential attachment. The generation scheme is as follows.
Nodes are added one at a time. Starting with a small
number of initial nodes, at each time step we add a new node
with m edges linked to nodes already part of the network.
In our experiments, we start with m + 1 initial nodes. The
edges are not placed uniformly at random but preferentially
in proportion to the degree of the network nodes. The
probability p that the new node is connected to a node i already
in the network depends on the degree ki of node i, such
that: p = ki= Pn</p>
        <p>j=1 kj. As above, there is also always an
edge between the node representing the tutor and the rest
of nodes.</p>
        <p>
          The degree distribution of this network follows a Power Law
distribution. Figure 3(b) shows an example of a power law
graph with 51 nodes and m = 16 and its degree distribution.
The point with degree 50 describes the tutor node while the
rest of the nodes closely resemble a power law distribution.
Recent empirical results on large real-world networks often
show, among other features, their degree distribution
following a power law [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.1.3 Cluster Networks</title>
        <p>As our focus is on learning communities, we also experiment
with a third type of social network: the cluster network
which is based on the notions of groups and hierarchy. Such
networks consists of a graph composed of a number of fully
connected clusters (where we believe clusters may represent
classrooms or similar pedagogical entities). Additionally,
as above, all the nodes are connected with the tutor node.
Figure 3(c) shows an example of a cluster graph with 51
nodes, 5 clusters of 10 nodes each and its degree distribution.
The point with degree 50 describes the tutor while the rest
of the nodes have degree 10, since every student is fully
connected with the rest of the classroom.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3.2 Experimental Platform</title>
      <p>In our experimentation, given an initial automated
assessment state S = hR; A; C; Li with an empty set of assessments
L = fg, we want to simulate tutor and peer assessments
so that the collaborative assessment method can eventually
generate a reliable and de nitive set of assessments for all
assignments.</p>
      <p>To simulate assessments, we say each students is de ned by
its pro le that describes how good its assessments are. The
pro le is essentially de ned by the measure, or distance, d 2
[0; 1] that speci es how close are the student's assessments
to that of the tutor.</p>
      <p>We then assume the simulator knows how the tutor and each
student would assess an assignment. This becomes necessary
in our simulation, since we generate student assessments in
terms of their distance to that of the tutor's, even if the
tutor does not choose to actually assess the assignment in
question. This simulator's knowledge of the values of all
possible assessments is generated accordingly:</p>
      <p>For every assignment 2 A, we calculate the tutor's
assessment, which is randomly generated according to
the function f : A ! [0; ]n. This assessment
essentially describes what mark would the tutor give , if
it decided to assess it.</p>
      <p>For every assignment 2 A, we also calculate the
assessment of each student 2 S. This is calculated
according to the function f : A ! [0; ]n, such that:
(a) Random Network (aprox graph density 0.5)
(b) Power Law Network (aprox graph density 0.5)
(c) Cluster Network (aprox graph density 0.2)
sim(f ( ); f ( )) d We note that we only need
to calculate 's assessment of if the student who
submitted the assignment is a neighbour of in N .
We note that the above only calculates what the assessments
would be, if referees where to assess assignments.</p>
    </sec>
    <sec id="sec-5">
      <title>3.3 Benchmark</title>
      <p>Given an initial automated assessment state S = hR; A; C; Li
with an empty set of assessments L = fg, a set of student
pro les P r = fdsg8s2S , and a social network N (whose
nodes is the set R), we simulate individual tutor and
students' assessments. When does a referee in R assess an
assignment in A is explained shortly. However we note here
that the value of each generated assessment is equivalent to
that calculated for the simulator's knowledge (see Section 3.2
above).</p>
      <p>In our benchmark, we consider the three types of social
networks introduced earlier: random social networks (with 51
nodes, p = 0:5, and approximate density of 0.5), power law
networks (with 51 nodes, m = 16, and approximate density
of 0.5), and cluster networks (with 51 nodes, 5 clusters of 10
nodes each, and approximate density of 0.2). Examples of
these generated networks are shown in Figure 3.
We say one assignment is submitted by each student,
resulting in jSj = 50 and jAj = 50. The range that a referee
(tutor or student) may mark a given assignment with
respect to a given criteria is [0,10]. And the set of criteria is
C = hspeed; maturityi. The criteria essentially measure the
speed of playing a musical piece, and the maturity level of
the student's performance.</p>
      <p>An assessment pro le is generated for each student at the
beginning of the execution, resulting in a set of student
proles P r = fdsg8s2S , where d 2 [0; 0:5]. We consider here two
cases for generating the set of student pro les P r. A rst
case where d is picked randomly following a power law
distribution (Figure 4(a)) and a second case where d is picked
randomly following a uniform distribution (Figure 4(b)).
With simulated individual assessments, we then run the
collaborative assessment method in order to compute an
automated assessment. We also compute the `error' of the
collaborative assessment method, whose range is [0; 1], over
the set of assignments A accordingly:</p>
      <p>X sim(f ( ); ( ))
2A</p>
      <p>jAj
2 A
, where ( ) describes the automated assessment for a given
assignment
(a) Power law pro le generation
(b) Uniform pro le generation
With the settings presented above, we run two di erent
experiments. The results presented are an average over 50
executions. The two experiments are presented next.
In experiment 1, students provide their assessments before
the tutor. Each student provides assessments for a
randomly chosen a number of peer assigments (of course, where
assignments are those of their neighboring peers in N ). We
run the experiment for 5 di erent values of a = f3; 4; 5; 6; 7g.
After the students provide their assessments, the tutor starts
assessing assignments incrementally. After every tutor
assessment, the error over the set of automated assessment is
calculated. Notice that the collaborative assessment method
takes the tutor assessment, when it exists, to be the nal
assessment. As such, the number of automated assessments
calculated based on aggregating students' assessments is
reduced over time. Finally, when the tutor has assessed all 50
students, the resulting error is 0.</p>
      <p>In experiment 2, the tutor provides its assessments before
the students. The tutor in this experiment will assess a
randomly chosen number of assignments, where this
number is based on the percentage a of the total number of
assignments. We run the experiment for 4 di erent values
of a = f5; 10; 15; 20g. After the tutor provides their
assessments, students' assessments are performed. In every
iteration, a student randomly selects a neighbor in N and
assesses his assignment (in case it has not been assessed before
by , otherwise another connected peer is chosen). We note
that in the case of random and power law networks (denser
networks), a total number of 1000 student assessments are
performed. Whereas in the case of cluster networks (looser
network), a total of 400 student assessments are performed.
We note that initially, the trust graph is not fully connected,
so the service is not able to provide automated assessments
for all assignments. When the grap gets fully connected, the
service generates automated assessments for all assignments
and we start measuring the error after every new iteration.</p>
    </sec>
    <sec id="sec-6">
      <title>3.4 Evaluation</title>
      <p>In experiment 1, we observe (Figure 5) that the error
decreases when the number of tutor assessments increase, as
expected, until it reaches 0 when the tutor has assessed all 50
students. This decrement is quite stable and we do not
observe abrupt error variations or important error increments
from one iteration to the next. More variations are observed
in the initial iterations since the service has only a few
assessments to deduce the weights of the trust graph and to
calculate the nal outcome.</p>
      <p>In the case of experiment 2 (Figure 6), the error diminishes
slowly as the number of student assessments increase,
although it never reaches 0. Since the number of tutor
assessments is xed in this experiment, we have an error threshold
(a lower bound) which is linked to the students' assessment
pro le: the closest to the tutor's the lower this threshold will
be. In fact, in both experiments we observe that when using
a power law distribution pro le (Figure 4(a)) the automated
assessment error is lower than when using a uniform
distribution pro le (Figure 4(b)). This is because when using a
power law distribution, more student pro les are generated
whose assessments are closer to the tutors'.</p>
      <p>In general, the error trend observed in all experiments
comparing di erent social network scenarios (random, cluster or
power law) show a similar behavior. Taking a closer look at
experiment 2, cluster social graphs have the lowest error and
we observe that assessments on all assignments are achieved
earlier (this is, the trust graph gets connected earlier). We
attribute this to the topology of the fully connected
clusters which favors the generations of indirect edges earlier
in the graph between the tutor and the nodes of each
cluster. Power law social graphs have lower error than random
networks in most cases. This can be attributed to the
criteria of preferential attachment in their network generation,
Figure 5: Eperiment 1
Figure 6: Experiment 2
which favors the creation of some highly connected nodes.
Such nodes are likely to be assessed more frequently since
more peers are connected to them. Then, the automated
assessments of these higly connected peers are performed
with more available information which could lead to more
accurate outcomes.</p>
    </sec>
    <sec id="sec-7">
      <title>4. DISCUSSION</title>
      <p>The collaborative assessment model proposed in this paper
is thought of as a support in the creation of intelligent
online learning applications that encourage student
interactions within communities of learners. It goes beyond
current tutor-student online learning tools by making students
participate in the learning process of the whole group,
providing mutual assessment and making the overall learning
process much more collaborative.</p>
      <p>The use of AI techniques is key for the future of online
learning communities. The application presented in this paper is
specially useful in the context of MOOC: with a low
number of tutor assessments and encouraging students to
interact and provide assessments among each other, direct and
indirect trust measures can be calculated among peers and
automated assessments can be generated.</p>
      <p>Several error indicators can be designed and displayed to the
tutor managing the course, which we leave for future work.
For example the error indicators may inform the tutor which
assignments have not received any assessments yet, or which
deduced marks are considered unreliable. For example, a
deduced mark on a given assignment may be considered
unreliable if all the peer assessments that have been provided
for that assignment are considered not to be trusted by the
tutor as they fall below a preselected acceptable trust
threshold. Alternatively, a reliability measure may also be assigned
to the computed trust measure TD. For instance, if there
is only one assignment that has been assessed by and ,
then the computed TD( ; ) will not be as reliable as
having a number of assignments assessed by and . As such,
some reliability threshold may be used that de nes what is
the minimum number of assignments that both and need
to assess for TD( ; ) to be considered reliable. Observing
such error indicators, the tutor can decide to assess more
assignments and as a result the error may improve or the set
of deduced assessments may increase. Finally, if the error
reaches a level of acceptance, the tutor can decide to
endorse and publish the marks generated by the collaborative
assessment method.</p>
      <p>Another interesting question for future work is presented
next. Missing connections might be detected in the trust
graph that would improve its connectivity or maximize the
number of direct edges. The question that follows then is,
what assignments should be suggested to which peers such
that the trust graph and the overall assessment outcome
would improve?
Additionally, future work may also study di erent approaches
for calculating the indirect trust value between two referees.
In this paper, we use the product operator. We suggest to
study a number of operators, and run an experiment to test
which is most suitable. To do such a test, we may
calculate the indirect trust values for edges that do have a direct
trust measure, and then see which approach for calculating
indirect trust gets closest to the direct trust measures.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgements</title>
      <p>This work is supported by the Agreement Technologies project
(CONSOLIDER CSD2007-0022, INGENIO 2010) and the
PRAISE project (EU FP7 grant number 388770).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>[1] Praise project: http://www.iiia.csic.es/praise/.</mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Barabasi</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Albert</surname>
          </string-name>
          .
          <article-title>Emergence of scaling in random networks</article-title>
          .
          <source>Science</source>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3] L. de Alfaro and
          <string-name>
            <given-names>M.</given-names>
            <surname>Shavlovsky</surname>
          </string-name>
          .
          <source>Thecnical report 1308</source>
          .5273, arxiv.org. Crowdgrader:
          <article-title>Crowdsourcing the evaluation of homework assignments</article-title>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Erdo</surname>
          </string-name>
          <article-title>}s and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Renyi</surname>
          </string-name>
          .
          <article-title>On random graphs</article-title>
          .
          <source>Publicationes Mathematicae</source>
          ,
          <year>1959</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E.</given-names>
            <surname>Ferrara</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Fiumara</surname>
          </string-name>
          .
          <article-title>Topological features of online social networks</article-title>
          .
          <source>Communications in Applied and Industrial Mathematics</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Godo</surname>
          </string-name>
          and
          <string-name>
            <surname>R.</surname>
          </string-name>
          <article-title>Rodr guez. Logical approaches to fuzzy similarity-based reasoning: an overview</article-title>
          .
          <source>Preferences and Similarities</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Piech</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Do</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ng</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Koller</surname>
          </string-name>
          .
          <article-title>Tuned models of peer assessment in moocs</article-title>
          .
          <source>Proc. of the 6th International Conference on Educational Data Mining (EDM</source>
          <year>2013</year>
          ),
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>