<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Reputation in the Academic World</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nardine Osman</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carles Sierra</string-name>
          <email>sierrag@iiia.csic.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Arti cial Intelligence Research Institute (IIIA-CSIC)</institution>
          ,
          <addr-line>Barcelona</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>With open access gaining momentum, open reviews becomes a more persistent issue. Institutional and multidisciplinary open access repositories play a crucial role in knowledge transfer by enabling immediate accessibility to all kinds of research output. However, they still lack the quantitative assessment of the hosted research items that will facilitate the process of selecting the most relevant and distinguished content. This paper addresses this issue by proposing a computational model based on peer reviews for assessing the reputation of researchers and their research work. The model is developed as an overlay service to existing institutional or other repositories. We argue that by relying on peer opinions, we address some of the pitfalls of current approaches for calculating the reputation of authors and papers. We also introduce a much needed feature for review management, and that is calculating the reputation of reviews and reviewers.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Motivation</title>
      <p>our module can be evaluated by an unlimited number of peers that o er not only a qualitative assessment in
the form of text, but also quantitative measures to build the works reputation. Crucially, our open peer review
module also includes a reviewer reputation system based on the assessment of reviews themselves, both by the
community of users and by other peer reviewers. This allows for a sophisticated scaling of the importance of
each review on the overall assessment of a research work, based on the reputation of the reviewer.</p>
      <p>As a result of calculating the reputation of authors, reviewers, papers, and reviews, by relying on peer opinions,
we argue that the model addresses some of the pitfalls of current approaches for calculating the reputation of
authors and papers. It also introduces a much needed feature for review management, and that is calculating
the reputation of reviews and reviewers. This is discussed further in the concluding remarks.</p>
      <p>In what follows, we present the ARM reputation model and how it quanti es the reputation of papers,
authors, reviewers, and reviews (Section 2), followed by some evaluation where we use simulations to evaluate
the correctness of the proposed model (Section 3), before closing with some concluding remarks (Section 4).
2
2.1</p>
      <p>ARM: The Academic Reputation Model</p>
      <p>Data and Notation
In order to compute reputation values for papers, authors, reviewers, and reviews we require a Reputation Data</p>
      <sec id="sec-1-1">
        <title>Set, which in practice should be extracted from existing paper repositories.</title>
        <p>De nition 2.1 (Data). A Reputation data Set is a tuple hP; R; E; D; a; o; vi, where</p>
        <p>P = fpigi2P is a set of papers (e.g. DOIs).</p>
        <p>R = frj gj2R is a set of researcher names or identi ers (e.g. the ORCHID identi er).</p>
        <p>E = feigi2E [ f?g is a totally ordered evaluation space, where ei 2 N n f0g and ei &lt; ej i i &lt; j and ?
stands for the absence of evaluation. We suggest the range [0,100], although any other range may be used,
and the choice of range will not a ect the performance.</p>
        <p>D = fdkgk2K is a set of evaluation dimensions, such as originality, technical soundness, etc.
a : P ! 2R is a function that gives the authors of a paper.
o : R P D T ime ! E, where o(r; p; d; t) 2 E is a function that gives the opinion of a reviewer, as a
value in E, on a dimension d of a paper p at a given instant of time t.
v : R R P T ime ! E, where v(r; r0; p; t) = e is a function that gives the judgement of researcher
r over the opinion of researcher r0, on paper p as a value e 2 E.1 Therefore, a judgement is a reviewer's
opinion about another reviewer's opinion. Note that while opinions about a paper are made with respect to
a given dimension in D, judgements are not related to dimensions. We assume a judgement is only made
with respect to one dimension, which describes how good the review is in general.</p>
        <p>We will not include the dimension (or the criteria being evaluated, such as originality, soundness, etc.) in the
equations to simplify the notation. There are no interactions among dimensions so the set of equations apply to
each of the dimensions under evaluation.</p>
        <p>Also, we will also omit the reference to time in all the equations. Time is essential as all measures are dynamic
and thus they evolve along time. We will make the simplifying assumption that all opinions and judgements are
maintained in time, that is, they are not modi ed. Including time would not change the essence of the equations,
it will simply make the computation complexity heavier.</p>
        <p>Finally, if a data set allowed for papers, reviews, and/or judgements to have di erent versions, then our model
simply considers the latest version only.</p>
        <p>1In tools like ConfMaster (www.confmaster.net) this information could be gathered by simply adding a private question to each
paper review, answered with elements in E, one value in E for the judgement on each fellow reviewer's review.
2.2</p>
        <p>Reputation of a Paper
We say the reputation of a paper is a weighted aggregation of its reviews, where the weight is the reputation of
the reviewer. (Section 2.4).</p>
        <p>RP (p) =
8 X
&gt;
&gt;
&gt;&gt;&gt; 8r2rev(p)
&lt; X
&gt;
&gt;
&gt;
&gt;
&gt;
:?
8r2rev(p)</p>
        <p>RR(r) o(r; p)</p>
        <p>RR(r)
if jrev(p)j</p>
        <p>k
where rev(p) = fr 2 R j o(r; p) 6= ?g denotes the reviewers of a given paper.</p>
        <p>Note that when a paper receives less that k reviews, its reputation is de ned as unknown, or ?. We currently
leave k as a parameter, though we suggest that k &gt; 1, so that the reputation of a paper is not dependent on a
single review. We also recommend small numbers for k, such as 2 or 3, because we believe it is usually di cult
to obtain reviews. As such, new papers can quickly start building a reputation.
We consider that a researcher's author reputation is an aggregation of the reputation of her papers. The
aggregation is based on the concept that the impact of a paper's reputation on its authors' reputation is inversely
proportional to the total number of its authors. In other words, if one researcher is the sole author of a paper, then
this author is the only person responsible for this paper, and any (positive or negative) feedback about this paper
is propagated as is to its sole author. However, if the researcher has co-authored the paper with several other
researchers, then the impact (whether positive or negative) that this paper has on the researcher decreases with
the increasing number of co-authors. We argue that collaborating with di erent researchers usually increases the
quality of a research work since the combined expertise of more than one researcher is always better than the
expertise of a single researcher. Nevertheless, the gain in a researcher's reputation decreases as the number of
coauthors increase. Hence, our model might cause researchers to be more careful when selecting their collaborators,
since they should aim at increasing the quality of the papers they produce in such a way that the gain for each
author is still larger than the gain it could have received if it was to work on the same research problem on her
own. As such, adding authors who do not contribute to the quality of the paper will also discouraged.</p>
        <p>RA(r) =
8 X
&gt;
&gt;
&gt;&lt; 8p2pap(r)
&gt;
&gt;
&gt;
:?
(p)</p>
        <p>RP (p) + (1
(p) )</p>
        <p>50
jpap(r)j
if pap(r) 6= ;
(1)
(2)
where pap(r) = fp 2 P j r 2 a(p) ^ RP (p) 6= ?g denotes the papers authored by a given researcher r, ? describes
1
ignorance, (p) = is the coe cient that takes into consideration the number of authors of a paper (recall
ja(p)j
that a(p) denotes the authors of a paper p), and is a tuning factor that controls the rate of decrease of the
(p) coe cient. Also note the multiplication by 50, which describes ignorance, as 50 is the median of the chosen
range [0; 100]. If another range was chosen, the median of that range would be used here. The choice of range
and its median does not a ect the performance of the model (i.e. the results of the simulation of Section 3 would
remain the same).
2.4</p>
        <p>Reputation of a Reviewer
Similar to the reputation of authors (Section 2.3), we consider that if a reviewer produces `good' reviews, then
the reviewer is considered to be a `reputed' reviewer. Furthermore, we consider that the reputation of a reviewer
is essentially an aggregation of the opinions over her reviews.2</p>
        <p>We assume that the opinions on how good a review is can be obtained, in a rst instance, by other reviewers
that also reviewed the same paper. However, as this is a new feature to be introduced in open access repositories
and conference and journal paper management systems, we believe collecting such information might take some
2We assume a review can only be written by one reviewer, and as such, the number of co-authors of a review is not relevant as
it was when calculating the reputation of authors.
time. An alternative that we consider here is that in the meantime we can use the `similarity' between reviews
as a measure of the reviewers opinions about reviews. In other words, the heuristic could be phrased as `if my
review is similar to yours then I may assume your judgement of my review would be good.'</p>
        <p>We note v (ri; rj ; p) 2 E for the `extended judgement' of ri over rj 's opinion on paper p, and de ne it as an
aggregation of opinions and similarities as follows:
v (ri; rj ; p) =
&gt;
:?
8&gt;v(ri; rj ; p)
&lt;</p>
        <p>if v(ri; rj ; p) 6= ?
Sim(o(ri; p); o(rj ; p)) If o(ri; p) 6= ? and o(rj ; p) 6= ?</p>
      </sec>
      <sec id="sec-1-2">
        <title>Otherwise</title>
        <p>where Sim stands for an appropriate similarity measure. We say the similarity between two opinions is the
di erence between the two: Sim(o(ri; p); o(rj ; p)) = 100 jo(ri; p) o(rj ; p)j.</p>
        <p>
          Given this, we consider that the overall opinion of a researcher on the capacity of another researcher to
make good reviews is calculated as follows. Consider the set of judgements of ri over reviews made by rj as:
V (ri; rj ) = fv (ri; rj ; p) j v(ri; rj ; p) 6= ? and p 2 P g. This set might be empty. Then, we de ne the judgement
of a reviewer over another one as a simple average:
(3)
(4)
(
          <xref ref-type="bibr" rid="ref2">5</xref>
          )
(6)
        </p>
        <p>Finally, the reputation of a reviewer r, RR(r), is an aggregation of judgements that her colleagues make about
her capability to produce good reviews. We weight this with the reputation of the colleagues as a reviewer:
RR(ri; rj ) =
RR(r) =
8 X
&gt;
&gt;
&gt;&lt; 8v2V (ri;rj)
&gt;&gt; jV (ri; rj )j
&gt;
:?
v
if V (ri; rj ) 6= ;
otherwise
8 X
&gt;
&gt;
&gt;&gt;&lt;&gt; 8ri2R
&gt;
&gt;
&gt;
&gt;
&gt;:50</p>
        <p>X
8ri2R</p>
        <p>RR(ri) RR(ri; r)</p>
        <p>RR(ri)</p>
        <p>R 6= ;
where R = fri 2 R j V (ri; r) 6= ;g. When no judgements have been made over r, we take the value 50 to
represent ignorance (as 50 is the median of the chosen range [0; 100] | again, we note that any the choice of
range and its median does not a ect the performance of the model; that is, the results of the simulation of</p>
      </sec>
      <sec id="sec-1-3">
        <title>Section 3 would remain the same).</title>
        <p>Note that the reputation of a reviewer depends on the reputation of other reviewers. In other words, every
time the reputation of one reviewer will change, it will trigger changing the reputation of other reviewers, which
might lead to an in nite loop of modifying the reputation of reviewers. We address this by using an algorithm
similar to the EigenTrust algorithm, as illustrated by Algorithm 4 of the Appendix. In fact, this algorithm may
be considered as a variation of the EigenTrust algorithm, which will require some testing to con rm how fast it
converges.
2.5</p>
        <p>Reputation of a Review
The reputation of a review is similar to the one for papers but using judgements instead of opinions. We say
the reputation of a review is a weighted aggregation of its judgements, where the weight is the reputation of the
reviewer (Section 2.4).</p>
        <p>RO(r0; p) =
8 X
&gt;
&gt;
&gt;&gt;&gt; 8r2jud(r0;p)
&lt; X
&gt;&gt;&gt;&gt; 8r2jud(r0;p)
&gt;:RR(r0)</p>
        <p>RR(r) v (r; r0; p)</p>
        <p>RR(r)
if jjud(r0; p)j</p>
        <p>k
where jud(r0; p) = fr 2 R j v (r; r0; p) 6= ?g denotes the set of judges of a particular review written by r0 on a
given paper p.</p>
        <p>Note that when a review receives less that k judgements, its reputation will not depend on the judgements,
but it will inherit the reputation of the author of the review (her reputation as a reviewer).</p>
        <p>We currently leave k as a parameter, though we suggest that k &gt; 1, so that the reputation of a review is not
dependent on a single judge. Again, we recommend small numbers for k, such as 2 or 3, because we believe it
will be di cult to obtain large numbers of judgements.
2.6</p>
        <p>A Note on Dependencies
Figure 1 shows the dependencies between the di erent measures (reputation measures, opinions, and judgements).
The decision of When to re-calculate those measures is then based on those dependencies. We provide a summary
of this below. Note that measures in white are not calculated, but provided by the users. As such, we only discuss
those in grey (grey rectangles represent reputation measures, whereas the grey oval represents the extended
judgements).</p>
        <p>Author</p>
        <p>Reputation
Author's Reputation. The reputation of the author depends on the reputation of its papers (Equation 2).
As such, every time the reputation of one of his papers changes, or every time a new paper is created, the
reputation of the author must be recalculated (Algorithm 2 of the Appendix).</p>
        <p>Paper's Reputation. The reputation of the paper depends on the opinions it receives, and the reputation
of the reviewers giving those opinions (Equation 1). As such, every time a paper receives a new opinion,
or every time the reputation of one of the reviewers changes, then the reputation of the paper must be
recalculated (Algorithm 1 of the Appendix).</p>
        <p>Review's Reputation. The reputation of a review depends on the extended judgements it receives, and
the reputation of the reviewers giving those judgements (Equation 6). As such, every time a review receives a
new extended judgements, or every time the reputation of one of the reviewers changes, then the reputation
of the review must be recalculated (Algorithm 5 of the Appendix).</p>
        <p>Reviewer's Reputation. The reputation of a reviewer depends on the extended judgements of other
reviewers and their reputation (Equation 5). As such, the reputation of the reviewer should be modi ed
every time there is a new extended judgement or the reputation of on of the reviewers changes. As the
reputation of a reviewer depends on the reputation of reviewers, then we suggest to calculate the reputation
of all reviewers repeatedly (in a manner similar to EigenTrust) in order to converge (Algorithm 4 of the
Appendix). If this will be computationally expensive, then this can be computed once a day, as opposed to
triggered by extended judgements and the change in reviewers' reputation.</p>
        <p>opinion
x-judgment
judgment</p>
        <p>Paper
Reputation
Reviewer
Reputation
Review
Reputation
x-judgement. The extended judgement is calculated either based on judgements (if available) or the
similarity between opinions (when judgements are not available) (Equation 3). As such, the extended
judgement should be recalculated every time a new (direct) judgement is made, or every time a new opinion
is added on a paper which already has opinions by other reviewers (Algorithm 3 of the Appendix).
3</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Evaluation through Simulation</title>
      <p>Simulation
To evaluate the e ectiveness of the proposed model, we have simulated a community of researchers, using
NetLogo [TW04]. We clarify that the focus of this work is not implementing a simulation that models the real
world, but a simulation that allows us to verify our model. As such, many assumptions that we make for this
simulation, and will appear shortly, might not be precisely (or always) true in the real world (such as having the
true quality of a paper inherit the quality of the best author).</p>
      <p>
        In our simulation, a breed in NetLogo (or a node in the research community's graph) represents either a
researcher, a paper, a review, or a judgement. The relations between breeds are: (1) authors of, that speci es
which researchers are authors of a given paper, (2) reviewers of, that speci es which researchers are reviewers of
a given paper, (3) reviews of, that speci es which reviews give opinions on a given paper, (4) judgements of, that
speci es which judgements give opinions on a given review; and (
        <xref ref-type="bibr" rid="ref2">5</xref>
        ) judges of, that speci es which researchers
have judged which other researcher.
      </p>
      <p>Also, each researcher has four parameters that describe: (1) her reputation as an author, (2) her reputation
as a reviewer, (3) her true research quality; and (4) her true reviewing quality. The rst two are calculated
by our ARM model, and they evolve over time. However, the last two describe the researcher's true quality
with respect to writing papers as well as reviewing papers or other reviews, respectively. In other words, our
simulation assumes true qualities exist, and that they are constant. In real life, there are no such measures.
Furthermore, how good one is at writing papers or writing reviews or making judgements naturally evolves with
time. Nevertheless, we chose to keep the simulation simple by sticking to constant true qualities, as the purpose
of the simulation is simply to evaluate the correctness of our ARM model.</p>
      <p>Similar to researchers, we say each paper has two parameters that describe it: (1) its reputation, which is
calculated by our ARM model, and it evolves over time; and (2) its true quality. Again, we assume that a paper's
true quality exists. How it is calculated is presented shortly.</p>
      <p>Reviews also have two parameters: (1) the opinion provided by the review, which in real life is set by the
researcher performing the review, while in our simulation it is calculated by the simulator, as illustrated shortly;
and (2) the reputation of the review, which is calculated by our ARM model and it evolves over time.</p>
      <p>Judgements, on the other hand, only have one parameter: the opinion provided by the judgement, which in
real life is set by the researcher judging a review, while in our simulation it is calculated by the simulator, as
illustrated shortly.</p>
      <p>Simulation starts at time zero with no researchers in the community, and hence, no papers, no reviews, and
no judgements. Then, with every tick of the simulation, a new paper is created, which may sometimes require
the creation of new researchers (either as authors or reviewers). With the new paper, reviews and judgements
are also created. How these elements are created is de ned next by the simulator's parameters and methods,
that drive and control this behaviour. We note that a tick of the simulation does not represent a xed unit in
calendar time, but the creation of one single paper.</p>
      <p>The ultimate aim of the evaluation is to investigate how close are the calculated reputation values to the
true values: the reputation of a researcher as an author, the reputation of a researcher as a reviewer, and the
reputation of a paper.</p>
      <p>The parameters and methods that drive and control the evolution of the community of researchers and the
evolution of their research work are presented below.</p>
      <p>1. Number of authors. Every time a new paper is created, the simulator assigns authors for this paper. How
many authors are assigned is de ned by the number of authors parameter (#co authors), which is de ned as
a Poisson distribution. For every new paper, a random number is generated from this Poisson distribution.
Who to assign is chosen randomly from the set of researchers, although sometimes, a new researcher is
created and assigned to this paper (see the `researchers birth rate' below). This ensures the number of
researchers in the community grows with the number of papers.
2. Number of reviewers. Every time a new paper is created, the simulator also assigns reviewers for this paper.</p>
      <p>How many reviewers are assigned is de ned by the number of reviewers parameter (#reviewers), which is
de ned as a Poisson distribution. For every new paper, a random number is generated from this Poisson
distribution. As above, who to assign is chosen randomly from the set of researchers, although sometimes,
a new researcher is created and assigned to this paper.
3. Researchers birth rate. As illustrated above, every paper requires authors and reviewers to be assigned to
it. When assigning authors and reviewers, the simulation will decide whether to assign an already existing
researcher (if any) or create a new researcher. This decision is controlled by the researchers birth rate
parameter (birth rate), which speci es the probability of creating a new researcher.
4. Researcher's true research quality. The author's true quality is sampled from a beta distribution speci ed
by the parameters A and A. We choose the beta distribution because it is a very versatile distribution
which can be used to model several di erent shapes of probability distributions by playing with only two
parameters, and .
5. Researcher's true review quality. The reviewer's true quality is sampled from a beta distribution speci ed
by the parameters R and R. Again, the beta distribution is a very versatile distribution which can be
used to model several di erent shapes of probability distributions by playing with only two parameters, as
illustrated shortly by our experiments.
6. Paper's true quality. We assume that a paper's true quality is the true quality of its best author, that is, the
author with the highest true research quality). We believe this assumption has some ground in real life. For
instance, some behaviour (such as looking for future collaborators, selecting who to give a funding to, etc.)
assumes researchers to be of a certain quality, and their research work to follow that quality respectively.
7. Opinion of a Review. The opinion presented by a review is speci ed as the paper's true quality plus some
noise, where the noise depends on the reviewer's true quality. This noise is chosen randomly from the range
[ (100 review quality)=2; +(100 review quality)=2]. In other words, the maximum noise that can be
added for the worst reviewer (whose review quality is 0) is 50, and the least noise that can be added for
the best reviewer (whose review quality is 100) is 0.
8. Opinion of a Judgement. The value (or opinion) of a judgement on a review is calculated as the similarity
between the review's value (opinion) and the judge's review value (opinion), where the similarity is de ned
by the metric distance as: 100 jreview judge0s reviewj. Note that, for simpli cation, direct judgements
have not been simulated, we only rely on indirect judgements.
3.2
3.2.1</p>
      <p>Results</p>
      <p>Experiment 1: The impact of the community's quality of reviewers
Given the above, we ran the simulator for 100 ticks (generating 100 papers). We ran the experiment over 6
di erent cases. In each, we had the following parameters xed:
#co authors = 2
#reviewers = 3
birth rate = 3</p>
      <p>A = A = 1
k = 3 (of Equations 1 and 6)</p>
      <p>= 1 (of Equation 2)</p>
      <p>The only parameters that changed where those de ning the beta distribution of the reviewers' qualities. This
experiment illustrated the impact of the community's quality of reviewers on the correctness of the ARM model.</p>
      <p>The results of the simulation are presented by Figure 2. For each case, the distribution of the reviewers' true
quality is illustrated to the right of the results. The results, in numbers, are also presented by Table 1. We notice
that the least error is presented when the reviewers are all of relatively good quality, with the majority being
great reviewers (Figure 2e). The errors start increasing as bad reviewers are added to the community (Figure 2c).
They increase even further in both cases, when the quality of reviewers follows a uniform distribution (Figure 2a),
as well as when the reviewers are equiprobably good or bad, with no average reviewers (Figure 2b). As soon
as the majority of reviewers are of poor quality (Figure 2d), the errors increase even further, with the worst
case being when good reviewers are absent from the community (Figure 2f). These results are not surprising.
A paper's true quality is not something that can be measured, or even agreed upon. As such, the trust model
depends on the opinions of other researchers. As a result, the better the reviewing quality of researchers, the
more accurate the trust model will be, and vice versa.</p>
      <p>The numbers of Table 1 illustrate how the error in the papers' reputation increases with the error in the
reviewers' reputation, though at a smaller rate. One curious thing about these results is the constant error in
the reputation of authors. The next experiment investigates this issue.</p>
      <p>Last, but not least, we note that the error is usually stable. This is because every time a paper is created, all
the reviews it receives and the judgements those reviews receive are created at the same simulation time-step.
In other words, it is not the case that papers accumulate more reviews and judgements over time, for the error
to decrease over time.</p>
      <p>R = 5 and R = 1
R = 2 and R = 1
R = 1 and R = 1
R = 0:1 and R = 0:1
R = 1 and R = 2
R = 1 and R = 2</p>
      <p>Error in
Reviewers' Reputation
11 %
23 %
30 %
34 %
44 %
60 %</p>
      <p>Error in
Papers' Reputation
2 %
5 %
7 %
5 %
8 %
9 %</p>
      <p>Error in
Authors' Reputation
22 %
23 %
23 %
22 %
23 %
20 %
In the second experiment, we investigate the impact of co-authorship on authors' reputation. We choose the
two extreme cases from experiment 1, when there are only relatively good authors in the community ( = 5
and R = 1), and when there are only relatively bad authors in the community ( = 5 and R = 1). For each
of these cases, we then change the number of co-authors, investigating three cases: #co authors = f0; 1; 2g. All
other parameters remain set to those presented in experiment 1 above.</p>
      <p>The results of this experiment are presented by Figure 3. The numbers are presented in Table 2. The results
show that the error in the reviewers and papers reputation almost does not change for di erent numbers of
coauthors. However, the error in the reputation of authors does. When there are no co-authors (#co authors = 0),
the error in authors' reputation is almost equal to the error in papers' reputation (Figures 3a and 3b). As soon as
1 co-author is added (#co authors = 0), the error in authors' reputation increases (Figures 3c and 3d). When 2
co-authors are added (#co authors = 2), the error in authors' reputation reaches the maximum, around 20{22%
(Figures 3e and 3f). In fact, unreported results show that the error in authors' reputation is almost the same in
all cases for #co authors 2.</p>
    </sec>
    <sec id="sec-3">
      <title>Conclusion</title>
      <p>We have presented the ARM reputation model for the academic world. ARM helps calculate the reputation
of researchers, both as authors and reviewers, and their research work. Additionally, ARM also calculates the
reputation of reviews.</p>
      <p>Concerning the reputation of authors, the most commonly used reputation measure is currently the
hindex [Hir05]. However, the h-index has its aws. For instance, the h-index can be manipulated through
self-citations [BK10, FR13]. A study has also found the h-index as not providing a signi cantly more accurate
measure of impact than the total number of citations [Yon14]. ARM, on the other hand, bases the reputation of
authors on the opinions that their papers receive from other members in their academic community. We believe
this should be a more accurate approach, though future work should aim at comparing both approaches.</p>
      <p>R = 1 and</p>
      <p>R = 1
(b)</p>
      <p>R = 0:1 and</p>
      <p>R = 0:1
(c)</p>
      <p>R = 2 and</p>
      <p>R = 1
(d)</p>
      <p>R = 1 and</p>
      <p>R = 2
distribution of
researchers w.r.t.
review quality:
distribution of
researchers w.r.t.
review quality:
distribution of
researchers w.r.t.
review quality:
distribution of
researchers w.r.t.
review quality:
(e)</p>
      <p>R = 5 and</p>
      <p>R = 1
(f )</p>
      <p>R = 1 and</p>
      <p>R = 5</p>
      <p>R = 5, R = 1, and #co authors = 0
(b)</p>
      <p>R = 1, R = 5, and #co authors = 0
(c)</p>
      <p>R = 2, R = 1, and #co authors = 1
(d)</p>
      <p>R = 1, R = 2, and #co authors = 1
distribution of
researchers w.r.t.
review quality:
distribution of
researchers w.r.t.
review quality:
distribution of
researchers w.r.t.
review quality:
distribution of
researchers w.r.t.
review quality:
(e)</p>
      <p>R = 5, R = 1, and #co authors = 2
(f)</p>
      <p>R = 1, R = 5, and #co authors = 2</p>
      <p>Concerning the reputation of papers, the most common measure currently used is the total number of citations
a paper gets. Again, this measure can easily be manipulated through the self-citations. [OSSM10] presents an
alternative approach based on the propagation of opinions in structural graphs. It allows papers to build
reputation either from the direct reviews it receives, or inherit reputation from the place where the paper is
published. In fact, a sophisticated propagation model is proposed to allow reputation to propagate upwards as
well as downwards in structural graphs (e.g. from a section to a chapter to a book, and vice versa). Simulations
presented in [OSMSMC12] illustrate the potential impact of this model. ARM does not have any notion of
propagation. The model is strictly based on direct opinions (reviews and judgements), and when no opinions are
present, ignorance is assumed (as in the default reputation of authors and papers).</p>
      <p>Concerning the reputation of reviews and reviewers, to our knowledge, these reputation measures have not
been addressed yet. Nevertheless, we believe these are important measures. Conference management systems are
witnessing a massive increase in paper submissions, and in many disciplines, nding good reviewers is becoming
a challenging task. Deciding what papers to accept/reject is sometimes a challenge for conference and workshop
organisers. ARM is a reputation model that addresses this issue by helping recognise the good reviews/reviewers
from the bad.</p>
      <p>The obvious next steps for ARM is applying it to a real dataset. In fact, the model is
currently being integrated with two Spanish repositories: DIGITAL.CSIC (https://digital.csic.es) and e-IEO
(http://www.repositorio.ieo.es/e-ieo/). However, these repositories do not have any opinions or judgements
yet, and as such, time is needed to start collecting this data. We are also working with the IJCAI 2017
conference (http://ijcai-17.org) in order to allow reviewers to review each other. We will collect the data of this
conference, which will provide us with the reviews and judgements needed for evaluating our model. We will
also continue to look through existing datasets.</p>
      <p>Future work can investigate a number of additional issues. For instance, we plan to provide data on the
convergence performance of the algorithm. One can also study the di erent types of attacks that could impact
the proposed computational model. While similarity of reviews is now computed based on the similarity of the
quantitative opinions, the similarity between qualitative opinions may also be used in future work by making use
of natural language processing techniques. Also, while we argue that direct opinion can help the model avoid
the pitfalls of the literature, it is also true that direct opinions are usually scarce. As such, if needed, other
information sources for opinions may also be considered, such as citations. This information can be translated
into opinions, and the equations of ARM should then change to give more weight to direct opinions than other
information sources.
4.0.1</p>
      <p>Acknowledgements
This work has been supported by CollectiveMind (a project funded by the Spanish Ministry of Economy &amp;
Competitiveness (MINECO), grant # TEC2013-49430-EXP), and Open Peer Review Module for Repositories
(a project funded by OpenAIRE, which in turn is an EU funded project).
[BK10]</p>
      <sec id="sec-3-1">
        <title>Christoph Bartneck and Servaas Kokkelmans. Detecting h-index manipulation through selfcitation analysis. Scientometrics, 87(1):85{98, 2010.</title>
      </sec>
      <sec id="sec-3-2">
        <title>Gunther Eysenbach. Citation advantage of open access articles. PLoS Biology, 4(5):e157, 05 2006.</title>
      </sec>
      <sec id="sec-3-3">
        <title>Emilio Ferrara and Alfonso E. Romero. Scienti c impact evaluation and the e ect of self-citations:</title>
        <p>Mitigating the bias by discounting the h-index. Journal of the American Society for Information
Science and Technology, 64(11):2332{2339, 2013.</p>
      </sec>
      <sec id="sec-3-4">
        <title>J. E. Hirsch. An index to quantify an individual's scienti c research output. Proceedings of the</title>
        <p>National Academy of Sciences of the United States of America, 102(46):16569{16572, 2005.</p>
      </sec>
      <sec id="sec-3-5">
        <title>Nikolaus Kriegeskorte and Diana Deca, editors. Beyond open access: visions for open evalua</title>
        <p>tion of scienti c papers by post-publication peer review, Frontiers in Computational Neuroscience.</p>
      </sec>
      <sec id="sec-3-6">
        <title>Frontiers E-books, November 2012.</title>
        <p>[OSSM10]
[TW04]
[Yon14]</p>
      </sec>
      <sec id="sec-3-7">
        <title>Nardine Osman, Carles Sierra, and Jordi Sabater-Mir. Propagation of opinions in structural</title>
        <p>graphs. In Proceedings of the 2010 Conference on ECAI 2010: 19th European Conference on</p>
      </sec>
      <sec id="sec-3-8">
        <title>Arti cial Intelligence, pages 595{600, Amsterdam, The Netherlands, The Netherlands, 2010. IOS</title>
      </sec>
      <sec id="sec-3-9">
        <title>Press.</title>
      </sec>
      <sec id="sec-3-10">
        <title>Seth Tisue and Uri Wilensky. Netlogo: Design and implementation of a multi-agent modeling</title>
        <p>environment. In In Proceedings of the Agent Conference, pages 161{184, 2004.</p>
      </sec>
      <sec id="sec-3-11">
        <title>Alexander Yong. Critique of hirschs citation index: A combinatorial fermi problem. Notices of</title>
        <p>the American Mathematical Society, 61(11):1040{1050, 2014.</p>
      </sec>
      <sec id="sec-3-12">
        <title>Algorithm 1: Reputation of a paper</title>
        <p>end
end
if length(rev) &lt; k then</p>
        <p>RepP aper null;
else
Function ReputationPaper(p : P ):[0,100] =</p>
        <p>Data: p : P /* a paper identifier */
Data: aut : P ! R list /* function returning the list of authors of papers */
Data: o : (R E) list /* list of evaluations of reviewers over paper p */
Data: k : integer /* minimum number of reviewers to compute non-default reputation k &gt; 1
*/
Result: RepP aper : [0; 100] /* the reputation value of paper p */
/* This function computes the reputation of a paper for a given dimension. It must be
called every time a new review is created for this paper, and every time the
reputation of one of the paper's reviewers is modified. */
rev = ;;
for (r; e) 2 o do
if RR(r) 6= null then</p>
        <p>rev = rev [ (r; e);
normal 0;
for (r; e) 2 rev do</p>
        <p>normal normal + ReputationReviewer(r);
end
num 0:0;
for (r; e) 2 rev do</p>
        <p>num num + ReputationReviewer(r) e;
end
RepP aper</p>
        <p>num=normal;
end
return RepP aper;</p>
      </sec>
      <sec id="sec-3-13">
        <title>Algorithm 2: Reputation of an author</title>
        <p>Function ReputationAuthor(r : R):[0,100] =</p>
        <p>Data: r : R /* a researcher identifier */
Data: pap : R ! P list /* function returning the list of papers of authors */
Data: aut : P ! R list /* function returning the list of authors of papers */
Data: alpha : real /* tuning factor for coefficient gamma */
Result: RepAuthor : [0; 100] /* the reputation value of author r */
/* This function computes the reputation of an author. It must be called every time a
new paper is created for this author, and every time the reputation of one of the
author's papers is modified. */
pap2 = ;;
for p 2 pap(r) do
if RP (p) 6= null then</p>
        <p>pap2 = pap2 [ p;
end</p>
        <p>1=length(aut(p)) /* length gives the length of a list
num + exp(gamma; alpha) ReputationP aper(p) + (1 exp(gamma; alpha)) 50
*/
else
end
RepAuthor = num=jpap2j;</p>
        <p>RepAuthor = null
end
return RepAuthor
else
else
end
end
if e0[i] 6= null then
num0 num0 + e[i];
den0 den0 + 1;
end
end
if den 6= 0 and den0 6= 0 then
x num=den;
x0 num0=den0;
similar 100 jx</p>
        <p>x0j;
if 9 ebar; ebar0 : (ri; ebar) 2 obar and (rj; ebar0) 2 obar then
extjudge sim(ebar; ebar0)
extjudge</p>
        <p>null</p>
      </sec>
      <sec id="sec-3-14">
        <title>Algorithm 3: Auxiliary functions, used by Algorithms 4 and 5</title>
        <p>Function v*(ri : R; rj : R; p : P ):[0,100]+null=</p>
        <p>Data: ri : R; rj : R /* researcher identifiers
Data: p : P /* a paper identifier
Data: obar : (R Ek) list /* list of vector evaluations of reviewers over p
Data: v : (R R E) list /* list of judgments over paper p
Result: extjudge : [0; 100] + null /* extended judgment of ri on rj's opinion of p
/* This function computes extended judgments. It must be called every time a new judgment is made, and
every time a new review is added on a paper which already has reviews by others. It is also called by
the AverageJudgment function below and the ReputationReview function of Algorithm 5.
if 9 e : (ri; rj; e) 2 v then
extjudge e
*/
*/
*/
*/
*/
*/
end
return extjudge
Function sim(e : Ek; e0 : Ek):[0,100]=</p>
        <p>Data: e : Ek; e0 : Ek /* evaluation vectors */
Result: similar : [0; 100] + null /* difference */
/* This function computes the similarity between two vectors. It is only called by the v function above.</p>
        <p>*/
num 0;
num0 0;
den 0;
den0 0;
for i 2 [1; k] do
if e[i] 6= null then
num num + e[i];
den den + 1;
end
similar null;
return similar
Function AverageJudgment(r : R; r0 : R):[0,100]+null=</p>
        <p>Data: r : R; r0 : R /* two research identifiers
Result: AvgJudge : [0; 100] + null /* the average judgment of r over r0's opinions
/* This function computes the average judgment of one reviewer over another. It is only called by the</p>
        <p>ReputationReviwer function Of Algorithm 4.
judgements 0:0;
num 0:0;
for p 2 P do
if v (r; r0; p) 6= null then
judgements judgements + 1;
num num + v (r; r0; p)
*/
*/
*/
end
end
if judgements 6= 0:0 then</p>
        <p>AvgJudge num=judgements
else</p>
        <p>AvgJudge
end
return AvgJudge
null</p>
      </sec>
      <sec id="sec-3-15">
        <title>Algorithm 4: Reputation of a reviewer</title>
        <p>Function ReputationReviewer(r : R):[0,100] =</p>
        <p>Data: r : R /* a researcher identifier
Data: RepReviewer(r) : [0; 100] /* the reputation value of author r
Result: RepReviewer(r) : [0; 100] /* the new reputation value of author r
/* This function computes the reputation of a single reviewer. It is only called by
the function ReputationReviewers and itself, ReputationReviewer.
den 0:0;
num 0:0;
for r0 2 R, r0 6= r do
if AverageJudgment(r0; r) 6= null then
den den + ReputationReviewer(r0);
num num + ReputationReviewer(r0) AverageJudgment(r0; r);
*/
*/
*/
*/
end
end
if den &gt; 0:0 then</p>
        <p>RepReviewer(r)
else
end
end
end
return RepReviewers;</p>
        <p>Function ReputationReview(r : R; p : P; k : integer):[0,100] =</p>
        <p>Data: r : R /* a researcher identifier */
Data: p : P /* a paper identifier */
Data: k : integer /* minimum number of judgments to compute non-default reputation review
value, k &gt; 0 */
Result: RepReview : [0; 100] /* the reputation value of the review of r over p */
/* This function computes the reputation of a particular review. It must be called
every time an extended judgment over that opinion of r is created of modified
(calculated by the function v of Algorithm 3), and every time the reputation of
the author of the review is modified. */
jud = ;;
for r0 2 R; r0 6= r do
if v (r0; r; p) 6= null ^ RR(r0) 6= null then</p>
        <p>jud = jud [ r0;
end
end
den 0:0;
num 0:0;
if jud 6= ; then
for r0 2 jud do
den den + ReputationReviewer(r0);
num num + ReputationReviewer(r0) v (r0; r; p)
end</p>
        <p>RepReview
else
RepReview</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [OSMSMC12]
          <string-name>
            <given-names>Nardine</given-names>
            <surname>Osman</surname>
          </string-name>
          , Jordi Sabater-Mir,
          <string-name>
            <given-names>Carles</given-names>
            <surname>Sierra</surname>
          </string-name>
          , and
          <string-name>
            <surname>Jordi</surname>
          </string-name>
          Madrenas-Ciurana.
          <article-title>Simulating research behaviour</article-title>
          .
          <source>In Proceedings of the 12th International Conference on Multi-Agent-Based Simulation, MABS'11</source>
          , pages
          <fpage>15</fpage>
          {
          <fpage>30</fpage>
          , Berlin, Heidelberg,
          <year>2012</year>
          . Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>Algorithm 5: Reputation of a review</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>