=Paper=
{{Paper
|id=Vol-1778/AmILP_2
|storemode=property
|title=Reputation in the Academic World
|pdfUrl=https://ceur-ws.org/Vol-1778/AmILP_2.pdf
|volume=Vol-1778
|authors=Nardine Osman,Carles Sierra
|dblpUrl=https://dblp.org/rec/conf/ecai/OsmanS16a
}}
==Reputation in the Academic World==
Reputation in the Academic World Nardine Osman and Carles Sierra 1 Abstract. With open access gaining momentum, open reviews be- in repositories using our module can be evaluated by an unlimited comes a more persistent issue. Institutional and multidisciplinary number of peers that offer not only a qualitative assessment in the open access repositories play a crucial role in knowledge transfer form of text, but also quantitative measures to build the works reputa- by enabling immediate accessibility to all kinds of research output. tion. Crucially, our open peer review module also includes a reviewer However, they still lack the quantitative assessment of the hosted re- reputation system based on the assessment of reviews themselves, search items that will facilitate the process of selecting the most rel- both by the community of users and by other peer reviewers. This evant and distinguished content. This paper addresses this issue by allows for a sophisticated scaling of the importance of each review proposing a computational model based on peer reviews for assess- on the overall assessment of a research work, based on the reputation ing the reputation of researchers and their research work. The model of the reviewer. is developed as an overlay service to existing institutional or other As a result of calculating the reputation of authors, reviewers, pa- repositories. We argue that by relying on peer opinions, we address pers, and reviews, by relying on peer opinions, we argue that the some of the pitfalls of current approaches for calculating the reputa- model addresses some of the pitfalls of current approaches for calcu- tion of authors and papers. We also introduce a much needed feature lating the reputation of authors and papers. It also introduces a much for review management, and that is calculating the reputation of re- needed feature for review management, and that is calculating the views and reviewers. reputation of reviews and reviewers. This is discussed further in the concluding remarks. 1 MOTIVATION In what follows, we present the ARM reputation model and how it quantifies the reputation of papers, authors, reviewers, and reviews There has been a strong move towards open access repositories in the (Section 2), followed by some evaluation where we use simulations last decade or so. Many funding agencies — such as the UK Research to evaluate the correctness of the proposed model (Section 3), before Councils, Canadian funding agencies, American funding agencies, closing with some concluding remarks (Section 4). the European Commission, as well as many universities — are pro- moting open access by requiring the results of their funded projects to be published in open access repositories. It is a way to ensure that 2 ARM: ACADEMIC REPUTATION MODEL the research they fund has the greatest possible research impact. Aca- demics are also very much interested in open access repositories, as 2.1 Data and Notation this helps them maximise their research impact. In fact, studies have confirmed that open access articles are more likely to be used and In order to compute reputation values for papers, authors, review- cited than those sitting behind subscription barriers [2]. As a result, a ers, and reviews we require a Reputation Data Set, which in practice growing number of open access repositories are becoming extremely should be extracted from existing paper repositories. popular in different fields, such as PLoS ONE for Biology, arXiv for Definition 2.1 (Data). A Reputation data Set is a tuple Physics, and so on. hP, R, E, D, a, o, vi, where With open access gaining momentum, open reviews becomes a more persistent issue. Institutional and multidisciplinary open access • P = {pi }i2P is a set of papers (e.g. DOIs). repositories play a crucial role in knowledge transfer by enabling im- • R = {rj }j2R is a set of researcher names or identifiers (e.g. the mediate accessibility to all kinds of research output. However, they ORCHID identifier). still lack the quantitative assessment of the hosted research items that • E = {ei }i2E [ {?} is a totally ordered evaluation space, where will facilitate the process of selecting the most relevant and distin- ei 2 N \ {0} and ei < ej iff i < j and ? stands for the absence guished content. Common currently available metrics, such as num- of evaluation. We suggest the range [0,100], although any other ber of visits and downloads, do not reflect the quality of a research range may be used, and the choice of range will not affect the product, which can only be assessed directly by peers offering their performance. expert opinion together with quantitative ratings based on specific • D = {dk }k2K is a set of evaluation dimensions, such as original- criteria. The articles published in the Frontiers book [5] highlight the ity, technical soundness, etc. need for open reviews. • a : P ! 2R is a function that gives the authors of a paper. To address this issue we develop an open peer review module, the • o : R ⇥ P ⇥ D ⇥ T ime ! E, where o(r, p, d, t) 2 E is a Academic Reputation Model (ARM), as an overlay service to exist- function that gives the opinion of a reviewer, as a value in E, on a ing institutional or other repositories. Digital research works hosted dimension d of a paper p at a given instant of time t. 1 Artificial Intelligence Research Institute (IIIA-CSIC), Barcelona, Spain, • v : R ⇥ R ⇥ P ⇥ T ime ! E, where v(r, r0 , p, t) = e is a email: {nardine, sierra}@iiia.csic.es function that gives the judgement of researcher r over the opin- ion of researcher r0 , on paper p as a value e 2 E.2 Therefore, a expertise of more than one researcher is always better than the ex- judgement is a reviewer’s opinion about another reviewer’s opin- pertise of a single researcher. Nevertheless, the gain in a researcher’s ion. Note that while opinions about a paper are made with respect reputation decreases as the number of co-authors increase. Hence, to a given dimension in D, judgements are not related to dimen- our model might cause researchers to be more careful when select- sions. We assume a judgement is only made with respect to one ing their collaborators, since they should aim at increasing the quality dimension, which describes how good the review is in general. of the papers they produce in such a way that the gain for each author is still larger than the gain it could have received if it was to work on We will not include the dimension (or the criteria being evaluated, the same research problem on her own. As such, adding authors who such as originality, soundness, etc.) in the equations to simplify the do not contribute to the quality of the paper will also discouraged. notation. There are no interactions among dimensions so the set of equations apply to each of the dimensions under evaluation. R 8A (r)X = Also, we will also omit the reference to time in all the equations. > > (p) ⇥ RP (p) + (1 (p) ) ⇥ 50 Time is essential as all measures are dynamic and thus they evolve > < 8p2pap(r) along time. We will make the simplifying assumption that all opin- if pap(r) 6= ; > > |pap(r)| ions and judgements are maintained in time, that is, they are not mod- > :? otherwise ified. Including time would not change the essence of the equations, (2) it will simply make the computation complexity heavier. where pap(r) = {p 2 P | r 2 a(p) ^ RP (p) 6= ?} denotes Finally, if a data set allowed for papers, reviews, and/or judge- the papers authored by a given researcher r, ? describes ignorance, ments to have different versions, then our model simply considers 1 the latest version only. (p) = is the coefficient that takes into consideration the |a(p)| number of authors of a paper (recall that a(p) denotes the authors of 2.2 Reputation of a Paper a paper p), and is a tuning factor that controls the rate of decrease of the (p) coefficient. Also note the multiplication by 50, which de- We say the reputation of a paper is a weighted aggregation of its scribes ignorance, as 50 is the median of the chosen range [0, 100]. reviews, where the weight is the reputation of the reviewer. (Sec- If another range was chosen, the median of that range would be used tion 2.4). here. The choice of range and its median does not affect the perfor- mance of the model (i.e. the results of the simulation of Section 3 8 X would remain the same). > > RR (r) · o(r, p) > > > < 8r2rev(p) X if |rev(p)| k 2.4 Reputation of a Reviewer RP (p) = RR (r) (1) > > > > 8r2rev(p) Similar to the reputation of authors (Section 2.3), we consider that if a > : ? otherwise reviewer produces ‘good’ reviews, then the reviewer is considered to be a ‘reputed’ reviewer. Furthermore, we consider that the reputation where rev(p) = {r 2 R | o(r, p) 6= ?} denotes the reviewers of a of a reviewer is essentially an aggregation of the opinions over her given paper. reviews.3 Note that when a paper receives less that k reviews, its reputation We assume that the opinions on how good a review is can be is defined as unknown, or ?. We currently leave k as a parameter, obtained, in a first instance, by other reviewers that also reviewed though we suggest that k > 1, so that the reputation of a paper is not the same paper. However, as this is a new feature to be introduced dependent on a single review. We also recommend small numbers for in open access repositories and conference and journal paper man- k, such as 2 or 3, because we believe it is usually difficult to obtain agement systems, we believe collecting such information might take reviews. As such, new papers can quickly start building a reputation. some time. An alternative that we consider here is that in the mean- time we can use the ‘similarity’ between reviews as a measure of the reviewers opinions about reviews. In other words, the heuristic could 2.3 Reputation of an Author be phrased as ‘if my review is similar to yours then I may assume We consider that a researcher’s author reputation is an aggregation your judgement of my review would be good.’ of the reputation of her papers. The aggregation is based on the con- We note v ⇤ (ri , rj , p) 2 E for the ‘extended judgement’ of ri over cept that the impact of a paper’s reputation on its authors’ reputation rj ’s opinion on paper p, and define it as an aggregation of opinions is inversely proportional to the total number of its authors. In other and similarities as follows: words, if one researcher is the sole author of a paper, then this author is the only person responsible for this paper, and any (positive or neg- v ⇤ (ri , rj , p) = ative) feedback about this paper is propagated as is to its sole author. 8v discuss those in grey (grey rectangles represent reputation measures, > > < 8v2V ⇤ (r ,r ) whereas the grey oval represents the extended judgements). i j RR (ri , rj ) = if V ⇤ (ri , rj ) 6= ; (4) > > |V ⇤ (ri , rj )| > :? Paper Author otherwise opinion Reputation Reputation Finally, the reputation of a reviewer r, RR (r), is an aggregation of judgements that her colleagues make about her capability to produce good reviews. We weight this with the reputation of the colleagues as a reviewer: 8 X Reviewer > Reputation > RR (ri ) · RR (ri , r) > > > < 8ri 2R⇤ X R⇤ 6= ; RR (r) = R R i(r ) (5) x-judgment > > > > 8ri 2R⇤ > : 50 otherwise Review where R⇤ = {ri 2 R | V ⇤ (ri , r) 6= ;}. When no judgements have Reputation been made over r, we take the value 50 to represent ignorance (as 50 is the median of the chosen range [0, 100] — again, we note that any the choice of range and its median does not affect the performance of the model; that is, the results of the simulation of Section 3 would remain the same). judgment Note that the reputation of a reviewer depends on the reputation of other reviewers. In other words, every time the reputation of one reviewer will change, it will trigger changing the reputation of other Figure 1: Dependencies reviewers, which might lead to an infinite loop of modifying the rep- utation of reviewers. We address this by using an algorithm similar to the EigenTrust algorithm, as illustrated by Algorithm ?? of the • Author’s Reputation. The reputation of the author depends on Appendix. In fact, this algorithm may be considered as a variation of the reputation of its papers (Equation 2). As such, every time the the EigenTrust algorithm, which will require some testing to confirm reputation of one of his papers changes, or every time a new paper how fast it converges. is created, the reputation of the author must be recalculated. • Paper’s Reputation. The reputation of the paper depends on the opinions it receives, and the reputation of the reviewers giving 2.5 Reputation of a Review those opinions (Equation 1). As such, every time a paper receives The reputation of a review is similar to the one for papers but using a new opinion, or every time the reputation of one of the reviewers judgements instead of opinions. We say the reputation of a review changes, then the reputation of the paper must be recalculated . is a weighted aggregation of its judgements, where the weight is the • Review’s Reputation. The reputation of a review depends on the reputation of the reviewer (Section 2.4). extended judgements it receives, and the reputation of the review- ers giving those judgements (Equation 6). As such, every time a 8 X review receives a new extended judgements, or every time the rep- > > RR (r) · v ⇤ (r, r0 , p) utation of one of the reviewers changes, then the reputation of the > > > < 8r2jud(r0 ,p) X if |jud(r 0 , p)| k review must be recalculated. RO (r0 , p) = R R (r) • Reviewer’s Reputation. The reputation of a reviewer depends on > > > > 8r2jud(r 0 ,p) the extended judgements of other reviewers and their reputation > : RR (r0 ) otherwise (Equation 5). As such, the reputation of the reviewer should be (6) modified every time there is a new extended judgement or the rep- where jud(r0 , p) = {r 2 R | v ⇤ (r, r0 , p) 6= ?} denotes the set of utation of on of the reviewers changes. As the reputation of a re- judges of a particular review written by r0 on a given paper p. viewer depends on the reputation of reviewers, then we suggest to Note that when a review receives less that k judgements, its repu- calculate the reputation of all reviewers repeatedly (in a manner tation will not depend on the judgements, but it will inherit the repu- similar to EigenTrust) in order to converge. If this will be com- tation of the author of the review (her reputation as a reviewer). putationally expensive, then this can be computed once a day, as We currently leave k as a parameter, though we suggest that k > 1, opposed to triggered by extended judgements and the change in so that the reputation of a review is not dependent on a single judge. reviewers’ reputation. Again, we recommend small numbers for k, such as 2 or 3, because • x-judgement. The extended judgement is calculated either based we believe it will be difficult to obtain large numbers of judgements. on judgements (if available) or the similarity between opinions (when judgements are not available) (Equation 3). As such, the The ultimate aim of the evaluation is to investigate how close are extended judgement should be recalculated every time a new (di- the calculated reputation values to the true values: the reputation of a rect) judgement is made, or every time a new opinion is added on researcher as an author, the reputation of a researcher as a reviewer, a paper which already has opinions by other reviewers. and the reputation of a paper. The parameters and methods that drive and control the evolution 3 Evaluation through Simulation of the community of researchers and the evolution of their research work are presented below. 3.1 Simulation To evaluate the effectiveness of the proposed model, we have simu- 1. Number of authors. Every time a new paper is created, the simula- lated a community of researchers, using NetLogo [8]. We clarify that tor assigns authors for this paper. How many authors are assigned the focus of this work is not implementing a simulation that models is defined by the number of authors parameter (#co-authors ), the real world, but a simulation that allows us to verify our model. which is defined as a Poisson distribution. For every new paper, a As such, many assumptions that we make for this simulation, and random number is generated from this Poisson distribution. Who will appear shortly, might not be precisely (or always) true in the real to assign is chosen randomly from the set of researchers, although world (such as having the true quality of a paper inherit the quality sometimes, a new researcher is created and assigned to this paper of the best author). (see the ‘researchers birth rate’ below). This ensures the number In our simulation, a breed in NetLogo (or a node in the research of researchers in the community grows with the number of papers. community’s graph) represents either a researcher, a paper, a review, 2. Number of reviewers. Every time a new paper is created, the sim- or a judgement. The relations between breeds are: (1) authors of, ulator also assigns reviewers for this paper. How many review- that specifies which researchers are authors of a given paper, (2) re- ers are assigned is defined by the number of reviewers parameter viewers of, that specifies which researchers are reviewers of a given (#reviewers ), which is defined as a Poisson distribution. For every paper, (3) reviews of, that specifies which reviews give opinions on a new paper, a random number is generated from this Poisson distri- given paper, (4) judgements of, that specifies which judgements give bution. As above, who to assign is chosen randomly from the set opinions on a given review; and (5) judges of, that specifies which of researchers, although sometimes, a new researcher is created researchers have judged which other researcher. and assigned to this paper. Also, each researcher has four parameters that describe: (1) her 3. Researchers birth rate. As illustrated above, every paper requires reputation as an author, (2) her reputation as a reviewer, (3) her true authors and reviewers to be assigned to it. When assigning au- research quality; and (4) her true reviewing quality. The first two are thors and reviewers, the simulation will decide whether to assign calculated by our ARM model, and they evolve over time. However, an already existing researcher (if any) or create a new researcher. the last two describe the researcher’s true quality with respect to writ- This decision is controlled by the researchers birth rate parame- ing papers as well as reviewing papers or other reviews, respectively. ter (birth rate), which specifies the probability of creating a new In other words, our simulation assumes true qualities exist, and that researcher. they are constant. In real life, there are no such measures. Further- 4. Researcher’s true research quality. The author’s true quality is more, how good one is at writing papers or writing reviews or mak- sampled from a beta distribution specified by the parameters ↵A ing judgements naturally evolves with time. Nevertheless, we chose and A . We choose the beta distribution because it is a very ver- to keep the simulation simple by sticking to constant true qualities, satile distribution which can be used to model several different as the purpose of the simulation is simply to evaluate the correctness shapes of probability distributions by playing with only two pa- of our ARM model. rameters, ↵ and . Similar to researchers, we say each paper has two parameters that 5. Researcher’s true review quality. The reviewer’s true quality is describe it: (1) its reputation, which is calculated by our ARM model, sampled from a beta distribution specified by the parameters ↵R and it evolves over time; and (2) its true quality. Again, we assume and R . Again, the beta distribution is a very versatile distribution that a paper’s true quality exists. How it is calculated is presented which can be used to model several different shapes of probability shortly. distributions by playing with only two parameters, as illustrated Reviews also have two parameters: (1) the opinion provided by shortly by our experiments. the review, which in real life is set by the researcher performing the 6. Paper’s true quality. We assume that a paper’s true quality is the review, while in our simulation it is calculated by the simulator, as true quality of its best author, that is, the author with the high- illustrated shortly; and (2) the reputation of the review, which is cal- est true research quality). We believe this assumption has some culated by our ARM model and it evolves over time. ground in real life. For instance, some behaviour (such as looking Judgements, on the other hand, only have one parameter: the opin- for future collaborators, selecting who to give a funding to, etc.) ion provided by the judgement, which in real life is set by the re- assumes researchers to be of a certain quality, and their research searcher judging a review, while in our simulation it is calculated by work to follow that quality respectively. the simulator, as illustrated shortly. 7. Opinion of a Review. The opinion presented by a review is spec- Simulation starts at time zero with no researchers in the commu- ified as the paper’s true quality plus some noise, where the noise nity, and hence, no papers, no reviews, and no judgements. Then, depends on the reviewer’s true quality. This noise is chosen ran- with every tick of the simulation, a new paper is created, which may domly from the range [ (100 review quality)/2, +(100 sometimes require the creation of new researchers (either as authors review quality)/2]. In other words, the maximum noise that can or reviewers). With the new paper, reviews and judgements are also be added for the worst reviewer (whose review quality is 0) is created. How these elements are created is defined next by the simu- ±50, and the least noise that can be added for the best reviewer lator’s parameters and methods, that drive and control this behaviour. (whose review quality is 100) is 0. We note that a tick of the simulation does not represent a fixed unit 8. Opinion of a Judgement. The value (or opinion) of a judgement in calendar time, but the creation of one single paper. on a review is calculated as the similarity between the review’s value (opinion) and the judge’s review value (opinion), where the Error in Error in Error in similarity is defined by the metric distance as: 100 |review Reviewers’ Papers’ Authors’ judge0 s review|. Note that, for simplification, direct judgements Reputation Reputation Reputation have not been simulated, we only rely on indirect judgements. ↵R = 5 & ⇠ 11 % ⇠2% ⇠ 22 % R =1 3.2 Results ↵R = 2 & ⇠ 23 % ⇠5% ⇠ 23 % R =1 3.2.1 Experiment 1: The impact of the community’s quality ↵R = 1 & ⇠ 30 % ⇠7% ⇠ 23 % of reviewers R =1 Given the above, we ran the simulator for 100 ticks (generating 100 ↵R = 0.1 & ⇠ 34 % ⇠5% ⇠ 22 % papers). We ran the experiment over 6 different cases. In each, we R = 0.1 had the following parameters fixed: ↵R = 1 & ⇠ 44 % ⇠8% ⇠ 23 % R =2 #co-authors = 2 ↵R = 1 & ⇠ 60 % ⇠9% ⇠ 20 % #reviewers = 3 R =2 birth rate = 3 Table 1: The results of experiment 1, in numbers ↵A = A = 1 authors in the community (↵ = 5 and R = 1). For each of these k = 3 (of Equations 1 and 6) cases, we then change the number of co-authors, investigating three cases: #co-authors = {0, 1, 2}. All other parameters remain set to = 1 (of Equation 2) those presented in experiment 1 above. The only parameters that changed where those defining the beta The results of this experiment are presented by Figure 3. The num- distribution of the reviewers’ qualities. This experiment illustrated bers are presented in Table 2. The results show that the error in the the impact of the community’s quality of reviewers on the correctness reviewers and papers reputation almost does not change for differ- of the ARM model. ent numbers of co-authors. However, the error in the reputation of The results of the simulation are presented by Figure 2. For each authors does. When there are no co-authors (#co-authors = 0), the case, the distribution of the reviewers’ true quality is illustrated to error in authors’ reputation is almost equal to the error in papers’ the right of the results. The results, in numbers, are also presented by reputation (Figures 3a and 3b). As soon as 1 co-author is added Table 1. We notice that the least error is presented when the review- (#co-authors = 0), the error in authors’ reputation increases (Fig- ers are all of relatively good quality, with the majority being great ures 3c and 3d). When 2 co-authors are added (#co-authors = 2), the reviewers (Figure 2e). The errors start increasing as bad reviewers error in authors’ reputation reaches the maximum, around 20–22% are added to the community (Figure 2c). They increase even further (Figures 3e and 3f). In fact, unreported results show that the error in in both cases, when the quality of reviewers follows a uniform dis- authors’ reputation is almost the same in all cases for #co-authors tribution (Figure 2a), as well as when the reviewers are equiprobably 2. good or bad, with no average reviewers (Figure 2b). As soon as the majority of reviewers are of poor quality (Figure 2d), the errors in- Error in Error in Error in crease even further, with the worst case being when good reviewers Reviewers’ Papers’ Authors’ are absent from the community (Figure 2f). These results are not sur- Reputation Reputation Reputation prising. A paper’s true quality is not something that can be measured, ↵R =5; ↵R =1; ↵R =5; ↵R =1; ↵R =5; ↵R =1; R =1 R =5 R =1 R =5 R =1 R =5 or even agreed upon. As such, the trust model depends on the opin- ions of other researchers. As a result, the better the reviewing quality #co-authors = 0 ⇠ 11% ⇠ 60% ⇠ 2% ⇠ 9% ⇠ 22% ⇠ 20% of researchers, the more accurate the trust model will be, and vice #co-authors = 1 ⇠ 13% ⇠ 57% ⇠ 3% ⇠ 9% ⇠ 12% ⇠ 15% versa. #co-authors = 2 ⇠ 13% ⇠ 54% ⇠ 3% ⇠ 9% ⇠ 2% ⇠ 7% The numbers of Table 1 illustrate how the error in the papers’ rep- Table 2: The results of experiment 2, in numbers utation increases with the error in the reviewers’ reputation, though at a smaller rate. One curious thing about these results is the constant error in the reputation of authors. The next experiment investigates this issue. 4 Conclusion Last, but not least, we note that the error is usually stable. This is because every time a paper is created, all the reviews it receives We have presented the ARM reputation model for the academic and the judgements those reviews receive are created at the same world. ARM helps calculate the reputation of researchers, both as simulation time-step. In other words, it is not the case that papers authors and reviewers, and their research work. Additionally, ARM accumulate more reviews and judgements over time, for the error to also calculates the reputation of reviews. decrease over time. Concerning the reputation of authors, the most commonly used reputation measure is currently the h-index [4]. However, the h-index has its flaws. For instance, the h-index can be manipulated through 3.2.2 Experiment 2: The impact of co-authorship self-citations [1, 3]. A study has also found the h-index as not pro- In the second experiment, we investigate the impact of co-authorship viding a significantly more accurate measure of impact than the total on authors’ reputation. We choose the two extreme cases from ex- number of citations [9]. ARM, on the other hand, bases the reputation periment 1, when there are only relatively good authors in the com- of authors on the opinions that their papers receive from other mem- munity (↵ = 5 and R = 1), and when there are only relatively bad bers in their academic community. We believe this should be a more distribution of distribution of researchers w.r.t. researchers w.r.t. review quality: review quality: (a) ↵R = 1 and R = 1 (b) ↵R = 0.1 and R = 0.1 distribution of distribution of researchers w.r.t. researchers w.r.t. review quality: review quality: (c) ↵R = 2 and R = 1 (d) ↵R = 1 and R = 2 distribution of distribution of researchers w.r.t. researchers w.r.t. review quality: review quality: (e) ↵R = 5 and R = 1 (f) ↵R = 1 and R = 5 Figure 2: The impact of reviewers’ quality on reputation measures. For each set of results, the distribution of the reviewers’ true quality is presented to the right of the results. distribution of distribution of researchers w.r.t. researchers w.r.t. review quality: review quality: (a) ↵R = 5, R = 1, and #co-authors = 0 (b) ↵R = 1, R = 5, and #co-authors = 0 distribution of distribution of researchers w.r.t. researchers w.r.t. review quality: review quality: (c) ↵R = 2, R = 1, and #co-authors = 1 (d) ↵R = 1, R = 2, and #co-authors = 1 distribution of distribution of researchers w.r.t. researchers w.r.t. review quality: review quality: (e) ↵R = 5, R = 1, and #co-authors = 2 (f) ↵R = 1, R = 5, and #co-authors = 2 Figure 3: The impact of co-authorship on reputation of authors. For each set of results, the distribution of the reviewers’ true quality is presented to the right of the results. accurate approach, though future work should aim at comparing both REFERENCES approaches. [1] Christoph Bartneck and Servaas Kokkelmans, ‘Detecting h-index ma- Concerning the reputation of papers, the most common measure nipulation through self-citation analysis’, Scientometrics, 87(1), 85–98, currently used is the total number of citations a paper gets. Again, (2010). this measure can easily be manipulated through the self-citations. [7] [2] Gunther Eysenbach, ‘Citation advantage of open access articles’, PLoS presents an alternative approach based on the propagation of opin- Biology, 4(5), e157, (05 2006). [3] Emilio Ferrara and Alfonso E. Romero, ‘Scientific impact evaluation ions in structural graphs. It allows papers to build reputation either and the effect of self-citations: Mitigating the bias by discounting the from the direct reviews it receives, or inherit reputation from the h-index’, Journal of the American Society for Information Science and place where the paper is published. In fact, a sophisticated propa- Technology, 64(11), 2332–2339, (2013). gation model is proposed to allow reputation to propagate upwards [4] J. E. Hirsch, ‘An index to quantify an individual’s scientific research out- put’, Proceedings of the National Academy of Sciences of the United as well as downwards in structural graphs (e.g. from a section to a States of America, 102(46), 16569–16572, (2005). chapter to a book, and vice versa). Simulations presented in [6] il- [5] Nikolaus Kriegeskorte and Diana Deca, eds. Beyond open access: vi- lustrate the potential impact of this model. ARM does not have any sions for open evaluation of scientific papers by post-publication peer notion of propagation. The model is strictly based on direct opinions review, Frontiers in Computational Neuroscience. Frontiers E-books, (reviews and judgements), and when no opinions are present, igno- November 2012. [6] Nardine Osman, Jordi Sabater-Mir, Carles Sierra, and Jordi Madrenas- rance is assumed (as in the default reputation of authors and papers). Ciurana, ‘Simulating research behaviour’, in Proceedings of the 12th Concerning the reputation of reviews and reviewers, to our knowl- International Conference on Multi-Agent-Based Simulation, MABS’11, edge, these reputation measures have not been addressed yet. Never- pp. 15–30, Berlin, Heidelberg, (2012). Springer-Verlag. theless, we believe these are important measures. Conference man- [7] Nardine Osman, Carles Sierra, and Jordi Sabater-Mir, ‘Propagation of opinions in structural graphs’, in Proceedings of the 2010 Conference agement systems are witnessing a massive increase in paper submis- on ECAI 2010: 19th European Conference on Artificial Intelligence, pp. sions, and in many disciplines, finding good reviewers is becoming a 595–600, Amsterdam, The Netherlands, The Netherlands, (2010). IOS challenging task. Deciding what papers to accept/reject is sometimes Press. a challenge for conference and workshop organisers. ARM is a repu- [8] Seth Tisue and Uri Wilensky, ‘Netlogo: Design and implementation of tation model that addresses this issue by helping recognise the good a multi-agent modeling environment’, in In Proceedings of the Agent Conference, pp. 161–184, (2004). reviews/reviewers from the bad. [9] Alexander Yong, ‘Critique of hirschs citation index: A combinatorial The obvious next steps for ARM is applying it to a real dataset. fermi problem’, Notices of the American Mathematical Society, 61(11), In fact, the model is currently being integrated with two Span- 1040–1050, (2014). ish repositories: DIGITAL.CSIC (https://digital.csic.es) and e-IEO (http://www.repositorio.ieo.es/e-ieo/). However, these repositories do not have any opinions or judgements yet, and as such, time is needed to start collecting this data. We are also working with the IJCAI 2017 conference (http://ijcai-17.org) in order to allow review- ers to review each other. We will collect the data of this conference, which will provide us with the reviews and judgements needed for evaluating our model. We will also continue to look through existing datasets. Future work can investigate a number of additional issues. For in- stance, we plan to provide data on the convergence performance of the algorithm. One can also study the different types of attacks that could impact the proposed computational model. While similarity of reviews is now computed based on the similarity of the quantita- tive opinions, the similarity between qualitative opinions may also be used in future work by making use of natural language process- ing techniques. Also, while we argue that direct opinion can help the model avoid the pitfalls of the literature, it is also true that di- rect opinions are usually scarce. As such, if needed, other informa- tion sources for opinions may also be considered, such as citations. This information can be translated into opinions, and the equations of ARM should then change to give more weight to direct opinions than other information sources. ACKNOWLEDGEMENTS This work has been supported by CollectiveMind (a project funded by the Spanish Ministry of Economy & Competitiveness (MINECO), grant # TEC2013-49430-EXP), and Open Peer Review Module for Repositories (a project funded by OpenAIRE, which in turn is an EU funded project).