BIR 2018 Workshop on Bibliometric-enhanced Information Retrieval


     BIBLME RecSys: Harnessing Bibliometric
    Measures for a Scholarly Paper Recommender
                      System

            Anaı̈s Ollagnier, Sébastien Fournier, and Patrice Bellot

       Aix Marseille Univ, Toulon University, CNRS, LIS, Marseille, France
     {anais.ollagnier,sebastien.fournier,patrice.bellot}@univ-amu.fr


      Abstract. The iterative continuum of scientific production generates
      a need for filtering and specific crossing of ideas and papers. In this
      paper, we present BIBLME RecSys software which is dedicated to the
      analysis of bibliographical references extracted from scientific collections
      of papers. Our goal is to provide users with paper suggestions guided by
      the papers they are reading and by the references they contain. To do
      so, we propose a new approach based on a new bibliometric measure. We
      propose to determine the impact, the inner representativeness, of each
      bibliographical reference according to their occurrences in the paper the
      user is reading. By means of this approach, we suggest central references
      of the author’s paper. As a result, we obtain papers that are related to
      the paper selected by the user according to the influence of references on
      it. We evaluate the recommendation in the context of a digital library
      dedicated to humanities and social sciences.

      Keywords: Recommender systems, Text mining, Digital libraries, Bib-
      liographic information, Bibliometrics.


1    Introduction

There are 114 million of scholarly papers archived on the Web [9]. While cer-
tainly advantageous, researchers have unprecedented level of access which creates
a problem of “information overload” [2]. To help generate relevant suggestions
for researchers, scholarly paper recommenders emerged over the last decade to
ease finding publications. In this research field, content-based filtering (CBF) al-
gorithms are the predominant approaches [14]. In most CBF systems dedicated
to textual applications (e.g., scholarly papers), item descriptions are represented
by textual features such as plain words, phrases, or n-grams. However, these sys-
tems encounter difficulties caused by complications that originate from natural
language ambiguity. In the context of digital libraries, references are a major
source of links. Indeed, bibliographical references are an important part of the
academic writing and allow to convey various information relating to the author’s
research fields. In this paper, we propose a scholarly paper recommender system
which suggests related papers to the user’s paper is reading. To do that, we were


                                        34
                                    BIR 2018 Workshop on Bibliometric-enhanced Information Retrieval


interested in textual features through the identification of references extracted
from the body of the text and the footnotes. By this way unlike traditional CBF
approaches in which items are recommend to users by determining their simi-
larities (inherent characteristics) with other items, we propose to determine an
active user’s interests from the content analysis of the article selected.
    Leveraging information extracted from bibliographical reference analysis and
from quantitative measurements, we propose a centrality indicator which al-
lows to evaluating bibliographical references’ importance of the user’s paper is
reading. Based on the findings that certain textual features, such as citation fre-
quency and citation location might allow to predict references’ importance [19],
we construct this indicator from factors which use information on the authors’
citation behavior. By means of this approach, citations are used in the same way
as words were used in CBF algorithm in order to weighted each reference ac-
cording to its level of centrality. By this way, our centrality indicator can reflect
the strength of references’ influence and so potential relevant readings. Refer-
ence analysis is integrated in the BILBO1 [10] system, the software we have been
developing for some years in the context of OpenEdition that is a large digital
library dedicated to Humanities and Social Sciences (HSS).


2     Related Work

In the context of digital librairies and publishers’ portals, recommender system
have been introduced in order to provide relevant related literature. In 1998,
Giles et al. introduced the first scholarly recommender system as part of the
CiteSeer project [6]. Since several methods have been proposed and at least
216 papers relating to scholarly paper recommendation approaches were pub-
lished [3]. Two types of algorithms are typically used in recommender systems:
collaborative filtering (CF) and content-based filtering (CBF). CF algorithms
recommend items to users based on the ratings of other users (with similar in-
terest). In CBF algorithms, the user’s interest is inferred from the items that the
user interacted with. Items’ representation consist in a content model in which
features are typically word-based.
    Several digital libraries have deployed recommender systems such as Tech-
Lens+ [12], BibTip [20] and CiteUlike [5]. Each system uses different kind of data:
TechLens+ uses citation data and usage data, BibTip rests upon the observation
of user patterns and the statistical evaluation of the usage data, and CiteULike
is based on users’ bookmark items. These systems are essentially based on CBF
algorithms. Most of the proposed approaches operate either by clustering similar
items or by profiling users’ behaviour (user-based collaborative filtering) or by
combining the two (hybrid recommenders) [3]. In recent works, Asabere et al. [1]
introduced a folksonomy-based paper recommendation algorithm which recom-
mends papers issued by active participants, to other Group Profile Participants
at the same conference based on preference similarity of their research interests.
1
    https://github.com/OpenEdition/bilbo


                                      35
                                        BIR 2018 Workshop on Bibliometric-enhanced Information Retrieval


    Philip et al. [17] proposed a CBF approach based on TF-IDF weighing scheme
    and cosine similarity measure.
        Currently, it is not possible to identify the most effective recommendation
    approaches [3]. So according to our dataset, we have oriented our work on a
    CBF approach. The majority of these approaches are based on plain terms ex-
    tracted from papers, n-grams or topics based on LDA. Certain researches have
    used non-textual features such as writing style, layout information, and XML
    tags. Concerning the use of citations, [6] have proposed the CC-IDF method
    in which citations are used in the same way as words-based features were used
    and weighted the citations with the standard TF-IDF measure. Inspired by this
    work, we propose a system in which the user’s interests are inferred from cita-
    tions extracted from the current paper that the user interact with in order to
    provide related papers.


    3   Proposed Method

    Our method starts with our former bibliographical references detection system
    dedicated to scholarly papers [15] which is integrated in the BILBO system.
    Based on this software the names of the authors, the titles, the year of publication
    and some meaningful elements of information are extracted from both the full
    texts and the reference sections. From bibliographical references annotated and
    extracted by our system, we build our scholarly recommender system entitled
    BIBLME RecSys. It consists of four steps:

Step 1: Retrieve references both in the body of the text (ref ) and in the reference
        section (Pref );
Step 2: Check matches between each ref corresponding to the same paper Pref ;
Step 3: Compute for each Pref a centrality indicator derived from the evaluation
        criteria based on objective quantitative measurements;
Step 4: Rank each Pref according to its centrality indicator and recommend papers
        with the high scores.

        As we show in Fig. 1, each candidate paper to recommend (Prefi ) is repre-
    sented by a centrality indicator (cIndic(Prefi )) which corresponds to quantita-
    tive measures used in Step 3. This indicator allows to highlight papers which are
    mainly employed by the authors in their paper. In this approach, a key innova-
    tive step is to model a paper of interest from factors based on authors’ citation
    behaviours. The first factor corresponding to the f reqF actor(refi , Prefi ) func-
    tion computes references refi which occur in the given paper. The second factor
    corresponding to the granuF actor(refi , refj ) function calculates for each Prefi
    how corresponding references in the text are used by the author in his paper.
    This factor has two levels of granularity (fine granularity/coarse granularity)
    which we detail in Section 3.2. From these factors, we determine central papers
    of the author’s paper. By this way, we leverage the bibliographical references in
    order to shape a user’s interests.


                                          36
                                   BIR 2018 Workshop on Bibliometric-enhanced Information Retrieval


                  Fig. 1. Example of BIBLME RecSys operations


    In Step 1, sets of references extracted both from the body of the text and
from the reference section are constructed. Then in Step 2, we manage with these
sets in order to determine which references correspond to the same paper from
matching functions (matchF unc(refi , refj )). These functions based on a strict
matching and a fuzzy matching allow to compare two bibliographical references
extracted from the body of the text or from the reference section. Then for each
reference whose matching functions are fulfilled, quantitative measures based on
the frequency factor of usage and the granularity factors are computed. Lastly,
candidate papers to recommend are ranked according to their centrality indicator
which represents a linear combination of the quantitative measures. With this
system, users can set the value assigned to each factor. According to the score
assigned to each factor, users can obtain informations about author’s citation
behaviours. For example, if the coarse granularity factor is set at a high value,
references which occur throughout the paper will be highlighted.

3.1   Matching Functions
The purpose of the matching functions (matchF unc(refi , refj ) in Fig. 1) is to
gather the references corresponding to a given paper. These functions allow to
compute the frequency factor and the granularity factors presented in section 3.2.
Two functions are necessary: a strict matching function and a fuzzy matching


                                     37
                                     BIR 2018 Workshop on Bibliometric-enhanced Information Retrieval


function. The strict matching function checks whether there is an exact match-
ing between the content of the bibliographic fields found in a string and the
content of the same fields found in an another reference, even if the fields are
not the same. The fuzzy matching function allows to estimate the matching de-
gree between references whose content of the bibliographic fields is substantially
different. Consider R as a set of references such as refi and refj ∈ R.

   Definition 1. (The strict matching function) This function checks whether
bibliographic fields of the refi reference (i ∈ [1..n]) and the refj reference
(j ∈ [1..m]) match. refi is tokenized in (wki ) and refj is tokenized in (wkj ),
with k = min(len(refi ), len(refj )).
                                         (
                                          1 if [w1i w2i ..wki ] = [w1j w2j ..wkj ]
         matchingstrict (refi , refj ) =                                           (1)
                                          0
   Definition 2. (The fuzzy matching function) sim(θ, δ) corresponds to
the Levenshtein distance. θ refers to the vector (w1i ..wki ) of tokens in refi and
δ corresponds to the vector (w1j ..wkj ) of tokens extracted from refj , with k =
min(len(refi ), len(refj )). ω specifies a similarity threshold.
                                                 (
                                                   1 if sim(θ, δ) > ω
            matchingsimilarity (refi , refj ) =                                  (2)
                                                   0

3.2   BIBLME RecSys Factors
BIBLME RecSys factors are based on the observations related to the construc-
tion of scientific discourse and especially authors’ citation behaviours. They are
computed using quantitative measures such as the frequency and the distribution
granularity of each reference in the body of the text.

The Frequency Factor is computed from the number of references corre-
sponding to the same paper even if their structures are different. The hypothesis
emitted from this factor is: the impact of the reference will be considered more
important if it is cited several times in the paper. Algorithm 1 describes the
processing chain used to compute the frequency factor.
    Consider R as the set of references refj (j ∈ [1..m]) extracted from the body
of the text and R0 as the set of references P refi (i ∈ [1..n]) extracted from the
the reference section. We only perform these matches because we consider the
reference section as a list in which occurs each reference extracted from the text.
The biblM eF actors array stores the results in order to compute the centrality
indicator. In this case, its size is equal to the size of R0 . Each biblM eF actors
index corresponds to the position index of references P refi from R0 . Then,
for each reference in the body of the text refj from R, matching functions
matchingstrict (P refi , refj ) and matchingsimilarity (P refi , refj ) are applied with
references from R0 . If the conditions are verified, biblM eF actors[i] is incremented
according to the parameter value assigned to α.


                                        38
                                       BIR 2018 Workshop on Bibliometric-enhanced Information Retrieval


Algorithm 1 Computing the frequency factor
 1: for i of 0 to (length of biblM eF actors) do
 2:    biblM eF actors[i] ← 0
 3: end for
 4: for i of 0 to (length of R0 ) do
 5:    for j of 0 to (length of R) do
 6:         if matchingstrict (refj , P refi ) = 1 or matchingsimilarity (refj , P refi ) = 1 then
 7:             biblM eF actors[i] ← biblM eF actors[i] + α
 8:         end if
 9:    end for
10: end for


The Granularity Factors are computed by taking into account different levels
of distribution granularity, namely, the fine granularity and the coarse granular-
ity. The purpose of these factors is to distinguish how the references are used
by the authors in their paper. To do that, an additional score is computed for
each references extracted from the reference section if corresponding references
in the body of the text fulfill distribution conditions. Based on the findings that
references’ importance increase proportionally with numbers of mentions and
more detailed discussion of the cited document [19], we propose to construct the
granularity factors according to the following assumptions:

 – more citations of a reference occur throughout the paper more the author is
   influenced by this work;
 – on the contrary, the concentration of citations within low textual density
   areas tends to strengthen author’s arguments on specific aspects of his re-
   searches.

    The fine granularity is computed from references in the body of the text
referring to the same paper in the same paragraph and the number of words
between each one of these references. Then, a score is assigned if the number of
words between these references is less than the average of the distances between
the references corresponding to the same paper. The fine granularity function is
as follows:
                                              (
                                               1 if aj − bi < AvgRef
            Granularityf ine (refi , refj ) =                                (3)
                                               0
    Where refi and refj are references extracted from an ordered subset referring
to the same document d in the same paragraph P . aj is the refj start position
in the paragraph P and bi is the refi end position in the paragraph P . AvgRef
is the average of all the averages of distances in words between references within
a document d in a paragraph P .

   The coarse granularity is measured from references in the body of the text
corresponding to the same paper throughout a given paper. To do that, we count


                                          39
                                        BIR 2018 Workshop on Bibliometric-enhanced Information Retrieval


the number of paragraphs which separate each of these references. Then, a score
is assigned if the number of paragraphs between these references is less than
the average of distances. The coarse granularity function can be calculated as
follows:

                                       (
                                                                       0
                                        1 if index(Q) − index(P ) < AvgRef
    Granularitycoarse (refi , refj ) =                                                       (4)
                                        0

   Where index() is a function which gives the index of a given paragraph. P
and Q two paragraphs and refi and refj are references extracted from an or-
                                                  0
dered subset referring to the same document d. AvgRef is the average of all the
averages of distances between paragraphs that separate two references to the
same document d.

    Algorithm 2 describes the processing chain used to compute the granularity
factors.


Algorithm 2 Computing the granularity factors
 1: for i of 0 to (length of R) do
 2:    for j of 0 to (length of R) do
 3:         if i 6= j then
 4:             if matchingstrict (refi , refj ) or matchingsimilarity (refi , refj ) then
 5:                  if granularityf ine (refi , refj ) = 1 then
 6:                      biblM eF actors[i] ← biblM eF actors[i] + β
 7:                  end if
 8:                  if granularitycoarse (refi , refj ) = 1 then
 9:                      biblM eF actors[i] ← biblM eF actors[i] + γ
10:                  end if
11:             end if
12:         end if
13:     end for
14: end for


    For each couple (refi , refj ) of references from the set R, we apply matching
functions matchingstrict (refi , refj ) and matchingsimilarity (refi , refj ). If condi-
tions are fulfilled, the measurement of granularity factors granularityf ine (refi , refj )
and granularitycoarse (refi , refj ) is performed. For each granularity function if
their respective average of distances are verified, biblM eF actors[i] is incremented
according to the parameter value assigned to β or γ.


4     Context of the Experiments
To evaluate BIBLME RecSys, we compare the candidate papers proposed by
the OpenEdition’s search engine which is designed to search for documents on


                                           40
                                     BIR 2018 Workshop on Bibliometric-enhanced Information Retrieval


OpenEdition portal. In the literature, we were unable to find open recommen-
dation systems based on bibliometric measurements with which to compare to
BIBLME RecSys’s suggestions. So, we used the OpenEdition search engine called
Search OpenEdition which is based on Apache Lucene retrieval model. Candi-
date papers are extracted, for a given query, by a Boolean filter which identify
papers containing the requested terms. Then, papers are ranked according to
BM25 similarity. As for standard search engines, Search OpenEdition allows dif-
ferent querying modes which apply facets or filters. As part of our work, Search
OpenEdition was queried without specifying search fields related to document
characteristics and Boolean operators. However, the advanced search mode was
used with the application of a filter which only query OpenEdition Journals
platform2 in order to deal with the same data that BIBLME RecSys uses. Con-
cerning the queries submitted they are written with the name of the first author
and the full title of given papers which correspond to the main fields available
in the references extracted from the text.
    The candidate papers to recommend is constructed from 12 papers3 extracted
from various fields in HSS such as languages, anthropology, ethnology, commu-
nication, law and culture, health, economy and development, education, agri-
culture, and the environment. In order to estimate the impact of this approach
and to avoid the possibility of bias product by the BILBO software, we man-
ually annotated citations and references of these papers. In order to allow an
evaluation of the candidate papers extracted from BIBLME RecSys and Search
OpenEdition, a platform4 has been developed.
    From this platform, users have access to the list of papers with their abstracts.
For each paper two lists of five recommended papers are proposed for both
systems. Only first five recommended papers were displayed for each system in
order to avoid an evaluation too tedious. Users can choose the list containing the
more relevant recommended papers and refine their evaluation by giving a rating
from 0 to 5 (0 means that the recommended paper is off topic and 5 means that
the recommended paper is in agreement with the topic of the current paper).
The majority of the recommended papers have a clickable link to obtain the
original version of the paper or an abstract. A ”Suggestion” field is available in
order to allow users to express an opinion or remarks on suggestions.


5   Experimental Results
From the platform presented previously, an analysis on the basis of the user
feedback is performed. Through this study, we show the relevance of recom-
mended papers according to the system selected by the users. The results have
been recorded after one month. The OpenEdition employees were the main par-
ticipants in this evaluation. These participants come from varying cultural and
2
  http://journals.openedition.org/
3
  Due to the task complexity for the participants, we only have selected a little sample
  of papers.
4
  http://grapheval.openeditionlab.org/


                                       41
                                     BIR 2018 Workshop on Bibliometric-enhanced Information Retrieval


disciplinary backgrounds such as software engineering, sales and partnerships,
finance, legal and public policy, marketing and communications, user services.
Over this time frame, we counted 31 participants with an average of 2.3 items
assessed per person.
    Concerning the predefined settings for BIBLME RecSys, each centrality in-
dicator is computed from the frequency factor and the granularity factors. The
same parameter value is applied for each factor in order to obtain the sum of the
coefficients equal to 1. Table 1 shows the obtained results for each proposed pa-
per. The values correspond to the number of users which selected the suggestions
provided by BIBLME RecSys or Search OpenEdition.


     Table 1. Results Obtained by BIBLME RecSys and Search OpenEdition.

Proposed paper                                         BIBLME             Search
                                                        RecSys          OpenEdition
Jaubert - Correspondance as an Ethical Genre                7                  5
Bisson - Sufism and Tradition                               8                  0
Nabti - Sufis in Parisian suburbs                           5                  0
Laborde - The hermit and the virtuoso                       6                  0
Esquenazi - From star system to people                      4                  4
Danou - On a Novella of Arthur Schnitzler (1862-            2                  1
1931), Flight into Darkness
Wrobel - Gothic, Reform and Panoptic                        3                  0
Amiraly - The impact of a pilot water metering              4                  0
project in an Indian city on users perception of
the public water supply
Masdonati - The question of identity in the dual            1                  5
Swiss vocational training system: a contrasting
picture
Delannoy - Karst: from palaeogeographic archives            1                  0
to environmental indicators
Duval - The deceased to the shackles                        2                  2
Angevin - The Magdalenian lithic industry from              2                  0
the openair site of la Corne-de-Rollay (Couleuvre,
Allier): production standards andproduction lines
variability


   Result analyses. Users have selected 45 times BIBLME RecSys’s sugges-
tions as the most relevant while Search OpenEdition’s suggestions have been
selected 17 times. However, performances vary depending on the papers. Users
have selected BIBLME RecSys’s suggestions as the most effective for 8 proposed
papers while the others obtain almost similar performances. The user feedback


                                       42
                                   BIR 2018 Workshop on Bibliometric-enhanced Information Retrieval


provided with the ”Suggestion” field shows that the systems propose relevant
suggestions. The main difference between the systems concerns the topic covered.
Indeed, inspecting the recommended paper topics shows that BIBLME RecSys
provides suggestions closest to the targeted paper topics. Unlike Search OpenEdi-
tion that tends to provide suggestions in terms of related themes. Let’s take the
example of the paper ”Sufism and Tradition” which is focused on the influence
of the intellectual René Guénon on European Sufi Islam. BIBLME RecSys pro-
vides suggestions whose the main topics are Sufism and/or René Guénon which
are central topics to explaining content of the targeted paper. Unlike Search
OpenEdition proposes suggestions based on the religious object topic. From this
analysis, we can observe that BIBLME RecSys’s suggestions are more closely
linked to the targeted paper topic. Conversely, those of Search OpenEdition are
on more generic scientific aspects. Obviously, results are contrasted. An another
example, the paper ”From star system to people” which is focused on market-
ing strategies produced through the ”star system”. Given the candidate paper
topics, BIBLME RecSys provides suggestions about the economic exploitation
of notoriety and Search OpenEdition proposes suggestions based on the topic of
the ”celebrity culture”. In this case, each system provides suggestions with at
least one central topic related to the targeted paper. Despite this, these examples
indicate how our approach can characterize user’s interests by proposing papers
based on important topics of the targeted paper.
    Limitations. The user feedback has revealed different behaviours from each
system. Indeed, our approach provides suggestions more closely related to the
targeted paper topics than Search OpenEdition. The suggestions proposed by
BIBLME RecSys are extracted directly from the text which explain close topic
links. Sometimes it was difficult for users to evaluate recommended papers due
to the topic complexity. Thus, the main limitations concern users’ satisfaction.
In this experiment, we focused on topic links between recommended papers and
targeted papers. However, in the context of scholarly recommender system it is
important to assess the scientific content of recommended papers. In this experi-
ment, we can observe that users are able to evaluate the topic links although they
are not specialists on proposed paper’s topics. Papers not sufficiently evaluated,
such as ”Karst : from palaeogeographic archives to environmental indicators”,
reflect the needs of a specific expertise.
    Finally, leveraging our centrality indicator, BIBLME RecSys is able to suggest
central papers of the author’s paper and therefore relevant readings according
to the proposed paper. This is then reflected in the user choices which tend to
select BIBLME RecSys’s suggestions. We believe that our approach is effective
in characterizing candidate papers to recommend in order to obtain much higher
recommendation relevance in the context of scholarly recommender system.


6   Conclusion

We have explored an approach based on the content analysis of the paper the
user is reading. From bibliographical reference analysis and and objective quan-


                                      43
                                     BIR 2018 Workshop on Bibliometric-enhanced Information Retrieval


titative measurements we proposed a centrality indicator. This indicator allows
to evaluate bibliographical references’ importance of the paper that the user
interacted with. In this approach, we represent a candidate paper from the num-
ber of its mentions and how its citations are used by the authors. From these
information, we determined central papers of the authors and therefore relevant
readings. Our results showed that, in discovery of potential relevant readings,
BIBLME RecSys’s suggestions are more closely linked to the targeted paper
topic.
   We believe that our approach can be applied more generally in the context of
digital libraries. Harnessing bibliographical references in the full-text can be used
wherever scientific domains. Moreover, this indicator is based on authors’ citation
behaviours specific to the targeted paper unlike recent works in bibliometrics [4].
The use of a such indicator can highlight, for a given paper, the influence of
references on the author’s paper. This indicator may be used for the scientific
activity evaluation and scientific networks evaluation currently practiced in the
bibliometric field. This indicator allows to harness information about central
papers used by authors and potentially the most influential on their research.
   To discover much more relevant papers, in future work, we plan to use graph
data model for our data in order to exploit recommendation algorithms by graph
analysis. By this modelling, we intent to involve external resources based on the
same topics but also to consider the centrality indicator as edges. Moreover, we
plan to determine relations between references and their proximity [21], and so
propose references between two articles (citing-cited) based on their affiliation
to a particular scientific domain, similar author collectives and their influence on
each other. Our future work direction also aims to examine the degree of users’
satisfaction regarding the OpenEdition Journals through library user surveys.


Acknowledgment
This research was supported by ANR program Investissements d’Avenir EquipEx
DILOH (ANR-11-EQPX-0013).


References
1. Asabere, N. Y., Xia, F., Meng, Q., Li, F., Liu, H.: Scholarly paper recommenda-
   tion based on social awareness and folksonomy. International Journal of Parallel,
   Emergent and Distributed Systems, vol. 30 no. 3, pp. 211-232, (2015)
2. Baeza-Yates, R. , Ribeiro-Neto, B.: Modern information retrieval: The Concepts
   and Technology behind Search. ACM Press, New York, (1999)
3. Beel, J., Gipp, B., Langer, S., Breitinger, C.: paper recommender systems: a litera-
   ture survey. International Journal on Digital Libraries, vol. 17 no. 4, pp. 305–338,
   (2016)
4. Belter, C.: ”Bibliometric indicators: opportunities and limits”, Journal of the Medi-
   cal Library Association, vol. 103 no. 4, Medical Library Association, pp. 219, (2015)
5. Bogers, T., Van den Bosch, A.: Recommending scientific articles using citeulike. In:
   the 2008 ACM conference on Recommender systems, pp. 287–290, ACM, (2008)


                                        44
                                      BIR 2018 Workshop on Bibliometric-enhanced Information Retrieval


6. Bollacker, K., Lawrence, S., Giles, L.: CiteSeer: An autonomous web agent for au-
   tomatic retrieval and identification of interesting publications. In: Proc. of the 2nd
   Int. Conf. Autonomous agents, pp. 116–123, vol. 41 no. 1, ACM, (1998)
7. Haruna, K., Ismail, M. A., Damiasih, D., Sutopo, J., Herawan, T.: A collaborative
   approach for research paper recommender system. PloS one, vol. 12 no. 10, (2017)
8. Hristakeva, M., Kershaw, D., Rossetti, M., Knoth, P., Pettit, B., Vargas, S., Jack, K.:
   Building recommender systems for scholarly information. In: 1st International Work-
   shop on Scholarly Web Mining (SWM), International Conference on Web Search and
   Data Mining (2017)
9. Khabsa, M., Giles, C. L.: The number of scholarly documents on the public web.
   PloS one, vol. 9 no. 5, pp. e93949, (2014)
10. Kim, Y.-M., Bellot, P., Faath, E., Dacos, M.: Automatic annotation of bibliograph-
   ical references in digital humanities books, articles and blogs. In: Proc. of the 4th
   ACM workshop on Online books, complementary social media and crowdsourcing,
   pp. 41–48, ACM, (2011)
11. Knoth, P., Anastasiou, L., Charalampous, A., Cancellieri, M., Pearce, S., Pontika,
   N., Bayer, V. (2017). Towards effective research recommender systems for reposito-
   ries. In: Proc. of Open Repositories, Open Repositories, (2017)
12. Mnnich, M., Spiering, M.: Adding value to the library catalog by implementing a
   recommendation system. D-Lib Magazine, 14.5/6, pp. 1082–9873, (2008)
13. Nascimento, C., Laender, A. H., da Silva, A. S., Gonalves, M. A.: A source in-
   dependent framework for research paper recommendation. In : Proc. of the 11th
   annual international ACM/IEEE joint conference on Digital libraries, pp. 297–306,
   ACM, (2011)
14. Lops, P., De Gemmis, M., Semeraro, G.: Content-based recommender systems:
   State of the art and trends. Recommender systems handbook, pp. 73–105, Springer,
   (2011)
15. Ollagnier, A., Fournier, S., Bellot, P.: A Supervised Approach for Detecting Allu-
   sive Bibliographical References in Scholarly Publications. In : ACM WIMS Confer-
   ence on Web Intelligence, Mining and Semantics (WIMS), Nmes, France, pp. 36–39,
   (2016)
16. Pazzani, M. J., Billsus, D.: Content-based recommendation systems. In: The adap-
   tive web, pp. 325–341, Springer, Berlin, Heidelberg. (2007)
17. Philip, S., Shola, P., Ovye, A.: Application of content-based approach in research
   paper recommendation system for a digital library. International Journal of Ad-
   vanced Computer Science and Applications 5.10 (2014)
18. Su, X., Khoshgoftaar, T. M.: A survey of collaborative filtering techniques. Ad-
   vances in artificial intelligence, pp. 4, (2009)
19. Tang, R., Safer., M. A.: Author-rated importance of cited references in biology
   and psychology publications. Journal of Documentation, vol. 62 no. 2, pp. 246–272,
   (2008)
20. Torres, R., Mcnee, S. M., Abel, M., Konstan, J. A., Riedl, J.: Enhancing digital
   libraries with TechLens. In : Proc. of the 2004 Joint ACM/IEEE Conference on.
   IEEE, pp. 228–236, (2004)
21. Tran, H. D., Cabanac, G., Hubert, G.: Expert suggestion for conference program
   committees. In : 11th International Conference on Research Challenges in Informa-
   tion Science (RCIS), pp. 221–232, IEEE, (2017)
22. Yang, C., Wei, B., Wu, J., Zhang, Y., Zhang, L.: CARES: a ranking-oriented
   CADAL recommender system. In : Proc. of the 9th ACM/IEEE-CS joint conference
   on Digital libraries, pp. 203–212, ACM, (2009)


                                         45