=Paper=
{{Paper
|id=Vol-2360/paper6RARDII
|storemode=property
|title=RARD II: The 94 Million Related-Article Recommendation Dataset
|pdfUrl=https://ceur-ws.org/Vol-2360/paper6RARDII.pdf
|volume=Vol-2360
|authors=Joeran Beel,Barry Smyth,Andrew Collins
|dblpUrl=https://dblp.org/rec/conf/ecir/BeelSC19
}}
==RARD II: The 94 Million Related-Article Recommendation Dataset==
https://ceur-ws.org/Vol-2360/paper6RARDII.pdf
RARD II: The 94 Million Related-Article
Recommendation Dataset
Joeran Beel*1, 2, Barry Smyth3 and Andrew Collins*1
1 Trinity College Dublin, School of Computer Science & Statistics, ADAPT Centre, Ireland
2 National Institute of Informatics Tokyo, Digital Content and Media Sciences Division, Japan
3 University College Dublin, Insight Centre for Data Analytics, Ireland
beelj@tcd.ie, barry.smyth@insight-centre.org, ancollin@tcd.ie
Abstract. The main contribution of this paper is to introduce and describe a new
recommender-systems dataset (RARD II). It is based on data from Mr. DLib, a
recommender-system as-a-service in the digital library and reference-manage-
ment-software domain. As such, RARD II complements datasets from other do-
mains such as books, movies, and music. The dataset encompasses 94m recom-
mendations, delivered in the two years from September 2016 to September 2018.
The dataset covers an item-space of 24m unique items. RARD II provides a range
of rich recommendation data, beyond conventional ratings. For example, in ad-
dition to the usual (implicit) ratings matrices, RARD II includes the original rec-
ommendation logs, which provide a unique insight into many aspects of the al-
gorithms that generated the recommendations. The logs enable researchers to
conduct various analyses about a real-world recommender system. This includes
the evaluation of meta-learning approaches for predicting algorithm perfor-
mance. In this paper, we summarise the key features of this dataset release, de-
scribe how it was generated and discuss some of its unique features. Compared
to its predecessor RARD, RARD II contains 64% more recommendations, 187%
more features (algorithms, parameters, and statistics), 50% more clicks, 140%
more documents, and one additional service partner (JabRef).
Keywords: recommender systems datasets, digital libraries, click logs.
1 Introduction
The availability of large-scale, realistic, and detailed datasets is an essential element of
many research communities, such as the information retrieval community (e.g. TREC
[1,2], NTCIR [3,4], and CLEF [5,6]), the machine learning community (e.g. UCI [7],
OpenML [8], KDD Cup [9]) and the recommender systems community. Particularly,
the meta-learning and algorithm section community [10] as well as the automated ma-
chine-learning (AutoML) community [11] depend on large-scale datasets. Such da-
tasets provide data for researchers to benchmark existing techniques, as well as to de-
velop, train and evaluate new algorithms. They can also be essential when it comes to
supporting the development of a research community. The recommender-systems
* This publication has emanated from research conducted with the financial support of Science Foundation
Ireland (SFI) under Grant Number 13/RC/2106 and funding from the European Union and Enterprise Ireland
under Grant Number CF 2017 0303-1.
The 1st Interdisciplinary Workshop on Algorithm Selection and Meta-Learning in
Information Retrieval (AMIR), 14 April 2019, Cologne, Germany. Editors: Joeran
Beel and Lars Kotthoff. Co-located with the 41st European Conference on Infor-
mation Retrieval (ECIR). http://amir-workshop.org/
community has been well-served by the availability of a number of datasets in popular
domains including movies [12,13], books [14], music [15] and news [5,16]. The im-
portance of these datasets is evidenced by their popularity among researchers and prac-
titioners; for example, the MovieLens datasets have been downloaded 140,000 times in
2014 [12], and Google Scholar lists some 13,000 research articles and books that men-
tion one or more of the MovieLens datasets [17].
The recommender-systems community has matured, and the importance of recom-
mender systems has grown rapidly in the commercial arena. Researchers and practi-
tioners alike have started to look beyond the traditional e-commerce / entertainment
domains (e.g. books, music, and movies). However, there are few datasets that are suit-
able for meta-learning in the context of recommender systems, or suitable for research
in the field of digital libraries. It is with this demand in mind that we introduce the
RARD II dataset. RARD II is based on a digital-library recommender-as-a-service plat-
form known as Mr. DLib [18–20]. Primarily, Mr. DLib provides related-article type
recommendations based on a source / query article, to a number of partner services
including the social-sciences digital library Sowiport [21–24] and the JabRef reference-
management software [25].
The unique value of RARD II for recommender-systems and algorithm-selection re-
search, stems from the scale and variety of data that it provides in the domain of digital
libraries. RARD II includes data from 94m recommendations, originating from more
than 13.5m queries. The dataset comprises two full years of recommendations, deliv-
ered between September 2016 and September 2018. Importantly, in addition to con-
ventional ratings-type data, RARD II includes comprehensive recommendation logs.
These provide a detailed account of the recommendations that were generated – not
only the items that were recommended, but also the context of the recommendation (the
source query and recommendation destination), and meta-data about the algorithms and
parameters used to generate them. Compared to its predecessor RARD I [26] – which
was downloaded 1,489 times between June 2017 and April 2019 – RARD II contains
64% more recommendations, 187% more features (algorithms, parameters, and statis-
tics), 50% more clicks, 140% more documents, and JabRef as new partner.
In what follows, we describe this RARD II data release in more detail. We provide
information about the process that generated the dataset and pay particular attention to
a number of the unique features of this dataset.
2 Background / Mr. DLib
Mr. DLib is a recommendation-as-a-service (RaaS) provider [18]. It is designed to sug-
gest related articles through partner services such as digital libraries or reference man-
agement applications. For example, Mr. DLib provides related-article recommenda-
tions to Sowiport to be presented on Sowiport’s website alongside some source/target
article (see Fig. 1). Sowiport was the largest social science digital library in Germany,
with a corpus of 10m articles (the service was discontinued in December 2017).
Fig. 1. Sowiport’s website with a source-article (blue) and recommendations (green)
Fig. 2 summarises the recommendation process, implemented as a RESTful Web Ser-
vice. The starting point for Mr. DLib to calculate recommendations is the ID or title of
some source article. Importantly, Mr. DLib closes the recommendation loop because in
addition to providing recommendations to a user, the response of a user (principally,
article selections) is returned to Mr. DLib for logging.
Sowiport
Related Articles Source Article
Recommended Article, Title: Example Document
2015
Bob and Alice Authors: John Doe and Alice Bow
Abstract: This is the document a user (1) Sowiport requests
Another Article, 1999
Bob
currently looks at and Sowiport wants to related-articles
display a list of related articles for. for a source document
And another Article, 2017
Peter, Sandra, and John
(3) Sowiport renders the
XML and displays the
related articles
(2) Mr. DLib returns a
Mr. DLib Web
list of related
articles in XML Service
Recom-
Docu- mender
(4) If a user clicks a recommendation, Statistics ment System
Sowiport sends a notification to Mr. DLib Corpus
Fig. 2. Illustration of the recommendation process
In another use-case, Mr. DLib provides related-article recommendations to JabRef, one
of the most popular open-source reference managers (Fig. 3) [27]. Briefly, when users
select a reference/article in JabRef, the “related articles” tab presents a set of related-
articles, retrieved from Mr. DLib. Related-article recommendations are made from the
10m Sowiport corpus, and 14m CORE documents [28–31].
Fig. 3. Related-article recommendations in JabRef
To generate a set of recommendation, Mr. DLib harnesses different recommendation
approaches including content-based filtering. Mr. DLib also uses external recommen-
dation APIs such as the CORE Recommendation API [32,33] as part of a ‘living lab’
[34]. The algorithm selection and parametrization is managed by Mr. DLib’s A/B test-
ing engine. The details of the recommendation approach used, including any relevant
parameters and configurations, are logged by Mr. DLib.
As an example of the data that Mr. DLib logs, when the A/B engine chooses content-
based filtering, it randomly selects whether to use ‘normal keywords’, ‘key-phrases’
[35] or ‘word embeddings’. For each option, additional parameters are randomly cho-
sen; e.g., when key-phrases are chosen, the engine randomly selects whether key-
phrases from the ‘title’ or ‘abstract’ are used. Subsequently, the system randomly se-
lects whether unigrams, bigrams, or trigrams are used. Then, the system randomly se-
lects how many key-phrases to use to calculate document relatedness, from one to
twenty. The A/B engine also randomly chooses which matching mode to use when
parsing queries [standardQP | edismaxQP]. Finally, the engine selects whether to re-
rank recommendations with readership data from Mendeley, and how many recommen-
dations to return.
All this information – the queries and responses, user actions, and the recommenda-
tion process meta data – is made available in the RARD II data release.
3 The RARD II Dataset
RARD II is available on http://data.mr-dlib.org and published under “Creative Com-
mons Attribution 3.0 Unported (CC-BY)” license [36]. The dataset consists of three sub-
datasets: (1) the recommendation log; (2) the ratings matrices; and (3) the external
document IDs.
3.1 The Recommendation Log
The recommendation_log.csv file (20 GB) contains details on each related-article
query from Sowiport and JabRef, and the individual article recommendations returned
by Mr. DLib. A detailed description of every field in the recommendation log is beyond
the scope of this paper (please refer to the dataset’s documentation for full details).
Briefly, the key statistics of the log are presented in Table 1.
Table 1. Key numbers of the recommendation log
Total Sowiport JabRef
Requests
Total 13,482,392 13,170,639 311,753
Unique (src_doc_id) 2,663,8261 2,433,024 238,687
Responses
Total 13,482,392 13,170,639 311,753
With 1+ Click(s) 103,856 100,578 3,278
Avg. #Recs per Response 6.97 6.99 5.92
Recommendations
Total 93,955,966 92,110,708 1,845,258
Unique (rec_doc_id) 7,224,2791 6,819,067 856,158
Clicks
Total 113,954 110,003 3,951
Click-Through Rate 0.12% 0.12% 0.21%
The recommendation log contains 93,955,966 rows (Fig. 4), and each row corresponds
to a single recommended item, i.e. a related-article recommendation. All items were
returned in one of the 13,482,392 responses to the 13,482,392 queries by Sowiport and
JabRef. The 13.5m queries were made for 2,663,826 unique source documents (out of
the 24m documents in the corpus). This means, for each of the 2.7m documents, rec-
ommendations were requested 5.2 times on average. For around 21.4m documents in
the corpus, recommendations were never requested.
Each of the 13.5m responses contains between one and 15 related-article recommen-
dations – 94m recommendations in total and 6.97 on average per response. The 94m
recommendations were made for 7,224,279 unique documents out of the 24m docu-
ments in the corpus. This means, those documents that were recommended, were
1 The sum of ‘Sowiport’ and ‘JabRef’ does not equal the ‘Total’ number because some documents
were requested by / recommended to both Sowiport and JabRef. However, these documents
are only counted once for the ‘Total’ numbers.
recommended 13 times on average. Around 17m documents in the corpus were never
recommended.
row_id query_id query_received partner src_doc_id rspns_id rec_id algo_id text_field … rec_doc_id re-ranking responded clicked
1 1 18-Sep '16, 4:02 sowiport 5,265 1 1 239 title … 95 yes 18-Sep '16, 4:03 NULL
2 2 18-Sep '16, 4:03 sowiport 854 2 2 21 abstract … 4,588 no 18-Sep '16, 4:04 NULL
3 2 18-Sep '16, 4:03 sowiport 854 2 3 21 abstract … 9,648 no 18-Sep '16, 4:04 18-Sep '16, 4:06
4 2 18-Sep '16, 4:03 sowiport 854 2 4 21 abstract … 445 no 18-Sep '16, 4:04 NULL
5 3 18-Sep '16, 4:05 sowiport 917 3 5 3 NULL … 776 no 18-Sep '16, 4:05 18-Sep '16, 4:08
6 3 18-Sep '16, 4:05 sowiport 917 3 6 3 NULL … 95 no 18-Sep '16, 4:05 NULL
… … … … … … … … … … … … … …
93,955,963 13,482,391 30-Sep '18, 23:48 jabref 5,265 13,482,391 93,955,963 21 abstract … 95 no 30-Sep '18, 23:48 30-Sep '18, 23:48
93,955,964 13,482,391 30-Sep '18, 23:48 jabref 5,265 13,482,391 93,955,964 21 abstract … 5,846 no 30-Sep '18, 23:48 30-Sep '18, 23:50
93,955,965 13,482,391 30-Sep '18, 23:48 jabref 5,265 13,482,391 93,955,965 21 abstract … 778 no 30-Sep '18, 23:48 NULL
93,955,966 13,482,392 30-Sep '18, 23:50 sowiport 64 13,482,392 93,955,966 12 title … 168 yes 30-Sep '18, 23:51 NULL
Fig. 4. Illustration of the recommendation log
For each recommended article, the recommendation log contains more than 40 features
(columns) including:
• Information about the partner, query article, and time.
• The id of the recommendation response, recommended articles, and various rank-
ing information before and after Mr. DLib’s A/B selection.
• Information about the recommendation technique used, including algorithm fea-
tures (relevancy and popularity metrics, content features where relevant) and the
processing time needed to generate these recommendations.
• The user response (a click/selection, if any) and the time of this response.
The log includes 113,954 clicks received for the 94m recommendations, which equals
a click-through rate (CTR) of 0.12%. Clicks were counted only once, i.e. if a user
clicked the same recommendations multiple times, only the first click was logged.
103,856 of the 13.5m responses contained at least one clicked recommendation
(0.77%). Based on feedback from colleagues and reviewers, we are aware that many
believe a CTR of 0.12% would be very low as in some other recommender systems
CTR is 5% and higher [37]. However, these other recommender systems provide per-
sonalized recommendations and display recommendations very prominently, for in-
stance, in a pop-up dialog. Recommender systems that are like Mr. DLib – i.e. systems
that provide non-personalized related-item recommendations in an unobtrusive way –
achieve CTRs similar to Mr. DLib, or lower. The various user-interface factors that
influence click-through rates are beyond the scope of this article. Suffice it to say that
the Mr. DLib recommendations are typically presented in a manner that is designed not
to distract the user, which no doubt tends to reduce the CTR. 2
RARD II’s recommendation log enables researchers to reconstruct the fine details
of these historic recommendation sessions, and the responses of users, for the purpose
of benchmarking and/or evaluating new article recommendation strategies, and training
machine-learning models for meta-learning and algorithm selection.
2 We recently re-implemented the system of Mr. DLib and observed higher click-through rates
of around 0.7%. Presumably this may be caused by a much larger number of indexed items in
the new Mr. DLib (120 million instead of 24 million). However, further analysis is necessary.
One example of an analysis would be the analysis how the effectiveness of the rec-
ommender system changes over time (Fig. 5). Between September 2016 and mid-Feb-
ruary 2017, Mr. DLib delivered around 10 million recommendations per month to
Sowiport. This number decreases to around 2 million recommendations per month from
February 2017 onwards. This is due to a change in technology. In the beginning, Sowi-
port requested recommendations from their server also when web crawlers were crawl-
ing Sowiport’s website. In February, Sowiport began to use a JavaScript client that was
ignoring web crawlers. Click-through rate for Sowiport is slightly decreasing over time
from around 0.17% in the beginning to around 0.14% in November 2017. Since De-
cember 2017, click-through rate on Sowiport decreased to near 0%. This decrease is
due to Sowiport’s termination of its main service, i.e. the search function, in December
2017. The search interface is deactivated, although the individual article’s detail pages
are still online. These pages are still indexed in Google and lead to some visitors on
Sowiport. However, these visitors typically leave the website soon and click few or
even no recommendations.
In April 2017, JabRef integrated Mr. DLib into its Version 4.0 beta (Fig. 5). The
users of this beta version requested 1.7 thousand recommendations in April 2017. These
numbers increased to 17 thousand recommendations in May and remained stable
around 25 thousand recommendations in the following months until September 2017.
Click-through rate during these months was comparatively high (up to 0.82%). Follow-
ing the final release of JabRef Version 4.0 in October 2017, the volume of recommen-
dations increased by factor 6, to an average of 150 thousand recommendations per
month. The click-through rate stabilised around 0.2%.
100,000,000 1.0%
10,000,000 0.9%
RECOMMENDATIONS(LOG)
CLICK-THROUGH RATE (CTR)
0.8%
NUMBER OF SETS /
1,000,000
0.7%
100,000 0.6%
10,000 0.5%
1,000 0.4%
0.3%
100
0.2%
10 0.1%
1 0.0%
Sep- Oct- Nov- Dec- Jan- Feb- Mar- Apr- May- Jun- Jul- Aug- Sep- Oct- Nov- Dec- Jan- Feb- Mar- Apr- May- Jun- Jul- Aug- Sep-
16 16 16 16 17 17 17 17 17 17 17 17 17 17 17 17 18 18 18 18 18 18 18 18 18
#Sets (Sowiport) 440K 1339K 1392K 1684K 1986K 834K 280K 304K 462K 416K 392K 386K 301K 361K 432K 383K 551K 151K 271K 80K 308K 112K 213K 56K 36K
#Recs. (Sowiport) 4.4M 10.7M 10.4M 12.8M 15.1M 5.8M 1.7M 1.8M 2.8M 2.5M 2.3M 2.3M 1.8M 2.2M 2.6M 2.3M 3.3M .9M 1.6M .5M 1.8M .7M 1.3M .3M .2M
#Sets (JabRef) 297 3K 4K 4K 5K 4K 22K 24K 21K 29K 24K 27K 27K 26K 24K 23K 23K 21K
#Recs. (JabRef) 1.7K 16.9K 22.4K 21.7K 27.5K 24.7K 131K 142K 125K 173K 144K 160K 160K 153K 144K 135K 139K 126K
CTR (Sowiport) 0.19% 0.17% 0.17% 0.10% 0.08% 0.17% 0.15% 0.16% 0.12% 0.11% 0.10% 0.10% 0.13% 0.15% 0.14% 0.11% 0.07% 0.02% 0.01% 0.02% 0.02% 0.02% 0.01% 0.01% 0.00%
CTR (JabRef) 0.40% 0.82% 0.76% 0.42% 0.60% 0.34% 0.24% 0.16% 0.18% 0.16% 0.20% 0.17% 0.19% 0.17% 0.22% 0.21% 0.20% 0.19%
MONTH
Fig. 5. Number of Sets and Recommendations, and CTR by Month for Sowiport and JabRef
Further to content-based filtering recommendation approaches, Mr. DLib also recom-
mends documents according to manually defined user-models with our stereotype al-
gorithm [38]. We also recommend frequently viewed documents with our most-popular
algorithm[38]. Content-based filtering algorithms are used most frequently, as our pre-
vious evaluations show that these are most effective for users of Sowiport and Jabref
[19,38,39]. The distribution of algorithm usage within RARD II is shown in Fig. 6.
Fig. 6. Total number of recommendations delivered with each algorithm
RARD II’s detailed log entries make it uniquely suitable for analyses of algorithm se-
lection approaches. Because all parameters for each algorithm are logged, as well as
scenario attributes such as the time of day, the relationships between algorithm perfor-
mance and this meta-data may be learned and evaluated. Meta-learning could be used
to predict optimal algorithms per-request, using logged scenario attributes as meta-fea-
tures. For example, a specific variant of content-based filtering might be most effective
for users of Jabref at a certain time of day. Furthermore, as RARD II includes data from
multiple recommendation scenarios, algorithm performance could be learned at a
macro/global level, i.e., per platform [40].
Based on RARD II’s recommendation logs, researchers could also analyse the per-
formance of Mr. DLib’s different recommendation algorithms and variations as shown
in Fig. 7. The figure shows the CTR for the content-based filtering recommendations,
stereotype recommendations, most-popular recommendations and a random recom-
mendations baseline. In the first month, until February 2017, the CTR of all recommen-
dation approaches is similar, i.e. between 0.1% and 0.2%. In March 2017, CTR for all
approaches except content-based filtering decreases to near zero. We assume this to be
due to the JavaScript that Sowiport used since March 2017, and which did not deliver
recommendations to web spiders. Apparently, the comparatively high CTR of stereo-
type, most-popular and random recommendations was caused by web spiders following
just all hyperlinks including the recommendations. As mentioned before, CTR for all
recommendations, including content-based filtering, decreases again in December 2017
when Sowiport terminated its search feature.
100,000,000 0.4%
10,000,000
CLICK-THROUGH RATE (CTR)
1,000,000 0.3%
SETS / RECOMMENDATIONS
100,000
NUMBER (LOG) OF
10,000 0.2%
1,000
100 0.1%
10
1 0.0%
Sep-16 Oct-16 Nov-16 Dec-16 Jan-17 Feb-17 Mar-17 Apr-17 May-17 Jun-17 Jul-17 Aug-17 Sep-17 Oct-17 Nov-17 Dec-17 Jan-18 Feb-18 Mar-18 Apr-18 May-18 Jun-18 Jul-18 Aug-18 Sep-18
Cnt. Bsd. Flt., #Recs 4M 10M 9M 11M 13M 5M 1M 1M 3M 2M 2M 2M 2M 2M 3M 2M 3M 1M 2M .6M 1.9M .8M 1.3M .4M .3M
Stereotype, #Recs 45K 123K 151K 180K 198K 137K 136K 60K 54K 51K 50K 39K 46K 56K 49K 72K 20K 36K 10K 40K 15K 28K 8K 5K
Random, #Recs 157472 518140 641630 762873 288387 68052 81468 53934 48672 45672 45006 34999 41622 49104 44310 64134 17538 31980 9516 36210 13302 24732 6444 4326
Most Popular, #Recs 247K 732K 905K 1073K 452K 156K 157K 68K 63K 58K 59K 46K 54K 64K 57K 83K 23K 41K 12K 46K 17K 33K 8K 6K
Cnt. Bsd. Flt., CTR 0.19% 0.17% 0.17% 0.10% 0.08% 0.18% 0.19% 0.20% 0.13% 0.12% 0.11% 0.11% 0.14% 0.16% 0.15% 0.12% 0.07% 0.05% 0.03% 0.07% 0.03% 0.06% 0.03% 0.07% 0.07%
Stereotype, CTR 0.21% 0.14% 0.10% 0.08% 0.08% 0.01% 0.02% 0.01% 0.02% 0.01% 0.01% 0.01% 0.02% 0.02% 0.03% 0.01% 0.01% 0.00% 0.00% 0.00% 0.03% 0.00% 0.00% 0.02%
Random, CTR 0.18% 0.13% 0.10% 0.08% 0.12% 0.01% 0.02% 0.01% 0.01% 0.00% 0.01% 0.01% 0.01% 0.01% 0.02% 0.00% 0.01% 0.04% 0.00% 0.03% 0.00% 0.00% 0.00% 0.00%
Most Popular, CTR 0.18% 0.13% 0.09% 0.08% 0.15% 0.01% 0.03% 0.01% 0.01% 0.01% 0.01% 0.01% 0.04% 0.02% 0.01% 0.02% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
MONTH
Fig. 7. Number of Recommendations and CTR by Month for Content-Based Filtering, Stereotype
Recommendations, Most-Popular Recommendations, and Random Recommendations.
The recommendation logs may also be used to research the effect that the feature type
in content-based filtering has on performance (Fig. 8). Surprisingly, standard keywords
perform best as shown in Fig. 8. For instance, CTR for key phrases and word embed-
dings is 0.04% and 0.05% respectively in October 2017. In contrast, CTR of standard
terms is 0.16% in October.
10,000,000 0.4%
1,000,000
0.3%
CLICK-THROUGH RATE (CTR)
100,000
SETS / RECOMMENDATIONS
10,000
NUMBER (LOG) OF
0.2%
1,000
100
0.1%
10
1 0.0%
Sep-16 Oct-16 Nov-16 Dec-16 Jan-17 Feb-17 Mar-17 Apr-17 May-17 Jun-17 Jul-17 Aug-17 Sep-17 Oct-17 Nov-17 Dec-17 Jan-18 Feb-18 Mar-18 Apr-18 May-18 Jun-18 Jul-18 Aug-18 Sep-18
Keywords, #Recs 4.40M .58M 1.36M 1.68M 2.00M 1.72M 1.22M 1.35M 2.46M 2.20M 2.07M 2.09M 1.65M 2.02M 2.40M 2.14M 2.99M .88M 1.55M .54M 1.77M .71M 1.24M .40M .29M
Keyphrases, #Recs 416K 1M 2M 3M 925K 71K 82K 137K 146K 134K 82K 44K 47K 68K 52K 128K 45K 48K 9K 39K 14K 27K 8K 5K
Word Embd., #Recs 9K 33K 45K 35K 79K 27K 31K 8K 28K 10K 19K 6K 4K
Keywords, CTR 0.19% 0.24% 0.18% 0.09% 0.08% 0.15% 0.20% 0.20% 0.14% 0.12% 0.11% 0.11% 0.14% 0.16% 0.15% 0.12% 0.08% 0.05% 0.02% 0.06% 0.03% 0.05% 0.02% 0.05% 0.06%
Keyphrases, CTR 0.17% 0.09% 0.04% 0.03% 0.12% 0.05% 0.07% 0.04% 0.06% 0.04% 0.03% 0.05% 0.04% 0.04% 0.03% 0.01% 0.01% 0.01% 0.04% 0.01% 0.09% 0.02% 0.03% 0.02%
Word Embd., CTR 0.02% 0.05% 0.02% 0.04% 0.02% 0.01% 0.01% 0.04% 0.01% 0.06% 0.01% 0.00% 0.03%
MONTH
Fig. 8. Number of Recommendations and CTR by Month for Different Content-Based-Filtering
Variations
RARD II also allows researchers to analyse the impact that processing time has on
click-through rate. Fig. 9 shows that the longer users need to wait for recommendations,
the lower the click-through rate becomes. This holds true for all algorithms being used
in Mr. DLib. For instance, when content-based filtering recommendations are returned
within 2 second, the average CTR is 0.16%. If recommendations are returned after 5
seconds, CTR decreases to 0.11% and if recommendations are returned after 10 sec-
onds, CTR decreases to 0.06%. This finding correlates with findings in other infor-
mation retrieval applications such as search engines [41,42].
RARD II’s recommendation log allows for many analyses more. Examples include
or own analyses of the effect of bibliometric re-ranking [43], position bias (effect of a
recommendation’s rank, regardless of its actual relevance) [44], choice overload (effect
of the number of displayed recommendations) [45], and the effect of the document field
being used for content-based filtering [46].
100,000,000 0.18
Click-through Rate (%)
Impressions (log)
10,000,000 0.16
1,000,000 0.14
0.12
100,000
0.10
10,000
0.08
1,000
0.06
100 0.04
10 0.02
1 0.00
0 1 2 3 4 5 6 7 8 9 10 11-20 21+
Impressions 29275k 624k 331k 169k 101k 72k 50k 35k 27k 21k 17k 79k 24k
All algorithms 0.11 0.15 0.14 0.11 0.10 0.13 0.10 0.09 0.06 0.07 0.06 0.05 0.02
Stereotype 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Most Popular 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Random 0.01 0.00 0.03 0.00 0.02 0.00 0.03 0.00 0.00 0.00 0.00 0.01 0.00
CBF 0.12 0.16 0.15 0.12 0.11 0.13 0.10 0.10 0.06 0.07 0.06 0.05 0.02
Terms 0.12 0.16 0.14 0.12 0.11 0.12 0.11 0.10 0.07 0.10 0.08 0.05 0.01
Keyphrases 0.04 0.01 0.01 0.02 0.02 0.04 0.04 0.00 0.02 0.02 0.03 0.02 0.00
Embeddings 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Processing Time (seconds)
Impressions All algorithms Stereotype Most Popular Random CBF Terms Keyphrases Embeddings
Fig. 9. CTR based on time to calculate recommendations
3.2 The Implicit Ratings Matrices
A ratings matrix is a mainstay of most recommendation datasets. RARD II contains
two implicit, item-item rating matrices discussed below. Implicit ratings are based on
the assumption that when users click a recommended article it is because they find the
recommendation relevant (a positive rating). And, conversely if they don’t click a rec-
ommendation it is because the recommendation is not relevant (a negative rating). Of
course, neither of these assumptions is particularly reliable. For example, just because
a user selects a recommendation doesn’t mean it will turn out to be relevant. Likewise,
a user may choose not to respond if a recommendation is not relevant, or they may
simply not notice the recommendation. However, click-related metrics such as click-
through rate are a commonly used metric, particularly in industry, and are a good first
indication of relevance.
The full ratings matrix rating_matrix_full.csv (1.8 GB) is illustrated in Fig. 10.
It contains 48,879,170 rows, one for each tuple. For each tuple, the following information is provided.
• The id of the source document (src_doc_id) for which Sowiport or JabRef queried
recommendations. In a typical recommendation scenario, this entity may be in-
terpreted as the user to whom recommendations were made.
• The id of a related-article (rec_doc_id) that was returned for the query. This entity
may be interpreted as the item that was recommended to the ‘user’.
• The number of times (#displayed) the tuple occurs in the recommendation log,
i.e. how often the article was recommended as related to the given source docu-
ment.
• The number of times the recommended article was clicked by a user (#clicks).
• The click-through rate (CTR), which represents an implicit rating of how relevant
the recommended document was for the given source document.
row_id src_doc_id (user) rec_doc_id (item) # displayed # clicks ctr (rating)
1 2 95 18 - 0%
2 18 4,588 5 2 40%
3 18 16,854,445 1 - 0%
4 56 985 12 10 83%
… … … … … …
48,879,167 24,523,730 776 64 1 2%
48,879,168 24,523,730 125,542 5 - 0%
48,879,169 24,523,738 6,645 8 - 0%
48,879,170 24,523,738 68,944 1 1 100%
Fig. 10. Illustration of the full rating matrix
The full rating matrix was computed based on the full recommendation log. Hence, it
also includes data from responses for which none of the recommendations were clicked.
There are at least three reasons why users sometimes did not click on any recommen-
dation: users may not have liked any of the recommendations; users may not have seen
the recommendation; or recommendations may have been delivered to a web spider or
bot that did not follow the hyperlinks. In the latter two cases, the non-clicks should not
be interpreted as a negative vote. However, it is not possible to identify, which of the
three scenarios applies for those sets in which no recommendation was clicked. There-
fore, we created a filtered rating matrix.
The filtered ratings matrix rating_matrix_filtered.csv (26 MB) contains
745,167 rows and has the same structure (Fig. 11) as the full ratings matrix (Fig. 10).
However, the matrix is based only on responses in which at least one recommendation
from the response was clicked. The rationale is that when at least one recommendation
was clicked, a real user must have looked at the recommendations who decided to click
some recommendations and to not click others. Consequently, the non-clicked recom-
mendations are more likely to correspond to an actual negative vote. Compared to the
full matrix, the rows are missing that represent tuples that were delivered in responses that did not receive any clicks. Also,
the remaining rows tend to have lower #displayed counts than in the full-ratings matrix.
id src_doc_id (user) rec_doc_id (item) # displayed # clicks ctr (rating)
1 2 95 13 - 0%
2 18 4,588 2 2 100%
3 56 985 11 10 91%
… … … … … …
745,165 24,523,730 776 32 1 3%
745,166 24,523,730 125,542 2 - 0%
745,167 24,523,738 68,944 1 1 100%
Fig. 11. Illustration of the filtered rating matrix
There are certainly more alternatives to create the implicit ratings matrix. For instance,
one may argue that recommendation sets in which all recommendations were clicked,
might have been ‘clicked’ by a web spider, which simply followed all hyperlinks on a
website. Based on the recommendation log, researchers can create their own implicit
ratings matrix.
Fig. 12 shows the distribution of views and clicks of the tuples (query document;
recommended document). 93.9% of all tuples occur only once. This means, it rarely
happened that a document pair (source document x and recommended article y) was
recommended twice or even more often. Actually, only 4.72% of the tuples were deliv-
ered twice, and 0.73% of the tuples were delivered three times. Most of these tuples did
not receive any click (85.48%). 14% of the tuples received one click, and 0.4% received
two clicks.
100%
Distribution
75%
50%
25%
0%
0 1 2 3 4 5 6 7 8 9 10 11+
Clicked 85.440 14.187 0.279% 0.051% 0.019% 0.008% 0.005% 0.002% 0.002% 0.001% 0.001% 0.007%
Displayed 94.64% 4.12% 0.70% 0.23% 0.11% 0.06% 0.04% 0.02% 0.02% 0.01% 0.05%
Number of Views / Clicks
Fig. 12. Statistics of the filtered ratings matrix
3.3 External IDs (Sowiport, Mendeley, arXiv, …)
The third element of the data release is the list of external document ids ( exter-
nal_IDs.csv, 924 MB). There are 33m such ids for the Sowiport and CORE documents
used. In addition, for a subset of the Sowiport ids there is associated identifiers for
Mendeley (18%), ISSN (16%), DOI (14%), Scopus IDs (13%), ISBN (11%), PubMed
IDs (7%) and arXiv IDs (0.4%). This third subset is provided to facilitate researchers
in obtaining additional document data and meta-data from APIs provided by Sowiport
[47], CORE [48], and Mendeley [49].
4 Discussion & Related Work
We have introduced RARD II, a large-scale, richly detailed dataset for recommender
systems research based on more than 94m delivered scientific article recommendations.
It contributes to a growing number of such datasets, most focusing on ratings data,
ranging in scale from a few thousand datapoints to tens or even hundreds of millions
of datapoints [50].
The domain of the RARD II data release (article recommendation) distinguishes it
from more common e-commerce domains, but it is not unique in this regard [51–56].
Among others, CiteULike [57,58] and Bibsonomy [59,60] published datasets contain-
ing the social tags that their users added to research articles and, although not intended
specifically for use in recommender systems research, these datasets have nonetheless
been used to evaluate research-paper recommender systems [61–69]. Jack et al. com-
piled a Mendeley dataset [70] based on 50,000 randomly selected personal libraries
from 1.5m users and with 3.6m unique articles. Similarly, Docear published a dataset
based on its recommender system [71] based on the metadata of 9.4m articles, their
citation network, related user information, and the details of 300,000 recommendations.
While RARD II shares some similarities with some of these datasets, particularly
Docear and, of course, its predecessor RARD I [26], it is one of the only examples of a
dataset from the digital library domain that has been specifically designed to support
recommender systems research, and provides data at a scale that no other dataset in this
domain provides. Because of this it includes information that is especially valuable to
recommender systems researchers, not just ratings data but also the recommendation
logs, which provide a unique insight into all aspects of the sessions that generated the
recommendations and led to user clicks. Considering the scale and variety of data,
RARD II is a unique and valuable addition to existing datasets.
Many datasets are pruned, i.e. data that is considered as not optimal is removed. For
instance, the MovieLens datasets contain only data from users who rated at least 20
movies [12]. Such pruned datasets are nice for researchers because applying e.g. col-
laborative filtering works typically very well. However, such datasets are not realistic
as dealing with noise is a crucial task in real-world recommender systems that are used
in production. RARD II is not pruned, i.e. no data was removed. We believe that giving
access to the full data including the many not-clicked recommendations is valuable for
researchers who want to conduct research in a realistic scenario.
5 Limitations and Future Work
RARD II is a unique dataset with high value to recommender-systems researchers, par-
ticularly in the domain of digital libraries. However, we see areas for improvement,
which we plan to address in the future.
Currently, RARD only contains the recommendation log and matrices from Mr.
DLib, and Mr. DLib delivers recommendations only to two service partners. In the long
run, we aim to make RARD a dataset that contains data from many RaaS operators. In
addition to Mr. DLib, RaaS operators like Babel [72] and the CORE Recommender
[33,48] could contribute their data. Additional service partners of Mr. DLib could also
increase the value of the RARD releases. RARD would also benefit from having per-
sonalized recommendation algorithms included in addition to the non-personalized re-
lated-article recommendations. We are also aware of the limitations that clicks inherit
as evaluation metrics. Future versions of RARD will include additional metrics such as
real user ratings, and other implicit metrics. For instance, knowing whether a recom-
mended document was eventually added to a JabRef user’s library would provide val-
uable information. While RARD contains detailed information about the applied algo-
rithms and parameters, information about the items is limited. We hope to be able to
include more item-related data in future releases (e.g. metadata such as author names
and document titles).
References
[1] D. Harman, “Overview of the First Text REtrieval Conference (TREC-1),” NIST
Special Publication 500-207: The First Text REtrieval Conference (TREC-1),
1992.
[2] E.M. Voorhees and A. Ellis, eds., The Twenty-Fifth Text REtrieval Conference
(TREC 2016) Proceedings, National Institute of Standards and Technology
(NIST), 2016.
[3] A. Aizawa, M. Kohlhase, and I. Ounis, “NTCIR-10 Math Pilot Task Overview.,”
NTCIR, 2013.
[4] R. Zanibbi, A. Aizawa, M. Kohlhase, I. Ounis, G. Topi, and K. Davila, “NTCIR-
12 MathIR task overview,” NTCIR, National Institute of Informatics (NII), 2016.
[5] F. Hopfgartner, T. Brodt, J. Seiler, B. Kille, A. Lommatzsch, M. Larson, R.
Turrin, and A. Serény, “Benchmarking news recommendations: The clef newsreel
use case,” ACM SIGIR Forum, ACM, 2016, pp. 129–136.
[6] M. Koolen, T. Bogers, M. Gäde, M. Hall, I. Hendrickx, H. Huurdeman, J. Kamps,
M. Skov, S. Verberne, and D. Walsh, “Overview of the CLEF 2016 Social Book
Search Lab,” International Conference of the Cross-Language Evaluation Forum
for European Languages, Springer, 2016, pp. 351–370.
[7] D. Dheeru and E. Karra Taniskidou, “UCI Machine Learning Repository,”
University of California, Irvine, School of Information and Computer Sciences.
http://archive.ics.uci.edu/ml, 2017.
[8] J. Vanschoren, J.N. van Rijn, B. Bischl, and L. Torgo, “OpenML: Networked
Science in Machine Learning,” SIGKDD Explorations, vol. 15, 2013, pp. 49–60.
[9] SIGKDD, “KDD Cup Archives,” ACM Special Interest Group on Knowledge
Discovery and Data Mining. http://www.kdd.org/kdd-cup, 2018.
[10] L. Kotthoff, “Algorithm selection for combinatorial search problems: A survey,”
Data Mining and Constraint Programming, Springer, 2016, pp. 149–190.
[11] F. Hutter, L. Kotthoff, and J. Vanschoren, “Automatic machine learning:
methods, systems, challenges,” Challenges in Machine Learning, 2019.
[12] F.M. Harper and J.A. Konstan, “The movielens datasets: History and context,”
ACM Transactions on Interactive Intelligent Systems (TiiS), vol. 5, 2016, p. 19.
[13] S. Dooms, A. Bellog𝚤n, T.D. Pessemier, and L. Martens, “A Framework for
Dataset Benchmarking and Its Application to a New Movie Rating Dataset,” ACM
Trans. Intell. Syst. Technol., vol. 7, 2016, pp. 41:1–41:28.
[14] C.-N. Ziegler, S.M. McNee, J.A. Konstan, and G. Lausen, “Improving
recommendation lists through topic diversification,” Proceedings of the 14th
international conference on World Wide Web, ACM, 2005, pp. 22–32.
[15] T. Bertin-Mahieux, D.P.W. Ellis, B. Whitman, and P. Lamere, “The Million Song
Dataset,” Proceedings of the 12th International Conference on Music Information
Retrieval (ISMIR 2011), 2011.
[16] B. Kille, F. Hopfgartner, T. Brodt, and T. Heintz, “The plista dataset,”
Proceedings of the 2013 International News Recommender Systems Workshop
and Challenge, ACM, 2013, pp. 16–23.
[17] Google, “Search Results Mentioning MovieLens on Google Scholar,”
https://scholar.google.de/scholar?q=“movielens”. Retrieved 14 April 2018,
2018.
[18] J. Beel, A. Aizawa, C. Breitinger, and B. Gipp, “Mr. DLib: Recommendations-
as-a-service (RaaS) for Academia,” Proceedings of the 17th ACM/IEEE Joint
Conference on Digital Libraries, Toronto, Ontario, Canada: IEEE Press, 2017,
pp. 313–314.
[19] J. Beel and S. Dinesh, “Real-World Recommender Systems for Academia: The
Gain and Pain in Developing, Operating, and Researching them,” Proceedings of
the Fifth Workshop on Bibliometric-enhanced Information Retrieval (BIR) co-
located with the 39th European Conference on Information Retrieval (ECIR
2017), P. Mayr, I. Frommholz, and G. Cabanac, eds., 2017, pp. 6–17.
[20] J. Beel, B. Gipp, S. Langer, M. Genzmehr, E. Wilde, A. Nuernberger, and J.
Pitman, “Introducing Mr. DLib, a Machine-readable Digital Library,”
Proceedings of the 11th ACM/IEEE Joint Conference on Digital Libraries
(JCDL‘11), ACM, 2011, pp. 463–464.
[21] D. Hienert, F. Sawitzki, and P. Mayr, “Digital library research in action - -
supporting information retrieval in Sowiport,” D-Lib Magazine, vol. 21, 2015.
[22] P. Mayr, “Sowiport User Search Sessions Data Set (SUSS),” GESIS Datorium,
2016.
[23] F. Sawitzki, M. Zens, and P. Mayr, “Referenzen und Zitationen zur Unterstützung
der Suche in SOWIPORT,” Internationales Symposium der
Informationswissenschaft (ISI 2013). Informationswissenschaft zwischen
virtueller Infrastruktur und materiellen Lebenswelten, DEU, 2013, p. 5.
[24] M. Stempfhuber, P. Schaer, and W. Shen, “Enhancing visibility: Integrating grey
literature in the SOWIPORT Information Cycle,” International Conference on
Grey Literature, 2008, pp. 23–29.
[25] O. Kopp, U. Breitenbuecher, and T. Mueller, “CloudRef - Towards Collaborative
Reference Management in the Cloud,” Proceedings of the 10th Central European
Workshop on Services and their Composition, 2018.
[26] J. Beel, Z. Carevic, J. Schaible, and G. Neusch, “RARD: The Related-Article
Recommendation Dataset,” D-Lib Magazine, vol. 23, Jul. 2017, pp. 1–14.
[27] J. Beel, “On the popularity of reference managers, and their rise and fall,” Docear
Blog. https://www.docear.org/2013/11/11/on-the-popularity-of-reference-
managers-and-their-rise-and-fall/, 2013.
[28] P. Knoth and N. Pontika, “Aggregating Research Papers from Publishers’
Systems to Support Text and Data Mining: Deliberate Lack of Interoperability or
Not?,” Proceedings of INTEROP2016, INTEROP2016, 2016.
[29] P. Knoth and Z. Zdrahal, “CORE: three access levels to underpin open access,”
D-Lib Magazine, vol. 18, 2012.
[30] N. Pontika, P. Knoth, M. Cancellieri, and S. Pearce, “Developing Infrastructure
to Support Closer Collaboration of Aggregators with Open Repositories,” LIBER
Quarterly, vol. 25, Apr. 2016.
[31] L. Anastasiou and P. Knoth, “Building Scalable Digital Library Ingestion
Pipelines Using Microservices,” Proceedings of the 11th International
Conference on Metadata and Semantic Research (MTSR), Springer, 2018, p. 275.
[32] P. Knoth, L. Anastasiou, A. Charalampous, M. Cancellieri, S. Pearce, N. Pontika,
and V. Bayer, “Towards effective research recommender systems for
repositories,” Proceedings of the Open Repositories Conference, 2017.
[33] N. Pontika, L. Anastasiou, A. Charalampous, M. Cancellieri, S. Pearce, and P.
Knoth, “CORE Recommender: a plug in suggesting open access content,”
http://hdl.handle.net/1842/23359, 2017.
[34] J. Beel, A. Collins, O. Kopp, L. Dietz, and P. Knoth, “Mr. DLib’s Living Lab for
Scholarly Recommendations,” Proceedings of the 41st European Conference on
Information Retrieval (ECIR), 2019.
[35] F. Ferrara, N. Pudota, and C. Tasso, “A Keyphrase-Based Paper Recommender
System,” Proceedings of the IRCDL’11, Springer, 2011, pp. 14–25.
[36] Creative Commons, “Creative Commons Attribution 3.0 Unported (CC BY 3.0),”
https://creativecommons.org/licenses/by/3.0/, 2018.
[37] J. Beel, “Towards Effective Research-Paper Recommender Systems and User
Modeling based on Mind Maps,” PhD Thesis. Otto-von-Guericke Universität
Magdeburg, 2015.
[38] J. Beel, S. Dinesh, P. Mayr, Z. Carevic, and J. Raghvendra, “Stereotype and Most-
Popular Recommendations in the Digital Library Sowiport,” Proceedings of the
15th International Symposium of Information Science (ISI), 2017, pp. 96–108.
[39] S. Feyer, S. Siebert, B. Gipp, A. Aizawa, and J. Beel, “Integration of the Scientific
Recommender System Mr. DLib into the Reference Manager JabRef,”
Proceedings of the 39th European Conference on Information Retrieval (ECIR),
2017, pp. 770–774.
[40] A. Collins, D. Tkaczyk, and J. Beel, “A Novel Approach to Recommendation
Algorithm Selection using Meta-Learning,” Proceedings of the 26th Irish
Conference on Artificial Intelligence and Cognitive Science (AICS), CEUR-WS,
2018, pp. 210–219.
[41] R. Kohavi, A. Deng, R. Longbotham, and Y. Xu, “Seven rules of thumb for web
site experimenters,” Proceedings of the 20th ACM SIGKDD international
conference on Knowledge discovery and data mining, ACM, 2014, pp. 1857–
1866.
[42] E. Schurman and J. Brutlag, “The user and business impact of server delays,
additional bytes, and HTTP chunking in web search,” Velocity Web Performance
and Operations Conference, 2009.
[43] S. Siebert, S. Dinesh, and S. Feyer, “Extending a Research Paper
Recommendation System with Bibliometric Measures,” 5th International
Workshop on Bibliometric-enhanced Information Retrieval (BIR) at the 39th
European Conference on Information Retrieval (ECIR), 2017.
[44] A. Collins, D. Tkaczyk, A. Aizawa, and J. Beel, “Position Bias in Recommender
Systems for Digital Libraries,” Proceedings of the iConference, Springer, 2018,
pp. 335–344.
[45] F. Beierle, A. Aizawa, and J. Beel, “Choice Overload in Research-Paper
Recommender Systems,” International Journal of Digital Libraries, 2019.
[46] A. Collins and J. Beel, “Keyphrases vs. Document Embeddings vs. Terms for
Recommender Systems: An Online Evaluation,” Proceedings of the ACM/IEEE-
CS Joint Conference on Digital Libraries (JCDL), 2019.
[47] GESIS, “Sowiport OAI API,” http://sowiport.gesis.org/OAI/Home, 2017.
[48] CORE, “CORE Open API and Datasets,” https://core.ac.uk, 2018.
[49] Mendeley, “Mendeley Developer Portal (Website),” http://dev.mendeley.com/,
2016.
[50] A. Gude, “The Nine Must-Have Datasets for Investigating Recommender
Systems,” Blog. https://gab41.lab41.org/the-nine-must-have-datasets-for-
investigating-recommender-systems-ce9421bf981c, 2016.
[51] M. Lykke, B. Larsen, H. Lund, and P. Ingwersen, “Developing a test collection
for the evaluation of integrated search,” European Conference on Information
Retrieval, Springer, 2010, pp. 627–630.
[52] K. Sugiyama and M.-Y. Kan, “Scholarly paper recommendation via user’s recent
research interests,” Proceedings of the 10th ACM/IEEE Annual Joint Conference
on Digital Libraries (JCDL), ACM, 2010, pp. 29–38.
[53] K. Sugiyama and M.-Y. Kan, “A comprehensive evaluation of scholarly paper
recommendation using potential citation papers,” vol. 16, 2015, pp. 91–109.
[54] K. Sugiyama and M.-Y. Kan, “Exploiting potential citation papers in scholarly
paper recommendation,” Proceedings of the 13th ACM/IEEE-CS joint conference
on Digital libraries, ACM, 2013, pp. 153–162.
[55] D. Roy, K. Ray, and M. Mitra, “From a Scholarly Big Dataset to a Test Collection
for Bibliographic Citation Recommendation,” AAAI Workshop: Scholarly Big
Data, 2016.
[56] D. Roy, “An improved test collection and baselines for bibliographic citation
recommendation,” Proceedings of the 2017 ACM on Conference on Information
and Knowledge Management, ACM, 2017, pp. 2271–2274.
[57] CiteULike, “Data from CiteULike’s new article recommender,” Blog,
http://blog.citeulike.org/?p=136, Nov. 2009.
[58] K. Emamy and R. Cameron, “Citeulike: a researcher’s social bookmarking
service,” Ariadne, 2007.
[59] D. Benz, A. Hotho, R. Jäschke, B. Krause, F. Mitzlaff, C. Schmitz, and G.
Stumme, “The Social Bookmark and Publication Management System
BibSonomy,” The VLDB Journal, vol. 19, 2010, pp. 849–875.
[60] Bibsonomy, “BibSonomy :: dumps for research purposes.,”
https://www.kde.cs.uni-kassel.de/bibsonomy/dumps/, 2018.
[61] C. Caragea, A. Silvescu, P. Mitra, and C.L. Giles, “Can’t See the Forest for the
Trees? A Citation Recommendation System,” iConference 2013 Proceedings,
2013, pp. 849–851.
[62] R. Dong, L. Tokarchuk, and A. Ma, “Digging Friendship: Paper Recommendation
in Social Network,” Proceedings of Networking & Electronic Commerce
Research Conference (NAEC 2009), 2009, pp. 21–28.
[63] Q. He, J. Pei, D. Kifer, P. Mitra, and L. Giles, “Context-aware citation
recommendation,” Proceedings of the 19th international conference on World
wide web, ACM, 2010, pp. 421–430.
[64] W. Huang, S. Kataria, C. Caragea, P. Mitra, C.L. Giles, and L. Rokach,
“Recommending citations: translating papers into references,” Proceedings of the
21st ACM international conference on Information and knowledge management,
ACM, 2012, pp. 1910–1914.
[65] S. Kataria, P. Mitra, and S. Bhatia, “Utilizing context in generative bayesian
models for linked corpus,” Proceedings of the 24th AAAI Conference on Artificial
Intelligence, 2010, pp. 1340–1345.
[66] D.M. Pennock, E. Horvitz, S. Lawrence, and C.L. Giles, “Collaborative filtering
by personality diagnosis: A hybrid memory-and model-based approach,”
Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence,
Morgan Kaufmann Publishers Inc., 2000, pp. 473–480.
[67] L. Rokach, P. Mitra, S. Kataria, W. Huang, and L. Giles, “A Supervised Learning
Method for Context-Aware Citation Recommendation in a Large Corpus,”
Proceedings of the Large-Scale and Distributed Systems for Information
Retrieval Workshop (LSDS-IR), 2013, pp. 17–22.
[68] R. Torres, S.M. McNee, M. Abel, J.A. Konstan, and J. Riedl, “Enhancing digital
libraries with TechLens+,” Proceedings of the 4th ACM/IEEE-CS joint
conference on Digital libraries, ACM New York, NY, USA, 2004, pp. 228–236.
[69] F. Zarrinkalam and M. Kahani, “SemCiR - A citation recommendation system
based on a novel semantic distance measure,” Program: electronic library and
information systems, vol. 47, 2013, pp. 92–112.
[70] K. Jack, M. Hristakeva, R.G. de Zuniga, and M. Granitzer, “Mendeley’s Open
Data for Science and Learning: A Reply to the DataTEL Challenge,”
International Journal of Technology Enhanced Learning, vol. 4, 2012, pp. 31–
46.
[71] J. Beel, S. Langer, B. Gipp, and A. Nuernberger, “The Architecture and Datasets
of Docear’s Research Paper Recommender System,” D-Lib Magazine, vol. 20,
2014.
[72] I. Wesley-Smith and J.D. West, “Babel: A Platform for Facilitating Research in
Scholarly Article Discovery,” Proceedings of the 25th International Conference
Companion on World Wide Web, International World Wide Web Conferences
Steering Committee, 2016, pp. 389–394.