<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>rrecsys: an R-package for prototyping recommendation algorithms</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ludovik Çoba</string-name>
          <email>lcoba@unishk.edu.al</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Markus Zanker</string-name>
          <email>mzanker@unibz.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Free University of Bozen-Bolzano</institution>
          ,
          <addr-line>piazza Domenicani, 3, 39100 Bolzano</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universiteti i Shkodrës "Luigj Gurakuqi"</institution>
          ,
          <addr-line>Sheshi 2 Prilli, Shkodër</addr-line>
          ,
          <country country="AL">Albania</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <abstract>
        <p>We introduce rrecsys, an open source extension package in R for rapid prototyping and intuitive assessment of recommender system algorithms. As the only currently available R package for recommender algorithms (recommenderlab) did not include popular algorithm implementations such as matrix factorization or One-class Collaborative Filtering algorithms we developed rrecsys as an easily accessible tool that can, for instance, be employed for interactive demonstrations when teaching. This package replicates state-of-the-art Collaborative Filtering algorithms for rating and binary data and we compare results with the Java-based LensKit implementation and recommederlab for the purpose of benchmarking the implementation. Therefore this work can also be seen as a contribution in the context of replication of algorithm implementations and reproduction of evaluation results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        R represents a popular choice in Data Analytics and
Machine Learning. The software has low setup cost and
contains a large selection of packages and functionalities to
enhance and prototype algorithms with compact code and
good visualization tools. Thus R represents a suitable
environment for exploring the eld of recommender systems.
We present and contribute a novel R package, rrecsys1, that
replicates several state-of-the-art recommender algorithms
for Likert scaled as well as binary rating values. Up to now
there is only one package addressing recommender systems,
recommenderlab 2, which lacks implementation of popular
algorithms and we benchmark results in Section 3.
This work can be seen as a contribution towards the
reproducibility of algorithms and results. Although this concept
of reproducibility of experimental results is a fundamental
prerequisite for scienti c research it is many times not given
for granted in the recommender systems eld. For instance
Said et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] pointed out that the major recommendation
frameworks such as MyMediaLite, LensKit and Apache
Mahout show major di erences in the implementation of the
same algorithm variants and in their evaluation
methodology. Di erences which are according to Said et al. many
times much larger than the typically reported performance
1https://cran.r-project.org/package=rrecsys
2https://cran.r-project.org/package=recommenderlab
globalMean
itemAverage
userAverage
SVD(10 feat.)
SVD (50 feat.)
SVD (100 feat.)
SVD (150 feat.)
IB (20 neigh.)
IB (50 neigh.)
IB (100 neigh.)
improvements of a new algorithm over the selected baseline
technique. Prototyping helps to shape recommender
algorithms and evaluation methodologies as a strategy to tackle
directly the issue of reproducibility. Furthermore,
teaching recommendation concepts and evaluation methodology
in hands-on sessions is highly relevant to understand ideas
and algorithms from a didactics perspective and to make the
learning experience more student-centered.
2.
      </p>
    </sec>
    <sec id="sec-2">
      <title>THE PACKAGE</title>
      <p>rrecsys has a modular structure as well as includes
expansion capabilities. The core of the package includes the
implementation of several popular algorithms such as: Most
Popular, Global Average, Item Average, User Average, Item
Based K-Nearest Neighbors, Simon Funk's SVD, Weighted
Alternated Least Squares and Bayesian Personalized
Ranking. The package's evaluation module is based on k-fold
cross-validation method. A strati ed random selection
procedure is applied when dividing the rated items of each user
into k folds such that each user is uniformly represented
in each fold. Based on the task (rating prediction or
recommendation) the following metrics are computed: mean
absolute error(MAE), root mean square error(RMSE),
Precision, Recall, F1, True and False Positives, True and False
Negatives, normalized discounted cumulative gain (NDCG),
rank score, area under the ROC curve (AUC) and catalog
coverage. RMSE and MAE metrics are computed according
to their two variants, user-based vs. global.
3.</p>
    </sec>
    <sec id="sec-3">
      <title>RRECSYS IN ACTION</title>
      <p>In this section we introduce an executable script in R for
running some of the functionalities of rrecsys in order to
demonstrate its intuitive use.
# Install and load:
install.packages("rrecsys")
library(rrecsys)
# ML Latest is loaded on the package.
data("mlLatest100k")
# Define a rating matrix and explore it.
mlLatest &lt;- defineData(mlLatest100k,</p>
      <p>minimum = .5, maximum = 5, halfStar = TRUE)
sparsity(mlLatest); numRatings(mlLatest)
rowRatings(mlLatest); colRatings(mlLatest)
smallMlLatest &lt;- mlLatest[rowRatings(mlLatest)
&gt;= 200, colRatings(mlLatest) &gt; 10]
# Setting up the number of iterations for FunkSVD.
setStoppingCriteria(nrLoops = 50)
# Training a model using FunkSVD.
svd10 &lt;- rrecsys(smallMlLatest, "FunkSVD", k = 10,
lambda = 0.001, gamma = 0.0015)
# Using the trained model to predict and recommend.
p &lt;- predict(svd10)
r &lt;- recommend(svd10, topN = 10)
# Instantiate an evaluation model.
model &lt;- evalModel(smallMlLatest, folds = 5)
# Using the above model to evaluate predictions.
evalPred(model, "IBKNN", neigh = 10)
# Using the same model to evaluate recommendations.
evalRec(model, "globalAverage", topN = 10,
goodRating = 3)</p>
    </sec>
    <sec id="sec-4">
      <title>4. BENCHMARK RESULTS</title>
      <p>
        In Table 1 we report results from benchmarking the
rrecsys implementation with the popular Lenskit [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] Java library
and the recommenderlab R package. The reported results
demonstrate the ability to clearly reproduce the results of
Lenskit being the most well-known Java-based
recommendation library and in contrast to recommenderlab. Evaluation
is made using 5-fold cross validation on the MovieLens100K
dataset. Lenskit and rrecsys were con gured identically. In
the case of recommenderlab we selected parameters such
that its con guration was as close as possible to our and
Lenskit's evaluation methodology. Reported error metrics
were computed as a global average over the whole ratings in
the test set. The SVD algorithm implementation in
recommenderlab is based on an approximation estimated by the
EM algorithm, resulting in bad prediction performance but
enables the developer to vectorize, providing good
computation performance. In the case of the item based k-nearest
neighbor algorithm, rrecsys replicates recommenderlab
implementation. Yet results of recommenderlab di er quite
clearly proving that disparity in the implementation of the
evaluation methodology signi cantly in uences the reported
results. We deployed a second set of experiments using
MovieLens Latest3[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] dataset cropped to a smaller chunk
containing 620 users, 851 items, 58801 ratings, where each
users has rated at least 20 items and each item was rated at
least 25 times. In Figure 1 we report results of evaluation
on this dataset with 5 folds. We ran these examples on a
2012's laptop computer with an Intel i5 at 2.60GHz and 8
3Authors express their gratitude to GroupLens for allowing
redistribution of the MovieLens Latest data.
      </p>
      <sec id="sec-4-1">
        <title>IBKNN</title>
        <p>Baseline Alg.</p>
        <p>BPR(20 features, 20 iteration)
BPR(40 features, 20 iteration)
wALS(20 features, 20 iteration)
wALS(40 features, 20 iteration)
0:85
0:8
10
50
100
150
200</p>
      </sec>
      <sec id="sec-4-2">
        <title>Neighboors for IB.</title>
        <p>GB RAM.We report execution times for a single prediction
task and the full evaluation steps in Table 2 and 3.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. CONCLUSIONS</title>
      <p>This poster contributed a recently released package for
prototyping and interactively demonstrating
recommendation algorithms in R. It comes with a nice range of
implemented standard CF algorithms. Reported results
demonstrate that it reproduces results of the Java-based Lenskit
toolkit. Thus it remains to hope that this e ort will be of
use for the eld of recommender systems and the large R
user community.</p>
    </sec>
    <sec id="sec-6">
      <title>6. REFERENCES</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Ekstrand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ludwig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Konstan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Riedl</surname>
          </string-name>
          .
          <article-title>Rethinking the recommender research ecosystem: Reproducibility, openness, and lenskit</article-title>
          .
          <source>RecSys '11</source>
          , pages
          <fpage>133</fpage>
          {
          <fpage>140</fpage>
          , New York, NY, USA,
          <year>2011</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Harper</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Konstan</surname>
          </string-name>
          .
          <article-title>The movielens datasets: History and context</article-title>
          .
          <source>ACM Trans. Interact. Intell. Syst.</source>
          ,
          <volume>5</volume>
          (
          <issue>4</issue>
          ):
          <volume>19</volume>
          :1{
          <fpage>19</fpage>
          :
          <fpage>19</fpage>
          ,
          <string-name>
            <surname>Dec</surname>
          </string-name>
          .
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Said</surname>
          </string-name>
          and
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Bellog n. Comparative Recommender System Evaluation: Benchmarking Recommendation Frameworks</article-title>
          .
          <source>RecSys</source>
          , pages
          <volume>129</volume>
          {
          <fpage>136</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>