<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ECML-PKDD 2011 Discovery Challenge Overview</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nino Antulov-Fantulin</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matko Boˇsnjak</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Zˇnidarˇsiˇc</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Miha Grˇcar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mikolaj Morzy</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tomislav Sˇmuc</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Joˇzef Stefan Institute</institution>
          ,
          <addr-line>Ljubljana</addr-line>
          ,
          <country country="SI">Slovenia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Poznan ́ University of Technology</institution>
          ,
          <addr-line>Poznan ́</addr-line>
          ,
          <country country="PL">Poland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Rudjer Boˇskovic Institute</institution>
          ,
          <addr-line>Zagreb</addr-line>
          ,
          <country country="HR">Croatia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2011</year>
      </pub-date>
      <fpage>7</fpage>
      <lpage>20</lpage>
      <abstract>
        <p>This year's Discovery Challenge was dedicated to solving of the video lecture recommendation problems, based on the data collected at VideoLectures.Net site. Challenge had two tasks: task 1 in which new-user/newitem recommendation problem was simulated, and the task 2 which was a simulation of the clickstream-based recommendation. In this overview we present challenge datasets, tasks, evaluation measure and we analyze solutions and results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>the leaderboard and the test set), together with task and evaluation descriptions is
publicly available for the non-commercial research purposes [28].</p>
      <p>We have ensured prize-sponsoring (5500 e) from the European Commission through
the e-LICO EU project, 2009-2012 whose primary goal is to build a virtual laboratory
for interdisciplinary collaborative research in data mining and data-intensive sciences.</p>
      <p>The prizes, for each of the tracks are:
– 1500 e for the first place
– 700 e for the second place
– 300 e for the third place</p>
    </sec>
    <sec id="sec-2">
      <title>The prizes, for the Workflow contest are:</title>
      <p>– 500 e for the best workflow
– Free admission to RapidMiner Community Meeting and Conference 2012 for the
best RapidMiner workflow (sponsor: Rapid-I)</p>
      <sec id="sec-2-1">
        <title>The challenge has been hosted on TunedIt7.</title>
        <p>2</p>
        <sec id="sec-2-1-1">
          <title>Background</title>
          <p>
            Recommender systems have become an important research area since the first
appearance of the information overload for the typical user on the internet. Personalized
recommender systems take user profiles into account when the prediction for particular
user and item is generated. The prediction techniques for the recommender systems
[
            <xref ref-type="bibr" rid="ref1 ref2 ref3">1–3</xref>
            ] can be divided into three main categories: content-based, collaborative-based
and hybrid-based prediction techniques.
          </p>
          <p>
            Content-based techniques [
            <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
            ] are based on interactions between a particular user
and all the items in the system. Content-based recommender systems use information
about items and the user’s past activities on items in order to recommend similar
items.
          </p>
          <p>
            Collaborative filtering techniques [
            <xref ref-type="bibr" rid="ref6 ref7 ref8">6–8</xref>
            ] analyze interactions between all users and
all items through users’ ratings, clicks, comments, tags, etc. Collaborative filtering
recommender systems do not use any specific knowledge about the items except their
unique identifiers. These prediction techniques are domain-independent and can
provide serendipity recommendations for users. However, collaborative filtering needs
sufficient amount of collaborative data in order to recommend for the new user or the
new item (the cold-start problem) [
            <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
            ].
          </p>
          <p>
            Hybrid prediction techniques [
            <xref ref-type="bibr" rid="ref11 ref12 ref13">11–13</xref>
            ] merge collaborative-based and content-based
techniques and are more resistant to cold start problems. This challenge was designed
to tackle the problems of cold start and hybridization of content and collaborative data
in realistic setting of the VL.Net website. In comparison to recommender challenges
of recent years (Netflix challenge, KDDCup challenge 2008, KDDCup challenge 2011)
this challenge relies on indirect collaborative data, and is more focussed on utilization
of content and descriptions of items.
7 http://tunedit.org
3
          </p>
        </sec>
        <sec id="sec-2-1-2">
          <title>Description of the challenge dataset</title>
          <p>The data snapshot which is the basis for the VideoLectures.Net dataset was taken in
August 2010. At that time, the database contained 8 105 video lectures. 5 286 lectures
were manually categorized into taxonomy of roughly 350 scientific topics such as Arts,
Computer Science, and Mathematics.</p>
          <p>VideoLectures.Net dataset includes:
1. Data about lectures: every lecture has a title, type (e.g. lecture, keynote,
tutorial, press conference, etc.) language identifier (e.g. en, sl, fr, etc.), number of
views, publication date, event identifier, and a set of authors. Many lectures come
with a short textual description and/or with slide titles from the respective
presentations. Specifically, 5 724 lectures are enriched with this additional unstructured
data. The training part of data contains also lecture-pairs coviewing frequencies
(CVS - common view score), and pooled sequences related collaborative data,
which is not available for the set of test lectures. Test set contains lectures with
publication date after July 01, 2009, which are used for task 1 scoring. Neither
CVS nor pooled viewing sequences containing these lectures are available in the
training data.
2. Data about authors: each author has a name, e-mail address, homepage
address, gender, affiliation, and the respective list of lectures. The dataset contains
8 092 authors. The data about the authors is represented by authors’ names,
VL.Net url, e-mail, homepage, gender, affiliation, and pairwise relations to the
lectures delivered by the author at VL.Net
3. Data about events: a set of lectures can be associated with an event (e.g. a
specific conference). In a similar fashion, events can be further grouped into
metaevents. An event is described in a similar way as a lecture: it has a title, type (e.g.
project, event, course), language identifier, publication date, and a meta-event
identifier. The VideoLectures.Net dataset contains data about 519 events and
meta-events (245 events are manually categorized, 437 events are enriched with
textual descriptions).
4. Data about the categories: The data about the categories is represented in
the shape of the scientific taxonomy used on VL.Net. The taxonomy is described
in a pairwise form, using parent and child relations.
5. View statistics: The VideoLectures.Net software observes the users accessing
the content. Each browser, identified by a cookie, is associated with the sequence
of lectures that were viewed in the identified browser. Temporal information,
view durations, and/or user demographics are not available. The dataset contains
anonymized data of 329 481 distinct cookie-identified browsers. The data about
view statistics is given in the form of frequencies: (i) for a pair of lectures viewed
together (not necessarily consecutively) with at least two distinct cookie-identified
browsers; (ii) for pooled viewing sequences - triplets of lectures viewed together
prior to a given sequence of ten lectures. This is a special construct based on
aggregation of click-streams, which is used for training and scoring in task 2.
3.1</p>
          <p>Creating pooled viewing sequences
In order to comply with privacy-preserving constraints, lecture viewing sequences
for the task 2 have been transformed into what we named pooled sequences. Pooled
viewing sequence is given by the set of three lectures on the left side (triplet) and a
ranked list of at most ten lectures on the right side. The set of three lectures does
not imply an ordering, it is merely a set that comes upstream of lectures given on
the right of a pooled viewing sequence. Ranked list on the right side of some pooled
viewing sequence is constructed from all the clickstreams with the particular triplet
on the left side. The transformation process for the construction of pooled viewing
sequences is given below.</p>
          <p>Consider a sequence of viewed lectures:</p>
          <p>id1 → id7 → id2 → id1 → id4 → id5 → id6 → id3</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>We first filter out duplicates (here - id1):</title>
        <p>id1 → id7 → id2 → id4 → id5 → id6 → id3
Then, we determine all possible unordered triplets in the sequence. For each triplet,
cut the sequence after the right-most lecture from the triplet.</p>
        <p>In the above example, if {id1, id4, id5} is the triplet, the sequence is cut right after
id5. Finally, increase triplet-specific counts for all the lectures after the cut. In the
above example, given the triplet {id1, id4, id5}, the triplet-specific counts for id6 and
id3 are increased:</p>
        <p>{id1, id4, id5} → id6 : 1, id3 : 1
Suppose there is another click-stream sequence, that amongst others, contains
unordered triplet id1, id4, id5 and that id6, id3, and id7 are lectures appearing after the
cut. Then the counts for the {id1, id4, id5} are increased as follows:</p>
        <p>{id1, id4, id5} → id6 : 2, id3 : 2, id7 : 1
3.2</p>
        <p>Creating lecture co-viewing frequencies</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Consider two sequences of viewed lectures:</title>
      <p>id1 → id7 → id2 → id1,
id2 → id3 → id7.
id1 → id7 → id2,
id2 → id3 → id7.</p>
      <p>CVS (id1, id2) = 1, CVS (id1, id7) = 1,
CVS (id2, id7) = 2, CVS (id2, id3) = 1,
CVS (id3, id7) = 1.
We first filter out duplicates in sequences:</p>
    </sec>
    <sec id="sec-4">
      <title>Then, we determine lecture co-viewing frequencies (CVS): ECML-PKDD 2011 Discovery Challenge. Table 1: Train-test data statistics</title>
      <p>Moment t2 05.08.2010.</p>
      <p>Moment t1 01.07.2009.</p>
      <p>Total number of lectures in the train set 6983
Total number of lectures in the test set 1122
Number of common-view pairs in the train set 363 880</p>
      <p>Number of common-view pairs in the test set 18 450
3.3</p>
      <p>A train-test split logic
Basic statistics of lectures in the training and test sets are given in Table 1. Common
view score matrix CV S is a lecture co-viewing frequency matrix collected at the site
at some moment t2 and represents lecture viewing adjacency matrix of lecture-lecture
graph G at the moment t2. G is undirected weighted graph of all lectures. Each
lecture in this graph has associated temporal information - date of publishing at the
VideoLectures.Net site. We partition G using the publishing date by some threshold
t1, into two disjoint graphs G1 and G2: each lecture in G1 has publishing date before
the date threshold while each lecture in G2 has publishing date after the date threshold
t1. We define pair common viewing time as a period that two lectures spend together
in the system. All lecture pairs (xi, xj ) : xi ∈ G1, xj ∈ G1 have pair common time
strictly greater than (t2 − t1) value and all lecture pairs (xi, xj ) : xi ∈ G1, xj ∈ G2
have pair common time strictly less than (t2 − t1) value.</p>
      <p>In oder to make proper the training and test set split based on G1 and G2, we
had to ensure similar distribution of pair common times in both training and test
sets. We have divided nodes from subgraph G2 in randomized fashion (with some
constraints) into two approximately equal sets (G21, G22) and we have appended G21
to the training set. Now, the subset of lecture pairs (xi, xj ) : xi ∈ G1, xj ∈ G21 from
the training set has similar distribution of pair common times that overlaps with times
(xi, xj ) : xi ∈ G1, xj ∈ G22 from the test set. Figure 1 gives the distribution of edges
related to the graphs G1, G22.</p>
      <p>Finaly, the train-test split logic was implemented through the series of steps:
1. Split the lectures by publication date into two subsets: old (publication date &lt;
July 01, 2009) and new (publication date ≥ July 01, 2009). Put the old lectures
into the training set;
2. Move all new lectures with parent id occuring in the old lecture subset to the
training set;
3. Split the rest of the new lectures randomly into two disjoint sets of similar
cardinality, taking care of their parent ids:
(a) lectures with the same parent id can be only in one of the sets;
(b) lectures without parent id are just randomly divided between two sets.
4. Finally, add one of the disjoint sets to the training set; the other disjoint set
represents the test set.</p>
      <p>At the end of the process, we get the training set consisting of all the lectures
with publishing date prior to July 01, 2009, together with approximately half of the
lectures after the aforementioned date, and the test set consisting of the rest of the
lectures published after the aforementioned date.
Due to the nature of the problem, each of the tasks has its own merit: task 1
simulates new-user and new-item recommendation (cold start mode); task 2 simulates
clickstream-based (implicit preference) recommendation.
The first task of the challenge is related to solving the so called cold start problem,
commonly associated with pure collaborative filtering (CF) recommenders. Generally,
cold start recommending quality should be measured through user satisfaction surveys
and analysis. For the challenge, one needs a quantitative measure and a simulated cold
start situation. In order to be able to score solutions, new video lectures are those that
entered the site more recently, but for which there is already some viewing information
available.</p>
      <p>In this task, we assume that the user has seen one of the lectures which are
characterized by the earlier times of entering the site (old lectures). As a solution for
this task a ranked list of lectures from the new lectures set, is to be recommended
after viewing some of the old lectures. The length of the recommended list is fixed at
30 lectures. Overall score for the submission/solution is based on the mean average
R-precision score (MARp) (explained in Section 5).</p>
      <p>Solution for the task 1 is based on ranking of lectures according to withheld lecture
co-viewing frequencies in descending order. Suppose, the co-viewing frequencies (CVS)
for some old lecture id1 to new lectures {id2, id3, id4, id5} are:
then we construct solution ranked list for old-lecture id1:</p>
      <p>CVS (id1, id2) = 12, CVS (id1, id3) = 2,
CVS (id1, id4) = 43, CVS (id1, id5) = 3,
id1 : id4, id2, id5, id3.
4.2</p>
      <p>Pooled lecture viewing sequences task
In task 2 contestants are asked to recommend a ranked list of ten lectures that should
be recommended after viewing a set of three lectures. In contrast to the task 1, this
is the situation close to typical recommendation scenario (submission and evaluation
for the task 2). Solution for the task 2 is based on ranking of lectures according to
frequencies in withheld pooled lecture viewing sequences in descending order. Test
lectures from the task 1 are in this case not included into training pooled sequences,
but can be a part of the ranked solution list for the task 2.</p>
      <p>Suppose, there is a pooled lecture viewing sequences:</p>
      <p>{id1, id4, id5} → id6 : 5, id3 : 4, id7 : 2, id2 : 1,
then we construct solution ranked list for triplet {id1, id4, id5}:</p>
      <p>{id1, id4, id5} → id6, id3, id7, id2.
5</p>
      <sec id="sec-4-1">
        <title>Challenge evaluation function</title>
        <p>Taking into account relative scarcity of items available for learning, recommending
and evaluation (esp. in case of cold start task), we have defined an R-precision variants
of standard evaluation measures in information retrieval p@k and MAP . The overall
score of the submission is mean value over all queries R (recommended lists r) given
in the test sets:</p>
        <p>MARp =
1</p>
        <p>X AvgRp(r)
|R| r∈R
AvgRp(r) = X
z∈Z</p>
        <p>Rp@z(r)
|Z|</p>
        <p>Average R-precision score - AvgRp(r) for a single recommended ranked list r is
defined as:</p>
        <p>where Rp@z(r) is R-precision at some cut-off length z ∈ Z. Rp@z(r) is defined as
the ratio of number of retrieved relevant items and relevant items at the particular
cut-off z of the list:
|relevant ∩ retrived |z = |relevant ∩ retrived |z</p>
        <p>|relevant |z min(m, z)</p>
        <p>Number of relevant items at cut-off length z is defined as min(m, z), where m is
the total number of relevant items. When m ≤ z, number of relevant items at z is m,
while for other situations it is limited to top z relevant items from the (real) solution
ranked list s. A special situation happens when there are more equally relevant items
at the same rank (ties) at the cut-off length of the s list. In that case, any of these
items are treated as relevant (true positive) in calculating Rp@z(r). For the task 1,
cut-off lengths z for the calculation of MARp are z ∈ {5, 10, 15, 20, 25, 30}. For the
task 2, cut-off lengths z for the calculation of MARp are z ∈ {5, 10}.
We have introduced R-precision because it is more apt to our situation: it adjusts to
the size of the set of relevant documents. Typically, in information retrieval tasks one
has to filter and rank from a large pool of both relevant and irrelevant items. This is
not the case with the simulated cold start situation of this challenge. As an example,
if there were only 4 items (lectures) in the whole collection relevant to the particular
query, a perfect recommender system would score 1, measured by Rp@10, whereas its
p@10 would be only 0.4. Using this measure for our application makes more sense,
as the number of relevant items can vary from 1 to above 30, and in such situations
Rp@z expresses the quality of retrieval more fairly at some predefined retrieval
(cutoff) length, than p@z. The reason why we use AvgRp(r) over set of different Rp@z,
is that through the averaging we can also take into account ranking and at the same
time improve the ability to differentiate between similar solutions (recommenders).</p>
        <p>We have also considered MAP (mean average precision) measure, which is the
closest to the proposed measure. However, MAP does not take into account absolute
ranking positions of recommended items since permutations of relevant or true positive
items in recommended list do not affect MAP score.</p>
        <p>
          Normalized discounted cumulative gain (NDCG) [16, 17] takes into account that
relevant documents are more useful when apperaing earliear in a recommendation
list. It is the most common measure used for ranking the results of the search list in
information retrieval. This measure has also been used in other challenges where the
main task was to learn ranking [
          <xref ref-type="bibr" rid="ref14">14, 15</xref>
          ].
        </p>
        <p>If ranking order is not to be so strict for the top-n item recommendations [18], the
”granularity” of ranking can be relaxed. This is the main reason why we are using
MARp measure instead of the NDCG. Proposed measure MARp takes into account
absolute ranking positions with granularity of five items. This granularity was chosen
after studying the ranking-recall influence on recommender system evaluation.
6</p>
      </sec>
      <sec id="sec-4-2">
        <title>Challenge submissions results</title>
        <p>ECML-PKDD 2011 Discovery Challenge started on 18th of April and ended on 8th of
July 2011. The competition attracted significant number of participants: 303 teams
with 346 members, with 62/22 active teams per task. More than 2000 submissions
were sent and best approaches outperformed baseline solution several times.</p>
        <p>Winners of the challenge for task 1 are:</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Winners of the challenge for task 2 are:</title>
      <p>The final scores, for the teams that scored better than the random recommender,
are presented in the Figures 3 and 4, for each of the tasks respectivelly. The scores are
accompanied with the graphs of differences between preliminary MARp score on the
leaderboard set and the final MARp score on the test set. For the task 1, from Figure
3, we can conclude that majority of the teams had positive difference scores, which
may suggest overtraining. To the contrast, the majority of the teams had negative
difference scores in tasks 2 (see Figure 4).</p>
      <p>The distributions of the average R-precision over queries for the winning entry
on each of the tasks are presented in Figure 5. Difference in distributions between
the tasks reflects also the difference in the approaches used: while for the first task
main features for solving the problem are constructed from lecture content and
metadata similarity, for the second task only co-viewing information is utilized. We have
also noted that these distributions are qualitatively very similar between first three
positioned entries on each of the tasks, reflecting general similarity in approaches of
different teams.</p>
      <p>Dependence of query average R-precision score on the size of the solution list for
the task 1 is presented in Figure 6 (graph on the left). On average, query score just
slightly diminishes with the increase of the solution list. To the contrast, dependence
of the query average R-precision score to the triplet frequency, for the task 2 (graph
on the right in Figure 6) shows that on average the quality of result for the query is
proportional to the triplet frequency.
The teams approached task 1 using quite different learning techniques with the
primary effort focussed on feature engineering and optimization. Almost all of the
participants have utilized all the lecture content related data (lecture taxonomy, event
tree, types of lectures, descriptions, etc.), differing however slightly in definitions of
the similarity of any two lectures. Important with respect to the overall score was
the process of filling the missing values for the lectures that lack some of the content
related data. Winning solutions used more sophisticated approach of filling lecture
content and meta-data features’ missing values using lecture co-viewing information
(weighted CVS feature vector expansion [19], query expansion[20]) - thus utilizing
collaborative information to ”enrich” content-based features.</p>
      <p>Table 2 gives a summary of the feature engineering approaches and learning
methods used in solving challenge tasks.
7</p>
      <sec id="sec-5-1">
        <title>Conclusion</title>
        <p>In the last couple of years, a number of challenges was organized in the field of
recommendation problems. Most of them were focussed on prediction problems related
to large scale explicit or implicit user preference matrices, in some cases combined with
(mostly obfuscated) user, item and/or context related information. ECML-PKDD
2011 Discovery Challenge differed from this mainstream through two aspects: (i)
instead of user preferences, only item to item preference information is available in
the shape of the co-viewing frequencies graph; (ii) a rich and explicit description of
lectures is available in the form of structured and unstructured text. On both tasks
participants have obtained significantly higher MARp values than set by the baseline
solutions.</p>
        <p>The analysis of the results shows that the most important part of a successful
solution was careful feature engineering. Definition of the similarity scoring
function capable of capturing content, context and temporal information turned out to
be crucial for the success in the cold start (task 1) competition. Task 2, pooled
sequence completion problem, was easier to solve and both approaches and results of
the participants were mutualy much more similar. Rather unexpectedly, content
related information was not used in ranking lectures to be viewed in succession to test
set triplets. Most of the participants have also reported about the complexity/scaling
of their solutions.</p>
        <p>Our opinion is that the results of the challenge could be quite useful for
constructing a new recommendation system for the VideoLectures.Net. In particular, there are
several approaches that could significantly improve recommendation quality of new
lectures at the site, with modest consumption of additional computational resources.
Using lecture co-viewing frequency information instead of original preferences
information in the form of click-streams should be studied in more detail, in order to
understand the implications of this transformation on the personalized
recommendation quality from the user’s perspective.</p>
        <p>Acknowledgements
The Discovery Challenge 2011 has been supported by the EU collaborative project
e-LICO (GA 231519). The organizers of the Challenge are grateful to the Center
for Knowledge Transfer in Information Technologies of the Joˇzef Stefan Institute
and Viidea Ltd for the data of the VideoLectures.Net site, and TunedIT for the
professional support in conducting the competition. Finally, we want to thank all the
active participants of the challenge for their effort in the challenge and willingness to
share their solutions and experience through the contributions in this workshop.
15. Internet Mathematics 2009 contest: Limited Liability Company,
http://imat2009.yandex.ru/academic/mathematic/2009/en/.
16. K. Jarvelin, J. Kekalainen: Cumulated gain-based evaluation of IR techniques, ACM</p>
        <p>Transactions on Information Systems 20(4), pp 422-446 (2002).
17. B. Croft, D. Metzler, and T. Strohman: Search Engines: Information Retrieval in
Practice. Addison Wesley, (2009).
18. A. Turpin, W. Hersh: Why batch and user evaluations do not give the same results, In
Proceedings of the 24th Annual ACM SIGIR Conference on Research and Development
in Information Retrieval. ACM, New York, pp 17-24, (2001).
19. A. D’yakonov: Two Recommendation Algorithms Based on Deformed Linear
Combinations. In Proc. of ECML-PKDD 2011 Discovery Challenge Workshop, pp 21-27 (2011).
20. E. Spyromitros-Xioufis, E. Stachtiari, G. Tsoumakas, and I. Vlahavas: A Hybrid
Approach for Cold-start Recommendations of Videolectures. In Proc. of ECML-PKDD 2011
Discovery Challenge Workshop, pp 29-39, (2011).
21. M. Moˇzina, A. Sadikov, and I. Bratko: Recommending VideoLectures with Linear
Regression. In Proc. of ECML-PKDD 2011 Discovery Challenge Workshop, pp 41-49, (2011).
22. J. A. Kreiner and E. Abraham: Recommender system based on purely probabilistic
model from pooled sequence statistics. In Proc. of ECML-PKDD 2011 Discovery
Challenge Workshop, pp 51-57, (2011).
23. V. Nikulin: OpenStudy: Recommendations of the Following Ten Lectures After
Viewing a Set of Three Given Lectures. In Proc. of ECML-PKDD 2011 Discovery Challenge
Workshop, pp 59-69, (2011).
24. H. Liu, S. Das, D. Lee, P. Mitra, C. Lee Giles: Using Co-views Information to Learn
Lecture Recommendations. In Proc. of ECML-PKDD 2011 Discovery Challenge Workshop,
pp 71-82, (2011).
25. M. Chevalier, T. Dkaki, D. Dudognon, J. Mothe: IRIT at VLNetChallenge. In Proc. of</p>
        <p>ECML-PKDD 2011 Discovery Challenge Workshop, pp 83-93, (2011).
26. L. Iaquinta and G. Semeraro: Lightweight Approach to the Cold Start Problem in the
Video Lecture Recommendation. In Proc. of ECML-PKDD 2011 Discovery Challenge
Workshop, pp 95-101, (2011).
27. G. Capan, O. Yilmazel: Joint Features Regression for Cold-Start Recommendation on
VideoLectures.Net In Proc. of ECML-PKDD 2011 Discovery Challenge Workshop, pp
103-109, (2011).
28. N. Antulov-Fantulin, M. Boˇsnjak, T. Sˇmuc, M. Jermol, M. Zˇnidarˇsiˇc, M. Grˇcar, P. Keˇse,
N. Lavraˇc: ECML/PKDD 2011 - Discovery challenge: VideoLectures.Net Recommender
System Challenge, http://lis.irb.hr/challenge/</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>S.</given-names>
            <surname>Rendle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Tso-Sutter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Huijsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Freudenthaler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Gantner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wartena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Brussee</surname>
          </string-name>
          and
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Wibbels: Report on State of the Art Recommender Algorithms (Update)</article-title>
          .
          <source>MyMedia public deliverable D4.1</source>
          .2., (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>G.</given-names>
            <surname>Adomavicius</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Tuzhilin</surname>
          </string-name>
          <article-title>Towards the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions</article-title>
          .
          <source>IEEE Transactions on knowledge and data engineering</source>
          ,
          <volume>17</volume>
          (
          <issue>6</issue>
          ) (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>M.</given-names>
            <surname>Montaner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lopez</surname>
          </string-name>
          and
          <string-name>
            <surname>J.L. de la Rosa</surname>
          </string-name>
          :
          <article-title>A Taxonomy of Recommender Agents on the Internet</article-title>
          .
          <source>Artificial Intelligence Review</source>
          ,
          <volume>19</volume>
          , (
          <year>2003</year>
          ),
          <fpage>285</fpage>
          -
          <lpage>330</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Salton: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer</article-title>
          .
          <source>Addison Wesley</source>
          , (
          <year>1989</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>R.</given-names>
            <surname>Baeza-Yates</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ribeiro-Neto</surname>
          </string-name>
          :
          <article-title>Modern Information Retrieval</article-title>
          . Addison Wesley, (
          <year>1999</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>W.</given-names>
            <surname>Hill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Stead</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rosenstein</surname>
          </string-name>
          and
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Furnas: Recommending and Evaluating Choices in Virtual Community of Use</article-title>
          .
          <source>Proc. Conf. Human Factors in Computing Systems</source>
          , (
          <year>1995</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>P.</given-names>
            <surname>Resnick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Iakovou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sushak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bergstrom</surname>
          </string-name>
          and
          <string-name>
            <surname>J.Riedl:</surname>
          </string-name>
          <article-title>GroupLens: An Open Architecture for Collaborative Filtering of Netnews Proc</article-title>
          . Computer Supported Cooperative Work Conf., (
          <year>1994</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>U.</given-names>
            <surname>Shardanand</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Maes</surname>
          </string-name>
          : Social Information Filtering: Algorithms for Automating '
          <source>word of Mouth' Proc. Conf. Human Factors in Computing Systems</source>
          , (
          <year>1995</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>C.</given-names>
            <surname>Boutilier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.S.</given-names>
            <surname>Zemel</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Marlin</surname>
          </string-name>
          :
          <article-title>Active Collaborative Filtering</article-title>
          .
          <source>In Proc. of the Nineteenth Annual Conference on Uncertainty in Artificial Intelligence</source>
          , (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>A.</given-names>
            <surname>Schein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Popescul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ungar</surname>
          </string-name>
          , and
          <string-name>
            <surname>D.</surname>
          </string-name>
          <article-title>Pennock: Generative models for coldstart recommendations</article-title>
          .
          <source>In Proceedings of the 2001 SIGIR Workshop on Recommender Systems</source>
          , (
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>M.</given-names>
            <surname>Balabanovic</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shoham</surname>
          </string-name>
          : Fab:
          <article-title>Content-based, collaborative recommendation</article-title>
          .
          <source>Communications of the ACM</source>
          ,
          <volume>40</volume>
          (
          <issue>3</issue>
          ), (
          <year>1997</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>J.</given-names>
            <surname>Basilico</surname>
          </string-name>
          and
          <string-name>
            <surname>T.</surname>
          </string-name>
          <article-title>Hofmann: Unifying collaborative and content-based filtering</article-title>
          .
          <source>In Proceedings of the Twenty-First International Conference on Machine Learning</source>
          , pages
          <fpage>65</fpage>
          -
          <lpage>72</lpage>
          , New York, NY, USA, ACM Press, (
          <year>2004</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. R.Burke:
          <article-title>Hybrid recommender systems: Survey and experiments</article-title>
          .
          <source>User Modeling</source>
          and
          <string-name>
            <surname>User-Adapted</surname>
            <given-names>Interaction</given-names>
          </string-name>
          ,
          <volume>12</volume>
          (
          <issue>4</issue>
          ), pp
          <fpage>331</fpage>
          -
          <lpage>370</lpage>
          , (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>O.</given-names>
            <surname>Chapelle</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          <article-title>Chang: Yahoo! Learning to Rank Challenge Overview</article-title>
          ,
          <source>JMLR: Workshop and Conference Proceedings</source>
          <volume>14</volume>
          , pp
          <fpage>1</fpage>
          -
          <lpage>24</lpage>
          , (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>