=Paper=
{{Paper
|id=None
|storemode=property
|title=From Recordings to Recommendations: Suggesting Live Events in the DVR Context
|pdfUrl=https://ceur-ws.org/Vol-676/paper6.pdf
|volume=Vol-676
|dblpUrl=https://dblp.org/rec/conf/recsys/BassoMPR10
}}
==From Recordings to Recommendations: Suggesting Live Events in the DVR Context==
From Recordings to Recommendations: Suggesting Live
Events in the DVR Context
Alessandro Basso, Marco Milanesio, André Panisson, Giancarlo Ruffo
Dipartimento di Informatica
Università degli Studi di Torino
Torino, Italy
{basso,milane,panisson,ruffo}@di.unito.it
ABSTRACT known recommendation algorithms, that exploit user pro-
Providing valuable recommendations in the DVR domain is files for increasing accuracy and take into account privacy
quite straightforward when enough information about users issues as well.
and/or contents is known. In this work, we discuss the pos- Second, differently from the Video on Demand domain, the
sibility of recommending future live events without knowing usage of an Electronic Program Guide (EPG) is not always
anything else but past user programmed recording sched- assured. This fact brings two consequences: (a) there is
ules. no knowledge on the content the user is recording and/or
watching, and (b) there is no well defined one-to-one corre-
Categories and Subject Descriptors spondence between a recording and a broadcast event. This
H.3.3 [Information Storage and Retrieval]: Information leads to the impossibility of directly recommending record-
Search and Retrieval—Information filtering; J.4 [Computer ings to users.
Applications]: Social And Behavioral Sciences
Taking into account these considerations brings us our re-
search question: in such a domain, is it possible to give
General Terms valuable live event recommendations to users, only consid-
Algorithms, Experimentation, Human Factors. ering their recording activity on the DVR? Users have to be
brought to contents of interest, but, differently from other
Keywords approaches, we are not using anything but collaborative fil-
Digital Video Recorders, TV Broadcasts, Recommendation tering technique on users’ activity. Thus, the main contri-
Systems, Collaborative Algorithms, Implicit Data bution of our approach is the demonstration that this can be
achieved without any knowledge on what is being broadcast,
1. INTRODUCTION neither EPGs nor content classifications.
A Digital Video Recorder (DVR) is a device aimed at record-
ing digital streams to a storage. DVRs can be either hard-
ware devices, such as set-top-boxes and portable media play-
2. RELATED WORK
The task of recommending live events, such as TV shows,
ers, or software devices, such as web/PC-based Personal
has been already investigated in the past years. Proposed
Video Recorders (PVRs), managing all user interactions and
methods can exploit different ways to collect the required
personalizations. By using DVRs, users are no longer bound
information for user profiling, as well as can make use of
to the broadcaster’s schedule, but are free to define their
various recommendation algorithms. In particular, some ap-
personal lists of programs at any time.
proaches, such as [6], explicitly ask the users about their
interests and build suggestions on top of the resulting user
In order to provide a better user experience by means of fo-
profiles. A different idea, which is adopted in several works
cused advices (e.g., recommendation of new contents), the
[2, 9, 10], makes use of implicit feedbacks, i.e., information
arisen issues can be summarized in two main categories.
derived from the analysis of the user behavior while using
First, some logging activity must be done to find common
the DVR. Other solutions, as [4, 12, 15], propose recom-
usage patterns on which identify potential users’ interests.
mender systems which make use of user’s view history as
Users are not willing to offer an explicit profile when using
well as both explicit and implicit feedbacks. According to
a DVR, thus we do have, possibly, only a set of observations
authors, such a mixed technique allows to obtain the best
on their activity. This is an important challenge for many
performance.
Another feature to tell apart existing methods for live events
recommendation is the recommender algorithm used. A
common approach relies on the content of the programs
broadcast and it is therefore called content-based. Exam-
ples in this category can be found in [10, 12]. Some authors
devised recommenders that make use of multiple content-
Copyright is held by the author/owner(s). Workshop on the Practical Use of
Recommender Systems, Algorithms and Technologies (PRSAT 2010), held based techniques, as in [3, 4].
in conjunction with RecSys 2010. September 30, 2010, Barcelona, Spain. A solution able to increase novelty of recommendations is
collaborative filtering, like the works in [2, 5]. Another in- similar on timings. Specific values are used to define the
teresting method is proposed in [9] and exploits the latent maximum clustering distance for the start and the end times.
factor model. The output of this activity is a set of clusters, each identi-
fying a single event. The centroid of the cluster, i.e., the
In this work, we focus on implicit feedbacks only and we use recording that minimizes the intra-cluster timing distances,
a collaborative filtering approach to compute recommenda- is considered the representative of the event.
tions. Our aim is to minimize the information required as
input of the recommender system, without sacrificing the Aggregation. As the clustering occurs periodically, this
novelty. The real challenge is to be able to recommend pro- operation aims to identify newly created events character-
grams to users without actually knowing anything about ized by the same channel and periodicity of the formerly
what is broadcast on TV, since no EPG is used (differently created ones, but comparable timings. Such elements refer
from existing methods). to the same programs and are therefore merged into unique
events, whose properties are updated by taking into account
3. DATASET the values of all the similar ones.
Faucet is a PVR integrated in a podcasting service1 , which
allows the recording and further downloading of Italian TV Collapsing. A further refinement phase is required to grant
and Radio broadcasts [1]. The activity of the users is in- the consistency of the generated events. In fact, due to the
crementally collected (hourly) into a log file containing the high variability of timings, especially when a new transmis-
scheduled recordings set in the past hour as well as the oc- sion appears, events which are initially considered as non
curred downloads. The resulting dataset is populated by real referring to the same transmission tend to slowly and inde-
users expressing their preferences through the recorded pro- pendently converge to more stable timeframes. This implies
grams. The dataset is publicly available at http://secnet. the need of merging them into single events.
di.unito.it/vcast.
Each registered user can fix the desired settings for the As a result of the processing phase, given a set of recording
recording of interest. At the end of the process, her record- clustered together, each one with the same channel ci and
ing is scheduled for the given time and will be further avail- the same periodicity pi , we compute a discrete event ei in
able for downloading purposes. Each recording ri , thus, is the form of: < {ui }, t, ci , b, e, pi >, where {ui } is the set of
defined as a tuple < ui , ci , ti , bi , ei , pi > with the following users whose recordings were clustered together; t is the user
notation: user ui sets up a recording on channel ci , starting generated title most frequent among users in {ui }; b and e
from time bi and ending at time ei , with a title ti and a pe- are, respectively, the starting and ending time computed as
riodicity pi (e.g., once, every Tuesday, mon-fri). In Faucet, the median value of all the clustered recordings.
channels and periodicity values are fixed (users can choose
their ci and pi from a combobox), while all other fields are
completely up to the user. 4.2 From Events to Recommendations
After the end time expires, the recording is made available When future events are computed from scheduled record-
to the user for downloading. In case of periodic events, the ings, we are thus able to propose them to users by means of
recording step can occur an undefined number of times. Af- two different charts: (1) a global chart returning those events
ter each recording step, the respective download is made computed starting from the largest groups of recordings, i.e.,
available. those chosen by the largest sets of users; and (2) a user-based
recommendation list, returning a set of new events of pos-
4. METHODOLOGY sible interest to each user requesting it, computed through
a similarity function over the whole population. We call
In this section, we want to outline what our approach is.
them Most Popular and Rec2 (Recordings times Recommen-
Given no knowledge on the broadcasts, we collect the users
dations), respectively. Both charts are computed by means
activity to compute what we call discrete events, to be used
of the memory based collaborative filtering approach named
for recommendation purposes and top chart list building.
k -Nearest Neighbors (kNN) [14]. We apply both variants
of the kNN algorithm: the user-based one [8], by identify-
4.1 From Recordings to Events ing users interested in similar contents; and the item-based
The extraction of meaningful information from the unstruc- approach [7], by focusing on items shared by two or more
tured amount of data contained in the dataset is essential users.
to define a set of events which map the broadcast programs.
Through the event discovery phase, we can discretize the In kNN, the weight (i.e., a measure of interest) of an element
continuous domain of timings defined by the recordings, cre- ei for an user uk can be defined as:
ating the basis for the application of a recommender algo-
X
rithm. The basic procedure used in the discretization was w(uk , ei ) = r(ua , ei ) · c(uk , ua ), (1)
first introduced in [1] and covers a number of subsequent ua ∈N (uk )
steps:
where N (uk ) are the neighbors of user uk and r(ua , ei ) is
Clustering. Recordings are clustered together by consider- equal to 1 if user ua is associated to the event ei , and 0 oth-
ing the channel, the periodicity and the difference between erwise. The coefficient c(uk , ua ) represents the neighbor’s
starting and ending times. All recordings belonging to the information weight for user uk . In most of the kNN-based
same cluster are thus equal as channel and periodicity, whilst algorithms [8], the coefficient used is the similarity between
1 uk and ua .
http://www.vcast.it/
Most Popular. The MostPopular algorithm can be defined
by means of eq. (1), assuming the number of neighbors un- 100
bounded, which implies N (uk ) = U, ∀uk ∈ U ; and c(ua , ub ) =
Probability that an element have x recordings
1, ∀ua , ub ∈ U , with U as the setP of all users. Thus, the 10-1
weight is modified as w(uk , ei ) = ua r(ua , ei ).
10-2
After calculating the weight of all elements, they are sorted
in descendant order. In the MostPopular algorithm, as the
10-3
set of neighbors is independent of the user, all users re-
ceive the same recommended elements, i.e., the most popular
ones. 10-4
Rec2 . In order to provide personal suggestions, we have to 10-5
define a similarity function for grouping similar users (items)
from which choosing the appropriate elements to recom- 10-6 0
10 101 102 103 104
mend. Our definition of similarity is based only on implicit number of recordings
feedbacks, resulting from observing the behavior of users: if
she records something, then we assume that she is interested Figure 1: Number of recordings per event
in it; otherwise, we can not infer anything about the interest (Probability density function)
of the user for that element. We are therefore considering
binary feedbacks.
We are giving particular emphasis on the recall measure; in
Given two users u and v and the associated discrete events
fact, since we do not have explicit feedbacks regarding the
Eu and Ev , we can choose the similarity metric, S(u, v),
user’s interest in those items which have not been consid-
considering several well known measures (e.g., Dice, Cosine
ered (i.e., not programmed, nor downloaded), precision is
and Matching) [11]. After choosing a metric, ∀u we can
not very meaningful [9].
compute the subset Nu ⊆ U of neighbors of user u. A user
v such that Ev ∩ Eu 6= ∅ is thus defined as a neighbor of u.
First, we choose different similarity functions to understand
Starting from the neighborhood of u, the similarity with u is
whether similarity influences the results of the user-based
computed for each pair < u, v > such that v ∈ Nu . Finally,
kNN algorithm. From Figure 2(a) it is clear that, in this
if S(u, v) > 0, we consider u similar to v. The value S(u, v)
case, all chosen similarity metrics show nearly the same per-
is used to weight such a relation, therefore determining a
formance.
similarity order among the neighborhood of u, from which
choosing new events to recommend to u.
The second step is to find the optimal value for k. Figure
2(b) shows the results with k ∈ {100, 300, 500, 700, 2000}
Similarly, this approach can be adopted for the item-based
in user-based kNN (Dice similarity), and the MostPopu-
similarity: two events are considered similar if the share at
lar recommender. We omit the values of k = {500, 700}
least a single user that is associated to both of them.
since the results are almost equal to k = 300. Compared
to the MostPopular algorithm (i.e., unbounded neighbors),
5. EVALUATION a value k = 100 is not enough to outperform it, whilst for
In the following, we evaluate the obtained results in the k = 2000, kNN starts to converge to it. Considering the top
event extraction process and in the recommendation of new 10 recommended elements, we can achieve the best results
events to users, both in Most Popular and in Rec2 . for k = 300, whilst k = 500 is more suitable when taking
further elements. As in most cases 10 elements are sufficient
5.1 Event Extraction for a recommendation, k = 300 offers a good trade-off be-
As a remainder, we are dealing with several independent, tween valuable recommendations and resource consumption
user generated recording schedules, that we cluster together for building the neighborhood.
and from which we compute the discrete events. In Figure 1,
a view of the distribution of the recordings is given: for each A comparison among user/item-based kNN and MostPop-
detected event, the number of recordings clustered together ular is depicted in Figure 2(c). We can observe that the
changes according to users’ activity. As it turns out, most latter is clearly outperformed by the other two algorithms,
recordings (and, thus, most users) tend to be clustered and especially when more than 7 recommended items are con-
aggregated on very few events, while there are lots of events sidered. The user-based algorithm performs slightly better
with very few recordings. The Most Popular algorithm ex- than the item-based one (more noticeable with more than
ploit these inner features of the resulting discrete events to 15 recommended items). In general, item-based algorithms
compute the top chart. tend to perform better because usually the number of items
is considerably lower than the users [14], but this property
does not hold in our domain.
5.2 Computing Recommendation
We measure how accurate is the recommendation in predict-
ing the elements that users would program in terms of re- 6. CONCLUSION
call. These values are computed as the average of all users’ In this paper we show how to recommend live events to
recall values using the top n recommended elements [13]. users without any knowledge about the broadcast content
35% 40% 40%
30% 35% 35%
30% 30%
25%
25% 25%
20%
Recall
Recall
Recall
20% 20%
15%
15% 15%
10%
10% 10%
Most Popular
5% Dice similarity 5% k=2000 5% User-based KNN
Cosine similarity k=300 Item-based KNN
Matching similarity k=100 Most Popular
0% 0%
0 5 10 15 20 25 30 0 5 10 15 20 25 30
0%
0 5 10 15 20 25 30
top selection top selection top selection
(a) Comparison between similarity func- (b) Recall for user-based kNN (c) Recall for kNN (k = 300) wrt Most-
tions in user-based kNN Popular
Figure 2: Comparison between recommenders and recall for kNN and Most Popular.
and user’s likes. Recommendations can be given both glob- [7] M. Deshpande and G. Karypis. Item-based top-n
ally and personally. It is important to underline that the recommendation algorithms. ACM Trans. Inf. Syst.,
most popular events are easier to predict since users tend to 22(1):143–177, 2004.
naturally focus on them, even without any specific sugges- [8] J. L. Herlocker, J. A. Konstan, A. Borchers, and
tion. On the contrary, granting a high novelty in personal J. Riedl. An algorithmic framework for performing
recommendations is a more challenging goal due to the re- collaborative filtering. In SIGIR ’99: Proceedings of
duced amount of explicit information. Nevertheless, we can the 22nd annual international ACM SIGIR conference
obtain interesting results even exploiting a simple approach on Research and development in information retrieval,
as the kNN. We are currently attempting other approaches pages 230–237, New York, NY, USA, 1999. ACM.
to recommendation (e.g., latent factor model) with implicit [9] Y. Hu, Y. Koren, and C. Volinsky. Collaborative
feedbacks, with the aim of improving the prediction accu- filtering for implicit feedback datasets. In ICDM ’08:
racy. Proceedings of the 2008 Eighth IEEE International
Conference on Data Mining, pages 263–272,
Washington, DC, USA, 2008. IEEE Computer Society.
7. REFERENCES [10] S. G. Kaushal, S. Gutta, K. Kurapati, K. Lee,
[1] A. Basso, M. Milanesio, and G. Ruffo. Events J. Martino, J. Milanski, J. D. Schaffer, and
discovery for personal video recorders. In EuroITV J. Zimmerman. Tv content recommender system. In
’09: Proceedings of the seventh european conference on In Proceedings of the 17th National Conference on
European interactive television conference, pages Artificial Intelligence, pages 1121–1122. AAAI Press /
171–174, New York, NY, USA, 2009. ACM. The MIT Press, 2000.
[2] P. Baudisch and L. Brueckner. Tv scout: Guiding [11] B. Markines, C. Cattuto, F. Menczer, D. Benz,
users from printed guides to personalized tv program. A. Hotho, and G. Stumme. Evaluating similarity
In In Proceedings of the 2nd Workshop on measures for emergent semantics of social tagging. In
Personalization in Future TV (May 28, Malaga, WWW ’09: Proceedings of the 18th international
Spain), Universidad de Malaga, pages 151–160, 2002. conference on World wide web, pages 641–650, New
[3] Y. Blanco-Fernández, J. J. Pazos-Arias, A. Gil-Solla, York, NY, USA, 2009. ACM.
M. Ramos-Cabrer, B. Barragáns-Martı́nez, [12] M. Rovira, J. Gonzàlez, A. López, J. Mas, A. Puig,
M. López-Nores, J. Garcı́a-Duque, A. Fernández-Vilas, J. Fabregat, and G. Fernandez. Indextv: a mpeg-7
and R. P. Dı́az-Redondo. A multi-agent open based personalized recommendation system for digital
architecture for a tv recommender system: A case tv. In ICME, pages 823–826. IEEE, 2004.
study using a bayesian strategy. Multimedia Software
[13] B. M. Sarwar, G. Karypis, J. A. Konstan, and J. T.
Engineering, International Symposium on, pages
Riedl. Application of dimensionality reduction in
178–185, 2004.
recommender system - a case study. In In ACM
[4] A. L. Buczak, J. Zimmerman, and K. Kurapati. WebKDD Workshop, 2000.
Personalization: Improving ease-of-use, trust and
[14] B. M. Sarwar, G. Karypis, J. A. Konstan, and J. T.
accuracy of a tv show recommender. In in Proceedings
Riedl. Item-based collaborative filtering
of the TV’02 workshop on Personalization in TV,
recommendation algorithms. In WWW ’01:
Malaga, pages 3–12, 2002.
Proceedings of the 10th international conference on
[5] P. Cremonesi and R. Turrin. Analysis of cold-start World Wide Web, pages 285–295, New York, NY,
recommendations in iptv systems. In RecSys ’09: USA, 2001. ACM.
Proceedings of the third ACM conference on
[15] K. K. Srinivas, S. Gutta, D. Schaffer, J. Martino, and
Recommender systems, pages 233–236, New York, NY,
J. Zimmerman. A multi-agent tv recommender. In In
USA, 2009. ACM.
Proceedings of the UM 2001 workshop
[6] D. Das and H. ter Horst. Recommender systems for ”Personalization in Future TV”, 2001.
tv. In In Proceedings of AAAI, 1998.