=Paper=
{{Paper
|id=None
|storemode=property
|title=Employing User-Assigned Tags to Provide Personalized as well as Collaborative TV Recommendations
|pdfUrl=https://ceur-ws.org/Vol-720/Thalhammer.pdf
|volume=Vol-720
}}
==Employing User-Assigned Tags to Provide Personalized as well as Collaborative TV Recommendations==
Employing User-Generated Tags to Provide Personalized
as well as Collaborative TV Recommendations
Andreas Thalhammer Günther Hölbling Dieter Fensel
Semantic Technology InstituteChair of Distributed Semantic Technology Institute
University of Innsbruck Information Systems University of Innsbruck
Technikerstraße 21a University of Passau Technikerstraße 21a
6020 Innsbruck, Austria Innstraße 43 6020 Innsbruck, Austria
andreas.thalhammer@sti2.at 94032 Passau, Germany dieter.fensel@sti2.at
hoelblin@fim.uni-
passau.de
ABSTRACT the user’s preferences for certain tags and the annotations
Within the Web, the annotation of content has become a of upcoming programs. The collaborative ranking measures
common way to provide efficient navigation and recommen- the similarity between tag clouds in the same way, but from
dation of resources. In the future, TV sets with integrated a community point of view as it considers tags from other
Web capabilities will offer tagging as a tool for content or- users as well.
ganization in the realm of home entertainment. The recom- Both approaches are meant to address the problem of
mendation of TV content is a challenging task as a system overspecialization (that occurs in solely content-based sys-
has to consider each user’s individual preferences without tems [1]) through social discovery: tags are user-generated
getting too specific. We present a strategy which employs and describe the semantics of an item. As a matter of fact,
user-generated tags in a flexible way to address this issue. the semantics of a TV program do not necessarily correlate
Our approach provides two different ways of semantic rank- with the descriptions from the metadata.
ing for TV program lists: The first allows a higher ranking of Note that, in this paper, the terms “personalized” and
programs that fit well to the user’s personal likings. The sec- “collaborative” are used in the context of social annotation.
ond introduces collaborative aspects and therefore promotes
a community-driven approach rather than an individual way 2. TAG-BASED TV RECOMMENDATION
of recommendation.
The field of recommendation uses two common kinds of
ratings, implicit (extracted from user transactions) and ex-
Categories and Subject Descriptors plicit (the user is explicitly asked) ones [1]. As for the latter,
H.3.3 [INFORMATION STORAGE AND RETRIEVAL]: instead of using the common numerical ratings, Sen et al. [4]
Information Search and Retrieval—Information filtering; H.5.1 suggest using tags in order to provide a more individual and
[INFORMATION INTERFACES AND PRESENTA- accurate way of expressing what users like about a specific
TION]: Multimedia Information Systems item. Thus, by employing tags, we switch from the “degree
of the user’s preference” point of view to the level of “what
General Terms actually are the user’s preferences (in her own words)”.
In [5], it is suggested that resource recommendation should
HUMAN FACTORS, MEASUREMENT be performed by applying traditional collaborative filtering
methods on user-item, user-tag, and item-tag datasets. In
1. INTRODUCTION our system we refine this idea for the television domain by
In recent years, the fusion of television and the Web has focusing on the item-tag data in combination with different
already begun. In this context, the integration of content representations of a single user profile.
from the Web into television and vice versa are two impor-
tant and not yet completed tasks. Considering the charac- 2.1 Input data
teristics of both information sources, the following turns out: As input for the recommendation approach, we consider
while television is consumed mostly passively, Web content two entities: The first is an upcoming program, which has
usually offers a high degree of user interaction. However, in not been tagged yet. The second is a user profile that con-
the next years, this distinction will become more and more tains the history of previously watched programs along with
blurred. In particular, television will offer common ways the tags assigned by the users.
of interaction that are currently only well known from the Finding tags for an upcoming program, which is a can-
Web, especially social annotation of content. didate for recommendation, is a non-trivial task as users
Our approach applies user-generated tags in order to pro- commonly assign tags after and not before media consump-
vide recommendation of TV content. As a result of an in- tion. There exist various options to tackle this problem:
formation filtering process, we provide two rankings of a Keywords can be extracted from the program descriptions
program list, each of which is based on the same data - (as it is done by tvister1 ) and reused as tags. Furthermore, a
but employs different ways of user modeling. The person-
1
alized ranking focuses on the semantic similarity between tvister - http://www.tvister.de/
professional team could tag upcoming programs in advance.
It is clear that the creation of tag clouds in both of these Table 2: Two different representations of a single
ways differs from the dynamic process of community tag- user profile.
ging. In [2], we found a feasible way to address this issue by Die Simpsons
applying a machine learning approach in combination with a comic (1), satire (2), spass zeichentrick (1), satire (1),
client-server architecture. We use this approach in order to (1) lustig (1), comic (1), spass (1)
provide an efficient prediction of tags that are very similar
Broken comedy
to the ones that real users assign. Table 1 exemplifies the
fun (1), lustig (3), satire (2) fun (1), lustig (1), satire (1)
result of our tag prediction step by showing generated tags
and their weights for three TV programs. Navy CIS
For the personalized and the collaborative part of recom- spannend (2) gerichtsmedizin (1), spannend
mendation, we use two different representations of the same (1), ncis (1), navy (1)
user profile (containing the tagging history). To accentuate
this, we refer to table 2, which shows a small user profile Die Simpsons
that is present in our dataset [2]. The tag cloud of each TV cartoon (1), lustig (3) cartoon (1), lachen (1), chillen
program contained in the user history is presented in dif- (1), lisa (1), homer (1), lustig
(1), kult (2), bart (1)
ferent ways. The personalized approach does not consider
tags from other users, but only the ones the current user Stargate
assigned. This results in a binary representation, as a user sci-fi (1) sci-fi (1)
either assigns a particular tag for a program, or not. To
address this issue, we weight each tag by the user’s individ- Verführung einer Fremden
ual preference for it (total number of usages in her profile). spannend (2), thriller (1) spannend (1), thriller (1)
An example is provided by the left side of table 2. The col-
laborative approach incorporates the tags of all users that switch reloaded
lustig (3) , verarsche (1) parodie (1), verarsche (1),
annotated one specific program and weights each tag by its satire (1), lustig (1), tv (1)
total number of usages for a specific program. This way of
presenting tag clouds is the most common one within the
Web. The right column of table 2 exhibits an example of [2] we discovered a discrepancy between the tag weights of
this notation. the upcoming programs (predicted) and the ones of the pro-
The comparison of upcoming programs with the personal- grams in the user profiles (accumulated). This deviation is
ized version of the user profile and also with the collaborative related to the different perception of the same rating scale
one, results in two different program rankings. in user-item scenarios that is explained in [3]. Therefore, we
In the following, we refer to the user profile as either the decided to employ Pearson correlation as a similarity mea-
personalized or the collaborative representation. sure to mitigate this effect. For two programs p, q ∈ P ,
2.2 Similarity Measure having attached the tags Tpq with the weight w, this results
in the following formula:
A similarity measure is used to compare the tag clouds of
the programs in the user profile to the ones of the upcoming P
programs. We represent tag clouds as vectors in a similar (wp,t − w̄p )(wq,t − w̄q )
1 t∈Tpq
way as items are represented as vectors of user ratings in the sim(p, q) = r P + 1
collaborative filtering domain [1]: each dimension represents 2 (wp,t − w̄p ) 2
P
(wq,t − w̄q ) 2
t∈Tpq t∈Tpq
a single tag and each entry denotes the respective weight.
These vectors can be compared to each other by measuring The value w̄k stands for the average tag weight of the pro-
their degree of similarity. With the use of generated tags gram k:
1 X
wk,t
~
|k| t∈Tk
Table 1: Predicted tags and their weights for up-
coming programs. Note that the resulting similarity score lies between 0 and
Alarm für Cobra 11 - Die Autobahnpolizei 1 with a neutral point at 0.5 and equality at 1.
action (3.937), polizei (1.999), krimi (1.995), spannend 2.3 Score Aggregation
(1.990), autos (0.989), aufregend (0.898), aktion (0.896),
serie (0.879) After having measured the similarity of the new program
to all programs in the user profile, we need to aggregate
Asterix - Sieg über Cäsar these scores to a final one for each upcoming program. It
film (1.617), comic (1.613), geschichte (1.597), spielfilm does not make sense to aggregate the similarity scores of
(0.859), spass (0.859), comik (0.859), lustig (0.859), zei- all programs in the profile as users do often like more than
chentrick (0.858)
one program genre. Hence, we only aggregate the similarity
Die Simpsons scores of the k-nearest neighbors (k-NN) in order to aim at
zeichentrick (6.357), homer (3.653), comedy (3.552), kult specific genres the user prefers. This helps to obtain more
(3.535), lustig (3.434), humor (2.529), serie (1.957), car- accurate results as inter-genre measurements often result in
toon (1.773), simpsons (1.711), entspannung (1.549), a neutral similarity score that would be incorporated in ev-
amerika (1.498), fun (0.761), marge (0.899), james brooks ery aggregation. For the aggregation of the scores of the
(0.824), chillen (0.750), neue folgen (0.736), bart (0.726) k-NN, we use a weighted average approach with the scores
Table 3: Personalized vs. Collaborative: 3-NN and the aggregated scores (gray) of the upcoming programs.
Personalized Collaborative
Die Simpsons (0.677) Die Simpsons (0.742)
Die Simpsons switch reloaded (0.651) Die Simpsons (0.702)
Broken Comedy (0.636) Broken Comedy (0.611)
0.655 0.690
Die Simpsons (0.648) Die Simpsons (0.776)
Asterix - Sieg über Cäsar Die Simpsons (0.619) Broken Comedy (0.572)
switch reloaded (0.619) switch reloaded (0.554)
0.629 0.650
Navy CIS (0.679) Navy CIS (0.588)
Alarm für Cobra 11 - Verführung einer Fremden (0.659) Verführung einer Fremden (0.626)
Die Autobahnpolizei Stargate (0.499) Stargate (0.499)
0.623 0.576
being the weight (what results in squaring the similarity of the recommender system highly relies on the user’s taste
scores). For the programs within the k-nearest neighbors and therefore implements her individual preferences.
kN N ⊆ P rof ile and the upcoming program pnew this leads For a single top-N listing, it is possible to linearly combine
to the following formula: both types of scores for each program.
agg(pnew ) = P
1 X
sim(pnew , p)2
4. CONCLUSION AND OUTLOOK
sim(pnew , p) This paper presents two feasible and promising approaches
p∈kN N
p∈kN N
to provide top-N recommendations through collaborative
The aggregated score of an upcoming program can be in- tagging. Moreover, it is demonstrated that the utilization
terpreted as the user’s degree of preference for it. In our of user-generated tags might help to overcome the problem
approach, this value is used to provide a ranking within the of overspecialization in the emergent domain of TV recom-
list of upcoming programs. mendation.
For the future work, we plan to conduct a thorough evalu-
ation of the proposed approach that also includes a user sur-
3. PROOF OF CONCEPT vey. Furthermore, the similarity measurements can be en-
By using the upcoming programs of table 1 and the pro- hanced through lemmatization of tags in combination with
file of table 2 as input data, we do now exemplify how the ontology matchings between tag clouds.
aforementioned profile representations can provide different
scores and rankings. It needs to be pointed out that, for the 5. REFERENCES
reasons of clarity and brevity, the chosen user profile is very [1] G. Adomavicius and A. Tuzhilin. Toward the next
small (only seven tagged programs) and also the short list generation of recommender systems: A survey of the
of upcoming programs does not relate to a real case scenario state-of-the-art and possible extensions. IEEE Trans.
(usually more than 200 concurrent programs). on Knowl. and Data Eng., 17:734–749, June 2005.
The personalized as well as the collaborative rankings,
[2] G. Hölbling, A. Thalhammer, and H. Kosch.
shown in table 3, demonstrate that a top-N recommendation
Content-based tag generation to enable a tag-based
is possible with only few ratings. By considering tags, sim-
collaborative TV-Recommendation System. In 8th Int’l
ilarities between TV programs can be determined although
Conf. on Interactive TV&Video, pages 273–282, 2010.
they are not strongly correlated through content or meta-
data. In our case, the Asterix movie nearly gets the same [3] T. Segaran. Programming collective intelligence.
score (on both sides) as the upcoming Simpsons episode al- O’Reilly, 2007.
though it has no direct correlation (through TV metadata or [4] S. Sen, J. Vig, and J. Riedl. Tagommenders: connecting
content) to one of the user’s previously watched programs. users to items through tags. In 18th Int’l Conf. on
In contrast, the upcoming Simpson episode does have this World Wide Web, pages 671–680, 2009.
link: the user has already watched two episodes before. [5] K. H. L. Tso-Sutter, L. B. Marinho, and S.-T. Lars.
Therefore the reasonably high score of the Asterix movie Tag-aware recommender systems by fusion of
indicates that, even with a small profile, the use of tags as collaborative filtering algorithms. In Proc. of the 2008
semantic descriptors might help to overcome the common ACM symposium on Applied computing, SAC ’08, pages
problem of overspecialization. This also underlines our ef- 1995–1999, 2008.
forts to provide collaborative semantic tag prediction [2].
It is apparent that the ranking of the three programs in
table 3 is the same for the personalized and for the col-
laborative representation of the user profile. However, as
the differences between the scores in both lists indicate, the
ranking would strongly differ taken a larger and more real-
istic number of upcoming programs (> 200) into account.
The similarity scores of the collaborative ranking highlight
the community factor of the ranking. The personalized part