Employing User-Generated Tags to Provide Personalized as well as Collaborative TV Recommendations Andreas Thalhammer Günther Hölbling Dieter Fensel Semantic Technology InstituteChair of Distributed Semantic Technology Institute University of Innsbruck Information Systems University of Innsbruck Technikerstraße 21a University of Passau Technikerstraße 21a 6020 Innsbruck, Austria Innstraße 43 6020 Innsbruck, Austria andreas.thalhammer@sti2.at 94032 Passau, Germany dieter.fensel@sti2.at hoelblin@fim.uni- passau.de ABSTRACT the user’s preferences for certain tags and the annotations Within the Web, the annotation of content has become a of upcoming programs. The collaborative ranking measures common way to provide efficient navigation and recommen- the similarity between tag clouds in the same way, but from dation of resources. In the future, TV sets with integrated a community point of view as it considers tags from other Web capabilities will offer tagging as a tool for content or- users as well. ganization in the realm of home entertainment. The recom- Both approaches are meant to address the problem of mendation of TV content is a challenging task as a system overspecialization (that occurs in solely content-based sys- has to consider each user’s individual preferences without tems [1]) through social discovery: tags are user-generated getting too specific. We present a strategy which employs and describe the semantics of an item. As a matter of fact, user-generated tags in a flexible way to address this issue. the semantics of a TV program do not necessarily correlate Our approach provides two different ways of semantic rank- with the descriptions from the metadata. ing for TV program lists: The first allows a higher ranking of Note that, in this paper, the terms “personalized” and programs that fit well to the user’s personal likings. The sec- “collaborative” are used in the context of social annotation. ond introduces collaborative aspects and therefore promotes a community-driven approach rather than an individual way 2. TAG-BASED TV RECOMMENDATION of recommendation. The field of recommendation uses two common kinds of ratings, implicit (extracted from user transactions) and ex- Categories and Subject Descriptors plicit (the user is explicitly asked) ones [1]. As for the latter, H.3.3 [INFORMATION STORAGE AND RETRIEVAL]: instead of using the common numerical ratings, Sen et al. [4] Information Search and Retrieval—Information filtering; H.5.1 suggest using tags in order to provide a more individual and [INFORMATION INTERFACES AND PRESENTA- accurate way of expressing what users like about a specific TION]: Multimedia Information Systems item. Thus, by employing tags, we switch from the “degree of the user’s preference” point of view to the level of “what General Terms actually are the user’s preferences (in her own words)”. In [5], it is suggested that resource recommendation should HUMAN FACTORS, MEASUREMENT be performed by applying traditional collaborative filtering methods on user-item, user-tag, and item-tag datasets. In 1. INTRODUCTION our system we refine this idea for the television domain by In recent years, the fusion of television and the Web has focusing on the item-tag data in combination with different already begun. In this context, the integration of content representations of a single user profile. from the Web into television and vice versa are two impor- tant and not yet completed tasks. Considering the charac- 2.1 Input data teristics of both information sources, the following turns out: As input for the recommendation approach, we consider while television is consumed mostly passively, Web content two entities: The first is an upcoming program, which has usually offers a high degree of user interaction. However, in not been tagged yet. The second is a user profile that con- the next years, this distinction will become more and more tains the history of previously watched programs along with blurred. In particular, television will offer common ways the tags assigned by the users. of interaction that are currently only well known from the Finding tags for an upcoming program, which is a can- Web, especially social annotation of content. didate for recommendation, is a non-trivial task as users Our approach applies user-generated tags in order to pro- commonly assign tags after and not before media consump- vide recommendation of TV content. As a result of an in- tion. There exist various options to tackle this problem: formation filtering process, we provide two rankings of a Keywords can be extracted from the program descriptions program list, each of which is based on the same data - (as it is done by tvister1 ) and reused as tags. Furthermore, a but employs different ways of user modeling. The person- 1 alized ranking focuses on the semantic similarity between tvister - http://www.tvister.de/ professional team could tag upcoming programs in advance. It is clear that the creation of tag clouds in both of these Table 2: Two different representations of a single ways differs from the dynamic process of community tag- user profile. ging. In [2], we found a feasible way to address this issue by Die Simpsons applying a machine learning approach in combination with a comic (1), satire (2), spass zeichentrick (1), satire (1), client-server architecture. We use this approach in order to (1) lustig (1), comic (1), spass (1) provide an efficient prediction of tags that are very similar Broken comedy to the ones that real users assign. Table 1 exemplifies the fun (1), lustig (3), satire (2) fun (1), lustig (1), satire (1) result of our tag prediction step by showing generated tags and their weights for three TV programs. Navy CIS For the personalized and the collaborative part of recom- spannend (2) gerichtsmedizin (1), spannend mendation, we use two different representations of the same (1), ncis (1), navy (1) user profile (containing the tagging history). To accentuate this, we refer to table 2, which shows a small user profile Die Simpsons that is present in our dataset [2]. The tag cloud of each TV cartoon (1), lustig (3) cartoon (1), lachen (1), chillen program contained in the user history is presented in dif- (1), lisa (1), homer (1), lustig (1), kult (2), bart (1) ferent ways. The personalized approach does not consider tags from other users, but only the ones the current user Stargate assigned. This results in a binary representation, as a user sci-fi (1) sci-fi (1) either assigns a particular tag for a program, or not. To address this issue, we weight each tag by the user’s individ- Verführung einer Fremden ual preference for it (total number of usages in her profile). spannend (2), thriller (1) spannend (1), thriller (1) An example is provided by the left side of table 2. The col- laborative approach incorporates the tags of all users that switch reloaded lustig (3) , verarsche (1) parodie (1), verarsche (1), annotated one specific program and weights each tag by its satire (1), lustig (1), tv (1) total number of usages for a specific program. This way of presenting tag clouds is the most common one within the Web. The right column of table 2 exhibits an example of [2] we discovered a discrepancy between the tag weights of this notation. the upcoming programs (predicted) and the ones of the pro- The comparison of upcoming programs with the personal- grams in the user profiles (accumulated). This deviation is ized version of the user profile and also with the collaborative related to the different perception of the same rating scale one, results in two different program rankings. in user-item scenarios that is explained in [3]. Therefore, we In the following, we refer to the user profile as either the decided to employ Pearson correlation as a similarity mea- personalized or the collaborative representation. sure to mitigate this effect. For two programs p, q ∈ P , 2.2 Similarity Measure having attached the tags Tpq with the weight w, this results in the following formula: A similarity measure is used to compare the tag clouds of the programs in the user profile to the ones of the upcoming  P  programs. We represent tag clouds as vectors in a similar (wp,t − w̄p )(wq,t − w̄q ) 1 t∈Tpq  way as items are represented as vectors of user ratings in the sim(p, q) = r P + 1 collaborative filtering domain [1]: each dimension represents 2 (wp,t − w̄p ) 2 P (wq,t − w̄q ) 2  t∈Tpq t∈Tpq a single tag and each entry denotes the respective weight. These vectors can be compared to each other by measuring The value w̄k stands for the average tag weight of the pro- their degree of similarity. With the use of generated tags gram k: 1 X wk,t ~ |k| t∈Tk Table 1: Predicted tags and their weights for up- coming programs. Note that the resulting similarity score lies between 0 and Alarm für Cobra 11 - Die Autobahnpolizei 1 with a neutral point at 0.5 and equality at 1. action (3.937), polizei (1.999), krimi (1.995), spannend 2.3 Score Aggregation (1.990), autos (0.989), aufregend (0.898), aktion (0.896), serie (0.879) After having measured the similarity of the new program to all programs in the user profile, we need to aggregate Asterix - Sieg über Cäsar these scores to a final one for each upcoming program. It film (1.617), comic (1.613), geschichte (1.597), spielfilm does not make sense to aggregate the similarity scores of (0.859), spass (0.859), comik (0.859), lustig (0.859), zei- all programs in the profile as users do often like more than chentrick (0.858) one program genre. Hence, we only aggregate the similarity Die Simpsons scores of the k-nearest neighbors (k-NN) in order to aim at zeichentrick (6.357), homer (3.653), comedy (3.552), kult specific genres the user prefers. This helps to obtain more (3.535), lustig (3.434), humor (2.529), serie (1.957), car- accurate results as inter-genre measurements often result in toon (1.773), simpsons (1.711), entspannung (1.549), a neutral similarity score that would be incorporated in ev- amerika (1.498), fun (0.761), marge (0.899), james brooks ery aggregation. For the aggregation of the scores of the (0.824), chillen (0.750), neue folgen (0.736), bart (0.726) k-NN, we use a weighted average approach with the scores Table 3: Personalized vs. Collaborative: 3-NN and the aggregated scores (gray) of the upcoming programs. Personalized Collaborative Die Simpsons (0.677) Die Simpsons (0.742) Die Simpsons switch reloaded (0.651) Die Simpsons (0.702) Broken Comedy (0.636) Broken Comedy (0.611) 0.655 0.690 Die Simpsons (0.648) Die Simpsons (0.776) Asterix - Sieg über Cäsar Die Simpsons (0.619) Broken Comedy (0.572) switch reloaded (0.619) switch reloaded (0.554) 0.629 0.650 Navy CIS (0.679) Navy CIS (0.588) Alarm für Cobra 11 - Verführung einer Fremden (0.659) Verführung einer Fremden (0.626) Die Autobahnpolizei Stargate (0.499) Stargate (0.499) 0.623 0.576 being the weight (what results in squaring the similarity of the recommender system highly relies on the user’s taste scores). For the programs within the k-nearest neighbors and therefore implements her individual preferences. kN N ⊆ P rof ile and the upcoming program pnew this leads For a single top-N listing, it is possible to linearly combine to the following formula: both types of scores for each program. agg(pnew ) = P 1 X sim(pnew , p)2 4. CONCLUSION AND OUTLOOK sim(pnew , p) This paper presents two feasible and promising approaches p∈kN N p∈kN N to provide top-N recommendations through collaborative The aggregated score of an upcoming program can be in- tagging. Moreover, it is demonstrated that the utilization terpreted as the user’s degree of preference for it. In our of user-generated tags might help to overcome the problem approach, this value is used to provide a ranking within the of overspecialization in the emergent domain of TV recom- list of upcoming programs. mendation. For the future work, we plan to conduct a thorough evalu- ation of the proposed approach that also includes a user sur- 3. PROOF OF CONCEPT vey. Furthermore, the similarity measurements can be en- By using the upcoming programs of table 1 and the pro- hanced through lemmatization of tags in combination with file of table 2 as input data, we do now exemplify how the ontology matchings between tag clouds. aforementioned profile representations can provide different scores and rankings. It needs to be pointed out that, for the 5. REFERENCES reasons of clarity and brevity, the chosen user profile is very [1] G. Adomavicius and A. Tuzhilin. Toward the next small (only seven tagged programs) and also the short list generation of recommender systems: A survey of the of upcoming programs does not relate to a real case scenario state-of-the-art and possible extensions. IEEE Trans. (usually more than 200 concurrent programs). on Knowl. and Data Eng., 17:734–749, June 2005. The personalized as well as the collaborative rankings, [2] G. Hölbling, A. Thalhammer, and H. Kosch. shown in table 3, demonstrate that a top-N recommendation Content-based tag generation to enable a tag-based is possible with only few ratings. By considering tags, sim- collaborative TV-Recommendation System. In 8th Int’l ilarities between TV programs can be determined although Conf. on Interactive TV&Video, pages 273–282, 2010. they are not strongly correlated through content or meta- data. In our case, the Asterix movie nearly gets the same [3] T. Segaran. Programming collective intelligence. score (on both sides) as the upcoming Simpsons episode al- O’Reilly, 2007. though it has no direct correlation (through TV metadata or [4] S. Sen, J. Vig, and J. Riedl. Tagommenders: connecting content) to one of the user’s previously watched programs. users to items through tags. In 18th Int’l Conf. on In contrast, the upcoming Simpson episode does have this World Wide Web, pages 671–680, 2009. link: the user has already watched two episodes before. [5] K. H. L. Tso-Sutter, L. B. Marinho, and S.-T. Lars. Therefore the reasonably high score of the Asterix movie Tag-aware recommender systems by fusion of indicates that, even with a small profile, the use of tags as collaborative filtering algorithms. In Proc. of the 2008 semantic descriptors might help to overcome the common ACM symposium on Applied computing, SAC ’08, pages problem of overspecialization. This also underlines our ef- 1995–1999, 2008. forts to provide collaborative semantic tag prediction [2]. It is apparent that the ranking of the three programs in table 3 is the same for the personalized and for the col- laborative representation of the user profile. However, as the differences between the scores in both lists indicate, the ranking would strongly differ taken a larger and more real- istic number of upcoming programs (> 200) into account. The similarity scores of the collaborative ranking highlight the community factor of the ranking. The personalized part