<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Employing User-Generated Tags to Provide Personalized as well as Collaborative TV Recommendations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andreas Thalhammer Günther Hölbling</string-name>
          <email>andreas.thalhammer@sti2.at</email>
          <email>andreas.thalhammer@sti2.at 94032 Passau, Germany hoelblin@fim.unipassau.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dieter Fensel</string-name>
          <email>dieter.fensel@sti2.at</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Semantic Technology Institute Chair of Distributed, University of Innsbruck Information Systems, Technikerstraße 21a University of Passau</institution>
          ,
          <addr-line>6020 Innsbruck, Austria Innstraße 43</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Semantic Technology Institute, University of Innsbruck</institution>
          ,
          <addr-line>Technikerstraße 21a, 6020 Innsbruck</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Within the Web, the annotation of content has become a common way to provide e cient navigation and recommendation of resources. In the future, TV sets with integrated Web capabilities will o er tagging as a tool for content organization in the realm of home entertainment. The recommendation of TV content is a challenging task as a system has to consider each user's individual preferences without getting too speci c. We present a strategy which employs user-generated tags in a exible way to address this issue. Our approach provides two di erent ways of semantic ranking for TV program lists: The rst allows a higher ranking of programs that t well to the user's personal likings. The second introduces collaborative aspects and therefore promotes a community-driven approach rather than an individual way of recommendation.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>2.</p>
    </sec>
    <sec id="sec-2">
      <title>TAG-BASED TV RECOMMENDATION</title>
      <p>In recent years, the fusion of television and the Web has
already begun. In this context, the integration of content
from the Web into television and vice versa are two
important and not yet completed tasks. Considering the
characteristics of both information sources, the following turns out:
while television is consumed mostly passively, Web content
usually o ers a high degree of user interaction. However, in
the next years, this distinction will become more and more
blurred. In particular, television will o er common ways
of interaction that are currently only well known from the
Web, especially social annotation of content.</p>
      <p>Our approach applies user-generated tags in order to
provide recommendation of TV content. As a result of an
information ltering process, we provide two rankings of a
program list, each of which is based on the same data
but employs di erent ways of user modeling. The
personalized ranking focuses on the semantic similarity between
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>Input data</title>
      <p>As input for the recommendation approach, we consider
two entities: The rst is an upcoming program, which has
not been tagged yet. The second is a user pro le that
contains the history of previously watched programs along with
the tags assigned by the users.</p>
      <p>
        Finding tags for an upcoming program, which is a
candidate for recommendation, is a non-trivial task as users
commonly assign tags after and not before media
consumption. There exist various options to tackle this problem:
Keywords can be extracted from the program descriptions
(as it is done by tvister1) and reused as tags. Furthermore, a
1tvister - http://www.tvister.de/
professional team could tag upcoming programs in advance.
It is clear that the creation of tag clouds in both of these
ways di ers from the dynamic process of community
tagging. In [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], we found a feasible way to address this issue by
applying a machine learning approach in combination with a
client-server architecture. We use this approach in order to
provide an e cient prediction of tags that are very similar
to the ones that real users assign. Table 1 exempli es the
result of our tag prediction step by showing generated tags
and their weights for three TV programs.
      </p>
      <p>
        For the personalized and the collaborative part of
recommendation, we use two di erent representations of the same
user pro le (containing the tagging history). To accentuate
this, we refer to table 2, which shows a small user pro le
that is present in our dataset [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The tag cloud of each TV
program contained in the user history is presented in
different ways. The personalized approach does not consider
tags from other users, but only the ones the current user
assigned. This results in a binary representation, as a user
either assigns a particular tag for a program, or not. To
address this issue, we weight each tag by the user's
individual preference for it (total number of usages in her pro le).
An example is provided by the left side of table 2. The
collaborative approach incorporates the tags of all users that
annotated one speci c program and weights each tag by its
total number of usages for a speci c program. This way of
presenting tag clouds is the most common one within the
Web. The right column of table 2 exhibits an example of
this notation.
      </p>
      <p>The comparison of upcoming programs with the
personalized version of the user pro le and also with the collaborative
one, results in two di erent program rankings.</p>
      <p>In the following, we refer to the user pro le as either the
personalized or the collaborative representation.
2.2</p>
    </sec>
    <sec id="sec-4">
      <title>Similarity Measure</title>
      <p>
        A similarity measure is used to compare the tag clouds of
the programs in the user pro le to the ones of the upcoming
programs. We represent tag clouds as vectors in a similar
way as items are represented as vectors of user ratings in the
collaborative ltering domain [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]: each dimension represents
a single tag and each entry denotes the respective weight.
These vectors can be compared to each other by measuring
their degree of similarity. With the use of generated tags
      </p>
      <sec id="sec-4-1">
        <title>Alarm fur Cobra 11 - Die Autobahnpolizei</title>
        <p>action (3.937), polizei (1.999), krimi (1.995), spannend
(1.990), autos (0.989), aufregend (0.898), aktion (0.896),
serie (0.879)</p>
      </sec>
      <sec id="sec-4-2">
        <title>Asterix - Sieg uber Casar</title>
        <p>lm (1.617), comic (1.613), geschichte (1.597), spiel lm
(0.859), spass (0.859), comik (0.859), lustig (0.859),
zeichentrick (0.858)</p>
      </sec>
      <sec id="sec-4-3">
        <title>Die Simpsons</title>
        <p>
          zeichentrick (6.357), homer (3.653), comedy (3.552), kult
(3.535), lustig (3.434), humor (2.529), serie (1.957),
cartoon (1.773), simpsons (1.711), entspannung (1.549),
amerika (1.498), fun (0.761), marge (0.899), james brooks
(0.824), chillen (0.750), neue folgen (0.736), bart (0.726)
[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] we discovered a discrepancy between the tag weights of
the upcoming programs (predicted) and the ones of the
programs in the user pro les (accumulated). This deviation is
related to the di erent perception of the same rating scale
in user-item scenarios that is explained in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Therefore, we
decided to employ Pearson correlation as a similarity
measure to mitigate this e ect. For two programs p; q 2 P ,
having attached the tags Tpq with the weight w, this results
in the following formula:
sim(p; q) = 1 B
2 @B r
0
        </p>
        <p>P (wp;t
t2Tpq
wp)(wq;t</p>
        <p>wq)</p>
        <p>P (wp;t
t2Tpq
wp)2 P (wq;t
t2Tpq
wq)2</p>
        <p>1
+ 1CC</p>
        <p>A
The value wk stands for the average tag weight of the
program k:
1 X
~
jkj t2Tk
wk;t
Note that the resulting similarity score lies between 0 and
1 with a neutral point at 0.5 and equality at 1.
2.3</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Score Aggregation</title>
      <p>After having measured the similarity of the new program
to all programs in the user pro le, we need to aggregate
these scores to a nal one for each upcoming program. It
does not make sense to aggregate the similarity scores of
all programs in the pro le as users do often like more than
one program genre. Hence, we only aggregate the similarity
scores of the k-nearest neighbors (k-NN) in order to aim at
speci c genres the user prefers. This helps to obtain more
accurate results as inter-genre measurements often result in
a neutral similarity score that would be incorporated in
every aggregation. For the aggregation of the scores of the
k-NN, we use a weighted average approach with the scores
being the weight (what results in squaring the similarity
scores). For the programs within the k-nearest neighbors
kN N P rof ile and the upcoming program pnew this leads
to the following formula:
of the recommender system highly relies on the user's taste
and therefore implements her individual preferences.</p>
      <p>For a single top-N listing, it is possible to linearly combine
both types of scores for each program.
agg(pnew) =</p>
      <p>P
p2kNN
The aggregated score of an upcoming program can be
interpreted as the user's degree of preference for it. In our
approach, this value is used to provide a ranking within the
list of upcoming programs.
3.</p>
    </sec>
    <sec id="sec-6">
      <title>PROOF OF CONCEPT</title>
      <p>By using the upcoming programs of table 1 and the
prole of table 2 as input data, we do now exemplify how the
aforementioned pro le representations can provide di erent
scores and rankings. It needs to be pointed out that, for the
reasons of clarity and brevity, the chosen user pro le is very
small (only seven tagged programs) and also the short list
of upcoming programs does not relate to a real case scenario
(usually more than 200 concurrent programs).</p>
      <p>
        The personalized as well as the collaborative rankings,
shown in table 3, demonstrate that a top-N recommendation
is possible with only few ratings. By considering tags,
similarities between TV programs can be determined although
they are not strongly correlated through content or
metadata. In our case, the Asterix movie nearly gets the same
score (on both sides) as the upcoming Simpsons episode
although it has no direct correlation (through TV metadata or
content) to one of the user's previously watched programs.
In contrast, the upcoming Simpson episode does have this
link: the user has already watched two episodes before.
Therefore the reasonably high score of the Asterix movie
indicates that, even with a small pro le, the use of tags as
semantic descriptors might help to overcome the common
problem of overspecialization. This also underlines our
efforts to provide collaborative semantic tag prediction [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>It is apparent that the ranking of the three programs in
table 3 is the same for the personalized and for the
collaborative representation of the user pro le. However, as
the di erences between the scores in both lists indicate, the
ranking would strongly di er taken a larger and more
realistic number of upcoming programs (&gt; 200) into account.
The similarity scores of the collaborative ranking highlight
the community factor of the ranking. The personalized part
4.</p>
    </sec>
    <sec id="sec-7">
      <title>CONCLUSION AND OUTLOOK</title>
      <p>This paper presents two feasible and promising approaches
to provide top-N recommendations through collaborative
tagging. Moreover, it is demonstrated that the utilization
of user-generated tags might help to overcome the problem
of overspecialization in the emergent domain of TV
recommendation.</p>
      <p>For the future work, we plan to conduct a thorough
evaluation of the proposed approach that also includes a user
survey. Furthermore, the similarity measurements can be
enhanced through lemmatization of tags in combination with
ontology matchings between tag clouds.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Adomavicius</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Tuzhilin</surname>
          </string-name>
          .
          <article-title>Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions</article-title>
          .
          <source>IEEE Trans. on Knowl. and Data Eng</source>
          .,
          <volume>17</volume>
          :
          <fpage>734</fpage>
          {
          <fpage>749</fpage>
          ,
          <year>June 2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Ho</surname>
          </string-name>
          <article-title>lbling, A. Thalhammer, and</article-title>
          <string-name>
            <given-names>H.</given-names>
            <surname>Kosch</surname>
          </string-name>
          .
          <article-title>Content-based tag generation to enable a tag-based collaborative TV-Recommendation System</article-title>
          .
          <source>In 8th Int'l Conf. on Interactive TV&amp;Video</source>
          , pages
          <volume>273</volume>
          {
          <fpage>282</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Segaran</surname>
          </string-name>
          .
          <article-title>Programming collective intelligence</article-title>
          .
          <source>O'Reilly</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Sen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vig</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Riedl</surname>
          </string-name>
          .
          <article-title>Tagommenders: connecting users to items through tags</article-title>
          .
          <source>In 18th Int'l Conf. on World Wide Web</source>
          , pages
          <volume>671</volume>
          {
          <fpage>680</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>K. H. L.</given-names>
            <surname>Tso-Sutter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. B.</given-names>
            <surname>Marinho</surname>
          </string-name>
          , and S.-T. Lars.
          <article-title>Tag-aware recommender systems by fusion of collaborative ltering algorithms</article-title>
          .
          <source>In Proc. of the 2008 ACM symposium on Applied computing, SAC '08</source>
          , pages
          <year>1995</year>
          {
          <year>1999</year>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>