Interactive News Video Recommendation: An Example
                             System

                                                        Frank Hopfgartner
                                             International Computer Science Institute
                                                  1947 Center Street, Suite 600
                                                       Berkeley, CA, 94704
                                                       fh@icsi.berkeley.edu

 ABSTRACT                                                          signals and accompanying metadata. The audio-visual fea-
 This position paper introduces a recommender system which         tures can be described by low-level feature descriptors, the
 has been developed to study research questions in the field       main description standard being MPEG-7.
 of news video recommendation and personalization. The
 system is based on semantically enriched video data and           Retrieving videos using low-level features is, due to the Se-
 can be seen as an example system that allows research on          mantic Gap [18], a challenging approach. An analysis of
 semantic models for adaptive interactive systems.                 state-of-the-art research on video retrieval indicates that
                                                                   content-based video retrieval performance is still far away
                                                                   from their textual counterparts [7]. An interesting approach
 1.    INTRODUCTION                                                to narrow this performance gap is to further enrich video
 In recent years, the amount of multimedia content available       documents using external data sources, called metadata.
 to users has increased exponentially. This phenomenon has         Blanken et al. [4] list three types of metadata: (1) Descrip-
 come along with (and to much an extent is the consequence         tive Data, (2) Text Annotations and (3) Semantic Annota-
 of) a rapid development of tools, devices, and social services    tion. All approaches aim to provide annotations in textual
 which facilitate the creation, storage and sharing of personal    form that allow to bridge the Semantic Gap. Fernández et
 multimedia content. A new landscape for business and in-          al. [9], for instance, have shown that ontology-based search
 novation opportunities in multimedia content and technolo-        models that exploit semantic annotations can outperform
 gies has naturally emerged from this evolution, at the same       classical information retrieval models at a web scale. The
 time that new problems and challenges arise. In particular,       advantage of these models is that external knowledge is used
 the hype around social services dealing with visual content,      to set the content into their semantic context.
 such as YouTube or Dailymotion has led to a rather scat-
 tered publishing of video data by users worldwide [8]. Due        In [10], we introduced a news video recommender system
 to the sheer amount of large data collections, there is a grow-   which relies on such semantic annotations. The system cap-
 ing need to develop new methods that support the users in         tures daily broadcasting news, and segments the bulletins
 searching and finding videos they are interested in.              into semantically related news stories. DBpedia is exploited
                                                                   to set these stories into context. DBpedia is a structured
 Video retrieval is a specialization of information retrieval      representation of Wikipedia [2]. This semantic augmenta-
 (IR), a research domain that focuses on the effective stor-       tion of news stories is used as the backbone of our news video
 age and access of data. In a classical information retrieval      recommendation. Our first hypothesis was that implicit rel-
 scenario, a user aims to satisfy their information need by        evance feedback can be used to create appropriate long-term
 formulating a search query. This action triggers a retrieval      user profiles. Implicit relevance feedback refers to user in-
 process which results in a list of ranked documents, usually      teractions that are performed implicitly during a search ses-
 presented in decreasing order of relevance. The activity of       sion, such as clicking a search result or spending time to
 performing a search is called the information seeking pro-        read/view a document. We introduced an implicit user mod-
 cess. A document can be any type of data accessible by a          eling approach which automatically captured users’ evolving
 retrieval system. In the text retrieval domain, documents         information needs, representing interests in a dynamic user
 can be textual documents such as emails or websites. Image        profile. Another research question was to study whether the
 documents can be photos, graphics or other types of visual il-    selection of concepts in a generic ontology can be used for
 lustrations. Video documents consist of a set of audio-visual     accurate news video recommendations. Therefore, we intro-
                                                                   duced our approach of exploiting DBpedia to set concepts
                                                                   of news stories into their semantic context. As our evalu-
                                                                   ation indicates, semantic recommendations can successfully
                                                                   be employed to improve the recommendation quality.

                                                                   While we evaluated within this work the underlying person-
                                                                   alization technique, which takes advantage of an ontology,
                                                                   the impact of the adaptive presentation of the recommen-
                                                                   dations and search results, i.e. the interface design, has not


Copyright is held by the author/owner(s)
SEMAIS'11, Feb 13 2011, Palo Alto, CA, USA
 been evaluated yet. Given a well-evaluated backend which           of this approach is that it simplifies the information seeking
 relies on Semantic Web technologies, we argue in this posi-        process, e.g. by releasing the user from manually reformulat-
 tion paper that the introduced personalization system can          ing the search query, which might be problematic especially
 be seen as an exemplar system which allows for studying the        when the user is not exactly sure what they are looking for or
 research questions that are within the scope of this work-         does not know how to formulate their information need. Two
 shop. After introducing the research domain in Section 2,          types of relevance feedback exist: explicit and implicit feed-
 we illustrate in Section 3 how users can use the system to re-     back. While explicit RF models rely on users permanently
 ceive frequent news video recommendations that match their         providing relevance information about documents they re-
 personal interests. In Section 4, we introduce the interface       trieved, implicit RF models rely on automatically mining
 of prior mentioned system, which is required to visualize se-      user interaction data. The main advantage is that this ap-
 mantically enriched video data. Section 5 discusses how this       proach delivers the user from providing explicit feedback.
 system can be used as an example to study semantic models
 for adaptive interactive systems.                                  Most personalization services rely on users explicitly specify-
                                                                    ing preferences. However, users tend not to provide constant
 2.    SEMANTIC NEWS VIDEO RECOMMEN-                                explicit feedback on what they are interested in. In a long-
       DATION                                                       term user profiling scenario, this lack of feedback is critical,
                                                                    since feedback is essential for the creation of such profiles.
 When interacting with a video retrieval system, users ex-
                                                                    Considering that each interface feature is designed to allow
 press their information need in search queries. The under-
                                                                    users to either retrieve or explore document collections, we
 lying retrieval engine then retrieves relevant results to the
                                                                    hypothesized in [10] that the users’ interactions with these
 given queries. A necessary requisite for this IR scenario is to
                                                                    features can be exploited as implicit relevance feedback. We
 correctly interpret the users’ information need. As Spink et
                                                                    introduced a news video recommender system which auto-
 al. [19] indicate though, users very often are not sure about
                                                                    matically generates personalized multimedia news that cover
 their information need. One problem they face is that they
                                                                    topics of the users’ long-term interests.
 are often unfamiliar with the data collection, thus they do
 not exactly know what information they can expect from
                                                                    Defining the technical conditions for such recommender sys-
 the corpus [17]. Further, Jansen et al. [12] have shown that
                                                                    tems, we argued that the creation of a private news video
 video search queries are rather short, usually consisting of
                                                                    collection is required, consisting of up-to-date news bulletins
 approximately three terms. Considering these observations,
                                                                    from different broadcasting stations. Further, we argued
 it is hence challenging to satisfy users’ information needs,
                                                                    that semantic web technology can be exploited to link con-
 especially when dealing with ambiguous queries. Triggering
                                                                    cepts in the news broadcasts and suggested a categorization
 the short search query “Victoria”, for example, a user might
                                                                    of stories into broad news categories. From a user profiling
 be interested in videos about cities called Victoria (e.g. in
                                                                    point of view, these links and categories can be of high value
 Canada, United States or Malta), landmarks (e.g. Victoria
                                                                    to recommend semantically related transcripts, hence creat-
 Park in Glasgow or London), famous persons (e.g. Queen
                                                                    ing a semantic-based user profile. For example, a user could
 Victoria or Victoria Beckham) or other entities called Vic-
                                                                    show interest in a story about the sunset at the Greek island
 toria. Without further knowledge, it is a demanding task
                                                                    Santorini. The story transcript might contain the following
 to understand the users’ intentions. Interactive information
                                                                    sentence:
 retrieval aims at improving the classic information retrieval
 model by studying how to further engage users in the re-
 trieval process, in a way that the system can have a more               “This is Peter Miller, reporting live from San-
 complete understanding of their information need. Thus,                 torini, Greece, where we are just about to wit-
 aiming to minimize the users’ efforts to fulfill their informa-         ness one of the most magnificent sunsets of the
 tion seeking task, there is a need to personalize search. In            decade. [...]”.
 a web search scenario, Mobasher et al. [14] define personal-
 ization as “any action that tailors the Web experience to a
 particular user, or a set of users”. Another popular name is       If the same user enjoys travel with emphasis on warm Mediter-
 adaptive information retrieval, which was coined by Belew          ranean sites, he/she might also be interested in a report
 [3] to describe the approach of adapting, over time, retrieval     about the Spanish island Majorca. For example, imagine
 results based on users’ interests.                                 the following story:

 Most of the approaches that follow the interactive informa-
 tion retrieval model are based on relevance feedback tech-              “Just as every year, thousands of tourists enjoy
 niques [17]. Relevance feedback (RF) is one of the most im-             their annual sun bath here in Majorca. [...]”.
 portant techniques within the IR community. An overview
 of the large amount of research focusing on exploiting rele-
 vance feedback is given by Ruthven and Lalmas [16]. The            An interesting research question is how to identify whether
 principle of relevance feedback is to identify the user’s infor-   this story matches the user’s interests. Lioma and Ounis
 mation need and then, exploiting this knowledge, adapting          [13] argue that the semantic meaning of a text is mostly ex-
 search results. Rocchio [15] defines relevance feedback as         pressed by nouns and foreign names, since they carry the
 follows: The retrieval system displays search results, users       highest content load. Indeed, most adaptation approaches
 provide feedback by specifying keywords or judging the rel-        rely on these terms to personalize retrieval results, e.g. by
 evance of retrieved documents and the system updates the           performing a simple query expansion. The two example sto-
 results by incorporating this feedback. The main benefit           ries, however, do not share similar terms. A personalization


Copyright is held by the author/owner(s)
SEMAIS'11, Feb 13 2011, Palo Alto, CA, USA
 technique exploiting the terms only would hence not be able      in the search panel on top, results are listed on the right
 to recommend the second story. However, linking the con-         side and a navigation panel is placed on the left side of the
 cepts of the transcripts using DBpedia reveals the semantic      interface. When logging in, the latest news will be listed in
 context of both stories. It becomes evident that both sto-       the results panel. Search results are listed based on their
 ries are about two islands in the Mediterranean Sea. Ex-         relevance to the query. Since we are using a news corpus,
 ploiting this link could hence satisfy the user’s interest in    however, users can re-arrange the results in chronological
 warm Mediterranean Sites. We therefore proposed to set           order with latest news listed first. Each entry in the result
 news broadcasts into their semantic context by exploiting        list is visualized by an example key frame and a text snippet
 the large pool of linked concepts provided by DBpedia.           of the story’s transcript. Keywords from the search query
                                                                  are highlighted to ease the access to the results. Moving
 Having established a semantically annotated data collection,     the mouse over one of the key frames shows a tool tip pro-
 the recommender system can be operated on a regular basis        viding additional information about the story. A user can
 to retrieve news stories that match the user’s interests. In     get additional information about the result by clicking on
 the next section, we illustrate a typical use-case that illus-   either the text or the key frame. This will expand the result
 trates the use of the exemplar system.                           and present additional information including the full text
                                                                  transcript, broadcasting date, time and channel and a list
 3.    USE-CASE SCENARIO                                          of extracted named entities. In the example screenshot, the
 In the previous section, we provided a brief summary of the      third search result has been expanded. The shots forming
 research challenges that have been tackled in [10]. Users        the news story are represented by animated key frames of
 can interact with this system on a regular basis, e.g. over      each shot. Users can browse through these animations either
 several weeks, to satisfy their information need, allowing for   by clicking on the key frame or by using the mouse wheel.
 longitudinal user studies where the system can be evaluated.     This action will center the selected key frame and surround
 The following example depicts a typical use-case scenario:       it by its neighboring key frames. The user’s interactions
                                                                  with the interface are exploited to identify multiple topics
                                                                  of interests. On the left hand side of the interface, these in-
       “Imagine a user who is interested in multiple news         terests are presented by different categories, i.e. those news
       topics. They registered with a news recommender            categories that the user showed interest in during previous
       system with a unique identifier. For a period of           search sessions.
       several months, they log into the system, which
       provides them access to the latest news video sto-         Summarizing, the interface provides access to different news
       ries of the day. On the system’s graphical inter-          categories in which the user showed interest in. These inter-
       face, they have a list of the latest stories which         ests can adapt over time, i.e. when a user shows interest in a
       have been broadcast on two national television             certain news aspect right now, this aspect might already be
       channels. They now interact with the presented             irrelevant in a few days. Imagine, for example, a user who
       results and logs off again. On each subsequent             has shown high interest in any news regarding the FIFA
       day, they log in again and continue the above              Soccer World Cup. Just a few days after the end of the
       process.”                                                  tournament, the user’s interest might drop to a minimum
                                                                  again. Our interface serves this evolving need by automati-
                                                                  cally updating the categories in which the user showed the
 In this scenario, a user frequently uses the system to gather    most interest in during the last sessions. The evolving inter-
 latest news. The interface has been designed to adapt its        est is modeled by applying the Ostensive Model [6], which
 content based on users’ personal interests by employing the      provides a decay function that aligns a higher weighting to
 semantic context of the data collection. Each time, he/she       more recent user interests.
 interacts with the video documents which have been dis-
 played by the graphical user interface, he/she leaves a “se-     5.   DISCUSSION AND CONCLUSION
 mantic fingerprint” of their interests. Based on this finger-    Above description reveals that the interface has been de-
 print, more video documents are identified by exploiting the     signed to visualize news videos that match users’ interests.
 semantic link between the video documents in the collection.     The categorization of these interests is highly user-centric.
 Hence, each time the user interacts with retrieval results,      The interface adapts its content, i.e. both categories on the
 other related videos are identified and displayed. A long-       left hand side and news videos on the right hand side based
 term user study focusing on evaluating the performance of        on the users’ previous interactions. Even though the recom-
 different recommendation techniques has been introduced in       mendation technique relies on interlinked data, the interface
 [11].                                                            itself does not support filtering or browsing the data accord-
                                                                  ingly.
 While this evaluation is focused on the recommendation
 techniques, a thorough evaluation of the interface has not       As mentioned before, this constraint is due to the different
 been done yet. An overview over the interface is given in        focus of the research, which was aiming at studying rec-
 the next section.                                                ommendation techniques rather than adaptive interface de-
                                                                  signs. Nevertheless, given the support of semantically en-
 4.    INTERFACE DESIGN                                           riched video data, we argue that the system can be seen
 Figure 1 shows a screenshot of the adaptive news video re-       as an example framework which enables to study such in-
 trieval interface which was used within the study. It can be     terface features. Example improvements include visualizing
 split into three main areas: Search queries can be entered       story interlinking by using a hyperbolic tree, as has been


Copyright is held by the author/owner(s)
SEMAIS'11, Feb 13 2011, Palo Alto, CA, USA
                                             Figure 1: News Video Recommender Interface


 introduced by Bürger et al. [5]. In their Smart Content            [3] R. K. Belew. Adaptive information retrieval: using a
 Factory, each document in the index has been enriched with              connectionist representation to retrieve and learn
 semantic information, i.e. places mentioned in the transcript           about documents. SIGIR Forum, 23(SI):11–20, 1989.
 are matched with a generic geography thesaurus. Such tree           [4] H. M. Blanken, A. P. de Vries, H. E. Bok, and
 would allow users to browse the video collection based on the           L. Feng. Multimedia Retrieval. Springer Verlag,
 semantic content of each video. Another improvement could               Heidelberg, Germany, 1 edition, 2007.
 be to provide thesaurus supported query auto-completion             [5] T. Bürger, E. Gams, and G. Güntner. Smart content
 features as shown by Amin et al. [1]. This would allow users            factory: assisting search for digital objects by generic
 to get an idea about the collection based on the query sug-             linking concepts to multimedia content. In Proc. HT,
 gestions.                                                               pages 286–287. ACM, 2005.
                                                                     [6] I. Campbell and C. J. van Rijsbergen. The ostensive
 Acknowledgment                                                          model of developing information needs. In Proc.
 The author was supported by a fellowship within the Postdoc-            Library Science, pages 251–268, 1996.
 Program of the German Academic Exchange Service (DAAD).             [7] M. G. Christel. Establishing the utility of non-text
                                                                         search for news video retrieval with real world users.
 6.    REFERENCES                                                        In MULTIMEDIA ’07: Proceedings of the 15th
  [1] A. Amin, M. Hildebrand, J. van Ossenbruggen,                       international conference on Multimedia, pages
      V. Evers, and L. Hardman. Organizing suggestions in                707–716, New York, NY, USA, 2007. ACM.
      autocompletion interfaces. In ECIR’09: Proceedings of          [8] S. J. Cunningham and D. M. Nichols. How people find
      the 31st European Conference on IR Research, ECIR                  videos. In Proc. 8th ACM/IEEE-CS Joint Conference
      2009, Toulouse, France, pages 521–529, 2009.                       on Digital libraries, pages 201–210, New York, NY,
  [2] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann,                       USA, 2008. ACM.
      R. Cyganiak, and Z. G. Ives. DBpedia: A Nucleus for            [9] M. Fernández, V. López, M. Sabou, V. Uren,
      a Web of Open Data. In Proc. 6th Int. Semantic Web                 D. Vallet, E. Motta, and P. Castells. Using TREC for
      Conf., pages 722–735. Springer Berlin / Heidelberg, 11             cross-comparison between classic IR and
      2007.                                                              ontology-based search models at a Web scale. In


Copyright is held by the author/owner(s)
SEMAIS'11, Feb 13 2011, Palo Alto, CA, USA
      SemSearch’09, 4 2009.                                    [15] J. J. Rocchio. Relevance feedback in information
 [10] F. Hopfgartner and J. M. Jose. Semantic user                  retrieval. In G. Salton, editor, The SMART retrieval
      modelling for personal news video retrieval. In MMM,          system: experiments in automatic document
      pages 336–346, 2010.                                          processing, pages 313–323, Englewood Cliffs, USA,
 [11] F. Hopfgartner and J. M. Jose. Semantic user profiling        1971. Prentice-Hall.
      techniques for personalised multimedia                   [16] I. Ruthven and M. Lalmas. A survey on the use of
      recommendation. Multimedia Systems, 16(4):255–274,            relevance feedback for information access systems. The
      2010.                                                         Knowledge Engineering Review, 18(2):95–145, 2003.
 [12] B. J. Jansen, A. Goodrum, and A. Spink. Searching        [17] G. Salton and C. Buckley. Improving retrieval
      for multimedia: analysis of audio, video and image            performance by relevance feedback. Readings in
      web queries. World Wide Web, 3(4):249–254, 2000.              information retrieval, pages 355–364, 1997.
 [13] C. Lioma and I. Ounis. Examining the Content Load        [18] A. W. M. Smeulders, M. Worring, S. Santini,
      of Part of Speech Blocks for Information Retrieval. In        A. Gupta, and R. Jain. Content-Based Image
      ACL’06: Proceedings of the 21st International                 Retrieval at the End of the Early Years. IEEE Trans.
      Conference on Computational Linguistics and 44th              on Pattern Analysis and Machine Intelligence,
      Annual Meeting of the Association for Computational           22(12):1349–1380, 2000.
      Linguistics, Sydney, Australia, 2006.                    [19] A. Spink, H. Greisdorf, and J. Bateman. From highly
 [14] B. Mobasher, R. Cooley, and J. Srivastava. Automatic          relevant to not relevant: examining different regions of
      personalization based on web usage mining.                    relevance. Inf. Process. Manage., 34(5):599–621, 1998.
      Communications of the ACM, 43(8):142–151, 2000.


Copyright is held by the author/owner(s)
SEMAIS'11, Feb 13 2011, Palo Alto, CA, USA