=Paper= {{Paper |id=None |storemode=property |title=The Importance of Service and Genre in Recommendations for Online Radio and Television Programmes |pdfUrl=https://ceur-ws.org/Vol-793/womrad2011_paper5.pdf |volume=Vol-793 }} ==The Importance of Service and Genre in Recommendations for Online Radio and Television Programmes== https://ceur-ws.org/Vol-793/womrad2011_paper5.pdf
The Importance of Service and Genre in Recommendations
      for Online Radio and Television Programmes ∗

                                                                              †
                                                                 Ian Knopke
                                                     British Broadcasting Corporation
                                                        201 Wood Lane, White City
                                                                London, UK
                                                        ian.knopke@gmail.com

ABSTRACT
The BBC iPlayer is an online delivery system for both ra-
dio and television content [1]. One of the unique features
of the iPlayer is that programming is based around a seven
day “catch-up” window. This paper documents some early
investigations into features that may be used to produce
quality recommendations for that system. The two features
explored here, services and genre, are partly unique to BBC
metadata, and are available for all programmes in the sched-
ule. Services are roughly equivalent to channels or stations,
while genres are editorially-assigned categorisations of me-
dia content. Results of genre / service-based diversity are
presented, as well as some simple recommenders based on
there, and additional discussion of the topic and results.

Categories and Subject Descriptors
H.4 [Information Systems Applications]: Miscellaneous

General Terms
Recommendations
                                                                                   Figure 1: BBC Programme Hierarchy
Keywords
Recommendations, Broadcasting, Collaborative Filtering
                                                                              One of the unique features of the iPlayer is that program-
                                                                           ming is based around a seven day “catch-up” window. Pro-
1. INTRODUCTION                                                            gramming is first shown as a linear broadcast over normal
  The BBC iPlayer is an online delivery system for both                    transmission systems (radio and tv). Shortly thereafter the
radio and television content. Freely available for users within            same content becomes available online, without charge, for
the geographical borders of the United Kingdom, it has been                a period of one week. The BBC maintains a near-perfect
immensely successful and is used by millions of people each                synchronicity between their linear and online broadcasting
day. Unlike similar systems from commercial broadcasters,                  worlds, with over 95% of linear content available as “catch-
the BBC’s system is provided without advertising.                          up” internet television or radio on many different gaming
∗(Produces the permission block, and copyright CC 3.0                      consoles, integrated television platforms and mobile devices,
information). For use with SIG-ALTERNATE.CLS. Sup-                         as well as desktop and laptop computers. This synchronicity
ported by ACM.                                                             is completely integrated at both the metadata and transcod-
†                                                                          ing levels, and across both radio and television.
                                                                              A simplified diagram of the BBC programme metadata hi-
                                                                           erarchy is shown in Figure 1. The most important element
                                                                           for purposes of this paper is the episode. Episodes may be
                                                                           edited into different versions, and then sent out as trans-
                                                                           mitted broadcasts or made available online as ondemands.
WOMRAD 2011 2nd Workshop on Music Recommendation and Discovery,            Episodes are grouped into series, under a particular brand.
colocated with ACM RecSys 2011 (Chicago, US)                               Brands are equivalent to what a user might find listed in a
Copyright c . This is an open-access article distributed under the terms   programme guide; common UK examples are EastEnders or
of the Creative Commons Attribution License 3.0 Unported, which permits
unrestricted use, distribution, and reproduction in any medium, provided   Dr. Who (tv) or Desert Island Discs (radio).
the original author and source are credited.                                  This paper documents an investigation into features that
may be used to produce quality recommendations. The two                 the short availability window of iPlayer programmes
features explored here, services and genre, are partly unique           effectively meant that existing programmes never left
to BBC metadata, but genres are also used in other mu-                  this “build-up” phase of generating enough history with
sic and media recommendation systems. While there is a                  which to make effective recommendations. In most
large body of research into content-based features for mu-              cases new programmes often weren’t recommended un-
sic recommendation, it should be noted that the research                til they were near the end of their availability windows.
presented here is entirely based on metadata, user histories,           It is an extremely unfortunate situation to have the
and the BBC programme hierarchy.                                        BBC place considerable effort into creating world-class
                                                                        content, and then not recommend it for the majority
2. RECOMMENDATION SYSTEMS FOR THE                                       of that programme’s availability, or perhaps not at all.
   BBC IPLAYER                                                    Eliminating Old Content In a typical collaborative fil-
                                                                      tering system, removing items requires recalculation of
2.1 Previous Issues and Possible Solutions                            the mathematical relationships between all users and
   Most recommendation systems generate recommendations               products (or just products to products). This is a
by identifying similar users based on their recorded product          computationally-expensive process, and consequently
choices, and then identifying products popular with these             most online stores only remove products from their
users that a new, similar user has not yet chosen. This               catalogues infrequently. If necessary, invalid results
is often referred to as collaborative filtering. Amazon and           can be temporarily filtered until such time as a system-
last.fm are two examples of such systems [4], and there are           wide batch recalculation can be accomplished. In some
many variants [7, 5].                                                 cases removal of items can cause referential integrity
   These systems have proven to be effective in many com-             (foreign key) issues, and many collaborative filtering
mercial environments, leading to increased site traffic, sales,       systems apparently do not have mechanisms for remov-
and an improved connection between individual users and               ing content at all. This led to many programmes being
the items that they are interested in. However, a system              recommended that were no longer available, and re-
of this type was recently incorporated into the BBC iPlayer           quired the implementation of an expensive, secondary
product and failed to produce similar behaviour, with a daily         real-time filtering system to remove expired recom-
usage rate of approximately 4% of episode click-throughs.             mendations.
It is useful to examine some of the reasons why a technique
that has been successful in other online contexts would per-      2.2    Possible Solutions
form so poorly in the case of the BBC. Particular issues with        One obvious but partial solution to these problems would
standard collaborative filtering systems include:                 be to filter the output results to only produce recommenda-
                                                                  tions within the current time window. While this would al-
Dynamic Programme Schedule Most online stores have                leviate the problem of producing expired recommendations,
    a collection of items, such as books or songs, that are       it does not solve other issues such as the cold start problem.
    largely unchanging. While new items are often added,             Another approach, and the one explored here, is to in-
    the amount of new material in relation to the ma-             stead find more general categorisations for programmes. If
    jority of the collection is small enough that one can         all programmes in the current schedule can be assigned to a
    consider it to be relatively static. In practice, the rela-   set of static categories, these can then be used to record user
    tively small number of new items added can be handled         histories against. The experiments in this paper explore the
    through weekly or daily recalculations of recommenda-         potential of two such features, services and genre, for use in
    tions across the entire product set / user histories. In      storing cumulative user histories. These have the advantage
    contrast, the list of iPlayer ondemands is primarily lim-     of being assigned to all radio and television programmes in
    ited to a seven day availability window. The composi-         the BBC linear schedule and are readily available.
    tion of programs within this window changes dynami-
    cally, with new programmes being added at least every         2.3    Services
    hour, and older ones expiring. The list of valid pro-            In linear broadcasting, a service is a particular station or
    grammes, and effectively the viewer’s history of pro-         channel such as “BBC One” or “6 Music”. In the world of on-
    grammes to recommend against only extends back a              line “catch-up” broadcasting services tend to function more
    week. In effect, a completely new set of programmes is        as an association of programmes that share some common
    introduced every week, making it difficult to leverage        heritage. The reasons for this are partly historical, but these
    the user’s play history towards generating new recom-         divisions are also still valid from an audience perspective;
    mendations.                                                   the original channel structures were created to fulfill differ-
                                                                  ent audience requirements. For instance, “6 Music” tends
Cold Start Problem In the classical collaborative filter-         to focus on very new music, while the “BBC Four” radio
    ing model, new items do not get recommended un-               audience is more classically oriented. However, one of the
    til enough users have discovered them through other           advantages of online broadcasting is that audience members
    means. This is really just another aspect of the sparse       have the ability to switch between services more easily than
    data problem, where there is not enough user history          ever before. When removed from the restraints of the linear
    to make adequate recommendations [3]. New items are           schedule, one would expect to see users take advantage of
    often introduced to users through mechanisms such as          this and new listening trends and patterns to be reflected in
    promotions, or through partial solutions such as arti-        user play histories.
    ficially introducing non-personalised defaults based on
    average user ratings of all products [2]. In contrast,        2.4    Genres
    Table 1: Common BBC Services and Genres                                         Figure 3: Accumulated Daily Online Television Ac-
         Services            Genres                                                 tivity
         bbc 1xtra           childrens                                                             1.8e+06
                                                                                                                                                 Hourly Activity (Television)
         bbc 6music          religion and ethics                                                   1.6e+06
         bbc 7               entertainment
                                                                                                   1.4e+06
         bbc london          drama
         bbc radio five live factual                                                               1.2e+06

         bbc radio one       weather
                                                                                                    1e+06
         bbc radio three     music




                                                                                        Activity
         bbc radio two       sport                                                                 800000

         bbc three           news                                                                  600000
         bbc world service comedy
                                                                                                   400000


                                                                                                   200000

Figure 2: Accumulated Daily Online Radio Activity
                                                                                                        0
               70000                                                                                         0         5        10              15                 20           25
                                           Hourly Activity (Radio and Music)
                                                                                                                                        Hour


               60000


                                                                                                      Table 2: Diversity of BBC Services
               50000                                                                                                      Radio / Music TV
                                                                                                     Gini                 0.03          0.25
    Activity




               40000                                                                                 Entropy              0.07          0.6
                                                                                                     Classification Error 0.025         0.19
               30000




               20000                                                                the above time period were extracted, and the diversity of
                                                                                    each user’s individual history was calculated. While other
               10000
                                                                                    more complex diversity evaluation systems are available [6],
                       0   5   10
                                    Hour
                                               15                 20           25
                                                                                    three common measures of diversity were used: Gini impu-
                                                                                    rity, entropy (2), and a standard classification error using
                                                                                    the maximum value (3).

   Every BBC programme, both television and radio, has at                                                                             X
                                                                                                                                      c−1
least one genre assigned to it by an expert editorial staff                                                           gini(t) = 1 −            p(i|t)2                               (1)
member. These are used in a variety of marketing and pro-                                                                               i=0
motional functions, as well as for programming, and are con-
sidered to be accurate in the broadcasting industry.                                                                              X
                                                                                                                                  c−1
   A list of some common BBC services and genres is given                                                        entropy(t) = −         p(i|t)log2 p(i|t)                            (2)
in Table 1. While the properties of services and genre in                                                                         i=0
relation to the linear broadcast audience is well known, sim-
ilar information about online usage is not as well evaluated.                                                    maxclasserror(t) = 1 − maxi p(i|t)                                  (3)
Both features, however, are thought to be influential in the
online domain. The value of these for recommending online                              Table 2 shows the averaged values for all users. For com-
programming remains relatively unevaluated in an empirical                          parison purposes, similar figures were also calculated for the
way.                                                                                television users. These results clearly show that the ma-
                                                                                    jority of individual radio users concentrate around a very
                                                                                    small number of services, with very little diversity. Tele-
3. EXPERIMENT                                                                       vision users, on the other hand, tend to have much more
   We performed two kinds of experiments. First, the di-                            diverse service histories and do not appear to be as tied to
versity of genres and services were tested. Based on this,                          particular services in the online world. Similar figures for
four simple recommendation systems were evaluated for how                           genre are show in Table 3 and to a lesser degree exhibit the
close a match they were to a historical dataset.                                    same trends.
   A month of iPlayer play history was made available from                             Based on these results, four simple recommendation strate-
May 28 to June 25, 2010, consisting of approximately 18 mil-                        gies were tested for recommending radio and music pro-
lion instances of user selected ondemands, with most shows                          grammes. Recommendations were based on:
lasting a half or full hour. Of this, approximately 17 million
are televised selections and 1 million are radio. Daily online
radio and television usage patterns, averaged over the time
period are given in Figures 2 and 3 respectively.                                                     Table 3: Diversity of BBC Genres
   After some discussion and initial exploration, it was de-                                                              Radio / Music TV
cided to test these factors based on the diversity of user play                                      Gini                 0.15          0.37
history. To test the diversity of both services and genres, a                                        Entropy              0.32          0.93
play history of 89,574 radio and 747,992 television users for                                        Classification Error 0.14          0.31
                                                                     Also, the use of the original service has more of an impact
Figure 4: Markov Chain built from genres using                    in a radio context than in a television context. To be sure,
BBC 3                                                             there are significant differences between television and radio
                          Comedy
                                                                  as media formats, and in many ways are not comparable.
                                                                  Nevertheless, it is interesting to try. One possible inter-
                                                                  pretion is that television viewers have embraced the online
                          Drama        start_of_day
                                                                  experience to a greater extent than pure music or radio lis-
                                                                  teners. However, it may also be that radio users are more
                                                                  loyal in general to particular stations/brands than television
                                      Factual
                                                                  users for other reasons besides just the music. For instance,
                                                                  online radio stations such as last.fm specialise in automati-
                                                                  cally generating curated collections of music. Disregarding
                      Entertainment
                                                                  any differences between their recommendations and those
                                                                  programmed by the human curators at the BBC, the main
                                                                  difference is the other elements such as presenters and news
       Children’s      end_of_day               Sport
                                                                  segments, and these may be what keeps listeners from chang-
                                                                  ing services.
                                                                     Genre is also useable for radio recommendations, but genre
     Table 4: Results of Simple Recommenders                      as a single feature appears to work better for recommending
               Last programme    .06                              television programmes.
               Most common       .14
               Markov            .28                              5.   FUTURE DIRECTIONS
               Markov w/services .34
                                                                    While this was more on the order of an initial exploration
                                                                  of the problem space, the work presented here suggests a
                                                                  number of additional areas of research. It seems clear that
   • The genre of last programme
                                                                  time of day is also probably an important factor. We would
   • The most common genre in the user’s history                  also like to do better comparisons between the linear and
                                                                  online audience behaviours, as it seems that there is prob-
   • A Markov chain of genres derived from all linear broad-      ably a fair amount of common behaviour there. Also, the
     cast schedules                                               study should be expanded to include additional features.
   • Individual Markov chains of genres for each service
                                                                  6.   REFERENCES
   The inclusion of Markov chains requires some explanation.      [1] BBC. iPlayer, 2011. http://www.bbc.co.uk/iplayer/.
The order of programmes is traditionally an important fac-        [2] J. S. Breese, D. Heckerman, and C. M. Kadie.
tor in the scheduling of linear broadcasts, with the intention        Empirical analysis of predictive algorithms for
of sustaining audience interest for longer time periods. Con-         collaborative filtering. In Proceedings of the 14th
sequently, a simple Markov chain based on successive genres           Conference on Uncertainty in Artificial Intelligence,
was constructed using the linear schedules. Effectively this          pages 43–52, Madison, WI, 1998. Morgan Kauffman.
reduces to a probability distribution for each genre where the    [3] C.-N. Hsu, H.-H. Chung, and H.-S. Huang. Mining
most likely genre was compared to that of the next item in            skewed and sparse transaction data for personalized
the user’s history. Note that start and end-of-day states were        shopping recommendation. Machine Learning,
inserted to represent the 6 AM daily schedule changeover,             57(1-2):35–59, 2004.
as no connection is implied between days. In the case of the      [4] G. Linden, B. Smith, and J. York. Amazon.com
fourth recommender, individual Markov chains were built               recommendations: item-to-item collaborative filtering.
for each service and resolved using the service of the previ-         Internet Computing, IEEE, 7(1):76–80, 2003.
ous programme. As an example, Figure 4 shows a simple
                                                                  [5] B. Sarwar, G. Karypis, J. Konstan, and J. Reidl.
Markov chain built on successive genres for BBC 3.
                                                                      Item-based collaborative filtering recommendation
   Each recommender was then tested on each user’s play
                                                                      algorithms. In Proceedings of the 10th international
histories in sequence and a tally of matches / failures kept.
                                                                      conference on World Wide Web, WWW ’01, pages
These were evaluated using the user’s past histories as simple
                                                                      285–95, 2001.
percentages, as shown in Table 4.
                                                                  [6] M. Slaney and W. White. Measuring playlist diversity
                                                                      for recommendation systems. In Proceedings of the
4. DISCUSSION                                                         ACM Workshop on Audio and Music Computing for
   While none of the strategies tested could be considered a          Multimedia, pages 22–32, Santa Barbara, CA, USA,
complete recommendation system, it is surprising that cor-            2006. ACM.
rect results can be obtained more than a third of the time        [7] X. Su and T. Khoshgoftaar. A survey of collaborative
using only these two simple features, and a knowledge of the          filtering techniques. Advances in Artificial Intelligence,
programmes found in the linear schedule. One possible way             2009:1–20, 2009.
to interpret this is that the online audience shares some of
the behaviour of the linear scheduling audience, even when
freed of the constraints of only having a single content choice
at any one time.