=Paper=
{{Paper
|id=None
|storemode=property
|title=The Importance of Service and Genre in Recommendations for Online Radio and Television Programmes
|pdfUrl=https://ceur-ws.org/Vol-793/womrad2011_paper5.pdf
|volume=Vol-793
}}
==The Importance of Service and Genre in Recommendations for Online Radio and Television Programmes==
The Importance of Service and Genre in Recommendations for Online Radio and Television Programmes ∗ † Ian Knopke British Broadcasting Corporation 201 Wood Lane, White City London, UK ian.knopke@gmail.com ABSTRACT The BBC iPlayer is an online delivery system for both ra- dio and television content [1]. One of the unique features of the iPlayer is that programming is based around a seven day “catch-up” window. This paper documents some early investigations into features that may be used to produce quality recommendations for that system. The two features explored here, services and genre, are partly unique to BBC metadata, and are available for all programmes in the sched- ule. Services are roughly equivalent to channels or stations, while genres are editorially-assigned categorisations of me- dia content. Results of genre / service-based diversity are presented, as well as some simple recommenders based on there, and additional discussion of the topic and results. Categories and Subject Descriptors H.4 [Information Systems Applications]: Miscellaneous General Terms Recommendations Figure 1: BBC Programme Hierarchy Keywords Recommendations, Broadcasting, Collaborative Filtering One of the unique features of the iPlayer is that program- ming is based around a seven day “catch-up” window. Pro- 1. INTRODUCTION gramming is first shown as a linear broadcast over normal The BBC iPlayer is an online delivery system for both transmission systems (radio and tv). Shortly thereafter the radio and television content. Freely available for users within same content becomes available online, without charge, for the geographical borders of the United Kingdom, it has been a period of one week. The BBC maintains a near-perfect immensely successful and is used by millions of people each synchronicity between their linear and online broadcasting day. Unlike similar systems from commercial broadcasters, worlds, with over 95% of linear content available as “catch- the BBC’s system is provided without advertising. up” internet television or radio on many different gaming ∗(Produces the permission block, and copyright CC 3.0 consoles, integrated television platforms and mobile devices, information). For use with SIG-ALTERNATE.CLS. Sup- as well as desktop and laptop computers. This synchronicity ported by ACM. is completely integrated at both the metadata and transcod- † ing levels, and across both radio and television. A simplified diagram of the BBC programme metadata hi- erarchy is shown in Figure 1. The most important element for purposes of this paper is the episode. Episodes may be edited into different versions, and then sent out as trans- mitted broadcasts or made available online as ondemands. WOMRAD 2011 2nd Workshop on Music Recommendation and Discovery, Episodes are grouped into series, under a particular brand. colocated with ACM RecSys 2011 (Chicago, US) Brands are equivalent to what a user might find listed in a Copyright c . This is an open-access article distributed under the terms programme guide; common UK examples are EastEnders or of the Creative Commons Attribution License 3.0 Unported, which permits unrestricted use, distribution, and reproduction in any medium, provided Dr. Who (tv) or Desert Island Discs (radio). the original author and source are credited. This paper documents an investigation into features that may be used to produce quality recommendations. The two the short availability window of iPlayer programmes features explored here, services and genre, are partly unique effectively meant that existing programmes never left to BBC metadata, but genres are also used in other mu- this “build-up” phase of generating enough history with sic and media recommendation systems. While there is a which to make effective recommendations. In most large body of research into content-based features for mu- cases new programmes often weren’t recommended un- sic recommendation, it should be noted that the research til they were near the end of their availability windows. presented here is entirely based on metadata, user histories, It is an extremely unfortunate situation to have the and the BBC programme hierarchy. BBC place considerable effort into creating world-class content, and then not recommend it for the majority 2. RECOMMENDATION SYSTEMS FOR THE of that programme’s availability, or perhaps not at all. BBC IPLAYER Eliminating Old Content In a typical collaborative fil- tering system, removing items requires recalculation of 2.1 Previous Issues and Possible Solutions the mathematical relationships between all users and Most recommendation systems generate recommendations products (or just products to products). This is a by identifying similar users based on their recorded product computationally-expensive process, and consequently choices, and then identifying products popular with these most online stores only remove products from their users that a new, similar user has not yet chosen. This catalogues infrequently. If necessary, invalid results is often referred to as collaborative filtering. Amazon and can be temporarily filtered until such time as a system- last.fm are two examples of such systems [4], and there are wide batch recalculation can be accomplished. In some many variants [7, 5]. cases removal of items can cause referential integrity These systems have proven to be effective in many com- (foreign key) issues, and many collaborative filtering mercial environments, leading to increased site traffic, sales, systems apparently do not have mechanisms for remov- and an improved connection between individual users and ing content at all. This led to many programmes being the items that they are interested in. However, a system recommended that were no longer available, and re- of this type was recently incorporated into the BBC iPlayer quired the implementation of an expensive, secondary product and failed to produce similar behaviour, with a daily real-time filtering system to remove expired recom- usage rate of approximately 4% of episode click-throughs. mendations. It is useful to examine some of the reasons why a technique that has been successful in other online contexts would per- 2.2 Possible Solutions form so poorly in the case of the BBC. Particular issues with One obvious but partial solution to these problems would standard collaborative filtering systems include: be to filter the output results to only produce recommenda- tions within the current time window. While this would al- Dynamic Programme Schedule Most online stores have leviate the problem of producing expired recommendations, a collection of items, such as books or songs, that are it does not solve other issues such as the cold start problem. largely unchanging. While new items are often added, Another approach, and the one explored here, is to in- the amount of new material in relation to the ma- stead find more general categorisations for programmes. If jority of the collection is small enough that one can all programmes in the current schedule can be assigned to a consider it to be relatively static. In practice, the rela- set of static categories, these can then be used to record user tively small number of new items added can be handled histories against. The experiments in this paper explore the through weekly or daily recalculations of recommenda- potential of two such features, services and genre, for use in tions across the entire product set / user histories. In storing cumulative user histories. These have the advantage contrast, the list of iPlayer ondemands is primarily lim- of being assigned to all radio and television programmes in ited to a seven day availability window. The composi- the BBC linear schedule and are readily available. tion of programs within this window changes dynami- cally, with new programmes being added at least every 2.3 Services hour, and older ones expiring. The list of valid pro- In linear broadcasting, a service is a particular station or grammes, and effectively the viewer’s history of pro- channel such as “BBC One” or “6 Music”. In the world of on- grammes to recommend against only extends back a line “catch-up” broadcasting services tend to function more week. In effect, a completely new set of programmes is as an association of programmes that share some common introduced every week, making it difficult to leverage heritage. The reasons for this are partly historical, but these the user’s play history towards generating new recom- divisions are also still valid from an audience perspective; mendations. the original channel structures were created to fulfill differ- ent audience requirements. For instance, “6 Music” tends Cold Start Problem In the classical collaborative filter- to focus on very new music, while the “BBC Four” radio ing model, new items do not get recommended un- audience is more classically oriented. However, one of the til enough users have discovered them through other advantages of online broadcasting is that audience members means. This is really just another aspect of the sparse have the ability to switch between services more easily than data problem, where there is not enough user history ever before. When removed from the restraints of the linear to make adequate recommendations [3]. New items are schedule, one would expect to see users take advantage of often introduced to users through mechanisms such as this and new listening trends and patterns to be reflected in promotions, or through partial solutions such as arti- user play histories. ficially introducing non-personalised defaults based on average user ratings of all products [2]. In contrast, 2.4 Genres Table 1: Common BBC Services and Genres Figure 3: Accumulated Daily Online Television Ac- Services Genres tivity bbc 1xtra childrens 1.8e+06 Hourly Activity (Television) bbc 6music religion and ethics 1.6e+06 bbc 7 entertainment 1.4e+06 bbc london drama bbc radio five live factual 1.2e+06 bbc radio one weather 1e+06 bbc radio three music Activity bbc radio two sport 800000 bbc three news 600000 bbc world service comedy 400000 200000 Figure 2: Accumulated Daily Online Radio Activity 0 70000 0 5 10 15 20 25 Hourly Activity (Radio and Music) Hour 60000 Table 2: Diversity of BBC Services 50000 Radio / Music TV Gini 0.03 0.25 Activity 40000 Entropy 0.07 0.6 Classification Error 0.025 0.19 30000 20000 the above time period were extracted, and the diversity of each user’s individual history was calculated. While other 10000 more complex diversity evaluation systems are available [6], 0 5 10 Hour 15 20 25 three common measures of diversity were used: Gini impu- rity, entropy (2), and a standard classification error using the maximum value (3). Every BBC programme, both television and radio, has at X c−1 least one genre assigned to it by an expert editorial staff gini(t) = 1 − p(i|t)2 (1) member. These are used in a variety of marketing and pro- i=0 motional functions, as well as for programming, and are con- sidered to be accurate in the broadcasting industry. X c−1 A list of some common BBC services and genres is given entropy(t) = − p(i|t)log2 p(i|t) (2) in Table 1. While the properties of services and genre in i=0 relation to the linear broadcast audience is well known, sim- ilar information about online usage is not as well evaluated. maxclasserror(t) = 1 − maxi p(i|t) (3) Both features, however, are thought to be influential in the online domain. The value of these for recommending online Table 2 shows the averaged values for all users. For com- programming remains relatively unevaluated in an empirical parison purposes, similar figures were also calculated for the way. television users. These results clearly show that the ma- jority of individual radio users concentrate around a very small number of services, with very little diversity. Tele- 3. EXPERIMENT vision users, on the other hand, tend to have much more We performed two kinds of experiments. First, the di- diverse service histories and do not appear to be as tied to versity of genres and services were tested. Based on this, particular services in the online world. Similar figures for four simple recommendation systems were evaluated for how genre are show in Table 3 and to a lesser degree exhibit the close a match they were to a historical dataset. same trends. A month of iPlayer play history was made available from Based on these results, four simple recommendation strate- May 28 to June 25, 2010, consisting of approximately 18 mil- gies were tested for recommending radio and music pro- lion instances of user selected ondemands, with most shows grammes. Recommendations were based on: lasting a half or full hour. Of this, approximately 17 million are televised selections and 1 million are radio. Daily online radio and television usage patterns, averaged over the time period are given in Figures 2 and 3 respectively. Table 3: Diversity of BBC Genres After some discussion and initial exploration, it was de- Radio / Music TV cided to test these factors based on the diversity of user play Gini 0.15 0.37 history. To test the diversity of both services and genres, a Entropy 0.32 0.93 play history of 89,574 radio and 747,992 television users for Classification Error 0.14 0.31 Also, the use of the original service has more of an impact Figure 4: Markov Chain built from genres using in a radio context than in a television context. To be sure, BBC 3 there are significant differences between television and radio Comedy as media formats, and in many ways are not comparable. Nevertheless, it is interesting to try. One possible inter- pretion is that television viewers have embraced the online Drama start_of_day experience to a greater extent than pure music or radio lis- teners. However, it may also be that radio users are more loyal in general to particular stations/brands than television Factual users for other reasons besides just the music. For instance, online radio stations such as last.fm specialise in automati- cally generating curated collections of music. Disregarding Entertainment any differences between their recommendations and those programmed by the human curators at the BBC, the main difference is the other elements such as presenters and news Children’s end_of_day Sport segments, and these may be what keeps listeners from chang- ing services. Genre is also useable for radio recommendations, but genre Table 4: Results of Simple Recommenders as a single feature appears to work better for recommending Last programme .06 television programmes. Most common .14 Markov .28 5. FUTURE DIRECTIONS Markov w/services .34 While this was more on the order of an initial exploration of the problem space, the work presented here suggests a number of additional areas of research. It seems clear that • The genre of last programme time of day is also probably an important factor. We would • The most common genre in the user’s history also like to do better comparisons between the linear and online audience behaviours, as it seems that there is prob- • A Markov chain of genres derived from all linear broad- ably a fair amount of common behaviour there. Also, the cast schedules study should be expanded to include additional features. • Individual Markov chains of genres for each service 6. REFERENCES The inclusion of Markov chains requires some explanation. [1] BBC. iPlayer, 2011. http://www.bbc.co.uk/iplayer/. The order of programmes is traditionally an important fac- [2] J. S. Breese, D. Heckerman, and C. M. Kadie. tor in the scheduling of linear broadcasts, with the intention Empirical analysis of predictive algorithms for of sustaining audience interest for longer time periods. Con- collaborative filtering. In Proceedings of the 14th sequently, a simple Markov chain based on successive genres Conference on Uncertainty in Artificial Intelligence, was constructed using the linear schedules. Effectively this pages 43–52, Madison, WI, 1998. Morgan Kauffman. reduces to a probability distribution for each genre where the [3] C.-N. Hsu, H.-H. Chung, and H.-S. Huang. Mining most likely genre was compared to that of the next item in skewed and sparse transaction data for personalized the user’s history. Note that start and end-of-day states were shopping recommendation. Machine Learning, inserted to represent the 6 AM daily schedule changeover, 57(1-2):35–59, 2004. as no connection is implied between days. In the case of the [4] G. Linden, B. Smith, and J. York. Amazon.com fourth recommender, individual Markov chains were built recommendations: item-to-item collaborative filtering. for each service and resolved using the service of the previ- Internet Computing, IEEE, 7(1):76–80, 2003. ous programme. As an example, Figure 4 shows a simple [5] B. Sarwar, G. Karypis, J. Konstan, and J. Reidl. Markov chain built on successive genres for BBC 3. Item-based collaborative filtering recommendation Each recommender was then tested on each user’s play algorithms. In Proceedings of the 10th international histories in sequence and a tally of matches / failures kept. conference on World Wide Web, WWW ’01, pages These were evaluated using the user’s past histories as simple 285–95, 2001. percentages, as shown in Table 4. [6] M. Slaney and W. White. Measuring playlist diversity for recommendation systems. In Proceedings of the 4. DISCUSSION ACM Workshop on Audio and Music Computing for While none of the strategies tested could be considered a Multimedia, pages 22–32, Santa Barbara, CA, USA, complete recommendation system, it is surprising that cor- 2006. ACM. rect results can be obtained more than a third of the time [7] X. Su and T. Khoshgoftaar. A survey of collaborative using only these two simple features, and a knowledge of the filtering techniques. Advances in Artificial Intelligence, programmes found in the linear schedule. One possible way 2009:1–20, 2009. to interpret this is that the online audience shares some of the behaviour of the linear scheduling audience, even when freed of the constraints of only having a single content choice at any one time.