Analysis of User-generated Content for Improving YouTube
                 Video Recommendation
                                                                                            ∗
              Michele Galli, Davide Feltoni Gurini, Fabio Gasparetti , Alessandro Micarelli,
                                          Giuseppe Sansonetti
                                                         Roma Tre University
                                            Via della Vasca Navale 79 - Rome, 00146 Italy
                       {michele.galli,feltoni,gaspare,micarel,gsansone}@dia.uniroma3.it

ABSTRACT
Everyday video-sharing websites such as YouTube collect
large amounts of new multimedia resources. Comments left
by viewers often provide valuable information to describe
sentiments, opinions and tastes of users. For this reason, we
propose a novel re-ranking approach that takes into consid-
eration that information in order to provide better recom-
mendations of related videos. Early experiments indicate an
improvement in the recommendation performance.

Categories and Subject Descriptors
H.3.3 [Information Search and Retrieval]: [Information
Filtering]

Keywords                                                                  Figure 1: The YouTube website with metadata and
Recommender systems, Web 2.0, YouTube                                     recommended videos highlighted.

1.    INTRODUCTION
   YouTube is the world’s most popular web video commu-                   views and ratings) and the similarities of the videos with
nity used by 1 billions unique users world wide each month1 .             the history of videos watched by the user are combined to
Four billions of videos are viewed per day, with 100 hours of             rank the candidate resources. A trade-off between relevance
new ones uploaded every minute. Sifting through this large                and diversity across categories builds up the related video
repository of multimedia resources poses unique challenges                list Lid = (l1 , l2 , . . . , ln ). As a result, the user-generated
for the user.                                                             comments that are shown below the video are not taken
   The YouTube user interface provides, given the current                 into consideration. Although these user interactions are of-
video lid , a list of recommendations as shown in Fig. 1.                 ten short and noisy, they have the chance to represent valu-
YouTube selects those recommendations based on an algo-                   able information about user interests, tastes and, more in
rithm that considers signals from a variety of sources in-                general, debate topics about the videos.
cluding the user’s favorite, watched and liked videos [4].                   Related video lists can host a large number of suggestions,
These signals are combined for ranking the list of related                i.e., up to 40. Our hypothesis is that two videos may be
videos compiled by monitoring what other people usually                   related if they give rise to similar reactions and sentiments
watch next. By exploring this related-video graph, a can-                 from viewers. This sort of implicit relationship between mul-
didate list is built. Characteristics about the videos (e.g.,             timedia resources might improve the original YouTube rank-
                                                                          ing in a way that better matches the user expectations. In
∗Contact author email: gaspare@dia.uniroma3.it                            this paper we propose a re-ranking method that, for each
1
  http://www.youtube.com/yt/press/statistics.html                 (Ac-    video, generates a new ordered list of videos proposed by
cessed: 2 July 2015)                                                      the YouTube traditional recommender.

                                                                          2.   THE PROPOSED VIDEO RECOMMEN-
                                                                               DATION
                                                                            Given the lid video, the YouTube Data API2 allows us to
                                                                          retrieve up to 1000 comments Clid = {c1 , c2 , · · · }.The API
                                                                          provides us also the top 25 related videos. We filter too short
Copyright is held by the author(s).                                       2
RecSys 2015 Poster Proceedings, September 16-20, 2015, Vienna, Austria.     https://developers.google.com/youtube/ (Accessed: 2 July
.                                                                         2015)
comments and the ones with obscene or profane language.             nDCG of 0.829 while the proposed approach reaches 0.858
A Bayesian classifier trained on a subset of spam comments          with an improvement of 3.51% (p-value<0.05).
help us to filter out the less relevant content.
   A keyword-based approach [2] identifies the words that           4.   RELATED WORKS
express a sentiment, assigning them a score in [0, 1] to each          To the best of our knowledge, our work makes the first at-
of the following dimensions: positivity, negativity, and ob-        tempt to analyze user comments in the video recommenda-
jectivity. In particular, given a comment ci ∈ Clid we sum          tion domain. Shmueli et al. [6] analyze users’ co-commenting
up all the positivity scores and then subtract the negativity       patterns for predicting, for a given user, suitable news stories
ones. The obtained normalized real value is encoded in a            that she likely comment on. A similar approach is focused
categorical feature by linearly discretizing it to 5 intervals      on the news recommendation by Messenger and Whittle [5].
so that each comment is assigned to one of the following            Sergiu et al. [3] explore the effectiveness of comments and
classes: very positive, positive, neutral, negative, very neg-      other social signals for the video retrieval task, that is, when
ative. Those classes are also the five dimensions of a vector       a user query must be elaborated.
space model, where the sentiment vector:
             −− →
               (ss)                                                 5.   CONCLUSIONS AND FUTURE WORK
             vlid = (v1,id , v2,id , v3,id , v4,id , v5,id ) (1)
                                                                       Whereas the obtained benefits in the re-rank of YouTube
is calculated by summing up the occurrences of the very pos-        related videos is limited, the statistical significance of find-
itive classes for the dimension v1,id , positive occurrences for    ings let us think that a textual comment mining approach
v2,id , neutral occurrences for v3,id , and so forth. The same      should be considered for future investigations. Much of the
procedure is followed for each video lj ∈ Lid by analyzing          computation can be implemented offline, while the basic co-
the set of comments associated with lj . We obtain n vectors        sine similarity calculus has limited complexity.
−− →
  (ss)
vlj that can be compared by means of a cosine similarity               More experiments are undergoing to better understand
                 −−→                                                the relationship between the kinds of opinions and senti-
                  (ss)                                              ments expressed by the users and the categories of the videos.
measure with vlid . The related video lj will thus have a
                                (ss)                                By collecting a large training dataset, it is possible to dy-
sentiment-based similarity rid,j ∈ [0, 1].
                                                                    namically assign different weights to the three parameters
   A second step extracts named entities (e.g., persons, lo-
                                                                    of Eq. 2. Temporal dimension is a further element to con-
cations) and nouns from each comment by means of the
                                                                    sider [1]. There are many videos for which YouTube is not
Stanford Named-entity recognizer and Part-of-Speech tag-
                                                                    able to compute a reliable set of related videos due to the
ger, respectively. As with the previous procedure, two vec-
      −(ne)
       −→       −− −→                                               scarcity of user activities. It is interesting to understand
                  (pos)                                    (y)
tors, vlj and vlj , are obtained for each video lj in Lid           if the proposed approach can be successfully implemented
by summing up the contribution of the different comments.           even for new videos that have collected a right number of
                   −(ne)
                     −→      −−−→
                              (pos)                                 comments, partially addressing the data-sparsity issue due
The two vectors vlid and vlid are also computed. The
                                                                    to the scarcity of user activity records.
dimensions of the vectors are distinct named entities and
nouns that appear in the analyzed user-generated data. A
                                                (ne)     (pos)
cosine similarity measure assigns the scores rid,j and rid,j
                                                                    6.   REFERENCES
                                                                    [1] G. Arru, D. Feltoni Gurini, F. Gasparetti, A. Micarelli,
between lid and lj videos, respectively, for the named entity           and G. Sansonetti. Signal-based user recommendation
and noun comparisons.                                                   on twitter. In Proc. of WWW ’13, pages 941–944,
   The last step calculates the final rank for the video j by           Republic and Canton of Geneva, Switzerland, 2013.
linearly combining the three measures:
                                                                    [2] S. Baccianella, A. Esuli, and F. Sebastiani.
                         (ss)          (ne)    (pos)
             rid,j = α1 rid,j + α2 rid,j + α3 rid,j          (2)        Sentiwordnet 3.0: An enhanced lexical resource for
                                                                        sentiment analysis and opinion mining. In N. Calzolari
where the three α values are set to the α0 constant.                    and et al., editors, LREC. European Language
                                                                        Resources Association, 2010.
3.   EVALUATION                                                     [3] S. Chelaru, C. Orellana-Rodriguez, and I. Altingovde.
                                                                        How useful is social feedback for learning to rank
   A total of 8 persons were involved, mostly students of CS
                                                                        youtube videos? World Wide Web, 17(5):997–1025,
courses, all usual users of the YouTube service. A Java ap-
                                                                        2014.
plication has been developed to assist them during the eval-
uation. We asked them to select 10 videos V = {v1 , . . . , v10 }   [4] J. Davidson, B. Liebald, J. Liu, P. Nandy, T. V. Vleet,
from their watched history, the recommendations on the                  U. Gargi, S. Gupta, Y. He, M. Lambert, B. Livingston,
YouTube homepage or the subscribed channels. For each                   and D. Sampath. The youtube video recommendation
video vi ∈ V the application obtains its related YouTube                system. In Proc. of RecSys’10, pages 293–296, New
videos Lvi . A new ordered list L0vi is built by downloading            York, NY, USA, 2010. ACM.
the comments and running the proposed approach on them.             [5] A. Messenger and J. Whittle. Recommendations based
A randomized list is proposed to each user that was asked to            on user-generated comments in social media. In
evaluate her interests in watching each single video with a             Privacy, Security, Risk and Trust (PASSAT) and 2011
five-level Likert scale. The Normalized discounted cumula-              IEEE SocialCom, pages 505–508, Oct 2011.
tive gain (nDCG) is evaluated both for the YouTube list Lvi         [6] E. Shmueli, A. Kagian, Y. Koren, and R. Lempel. Care
and the new ranked one L0vi . After computing the measure               to comment?: Recommendations for commenting on
for each video we averaged them to obtain an overall per-               news stories. In Proc. of WWW ’12, pages 429–438,
formance evaluation. The YouTube recommender obtains a                  New York, NY, USA, 2012. ACM.