=Paper=
{{Paper
|id=Vol-2283/MediaEval_18_paper_12
|storemode=property
|title=Predicting the Interest in News Based on Image Annotations
|pdfUrl=https://ceur-ws.org/Vol-2283/MediaEval_18_paper_12.pdf
|volume=Vol-2283
|authors=Alexandru Ciobanu,Andreas Lommatzsch,Benjamin Kille
|dblpUrl=https://dblp.org/rec/conf/mediaeval/CiobanuLK18
}}
==Predicting the Interest in News Based on Image Annotations==
Predicting the Interest in News based On Image Annotations Alexandru Ciobanu Andreas Lommatzsch Benjamin Kille Technische Universität Berlin DAI-Labor, TU Berlin DAI-Labor, TU Berlin alexandru.ciobanu@campus. andreas@dai-lab.de benjamin.kille@dai-labor.de tu-berlin.de ABSTRACT 2 PROBLEM DESCRIPTION In recent years, the World Wide Web has changed from text-focused Several domains demand to estimate items’ relevancy based on web pages to multi-media sources featuring photos, videos, and images. In this work, we address the task defined by NewsREEL audio. The worldwide growth of broadband connections has facili- Multimedia [7]. We determine the most relevant news items based tated this trend and supports the spread of user-generated content. on the multimedia dataset provided by the task organizers. The data Navigating and finding interesting content has become a difficult include news articles, images displayed next to them, and interac- challenge. In this paper, we present approaches which use visual tions with readers. We report the evaluation metrics precision at ten features to predict how interesting a news article will be. This (Prec@10) as well as precision at the top ten percent (Prec@10%). task is part of the NewsREEL Multimedia challenge. The challenge We consider each news portal (“domain”) independently. More de- provides a large-scale data set of news items, images, and interac- tails can be found in [7]. tions. We implement a recommender system which can distinguish interesting articles from irrelevant ones based on image features. 3 RELATED WORK We evaluate the system’s throughput and predictions. We explain Recommender systems support users in finding the most interest- our insights and outline ideas to apply the gained knowledge in ing information. Traditionally, recommender system analyze user additional domains. profiles and provide recommendations based on the similarity in the user behavior (“Collaborative Filtering” [4, 5]). In the world- KEYWORDS wide web, users can anonymously access most websites as they Multimedia, News, Recommender Systems, Image Analysis relinquish login procedures. As a result, systems lack access to com- prehensive user profiles. They rely on session-based approaches or 1 INTRODUCTION content-based filtering instead. The number of documents and news articles published on the World Item-based recommender algorithms correlate item features and Wide Web has increased dramatically. Users struggle to find rele- user feedback which is taken to indicate the interest in the items. vant items. Recommender systems support users by reducing infor- Item features can be defined based on the item content. Typically, mation overload. They analyze users’ behavior toward items and text-mining approaches or semantic algorithms—describing the derive patterns to determine the most relevant items. Collaborative item based on ontologies—are used to obtain the item features [6, 8]. filtering and content-based filtering have become the most widely Reduced computational costs have facilitated deep learning ap- used algorithms for recommender systems. proaches for recognizing patterns in images. Deep Learning frame- Multimedia content—e.g. photos, videos, and audio—permeate works, such as Tensorflow [1] or Keras [2] trained on large image our everyday lives. More and more services emerge that enable us collections try to automatically identify concepts in images and to share photos and videos. Still, research on recommender systems to label images with meaningful terms. The quality of the image has yet to leverage multimedia content. This work contributes to annotations depends on the concrete scenario and the size of the this effort by focusing on the use of image data for recommending training dataset. news. In particular, we use methods which automatically determine The use of automatically computed image features for news fitting descriptors for images. The task asks us to estimate how recommender systems is still a topic for future research. Several interesting freshly published news articles will become. The eval- case studies [3] suggest that there is a potential for developing uation setting equates interestingness with popularity due to the useful recommender systems based on visual image features. This lack of user profiles. Hence, we focus on non-personalized recom- motivates us to implement new recommendation algorithms with mender systems. We hypothesize that images play a decisive role image features. The subsequent sections explain our approach and as they capture users’ attention. Thus, we use image annotations the implementation. to implement an estimator. The remainder of the paper is structured as follows: In Sec. 2 4 APPROACH we recapitulate the scenario. Sec. 3 discusses related work. We We consider only the images displayed next to the news items. present the approach in Sections 4. Subsequently, Sec. 5 illustrates We ignore additional meta-data or textual features such as text the evaluation results. Finally, Sec. 6 details our findings and gives snippets or headlines. In the first step, we annotate the images. We an outlook to future research. use Google Vision—Google’s Image Annotation Service—to ensure reliable labels. We annotate all images provided by the dataset in MediaEval ’18, 29-31 October 2018, Sophia Antipolis, France © 2018 Copyright held by the owner/author(s). NewsREEL Multimedia. Google Vision outputs a list of labels and their probabilities. We use the five most likely labels. MediaEval ’18, 29-31 October 2018, Sophia Antipolis, France Having inspected the image annotations, we recognized the Table 1: Evaluation results for our approach grouped by need to process the labels further. Many labels exhibited a too fine- news portals and weeks. Columns refer to precision at ten grained level of details. Consequently, we have trimmed the labels (P@10), precision at the top ten percent (P@10%), and aver- to the first word. For instance, “football equipment and supplies” has age precision at the top ten percent (AP@10%). become “football”. Our approach assumes that the labels represent the key infor- domain 13554 domain 39234 domain 17614 P@ P@ AP@ P@ P@ AP@ P@ P@ AP@ mation to estimate how exciting news items are. We use the im- Week 10 10% 10% 10 10% 10% 10 10% 10% pression information in the training dataset—available for weeks 04 0.70 0.63 0.58 0.10 0.14 0.11 0.10 0.25 0.21 one to three, and six to eight—to train an estimator. In other words, 10 0.70 0.56 0.51 0.00 0.14 0.10 0.00 0.28 0.19 we calculate the number of impression for each label. Some labels 11 0.70 0.58 0.51 0.00 0.15 0.10 0.00 0.27 0.19 appear in more articles than others. Still, readers’ preferences re- 12 0.70 0.54 0.52 0.10 0.17 0.11 0.00 0.27 0.19 main uncertain. Thus, we normalize the labels’ weights obtaining avg. 0.70 0.58 0.53 0.05 0.15 0.11 0.03 0.27 0.20 the average impression per label. As a result, we get three figures: the total number of impressions, the average number of impres- sions, and the number of articles linked to the label. We carry out to indicate that the computed annotations are only suitable for pre- the calculations for each news portal separately. This accounts for diction the popularity of items in certain domains. Moreover, the variations in topics amid publishers. Furthermore, the publishers importance of images for the popularity of images may differ on vary concerning the number of impressions which could bias our the considered news portals. features. We estimate an item’s popularity based on the five labels as- 6 CONCLUSION signed to its accompanying image. Subsequently, we sort the items In this paper, we have presented several approaches for estimating according to their scores and submit the top items to the task orga- the interests in news items based on visual features. Results show nizers. that our approach outperforms the baseline. Still, textual features seem to contain more information than visual features. We have observed varying levels of performance depending on the publisher. 5 EVALUATION For some publishers—e.g. 13554—visual features perform far above Our model uses 94 000 news articles and 704 000 labels. The au- the baseline. For other publishers, on the other hand, differences tomatic annotation failed for about 1.4% of all articles. In some remain small. cases, Google Vision failed to provide labels. In other cases, labels Our approach determines fitting descriptors for images. Thereby, exhibited a low probability. We successfully process 96% of all items we optimize the recommendations indirectly. We suppose that read- contained in the test set. For the remaining 4% either no label was ers engage with concepts related to labels. Alternatively, we could found or the label did not exist in the training set. We explored a hypothesize that readers react more strongly to the image rather variety of hyper-parameters to optimize our estimates. For instance, than the concept. If this thesis holds, we may be better off designing we varied the number of labels and the weeks used to train the low-level image features. estimator. We have validated different settings. Eventually, we ob- We plan to extend this line of research. Currently, our system served the best performance for the configuration with five labels considers labels separately. We will develop a model for label cate- and the entire training data. We have submitted these estimates to gories which will allow us to improve the preprocessing. Besides, we the task organizers. We obtained results regarding precision at ten, will further analyze the labels for each domain. We expect manual precision at the top ten percent, as well as average precision at the inspect to provide valuable clues on how to improve performance. top ten percent. Table 1 lists the results for publishers 13554, 17614, and 39234. REFERENCES The results show that the prediction quality highly depends on [1] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, the news portal. Our approach performs very successfully for pub- G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng. lisher 13554. Our method achieves 70% Precision@10, and 58% Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th Precision@10%. Analyzing the model in detail, we find that pho- USENIX Conference on Operating Systems Design and Implementation, OSDI’16, pages 265–283, Berkeley, CA, USA, 2016. USENIX Association. tos of German car brands and items comparing different cars are [2] F. Chollet et al. Keras. https://keras.io, 2015. popular on this domain; articles without car exterior photos (e.g. [3] F. Corsini and M. Larson. CLEF NewsREEL 2016: Image based Recommendation. portraits, buildings and cockpit designs) get only a small number In Working Notes of the 7th International Conference of the CLEF Initiative, Evora, Portugal. CEUR Workshop Proceedings, 2016. of impressions. [4] J. Herlocker, J. Konstan, A. Bochers, and J. Riedl. An algorithmic framework For the domain 17614 our approach outperforms the baseline [7], for performing collaborative filtering. In Proceedings of the 22nd International but reaches a lower precision for portal 13554. Analyzing the anno- Conference on Research and Development in Information Retrieval (SIGIR’99), 1999. [5] Y. Koren and R. Bell. Advances in Collaborative Filtering, pages 145–186. Springer tations most important for classifying the items on website 17614, US, Boston, MA, 2011. we find that images annotated with police and transportation are [6] A. Lommatzsch. Semantic Movie Recommendations, chapter 5, pages 133–154. Advances in Computer Vision and Pattern Recognition. Springer International popular in this domain. Publishing, Smart Information Systems edition, 2015. For the domain 39234 the approach performs similar to the base- [7] A. Lommatzsch, B. Kille, M. Larson, F. Hopfgartner, and L. Ramming. NewsREEL line. The big variance in the observed prediction performance seems Multimedia at MediaEval 2018: News Recommendation with Image and Text Content. In Procs. of MediaEval 2018. Predicting the Interest in News based On Image Annotations MediaEval ’18, 29-31 October 2018, Sophia Antipolis, France [8] P. Lops, M. Gemmis, and G. Semeraro. Content-based recommender systems: State Springer, 2011. of the art and trends. In Recommender Systems Handbook, chapter 3, pages 73–105.