A Workflow for Cross Media Recommendations based on Linked Data Analysis Thomas Köllmer1 , Emanuel Berndl2 , Thomas Weißgerber2 , Patrick Aichroth1 , and Harald Kosch2 1 Fraunhofer Institute for Digital Media Technology IDMT, Ehrenbergstraße 31, 98693 Ilmenau, Germany thomas.koellmer|patrick.aichroth@idmt.fraunhofer.de 2 University of Passau, Chair for Distributed Multimedia Systems, Innstraße 43, 94032 Passau, Germany berndl|weissger|kosch@dimis.fim.uni-passau.de Abstract. The quality of content-based recommendation depends to a very high degree on the quality of the metadata available. We propose a workflow that combines novel cross media analysis platforms with linked data analysis to generate recommendations. The focus is set on an “editor user story” that combines live analysis of currently created content with a stored data backlog to select suitable content for article enrichment. Keywords: Recommendation, Cross Media Analysis, Named Entity Recognition, Linked Data, Semantic Web 1 Introduction Nowadays, a high percentage of (online) computer systems can be supported by recommender systems, assisting the decision taking process of customers. One of the biggest and well known domains of recommenders is the field of shopping. Customers are presented with masses of choices to buy, while facing the problem of finding their (preferably perfect) fit item. Recommenders work in the background in order to facilitate the process of finding relevant items for a given user. The recommender gathers information in order to generate a profile for every user (and possibly every item), which in terms can be used to determine which items to recommend. Multimedia items are very similar to shopping items in a sense of supply and demand, so rec- ommenders can also be used to facilitate various processes that involve the finding of multimedia. For example, knowledge discovery poses similar problems, as users are looking for content items that fit their current needs and fields of interest. A main issue of this domain is not only content discovery, but it is also more difficult to generate so called features of items – characteristics and properties of the item – which are needed in order to compare them amongst each other. Extracting those features manually is a cumbersome task and too expensive in terms of time, even more aggravated by the sheer amount of data. Automated processes can help to overcome this step, but need careful setup in order to generate useful results. To overcome the aforementioned shortcomings of recommendation in knowledge discovery, the contributions of this paper are the design of a workflow that: – is based on linked data analysis results, possibly allowing the reuse of data from other problem domains, – is placed in an autonomous platform, enabling automated analysis of inserted content, – can be altered and adjusted to given use cases, and – is kept cross-modal, allowing recommendations being generated and consumed across differ- ent multimedia types. 2 Thomas Köllmer, Emanuel Berndl et al. In order to convey this, the paper is structured as follows. Section 2 discusses related work in the fields of recommendation and multimedia platforms. After positioning the work, section 3 describes identified use cases and their requirements, then section 4 specifies the recommendation workflow in detail. Section 5 will conclude this position paper depicting ongoing future work. 2 Related Work The two main foundations of this work are the fields of Recommender Systems and Media Analysis Platforms. The next two sections show how the presented work fits inside the vast amount of proposed recommender systems and introduces two possible analysis platforms. The actual analysis components are considered part of the platform. Recommender Systems Ricci et al. describe recommender systems in [10] as systems that “are software tools and tech- niques providing suggestions for items to be of use to a user”. While this definition is very broad, it sets the focus to the user and its expectations. In a sense, recommender systems are more than a shopping guide, but useful in every setting that helps a human to select items from a wide set of items faster than without technical help. The majority of recommender systems and the research on recommender systems is divided into two categories or two sources of data: The first approach is to observe user behaviour and draw conclusions from that, a technique called Collaborative Filtering. This assumes that within a certain user group, the user behaviour is similar. The second approach, Content Based Recommendation, relies on analysing the recommendation items, extracting certain features, and recommending similar items, based on that analysis. The approach discussed here is clearly content based: Multimedia items will be processed by an analysis platform, and suggested according to their semantic similarity. However, collaborative filtering techniques can be used to reorder the list or prioritize found items. A recent survey of recommender systems is provided by Bobadilla et al. in [3]. An overview of collaborative filtering techniques was compiled by Su et al. ([12]), content based techniques are discussed by Azzani et al. in [8]. Cross Media Analysis Platforms The work proposed in this paper was conducted within the scope of the MICO project3 , a project with the aim to develop a multimodal analysis platform for various kinds of data. However, the proposed setting is not limited to MICO, but can be used by alternative frameworks as well. The MICO project provides a platform that helps orchestrating different combinations of registered extractors. As every extractor might produce its results in different varying formats, it is desired to find one common denominator in order to make full use of the combined metadata and even enhance the degree of information by its recombination. One approach to achieve this is posed by the Semantic Web and its technologies to produce Linked Data (LD). Among those, the Resource Description Framework (RDF) [7] is the commonly known standard for producing metadata that allows semantic interlinking on the level of single resources and comprehensive querying with SPARQL[9]. The MICO Platform [11][5][2] utilised in our approach in combination with its MICO Metadata Model MMM [1] is an environment that allows to discover the hidden semantics of media in context by orchestrating sets of different components in a pipeline that jointly analyse content, each adding its bit of extra information to the final result. 3 http://cordis.europa.eu/project/rcn/111088_en.html A Workflow for Cross Media Recommendations based on Linked Data Analysis 3 We will extend this context in order to design a recommendation engine as part of a MICO workflow pipeline. Consequently, the recommendation process and especially its results can be stored as metadata, making it possible to generate cross-media recommendations based on meta- data results of various extractor combinations. A similar semantic web-based approach is proposed by De Meester et al. [4] and Verbogh et al. [13]. They also tackle the problem of the always increasing masses of multimedia data and the accompanying task of costly or tedious multimedia retrieval. By supporting a framework that allows automated analysis of multimedia data, they create a rich metadata background for every multimedia item, which in terms allows users to find their desired content more quickly and efficiently. Next to this, the process of annotating multimedia is automated and hence it is not another burden to the user or data-publisher itself. Their proposed framework meets state-of-the-art standards and analysis techniques and, in or- der to operate with an input base that is as broad and diverse as possible, is kept domain-agnostic by including multimedia analysis procedures as web services. The output follows Semantic-Web standards, enabling it to be understood and interpreted at various different locations. When “in- jecting” a multimedia item, the platform makes use of the (possibly) already present metadata background. Then, it induces a three step algorithm that can be re-iterated as often as needed, until no more change in metadata and an associated increase of quality is achieved. At first, the algorithm combines the results of different analysis processes that are registered at the platform with the already present metadata. Then it can determine which results need improvement before deciding a new analysis plan for the given multimedia item. This results in a rich and robust metadata background. 3 Cross media recommendation As discussed in the previous section, content based recommendation depends to a large amount on the quality of the content analysis. Cross media analysis, i.e., combined analysis on different media types on the input side (e.g., video and corresponding user responses) is one promising approach to obtain the needed metadata quality. The main user story we focus on in this report is the enrichment of journalistic articles with fitting media items, the so called editor support usecase. Within MICO, there is more: for example, a showcase partner, Zooniverse4 , has a related use case in which crowd-sourced discussions about a given item can generate recommendations or pointers to other discussions and consecutively other items. In this case, both the items and user discussions are analysed. The proposed recommendation workflow is designed to deal with both problems, the main difference are the used analysis components inside the platform. Editor Support Usecase In today’s web, it is vital practice, to add (linked) content to an article for reasons of giving the reader further information and commitment to the site, but also for search engine optimization. Independent of the motivation, an editor profits highly from a recommendation system, that suggests fitting items while the new article is written. On a high level, the related user story is: As an editor, while I create or edit articles using WordPress, I want to automatically get related articles and videos that I might link to the article. 4 https://www.zooniverse.org/ 4 Thomas Köllmer, Emanuel Berndl et al. 1 Content Crawler 2 Analysis Platform 4 5 LD Cache LD Matching 3 Editing Platform 6 7 Recommendation Fig. 1. Recommendation workflow combining stored results and live analysis to create recommenda- tions. The crawled content will be forwarded to the analysis platform at any time to create background knowledge, while the current item is parsed immediately using the same analysis platform instance. As a first step for generating recommendations, we implemented a pipeline designed to recom- mend content videos or text-comments to a given user writing her or his own comment. We will base this on the general “meaning” or topic of both, which is done by a named entity recognition (NER) component. The recommendation exposes a REST API, that can be used by a WordPress plugin to acquire the recommendation data. 4 Processing Workflow Figure 1 depicts the general workflow for the proposed recommendation system. The involved components are described in the remainder of this section. Before discussing them, the data flow (as indicated by the numbered arrows) looks like this: 1 (Continuous process) A crawler feeds the domain specific media items (news, videos, ...) to the MICO analysis platform 2 (Continuous process) Analysis results are stored as RDF data, e.g., inside MICO’s Marmotta5 , a Linked Data platform 3 An item gets into focus, e.g., someone is writing an article on a specific topic or a new user post is published somewhere 4 The editing platform feeds the item to the MICO platform 5 The analysis results are preprocessed by the LD Matching component 6 The LD Matching component queries the stored annotation results 7 The matching component calculates a similarity score and feeds relevant items back to the editing platform as a recommendation to the editing platform Editing Platform The term Editing Platform stands for a component that produces content that has to be analysed and matched to the cached analysis results. A good example for this is a content management system like WordPress or TYPO3. All major systems allow to inte- grate plugins, therefore the communication with the new service is assumed to be a plugin that communicates with the REST endpoint of the recommender service. 5 http://marmotta.apache.org/ A Workflow for Cross Media Recommendations based on Linked Data Analysis 5 Content Crawler This workflow assumes that every use case has a defined subset of content that is supposed to be recommended. This might be an internal archive of media files, e.g., a in-house video collection or public databases, e.g. YouTube or Wikimedia Commons. Licensing issues are out of scope of this paper, however it should be noted that the task of storing and evaluating license information can be accomplished within the linked-data model as well. For the prototype implementation, the Copyright Aware Crawler 6 , developed within the CUbRIK7 project is used. All the crawled content is forwarded to the analysis platform for semantic indexing and further use in the recommendation process. Analysis Platform & LD Cache As described in section 2, a central part of the proposed workflow is an analysis platform that is able to extract the desired features needed for a content based recommendation and is able to output its result in a linked data format, to profit from the additional semantics inside the following LD Matching step. The LD Cache component empha- sizes the need of the Matching component to access precomputed analysis results. Depending on the used architecture it can be integrated into the analysis platform, as it is the case for MICO. LD Matching As described in section 2, current recommender systems apply metrics to describe the similarity of concepts, behaviours, users, items or related item data. We adopt this kind of content-based approach, motivating the similarity of multimedia items on the NER linked data analysis produced by the MICO platform. Envisioned is an implementation of a component capable of computing a similarity score of stored videos towards written comments. By looking into the “meaning” of given analysis results the LD Matching component will try to identify the major topics of the associated resource. These will be used by a matching logic to gather relevant resources stored in the backing RDF store. To attain this goal it will try to group named entities into categories weighted by importance. Finding fitting categorisations of the initial input can be interpreted as a mapping of the given named entities into a semantic type hierarchy. The co- occurrence of relations of instances of these types as well as shared instances and similar graph structure properties form features for the similarity computation of types and their contained instances. By interpreting this hierarchy a semantic distance metric will be derived as foundation for the recommendation system. This method will try to make use of non-textual characteristics implied by the RDF graph. Hereby it can be considered to apply information of newly analysed resources to expand the knowledge graph. To clarify the idea of the matching process, consider following example in figure 2, showing an exemplary RDF subclass hierarchy: Such an hierarchy will be used in our approach to match the results given by the NER analysis. After that, the semantic similarity of the given classes can be calculated to ultimately compute the similarity of two given items. For example, considering three items iflower , ipanda , and ilion with the extracted meanings of “flower”, “panda”, and “lion” respectively, one would assume that ipanda is much more similar to ilion than iflower because they are much “closer” in the hierarchy. As a result, a user writing a text about flowers can receive recommendations about items related to animals, in case there are no fitting videos about plants or trees that would be more similar. 4.1 Demo and Work in Progress The proposed workflow was showcased as a demo inside the MICO project. During the remainder of the project it will be fully implemented for two show cases: An integration into WordPress 6 http://www.idmt.fraunhofer.de/en/projects/expired_publicly_financed_research_projects/ cubrik.html#tabpanel-2 7 http://cordis.europa.eu/project/rcn/100872_en.html 6 Thomas Köllmer, Emanuel Berndl et al. Resource Plant Animal Flower Tree Mammal Bird Panda Lion Penguin Fig. 2. Exemplary RDF class hierarchy. Solid arrows depict a direct subclass relationship, while dashed arrows symbolise a path of (possibly multiple) subclass relationships towards the top class. In RDF, every class is subclass of rdf:Resource, however, transitive edges derived through inference are not considered, as this would break the assumption of having a semantic distance between two classes, as every class reaches Resource with one “hop” to the Resource class. that suggests related videos to the editor, based on speech-to-text results on the video, and NER on the draft post. The second user story is about analysing user discussions and linking it with content available for the named entities. A snapshot on the related activities for recommendation inside the MICO project can be found in [6]. The source code repository is hosted on the project’s Bitbucket repository8 . 5 Ongoing and Future Work This paper describes a workflow which is used in order to generate recommendations in a workflow driven environment. The process makes use of linked metadata that is produced inside the MICO platform, generating the recommendations based on a named entity recognition extractor that excerpts the meaning or trend of a written user comment towards a given video. The platform then uses that information in order to find (already categorised) fitting similar videos to recommend. As the workflow as well as its underlying utilised analysis process is kept cross- modal, other use cases, dealing with other various multimedia formats that can be analysed by NER, can easily be supported. Next steps will also include ways of increasing the quality or reliability of the generated recommendations. Using the design of the platform, a feedback loop is envisioned, in which users can rate the received recommendations. This information will then be used in the analysis process as well as the recommendation generation process. The matching algorithm gives room for extensibility as well. Especially in the text analysis, for example textual feature-based approaches that use Word2Vec9 , a framework designed by Google to compute vector representations of words, can be used in order to further interpret the similarities of extracted named entities. This promises further improvement of the overall results, as written posts can be analysed more explicitly. 8 https://bitbucket.org/mico-project/ Please write an email to the authors to get full access to the recommendation and platform repository. 9 https://code.google.com/archive/p/word2vec/ A Workflow for Cross Media Recommendations based on Linked Data Analysis 7 Acknowledgements This work has been partially funded by the European Commission 7th Framework Program, under grant agreement no. 610480. References 1. Aichroth, P., Weigel, C., Kurz, T., Stadler, H., Drewes, F., Bjorklund, J., Schlegel, K., Berndl, E., Perez, A., Bowyer, A., Volpini, A.: Mico - media in context. In: Multimedia Expo Workshops (ICMEW), 2015 IEEE International Conference on. pp. 1–4 (June 2015) 2. Aichroth, P., Weigel, C., Kurz, T., Stadler, H., Drewes, F., Bjorklund, J., Schlegel, K., Berndl, E., Perez, A., Bowyer, A., et al.: Mico-media in context. In: Multimedia & Expo Workshops (ICMEW), 2015 IEEE International Conference on. pp. 1–4. IEEE (2015) 3. Bobadilla, J., Ortega, F., Hernando, A., Gutiérrez, A.: Recommender systems survey. Knowledge- Based Systems 46, 109–132 (2013) 4. De Meester, B., Verborgh, R., Pauwels, P., De Neve, W., Mannens, E., Van de Walle, R.: Towards robust and reliable multimedia analysis through semantic integration of services. Multimedia Tools and Applications pp. 1–20 (2015), http://dx.doi.org/10.1007/s11042-014-2445-9 5. Fernández, S., Schaffert, S., Kurz, T.: Mico. Proceedings of the 24th International Conference on World Wide Web - WWW ’15 Companion (2015) 6. Köllmer, T.: Utilizing the mico platform for cross media recommendation (2016), http://www. mico-project.eu/cross-media-recommendation/ 7. Manola, F., Miller, E.: RDF primer. W3C recommendation, W3C (Feb 2004), http://www.w3.org/TR/2004/REC-rdf-primer-20040210/ 8. Pazzani, M.J., Billsus, D.: Content-based recommendation systems. In: The adaptive web, pp. 325– 341. Springer (2007) 9. Prud’Hommeaux, E., Seaborne, A., et al.: Sparql query language for rdf. W3C recommendation 15 (2008), https://www.w3.org/TR/rdf-sparql-query/ 10. Ricci, F., Rokach, L., Shapira, B., Kantor, P.B.: Recommender systems handbook. Springer (2011) 11. Schlegel, K., Berndl, E., Granitzer, M., Kosch, H., Kurz, T.: A platform for contextual multimedia data: Towards a unified metadata model and querying. In: Proceedings of the 15th International Conference on Knowledge Technologies and Data-driven Business. pp. 1:1–1:8. i-KNOW ’15, ACM, New York, NY, USA (2015), http://doi.acm.org/10.1145/2809563.2809586 12. Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Advances in Artificial Intelligence 2009(12), 1–19 (2009) 13. Verborgh, R., Deursen, D., Mannens, E., Poppe, C., Walle, R.: Enabling context-aware multimedia annotation by a novel generic semantic problem-solving platform. Multimedia Tools and Applications 61(1), 105–129 (2011), http://dx.doi.org/10.1007/s11042-010-0709-6