The MULTISENSOR project – Development of
          Multimedia Content Integration Technologies for
          Journalism, Media Monitoring and International
                    Exporting Decision Support
      Dimitris Liparas1 , Stefanos Vrochidis1 , Ioannis Kompatsiaris1 , Gerard Casamayor2 , Leo Wanner2 , Ioannis
     Arapakis3 , David García Soriano3 , Reinhard Busch4 , Boris Vaisman4 , Boyan Simeonov5 , Vladimir Alexiev5 ,
    Andrey Belous6 , Emmanuel Jamin6 , Nicolaus Heise7 , Tilman Wagner7 , Michael Jugov8 , Mirja Eckhoff8 , Teresa
                                              Forrellat9 and Martí Puigbó9
 1 Centre for Research and Technology Hellas, 2 Pompeu Fabra University, 3 Eurecat, 4 Linguatec, 5 Ontotext, 6 everis,
                                    7 Deutsche Welle, 8 pressrelations, 9 PIMEC


Abstract. The rapid development of digital technologies has led to        other areas in order to adjust the own.
a great increase in the availability of multimedia content. The con-         To break this isolation, there is a need for technologies capable
sumption of such large amounts of content regardless of its reliability   to capture, interpret and relate economic information and news from
and cross-validation can have important consequences on the society       various subjective views as disseminated via TV, radio, newspapers,
and especially on journalism, media monitoring and international in-      blogs and social media. On top of this, semantic integration of het-
vestments. In this context, MULTISENSOR has researched and de-            erogeneous media, including computer-mediated interaction, is re-
veloped tools that provide unified access to multilingual and multi-      quired to gain a usable understanding based on social intelligence,
cultural economic, news story material across borders, that ensure        while a correlation with relevant incidents with different spatiotem-
its context-aware, spatiotemporal, sentiment-oriented and semantic        poral characteristics would allow for extracting hidden meaning.
interpretation, and that correlate and summarise the content into a          In the MULTISENSOR (Mining and Understanding of multilin-
coherent whole. The goal of the MULTISENSOR project is to pro-            guaL contenT for Intelligent Sentiment Enriched coNtext and Social
vide a platform, which allows for an integrated view of heteroge-         Oriented inteRpretation) project1 , we have developed a unified plat-
neous resources sensing the world (i.e. sensors), such as international   form for enabling the multidimensional content integration from
TV, newspapers, radio and social media. Three demonstrators have          heterogeneous sensors, with a view to providing end-user services
been developed, indicating the potential of the platform and provid-      such as journalism, commercial media monitoring and decision sup-
ing end-user services such as journalism, commercial media monitor-       port for SME (Small and Medium Enterprises) internationalisation.
ing and decision support for SME (Small and Medium Enterprises)           More specifically, potential investors can benefit from integration
internationalisation.                                                     and context-aware interpretation of complementary and contradict-
                                                                          ing multilingual and multimedia information and get decision sup-
                                                                          port for international investments. Media companies and archives
1    Introduction                                                         can also benefit from the spatiotemporal integration and sentiment-
Nowadays, the extensive availability of multilingual and multimedia       oriented interpretation of heterogeneous content both for media mon-
content worldwide is a result of the advances in digital technologies     itoring and for journalism purposes. Finally, the European public can
during the past decade, as well as the low cost of recording media.       benefit from this integration and context-aware interpretation in the
In the best case, this content is repetitive or complementary across      sense that it learns and comes to understand the views, fears and
political, cultural, or linguistic borders. However, the reality shows    worries of the citizens all over Europe and get support for forming
that it is also often contradictive and in some cases unreliable, some-   an objective opinion with respect to the state of affairs.
thing that can greatly impact its consumption. An indicative example         The approach of MULTISENSOR builds upon the multidimen-
is the current crisis of the financial markets in Europe, which has       sional content integration concept (Figure 1) by considering the fol-
created an extremely unstable ground for economic transactions and        lowing dimensions for mining, linking, understanding and summaris-
caused insecurity in the population. The consequence is an extreme        ing heterogeneous material: language, multimedia, semantics, con-
uncertainty and nervousness of politics and economy on the one side,      text, emotion, time and location.
which makes national and international investments really risky, and
on the other side, the inability of journalism and media monitoring
to equally consider all the media resources leaves the population in
each of these encapsulated areas in its own perspective–without the
                                                                          1 FP7-ICT-2013-10: http://www.multisensorproject.eu/
realistic opportunity to understand the perspective developed in the
                                                                             well as impact on its condensed presentation along with the con-
                                                                             tent summary.


                                                                         3     User perspective
                                                                         Within MULTISENSOR, three pilot use cases (UC) were defined and
                                                                         specific requirements were extracted for each one of them:
                                                                         UC1: Journalism: Journalists need to master large heterogeneous
                                                                         amounts of multimedia and multilingual data when writing a new ar-
                                                                         ticle. On the basis of a market analysis that was conducted and from
                                                                         a journalistic point of view, MULTISENSOR should be able to pro-
                                                                         vide an automatic summarisation of heterogeneous and multilingual
                                                                         digital information. The platform should also automatically suggest
                                                                         related content and information that allows journalists to enrich their
               Figure 1. The MULTISENSOR concept                         coverage of a specific topic.
                                                                         UC2: Commercial media monitoring: Professional clients of me-
2   Project objectives                                                   dia monitoring portals require direct access to comprehensive and
                                                                         targeted business and consumer information. This could include in-
In the context of MULTISENSOR, the following scientific objec-           formation on consumption habits, competitors and opinions. From a
tives with respect to the individual research areas of the project are   media monitoring point of view, it is important that the MULTISEN-
addressed:                                                               SOR system follows the usual workflow for the creation of a media
                                                                         analysis. In a first step, the user needs to define the sources and time
• Mining and content distillation of unstructured heterogeneous
                                                                         frame that is to be monitored, along with the search terms he wants
  and distributed multimedia and multilingual data: In this ob-
                                                                         to use. In a second step, the search results need to be curated and
  jective, MULTISENSOR attempts to facilitate the data mining
                                                                         validated. The MULTISENSOR system should present the results of
  from several international resources, including news articles, au-
                                                                         these queries in different output formats and visualisations.
  diovisual content (TV, radio), blogs and social media and provide
                                                                         UC3: SME (Small and Medium Enterprises) internationalisa-
  intelligent mechanisms for the distillation of information. This ob-
                                                                         tion: This UC deals with SME internationalisation, which refers to
  jective includes low- and high-level content analysis.
                                                                         small or medium-sized companies that want to start or are in the pro-
• User- and context-centric analysis of heterogeneous multime-
                                                                         cess of expanding from a regional or a national market to a new and
  dia and multilingual content: Here, the focus is on analysing
                                                                         foreign market in order to increase turnover and profit. This process
  content from the user perspective to extract sentiment and context,
                                                                         is of particular importance, as it is often the only option to achieve
  analysing computer-mediated interaction in the web and specifi-
                                                                         growth. But it is also aligned with considerable challenges, such as
  cally in social media, as well as generating high-level information
                                                                         a lack in knowledge about market conditions or the spoken language
  based on the outcome of the previously mentioned objective. The
                                                                         in the targeted countries. From the aforementioned, in order for the
  aim is to develop and integrate into the MULTISENSOR platform
                                                                         MULTISENSOR platform to be fully helpful in SME internation-
  research techniques for context extraction, sentiment extraction
                                                                         alisation cases and improve the decision-making process, it should
  and social media mining (influential user detection and commu-
                                                                         provide information about several related indicators, regarding the
  nity detection).
                                                                         condition of the market, the political and financial situation of the
• Semantic integration and context-aware interpretation over
                                                                         countries, potential competitors, consumption habits, etc. Further-
  the spatiotemporal and psychological dimension of heteroge-
                                                                         more, two very important requirements from this UC are summarisa-
  neous and spatiotemporally distributed multimedia and mul-
                                                                         tion (to reduce the amount of information that the internationalisation
  tilingual data: This includes multidimensional content correla-
                                                                         expert will need to read and study) and automatic language detection
  tion and alignment based on reasoning techniques, as well as
                                                                         and translation.
  on multimodal vector-based representation and topic-based mod-
  elling. The multimodal integration is performed on top of the low-
  and high-level content extracted in the two aforementioned objec-      4     MULTISENSOR framework
  tives.
• Semantic reasoning and intelligent decision support services:          The architecture of the MULTISENSOR framework is depicted in
  The purpose here is to make sense of very large amounts of het-        Figure 2. In this architecture, a periodic process of content harvest-
  erogeneous data by providing diverse analytics, contextualised         ing takes place, which retrieves source material by crawling a set of
  decision-making support for different situations to enable view of     sources for news, multimedia and social network content. Next, the
  the information from multiple perspectives. In this context, MUL-      different components of the framework, as well as the functionality
  TISENSOR has researched and developed advanced reasoning               of the modules that they contain and provide are described.
  techniques that abide to requirements for scalability and usabil-
  ity.
• Context-aware multimodal aggregation, multilingual sum-                4.1    Multimedia content extraction
  marisation and adequate presentation of the information to
  the user: This objective also includes context-aware interpreta-       This component aims at extracting knowledge from multimedia input
  tion of news by examining their impact on the news consumers in        data and presenting the extracted knowledge in a way that subsequent
  the light of cultural aspects, user experience and engagement, as      components can operate on it. It includes the following technologies:
                                              Figure 2. Architecture of the MULTISENSOR framework

1. Language Identification: Before a text is stored in the repository,         glish and German.
   it is analysed in which language it is written and the text is anno-     6. Multimedia concept and event detection: This module receives
   tated accordingly. The languages considered in MULTISENSOR                  as input a multimedia file (i.e. image or video) and computes de-
   are English, German, Spanish, Bulgarian and French.                         grees of confidence for a predefined set of visual concepts. The
2. Named entities extraction: This module aims at identifying                  module performs video decoding (applicable for video files only),
   names (named entities) in texts. Names are words which uniquely             feature extraction and classification in order to assign a confidence
   identify objects, like ‘Berlin‘, ‘Siemens‘, etc. The module incor-          value for a concept or event existence in an image or video shot
   porates two linguistic components that allow all analysis modules           [3].
   to operate on the same input: sentence segmentation and tokenisa-        7. Machine translation: Automatic machine translation (MT) has
   tion.                                                                       two main goals: to provide the translation of the summarisation
3. Concept extraction from text: Concept extraction starts from the            results in the end of the content analysis and summarisation chain
   results of the named entities extraction task. The goal of this mod-        and to enable full-text translation on-demand during the develop-
   ule is to identify in the text mentions to concepts that belong to the      ment of language-dependent analysis tools in the project, in case
   project domains. Candidate concepts are identified through analy-           a subset of required languages is not supported by these tools.
   sis of multilingual corpora. When processing new documents, the
   module attempts disambiguation of mentions of concepts against
   relevant ontologies and datasets.                                        4.2    User- and context-centric analysis
4. Concept linking and relations: This module aims at identifying
   in texts relations between mentions of named entities and con-           The objectives of this component are to model and represent con-
   cepts. Two relation types are considered: i) coreference relations       textual, sentiment and online social interaction features, as well as
   i.e. several mentions make reference to the same entity, and ii) n-      deploy linguistic processing at different levels of accuracy and com-
   ary relations describing situations and events involving multiple        pleteness.
   entities and concepts. To this end, a deep dependency parser [1]
   that delivers deep-syntactic dependency structures from sentences        1. Extraction of contextual features: This module provides a set of
   in nature language has been developed. This parser uses the output          contextual indicators characterising the content items and a frame-
   of an optimised dependency parser [2] as input.                             work for measuring their impact in the context of the use cases.
5. Audio recognition and analysis: Automatic speech recognition                Moreover, it provides representation techniques to be used in ef-
   (ASR) is employed in order to provide a channel for analysis of             fective context-based search.
   spoken language in audio and video files. The transcripts produced       2. Polarity and sentiment extraction: The polarity and sentiment
   follow the same analysis procedure as the input from other text             extraction module aims at modelling a robust opinion mining sys-
   sources. The languages covered by the ASR component are En-                 tem that is based on linguistic analysis and is applicable to large
                                                                               datasets. Moreover, models that take into account the presence of
   named entities in different sentences have been designed within             been developed: hybrid reasoning (consists of a combination of
   the module.                                                                 forward and backward chaining), multi-threaded reasoning (par-
3. Social interaction analysis: The social interaction analysis task           allel inference calculation), temporal reasoning (inference based
   involves a set of processes related to analysis of social network           on temporal entities and sequence in time) and geo-spatial reason-
   data stored into the MULTISENSOR repositories, namely crawled               ing (ability to reason based on latitude, longitude and altitude of a
   Twitter data. Two modules have been developed in the context of             given location). Additionally, a reasoning-based recommendation
   this task, namely the influential user detection and community de-          system with two main functionalities has been developed: firstly, it
   tection modules. First, a topic-dependent network of contributors           determines relevant facts by navigating the graph and secondly, it
   based on the mentions in the set of monitored tweets is built and           advises the user by interpreting these facts through the use of the
   next, retweet probabilities between users in this network are com-          aforementioned hybrid reasoning techniques and the assignment
   puted. The goal of the influential user detection module is to pro-         of relevance weights for each selected fact [7].
   vide a ranked list of users by decreasing order of influence based
   on the aforementioned network of mentions and retweet probabil-         4.5     Content summarisation
   ities. The goal of the community detection module is to make use
   of crawled Twitter posts in order to detect online dynamic commu-       The content summarisation component implements procedures for
   nities by means of an appropriate community detection algorithm,        producing multilingual briefings. Two established strategies in the
   which is applied to each graph snapshot defined by the user net-        field of text summarisation are considered in MULTISENSOR:
   work of mentions.                                                       1. Extractive summarisation: Text-to-text summarisation, where
                                                                              the relevance of sentences in the original documents is assessed
4.3    Multidimensional content integration and                               based on shallow linguistic features in order to decide on its in-
       retrieval                                                              clusion of a summary. A module following this strategy is used in
                                                                              order to establish a basic infrastructure for summarisation services
The objective of this component is to achieve integration and retrieval       and implement a fall-back method.
of content along different dimensions.                                     2. Abstractive summarisation: Documents are analysed and the in-
1. Multimodal indexing and retrieval: In this module, a multime-              formation extracted from them is used to generate a summary that
   dia data representation framework that allows for the efficient stor-      is not composed of fragments of the original documents, but is
   age and retrieval of socially connected multimedia objects is de-          generated directly from data. A module implementing abstractive
   veloped. The representation model is called SIMMO (Socially In-            methods operates on the semantic layer in order to select contents
   terconnected MultiMedia-enriched Objects) [4] and has the ability          extracted from multimedia documents and also coming from other
   to fully capture all the content information of interconnected mul-        datasets integrated into the MULTISENSOR system. Contents are
   timedia objects, while at the same time avoiding the complexity            selected and organised into a text plan that guarantees the coherent
   of previously proposed models.                                             presentation of information. A multilingual linguistic generation
2. Topic-based modelling: In this module, two subtasks are consid-            system renders text plans into the final summaries.
   ered: a) category-based classification and b) topic-event detection.
   The module receives as input multimodal features that are created       5     Use cases applications
   in the multimedia content extraction component and provides as
                                                                           During the three years of the project’s lifetime, three applications
   output the degree of confidence of a number of categories for a
                                                                           have been developed based on MULTISENSOR technologies, with
   specific content item (for category-based classification) [5] or a
                                                                           each application addressing one of the three use cases considered
   grouping for a list of content items based on the existence or not
                                                                           in MULTISENSOR. The first one provides search and exploratory
   of a number of topics / events (for topic-event detection) [6].
                                                                           functionalities for journalists2 , the second one aims at supporting
                                                                           a media monitoring company to monitor specific profiles for their
4.4    Semantic representation and reasoning                               clients3 , while the third one provides decision support for SME in-
MULTISENSOR includes a semantic layer in order to represent in a           ternationalisation4 .
unified way heterogeneous content. The following technologies are
involved:                                                                  5.1     Journalism use case application
1. Semantic representation: This representation includes a number          The journalism use case demonstrator is an application that assists
   of ontologies that are integrated in a common framework, such as        media professionals (e.g. journalists, media experts) in finding rele-
   DBpedia, GeoNames and FreeBase.                                         vant information in different formats, coming from different sources,
2. Ontology alignment: The ontology alignment module discovers             and according to the social activities that were produced around.
   candidate semantic correspondences between heterogeneous in-               Figure3 shows the results section of the application, which dis-
   formation descriptions and terminologies and verifies the correct-      plays the results of a search query that the user can make, based on a
   ness and consistency of the discovered mappings in an automatic         selection of keywords and filtering criteria. On the left side, search-
   way.                                                                    related entities are displayed. By clicking on an entity it will be added
3. Content alignment: This module deals with the semantic pro-             to the search query. Then these entities can be used to extend the
   cessing of the multimodal content, in order to identify near dupli-     search query. On the right side, the following information per article
   cate and contradictory information relying on semantic technolo-        is displayed:
   gies.                                                                   2 http://grinder1.multisensorproject.eu/uc1/
4. Hybrid reasoning and decision support: In the hybrid reason-            3 http://grinder1.multisensorproject.eu/uc2/
                                                                           4 http://grinder1.multisensorproject.eu/uc3/
   ing and decision support module, four reasoning techniques have
                                              Figure 3. Journalism use case application – Results section

• Context: Contextual features per article (title, source, etc.).             charts are shown for all articles that have been marked as relevant
• Summarisation: Display of the output of the summarisation mod-              in the search section. Finally, in the influencer section, the informa-
  ule.                                                                        tion that is extracted from the social interaction analysis modules of
• Translation: The online machine translation service operates on             MULTISENSOR (influential user detection and community detec-
  this functionality in order to translate a summary to one of the            tion) is displayed.
  available languages of the MULTISENSOR project.
• In-depth semantic analysis: Displays semantic page view. On                 5.3    SME internationalisation use case application
  this page, more information extracted from the text is displayed
  (list of named entities, sentiment polarity, cloud of specific con-         The SME internationalisation use case application supports SMEs
  cepts and related articles).                                                in order to start a process of internationalisation with any kind
• Article to portfolio: The link to add an article to the “portfolio”         of product. Relevant information related to the countries, the eco-
  (a folder that contains the user’s favourite documents), for further        nomic situation of the market, the legal information, and the expor-
  analysis. This analysis generates the aggregated analytical view of         tation/importation conditions can be retrieved easily to support deci-
  the portfolio content.                                                      sion making.
                                                                                 The application supports a number of sectors and products. When
5.2    Media monitoring use case application                                  a user selects a specific sector, articles about that sector are shown.
                                                                              After the selection of a product, the search will contain specific infor-
The media monitoring use case application replicates the workflow             mation about it. Another important functionality of this application is
of a media monitoring professional to execute an analysis for a client.       browsing specific information to a certain country for international-
This includes checking articles for relevance by various indicators           isation support purposes, based on a number of indicators. The con-
and saving the relevant articles for a client’s profile. The relevant         sidered indicators have been selected and organised by categories to
articles can then be analysed, so that conclusions can be drawn from          depict the relevant information related to a target country: Politics,
this analysis.                                                                Economy, Society and Culture.
   Figure4 depicts the search section of the application, where the              Furthermore, the SME professionals are interested in targeting
user is presented with a view to search for articles, based on key-           specific countries to establish new commercial activities. For this,
words and language/country filters. Alternatively, the user can select        the application offers the comparison of several indicators between
a profile, which has settings stored for recurring searches in order to       two targeted countries through the decision support system of MUL-
quickly populate the search mask.                                             TISENSOR, which is depicted in Figure5.
   In order to evaluate whether an article is relevant for the client, the
user can use additional functionalities, such as calling the summari-
                                                                              6     Conclusions
sation and/or translation service. In addition, he can take a look at the
entities extracted from the text and read the article’s full text. There      In this paper, an overview of the successful MULTISENSOR project
is also the analysis section, where visual results in the form of bar         is provided. The project has developed a platform that supports a)
         Figure 4. Media monitoring use case application – Search section


Figure 5. SME internationalisation use case application – Decision support interface
journalists in mastering heterogeneous content in order to prepare ar-
ticles and identify topics, as well as have access to multilingual sum-
maries; b) commercial media monitoring companies in summarising
the opinions of people for specific products and c) SMEs that want
to internationalise by providing market analysis, product reports and
decision support services. This platform integrates and makes use of
innovative modules, which could be separately exploited.
   MULTISENSOR technologies will have a big impact from several
perspectives. First, they will actively support the journalists (profes-
sional and amateurs), commercial media monitoring companies and
the international investments by SMEs. Second, the SMEs in the ICT
domain will benefit from the open source tools and technologies de-
veloped in MULTISENSOR, in order to improve their existing prod-
ucts and offer new services to their clients. Third, the development
of such tools will boost the competitiveness specifically in the media
monitoring domain and in Europe, since the mobility of SMEs will
be facilitated. Finally, the social impact of MULTISENSOR refers
to the production of cross-validated news articles and the presenta-
tion of news stories from different cultural, political and linguistic
perspectives.


7   Acknowledgements
This work was supported by MULTISENSOR project5 , partially
funded by the European Commission, under the contract number
FP7-610411.


REFERENCES
[1] M. Ballesteros, B. Bohnet, S. Mille and L. Wanner, “Deep-Syntactic
    Parsing,” COLING 2014, Dublin, Ireland, 2014.
[2] M. Ballesteros and B. Bohnet, “Automatic Feature Selection for
    Agenda-Based Dependency Parsing,” COLING 2014, Dublin, Ireland,
    2014.
[3] N. Gkalelis, F. Markatopoulou, A. Moumtzidou, D. Galanopoulos, K.
    Avgerinakis, N. Pittaras, S. Vrochidis, V. Mezaris, I. Kompatsiaris
    and I. Patras, “ITI-CERTH participation to TRECVID 2014,” Proc.
    TRECVID 2014 Workshop, Orlando, FL, USA, November 2014.
[4] T. Tsikrika, K. Andreadou, A. Moumtzidou, E. Schinas, S. Papadopou-
    los, S. Vrochidis and I. Kompatsiaris, “A Unified Model for Socially
    Interconnected Multimedia-Enriched Objects,” MultiMedia Modelling,
    pp. 372-384, Springer International Publishing, January 2015.
[5] D. Liparas, Y. Hacohen-Kerner, A. Moumtzidou, S. Vrochidis and
    I. Kompatsiaris, “News articles classification using Random Forests
    and weighted multimodal features”, 3rd Open Interdisciplinary MU-
    MIA Conference and 7th Information Retrieval Facility Conference
    (IRFC2014), Copenhagen, Denmark, November 10-12, 2014.
[6] I. Gialampoukidis, S. Vrochidis and I. Kompatsiaris, “A hybrid frame-
    work for news clustering based on the DBSCAN-Martingale and LDA”,
    12th International Conference on Machine Learning and Data Mining,
    New York, 16-21 July 2016.
[7] B. Simeonov, V. Alexiev, D. Liparas, M. Puigbo, S. Vrochidis, E. Jamin
    and I. Kompatsiaris, “Semantic integration of web data for interna-
    tional investment decision support”, In International Conference on In-
    ternet Science, pp. 205-217, Springer International Publishing, Septem-
    ber 2016.


5 http://www.multisensorproject.eu/