=Paper= {{Paper |id=Vol-1542/paper5 |storemode=property |title=Context-Aware User-Driven News Recommendation |pdfUrl=https://ceur-ws.org/Vol-1542/paper5.pdf |volume=Vol-1542 |authors=Jon Espen Ingvaldsen,Ozlem Ozgobek,Jon Atle Gulla |dblpUrl=https://dblp.org/rec/conf/recsys/IngvaldsenOG15 }} ==Context-Aware User-Driven News Recommendation== https://ceur-ws.org/Vol-1542/paper5.pdf
3. IMPLEMENTATION                                                               allowed to extract news items that are relevant to the geo special locality
The backend of the news recommender prototype developed is constructed          context, personal interests and given point of time. These three relevance
as a pipeline of operations transforming Rich Site Summary (RSS) entries        factors are customizable and the user can select whether or not they should
and raw text data into a semantic and searchable representation. The            influence the retrieved news items.
pipeline and its operations are implemented with using the Apache Storm2        To customize the geographical locality, the user specifies a circular
framework. This distributed computing framework enables scalability and         relevance region on a map. Figure 2a shows an example of such a
ability to handle large amounts of news items from a magnitude of               relevance region. By default, the relevance region is set to users current
publishers continuously.                                                        GPS location with a 50 km radius. By moving the region or modifying the
There are five steps involved in the data processing. The first step creates    radius, users can generate a local newspaper for any region of the world. If
an input stream by continuously monitoring a set of RSS feeds from a            the location factor is disabled, it means that the system is recommending
wide range of news publishers. Whenever a new news item occurs, RSS             news from any location in the world and news that are not containing
entry properties such as the title, lead text and HTML sources are              location information.
retrieved. The HTML sources are parsed and cleaned to extract a
                                                                                In the current Smartmedia prototype, we have predefined a handful of user
representative body text. In the second step, natural language processing
                                                                                interest profiles. Each user profile contains an alias and a weighted vector
operations such as language identification, sentence detection and part-of-
                                                                                of WikiData entities. Examples of predefined profiles in the system are
speech tagging is applied to extract entity mentions from the textual data.
                                                                                stock trader, soccer fan, technology geek, etc. By selecting any of these
The third step uses supervised models to map entity mentions to referent
                                                                                interest profiles, the retrieved news will be influenced and biased towards
entities in the WikiData knowledge bases. These models combine textual
                                                                                the interest topics. When the personal interest factor is disabled, the user
similarities, WikiData graph relations and entity frequencies and co-
                                                                                retrieve a news composition which is general and without such bias.
occurrence statistics to classify the relevance of multiple referent
candidates. First Story Detection is applied in the fourth step to group        By changing the time-factor, the user is presented with a calendar where
news items describing the same news story. In the fifth step this semantic      can move in time and retrieve either recent or historic news items. When,
representation is indexed and made searchable. As this backend                  the time-factor is disabled the user will retrieve news solely based on the
architecture is stream based, it is able to index and promote recent news       other relevance factors (location and personal interests).
items soon after they are discovered.
                                                                                Figure 2b shows an example of how news stories are presented. Here we
WikiData is the community-created knowledge base of Wikipedia[13].              see the same article as we had in Figure 1. The three circular buttons on
Since its public launch in 2012, the knowledge base has gathered more           the bottom of the screen allow users to toggle whether their locality,
than 15 millions entities, including more than 34 million statements and        personal interest profile and time setting such influence news story
over 80 million labels and descriptions in more than 350 languages[4].          retrieval.
Most geographical entities in WikiData provide a reference to Geonames
containing more detailed geographical properties. In the implementation         By clicking on a news story, the user gets the ingress of the news story and
of the Smartmedia prototype, the entity information from these knowledge        a list of the most salient entities for the selected news story. Figure 2c
                                    3
bases where indexed in a Lucene based search index. This index makes            shows the ingress and relevant WikiData entities from the news article
the entities searchable and creates a foundation for addressing entity          about Theresa May. As we can see, our news story about politics and
labels, descriptions and aliases, entity relations and geospatial properties.   terror related to Syria, Theresa May, ISIL and Sky News. By hovering
                                                                                these items, the user is presented with their textual WikiData description.
      Figure 1 shows an example of a news article from the Guardian             On figure 1c, we can see that the WikiData entity for Theresa May
where the text is parsed and enriched with WikiData entity annotations.         contains the description “British politician”.
The fields and nested data structure in this figure are similar to how the
news stories are stored and indexed in the Lucene based index. By running       In general, the three buttons at the bottom of the screen for location,
the news text from the news article in the figure through the data              interest profile and time can at any time be activated and de-activated in
processing pipeline, we identified nine WikiData entities, including            combinations to provide very different recommendation strategies. For
Bedfordshire, Home Office and Theresa May. Note that the news texts and         example, keeping all buttons active with default parameters means that the
list of entities and associations in the figure is shortened. All entities      system will recommend news articles that have recently takes place in the
contain a textual description and a list of associations. These associations    vicinity of the reader and are consistent with her profile. A screencast
are typed relations to other WikiData entities. We can see that                 video describing the features of the system and its user interface is
Bedfordshire contains eight such entity associations. Examples of entities      available at https://vimeo.com/121835936
linked and related to Bedfordshire are the instance of relations to
Ceremonial county of England and Administrative territorial entity of the
United Kingdom. Both Bedfordshire and Home Office are additionally
described with geospatial properties. In this case the geospatial properties    5. CONCLUSIONS AND FUTURE WORK
are longitude – latitude pairs, but the implementation allows for any geo       Many see the full stack of semantic web technologies as a complex
spatial shape decribed as valid Geojson .
                                         4                                      implementation of some really simple and good ideas about adding
                                                                                meaning to data. There are great rewards in understanding the full stack
When a user is opening the news app on the mobile a request containing          and what it can do, but most news organizations find great rewards by
user id, location and preferences are sent to the backend. Here, a multi        looking into linked data in combination with traditional information
factor search query is formed to retrieve relevant news entries from the        retrieval techniques.
index.
                                                                                In this paper we have shown a prototype of a news recommender system
                                                                                that demonstrates some of the context and geo spatial aware features
                                                                                online news services can achieve by using available and open knowledge
                                                                                bases and data processing and storage technologies.
4. USER INTERFACE
A web-based and responsive user interface is developed to make the news         Future work for the Smartmedia prototype will focus on improvement on
stream contents explorable on mobile devices. In this interface, the user is    entity linking qualities and evaluations of user needs. The user evaluations
                                                                                will look into to which extent users find the ability to control their news
                                                                                feed in terms of location, interest profile and time valuable and useful.
2
    http://storm.apache.org/
3
    https://lucene.apache.org/core/
4
    http://geojson.org/
articleId: "Guardian_254439378"
type: "article"
title: "Theresa May 'allowed state-sanctioned abuse of women' at Yarl's Wood"
leadText: "Shadow home secretary criticises minister after TV documentary alleges rape and self-harm at detention centre were
ignoredTheresa May, the home secretary, has been accused of allowing the “state-sponsored abuse of women” at the Yarl’s Wood detention
centre after a Channel 4 investigation uncovered guards ignoring self-harm and referring to inmates in racist terms.Yvette Cooper..."
entities: [ 9]
          0:   {
                    entityId: "Q23143"
                    name: "Bedfordshire"
                    description: "county in England"
                    associations: [ ... 8]
                    shape: {
                              type: "Point"
                              coordinates: [ 2]
                                        0: -0.41666666666667
                                        1: 52.083333333333
                    }
          }
          1:   {
                    entityId: "Q763388"
                    name: "Home Office"
                    description: "ministerial department of the Government of the United Kingdom"
                    associations: [ ... 3]
                    shape: {
                              type: "Point"
                              coordinates: [ 2]
                                        0: -0.129948
                                        1: 51.4958
                    }
          }
         2:   {
                    entityId: "Q264766"
                    name: "Theresa May"
                    description: "British politician"
                    associations: [ ... 21]}




         }


                                         Figure 1. Example of a news article enriched with WikiData entities.




                   a)                                            b)                                         c)

Figure 2. Screenshots from the Smartmedia prototype. a) The map query interface. b) Presentation of news stories. c) Presentation of news details.
                                                                         [8]    Meguebli, Y. and Kacimi, M. 2014. Building rich user profiles
6. REFERENCES                                                                   for personalized news recommendation. Proceedings of 2nd
                                                                                International Workshop on News Recommendation and
[1]   Asikin, Y. and Wörndl, W. 2014. Stories around You: Location-             Analytics. (2014).
      based Serendipitous Recommendation of News Articles.               [9]    Ozgobek, O., Gulla, J. and Erdur, R. 2014. A survey on
      Proceedings of 2nd International Workshop on News
                                                                                challenges and methods in news recommendation. In
      Recommendation and Analytics. (2014).                                     Proceedings of the 10th International Conference on Web
[2]   Cantador, I., Bellogín, A. and Castells, P. 2008. News@ hand: A           Information System and Technologies (WEBIST 2014). (2014).
      semantic web approach to recommending news. Adaptive
                                                                         [10]   Samet, H., Sankaranarayanan, J., Lieberman, M.D., Adelfio,
      hypermedia and adaptive web-based systems. (2008).                        M.D., Fruin, B.C., Lotkowski, J.M., Panozzo, D., Sperling, J.
[3]   Cantador, I., Bellogín, A. and Castells, P. 2008. Ontology-based          and Teitler, B.E. 2014. Reading news with maps by exploiting
      personalised and context-aware recommendations of news
                                                                                spatial synonyms. Communications of the ACM. 57, 10 (Sep.
      items. Proceedings of the 2008 IEEE/WIC/ACM International                 2014), 64–77.
      Conference on Web Intelligence and Intelligent Agent               [11]   Tavakolifard, M., Gulla, J.A., Almeroth, K.C., Ingvaldesn, J.E.,
      Technology. 1, (2008).
                                                                                Nygreen, G. and Berg, E. 2013. Tailored news in the palm of
[4]   Erxleben, F., Günther, M. and Krötzsch, M. 2014. Introducing
                                                                                your hand: a multi-perspective transparent approach to news
      Wikidata to the Linked Data Web. The Semantic Web–ISWC                    recommendation. WWW ’13 Companion Proceedings of the
      2014. (2014).
                                                                                22nd International Conference on World Wide Web. (May
[5]   Goossen, F. and IJntema, W. 2011. News personalization using
                                                                                2013), 305–308.
      the CF-IDF semantic recommender. Proceedings of the                [12]   Teitler, B. and Lieberman, M. 2008. NewsStand: A new view on
      International Conference on Web Intelligence, Mining and
                                                                                news. Proceedings of the 16th ACM SIGSPATIAL international
      Semantics (WIMS). (2011).
                                                                                conference on Advances in geographic information systems.
[6]   Gulla, J.A., Ingvaldsen, J.E., Fidjestøl, A.D., Nilsen, J.E.,             (2008).
      Haugen, K.R. and Su, X. 2013. Learning User Profiles in Mobile
                                                                         [13]   Vrandečić, D. and Krötzsch, M. 2014. Wikidata: a free
      News Recommendation. Journal of Print and Media Technology
                                                                                collaborative knowledgebase. Communications of the ACM.
      Research. II, 3 (2013), 183–194.                                          (2014).
[7]   IJntema, W. and Goossen, F. 2010. Ontology-based news
      recommendation. Proceedings of the 2010 EDBT/ICDT
      Workshops. (2010).