=Paper= {{Paper |id=Vol-1088/paper6 |storemode=property |title=Cicerone: Design of a Real-Time Area Knowledge-Enhanced Venue Recommender |pdfUrl=https://ceur-ws.org/Vol-1088/paper6.pdf |volume=Vol-1088 |dblpUrl=https://dblp.org/rec/conf/ijcai/VillatoroAPGT13 }} ==Cicerone: Design of a Real-Time Area Knowledge-Enhanced Venue Recommender== https://ceur-ws.org/Vol-1088/paper6.pdf
Cicerone: Design of a Real-Time Area Knowledge-Enhanced Venue Recommender

    Daniel Villatoro, Jordi Aranda, Marc Planagumà, Rafael Gimenez and Marc Torrent-Moreno
                          Barcelona Digital Technology Centre, Barcelona, Spain
                    {dvillatoro,jaranda,mplanaguma,rgimenez,mtorrent}@bdigital.org



                          Abstract                                previously analyzed ones as they also have to take into ac-
                                                                  count the context, distance from the user to the recommended
     Smart-devices with information sharing capabili-             venue, and maybe several other factors.
     ties anytime and anywhere have opened a wide                 With the penetration of smart-devices, users have the pos-
     range of ubiquitous applications. Within urban en-           sibility to access information anytime anywhere, and the
     vironments citizens have a plethora of locations to          system we present in this work profits from those ubiqui-
     choose from, and in the advent of the smart-cities           tious computing capabilities; our on-site location-based rec-
     paradigm, this is the scope of location-based rec-           ommender system allows users to obtain the most adequate
     ommender systems to provide citizens with the ad-            venue with respect to their current position. Our approach
     equate suggestions. In this work we present the de-          profits from a different dimension of the users’ parameters
     sign of an in-situ location-based recommender sys-           space, namely their social relationships and their relative geo-
     tem, where the venue recommendations are built               graphical knowledge with respect to the location of the items.
     upon the users’ location at request-time, but also           This model provides an alternative solution to the problem
     incorporating the social dimension and the exper-            of providing personalized recommendations in a geospatial
     tise of the neighboring users knowledge used to              domain: user expertise in this type of domain conveys an
     build the recommendations. Moreover, we propose              implicit continuum knowledge of the surrounding geospatial
     a specific easy-to-deploy architecture, that bases its       area and the locations within that area. Our solution intelli-
     functioning in the participatory social media plat-          gently combines this user geospatial knowledge to the clas-
     forms such as Twitter or Foursquare. Our system              sical social distances amongst users used in state-of-the-art
     constructs its knowledge base from the accesible             recommenders.
     data in Foursquare, and similarly obtains ratings            Our system personalizes recommendations of locations not
     from geopositioned tweets.                                   only considering the past history of a specific user, but also
                                                                  (1) the current location of the user, (2) the social distance with
                                                                  other similar users and (3) their expertise in the area where
1   Introduction                                                  the recommendation is going to be provided. This aggrega-
Urban environments host a plethora of interesting locations       tion function basically expresses a tendency of a user to visit
such as restaurants, shops, museums, theaters and a wide          a certain location given its distance to the location, and the
range of other venues that neither can be known by all users      past history of the user and its friends and their knowledge of
nor they might be interested in visiting all. However, each       the area.
citizen can potentially become an expert of the neighborhood      To the best of our knowledge, there are no existing recom-
he visits more often or lives, as he will know, and maybe have    mender systems that profit from the inherent characteristics of
visited, more venues in such area. Therefore, it is straighfor-   the geographical location, such as continuity in space, user’s
ward to see how for an specific citizen might not be a problem    area expertise or word-of-mouth location suggestions, to gen-
to find an adequate venue for his taste on his neighborhood of    erate recommendations to users.
expertise, but it potentially becomes cumbersome to do the
same task when in a different less-known neighborhood. It         2   State of the Art
becomes then a problem for citizens to find locations they        As we have discussed previously, the problem of finding ad-
might enjoy when away of their area of expertise. The prob-       equate venues for citizens to visit is a problem already tack-
lem of finding adequate items for specific users is that clasi-   led by the recommender community, under the location-based
cally solved by recommender systems.                              recommender systems [Zheng et al., 2009; Park et al., 2007].
In our case, we focus on location-based recommender sys-          Despite the impressive amount of literature in such area, this
tems [Zheng et al., 2009; Park et al., 2007], where users are     is still an open problem, even for those with access to com-
recommended locations to visit expecting to maximize users’       plete datasets and user profiles [Sklar et al., 2012], as new
satisfaction. These type of recommenders complement the           methods and algorithms are being proposed to boost their ef-
ficiency.                                                          formed by the items the user is being recommended. A User
In this work however, we propose the integration of social in-     A’s Recommendations set will be denoted herein as RA .
                                                                                                                              t
formation into the calculation of the recommendations. Some           An essential concept is the one of “Check-in” (CIU,L        )
autors have investigated the potential of the explicit inclusion                                     1
                                                                   which represents the attendance of a user U to a certain lo-
of information of user’s relationships from social networks to     cation L in the last t days, and therefore. Our system will
generate the neighborhood used in classical collaborative fil-     generate Location recommendations to the users by consid-
tering (CF) algorithms (social filtering), improving the resutls   ering not only its geographical position, but also its social
obtained by the classic CF in the analyzed scenarios [Groh         relationship with other users and their degree of knowledge
and Ehmig, 2007].                                                  of the visited locations.
Others [Bonhard and Sasse, 2006] have analyzed how the                In order to obtain more adequate recommendations in this
relationship between advice-seeker and recommender is ex-          type of environments, we envision the necessity of certain
tremely important in the user-centered recommendations,            estimators. Firstly, we need to quantify how well a certain
concluding that familiarity and similarity amongst the dif-        user knows a specific area (by considering the attendance fre-
ferent roles in the recommendation process aid judgement           quency to locations in such area with respect to the rest of
and decision making. As well as in our approach, some re-          the city). Moreover, it becomes necessary to understand the
searchers have considered the important role of experts [Am-       social distance between the target user and the other users,
atriain et al., 2009; Bao et al., 2012], however in our case,      whose opinions are being used to create the recommendations
these experts are calculated automatically for each specific       for the target user.
area of the city and weighted with respect to the social dis-         These measures are clearly described and specified next:
tance amongst the advice-seeker and recommender.                   The Area Knowledge (AKU,L    t
                                                                                                    ) of a user U with respect to a
A similar recommendation approach is presented [Ye et al.,         location L is calculated:
2010], where authors also propose the usage of Foursquare                                           P             t
information to provide venue recommendations to users;                                                        CIU,l 0
                                                                                              0
                                                                                             l ∈P ostalCodeL
                                                                                        t
more importantly the social perspective is integrated into their                   AKU,L =          P           t               (1)
recommendations, developing a Friend-Based Collaborative                                                     CIU,a
                                                                                                 ∀a∈Locations
Filtering (where the neighbours for CF are selected from the
social network of users), and an extension of this method          and represents how familiar a user is within an specific area
Geo-measured Friend-Based Collaborative Filtering (where           of the city (represented by its postal code).
                                                                                                     t
only closely located friends are selected as neighbours for        The Location Frequency (LFU,L        ) of a user U in a certain
CF).                                                               location L is calculated:
Our method then proposes a combination of the Geo-                                                     CI t
measured Friend-Based Collaborative Filtering [Ye et al.,                             t
                                                                                    LFU,L =           P U,L       t                  (2)
2010] and experts [Amatriain et al., 2009; Bao et al., 2012],                                                   CIU,a
                                                                                                ∀a∈Locations
in our specific case, neighborhood or area experts.
                                                                   and normalizes the number of visits of the user U to the loca-
                                                                   tion L.
3     Cicerone Recommender System                                                             t                  0
                                                                   The Social Importance (SIU,U  0 ) of a user U for a user U is

In this section we provide the theoretical framework of the        calculated:
Cicerone location-based recommender system. Firstly we de-                                                         1
scribe the basic terminology used later in the recommenda-                            t       (DegreeU 0 )      d(U,U 0 )

tion algorithm. As we have sketched previously, our system                          SIU,U 0 =                                (3)
                                                                                                  (nodes − 1)
bases its functioning in three information elements: the users’
social network, the users’ area knowledge and the current lo-         where DegreeU represents the number of connections
cation of the requesting user.                                     that U has in its social network, nodes represents the total
                                                                   number of nodes in the social network (and used to normalize
3.1    Basic Terminology                                           the SI), and d(U, U 0 ) represents the geodesic distance, i.e.
                                                                   minimum number of hops necessary to reach U 0 from U
As used herein, the term “location data item” stands for any       using the shortest path in their social network2 .
location item or representation of a location. A “location
item” is intended to encompass any type of location which             The Location Value (LVL,Ut
                                                                                                   ) of a location L for a user U
can be represented in a map using a latitude, a longitude, and     at time t is calculated:
possibly a category.                                                                    P        t          t         t
   The location recommender may be capable of selecting rel-                                 (LFU,L  × AKU,L     × SIU,U 0)
                                                                              t      usersinL
evant locations for a given target user. To do so, users should           LVL,U =                                              (4)
                                                                                                   |users|
be comparable entities and locations as well. It should be
understood that the implementations described herein are not          1
                                                                        The attendance of a user to a certain location can be captured in
item-specific and may operate with any other type of item vis-     several ways, for example, a Foursquare Check-in, a geopositioned
ited/shared by a community of users. For the specific case of      tweet, or a CDR trace of a phone call.
bars or restaurants items, users may interact with the items by       2
                                                                        d(U, U 0 ) < 0 means that there is no possible path that connects
visiting them. The Recommendations Set is the locations set        U and U 0
                                                                     a service embedded within already massive social networks.
                                                                     Twitter seems to be the ideal candidate for us for the follow-
                                                                     ing reasons:
                                                                         • Twitter shows a widespread uniform penetration almost
                                                                           worldwide, with an continously increasing numebr of
                                                                           users (288 million monthly active users in July 2012,
                                                                           showing an increase of 40% since July 2009 [Global-
                                                                           WebIndex, 2013]).
                                                                         • It allows users to associate their location when posting a
                                                                           message, and associate the specific coordinates as meta-
      Figure 1: Twitter-sensed Barcelona Social Network                    information.
                                                                         • It provides developers with an accesible API to obtain in
  where |users| represents the number of users that have                   near real-time the publicly published tweets.
“checked-in” to that Location.
                                                                         • Twitter captures a social network of followers and fol-
   The resulting value basically aggregates the information                lowings, publicly available for each user.
of surrounding detected locations considering the social dis-        As we have initially decided to deploy our application in the
tance of our specific user to the users that visited that location   city of Barcelona, Twitter confirms to be an ideal candidate.
(social information), and their familiarity in the area of the       The number of captured geopositioned tweets daily within
location (geographical information).                                 Barcelona is 6200 (from data coming from 2012), and the
                                                                     social network inferred from the users posting them can be
3.2    Cicerone Recommendation Algorithm                             seen in Figure 1 (with an average degree of 2.93).
The general algorithm for the functioning is the following:             Similarly, and to populate our items database, we opt to
 1. Once the position of a user A is detected, the system            use the crowdsourced database of Foursquare. Foursquare is
    automatically captures its Latitude and Longitude and            a location-based social media platform to communicate the
    launches the process that builds the personalized recom-         venues a user is in. This platform allows users to input into
    mendation set for that position and user at that certain         their databases new locations, by introducing not only the
    moment.                                                          venue’s name and specific location (with the GPS position
 2. The system retrieves all the locations in 100m radius of         and postal address) but also a semantic category. Foursquare
    the current position.                                            describes the places according to a rather complete taxonomy,
                                                                     where about 400 kinds of places are identified and grouped in
 3. The recommendation set for user A, Ra , is a set con-            9 wide categories3 . Foursquare provides an accesible API al-
    structed with all the locations in 100m radius of the user       lowing us to take snapshots of the existing locations in a cer-
    current position.                                                tain city. Within the city of Barcelona, from a snapshot taken
 4. The system calculates the location value of each of the          April 2013, we have detected over 66.000 foursquare loca-
    locations in that set.                                           tions, uniformly distributed amongst the different districts.
 5. The system orders the retrieved locations according to              Moreover, within the OpenData movement, the city of
    the calculated Location Values and constructs the Rec-           Barcelona provides a machine-readable administrative divi-
    ommendation Set with the 3 with a highest value.                 sion necessary for our theoretical calculations (namely the
                                                                     District divisions).

4     Functional implementation of Cicerone
                                                                     5       System Architecture
As explained previously, the theoretical framework to build
the recommendation needs from a number of data sources,              In this section we will describe the system architecture
namely, users locations, venues and the social relationships         (sketched in Figure 2) needed to implement a functional in-
amongst users. As this recommendation process is envisioned          statiation of the theoretical framework previously described.
to be executed when users are in-situ, the main functional re-       Firstly we will describe the social network monitoring used
quirement for our system is to work from a mobile device.            as data input for our platform, and also as user interface inter-
Working prototypes have decided to opt for the development           action with the recommendation engine. After that, we will
of a dedicated app (such as Yelp, TimeOut or TripAdvisor),           sketch the persistence infrastructure used to save the informa-
that users have to download within their devices. The app            tion related to venues and users, and finally we will describe
provides several advantages as the explicit user profiling as        the information update process and the component needed to
well as the definition of the necessary information to obtain        develop the recommendation platform.
the recommendation. However, for us it implies the big prob-
lem of reaching a critical mass of users that would made the             3
                                                                         The detailed categorization of Foursquare categories and parent
knowledge base and the recommendation more accurate. To              categories can be found at http://aboutfoursquare.com/
avoid this limitation we have opted to develop our system as         foursquare-categories/ (Last access April 2nd 2013).
                                                                   and update. Fed by the crawlers, the data required for our rec-
                                                                   ommendation solution arrives to the persistence manager and
                                                                   each of these elements are stored in a persistence infrastruc-
                                                                   ture in the following way: Users: One of the main functional
                                                                   requirements of the recommendation algorithm is the access
                                                                   to the social network of users. In order to effectively store
                                                                   this information, we opt for using a graph-oriented database,
                                                                   namely Neo4j4 . These type of databases allow us to per-
                                                                   sist users’ social network in the form of a directed weighted
                                                                   graph. In this database, we persist users as nodes and then
                                                                   establish edges amongst nodes if there exist a social relation
                                                                   amongst them. Consequently, an edge between two nodes is
                                                                   created if there exists a social relation amongst them, accord-
                                                                   ing to users’ Twitter profiles; specifically, an edge is created
                                                                   amongst from user A towards user B if userA follows user B
                                                                   in Twitter. At this edge level, the edge’s weight will be de-
               Figure 2: Cicerone Architecture                     fined depending on the users interactions: different types of
                                                                   Twitter interactions (such as mentions, retweets or favourited)
                                                                   will affect the weight differently.5 Another important infor-
                                                                   mation about users is saved, namely his “Check-ins” (as de-
                                                                   scribed in Sec. ). These “check-ins” (the specific coordinates
                                                                   of each user geopositioned tweet) is stored into a MongoDB6 ,
                                                                   as we can profit from the implemented geo-spatial index.
                                                                   Items: The items in our system are the locations within the
                                                                   target city. The locations database needs to provide efficient
                                                                   information access, as the recommender algorithm needs a
                                                                   high average number of accesses to it to build the recom-
                                                                   mendations. Moreover, an imporant item’s characteristic is
                Figure 3: Cicerone Workflow
                                                                   its location within space, that is proffited from when using
                                                                   geo-spatial indexing. Given these two characteristics (rapid
5.1   Social Networks Monitoring: Sensing the City                 information access and geo-spatial indexing), as well as the
The usage of social media platforms in our system are              potential for distributed computing, we opt to implement this
twofold: (1) information acquisition to feed the knowledge         database using MongoDB.
base of our platform, and (2) a channel for users’ interaction     Ratings: The notion of rating is clasically treated as an ex-
with our technology.                                               plicit evaluation of users about an item. However, in this
   The participatory information provided by users in              work, we take an alternative approach for ratings: we con-
Foursquare will be used to populate our items database; sim-       sider as a constant rating value the users presence in a loca-
ilarly, we will use geopositioned tweets to calculate users’       tion, sensed through the geopositioned tweets posted from or
Area Knowledge. Therefore, the social network monitor is           close to the venue location. This value is not obtained di-
the first layer of our architecture and it is composed by two      rectly, the Ratings will be part of knowledge obtained by the
crawlers: a Foursquare crawler and a Twitter crawler. The          recomendation engine and their information update process
Foursquare crawler is in charge to scan the target city for new    capturing user’s visits to specific locations but this will be ex-
venues. Once a new venue is identified, it is stored in the        plained on the Section 5.3.
items database with its associated metainformation such as
its specific coordinates, the address or the category. The Twit-   5.3    Recomendation Engine: Information Update
ter crawler is in charge to capture all the tweets generated in           Process
the target city. Its scope is threefold: (1) build and update      The last component in our proposed architecture is the re-
the users’ social network, (2) update the user area knowledge      comendation engine containing the implementation of the
using its geopositioned tweets, and (3) permits users’ com-        theoretical algorithms previously explained in the Section 3.
munication with the system.                                        Once we have our social networks monitor as an urban data
   Moreover, and given its popularity, we use Twitter as the       sensor, and the ability to persist all the raw data required
communication channel of our recommender system through            by the system, this component will be the responsible of the
a bot account managed by our intelligent agent.                    knowledge extraction process and the bussines logic triggered
5.2   Persistence Infrastructure: Urban data Model                    4
                                                                         http://www.neo4j.org
Any recommender system bases its functioning in three main            5
                                                                         Specific values and functions for edge weight determination
elements: users, items and ratings. These three elements have      will be developed at later versions of our software using empirical
to be stored according to the inherent properties of the sys-      information.
                                                                       6
tem, which in this case, imply real-time information access              http://www.mongodb.org
to generate a recommendation.                                          based on expert opinions from the web. In Proceedings
   Because of the real-time aspect of our system, our recom-           of the 32nd international ACM SIGIR conference on Re-
mendation platform (whose workflow is detailed in Figure 3)            search and development in information retrieval, SIGIR
needs to continously update some information elements such             ’09, pages 532–539, New York, NY, USA, 2009. ACM.
as users’ Area Knowledge and Location Frequency, the cre-           [Bao et al., 2012] Jie Bao, Yu Zheng, and Mohamed F. Mok-
ation or update of social relationships amongst users or the           bel. Location-based and preference-aware recommenda-
appearance of new locations. Specifically, we envision the             tion using sparse geo-social networking data. In Proceed-
users’ communication with our system through a Twitter per-            ings of the 20th International Conference on Advances
sonality that encapsules our recommendation platform; ev-              in Geographic Information Systems, SIGSPATIAL ’12,
erytime a user mentions our system’s username, the plat-               pages 199–208, New York, NY, USA, 2012. ACM.
form will capture this tweet (through the Mention’s Service
sketched in Figure 2) and identify it as an explicit request        [Bonhard and Sasse, 2006] P. Bonhard and M. A. Sasse.
for a recommendation that will trigger the whole intelligent           ’knowing me, knowing you’ – using profiles and social
process. Eventhough our technological platform allows us to            networking to improve recommender systems. BT Tech-
generate recommendations everytime a user’s location is cap-           nology Journal, 24(3):84–98, July 2006.
tured (with every geopositioned tweet), we rather restrict its      [GlobalWebIndex, 2013] GlobalWebIndex. Twitter now the
functioning with a mention system reducing the overall intru-          fastest growing social platform in the world. Web Report,
siveness.                                                              Jan 2013.
   After the recommendation is generated, it is returned to         [Groh and Ehmig, 2007] Georg Groh and Christian Ehmig.
the user also through Twitter with a message posted by our             Recommendations in taste related domains: collaborative
intelligent agent.                                                     filtering vs. social filtering. In Proceedings of the 2007
                                                                       international ACM conference on Supporting group work,
6   Conclusion and Future Work                                         GROUP ’07, pages 127–136, New York, NY, USA, 2007.
The designed recommender system plans to profit from the               ACM.
information proactively shared by users in the analyzed par-        [Jeske, 2013] Tobias Jeske. Floating car data from smart-
ticipatory platforms. However, as recently argued in [Jeske,           phones: What google and waze know about you and how
2013], these type of crowdsourced systems is sensible to ma-           hackers can control traffic. https://media.blackhat.com/eu-
licious attacks: in our case, and given the lack of restrictions       13/briefings/Jeske/bh-eu-13-floating-car-data-jeske-
to post geo-positioned content from Twitter, someone could             wp.pdf (Last access April 1st 2013)., 2013.
easily envision the method to create a fake user to become the      [Park et al., 2007] Moon-Hee Park, Jin-Hyuk Hong, and
one with higher area knowledge in every area of the city, and
                                                                       Sung-Bae Cho. Location-based recommendation system
then influence directly the resulting recommendations to his
                                                                       using bayesian users preference model in mobile devices.
own will.
                                                                       In Ubiquitous Intelligence and Computing, pages 1130–
   Despite this potential problem associated to the publishing         1139. Springer, 2007.
policy of Twitter and Foursquare, and as we have analyzed in
Sec. 2, many others have used information from these sources        [Sklar et al., 2012] Max Sklar, Blake Shaw, and Andrew
to generate location-based recommendations. However, and               Hogue. Recommending interesting events in real-time
to the best of our knowledge, the presented algorithm is the           with foursquare check-ins. In Proceedings of the sixth
first to include explicitly the user’s expertise about one of the      ACM conference on Recommender systems, RecSys ’12,
fundamental properties of the items: the area where it is lo-          pages 311–312, New York, NY, USA, 2012. ACM.
cated. By combining this information, with some social infor-       [Ye et al., 2010] Mao Ye, Peifeng Yin, and Wang-Chien Lee.
mation, we hypothesize that our system will be able to out-            Location recommendation for location-based social net-
perform other location-based recommender systems.                      works. In Proceedings of the 18th SIGSPATIAL Interna-
   Our main long term research task to be performed is the             tional Conference on Advances in Geographic Information
development of a user profiling in term of the type of venues          Systems, GIS ’10, pages 458–461, New York, NY, USA,
the user attends to, with the overall objective of combining           2010. ACM.
the area expertise and with specific user profiles.                 [Zheng et al., 2009] Yu Zheng, Yukun Chen, Xing Xie, and
                                                                       Wei-Ying Ma. Geolife2. 0: a location-based social net-
Acknowledgments                                                        working service. In Mobile Data Management: Systems,
This work has been completed with the support of ACC1Ó,                Services and Middleware, 2009. MDM’09. Tenth Interna-
the Catalan Agency to promote applied research and innova-             tional Conference on, pages 357–358. IEEE, 2009.
tion.

References
[Amatriain et al., 2009] Xavier Amatriain, Neal Lathia,
  Josep M. Pujol, Haewoon Kwak, and Nuria Oliver. The
  wisdom of the few: a collaborative filtering approach