Culture-aware Point-of-Interest Category Completion in a
     Global Location-Based Social Network Database without
                       Access to User Data
                            Nikolaos Lagos                                                        Ioan Calapodescu
                          Naver Labs Europe                                                        Naver Labs Europe
                           Meylan, France                                                            Meylan, France
                    nikolaos.lagos@naverlabs.com                                            ioan.calapodescu@naverlabs.com

ABSTRACT                                                                              culture-aware POI category imputation. We formally de-
Point of Interest (POI) categories can facilitate a number of ser-                    fine the problem and present a corresponding analysis.
vices, such as location-based search and place recommendation.                      • Culture-related categorisation without requiring access to
However, such information can be incomplete and/or incorrect,                         user data, has not been proposed before. To achieve that,
especially in crowdsourcing environments. In the literature, au-                      we simulate user information, by replacing culture-related
tomatic category imputation has been suggested to tackle this                         training inputs in an appropriate manner, at inference
problem, showing that contextual information is vital for increas-                    time. For instance, the country where a POI is located can
ing the quality of such predictions. To this end, users’ check-in                     be one of the training inputs. We can replace at inference
data, and most particularly location and time of visit, is often                      time the values of this input with the nationality of the
used as the notion of context. In this work, we propose a method                      user.
that considers culture as a contextual parameter. Contrary to                    The rest of this paper is organised as follows. We review related
existing methods, our approach does not require access to user                work in Section 2. We formally define the problem in Section 3
data. We illustrate the feasibility of our method by performing               and describe our method in Section 4. Experiments are presented
experiments on data from Foursquare, a global location-based                  in Section 5. Section 6 includes the conclusions of this work.
social network.
                                                                              1.1     Industrial context
1    INTRODUCTION                                                             Our company Naver, provides, among other things, location-
                                                                              based services. Good quality POI data is thus of major importance.
Point of Interest (POI) categories can facilitate a number of ser-
                                                                              In this context, in Naver Labs Europe, we have been exploring
vices, such as location-based search and place recommendation.
                                                                              automatic multi-lingual methods for completing and correcting
However, as discussed in recent work, categories, especially in
                                                                              POI semantic tags found in Foursquare’s database, a global crowd-
crowdsourcing environments, can be incomplete and/or incor-
                                                                              sourced location-based social network1 . The scope of our work
rect. Automatic category prediction has therefore been proposed
                                                                              is to eventually support:
to remedy this problem and impute missing categories [20].
   Recent advances in POI categorisation have shown that con-                       • A user that is not familiar with local culture to discover
textual information is vital for increasing the quality of automatic                  appropriate POIs in the vicinity of her/his position. If POIs
POI category prediction [5, 11, 15, 16]. To this end, users’ check-                   are not categorised appropriately, the user can not easily
in information, and most particularly location and time of visit                      search for them and they will not be included in the search
are often used to define the context. There are two important                         results. In addition, proper POI categorisation could also
shortcomings though with these approaches:                                            help in recommendation.
    (1) Getting relevant data presupposes that users’ will allow
        their check-in data to be shared with the corresponding               2 RELATED WORK
        service. However, recent initiatives and laws, such as the            2.1 Point-of-Interest Category Prediction
        EU General Data Protection Regulation (GDPR), stipulate               Most of the work on Point-of-Interest category prediction has
        that users should have more control over their personal               taken place in the context of Location-Based Social Networks.
        data, and potentially disallow the use (and storage) of data,             Ye et al. [16] were the first to show that taking into account
        such as their check-in information, from third-parties.               the geographical, local context of users can improve POI recom-
    (2) Context is defined in terms of location proximity in exist-           mendation and categorisation. From then on, other related work
        ing systems. However, recent advances in recommender                  has been systematically using such contextual data, originating
        systems and information search have shown that integrat-              from check-ins and/or related mobile sensors [15].
        ing information about cultural backgrounds, especially                    Krumm [11] showed that other personal information (e.g. gen-
        in a global setting, can help in developing high quality              der and age range) could also help to improve the results further.
        systems [7, 17].                                                          The type of user data explored in the state-of-the-art include
    The main contributions of this work are as follows.                       the number, frequency, time and duration of user check-ins, and
     • To our knowledge this is the first study of a multi-lingual,           sometimes demographic information (e.g. age range, gender).
       global location-based social network database related to               This data is usually combined with the location of the POI and
                                                                              its proximity to other POIs.
© 2020 Copyright for this paper by its author(s). Published in the Workshop
Proceedings of the EDBT/ICDT 2020 Joint Conference (March 30-April 2, 2020,
Copenhagen, Denmark) on CEUR-WS.org. Use permitted under Creative             1 We got access to this data thanks to an agreement between Naver Labs and
Commons License Attribution 4.0 International (CC BY 4.0).                    Foursquare.
    He et al. [5] and Zhou et al. [20], have in addition used POI in-                      In this work, we assume that there are several "culture spe-
                                                                                                                       n
formation including user defined semantic tags, and name token                             cific" labelsets, such that
                                                                                                                       S
                                                                                                                         L j ⊆ L, where n is the maximum
embeddings pre-computed on a domain-specific corpus.                                                                        j=1
    Jiang et al. [8] apply machine classification techniques to the                        number of cultures that are represented by the labels in Λ and
problem of fusing different POI databases under a common clas-                             n < ∞. For instance revisiting our example, if we assume that
sification hierarchy, the North American Industry Classification                           L = {Japanese Restaurant, Noodle House, Ramen Restaurant },
System (NAICS). Their study involves only a few American towns                             and we have at least two cultures French and Japanese with
and they use as input features only the categories and their pre-                          corresponding labelsets L 1 and L 2 , then it could be that L 1 =
defined relations, as already manually attributed in the original                          {Japanese Restaurant, Noodle House} and L 2 = {Noodle House,
data sources to the POIs.                                                                  Ramen Restaurant }, while Lo = {Japanese Restaurant }.
    A number of works have been carried out on location predic-                                We denote by y = (y 1 , ..., ym ) an m-dimensional binary vector
tion based in social streams, e.g. Twitter, where the main research                        where y i ∈ [0, 1] such that y i = 1 if and only if li ∈ L. We
interest is using noisy and short text for classification. For in-                         define a variant of it for culture c as yc = (yc1 , ..., ycm ) where
stance, Cano et al. [2] uses tweets to infer volatile POI classes                          yci ∈ [0, 1] such that yci = 1 if and only if li ∈ Lc and each y i − yci
according to specific temporary events happening at a specific                             is non-negative. Accordingly, the m-dimensional binary vector
location. Interested readers may refer to Zheng et al. [19] for a                          yo = (yo1 , ..., yom ) with y i ∈ [0, 1] has yoi = 1 if and only if li ∈ Lo .
comprehensive survey of the domain. Despite superficial com-                                   We assume that C ∪ N = A where C stands for the set of
monalities, this subject is different from the one studied here.                           culture-related attributes, and N the set of culture-independent
    To summarise, as mentioned in the Introduction, all the afore-                         ones. Considering again our example, the C could be instantiated
mentioned approaches assume access to user data at training                                by the country where the restaurant is located, and N by the
time and have not dealt with the notion of user’s culture.                                 name "La Table du Ramen".
                                                                                               We denote by x, xC , x N the vectors that represent correspond-
2.2      Culture-aware Recommendation                                                      ingly A, C, and N , such that the observed p in the dataset, is
Recent work had highlighted the importance of modelling users’                             defined as
culture in recommendation and information search [7]. Notably,                                                    p = {x, yo } = {xC , x N , yo }
the cultural background of a user was found to play a vital role                           Our objective is to make culture-specific predictions such that
in how recommended items are judged [14].                                                  for culture c
   To our knowledge, in the area of automatic recommender                                                                   p = {x, yc }
systems, Zangerle et al. [17] uses culture as a computation pa-                            We thus formulate our goal as a multi-label classification problem
rameter. The authors, base their proxy for defining culture on                             where we want to find a classifier bc : X → Y where X is the
the nationality of the users and use Hofstede et al’s. [6] grouping                        input space (all possible attribute vectors) and Y the output space
of nationalities in culturally similar clusters as a guide for their                       (all possible labelset vectors), such that yc = bc (x ).
computation model2 . However, in this case as well, the authors
assume that they have access to user data.                                                 4    CATEGORY PREDICTION METHOD
                                                                                           The category of POIs, especially in a location-based social net-
3     PROBLEM DEFINITION                                                                   work, is related to the cultural profile of the users that visit it.
Our goal is to predict POIs’ categories that are appropriate to a                          This has been proven via the inclusion of user profiles in related
specific culture. For instance, a typical place found in the database                      work (c.f. Section 2). However, instead of accessing user informa-
of Foursquare is "La Table du Ramen"3 , which is located in Paris,                         tion to discover such profiles, we use the observation that the
France. The category found in Foursquare’s database is Japanese                            majority of POIs are categorised in a manner that reflects local
Restaurant, which may be sufficient for the local culture. However,                        culture in location-based social network databases. For instance,
a Japanese would expect the POI to be categorised at a much                                as shown in Fig. 1, we find in Foursquare that restaurants selling
more fine-grained level, as e.g. Ramen Restaurant or as a Noodle                           noodle dishes4 are usually categorised as Asian Restaurant in
House. The objective is to automatically predict such categories,                          France, while Ramen Restaurant is by a large margin the most
according to one’s culture.                                                                popular category in Japan.
      We consider that POI category prediction in our context is
equivalent to the problem of completing a specific attribute of
the dataset, the one that represents categories of POIs, based
on data from the remaining attributes. Formally, a POI p should
have an attribute that includes an ideal, complete and correct,
category labelset i.e. set of relevant labels L ⊂ Λ, where Λ =
l 1 , ..., lm is the set of all possible labels, while the rest of the
attributes are represented by the set A. However, in practice,
the set of labels attributed to p, what we call thereof the ob-                            Figure 1: Category distribution of POIs having the token
served labelset Lo ⊂ Λ, can be incomplete and/or incorrect.                                "noodle" in their name in Japan and France. It is obvious
                                                                                           that "Ramen Restaurant" is the most popular category in
2 Hofstede et al. [6] notes that culture is always a collective phenomenon, as it is, at   Japan and "Asian Restaurant" in France.
least partly, shared with people who live or lived within the same social environment,
which is where it was learned. In that sense, context may encompass a lot of different
aspects, including the notions of social status, education, and language. Hofstede et         Based on this insight, if at training time we use culture related
al. [6] mentions that "one’s country" is an important parameter that defines culture       attributes to learn a latent representation of POI’s categories,
in this sense [6]
3 English translation:The Table of Ramen.                                                  4 The token "noodle" is explicitly mentioned in the POI name.
then at inference time we can replace the corresponding inputs                          latitude and longitude are two of the most frequently used geo-
according to the target cultural profile. For instance, if at training                  graphical coordinates. The predominant way of modelling coor-
time we use the country in which the POI is located as an input                         dinates is to discretise the input space [13, 18]. This could take
parameter to our model, at inference time we can replace the                            the form of a grid separated into a fixed number of cells. Usually
value of this parameter with the target country, simulating what                        in this case the form and granularity of the cells has to be selected
would happen if the same POI was located in the target country                          appropriately. We use countries as a proxy of different cultures.
instead. We can thus generate culturally-appropriate predictions                        We represent countries as categorical variables, as this granular-
and complete the database offline. Revisiting the above example,                        ity can be related to different cultures, as explained in Section 2.2.
we can assume that some of the Ramen restaurants located in                             However, other representations could also be suitable, such as
Japan, would be categorised in a different manner, e.g. as Noodle                       regions etc.
Houses, if the value of the country was changed to France5 . An
overview of the proposed method is shown in Fig. 2.                                     Note: Other cultural variables. This is an optional category,
   Based on the discussion above, we reformulate our problem,                           which can include other parameters related to culture. For in-
and look for a classifier bc such that yc = bc (xc ) where xc is a                      stance, to determine socio-cultural context, opening hours and
culture-specific variant of the input.                                                  the price range of corresponding services may also be impor-
   To find bc , we follow a standard approach and transform our                         tant. Both of these variables could be discretised and considered
problem into finding a real-valued vector function f : X → S ∈                          categorical variables.
[0, 1]m that allows to indicate the relevance of a label li in rela-
tion to the input i.e. f (xc ) = ( f (xc , l 1 , ), f (xc , l 2 ), ..., f (xc , lm ))   4.2     Training
where f (xc , li ) is the confidence of li ∈ Λ being a correct label                    Once we vectorise our attributes, as explained in the previous
for xc and m is the number of labels. Actually this corresponds to                      section, we use a concatenation layer to combine them.
an estimation of p(yci |xc ) : yci ∈ [0, 1]. Note that ideally, observed                   If a is a POI attribute such that a ∈ A and ϕ a (x a ) ∈ ℜD a is the
outputs should be completely specified vectors, however in our                          attribute specific vectorisation function, where D a denotes the
context the training instances are only partially complete, so of                       dimensionality associated with the attribute a, then the final input
the form (xci , yoi ). We follow the Binary Relevance method, thus                      vector is a concatenation of all vectorised individual attributes:
learn m binary models, each specialised into predicting whether
one label is correct or not, independently from the other labels.                                            x̃ = [ϕ 1 (a 1 ), ϕ 2 (a 2 ), ..., ϕ n (an )]
For an unseen xc , the predicted labels are then the union of the
predictions of all the binary models.                                                   where n denotes the number of attributes. We feed this to a dense
   To learn the binary models we perform the following steps.                           layer

      • Attribute selection: We use the name and spatial geo-                                                       h = relu[W h x̃ + b h ]
        coordinates of the POIs.                                                        After applying a dropout layer, we then calculate
      • Vectorisation: This step includes transforming the attributes
        in a form that can be treated by the classifier, as explained                                         p(y|h, θ ) = siдmoid[W h + b]
        in Section 4.1.
                                                                                        where θ = (W , b,W h , b h ) are learned parameters of the model.
      • Training: A model that learns to predict the probability of
                                                                                        siдmoid (s) denotes the logistic function f (si ) = 1+e1 si . The pa-
        li being a correct label given x is computed in this step.
                                                                                        rameters θ are learned by minimising the binary cross-entropy
        As explained in the previous paragraph, our problem is
                                                                                        loss function.
        casted as a supervised machine learning problem. Details
        are provided in Section 4.2.
      • Inference: Whether li is a correct label for culture-specific
                                                                                        4.3     Inference
        variants of x is computed in this step. Details are given in                    The multi-label model learned is given culture-specific inputs at
        Section 4.3.                                                                    this step. It then generates for each label a probability score. To
                                                                                        get from that the corresponding set of accepted labels, a constant
                                                                                        can be applied as threshold (usually this is 0.5) [4].
4.1      Vectorisation
Categorical variables. We represent them with one-hot en-
                                                                                        5 EXPERIMENTS
coded embeddings, as usually reported in the literature.
                                                                                        5.1 Set up
Sequential variables. Biessmann et al. [1] report that character-                          5.1.1 Data. We perform our experiments on 2.4M POIs ex-
based representations are more robust for a similar setting to ours                     tracted from a large database provided by Foursquare to illustrate
(i.e. sparse data and multiple languages). In addition, Joulin et al.                   the feasibility of our method. Details are provided below.
[9] and Biessmann et al. [1] mention that character n-grams can
perform better than simple, unigram, character-based LSTMs.                                Categorisation hierarchy. Our dataset includes 808 POI cat-
After experimentation we have adopted trigram character based                           egories from the categorisation hierarchy of Foursquare6 , as
LSTMs for POI names.                                                                    shown in Table 1. As the classification hierarchy is based on
                                                                                        crowdsourced data, the parts of the dataset that include more
Spatial variables. Geographical coordinates are the most im-                            POI instances are represented with more categories, resulting in
portant spatial attributes that characterise a POI. For instance,                       it being heavily imbalanced. For instance, the most well devel-
                                                                                        oped category that is located at the root of the hierarchy is Food,

5 In our experiments, 7% of them were categorised as "Noodle House" c.f. Section 5.     6 https://developer.foursquare.com/docs/resources/categories as of 3rd October 2018
                          Figure 2: Overview of proposed method. Images of NN models are adapted from [3]

         Table 1: Category Distribution in the dataset                                    Table 2: Percentage & distribution of semantic tags

   Root Category                           Levels      Categories in Path                     Category               POIs (%)
   Food                                       5                  337                           Café                     9.83
   Shop & Service                             3                  144                        Restaurant                  5.88
   Outdoors & Recreation                      4                   83                        Pizza Place                 4.49
   Professional & Other Places                3                   77                       Coffee Shop                  4.48
   Arts & Entertainment                       3                   53                          Bakery                     3.8
   Travel & Transport                         3                   44                         Fast Food
                                                                                                                        3.23
   College & University                       3                   32                        Restaurant
   Nightlife Spot                             3                   25                             ...                      ...
   Event                                      2                   8                    Chinese Restaurant               2.98
   Residence                                  2                   5                    Japanese Restaurant               2.5
                                                                                         Asian Restaurant               2.41
                                                                                                 ...                      ...
with 336 categories distributed in 5 levels. The least developed                          Noodle House                  1.02
one is Residence having only 4 subcategories, over 2 levels7 .                                   ...                      ...
   Label distribution. We have extracted a sample dataset from the                      Ramen Restaurant                0.65
existing Foursquare database for our experiments. We used the                                    ...                      ...
most well developed root category, Food, as seed, and took only
POIs with a high reality index. The distribution of the categories
is similar to the one found in the original hierarchy. To better                      longtitude are written in the standard form, normally with >10-
understand the dataset we provide the overall distribution of                         decimal point precision (e.g. latitude:55.76942424341726, longti-
POIs and the one of the top categories in Table 2. The dataset is                     tude: 44.948036880105064). Comments are also available for some
skewed in terms of the POI instances attributed to each category,                     of the POIs but as they are relatively sparse we chose not to use
with the first 10 top categories having more than 40% of the POIs                     them for the current experiments.
attributed to them.                                                                      Dataset creation. To generate training, development, and test
   Point-of-Interest attributes. . POI attributes include its name,                   data, we used approximate stratified sampling. The goal was to
and the latitude and longitude, transformed into the represen-                        maintain the distribution of positive and negative examples of
tation discussed in Section 4.1. It is important to note that con-                    each label by considering each label independently. Consequently,
trary to completely freely crowdsourced POI databases such as                         we allocated the POIs proportionally into 40% for training, 10%
OpenStreetMaps [12], the format is normalised. Latitude and                           for development, and 50% for testing purposes8 .
7 Even if we used Food as seed, we find also the rest of the root categories in the   8 We kept a relatively large percentage of the data for testing pusrpose, in order
dataset. The reason is that POIs can be categorised using multiple labels, although   to have a large enough sample of POIs belonging to long tail categories, where
at least one of the labels must have as seed Food.                                    cultural differences may be more obvious.
   5.1.2 Model. We use a neural architecture with one hidden            REFERENCES
dense layer, followed by a dropout layer, and the output layer.          [1] Felix Biessmann, David Salinas, Sebastian Schelter, Philipp Schmidt, and
We use the Rectified Linear Unit as the activation function of               Dustin Lange. 2018. "Deep" Learning for Missing Value Imputationin Ta-
                                                                             bles with Non-Numerical Data. In Proceedings of the 27th ACM International
the hidden layer. The dropout rate is set to 0.3. The loss we use            Conference on Information and Knowledge Management (CIKM ’18). ACM, New
is binary crossentropy. We have set an early stopping criterion              York, NY, USA, 2017–2025. https://doi.org/10.1145/3269206.3272005
                                                                         [2] Amparo Elizabeth Cano, Andrea Varga, and Fabio Ciravegna. 2011. Volatile
for the training based on a pre-defined threshold that takes into            Classification of Point of Interests based on Social Activity Streams. In Proceed-
account the delta of the loss between two consecutive epochs. For            ings of the 4th International Workshop on Social Data on the Web, SDoW@ISWC
all sequential features we applied a length of 50. For the LSTM              2011, Bonn, Germany, October 23, 2011 (CEUR Workshop Proceedings), Alexan-
                                                                             dre Passant, Sergio Fernández, John G. Breslin, and Uldis Bojars (Eds.), Vol. 830.
layer we set the dimensions of the embedding layer vector space              CEUR-WS.org. http://ceur-ws.org/Vol-830/sdow2011_paper_7.pdf
to 128 and the number of the LSTM hidden units to 128. The               [3] Exxact. 2017. Discover the Difference Between Deep Learning Training and
LSTM has a recurrent droupout rate of 0.3. Experiments were                  Inference. https://blog.exxactcorp.com/discover-difference-deep-learning-
                                                                             training-inference/. Accessed: 2019-11-06.
run on a single GPU instance (1 GPU with 16GB VRAM, 4 CPUs,              [4] Ouadie Gharroudi. 2017. Ensemble multi-label learning in supervised and
with 256GB RAM). Training was performed with a batch size                    semi-supervised settings. Ph.D. Dissertation. Université de Lyon.
                                                                         [5] Tieke He, Hongzhi Yin, Zhenyu Chen, Xiaofang Zhou, Shazia Sadiq, and Bin
of 32. We used the Adam optimiser with the default parameters                Luo. 2016. A Spatial-Temporal Topic Model for the Semantic Annotation of
recommended in the paper [10]. All the results reported below                POIs in LBSNs. ACM Trans. Intell. Syst. Technol. 8, 1, Article 12 (July 2016),
are on the test dataset, which has not been used for training or             24 pages. https://doi.org/10.1145/2905373
                                                                         [6] Hofstede Gert J. Minkov Michael Hofstede, Geert. 2010. Cultures and organi-
validation purposes.                                                         zations: Software of the mind. McGraw-Hill, New York, 6, 18.
                                                                         [7] Hannu Jaakkola and Bernhard Thalheim. 2017. Supporting culture-aware
                                                                             information search. Frontiers in Artificial Intelligence and Applications,
5.2     Results                                                              Vol. 292. IOS Press, Netherlands, 161–181.               https://doi.org/10.3233/
Table 3 shows shows an excerpt of how POIs are categorised in                978-1-61499-720-7-161 JUFOID=56381.
                                                                         [8] Shan Jiang, Ana Alves, Filipe Rodrigues, Joseph Ferreira, and Francisco C.
different cultures. As mentioned in previous sections, we use as             Pereira. 2015. Mining point-of-interest data from social networks for urban
a proxy to represent culture, one’s country. For each category               land use classification and disaggregation. Computers, Environment and Urban
                                                                             Systems 53 (2015), 36 – 46. https://doi.org/10.1016/j.compenvurbsys.2014.12.
included in Table 3, we provide observations related to the predic-          001 Special Issue on Volunteered Geographic Information.
tion differences between different cultures. Based on the results        [9] Armand Joulin, Edouard Grave, Piotr Bojanowski, Maximilian Nickel, and
we can note the following.                                                   Tomas Mikolov. 2017. Fast Linear Model for Knowledge Graph Embeddings.
                                                                             arXiv e-prints, Article arXiv:1710.10881 (Oct 2017), arXiv:1710.10881 pages.
      • The model learned that some categories are by design (as             arXiv:stat.ML/1710.10881
                                                                        [10] Diederik Kingma and Jimmy Ba. 2015. Adam: a method for stochastic opti-
        defined by Foursquare) allowed only in specific countries            mization (2014). arXiv preprint arXiv:1412.6980 15 (2015).
        e.g. Acai House and Churrascaria only in Brazil.                [11] John Krumm and Dany Rouhana. 2013. Placer: Semantic Place Labels from
      • The predictions reflect, to the best of our knowledge, local         Diary Data. In Proceedings of the 2013 ACM International Joint Conference on
                                                                             Pervasive and Ubiquitous Computing (UbiComp ’13). ACM, New York, NY, USA,
        culture reasonably well e.g. Bistro is popular in France and         163–172. https://doi.org/10.1145/2493432.2493504
        Souvlaki Shop in Greece.                                        [12] OpenStreetMap contributors. 2017.              Planet dump retrieved from
                                                                             https://planet.osm.org . https://www.openstreetmap.org.
      • Semantically similar categories are found by the classi-        [13] Tsangaratos Paraskevas, Rozos Dimitrios, and Benardos Andreas. 2014. Use
        fier between different cultures, even when the original              of artificial neural network for spatial rainfall analysis. Journal of Earth
        category is specific to one culture only. For instance Paste-        System Science 123, 3 (01 Apr 2014), 457–465. https://doi.org/10.1007/
                                                                             s12040-014-0417-0
        laria, a typical Brazilian and Portuguese POI category, is      [14] Markus Schedl and Dominik Schnitzer. 2014. Location-Aware Music Artist
        predicted in other cultures as Bakery and/or Snack Place,            Recommendation. In Proceedings of the 20th Anniversary International Confer-
        which is correct. Similarly, Churrascaria is classified as           ence on MultiMedia Modeling - Volume 8326 (MMM 2014). Springer-Verlag
                                                                             New York, Inc., New York, NY, USA, 205–213. https://doi.org/10.1007/
        BBQ Joint in other cultures.                                         978-3-319-04117-9_19
                                                                        [15] Yan Wang, Zongxu Qin, Jun Pang, Yang Zhang, and Jin Xin. 2017. Semantic
   Latent similarities discovered by the model are questionable              Annotation for Places in LBSN Through Graph Embedding. In Proceedings
though in some cases. For instance, Souvlaki Shop, a prominent               of the 2017 ACM on Conference on Information and Knowledge Management
                                                                             (CIKM ’17). ACM, New York, NY, USA, 2343–2346. https://doi.org/10.1145/
category in the Greek culture, is correlated to the category Kebab           3132847.3133075
Shop in other cultures. In reality, in a number of aspects there are    [16] Mao Ye, Dong Shou, Wang-Chien Lee, Peifeng Yin, and Krzysztof Janowicz.
strong similarities: for instance the form of the corresponding              2011. On the Semantic Annotation of Places in Location-based Social Networks.
                                                                             In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge
sandwiches (pita-based and circular), the meat that is cooked on             Discovery and Data Mining (KDD ’11). ACM, New York, NY, USA, 520–528.
a spit, and the "fast food" type of food delivery. However, the              https://doi.org/10.1145/2020408.2020491
former is made traditionally out of pork and the latter does not        [17] Eva Zangerle, Martin Pichl, and Markus Schedl. 2018. Culture-Aware Music
                                                                             Recommendation. In Proceedings of the 26th Conference on User Modeling,
have pork, which is an important difference. In future work, we              Adaptation and Personalization (UMAP ’18). ACM, New York, NY, USA, 357–
plan to tackle this aspect further.                                          358. https://doi.org/10.1145/3209219.3209258
                                                                        [18] Junbo Zhang, Yu Zheng, Dekang Qi, Ruiyuan Li, and Xiuwen Yi. 2016. DNN-
                                                                             based Prediction Model for Spatio-temporal Data. In Proceedings of the 24th
6     CONCLUSIONS                                                            ACM SIGSPATIAL International Conference on Advances in Geographic Informa-
                                                                             tion Systems (SIGSPACIAL ’16). ACM, New York, NY, USA, Article 92, 4 pages.
We have presented a new method to predict, in a culture-specific             https://doi.org/10.1145/2996913.2997016
manner, POI categories, without requiring access to user informa-       [19] Xin Zheng, Jialong Han, and Aixin Sun. 2018. A Survey of Location Prediction
                                                                             on Twitter. IEEE Transactions on Knowledge and Data Engineering 30, 9 (Sep.
tion. To achieve that, we simulate user information, by replacing            2018), 1652–1671. https://doi.org/10.1109/TKDE.2018.2807840
culture-related training inputs in an appropriate manner, at in-        [20] Jingbo Zhou, Shan Gou, Renjun Hu, Dongxiang Zhang, Jin Xu, Xuehui Wu,
                                                                             Airong Jiang, and Hui Xiong. 2019. A Collaborative Learning Framework
ference time. For instance, the country where a POI is located               to Tag Refinement for Points of Interest. In Proceedings of the 25th ACM
can be replaced at inference time the by the nationality of the              SIGKDD International Conference on Knowledge Discovery & Data Mining
user. We have performed preliminary experiments on data of a                 (KDD ’19). ACM, New York, NY, USA, 1752–1761. https://doi.org/10.1145/
                                                                             3292500.3330698
global location-based social network, Foursquare, that give us
promising results. In future work, these results will be further
verified with user studies.
           Table 3: Culture-specific prediction results for different POI categories. Green coloured values are
        significantly higher than in the the rest of the cultures, and red significantly lower, indicating a notable
                                                culture-specific influence.

Category         Original data                                                   Culture
                                  KR                FR               US               BR                TR               GR
Acai House      1488             43                29             0                1534            0                0
Note:Except for Brazil in the rest of the cultures the same POIs are categorised as Snack Place, Juice Bar, Dessert Shop.
According to Foursquare’s documentation Acai house is a category only supported in Brazil.
Bistro           879               470              3310              351               1162              95              10
Note: Bistros predicted using the French culture are tagged in the original data as: Café, Bar, Gastropub, Diner. Corresponding
predictions in other cultures are: Café, Wine Bar, Bar, Gastropub.
Brasserie        18                0               91               0               3                 0                  0
Note: In other cultures Café is the main predicted category for the same POIs (or there is no prediction at all). In the silver
standard the POIs are also categorised as Bistro or Café.
Café             126665           108908          87381           63223            108837           148050            236436
Note: In the US culture a lot of Cafes seem to be categorised as Coffee Shops instead. In the Greek culture Coffee Shop,
Breakfast Place, Dessert Shop, Bar, Tea Room POIs are categorised as Café (which is actually representative of the culture).
Coffee Shop     54291            49069              45769            68102            50513             52742            23629
Note: As explained in the previous row.
Churrascaria    557              0                 0                0                 674             0                0
Note: Churrascaria is a Portuguese/Brazilian BBQ. In other cultures the majority of the same POIs are classified as BBQ Joint
and a small percentage as Steakhouse (especially in the US).
Creperie        978              775              2968            932              1138             917             1740
Note: Creperies are obviously common place in the French culture. In the US and KR ones the same POIs are rather
categorised as Dessert Shop or Breakfast Shop. In the BR one in addition to Dessert Shop we find also Pastelaria9 .
Dessert Shop 15164               19422            4747              9651             15460            21941           13736
Note: In the French culture Dessert Shop POIs are rather classified as Café, Bakery, Creperie, Pastry Shop, Chocolate Shop.
In the US one we have to note the large number of POIs categorised as Ice Cream Shop, Frozen Yogurt Shop, Candy Store.
Diner             2590             2289              2388             3980           1539              1593            100
Note: Some of the POIs categorised in the US culture as Diner, are mainly categorised in other cultures as Café or Breakfast
Spot or there is no prediction (the difference is really big with Greece where almost all of them are categorised as Café).
Friterie           656               7                4864            2                419               4                62
Note: The model has learned that Friterie is a culture-specific category (supported in FR, BE, NL in the Foursquare database).
It is interesting to note that in the US culture the same POIs are categorised as Burrito Place, Taco Place, Food Truck, in the
TR one as Kofte Place and in the GR one as Snack Place. They all seem to share a fast food aspect.
Meyhane         854              0              0                  0                0               2157            0
Note: The model has learned that Meyhane is TR culture-specific category. In other cultures Meyane POIs usually do not get
any prediction.
Pastelaria      340              1               0               0                 898              57               0
Note: The model has learned that Pastelaria is BR (and Portuguese) culture-specific category. In FR culture categorised as
Creperie, Snack Place, Bakery, US:Bakery,Snack Place, KR:Bakery, Dessert Shop, Snack Place.
Pastry Shop     60               5                307              0                    0                88               0
Note: The model has learned that Pastry Shop is more frequent in FR (i.e. Patisserie). In other cultures we mainly find Dessert
Shop or Bakery.
Souvlaki Shop 94                 0                 0               0                  0                0                2434
Note: The model learned that Souvlaki Shop is specific to GR culture. In other cultures and in the silver standard, the same
POIs are categorised mainly as Fried Chicken Joint and BBQ joint. There is a strong correlation to Steakhouse as well, or
more precisely to Kebab shops that are also classified as Steakhouse in the silver standard.
Sports Bar       67               9               2                100              4               5               0
Note: The model learned that Sports Bar is more frequent in the US culture. In the silver standard, the same POIs are
categorised also just as Bar and/or Wing Joint. Furthermore, the model learned a strong correlation between the category
Wings Joint and sports Bar - the two categories are predicted together 77% of the time.
Takoyaki         186              130              45              34               22               44               30
Place
Note: Takoyaki Place POIs do not get any prediction most of the times in other cultures except for KR. 86% of the predicted
POIs are correct according to the silver standard.