<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Culture-aware Point-of-Interest Category Completion in a Global Location-Based Social Network Database without Access to User Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nikolaos Lagos</string-name>
          <email>nikolaos.lagos@naverlabs.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ioan Calapodescu</string-name>
          <email>ioan.calapodescu@naverlabs.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Naver Labs Europe</institution>
          ,
          <addr-line>Meylan</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Point of Interest (POI) categories can facilitate a number of services, such as location-based search and place recommendation. However, such information can be incomplete and/or incorrect, especially in crowdsourcing environments. In the literature, automatic category imputation has been suggested to tackle this problem, showing that contextual information is vital for increasing the quality of such predictions. To this end, users' check-in data, and most particularly location and time of visit, is often used as the notion of context. In this work, we propose a method that considers culture as a contextual parameter. Contrary to existing methods, our approach does not require access to user data. We illustrate the feasibility of our method by performing experiments on data from Foursquare, a global location-based social network.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Point of Interest (POI) categories can facilitate a number of
services, such as location-based search and place recommendation.
However, as discussed in recent work, categories, especially in
crowdsourcing environments, can be incomplete and/or
incorrect. Automatic category prediction has therefore been proposed
to remedy this problem and impute missing categories [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
      </p>
      <p>
        Recent advances in POI categorisation have shown that
contextual information is vital for increasing the quality of automatic
POI category prediction [
        <xref ref-type="bibr" rid="ref11 ref15 ref16 ref5">5, 11, 15, 16</xref>
        ]. To this end, users’
checkin information, and most particularly location and time of visit
are often used to define the context. There are two important
shortcomings though with these approaches:
(1) Getting relevant data presupposes that users’ will allow
their check-in data to be shared with the corresponding
service. However, recent initiatives and laws, such as the
EU General Data Protection Regulation (GDPR), stipulate
that users should have more control over their personal
data, and potentially disallow the use (and storage) of data,
such as their check-in information, from third-parties.
(2) Context is defined in terms of location proximity in
existing systems. However, recent advances in recommender
systems and information search have shown that
integrating information about cultural backgrounds, especially
in a global setting, can help in developing high quality
systems [
        <xref ref-type="bibr" rid="ref17 ref7">7, 17</xref>
        ].
      </p>
      <sec id="sec-1-1">
        <title>The main contributions of this work are as follows.</title>
        <p>• To our knowledge this is the first study of a multi-lingual,
global location-based social network database related to
1.1</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Industrial context</title>
      <p>Our company Naver, provides, among other things,
locationbased services. Good quality POI data is thus of major importance.
In this context, in Naver Labs Europe, we have been exploring
automatic multi-lingual methods for completing and correcting
POI semantic tags found in Foursquare’s database, a global
crowdsourced location-based social network1. The scope of our work
is to eventually support:
• A user that is not familiar with local culture to discover
appropriate POIs in the vicinity of her/his position. If POIs
are not categorised appropriately, the user can not easily
search for them and they will not be included in the search
results. In addition, proper POI categorisation could also
help in recommendation.
Most of the work on Point-of-Interest category prediction has
taken place in the context of Location-Based Social Networks.</p>
      <p>
        Ye et al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] were the first to show that taking into account
the geographical, local context of users can improve POI
recommendation and categorisation. From then on, other related work
has been systematically using such contextual data, originating
from check-ins and/or related mobile sensors [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>
        Krumm [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] showed that other personal information (e.g.
gender and age range) could also help to improve the results further.
      </p>
      <p>The type of user data explored in the state-of-the-art include
the number, frequency, time and duration of user check-ins, and
sometimes demographic information (e.g. age range, gender).
This data is usually combined with the location of the POI and
its proximity to other POIs.
1We got access to this data thanks to an agreement between Naver Labs and
Foursquare.</p>
      <p>
        He et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and Zhou et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], have in addition used POI
information including user defined semantic tags, and name token
embeddings pre-computed on a domain-specific corpus.
      </p>
      <p>
        Jiang et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] apply machine classification techniques to the
problem of fusing diferent POI databases under a common
classification hierarchy, the North American Industry Classification
System (NAICS). Their study involves only a few American towns
and they use as input features only the categories and their
predefined relations, as already manually attributed in the original
data sources to the POIs.
      </p>
      <p>
        A number of works have been carried out on location
prediction based in social streams, e.g. Twitter, where the main research
interest is using noisy and short text for classification. For
instance, Cano et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] uses tweets to infer volatile POI classes
according to specific temporary events happening at a specific
location. Interested readers may refer to Zheng et al. [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] for a
comprehensive survey of the domain. Despite superficial
commonalities, this subject is diferent from the one studied here.
      </p>
      <p>To summarise, as mentioned in the Introduction, all the
aforementioned approaches assume access to user data at training
time and have not dealt with the notion of user’s culture.
2.2</p>
    </sec>
    <sec id="sec-3">
      <title>Culture-aware Recommendation</title>
      <p>
        Recent work had highlighted the importance of modelling users’
culture in recommendation and information search [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Notably,
the cultural background of a user was found to play a vital role
in how recommended items are judged [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        To our knowledge, in the area of automatic recommender
systems, Zangerle et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] uses culture as a computation
parameter. The authors, base their proxy for defining culture on
the nationality of the users and use Hofstede et al’s. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] grouping
of nationalities in culturally similar clusters as a guide for their
computation model2. However, in this case as well, the authors
assume that they have access to user data.
3
      </p>
    </sec>
    <sec id="sec-4">
      <title>PROBLEM DEFINITION</title>
      <p>Our goal is to predict POIs’ categories that are appropriate to a
specific culture. For instance, a typical place found in the database
of Foursquare is "La Table du Ramen"3, which is located in Paris,
France. The category found in Foursquare’s database is Japanese
Restaurant, which may be suficient for the local culture. However,
a Japanese would expect the POI to be categorised at a much
more fine-grained level, as e.g. Ramen Restaurant or as a Noodle
House. The objective is to automatically predict such categories,
according to one’s culture.</p>
      <p>
        We consider that POI category prediction in our context is
equivalent to the problem of completing a specific attribute of
the dataset, the one that represents categories of POIs, based
on data from the remaining attributes. Formally, a POI p should
have an attribute that includes an ideal, complete and correct,
category labelset i.e. set of relevant labels L ⊂ Λ, where Λ =
l1, ..., lm is the set of all possible labels, while the rest of the
attributes are represented by the set A. However, in practice,
the set of labels attributed to p, what we call thereof the
observed labelset Lo ⊂ Λ, can be incomplete and/or incorrect.
2Hofstede et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] notes that culture is always a collective phenomenon, as it is, at
least partly, shared with people who live or lived within the same social environment,
which is where it was learned. In that sense, context may encompass a lot of diferent
aspects, including the notions of social status, education, and language. Hofstede et
al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] mentions that "one’s country" is an important parameter that defines culture
in this sense [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
3English translation:The Table of Ramen.
      </p>
      <p>In this work, we assume that there are several "culture
spen
cific" labelsets, such that S Lj ⊆ L, where n is the maximum
j=1
number of cultures that are represented by the labels in Λ and
n &lt; ∞. For instance revisiting our example, if we assume that
L = {Japanese Restaurant , N oodle House, Ramen Restaurant },
and we have at least two cultures French and Japanese with
corresponding labelsets L1 and L2, then it could be that L1 =
{Japanese Restaurant , N oodle House } and L2 = {N oodle House,
Ramen Restaurant }, while Lo = {Japanese Restaurant }.</p>
      <p>
        We denote by y = (y1, ..., ym ) an m-dimensional binary vector
where yi ∈ [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ] such that yi = 1 if and only if li ∈ L. We
define a variant of it for culture c as yc = (yc1, ..., ycm ) where
yci ∈ [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ] such that yci = 1 if and only if li ∈ Lc and each yi − yci
is non-negative. Accordingly, the m-dimensional binary vector
yo = (yo1, ..., yom ) with yi ∈ [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ] has yoi = 1 if and only if li ∈ Lo .
      </p>
      <p>We assume that C ∪ N = A where C stands for the set of
culture-related attributes, and N the set of culture-independent
ones. Considering again our example, the C could be instantiated
by the country where the restaurant is located, and N by the
name "La Table du Ramen".</p>
      <p>We denote by x, xC , xN the vectors that represent
correspondingly A, C, and N , such that the observed p in the dataset, is
defined as</p>
      <p>p = {x, yo } = {xC , xN , yo }
Our objective is to make culture-specific predictions such that
for culture c</p>
      <p>p = {x, yc }
We thus formulate our goal as a multi-label classification problem
where we want to find a classifier bc : X → Y where X is the
input space (all possible attribute vectors) and Y the output space
(all possible labelset vectors), such that yc = bc (x ).
4</p>
    </sec>
    <sec id="sec-5">
      <title>CATEGORY PREDICTION METHOD</title>
      <p>The category of POIs, especially in a location-based social
network, is related to the cultural profile of the users that visit it.
This has been proven via the inclusion of user profiles in related
work (c.f. Section 2). However, instead of accessing user
information to discover such profiles, we use the observation that the
majority of POIs are categorised in a manner that reflects local
culture in location-based social network databases. For instance,
as shown in Fig. 1, we find in Foursquare that restaurants selling
noodle dishes4 are usually categorised as Asian Restaurant in
France, while Ramen Restaurant is by a large margin the most
popular category in Japan.
4The token "noodle" is explicitly mentioned in the POI name.
then at inference time we can replace the corresponding inputs
according to the target cultural profile. For instance, if at training
time we use the country in which the POI is located as an input
parameter to our model, at inference time we can replace the
value of this parameter with the target country, simulating what
would happen if the same POI was located in the target country
instead. We can thus generate culturally-appropriate predictions
and complete the database ofline. Revisiting the above example,
we can assume that some of the Ramen restaurants located in
Japan, would be categorised in a diferent manner, e.g. as Noodle
Houses, if the value of the country was changed to France5. An
overview of the proposed method is shown in Fig. 2.</p>
      <p>Based on the discussion above, we reformulate our problem,
and look for a classifier bc such that yc = bc (xc ) where xc is a
culture-specific variant of the input.</p>
      <p>
        To find bc , we follow a standard approach and transform our
problem into finding a real-valued vector function f : X → S ∈
[
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ]m that allows to indicate the relevance of a label li in
relation to the input i.e. f (xc ) = ( f (xc , l1, ), f (xc , l2), ..., f (xc , lm ))
where f (xc , li ) is the confidence of li ∈ Λ being a correct label
for xc and m is the number of labels. Actually this corresponds to
an estimation of p (yci |xc ) : yci ∈ [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ]. Note that ideally, observed
outputs should be completely specified vectors, however in our
context the training instances are only partially complete, so of
the form (xci , yoi ). We follow the Binary Relevance method, thus
learn m binary models, each specialised into predicting whether
one label is correct or not, independently from the other labels.
For an unseen xc , the predicted labels are then the union of the
predictions of all the binary models.
      </p>
      <p>To learn the binary models we perform the following steps.
• Attribute selection: We use the name and spatial
geocoordinates of the POIs.
• Vectorisation: This step includes transforming the attributes
in a form that can be treated by the classifier, as explained
in Section 4.1.
• Training: A model that learns to predict the probability of
li being a correct label given x is computed in this step.
As explained in the previous paragraph, our problem is
casted as a supervised machine learning problem. Details
are provided in Section 4.2.
• Inference: Whether li is a correct label for culture-specific
variants of x is computed in this step. Details are given in
Section 4.3.
4.1</p>
    </sec>
    <sec id="sec-6">
      <title>Vectorisation</title>
      <p>
        Categorical variables. We represent them with one-hot
encoded embeddings, as usually reported in the literature.
Sequential variables. Biessmann et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] report that
characterbased representations are more robust for a similar setting to ours
(i.e. sparse data and multiple languages). In addition, Joulin et al.
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and Biessmann et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] mention that character n-grams can
perform better than simple, unigram, character-based LSTMs.
After experimentation we have adopted trigram character based
LSTMs for POI names.
      </p>
      <p>
        Spatial variables. Geographical coordinates are the most
important spatial attributes that characterise a POI. For instance,
latitude and longitude are two of the most frequently used
geographical coordinates. The predominant way of modelling
coordinates is to discretise the input space [
        <xref ref-type="bibr" rid="ref13 ref18">13, 18</xref>
        ]. This could take
the form of a grid separated into a fixed number of cells. Usually
in this case the form and granularity of the cells has to be selected
appropriately. We use countries as a proxy of diferent cultures.
We represent countries as categorical variables, as this
granularity can be related to diferent cultures, as explained in Section 2.2.
However, other representations could also be suitable, such as
regions etc.
      </p>
      <p>Note: Other cultural variables. This is an optional category,
which can include other parameters related to culture. For
instance, to determine socio-cultural context, opening hours and
the price range of corresponding services may also be
important. Both of these variables could be discretised and considered
categorical variables.
4.2</p>
    </sec>
    <sec id="sec-7">
      <title>Training</title>
      <p>Once we vectorise our attributes, as explained in the previous
section, we use a concatenation layer to combine them.</p>
      <p>If a is a POI attribute such that a ∈ A and ϕa (xa ) ∈ ℜDa is the
attribute specific vectorisation function, where Da denotes the
dimensionality associated with the attribute a, then the final input
vector is a concatenation of all vectorised individual attributes:
x˜ = [ϕ1 (a1), ϕ2 (a2), ..., ϕn (an )]
where n denotes the number of attributes. We feed this to a dense
layer</p>
      <p>h = relu[W hx˜ + bh ]
After applying a dropout layer, we then calculate</p>
      <p>p (y |h, θ ) = siдmoid[W h + b]
where θ = (W , b, W h, bh ) are learned parameters of the model.
1
siдmoid (s ) denotes the logistic function f (si ) = 1+esi . The
parameters θ are learned by minimising the binary cross-entropy
loss function.
4.3</p>
    </sec>
    <sec id="sec-8">
      <title>Inference</title>
      <p>
        The multi-label model learned is given culture-specific inputs at
this step. It then generates for each label a probability score. To
get from that the corresponding set of accepted labels, a constant
can be applied as threshold (usually this is 0.5) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
5In our experiments, 7% of them were categorised as "Noodle House" c.f. Section 5.
6https://developer.foursquare.com/docs/resources/categories as of 3rd October 2018
      </p>
      <sec id="sec-8-1">
        <title>Root Category</title>
      </sec>
      <sec id="sec-8-2">
        <title>Food</title>
        <p>Shop &amp; Service
Outdoors &amp; Recreation
Professional &amp; Other Places
Arts &amp; Entertainment
Travel &amp; Transport
College &amp; University
Nightlife Spot
Event
Residence
with 336 categories distributed in 5 levels. The least developed
one is Residence having only 4 subcategories, over 2 levels7.</p>
        <p>Label distribution. We have extracted a sample dataset from the
existing Foursquare database for our experiments. We used the
most well developed root category, Food, as seed, and took only
POIs with a high reality index. The distribution of the categories
is similar to the one found in the original hierarchy. To better
understand the dataset we provide the overall distribution of
POIs and the one of the top categories in Table 2. The dataset is
skewed in terms of the POI instances attributed to each category,
with the first 10 top categories having more than 40% of the POIs
attributed to them.</p>
        <p>
          Point-of-Interest attributes. . POI attributes include its name,
and the latitude and longitude, transformed into the
representation discussed in Section 4.1. It is important to note that
contrary to completely freely crowdsourced POI databases such as
OpenStreetMaps [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], the format is normalised. Latitude and
7Even if we used Food as seed, we find also the rest of the root categories in the
dataset. The reason is that POIs can be categorised using multiple labels, although
at least one of the labels must have as seed Food.
        </p>
        <p>Dataset creation. To generate training, development, and test
data, we used approximate stratified sampling. The goal was to
maintain the distribution of positive and negative examples of
each label by considering each label independently. Consequently,
we allocated the POIs proportionally into 40% for training, 10%
for development, and 50% for testing purposes8.
8We kept a relatively large percentage of the data for testing pusrpose, in order
to have a large enough sample of POIs belonging to long tail categories, where
cultural diferences may be more obvious.</p>
        <p>
          Model. We use a neural architecture with one hidden
dense layer, followed by a dropout layer, and the output layer.
We use the Rectified Linear Unit as the activation function of
the hidden layer. The dropout rate is set to 0.3. The loss we use
is binary crossentropy. We have set an early stopping criterion
for the training based on a pre-defined threshold that takes into
account the delta of the loss between two consecutive epochs. For
all sequential features we applied a length of 50. For the LSTM
layer we set the dimensions of the embedding layer vector space
to 128 and the number of the LSTM hidden units to 128. The
LSTM has a recurrent droupout rate of 0.3. Experiments were
run on a single GPU instance (1 GPU with 16GB VRAM, 4 CPUs,
with 256GB RAM). Training was performed with a batch size
of 32. We used the Adam optimiser with the default parameters
recommended in the paper [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. All the results reported below
are on the test dataset, which has not been used for training or
validation purposes.
5.2
        </p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>Results</title>
      <p>• The model learned that some categories are by design (as
defined by Foursquare) allowed only in specific countries
e.g. Acai House and Churrascaria only in Brazil.
• The predictions reflect, to the best of our knowledge, local
culture reasonably well e.g. Bistro is popular in France and
Souvlaki Shop in Greece.
• Semantically similar categories are found by the
classiifer between diferent cultures, even when the original
category is specific to one culture only. For instance
Pastelaria, a typical Brazilian and Portuguese POI category, is
predicted in other cultures as Bakery and/or Snack Place,
which is correct. Similarly, Churrascaria is classified as</p>
      <sec id="sec-9-1">
        <title>BBQ Joint in other cultures.</title>
        <p>Latent similarities discovered by the model are questionable
though in some cases. For instance, Souvlaki Shop, a prominent
category in the Greek culture, is correlated to the category Kebab
Shop in other cultures. In reality, in a number of aspects there are
strong similarities: for instance the form of the corresponding
sandwiches (pita-based and circular), the meat that is cooked on
a spit, and the "fast food" type of food delivery. However, the
former is made traditionally out of pork and the latter does not
have pork, which is an important diference. In future work, we
plan to tackle this aspect further.
6</p>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>CONCLUSIONS</title>
      <p>We have presented a new method to predict, in a culture-specific
manner, POI categories, without requiring access to user
information. To achieve that, we simulate user information, by replacing
culture-related training inputs in an appropriate manner, at
inference time. For instance, the country where a POI is located
can be replaced at inference time the by the nationality of the
user. We have performed preliminary experiments on data of a
global location-based social network, Foursquare, that give us
promising results. In future work, these results will be further
verified with user studies.</p>
      <sec id="sec-10-1">
        <title>Category</title>
        <p>Original data</p>
      </sec>
      <sec id="sec-10-2">
        <title>Culture BR KR FR</title>
        <p>US</p>
        <p>TR</p>
        <p>GR
Acai House 1488 43 29 0 1534 0 0
Note:Except for Brazil in the rest of the cultures the same POIs are categorised as Snack Place, Juice Bar, Dessert Shop.
According to Foursquare’s documentation Acai house is a category only supported in Brazil.</p>
        <p>Bistro 879 470 3310 351 1162 95 10
Note: Bistros predicted using the French culture are tagged in the original data as: Café, Bar, Gastropub, Diner. Corresponding
predictions in other cultures are: Café, Wine Bar, Bar, Gastropub.</p>
        <p>Brasserie 18 0 91 0 3 0 0
Note: In other cultures Café is the main predicted category for the same POIs (or there is no prediction at all). In the silver
standard the POIs are also categorised as Bistro or Café.</p>
        <p>Café 126665 108908 87381 63223 108837 148050 236436
Note: In the US culture a lot of Cafes seem to be categorised as Cofee Shops instead. In the Greek culture Cofee Shop,
Breakfast Place, Dessert Shop, Bar, Tea Room POIs are categorised as Café (which is actually representative of the culture).
Cofee Shop 54291 49069
Note: As explained in the previous row.</p>
        <p>45769
68102
50513
52742
23629
Churrascaria 557 0 0 0 674 0 0
Note: Churrascaria is a Portuguese/Brazilian BBQ. In other cultures the majority of the same POIs are classified as BBQ Joint
and a small percentage as Steakhouse (especially in the US).</p>
        <p>Creperie 978 775 2968 932 1138 917 1740
Note: Creperies are obviously common place in the French culture. In the US and KR ones the same POIs are rather
categorised as Dessert Shop or Breakfast Shop. In the BR one in addition to Dessert Shop we find also Pastelaria 9.
Dessert Shop 15164 19422 4747 9651 15460 21941 13736
Note: In the French culture Dessert Shop POIs are rather classified as Café, Bakery, Creperie, Pastry Shop, Chocolate Shop.
In the US one we have to note the large number of POIs categorised as Ice Cream Shop, Frozen Yogurt Shop, Candy Store.
Diner 2590 2289 2388 3980 1539 1593 100
Note: Some of the POIs categorised in the US culture as Diner, are mainly categorised in other cultures as Café or Breakfast
Spot or there is no prediction (the diference is really big with Greece where almost all of them are categorised as Café).
Friterie 656 7 4864 2 419 4 62
Note: The model has learned that Friterie is a culture-specific category (supported in FR, BE, NL in the Foursquare database).
It is interesting to note that in the US culture the same POIs are categorised as Burrito Place, Taco Place, Food Truck, in the
TR one as Kofte Place and in the GR one as Snack Place. They all seem to share a fast food aspect.</p>
        <p>Meyhane 854 0 0 0 0 2157 0
Note: The model has learned that Meyhane is TR culture-specific category. In other cultures Meyane POIs usually do not get
any prediction.</p>
        <p>Pastelaria 340 1 0 0 898 57 0
Note: The model has learned that Pastelaria is BR (and Portuguese) culture-specific category. In FR culture categorised as
Creperie, Snack Place, Bakery, US:Bakery,Snack Place, KR:Bakery, Dessert Shop, Snack Place.</p>
        <p>Pastry Shop 60 5 307 0 0 88 0
Note: The model has learned that Pastry Shop is more frequent in FR (i.e. Patisserie). In other cultures we mainly find Dessert
Shop or Bakery.</p>
        <p>Souvlaki Shop 94 0 0 0 0 0 2434
Note: The model learned that Souvlaki Shop is specific to GR culture. In other cultures and in the silver standard, the same
POIs are categorised mainly as Fried Chicken Joint and BBQ joint. There is a strong correlation to Steakhouse as well, or
more precisely to Kebab shops that are also classified as Steakhouse in the silver standard.</p>
        <p>Sports Bar 67 9 2 100 4 5 0
Note: The model learned that Sports Bar is more frequent in the US culture. In the silver standard, the same POIs are
categorised also just as Bar and/or Wing Joint. Furthermore, the model learned a strong correlation between the category
Wings Joint and sports Bar - the two categories are predicted together 77% of the time.</p>
        <p>Takoyaki 186 130 45 34 22 44 30
Place
Note: Takoyaki Place POIs do not get any prediction most of the times in other cultures except for KR. 86% of the predicted
POIs are correct according to the silver standard.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Felix</given-names>
            <surname>Biessmann</surname>
          </string-name>
          , David Salinas,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Schelter</surname>
          </string-name>
          , Philipp Schmidt, and
          <string-name>
            <given-names>Dustin</given-names>
            <surname>Lange</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>"Deep" Learning for Missing Value Imputationin Tables with Non-Numerical Data</article-title>
          .
          <source>In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM '18)</source>
          . ACM, New York, NY, USA,
          <fpage>2017</fpage>
          -
          <lpage>2025</lpage>
          . https://doi.org/10.1145/3269206.3272005
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Amparo</given-names>
            <surname>Elizabeth</surname>
          </string-name>
          <string-name>
            <surname>Cano</surname>
          </string-name>
          , Andrea Varga, and
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Ciravegna</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Volatile Classification of Point of Interests based on Social Activity Streams</article-title>
          .
          <source>In Proceedings of the 4th International Workshop on Social Data on the Web, SDoW@ISWC</source>
          <year>2011</year>
          , Bonn, Germany, October
          <volume>23</volume>
          ,
          <year>2011</year>
          (CEUR Workshop Proceedings), Alexandre Passant, Sergio Fernández, John G. Breslin, and Uldis Bojars (Eds.), Vol.
          <volume>830</volume>
          .
          <article-title>CEUR-WS.org</article-title>
          . http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>830</volume>
          /sdow2011_paper_7.pdf
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Exxact</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Discover the Diference Between Deep Learning Training and Inference</article-title>
          . https://blog.exxactcorp.
          <article-title>com/discover-diference-deep-learningtraining-inference/</article-title>
          . Accessed:
          <fpage>2019</fpage>
          -11-06.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Ouadie</given-names>
            <surname>Gharroudi</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Ensemble multi-label learning in supervised and semi-supervised settings</article-title>
          .
          <source>Ph.D. Dissertation</source>
          . Université de Lyon.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Tieke</given-names>
            <surname>He</surname>
          </string-name>
          , Hongzhi Yin, Zhenyu Chen, Xiaofang Zhou, Shazia Sadiq, and
          <string-name>
            <given-names>Bin</given-names>
            <surname>Luo</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>A Spatial-Temporal Topic Model for the Semantic Annotation of POIs in LBSNs</article-title>
          .
          <source>ACM Trans. Intell. Syst. Technol. 8</source>
          ,
          <issue>1</issue>
          ,
          <string-name>
            <surname>Article 12</surname>
          </string-name>
          (
          <year>July 2016</year>
          ),
          <volume>24</volume>
          pages. https://doi.org/10.1145/2905373
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Hofstede</given-names>
            <surname>Gert J. Minkov Michael</surname>
          </string-name>
          <string-name>
            <surname>Hofstede</surname>
          </string-name>
          , Geert.
          <year>2010</year>
          .
          <article-title>Cultures and organizations: Software of the mind</article-title>
          .
          <source>McGraw-Hill</source>
          , New York,
          <volume>6</volume>
          ,
          <fpage>18</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Hannu</given-names>
            <surname>Jaakkola</surname>
          </string-name>
          and
          <string-name>
            <given-names>Bernhard</given-names>
            <surname>Thalheim</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Supporting culture-aware information search</article-title>
          .
          <source>Frontiers in Artificial Intelligence and Applications</source>
          , Vol.
          <volume>292</volume>
          . IOS Press, Netherlands,
          <fpage>161</fpage>
          -
          <lpage>181</lpage>
          . https://doi.org/10.3233/ 978-1-
          <fpage>61499</fpage>
          -720-7-161 JUFOID=
          <fpage>56381</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Shan</given-names>
            <surname>Jiang</surname>
          </string-name>
          , Ana Alves, Filipe Rodrigues, Joseph Ferreira, and Francisco C. Pereira.
          <year>2015</year>
          .
          <article-title>Mining point-of-interest data from social networks for urban land use classification and disaggregation</article-title>
          .
          <source>Computers, Environment and Urban Systems</source>
          <volume>53</volume>
          (
          <year>2015</year>
          ),
          <fpage>36</fpage>
          -
          <lpage>46</lpage>
          . https://doi.org/10.1016/j.compenvurbsys.
          <year>2014</year>
          .
          <volume>12</volume>
          . 001 Special Issue on Volunteered Geographic Information.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Armand</given-names>
            <surname>Joulin</surname>
          </string-name>
          , Edouard Grave, Piotr Bojanowski, Maximilian Nickel, and
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Fast Linear Model for Knowledge Graph Embeddings</article-title>
          . arXiv e-prints,
          <source>Article arXiv:1710.10881 (Oct</source>
          <year>2017</year>
          ), arXiv:
          <fpage>1710</fpage>
          .10881 pages.
          <source>arXiv:stat.ML/1710.10881</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Diederik</given-names>
            <surname>Kingma</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jimmy</given-names>
            <surname>Ba</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Adam: a method for stochastic optimization (</article-title>
          <year>2014</year>
          ).
          <source>arXiv preprint arXiv:1412.6980</source>
          <volume>15</volume>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>John</given-names>
            <surname>Krumm</surname>
          </string-name>
          and
          <string-name>
            <given-names>Dany</given-names>
            <surname>Rouhana</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Placer: Semantic Place Labels from Diary Data</article-title>
          .
          <source>In Proceedings of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp '13)</source>
          . ACM, New York, NY, USA,
          <fpage>163</fpage>
          -
          <lpage>172</lpage>
          . https://doi.org/10.1145/2493432.2493504
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <article-title>OpenStreetMap contributors</article-title>
          .
          <year>2017</year>
          .
          <article-title>Planet dump retrieved from https://planet</article-title>
          .osm.org . https://www.openstreetmap.org.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Tsangaratos</surname>
            <given-names>Paraskevas</given-names>
          </string-name>
          , Rozos Dimitrios, and
          <string-name>
            <given-names>Benardos</given-names>
            <surname>Andreas</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Use of artificial neural network for spatial rainfall analysis</article-title>
          .
          <source>Journal of Earth System Science</source>
          <volume>123</volume>
          ,
          <issue>3</issue>
          (
          <issue>01</issue>
          <year>Apr 2014</year>
          ),
          <fpage>457</fpage>
          -
          <lpage>465</lpage>
          . https://doi.org/10.1007/ s12040-014-0417-0
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Markus</given-names>
            <surname>Schedl</surname>
          </string-name>
          and
          <string-name>
            <given-names>Dominik</given-names>
            <surname>Schnitzer</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Location-Aware Music Artist Recommendation</article-title>
          .
          <source>In Proceedings of the 20th Anniversary International Conference on MultiMedia Modeling - Volume 8326 (MMM</source>
          <year>2014</year>
          ). Springer-Verlag New York, Inc., New York, NY, USA,
          <fpage>205</fpage>
          -
          <lpage>213</lpage>
          . https://doi.org/10.1007/ 978-3-
          <fpage>319</fpage>
          -04117-9_
          <fpage>19</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Yan</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Zongxu Qin, Jun Pang, Yang Zhang, and
          <string-name>
            <given-names>Jin</given-names>
            <surname>Xin</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Semantic Annotation for Places in LBSN Through Graph Embedding</article-title>
          .
          <source>In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM '17)</source>
          . ACM, New York, NY, USA,
          <fpage>2343</fpage>
          -
          <lpage>2346</lpage>
          . https://doi.org/10.1145/ 3132847.3133075
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Mao</surname>
            <given-names>Ye</given-names>
          </string-name>
          , Dong Shou,
          <string-name>
            <surname>Wang-Chien</surname>
            <given-names>Lee</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Peifeng</given-names>
            <surname>Yin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Krzysztof</given-names>
            <surname>Janowicz</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>On the Semantic Annotation of Places in Location-based Social Networks</article-title>
          .
          <source>In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '11)</source>
          . ACM, New York, NY, USA,
          <fpage>520</fpage>
          -
          <lpage>528</lpage>
          . https://doi.org/10.1145/2020408.2020491
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Eva</surname>
            <given-names>Zangerle</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Pichl</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Markus</given-names>
            <surname>Schedl</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Culture-Aware Music Recommendation</article-title>
          .
          <source>In Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization (UMAP '18)</source>
          . ACM, New York, NY, USA,
          <fpage>357</fpage>
          -
          <lpage>358</lpage>
          . https://doi.org/10.1145/3209219.3209258
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Junbo</surname>
            <given-names>Zhang</given-names>
          </string-name>
          , Yu Zheng, Dekang Qi,
          <string-name>
            <given-names>Ruiyuan</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Xiuwen</given-names>
            <surname>Yi</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>DNNbased Prediction Model for Spatio-temporal Data</article-title>
          .
          <source>In Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPACIAL '16)</source>
          . ACM, New York, NY, USA, Article
          <volume>92</volume>
          , 4 pages. https://doi.org/10.1145/2996913.2997016
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Xin</surname>
            <given-names>Zheng</given-names>
          </string-name>
          , Jialong Han, and
          <string-name>
            <given-names>Aixin</given-names>
            <surname>Sun</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>A Survey of Location Prediction on Twitter</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>30</volume>
          ,
          <issue>9</issue>
          (Sep.
          <year>2018</year>
          ),
          <fpage>1652</fpage>
          -
          <lpage>1671</lpage>
          . https://doi.org/10.1109/TKDE.
          <year>2018</year>
          .2807840
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Jingbo</surname>
            <given-names>Zhou</given-names>
          </string-name>
          , Shan Gou, Renjun Hu, Dongxiang Zhang, Jin Xu, Xuehui Wu, Airong Jiang, and
          <string-name>
            <given-names>Hui</given-names>
            <surname>Xiong</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>A Collaborative Learning Framework to Tag Refinement for Points of Interest</article-title>
          .
          <source>In Proceedings of the 25th ACM</source>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <source>SIGKDD International Conference on Knowledge Discovery &amp; Data Mining (KDD '19)</source>
          . ACM, New York, NY, USA,
          <fpage>1752</fpage>
          -
          <lpage>1761</lpage>
          . https://doi.org/10.1145/ 3292500.3330698
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>