<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Matrix Factorization for Near Real-time Geolocation Prediction in Twitter Stream</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nghia Duong-Trung</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicolas Schilling</string-name>
          <email>schilling@ismll.uni-hildesheim.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lucas Rego Drumond</string-name>
          <email>ldrumond@ismll.uni-hildesheim.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lars Schmidt-Thieme</string-name>
          <email>schmidt-thieme@ismll.uni-hildesheim.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Information Systems and Machine Learning Lab (ISMLL) Universitatsplatz 1</institution>
          ,
          <addr-line>31141 Hildesheim</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The geographical location is vital to geospatial applications such as event detection, geo-aware recommendation and local search. Previous research on this topic has investigated geolocation prediction framework via conducting pre-partitioning and applying classi cation methods. These existing approaches target user's geolocation all at once via concatenation of tweets. In this paper, we study a novel problem in geolocation. We aim to predict user's geolocation at a given tweet's posting time. We propose a geo matrix factorization model to address this problem. First, we map tweets into a latent space using a matrix factorization technique. Second, we use a linear combination in the latent space to predict exact latitude and longitude. However, we only use one individual tweet as the input instead of using a concatenation of all tweets of a user. Our experimental results show that the proposed model has outperformed a set of regression models and state-of-the-art classi cation approaches.</p>
      </abstract>
      <kwd-group>
        <kwd>Twitter</kwd>
        <kwd>Near real-time Geolocation</kwd>
        <kwd>Matrix Factorization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        In the past years, online social networking and social media sites, e.g. Twitter
in general, have become an ubiquitous and constant mechanism for sharing and
seeking information. Although a tweet's length is limited to 140 characters, there
is still a huge amount of information to explore. Its contents are inherently
multifaceted and dynamic; consequently, representing people's thoughts and public
announcement at a temporal currency and vicinity. This causes Twitter data
to become speci cally interesting for multi-purpose investigations as they are
tweeted in near real-time fashion. Understanding the near real-time user's
geographical location, e.g. latitude and longitude pairs or physical coordinates,
enables providing policies and intervention aid strategies in a particular region
such as localized aid [
        <xref ref-type="bibr" rid="ref31 ref9">31,9</xref>
        ], disaster response [
        <xref ref-type="bibr" rid="ref21 ref27">27,21</xref>
        ], event detection [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ] and
disease surveillance [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        One of the early pioneer papers about geolocation in Twitter streams was
published in 2010 [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. In that work, the authors concatenated all user's tweets
during a speci ed duration into one single representative document. The
geolocation of the rst tweet or the rst available geo-tagged tweet in the collection
was then the geolocation of the representative document. Using a concatenation
provides circumstantial contents to develop a wide variety of techniques used in
geo-locating such as content analysis with terms in a gazetteer [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], content
analysis with probabilistic language models [
        <xref ref-type="bibr" rid="ref1 ref11 ref16">11,16,1</xref>
        ], metadata of various sorts such
as follow-following relationships [
        <xref ref-type="bibr" rid="ref17 ref22">17,22</xref>
        ], behavior-based time zone [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
Furthermore, the research conducted in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] exploits the idea of geolocation prediction as
label propagation by interpreting location labels spatially. Additionally, the work
of [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] extends [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] by taking into account edge weights as a function re ecting
user interactions.
      </p>
      <p>
        Prerequisites to these directions are the representation of the earth's surface.
Geolocations can be captured as points, or clusters based on a pre-partitioning
of regions into discrete sub-regions using city locations [
        <xref ref-type="bibr" rid="ref18 ref26 ref5">5,18,26</xref>
        ], named entities
and location indicative words [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] as well as vernacular expressions with the aid
of comprehensive gazetteers [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Another approach of partitioning the earth's
surface is to use a grid. While the simplest grid is a uniform rectangular one
with cells of equal-sized degrees [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ], more advanced grids are either an adaptive
grid based on k -d trees [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], an equal-area quaternary triangular mesh [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] or a
hierarchical structure [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ].
      </p>
      <p>However, these approaches have some drawbacks due to some reasons. First
of all, as being classi cation methods, they heavily depend on pre-partitioning or
a framing architecture that is used to split the regions into discrete sub-regions.
Thus, they discard the natural properties of real physical coordinates.
Moreover, concatenating tweets into one representative document requires a
timeconsuming collection as well as data abundance. In addition, concatenation of
tweets during a particular duration, e.g. a month, leads to failure of capturing
geolocation in near real-time situations. E ective geolocation of a user while
posting a single short tweet based purely on its content is a direction
worthinvestigating and also constitutes a more di cult task.</p>
      <p>In this paper, we address a novel geolocation prediction scenario via
regression within indicative latent feature space. By working on the latent feature
space, we have proved that regression models can be utilized to solve this
prediction problem. We aim to predict the exact user geolocation at a given posting's
time, simply based on the textual content of tweets, ignoring their metadata.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Proposed Method</title>
      <p>In this section, we present the general notation used in this paper as well as
our approach. It is based on a matrix factorization of the individual tweets
where we then learn a latent representation of tweets and words. This latent
representation will then be used to predict the nal geolocation. We also present
a learning algorithm for our approach which is optimized by stochastic gradient
descent.
2.1</p>
      <sec id="sec-2-1">
        <title>Notation</title>
        <p>Consider a dataset D containing a set of tweets where each tweet is described
by n many features. The dataset will be split into a training Dtrain, a test Dtest
and a validation Dvalid set, which will be used for hyperparameter optimization
later. We have m, l and v tweets in the training Dtrain, test Dtest and validation
Dvalid sets respectively. The tweet features are mapped from a dictionary that
comprises all words/tokens/unigrams in the dataset. We denote the vocabulary
size by jV j = n.</p>
        <p>Each tweet is annotated with a ground-truth coordinate pair y 2 R2, y =
(ylat; ylon) where ylat 2 R is the latitude and ylon 2 R is the longitude of the
associated tweet. By yui = (yulait; ylon) we denote the average geolocation of a
ui
user in the training set, where yulait 2 R is the average latitude and yuloin 2 R is
the average longitude. Using yU = (yUlat; ylon), we denote the average geolocation
U
of all users in the training set. Given some training data Xtrain 2 Rm n, and
the respective labels Y train 2 Rm 2, we seek to learn a machine learning model
f : Rn ! R2 which maps tweets to geolocations such that for some test data
Xtest 2 Rv n, the sum of distances</p>
        <p>v
X d(f (Xitest); Yitest)
i=1
(1)
is minimal. By Y test 2 Rv 2 we denote the set of ground-truth labels for the
test data. Note that, d is a distance metric where in our learning algorithm we
use the Haversine distance.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>The Geo Matrix Factorization Model</title>
        <p>Over the last decade, Matrix Factorization (MF) models have gained much
attention by the Net ix Prize competition where they have shown very good predictive
performance as well as decent run-time complexity in terms of dealing with very
sparse matrices. Based on the vanilla MF, we develop a more
multi-relationaloriented factorization model for the geolocation regression task: the Geo Matrix
Factorization (GMF) model. We approach the user geolocation problem as a
text regression task where we aim to predict the exact latitude and longitude
values using an individual tweet. However, instead of using the highly sparse
word counts as features in a linear regression, we rstly factorize the input space
by learning a matrix T 2 Rm k for tweets and W 2 Rk n for individual words
of each tweet to reconstruct X as:</p>
        <p>X</p>
        <p>T W
(2)</p>
        <p>As in the usual setting, the number of latent features k is usually much
smaller than the number of words n, such that through this approach, tweets
are projected into a lower dimensional latent feature space. This latent
representation of a tweet is then used within a linear model to predict the geolocation
of the user at the posting time of the tweet:</p>
        <p>where 2 Rk+1 and 2 Rk+1 are weight coe cients vectors for learning
latitude and longitude respectively. Notice that we also actually perform two
factorizations of X, one for latitude which yields T lat, this is done for longitude
as well. Our model then actually predicts the average training location of a user,
plus a regression term on the latent feature space obtained by the factorization
of X.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Model Fitting</title>
        <p>Given the model, we have to learn parameters T lat; T lon; W lat; W lon; ; , where
the W matrices are only used for reconstructing X and not for predicting the
actual geolocation. We optimize the prediction of the gelocation as well as the
factorization of X for the least-squares error. In order to prevent the model
from over tting to the training data we apply a Tikhonov regularization on the
regression parameters and , the latent feature matrices are regularized using
the Frobenius norm. The overall loss term for learning the parameters associated
to predicting latitude then looks like</p>
        <p>K
y^lat = yulalt + 0 + X
l</p>
        <p>kTllkat
k=1</p>
        <p>K
y^llon = yuloln + 0 + X kTllkon
(3)
(4)
Llat(y^lat; ylat) =</p>
        <p>1
j Xtrain j
y^lat
ylat 2 +
k k
2
+</p>
        <p>Xtrain</p>
        <p>T latW lat 2F + T T lat 2F +</p>
        <p>W</p>
        <p>W lat 2F ;
The regularization terms
by regularization parameters
larization.</p>
        <p>
          These terms penalize parameters with high magnitudes, that typically lead
to overly complex models with very small training errors but bad generalization
performance. Certainly, these hyperparameters can not be learned from the data
and will be optimized using a grid-search on the validation partition of the data.
To solve the above optimization tasks, we apply Stochastic Gradient Descent
(SGD) [
          <xref ref-type="bibr" rid="ref13 ref2">2,13</xref>
          ] where the learning rate is estimated using the Adaptive Subgradient
Method (AdaGRAD) [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] which helps yielding a better run-time performance.
The basic idea of SGD is that, instead of expensively calculating the gradient
of Equation 4 and its latitude counterpart, it randomly selects a tweet and
calculates the corresponding gradient. Suppose we have chosen a tweet indexed
by m, the partial derivatives of Equation 4 with the respect to T lat can be
computed as:
        </p>
        <p>Xmn</p>
        <p>K
X T mlaktWklant
k=1
ymlat
yulamt
kT mlakt
0</p>
        <p>l
K
X
k=1</p>
        <sec id="sec-2-3-1">
          <title>Wllnat + T T mlalt</title>
          <p>The partial derivatives with respect to the latent feature matrix W lat of the
tokens is obtained by</p>
          <p>Xmj</p>
          <p>K
X T mlaktWklajt T mlalt +
k=1</p>
        </sec>
        <sec id="sec-2-3-2">
          <title>W Wlljat</title>
          <p>Finally, the partial derivative of the regression parameters has the form:
as well as the respective term for longitude.
(5)
(6)
(7)
(8)
=
=
ymlat
ymlat
yulamt
yulamt</p>
          <p>K
X
k=1
K
X
k=1
kT mlakt
kT mlakt
0 T mlajt +</p>
          <p>j
0
and</p>
          <p>The partial derivatives of the longitude loss with the respect to T lon, W lon
can be calculated in the exact same manner as Equations 5, 6 and 7.
2.4</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>Inference for Test Data</title>
        <p>By optimizing the respective loss terms for the training data, we learn the latent
representation T of all training tweets as well as the linear regression parameters
and for predicting the nal geolocation. However, as we want to predict
geolocations of unseen test tweets, the latent representations T for the individual
training tweets cannot be employed. Out of this reason, we perform a fold-in,
where we factorize the feature matrix Xtest of the test data, using the latent
representation W of the word tokens that was learned on the training data. To
avoid confusion, we denote the latent tweet representations for the test tweets
by T 0lat and T 0lon and factorize Xtest as</p>
        <p>As we can see, W lat and W lon are reused from the learning phase.
Subsequently, in the fold-in phase, we de ne the objective function that we need to
minimize for T 0lat as follows:</p>
        <p>L
lat Xtest; T 0latW lat
=</p>
        <p>1
jXtestj</p>
        <p>The partial derivatives of Equation 9 with the respect to T 0lat can be
computed by:
=</p>
        <sec id="sec-2-4-1">
          <title>Xjtnest</title>
          <p>K
X Tj0klatWklant
k=1</p>
          <p>Wklant + testT 0ljakt</p>
          <p>The partial derivatives with the respect to T 0lon can be also computed in
the same manner as for Equation 10. Having learned the latent representation
of the test tweets using the fold-in procedure, we can then perform predictions
for the test users using Equation 3. However, not all users that appear in the
test data necessarily have to appear in the training data, hence we cannot use
their average geolocation for the nal prediction. For those users, we then use
the median geolocation of all users of the training data as:
yul =
(yul ; if ul 2 Dtrain
yU ;</p>
          <p>otherwise</p>
          <p>Algorithm 1 illustrates how the overall GMF works.
3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experiments</title>
      <p>In this section, we rst describe the datasets that we use as well as their
preprocessing. Additionally, we describe how we optimized the hyperparameters of
our model. Finally, we compare our approach to a set of competing methods.
3.1</p>
      <sec id="sec-3-1">
        <title>Dataset</title>
        <p>We have worked with three publicly available tweet datasets containing
geolocation information and compiled them to t the user geolocation prediction within
the near real-time scenario. One dataset comprises the tweets posted within the
United States, whereas the other dataset contains all tweets localized to north
America and the world. Through this, we evaluate our model's e ectiveness
and generality within di erent geographical scopes from a country to the whole
world. A splitting protocol is then designed for these datasets. We randomly
(9)
(10)
(11)</p>
      </sec>
      <sec id="sec-3-2">
        <title>Algorithm 1 GMF</title>
        <p>Require: Xtrain 2 Rm n, Xtest 2 Rl nk, Yn 2 Rm 2
Ensure: T 2 Rm k, T 0 2 Rl k, W 2 R , 2 Rk+1,
split all tweets of each user by a 60/20/20 scheme, denoted as LocalRandom
(LR). Secondly, we also investigate how our model works with a user appearing
in the test set might not exist in the training data by splitting all tweets using
the 60/20/20 scheme, called GlobalRandom (GR).</p>
        <p>
          US. This dataset is originally implemented by [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], and was later also used
in [
          <xref ref-type="bibr" rid="ref11 ref16 ref30">11,30,16</xref>
          ]. The dataset comprises tweets gathered from the "Gardenhose"
sample stream in the rst week of March, 2010. In this dataset, the authors
already provide geotagged tweets that we simply reuse. The implementing dataset
contains 377,616 tweets posted by 9,475 users.
        </p>
        <p>
          NA. The second dataset was collected by [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] and later implemented by
[
          <xref ref-type="bibr" rid="ref15 ref29">29,15</xref>
          ]. This dataset contains tweets within north America, including the United
States, parts of Canada and Mexico from September 4th to November 29th, 2011.
Because Twitter does not allow the distribution of complete tweets at that time,
the NA dataset only contains user IDs and tweet IDs. Subsequently, we have
to fetch the tweets from Twitter using its o cial API to check whether the
tweets are available as well as their availability of embedded coordinates. Only
226,595 tweets out of 38 million posted by 10,950 users have geotags available
and therefore are considered for the nal dataset.
        </p>
        <p>
          WORLD. The last dataset was compiled by [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] and later implemented by
[
          <xref ref-type="bibr" rid="ref15 ref29">29,15</xref>
          ]. The dataset comprises tweets from all over the world. As being described
in NA dataset, we also apply the same retrieving procedure. The implementing
dataset then contains 121,327 tweets posted by 80,179 users. In the WORLD
dataset, 70% of users has only one tweet. So that we only apply the GR 60/20/20
splitting scheme to it.
3.2
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Data Preprocessing</title>
        <p>In addition to length restriction, tweets are also characterized by the use of
terms that are not found in natural language, including hashtags, abbreviations,
emoticons and URLs. Through this, we propose a data preprocessing procedure
as follows.</p>
        <p>Tokenization. We apply a uni-gram tokenization procedure that preserves
hashtags, @-replies, abbreviations, blocks of punctuation, emoticons and unicode
glyphs and other symbols as tokens. We remove URL tokens to prevent the tweets
where bots are posting information such as advertisement to enter our dataset.</p>
        <p>Bag-of-words representation. After all tweets are tokenized, they are
converted from sparse vectors of token counts into sparse vectors of bag-of-words
representations using term frequency - inverse document frequency (TF.IDF)
scores. By using the TF.IDF scores, we discard language and grammar
structure, the token's order, semantics and meaning as well as part-of-speech. The
TF.IDF weights re ect how important a token is to an instance. The more
common a token is to many instances, the more penalization it gets. The tokens
with the highest TF.IDF weight are often the tokens that best characterize the
instance.
3.3</p>
      </sec>
      <sec id="sec-3-4">
        <title>Evaluation Metrics</title>
        <p>
          Given the ellipsoidal shape of the earth's surface, we apply the Haversine distance
to calculate the distance of two points represented by their latitude in range of
f 90; 90g and longitude in range of f 180; 180g. The Haversine distance dH :
R2 R2 ! R is the great circle distance between two geographical coordinate
pairs. We compute the distance between two points by the Haversine formula
[
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]. The formula of the central angle between them is given by:
=
sin2 j
y^lat
        </p>
        <p>ylatj
+ cos(ylat) cos(y^lat) sin2 j
y^lon</p>
        <p>
          ylonj
where r is the radius of the earth. Because of the ellipsoidal shape of the earth, its
radius varies from the equator to the poles. According to [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], we take the mean
of the earth's radius which amounts to r = 6371 km. Finally, the evaluation
metrics are the mean and median Haversine distances dH in kilometers between
the ground-truth geolocation y and the predicted geolocation y^.
In order to obtain good predictive performance, we also need to carefully tune
the hyperparameters in our model. By k 2 N+ we denote the number of latent
features used within the factorization of X. By T ; W ; ; and T 0 we denote
the regularization hyperparameters used when learning the latent feature
matrices, latent vocabulary matrices, the linear regression parameters for predicting
latitude and longitude and the latent features matrices for the test tweets
respectively. With T ; W ; ; and T 0 we denote the respective learning rates.
We tune the hyperparameters by assessing the validation performance of our
model and choosing the hyperparameter con guration which performs best. The
number of latent dimensions is selected among the range of k 2 f2; 4; 8; 16g,
while the value of all other hyperparameters are selected among the range of
f0:1; 0:01; 0:001; 0:0001; 0:00001g. The preprocessed datasets used in the paper
are publicly available unconditionally1.
For the Support Vector Machine (SVM) and Factorization Machines (FM), we
run them separately to predict latitude and longitude. To allow for a fair
comparison, all these regression models also include the user bias in their estimation.
Finally, we combine the predicted latitude and longitude to conduct a nal
distance calculation. For these models, we also apply a grid-search mechanism to
nd the best hyperparameter con gurations for each prediction of latitude and
longitude. On each dataset, we repeat running the models 10 times and take the
average results. The nal results can be observed in Table 1. We can see that
all other regression models on average do not perform that well, mainly due to
them using the extremely sparse 5; 200 TF.IDF features. Our model, however,
maps each tweet individually into an eight-dimensional latent feature space and
uses those features for prediction. The number of k latent feature is found by
grid-search mechanism. The results show that GMF outperforms all competitors
with large margins.
        </p>
        <p>We also report the state-of-the-art results by classi cation approaches (see
Table 2). One might notice that there are signi cant di erences in term of
accuracy prediction in two geolocation prediction scenarios. By targeting user's
geolocation at a given posting's time, the results show that our model signi cantly
1 Available online at: http://fs.ismll.de/publicspace/GMF/
reduces the localization error on the US and NA datasets. For the WORLD
dataset, the average individual tweet's length is 5 tokens while being 49 tokens
for the concatenation of tweets, our model still achieves reasonable results.
We have investigated the geo matrix factorization model for the task of near
real-time text-based geolocation in Twitter. In our work, we tackle the user
geolocation prediction task in a regression perspective. We analyze a single tweet as
the model's input without any concatenation. Through this, we can further
predict the user trajectory and achieve geolocation at a given posting's time. This is
a starting point for further investigation on the a ection of tweet concatenation
or the number of tweets needed to achieve an acceptable distance error.
Furthermore, We also address the sparsity and imbalance of online conversational
texts by a matrix factorization technique. Based on the experiment results, our
model outperforms all the competitors including SVM and FM within the
regression task using dedicated latent feature spaces. In comparison with current
state-of-the-art results by classi cation approaches, our model still outperforms
and/or achieve reasonable results. Our further improvement broadly falls into
various directions: optimization or applying the model over di erent datasets.
In the optimization direction, we will analyze direct optimization of the
Haversine formula. We also expand our model to predict near real-time geolocation of
another types of datasets such as Wikipedia articles and Flickr images.
Acknowledgments. Nghia Duong-Trung gratefully acknowledges the funding
of his work by the Ministry of Education and Training of Vietnam under the
national project no. 911.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ahmed</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hong</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smola</surname>
            ,
            <given-names>A.J.</given-names>
          </string-name>
          :
          <article-title>Hierarchical geographical modeling of user locations from social media posts</article-title>
          .
          <source>In: Proceedings of the 22nd international conference on World Wide Web</source>
          . pp.
          <volume>25</volume>
          {
          <fpage>36</fpage>
          .
          <string-name>
            <surname>International World Wide Web Conferences Steering Committee</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bottou</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Large-scale machine learning with stochastic gradient descent</article-title>
          .
          <source>In: Proceedings of COMPSTAT'2010</source>
          , pp.
          <volume>177</volume>
          {
          <fpage>186</fpage>
          . Springer (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Burton</surname>
            ,
            <given-names>S.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tanner</surname>
            ,
            <given-names>K.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giraud-Carrier</surname>
            ,
            <given-names>C.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>West</surname>
            ,
            <given-names>J.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barnes</surname>
          </string-name>
          , M.D.:
          <article-title>" right time, right place" health communication on twitter: value and accuracy of location information</article-title>
          .
          <source>Journal of medical Internet research 14(6)</source>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <issue>4</issue>
          .
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C.J.:</given-names>
          </string-name>
          <article-title>LIBSVM: A library for support vector machines</article-title>
          .
          <source>ACM Transactions on Intelligent Systems and Technology</source>
          <volume>2</volume>
          ,
          <issue>27</issue>
          :1{
          <fpage>27</fpage>
          :
          <fpage>27</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. Cheng,
          <string-name>
            <given-names>Z.</given-names>
            ,
            <surname>Caverlee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          :
          <article-title>You are where you tweet: a content-based approach to geo-locating twitter users</article-title>
          .
          <source>In: Proceedings of the 19th ACM international conference on Information and knowledge management</source>
          . pp.
          <volume>759</volume>
          {
          <fpage>768</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Compton</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jurgens</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Allen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Geotagging one hundred million twitter accounts with total variation minimization</article-title>
          .
          <source>In: Big Data (Big Data)</source>
          ,
          <source>2014 IEEE International Conference on</source>
          . pp.
          <volume>393</volume>
          {
          <fpage>401</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Decker</surname>
            ,
            <given-names>B.L.</given-names>
          </string-name>
          :
          <article-title>World geodetic system 1984</article-title>
          .
          <source>Tech. rep., DTIC Document</source>
          (
          <year>1986</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Dias</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anastacio</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martins</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>A language modeling approach for georeferencing textual documents</article-title>
          .
          <source>In: Actas del Congreso</source>
          Espan~ol de Recuperacion de Informacion (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Dredze</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>How social media will change public health</article-title>
          .
          <source>Intelligent Systems, IEEE</source>
          <volume>27</volume>
          (
          <issue>4</issue>
          ),
          <volume>81</volume>
          {
          <fpage>84</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Duchi</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hazan</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singer</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Adaptive subgradient methods for online learning and stochastic optimization</article-title>
          .
          <source>The Journal of Machine Learning Research</source>
          <volume>12</volume>
          ,
          <volume>2121</volume>
          {
          <fpage>2159</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Eisenstein</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ahmed</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xing</surname>
            ,
            <given-names>E.P.</given-names>
          </string-name>
          :
          <article-title>Sparse additive generative models of text</article-title>
          .
          <source>In: Proceedings of the 28th International Conference on Machine Learning (ICML-11)</source>
          . pp.
          <volume>1041</volume>
          {
          <issue>1048</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Eisenstein</surname>
            , J.,
            <given-names>O</given-names>
          </string-name>
          <string-name>
            <surname>'Connor</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>N.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xing</surname>
            ,
            <given-names>E.P.:</given-names>
          </string-name>
          <article-title>A latent variable model for geographic lexical variation</article-title>
          .
          <source>In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <volume>1277</volume>
          {
          <fpage>1287</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Gemulla</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nijkamp</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haas</surname>
            ,
            <given-names>P.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sismanis</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Large-scale matrix factorization with distributed stochastic gradient descent</article-title>
          .
          <source>In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining</source>
          . pp.
          <volume>69</volume>
          {
          <fpage>77</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Han</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cook</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baldwin</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Geolocation prediction in social media data by nding location indicative words</article-title>
          .
          <source>Proceedings of COLING</source>
          <year>2012</year>
          : Technical Papers pp.
          <volume>1045</volume>
          {
          <issue>1062</issue>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Han</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cook</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baldwin</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Text-based twitter user geolocation prediction</article-title>
          .
          <source>Journal of Arti cial Intelligence</source>
          Research pp.
          <volume>451</volume>
          {
          <issue>500</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Hong</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ahmed</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gurumurthy</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smola</surname>
            ,
            <given-names>A.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsioutsiouliklis</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Discovering geographical topics in the twitter stream</article-title>
          .
          <source>In: Proceedings of the 21st international conference on World Wide Web</source>
          . pp.
          <volume>769</volume>
          {
          <fpage>778</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Jurgens</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>That's what friends are for: Inferring location in online social media platforms based on social relationships</article-title>
          .
          <source>In: ICWSM</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Kinsella</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murdock</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>O</given-names>
            <surname>'Hare</surname>
          </string-name>
          ,
          <string-name>
            <surname>N.:</surname>
          </string-name>
          <article-title>I'm eating a sandwich in glasgow: modeling locations with tweets</article-title>
          .
          <source>In: Proceedings of the 3rd international workshop on Search</source>
          and
          <article-title>mining user-generated contents</article-title>
          . pp.
          <volume>61</volume>
          {
          <fpage>68</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deng</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>K.C.C.</given-names>
          </string-name>
          :
          <article-title>Towards social user proling: uni ed and discriminative in uence model for inferring home locations</article-title>
          .
          <source>In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining</source>
          . pp.
          <volume>1023</volume>
          {
          <fpage>1031</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Mahmud</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nichols</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Drews</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Home location identi cation of twitter users</article-title>
          .
          <source>ACM Transactions on Intelligent Systems and Technology (TIST) 5</source>
          (
          <issue>3</issue>
          ),
          <volume>47</volume>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>McClendon</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robinson</surname>
            ,
            <given-names>A.C.</given-names>
          </string-name>
          :
          <article-title>Leveraging geospatially-oriented social media communications in disaster response</article-title>
          .
          <source>International Journal of Information Systems for Crisis Response and Management (IJISCRAM) 5</source>
          (
          <issue>1</issue>
          ),
          <volume>22</volume>
          {
          <fpage>40</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>McGee</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Caverlee</surname>
            , J., Cheng,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>Location prediction in social media based on tie strength</article-title>
          .
          <source>In: Proceedings of the 22nd ACM international conference on Conference on information &amp; knowledge management</source>
          . pp.
          <volume>459</volume>
          {
          <fpage>468</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Rendle</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Factorization machines with libfm</article-title>
          .
          <source>ACM Transactions on Intelligent Systems and Technology (TIST) 3</source>
          (
          <issue>3</issue>
          ),
          <volume>57</volume>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Robusto</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>The cosine-haversine formula</article-title>
          .
          <source>American Mathematical</source>
          Monthly pp.
          <volume>38</volume>
          {
          <issue>40</issue>
          (
          <year>1957</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Roller</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Speriosu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rallapalli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wing</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baldridge</surname>
          </string-name>
          , J.:
          <article-title>Supervised textbased geolocation using language models on an adaptive grid</article-title>
          .
          <source>In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning</source>
          . pp.
          <volume>1500</volume>
          {
          <fpage>1510</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Rout</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bontcheva</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Preotiuc-Pietro</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohn</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Where's@ wally?: a classi cation approach to geolocating users based on their social ties</article-title>
          .
          <source>In: Proceedings of the 24th ACM Conference on Hypertext and Social Media</source>
          . pp.
          <volume>11</volume>
          {
          <fpage>20</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Sakaki</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Okazaki</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matsuo</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Earthquake shakes twitter users: real-time event detection by social sensors</article-title>
          .
          <source>In: Proceedings of the 19th international conference on World wide web</source>
          . pp.
          <volume>851</volume>
          {
          <fpage>860</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Weng</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>B.S.:</given-names>
          </string-name>
          <article-title>Event detection in twitter</article-title>
          .
          <source>ICWSM 11</source>
          ,
          <issue>401</issue>
          {
          <fpage>408</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Wing</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baldridge</surname>
          </string-name>
          , J.:
          <article-title>Hierarchical discriminative classi cation for text-based geolocation</article-title>
          .
          <source>In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <volume>336</volume>
          {
          <issue>348</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Wing</surname>
            ,
            <given-names>B.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baldridge</surname>
          </string-name>
          , J.:
          <article-title>Simple supervised document geolocation with geodesic grids</article-title>
          .
          <source>In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume</source>
          <volume>1</volume>
          . pp.
          <volume>955</volume>
          {
          <fpage>964</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Yin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lampert</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cameron</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robinson</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Power</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Using social media to enhance emergency situation awareness</article-title>
          .
          <source>IEEE Intelligent Systems (6)</source>
          ,
          <volume>52</volume>
          {
          <fpage>59</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>