<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On the Impact of User Movement Simulations in the Evaluation of LBS Privacy-Preserving Techniques</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sergio Mascetti</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dario Freni</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claudio Bettini</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>X. Sean Wang</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sushil Jajodia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CSIS, George Mason University</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>DICo, Universita di Milano</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Dept. of CS, University of Vermont</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The evaluation of privacy-preserving techniques for LBS is often based on simulations of mostly random user movements that only partially capture real deployment scenarios. We claim that benchmarks tailored to speci c scenarios are needed, and we report preliminary results on how they may be generated through an agent-based contextaware simulator. We consider privacy preserving algorithms based on spatial cloaking and compare the experimental results obtained on two benchmarks: the rst based on mostly random movements, and the second obtained from the context-aware simulator. The speci c deployment scenario is the provisioning of a friend- nder-like service on weekend nights in a big city. Our results show that, compared to the contextaware simulator, the random user movement simulator leads to signi cantly di erent results for a spatial-cloaking algorithm, under-protecting in some cases, and over-protecting in others.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Location-based services (LBS) are often cited as killer applications for the latest
GPS-equipped 3G phones. These phones are slated to be massively distributed
in 70 countries. While car navigation and identi cation of nearest points of
interest are already widely used services, more interest are generating the
socalled friend- nder services as a class of LBS that will change once more our way
to interact. A friend- nder service reveals to a participating user the presence of
other close-by participants belonging to a particular group (friends is only one
example), possibly showing their position on a map. From a technical point of
view, in contrast to services that nd nearest points of interests, this service is
characterized by a sequence of LBS requests instead of single ones, since a user
may want to periodically check, while moving or even while staying in the same
place, for close-by participants.</p>
      <p>
        Sociological studies have shown that a large number of users perceive the
release of their precise location, as part of a LBS request, as a possible privacy
threat [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Considering friend- nder services it is easy to identify two types of
privacy threats: a) the association of the identity of the user with the speci c
group of persons he is interested in may reveal her religious, sexual, or political
orientation, and b) the association of the identity of the user with her precise
location may reveal what kind of places she has been to, or that she has not
been where she was supposed to be at that time.
      </p>
      <p>
        As formally shown in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] the likelihood of a privacy violation, and
consequently the defense techniques to be applied, strongly depend on the knowledge
that an adversary may have. In the friend- nder service the service provider (SP)
may not be trusted or the communication channels may be insecure; then, the
adversary's knowledge may include the precise identity and location information
submitted with each request, and both the above privacy threats would become
real privacy breaches. The substitution of identities with pseudonyms does not
entirely solve the problem, if, for example, the adversary happens to know who
is at the location at the time reported in the request (e.g., in the case the issuer
of the request uses a delity card at a store). In some cases, the adversary may
also be able to recognize sequences of requests as issued by the same anonymous
user (e.g., by observing the same pseudonym or by spatio-temporal tracking)
and use this information to re-identify the issuer.
      </p>
      <p>Several defense techniques against both threats under di erent adversary
models have been proposed, and may be applied to the friend- nder service;
however, current proposals very rarely have formal assessments of the provided
privacy preservation, and are generally supported by experimental results based
either on real datasets of questionable signi cance for real LBS services (i.e.,
trucks or school bus traces) or on data simulations based on mostly random user
movements that hardly match the speci c deployment scenario of a LBS service.</p>
      <p>
        In order to understand if the use of simulations based on mostly random
user movements may be a real problem, or if it is actually useful and safe to use
these simulations, we considered a typical deployment scenario for a friend- nder
service: a large number of young people using the service on a weekend night in
a large city like Milan, Italy. We performed a deep study, using di erent sources,
including on-line surveys, of the parameters characterizing this scenario. We then
used the Brinkho simulator [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], widely used in testing LBS privacy preservation,
to generate, based on the parameters, a rst dataset of user movements. A second
dataset was created with a personalized version of the Siafu agent-based
contextaware simulator [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] which is able to capture much more details of our scenario.
Then, based on a common metric for privacy and quality of service evaluation, we
run a large number of tests on both datasets, considering di erent abilities of
reidenti cation by the adversary, as well as di erent privacy preserving techniques.
      </p>
      <p>Our results consistently show that (i) in some cases the evaluation on random
movement simulations leads to the de nition of overprotective techniques and
(ii) in other cases, the techniques that are shown to meet privacy requirements
based on those simulations do not meet them when tested with more realistic
context-aware simulations.</p>
      <p>
        We focus our technical treatment on protecting the association of the user
with the request he has issued (e.g., with the group of people he is interested in,
as in threat (a) described above), even if we believe that our arguments can be
easily extended to techniques only aimed to protect the location.
Related work
We are not aware of related work in this area considering speci cally the
relevance of realistic simulations in LBS. There are however several studies on user
movements with impact on many di erent application areas including
epidemiology, transportation, computer networks, marketing, as well as LBS. A very
interesting study supporting an argument against random movement simulations
recently appeared [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In the following we brie y report the main techniques
currently proposed for LBS privacy preservation, identifying the ones similar to
those tested in our experiments, and the ones using simulations to generate the
datasets for experiments.
      </p>
      <p>
        Privacy preserving solutions based on cryptographic techniques that totally
hide the location information in requests, even to the SP, have been recently
proposed [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] for LBS based on 1-NN queries, and may be probably adapted for
the service we consider. If proven to be correct, no simulation would be needed for
these techniques since no information would leak from any request and the above
privacy threats do not apply. However, this adaptation is still to be investigated,
and there are some general concerns with these approaches regarding e ciency
and exibility.
      </p>
      <p>A popular alternative technique is spatial cloaking, consisting in the
generalization of the spatial information transmitted to the SP as part of a service
request. By receiving generalized locations, the SP can only return approximate
results on the presence of close-by group members and their positions; while it
may be possible to have a trusted entity in the middle ltering the
communication and improving the precision, the related overhead costs should be taken
into account in evaluating the trade-o between generalization and quality of
service. While in this paper we consider techniques based on spatial cloaking
as in [7{9, 2, 10], other proposals have considered di erent techniques, including
the generation of dummy requests, the use of incremental requests, or the
substitution in the request of the position of the issuer with a region that does not
include her (see among others [11{13]).</p>
      <p>Most of the proposals for LBS privacy have only considered requests in
isolation while a few have also addressed the cases in which sequences of requests
can be exploited by the adversary ([14, 15] among others), as in the friend- nder
service. A related problem is privacy-aware publication of trajectories [16, 17];
even if this has some aspects more similar to database publication than to
service request privacy preservation, we believe that our results may be important
for these studies as well.</p>
      <p>Synthetic, mostly random, user movements obtained by the Brinkho
simulator or other simulators have been used in most of the above cited papers as
well as in our own previous work.
The rest of the paper is organized as follows. In Section 2 we formally de ne how
we evaluate the privacy of LBS requests, or equivalently, how we measure the
risk of a privacy violation upon issuing a request. In Section 3 we explain how
the two datasets were obtained from the generators based on the parameters
characterizing the deployment scenario. In Section 4 we brie y explain the
privacy preservation algorithms being used and we report our experimental results.
Section 5 concludes the paper.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Privacy metric of generalized requests</title>
      <p>As mentioned in the introduction, we are concerned with privacy protection via
location generalization (also called spatial cloaking). In this section, we formalize
the adversary model we consider in this paper, and give a metric to measure the
privacy provided by a set of generalized requests against the adversary.
2.1</p>
      <p>Requests, original requests, and generalized requests
We rst formally de ne requests and generalized request for LBS. A request
issued by a user without alteration is called an original request, and a generalized
request is one that is sent to the service provider and has been altered from the
original one for the purpose of privacy protection. Both kinds of requests are
called requests and denoted r. A convention in this paper is to use r0 to denote
generalized requests to emphasize its generalized nature, while use r to denote
original requests, if not speci ed otherwise.</p>
      <p>Either the client software or a trusted medium transforms (or generalizes) an
original request to a generalized one. In this paper, we are not concerned about
how the generalization has happened, but rather on the resulting generalized
requests and their privacy properties. In the experimental section, we evaluate
the performance of generalization algorithms based on the generalized requests
they generate.</p>
      <p>Each LBS request r, either original or generalized, is logically divided into
three parts: IDdata, ST data, and SSdata, containing user identi cation data,
location and time of the request, and other service parameters, respectively. In
the sequel, the spatial and temporal components in ST data are denoted with
Sdata and T data, respectively. In this paper, for the sake of simplicity, we
consider space and time as discrete domains. However, our results can be easily
extended to the case in which these two domains are continuous.</p>
      <p>Each generalized request r0 must correspond to an original request r such that
the di erence between r and r0 is only in SData and furthermore, r:Sdata
r0:Sdata, i.e., the spatial region of the generalized request must contain (or be
equal to) the spatial region of the original request4. We use issuer(r) to denote
the actual issuer of the (original or generalized) request r.
4 Here \region" can be a point.
2.2</p>
      <p>Adversary model
The objective here is to provide an adversary model that captures a general class
of adversary models. In a sense, our adversary model is an adversary
\metamodel". This adversary meta-model concerns two aspects of knowledge that an
adversary might have: (1) knowledge of users' whereabouts (i.e., their locations),
and (2) correlation of (generalized) requests. These two aspects cover the
(explicit or implicit) assumptions appeared in the relevant literature.</p>
      <p>For users' locations, we assume that the adversary has the knowledge
expressed as the following Ident function:</p>
      <p>Identt : the Areas
! the U ser sets;
that is, given an area A, Identt(A) is the set of users whom, through certain
means, the adversary has identi ed to be located in area A at time t. In the
following, when no confusion arises, we omit the time instant t. We further
assume that this knowledge is correct in the sense that these identi ed users in
reality are indeed in area A at the time.</p>
      <p>For a given user i, if there exists an area A such that i 2 Ident(A), then we
say i is identi ed by the adversary. Furthermore, we say that i is identi ed in
A. Note that there may be users who are also in A but the adversary does not
identify them. This may happen either because the adversary is not aware of
the presence of users in A, or because the adversary cannot identify these users
even if he is aware of their presence. We do not distinguish these two cases in
our adversary model as we shall see later that the distinction of the two cases
does not make any perceptible di erence in the ability of the adversary when
the total population is large.</p>
      <p>Clearly, in reality, there are lots of di erent sources of external information
that can lead the adversary to estimate the location of users. Some may lead the
adversary to know that a user is in a certain area, but not the exact location.
For example, an adversary may know that Bob is in a pub (due to his use of a
delity card at the pub), but may not know which room he is in. Some statistical
analysis may be done to derive the probability that Bob is in a particular room,
but this is beyond the scope of this paper.</p>
      <p>The most conservative assumption regarding this capability of the adversary
is that Ident(A) will give exactly all the users for each area A. It can be seen that
if the privacy of the user is guaranteed in this most conservative assumption, then
privacy is also guaranteed against any less precise Ident function. However, this
conservative assumption is unlikely true in reality, while some observed that this
assumption degenerates the quality of service unnecessarily. It will be interesting
to see how much privacy and quality of service change with more realistic Ident
functions. This is partly the goal of our paper.</p>
      <p>As part of this adversary model regarding the location and users, we also
assume another function:</p>
      <p>N umt : the Areas
! [0; 1);
that is, given an area A, N umt(A) gives an estimate of the number of users
in the area at time t. This function can be derived from statistical information
publicly available or through some kind of counting mechanism such as tickets to
a theater. Again, when no confusion arises, we do not indicate the time instant
t.</p>
      <p>The second part of the adversary model is his ability to correlate requests.
We formalize this with the following function L:</p>
      <sec id="sec-2-1">
        <title>L : the Requests</title>
        <p>! the Request sets;
that is, given a (generalized) request r0, L(r0) gives a set of requests such that
the adversary has concluded, through certain means, are issued by the same user
who issued the request r0. In other words, all the requests in L(r0) are linked to
r0, although the adversary may still not know who the user is.</p>
        <p>Note that L(r) may only give an (often small) subset of all the requests
issued by the issuer of r. On the other hand, we assume that the function L is
correct in the sense that each request in L(r) is indeed issued by the same user
in reality. A set of requests is called a trace, denoted , if from the link function
L we understand that all requests are issued by the same user. The requests in
are implicitly ordered along the time dimension.</p>
        <p>As in the case for Ident function, the most conservative assumption on
correlation is that L(r) gives exactly all the (generalized) requests that are issued
by the issuer of r. This is a very strong assumption that may lead to severely
decrease quality of service when accompanied with the most conservative
assumption about the Ident function. Again, a partial goal of this paper is to
study the impact of a less conservative but more realistic assumption on L.</p>
        <p>
          In [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], we proposed a formal framework to model LBS privacy attacks and
defenses for the static case. The main idea is that the safety of a defense technique
can be formally evaluated only if the context, i.e., the assumptions about the
adversary's external knowledge, is explicitly stated. Following this methodology,
in this paper, a context CH is given by three functions Ident, N um, and L, that
is
        </p>
        <p>CH = (Ident; N um; L):
In the next section, we formalize the attack on the generalized requests that an
adversary can perform in a context CH .</p>
        <p>A consequence of restricting to context CH is that, analogously to the
related work in this area, we focus our attention on using only ST data as a
quasiidenti er. Intuitively, a quasi-identi er in a request is a combination of values
that can be used to provide more information on who the actual issuer of a
request may be than without these values. For example, if the Ident function is
given, the ST Data in the request is a quasi-identi er as it may provide
information on the actual issuer, as shown in the next subsection. In principle, any
information contained in a request should be carefully analyzed to see if it may
serve as a quasi-identi er. For example, the IDdata part is an obvious target,
and some service speci c parameters may be used to link the request to users.
However, these aspects are outside the scope of this paper.
The general question for this subsection is, given a set of generalized requests
and a context CH , how much privacy the users who issued these requests have.</p>
        <p>We want to nd the following function:</p>
      </sec>
      <sec id="sec-2-2">
        <title>Att : the Request set</title>
        <p>the U sers
! [0; 1];
Intuitively, given a (generalized) request r0 and a user i, Att(r0; i) gives the
probability that the adversary can derive from CH that i is the issuer of r0
among all the users.</p>
        <p>In the following of this section we show how to specify the attack function
for context CH . Once the attack function is speci ed, we can use the following
formula to evaluate the privacy value of a request:</p>
        <p>P rivacy(r0) = 1</p>
        <sec id="sec-2-2-1">
          <title>AttCH (r0; issuer(r0))</title>
          <p>(1)
Intuitively, this value is the probability that the attacker will not associate the
issuer of request r0 to r0.</p>
          <p>In order to specify the Att function, we introduce the function Inside(i; r0)
that indicates the probability of user i to be located in r0:Sdata at the time
of the request. Intuitively, Inside(i; r0) = 1 if user i is identi ed by the
adversary as one of the users that are located in r0:Sdata at time r0:T data, i.e.,
i 2 Identt(r0:Sdata) when t = r0:T data. On the contrary, Inside(i; r0) = 0 if
i is recognized by the adversary as one of the users located outside r0:Sdata
at time r0:T data, i.e., there exists an area A with A \ r0:Sdata = ; such that
i 2 Ident(A). Finally, if neither of the above cases hold, then the adversary does
not know where i is. There is still a probability that i is in r0:Sdata.
Theoretically, this probability is the number of users in r0:Sdata that are not recognized
by the adversary (i.e., N um(r0:Sdata) jIdent(r0:Sdata)j) divided by all the
users who are not recognized by the adversary anywhere (i.e., jIj jIdent( )j,
where I is the set of all users, and is the entire area for the application).
Formally,</p>
          <p>8 1 if i 2 Ident(r0:Sdata)
Inside(i; r0) = &lt; 0 if 9A : A \ r0:Sdata = ; and i 2 Ident(A)
: Num(r0:Sdata) jIdent(r0:Sdata)j otherwise</p>
          <p>jIj jIdent( )j</p>
          <p>We can now de ne the Att function in context CH . For the sake of
presentation, let us rst consider the attack in the snapshot context</p>
          <p>Csnap = (Ident; N um; Lsnap);
where for each generalized request r0, Lsnap(r0) = fr0g. In this special case, the
probability of a user i of being the issuer of r0 is given by the probability of i
being in r0:Sdata at the time of the request, normalized among all the users in
I. Formally, the attack can be de ned as:</p>
          <p>AttCsnap (r0; i) = Pi02I Inside(i0; r0)</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>Inside(i; r0)</title>
        <p>(2)
(3)</p>
        <p>When the total population of users is large (relative to the number of users
whose locations are known to the adversary), then the \otherwise" case in
Formula 2 is very small, albeit nonzero. Intuitively, if a user i falls into this case,
then the adversary cannot really distinguish this particular user from all other
users who also fall into this case. For such a user i, we can simply give a value
1=jIj to AttCsnap (r0; i). We could give 1=(jIj N um( )), but this does not make
much impact in practice. Now it's easy to see that</p>
        <sec id="sec-2-3-1">
          <title>AttCsnap (r0; i)</title>
          <p>8 1=N um(r0:Sdata) if Inside(i; r0) = 1
&lt; 0 if Inside(i; r0) = 0
: 1=jIj otherwise
(4)
The above formula makes intuitively sense. Indeed, if i is recognized as
inside r0:Sdata, without any other information, the adversary cannot distinguish
him/her from any of the N um(r0:Sdata) people in the area who might be the
issuer. If i is recognized outside, then clearly i cannot be the issuer due to our
de nition of (generalized) requests. If i is not recognized anywhere (meaning
he/she can be anywhere), then the attacker cannot distinguish him/her from
any of the other people who are not recognized. Since we assume the total
population is much greater than N um( ), the probability that i is the issuer is close
to 1=jIj.</p>
          <p>Example 1. Consider the situation shown in Figure 1(a) in which there is the
request r0 such that, at time r0:T data, there are three users in r0:Sdata: one of
them is identi ed as i1, the other two are not identi ed. The adversary can also
identify users i2 and i3 outside r0:Sdata at time r0:T data. Assume that the set
I contains 100 users.</p>
          <p>(a) First request, r0.</p>
          <p>(b) Second request, r00.</p>
          <p>Clearly, i2 and i3 have zero probability of being the issuers, since they are
identi ed outside r0:Sdata and due to the assumption that the spatial region
of any generalized request must contain the spatial region of the original
request. On the contrary, the adversary is sure about the fact that i1 is
located in r0:Sdata. By Equation 3, the attack associates i1 to r0 with
likelihood 1=(Pi02I Inside(i0; r0)). By Formula 2, for each user i in I n fi1; i2; i3g,
Inside(i; r0) = 2=100. Therefore, Pi02I Inside(i0; r0) = 97 2=100 + 1 3.
Consequently, the probability of i1 to be the issuer of r0 is approximately 1=3.
Moreover, each user i 2 I n fi1; i2; i3g has a probability to be the issuer of about
(2=100)=3 = 2=300.</p>
          <p>In the general case L(r0) fr0g, we can evaluate, analogously to the snapshot
case, the probability that a user is located in the generalized region of all the
requests in the trace = L(r0). So, we can extend the Inside function to traces
where, given a trace and a user i, Inside(i; ) is the probability that user i is
located, for each request r0 in , in r0:ST data. Then, the attack is
AttCH (r0; i) = Pi02I Inside(i0; L(r0))</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>Inside(i; L(r0))</title>
        <p>(5)
We now turn to consider how to compute Inside(i; ).</p>
        <p>First consider some easy cases. If i 2 Ident(r0) for all requests r0 2 , then
Inside(i; ) = 1. If i 2 Ident(A) and A \ r0:Sdata = ; for an area A and at least
one requests r0 2 , then Inside(i; ) = 0.</p>
        <p>The rest of cases are di cult ones. To calculate Inside(i; ), we need to
consider the likelihood of someone moving from one location to/from another
in the speci c times. In this paper, we advocate the following as a reasonable
approach. We assume for each pair of locations A and B and two times t1 and
t2, we know the probability of a user i being in B at time t2 conditioned on the
fact that the user is in A at time t1. In formalism, consider two random variables
X: \i is inside A at time t1" and Y : \i is inside B at time t2", where A and B
are two areas and t1 and t2 are two di erent times. We assume the adversary
knows the value P (Y jX).</p>
        <p>We note that P (Y jX) in general is not the same as P (Y ). Indeed, how likely
user i is in B can depend on how likely the same user is in A. Take two extreme
examples: if A and B are very far away and t1 and t2 are close to each other,
then i cannot be in B at t2 if i is in A at t1, i.e., P (Y jX) 0. On the other
hand, if A and B are just two locations along a one-way road and the di erence
between times t1 and t2 matches the time needed to move from A to B with a
normal moving speed, then P (Y jX) 1. In practice, this value can be derived
from historical observations and experiences.</p>
        <p>Now, assume consists of the requests r10; : : : ; rk0. We form a Bayesian
network for each user i with X1, : : :, Xk as the nodes, where each Xj corresponds
to the random variable: \user i is in rj:Sdata at time rj:T data". In this network,
for each node Xh that satis es the condition (denoted c) i 2 Identt(rh0:Sdata)
with t = rh0:T data, we draw an arc towards each other node Xh0 which does
not satisfy condition C. In addition, for each pair of nodes rh0 and rh00 such that
neither satisfy condition c, we draw an arc from Xh0 to Xh00 if the rh00 :T data &lt;
rh000 :T data. (The resulting network is acyclic.) As we have assumed, we know the
value P (Xh0jXh) for each arc Xh to Xh0. Denote by E the conjunctive fact that
P (Xh) = 1 for each rh 2 that satis es condition c. What we want to nd is
P (X1; : : : ; XkjE) = Inside(i; ). This is a well-studied belief revision problem,
and many computation and approximation methods exist. (Note that if we apply
this method to the easier cases mentioned earlier, we would arrive at the correct
values.)
Example 2. Continue from Example 1 and assume a second request r00 (see
Figure 1(b)) is issued after r0 and that r00 is linked with r0, so consists of
these two requests. At time r00:T data, there are 4 users inside r00:Sdata, two of
which are identi ed as i1 and i2. No user is identi ed outside r00:Sdata. From
the above discussion, it follows that Inside(i2; ) = Inside(i3; ) = 0 since
i2 and i3 are identi ed outside the rst generalized request r0. All the other
users have a non-zero probability of being inside the generalized region of each
request in the trace. In particular, Inside(i1; ) = 1 since i1 is recognized in
both requests. Consider a user i 2 I n fi1; i2; i3g, and denote X and Y being
the assertion that \i is in r0:Sdata at time r0:T data" and \i is in r00:Sdata at
time r00:T data". In this case, the Bayesian network for i has two nodes Xr0 and
Xr00 , and there is an arc from Xr0 to Xr00 since r00 is issued after r0 is. Now
let us assume P (Xr00 jXr0 ) = 0:75, i.e., there is a 75% likelihood that
someone in r0:Sdata will move to r00:Sdata at the speci ed times. Now compute
Inside(i; ) = P (Xr0 ; Xr00 ) = P (Xr0 )P (Xr00 jXr0 ) = 2=97 0:75. Now the sum of
all the Inside(j; ) value is 1 + 0 + 0 + 97 2=97 0:75 = 2:5. The attack value
under these assumptions then is as follows: For AttCH (r00; i1) = 1=2:5 = 40%,
AttCH (r00; i2) = AttCH (r00; i3) = 0, and AttCH (r00; i) = (2=97) :75=2:5 0:6%
for all other 97 users i.</p>
        <p>To make the situation more interesting, let us remove the fact that i2 was
recognized outside at time r0:T data, and we want to gure out the value Inside(i2; ).
In this case, let us assume P (Xr0 jXr00 ) = 0:75, namely people who are in
r00:Sdata have a 75% likelihood to be from r0:Sdata. Under the fact E that i2 is in
r00:Sdata, then we know Inside(i2; ) = P (Xr0 ; Xr00 jE) = 0:75. Then the sum of
Inside values is 1+0:75+0+97 2=97 0:75 = 3:25. Hence, AttCH (r00; i1) 31%,
AttCH (r00; i2) 23%, AttCH (r00; i3) = 0, and AttCH (r00; i) = (2=97) 0:75=3:25
0:47% for each other 97 users i. This is an interesting exercise as it reveals that if
we add i2 to be possibly in r0:Sdata (with 75% probability), then the likelihood
that i1 is the issuer decreases, which is intuitively correct.</p>
        <p>It is worth noting that the de nition of attack in context CH is a proper
extension of the attack that can be de ned in the conservative context in which
the adversary knows the location and the identity of each user in each time
instant. The historical attack in this context was rst proposed in [14]. The idea
is that the only users that have non-zero probability of being the issuer of a
trace of requests are those whose spatio-temporal location is contained in the
generalized region of every request in the trace. It can be easily seen that, if each
user can be identi ed at each time instant, then the Inside() function returns
either 0 or 1 and hence the attack we speci ed for context CH assigns a zero
probability to each user that is located outside the generalized region of any
request in the trace.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The MilanoByNight simulation</title>
      <p>In order to evaluate privacy-preserving techniques applied to LBS, a dataset of
users' movements is needed. In our experiments, we want to focus on privacy
threats that arise when using a friend nder service, as described in Section 1. We
suppose that this kind of service is primarily used by people during entertainment
hours, especially at night. Therefore, the ideal dataset for our experiments should
represent movements of people on a typical Friday or Saturday night in a big city,
when users tend to move to entertainment places. To our knowledge, currently
there are no datasets like this publicly available, specially considering that we
want to have large scale, individual, and precise location data (i.e., with the
same approximation of current consumer GPS technology). In this section we
describe how we generated this user movement dataset.
3.1</p>
      <p>Relevant Parameters
For our experiments we want to arti cially generate movements for 100; 000 users
on the road network of Milan5. The total area of the map is 324 km2, and the
resulting average density is 308 users/km2. Very detailed digital vector maps
of the city have been generously provided by the municipality of Milan. The
simulation includes a total of 30; 000 home buildings and 1; 000 entertainment
places; the rst value is strictly related to the considered number of users, while
the second is based on real data from public sources which also provide the
geographical distribution of the places. Our simulation starts at 7 pm and ends at
1 am. During these hours, each user moves from house to an entertainment place,
spends some time in that place, and possibly moves to another entertainment
place or go back home.</p>
      <p>All probabilities related to users' choices are modeled with a probability
distributions. For this speci c data generation, some of the important parameters
of the simulation are:
{ Source and destination. These are the locations essential to de ne
movements. They may be homes or entertainment places. Some places in some
districts are more popular than others.
{ StartingTime. The time at which a user leaves her home to go to the rst
entertainment place.
{ Permanence. How long will a user stay at one entertainment place?
{ NumPlaces. How many entertainment places will a user visit on one night?</p>
      <p>In order to have a realistic model of these distributions, we prepared a survey
to collect real users data. We are still collecting data, but the current parameters
are based on interviews of more than 300 people in our target category.
5 100; 000 is an estimation of the number of people participating in the service we
consider.
3.2</p>
      <p>
        Weaknesses of mostly random movement simulations
Many papers in the eld of privacy preservation in LBS use arti cial data
generated by moving object simulators to evaluate their techniques. However, most
of the simulators are usually not able to reproduce a realistic behavior of users.
For example, objects generated by the Brinkho generator [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] cannot be
aggregated in certain places (e.g., entertainment places). Indeed, once an object
is instantiated, the generator chooses a random destination point on the map;
after reaching the destination, the object disappears from the dataset. For the
same reason, it is not possible to reproduce simple movement patterns (e.g.: a
user going out from her home to another place and then coming back home),
nor to simulate that a user remains for a certain time in a place.
      </p>
      <p>Despite these strong limitations, we made our best e ort to use the Brinkho
simulator to generate a set of user movements with characteristics as close as
possible to those explained in Section 3.1. For example, in order to simulate
entertainment places, some random points on the map, among those points on
the trajectories of users, were picked. The simulation has the main purpose of
understanding if testing privacy preservation over random movement simulations
gives signi cantly di erent results with respect to more realistic simulations.
3.3</p>
      <p>
        Generation of user movements with a context simulator
In order to obtain a dataset consistent with the parameters speci ed in
Section 3.1, we need a more sophisticated simulator. For our experiments, we have
chosen to customize the Siafu context simulator [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. With a context simulator
it is possible to design models for agents, places and context. Therefore, it is
possible to de ne particular places of aggregation and make users dynamically
choose which place to reach and how long to stay in that place. In our simulation
homes are distributed almost uniformly on the map, with a minor concentration
on the central zones of the city. Entertainment places are mostly concentrated
in 5 zones of the city.
      </p>
      <p>The distributions for StartingTime, Permanence and NumPlaces parameters
introduced in Section 3.1 were modeled with the results of the survey. For
example, the time of permanence in an entertainment place was modeled according
to the following percentages derived from the survey: 9:17% of the users stays
less than 1 hour, 34:20% stays between 1 and 2 hours, 32:92% stays between 2
and 3 hours, 16:04% stays between 3 and 4 hours, and 7:68% stays more than 4
hours.</p>
      <p>Following these parameters, in our dataset users spend 50:87% of the time
at home, 7:28% of the time moving from one place to another and 41:85% of the
time in entertainment places. When a user moves from one place to another, she
decides whether to go on foot or by car. In general, if an entertainment place
is farther than 500 meters, people tend to move by car, and this is re ected in
the simulation. The average speed of movements by car is 20 km/h, while the
average speed on foot is 3:6 km/h. With our parameters 10:64% of movements
are done on foot, while all the others are done by car.</p>
    </sec>
    <sec id="sec-4">
      <title>Experimental results</title>
      <p>In this section we show the results of our experimental evaluation. We rst
de ne how we evaluate the quality of service in Section 4.1, then we describe
the experimental setting in Section 4.2 and the generalization algorithms we
used in Section 4.3. Finally, in Sections 4.4 and 4.5 we show the impact of the
simulation parameters and of the user movements, respectively, in the evaluation
of the generalization algorithms.
Di erent metrics can be de ned to measure QoS for di erent kind of services. For
instance, for the friend- nder service we are considering, it would be possible to
measure how many times the generalization leads the SP to return an incorrect
result i.e., the issuer is not noti ed of a close-by friend or, vice versa, the issuer
is noti ed for a friend that is not close-by. While this metric is useful for this
speci c application, we want to measure the QoS independently from the speci c
kind of service. For this reason, in this paper we evaluate how QoS degrades in
terms of the perimeter of the generalized region. If the generalized region is
too large, the service becomes useless. For this purpose, we introduce a new
parameter, called maxP , that indicates this threshold in terms of the maximum
perimeter. We assume that no request is sent to the SP with a perimeter larger
than maxP .
In our experiments we used two datasets of users movements. The dataset AB
(Agent-Based) was generated with the customized Siafu simulator as described
in Section 3.3, while the dataset M RM (Mostly Random Movement) was created
with the Brinkho simulator as described in Section 3.2. In both cases, we
simulate LBS requests for the friend- nder service by choosing random users in the
simulation, we compute for each request the generalization according to a given
algorithm, we evaluate QoS as explained in Section 4.1 and privacy according to
formula (1) presented in Section 2.</p>
      <p>The most important parameters that characterize the simulations are
reported in Table 1, with the values in bold denoting default values. The number
of users indicates how many users are in the simulation, and the simulations
are designed so that this number remains almost constant at each time instant.
In every two minutes, each user has a probability Preq of issuing a request. For
technical reasons, the reported tests are based on a time frame of three hours
over the total six hours of the MilanoByNight scenario. This implies that in the
default case we consider a total of 45 requests (one every four minutes of the
considered time frame). The parameter Pid in indicates the probability that a
user is identi ed when she is located in a entertainment place while Pid out is
the probability that a user is identi ed in any other location (e.g., while moving
from home to a entertainment place). While we also perform experiments where
the two probabilities are the same, our scenario suggests as much more realistic
a higher value for Pid in (it is considered ten times higher than Pid out). This
is due to the fact that restaurants, pubs, movie theaters, and similar places are
likely to have di erent ways to identify people ( delity or membership cards, wi
hotspots, cameras, credit card payments, etc.) and in several cases more than
one place is owned by the same company that may have an interest in collecting
data about its customers.</p>
      <p>Finally, Plink indicates the probability that two consecutive requests can be
identi ed as issued by the same user.6 While we perform our tests considering
a full range of values, the speci c default value reported in the table is due
to a recent study on the ability of linking positions based on spatio-temporal
correlation [18].</p>
      <p>The experimental results we show in this section are obtained by running the
simulation for 100 issuers and then computing the average values.</p>
      <p>The Generalization Algorithms Used in the Experiments
In our experiments we evaluate the privacy and the QoS of requests generalized
by using two algorithms previously proposed in the literature. The rst one,
called Grid, was presented in [19], and it is used as a representative of several
algorithms aimed at guaranteeing k-anonymity in the snapshot case, i.e., these
algorithms do not take into account link ability of the adversary. Intuitively,
this particular algorithm partitions all users into blocks, each one having at
least cardinality k. Then, it computes the generalized region as the minimum
bounding rectangle (MBR) that covers the location of the users in the same
block as the issuer.</p>
      <p>The second algorithm, Greedy, was rst proposed in [14] and a similar idea
was also described in [15]. The use of Greedy is intended to represent algorithms
aimed at preserving privacy in the historical case, i.e., the general CH context,
6 The limitation to consecutive requests is because in our speci c scenario we assume
linking is performed mainly through spatio-temporal correlation.
assuming that the attacker may actually obtain and recognize traces of requests
from the same issuer. This algorithm computes the generalization of the rst
request r in a trace using an algorithm for the snapshot case. While doing this,
the set A of users located in the generalized region is stored. The generalized
regions of the successive request r0 linked with r is then computed as the MBR
of the location of the users in A at the time of r0. In our implementation we
use Grid as the snapshot algorithm to compute the generalization of the rst
request.</p>
      <p>For the purpose of our tests, we modi ed the two algorithms above so that
each generalized region has a perimeter always smaller than maxP . To achieve
this, if the perimeter of the generalized region is larger than maxP , then the
region is iteratively shrunk, until its perimeter is below maxP , by excluding
from the MBR the user that is farther from the issuer. In the Greedy algorithm,
when a user is excluded from the generalized region, then it is also excluded from
the set A of users, and hence he is not used in the generalization of the
successive requests. Eventually, when the set A contains the issuer only, a snapshot
generalization is executed again and A is reinitialized.</p>
      <p>In addition to the input request r, and the location of all the users in the
system, the considered algorithms require two additional parameters: the value
k, and the threshold maxP . In our tests, we used values for k between 10 and
60 (default is 10) and values for maxP between 1000 to 4000 meters (default is
1000 meters).</p>
      <p>In our experimental results we also evaluated the privacy threat when no
privacy preserving algorithm is applied. The label NoAlg is used in the gures
to identify results in this particular case.
4.4</p>
      <p>Impact of Simulation Parameters in the Evaluation of the
Generalization Algorithms
The objective of the rst set of experimental results we present is to show which
parameters of the simulation a ect most the evaluation of the generalization
algorithms. In these tests we used the AB dataset only.</p>
      <p>Figure 2(a) shows that the average privacy obtained with Greedy and Grid
is not signi cantly a ected by the size of the total population. Indeed, both
algorithms, independently from the total number of users, try to have generalized
regions that cover the location of k users, so the privacy of the requests is not
a ected. However, when the density is high, the two algorithms can generalize
to a small area, while when the density is low, a larger area is necessary to
cover the location of k users (see Figure 2(b)). On the contrary, the privacy
obtained when no generalization is performed is signi cantly a ected by the
total population. Indeed, a higher density increases the probability of di erent
users to be in the same location and hence it increases privacy also if the requests
are not generalized.</p>
      <p>A parameter that signi cantly a ects the average privacy is the probability
of identi cation of a user in a certain place. In Figure 3 we show the
experimental results for di erent values of Pid in when, in each test, Pid out is set to
0.9</p>
      <p>Greedy</p>
      <p>Grid
Greedy</p>
      <p>Grid</p>
      <p>NoAlg
40000 70000</p>
      <p>Number of users
(a) Average privacy.</p>
      <p>100000</p>
      <p>100000
Pid in=10. As expected, considering a trace of requests, the higher is the
probability of identifying users in one or more of the regions from which the requests
in the trace were performed, the smaller is the level of privacy.</p>
      <p>Figure 4(a) shows the impact of Plink on the average privacy. As expected,
high values of Plink lead to small values of privacy. Our results show that the
relation between the Plink and privacy is not linear. Indeed, privacy depends
almost linearly on the average length of the traces identi ed by the adversary
(Figure 4(b)). However, the average length of the traces grows almost
exponentially with the value of Plink (Figure 5).</p>
      <p>To summarize the rst set of experiments, our ndings show that many
parameters of the simulation signi cantly a ect the evaluation of the generalization
algorithms. This implies that when a generalization algorithm is evaluated it is
necessary to carefully estimate realistic values for the parameters of the
simulation. Indeed, an error in the estimation may lead to misleading results.
0.8
0.8</p>
      <p>Impact of the User Movements on the Evaluation of the
Generalization Algorithms
The objective of the second set of experiments is to answer an important
question posed in this paper: what is the impact of the di erent simulated user
movements on the evaluation of the Generalization Algorithms? We answer to
this question with a set of tests performed on the two di erent datasets we
obtained as described above.</p>
      <p>The rst set of tests, reported in in Figure 6, compares the privacy achieved
by the Greedy algorithm on the two datasets for di erent values of k and for
di erent values of QoS. The experiments on M RM were repeated trying also
larger values for the QoS threshold (maxP = 2000 and maxP = 4000), so three
di erent versions of M RM appear in the gures. In order to focus on these
parameters only, in these tests the probability of identi cation was set to the
same value for any place (Pid in = Pid out = 0:1), and for the M RM dataset
the issuer of the requests was randomly chosen only among those that stay
in the simulation for 3 hours, ignoring the ones staying for much shorter time
that inevitably are part of this dataset. This setting allowed us to compare the
results on the two datasets using the same average length of traces identi ed by
the adversary.</p>
      <p>Figure 6(a) shows that the average privacy of the algorithm evaluated on
the AB dataset is much higher than on the M RM dataset. This is mainly
motivated by the fact that in AB users tend to concentrate in a few locations
(the entertainment places) and this enhances privacy. This is also con rmed by a
similar test performed without using any generalization of locations; we obtained
values constantly higher for the AB dataset (the average privacy is 0:67 in AB
and 0:55 in M RM ).</p>
      <p>In Figure 6(b) we show the QoS achieved by the algorithm in the two datasets
with respect to the average privacy achieved. This result con rms that the level
of privacy evaluated on the AB dataset using small values of k and maxP for
the algorithm cannot be observed on the M RM dataset even with much higher
values for these parameters.</p>
      <p>From the experiments shown in Figure 6 we can conclude that if the M RM
dataset is used as a benchmark to estimate the values of k and maxP that
are necessary to provide a desired average level of privacy, then the results will
suggest the use of values that are over-protective. As a consequence, it is possible
that the service will exhibit a much lower QoS than the one that could be
achieved with the same algorithm.</p>
      <p>1
iycva 0.9
r
p
e
g
a
rve 0.8
A
1
0.95</p>
      <p>The above results may still support the safety of using M RM , since according
to what we have seen above a technique achieving a certain level of privacy may
only do better in a real scenario. However, our second set of experiments shows
that this is not the case.</p>
      <p>In Figure 7 we show the results we obtained by varying the probability of
identi cation. For this test, we considered two sets of issuers in the M RM data
set. One set is composed by users that stay in the simulation for 3 hours, (M RM
long traces, in Figure 7), while the other contains issuers randomly chosen in the
entire set of users (M RM all traces, in Figure 7), hence including users staying
in the simulation for a much shorter time.</p>
      <p>In Figure 7(a) and 7(b) we can observe that the execution on the M RM
dataset leads to evaluate a privacy level that is higher than the one obtained
on the AB dataset. In particular, the evaluation of the Grid algorithm using
the M RM dataset (Figure 7(b)), would suggest that the algorithm is able to
provide a high privacy protection. However, when evaluating the same algorithm
using the more realistic dataset AB, this conclusion seems to be incorrect. In
this case, the evaluation on the M RM dataset may lead to underestimate the
privacy risk, and hence to deploy services based on generalization algorithms
that may not provide the minimum required level of privacy.
In this paper we claim that the experimental evaluation of LBS privacy
preserving techniques should be based on user movement datasets obtained through
simulations tailored to the speci c deployment scenario of the target services.
Our results support our thesis for the class of LBS known as friend- nder
services, for techniques based on spatial cloaking, and for adversary models that
include the possibility for the adversary to occasionally recognize people in
certain locations. We believe that these results can be generalized to other LBS,
techniques and adversary models. For example, as a future work, it would be
interesting to also evaluate some defense techniques that generalize the issuer's
location to an area that does not necessarily contain the issuer's location.
Moreover, in our experiments we only considered the rst of the two privacy threats
presented in the introduction. We do have some ideas on how to extend them
to consider the second, location privacy, as well. Finally, we believe a signi
cant e ort should be devoted to the development of new exible and e cient
context-aware user movement simulators, as well as to the collection of real
data, possibly even in an aggregated form, to properly tune the simulations. In
our opinion this is a necessary step to have signi cant common benchmarks to
evaluate LBS privacy preserving techniques.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>The authors would like to thank Stefano Varesi for his contribution in writing
the code that was used for our simulations. This work was partially supported
by National Science Foundation (NSF) under grant N. CNS-0716567, and by
Italian MIUR under grant InterLink II04C0EC1D.
13. Ardagna, C.A., Cremonini, M., Damiani, E., di Vimercati, S.D.C., Samarati, P.:
Location privacy protection through obfuscation-based techniques. In: Proc. of the
21st Conference on Data and Applications Security. Volume 4602 of Lecture Notes
in Computer Science., Springer (2007) 47{60
14. Bettini, C., Wang, X.S., Jajodia, S.: Protecting privacy against location-based
personal identi cation. In: Proc. of the 2nd workshop on Secure Data Management.</p>
      <p>Volume 3674 of LNCS., Springer (2005) 185{199
15. Xu, T., Cai, Y.: Location anonymity in continuous location-based services. In:
Proc. of ACM International Symposium on Advances in Geographic Information
Systems, ACM Press (2007)
16. Abul, O., Bonchi, F., Nanni, M.: Never walk alone: Uncertainty for anonymity in
moving objects databases. In: Proc. of the 24th International Conference on Data
Engineering, IEEE Computer Society (2008)
17. Terrovitis, M., Mamoulis, N.: Privacy preservation in the publication of
trajectories. In: Proc. of the 9th International Conference on Mobile Data Management,
IEEE Computer Society (2008)
18. Vyahhi, N., Bakiras, S., Kalnis, P., Ghinita, G.: Tracking moving objects in
anonymized trajectories. In: Proc. of 19th International Conference on Database
and Expert Systems Applications, Springer (2008, to Appear)
19. Mascetti, S., Bettini, C., Freni, D., Wang, X.S.: Spatial generalization algorithms
for lbs privacy preservation. Journal of Location Based Services 2(1) (2008)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Barkhuus</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dey</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Location-based services for mobile telephony: a study of users privacy concerns</article-title>
          .
          <source>In: Proc. of the 9th International Conference on HumanComputer Interaction</source>
          , IOS Press (
          <year>2003</year>
          )
          <volume>709</volume>
          {
          <fpage>712</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bettini</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mascetti</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>X.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jajodia</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>Anonymity in location-based services: towards a general framework</article-title>
          .
          <source>In: Proc. of the 8th International Conference on Mobile Data Management, IEEE Computer Society</source>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Brinkho</surname>
          </string-name>
          , T.:
          <article-title>A framework for generating network-based moving objects</article-title>
          .
          <source>GeoInformatica</source>
          <volume>6</volume>
          (
          <issue>2</issue>
          ) (
          <year>2002</year>
          )
          <volume>153</volume>
          {
          <fpage>180</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Martin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nurmi</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>A generic large scale simulator for ubiquitous computing</article-title>
          .
          <source>In: 3rd Annual International Conference on Mobile and Ubiquitous Systems: Networking &amp; Services, IEEE Computer Society (July</source>
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Gonzalez</surname>
            ,
            <given-names>M.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hidalgo</surname>
            ,
            <given-names>C.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barabasi</surname>
            ,
            <given-names>A.L.</given-names>
          </string-name>
          :
          <article-title>Understanding individual human mobility patterns</article-title>
          .
          <source>Nature</source>
          <volume>453</volume>
          (
          <year>June 2008</year>
          )
          <volume>779</volume>
          {
          <fpage>782</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ghinita</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalnis</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khoshgozaran</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shahabi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>K.L.</given-names>
          </string-name>
          :
          <article-title>Private queries in location based services: Anonymizers are not necessary</article-title>
          .
          <source>In: Proc. of SIGMOD</source>
          , ACM Press (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Gruteser</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grunwald</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Anonymous usage of location-based services through spatial and temporal cloaking</article-title>
          .
          <source>In: Proc. of the 1st International Conference on Mobile Systems, Applications and Services (MobiSys)</source>
          ,
          <source>The USENIX Association</source>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Mokbel</surname>
            ,
            <given-names>M.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chow</surname>
            ,
            <given-names>C.Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aref</surname>
            ,
            <given-names>W.G.</given-names>
          </string-name>
          :
          <article-title>The new casper: query processing for location services without compromising privacy</article-title>
          .
          <source>In: Proc. of the 32nd International Conference on Very Large Data Bases, VLDB Endowment</source>
          (
          <year>2006</year>
          )
          <volume>763</volume>
          {
          <fpage>774</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Gedik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Protecting location privacy with personalized k-anonymity: Architecture and algorithms</article-title>
          .
          <source>IEEE Transactions on Mobile Computing</source>
          <volume>7</volume>
          (
          <issue>1</issue>
          ) (
          <year>2008</year>
          )
          <volume>1</volume>
          {
          <fpage>18</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kalnis</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghinita</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mouratidis</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Papadias</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Preventing location-based identity inference in anonymous spatial queries</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>19</volume>
          (
          <issue>12</issue>
          ) (
          <year>2007</year>
          )
          <volume>1719</volume>
          {
          <fpage>1733</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Kido</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yanagisawa</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Satoh</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Protection of location privacy using dummies for location-based services</article-title>
          .
          <source>In: Proc. of the 21st International Conference on Data Engineering Workshops</source>
          , IEEE Computer Society (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Yiu</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jensen</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
          </string-name>
          , H.:
          <article-title>Spacetwist: Managing the tradeo s among location privacy, query performance, and query accuracy in mobile services</article-title>
          .
          <source>In: Proc. of the 24th International Conference on Data Engineering</source>
          , IEEE Computer Society (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>