=Paper=
{{Paper
|id=Vol-1558/paper3
|storemode=property
|title=Which is Better for kNN Query Processing in the Cloud: Sequential or Parallel
|pdfUrl=https://ceur-ws.org/Vol-1558/paper3.pdf
|volume=Vol-1558
|authors=Chong Zhang,Xiaoying Chen,Bin Ge,Weidong Xiao
|dblpUrl=https://dblp.org/rec/conf/edbt/ZhangCGX16
}}
==Which is Better for kNN Query Processing in the Cloud: Sequential or Parallel==
<pdf width="1500px">https://ceur-ws.org/Vol-1558/paper3.pdf</pdf>
<pre>
    Which is Better for kNN Query Processing in the Cloud:
                     Sequential or Parallel∗

                              Chong Zhang1 , Xiaoying Chen2 , Bin Ge3 , Weidong Xiao4
                         ∗
                             Science and Technology on Information Systems Engineering Laboratory
                              National University of Defense Technology, Changsha 410073, China
                                +
                                  Collaborative Innovation Center of Geospatial Technology, China

                                 {1 leocheung8286, 2 chenxiaoying1991}@yahoo.com
                                 gebin1978@gmail.com, 4 wilsonshaw@vip.sina.com
                                 3


ABSTRACT                                                                 1.   INTRODUCTION
With the development of various Cloud system, providing                     With the development of Cloud computing, various lay-
powerful kNN query capability to DaaS (Database as a Ser-                ers of computing resources are used in terms of pay-as-you-
vice) is an essential requirement for many applications. In              go, such as IaaS (Infrastructure as a Service), PaaS (Plat-
this paper, we are interested in two opposite approaches for             form as a Service) and SaaS (Software as a Service). Nowa-
processing kNN query in Cloud system, parallel processing                days, Database as a Service (DaaS)[2][13] is a hot topic for
and sequential processing, and we want to explore the an-                database community in Cloud computing era. For DaaS
swer of which one performs better. For addressing such a                 users, it is not necessary to focus on the location of database
question, we devise a new distributed indexing structure VI-             instance, nor the physical storage mechanism of schema or
HCO, which is characterized by fast locating Cloud nodes                 tables, not even data partition fashion or query processing,
capability. Then parallel and sequential processing methods              in one word, the inner of database is transparent to the
are designed upon the structure. For parallel one, we take               users. They just define the structure of table, and insertion,
differential cells between two consecutive range queries into            query or other operations seem similar to use a centralized
consideration, and for sequential one, we elaborately design             local database. However, it is possible for one table, data are
an accurate message delivery algorithm. We verify our ideas              spread over many computing nodes, and querying processing
through experiments, which is conducted on both synthetic                needs the collaboration of these nodes. And as data volume
and real dataset, and the results show that VIHCO outper-                increases, the database should be adaptive to the new scale
forms a previous work RT-CAN, and the sequential method                  and new query requirement, i.e., it should be elastic.
is more efficient under small k query condition and small                   In this paper, we focus on kNN query in DaaS, which is
system size, while parallel one suits for large k and large              an essential function for spatial database. Given a point
scale of computing nodes.                                                in the space, kNN query aims to find k nearest objects to
                                                                         the query point. This topic is addressed well in some previ-
                                                                         ous works[13][9][14], however, there are two opposite ideas
Categories and Subject Descriptors                                       to solve the problem in the state-of-the-art, namely, parallel
H.2.4 [Database Management]: Systems - query processing                  processing and sequential processing. Parallel method ex-
                                                                         ploits the parallelism of Cloud nodes, and make them work
                                                                         simultaneously, while sequential one uses the vicinity rela-
General Terms                                                            tionship between query point and Cloud nodes to accurately
Algorithms, Measurement, Performance                                     deliver query messages. Nevertheless, which is better for
                                                                         DaaS is not studied before. Hence, in this paper, we extend
                                                                         our work in [14] to acquire the answer.
Keywords                                                                    For comparing the two approaches, we use a previous work
kNN query, Cloud, histogram, parallel, sequential                        RT-CAN as a baseline, and propose a new structure, called
                                                                         VIHCO (VIcinity-based Hilbert Cloud Overlay), to index
∗This work is supported by NSF of China grant 61303062                   spatial data in Cloud system and to process kNN query.
and 71331008.                                                            The feature of VIHCO is not only leveraged on fast look up
                                                                         routing table (finger table), but also highlighted on vicinity
                                                                         neighbors to quickly locate the nearby Cloud nodes. Based
                                                                         on such structure, we present the designs of parallel and
                                                                         sequential processing algorithms. Experiments on both syn-
                                                                         thetic and real dataset show that VIHCO outperforms RT-
                                                                         CAN, in efficiency and scalability, and the sequential method
 c 2016, Copyright is with the authors. Published in the Workshop Pro-   is more proper under small k query condition and small sys-
ceedings of the EDBT/ICDT 2016 Joint Conference (March 15, 2016, Bor-
deaux, France) on CEUR-WS.org (ISSN 1613-0073). Distribution of this     tem size, while parallel one suits for large k and large scale
paper is permitted under the terms of the Creative Commons license CC-   of computing nodes.
by-nc-nd 4.0
     To summarize, we make the following contributions:           3.    PROBLEM DEFINITION AND OVERLAY
      • We propose the problem of comparing performances
                                                                        STRUCTURE
        between parallel and sequential processing for kNN
        query in Cloud computing architecture.                    3.1    Problem Definition
                                                                     In this paper, only two-dimensional space is considered.
      • We design an indexing structure VIHCO capable to          A Cloud platform contains a set C of Cloud nodes, C={cni |
        fast locate Cloud nodes.                                  1 ≤ i ≤ n}, where cni denotes the i-th Cloud node with
                                                                  stand alone storage and computational capability. A set O
      • We devise parallel and sequential processing algorithms   of spatial objects is spread over Cloud nodes C according to
        based on VIHCO, and conduct experiments to com-           the underlying Cloud storing policy. And each object obj is
        pare their performances under different parameters.       represented as (x, y, shp), where (x, y) denotes the location
                                                                  of obj, and shp is a set of points denoting the shape of obj.
  The rest of this paper is organized as follows. Section 2       If obj is a point object, shp is null. Otherwise, i.e., obj is
reviews related works. Section 3 formally defines the prob-       an object with spatial extent, an MBR (Minimal Bounding
lem and presents VIHCO structure. Algorithms for parallel         Rectangle) is used as the approximated shape of the obj,
and sequential processing are presented in section 4 and 5,       and (x, y) means the center of the MBR.
respectively. And we experimentally evaluate VIHCO and               A kNN query could be formally defined as: given a set
acquire comparison answers in section 6. Finally, section 7       O of spatial objects, a kNN query (q=(xq , yq ), k), aims to
concludes the paper with directions for future works.             find a set Oq ⊆ O , such that |Oq |=k, and d(o, q) ≤ d(o0 , q),
                                                                  ∀o ∈ Oq , o0 ∈ O \ Oq , where d() is the Euclidean distance
                                                                  function.
2.     RELATED WORKS
   A similar work RT-CAN solving multidimensional queries         3.2    Overlay Structure
in Cloud system is proposed in [13]. Each peer firstly builds
                                                                    We propose an overlay structure called VIHCO (Vicinity-
R-tree to index multidimensional data locally, and then se-
                                                                  based Hilbert Cloud Overlay), which takes advantage of
lects the nodes in the level above the leaf level of R-tree to
                                                                  both linearization of Hilbert curve[5] and space vicinity. The
publish to CAN by interacting with overlay node. RT-CAN
                                                                  overview of the structure is presented in Figure 1.
addresses point query, range query and kNN query for multi-
dimensional data in Cloud system. Later, the authors pro-                                                 key=60
posed a framework for supporting DBMS-like indexes in the                                                          cn5

Cloud[2], enabling users to define their own indexes without                                                                        cn4 key=8
                                                                                                        key=52                         cn8 key=10
knowing the complicated structure of the underlying net-                                               cn3                                     11

work, and three conventional indexes, namely hash indexes,                                                                                 12


B+ -tree-like indexes and multi-dimensional indexes are im-                                       key=48
                                                                                                  cn2
                                                                                                                                      key=14
                                                                                                                                          cn1
plemented. The results of work [14] prove that our proposed
structure DRISTIX is better than RT-CAN, and in this pa-                                                                                  18

per, we extend our last work to exclusively study kNN query                                       42
                                                                                                       cn6
processing.                                                                                             key=41
                                                                                                                                cn7
   Several works from moving objects query and sensor net-                                                                            key=24
                                                                                                                               26
work are also related to our work. DTI[6] is a distributed
index for trajectory queries on moving objects. For each                        (a)                                      (b)
moving object, it builds a SkipList overlay to index trajec-
tory. This method suffers from high overhead, because it                         Figure 1: Overview of VIHCO
maintains complicated overlay connection when the number
of moving objects is large. Hua Lu et al.[7] address con-            The whole space is partitioned by Hilbert curve, in par-
tinuous queries for monitoring moving objects constrained         ticular, for a λ-order Hilbert, the space is divided into 22λ
in road network in a distributed environment. However,            cells, and each cell is attached a Hilbert value varied from
they adopt a central server to control the whole distributed      0 to 22λ − 1. Initially, Cloud nodes are formed into over-
queries, which is a bottle neck in real application. DIST[8]      lay network using Chord protocol[10], a distributed hash-
addresses tracking moving object in sensor network, it uses       based (key-value) P2P network structure. Formally, a set
quad-tree to divide space recursively and builds distributed      of 2m consistent keys [0, 2m − 1] is maintained by a set of
index hierarchically among sensors, and adopts efficient up-      Cloud nodes, and each Cloud node is exclusively respon-
date to reduce maintenance cost. However, due to lack of          sible for maintaining a segment [keys , keye ] (0 ≤ keys ≤
scalability of tree structure, DIST suffers bottle neck prob-     keye ≤ 2m − 1), each key of which is associated with data
lem.                                                              items (value). According to the order of the segment main-
   A related work addresses multi-dimensional queries for         tained in the key domain, each Cloud node connects to its
PaaS, MD-HBase[9] uses k-d tree and quad-tree to partition        predecessor node and successor node, in addition, it main-
space and adopts Z-curve to convert multidimensional data         tains a routing table called finger table containing m entries
points into one dimension, and supports multi-dimensional         where the i-th entry points the node maintaining the key
range and nearest neighbor queries, which leverages a mul-        keye +2i−1 . For the Cloud nodes maintaining the first and
tidimensional index structure layered over HBase. However,        the last segments of keys, they are also connected by each
this work is limited in HBase, it could not be generalized to     other as predecessor and successor, so the whole structure
many Cloud system.                                                looks like a ring. Figure 1(b) illustrates an example, node
cn8 maintains key segment [9, 10], with connecting prede-                             est Cloud node to q, say cnq , is found, and then according to
cessor cn4 and successor cn1 , for maintaining figure table,                          global Hilbert partition policy, cnq finds the set SCq of cells
cn8 finds the nodes which maintain the key 10+20 , 10+21 ,                            intersecting with the circle, after that, the cells with contigu-
10+22 , 10+23 , 10+24 , 10+25 , respectively, resulting cn1 ,                         ous Hilbert values are formed into intervals, i.e., SCq ={[csi ,
cn7 , cn6 , cn2 .                                                                     cei ] | si ≤ei }, and then cnq uses its vicinity neighbors to
   For simplicity, let 2λ = m, which means that each cell in                          search these cells, in particular, for each interval [csi , cei ],
the space is mapped to a key on the ring, i.e., each Cloud                            if there is a cell ck being cnq ’s left, lower, right or upper
node cni is responsible for maintaining a set of cells which                          cell, then cnq sends the interval to the corresponding vicinity
are continuous in Hilbert value domain, and we call these                             neighbor, say cnv , to search the results, otherwise, i.e., there
cells as cni ’s charge-cell s and cni is these cells’ charge-node.                    is no cell being cnq ’s neighbor cell, cnq uses its predecessor,
The term charge here means, if an object obj spatially in-                            successor and/or finger table to find the proper charge-nodes
tersects with cell cj , obj is maintained (stored and indexed)                        responsible for these cells to retrieve the results. Figure 2
by cj ’s charge-node. Figure 1 illustrates an example of VI-                          illustrates an example, assuming cn8 (key=10) is the nearest
HCO, the whole space is partitioned by a 3-order Hilbert                              Cloud node to query point q, after obtaining radius r1 , cn8
(Figure 1(a)), and there are 8 Cloud nodes, namely, cn1 to                            finds SCq ={[6, 8], 11, [30, 32], [53, 54], 57}, note that cell
cn8 , which connect to each other, forming a Chord (Figure                            9 and 10 are cn8 ’s charge-cell s, so they are not contained
1(b)). Hence, for cn8 , whose key is 10, its charge-cell s are                        in SCq . Then, for each interval (or value) in SCq , say [6,
cell 9 and 10, similarly, cn2 (key=48) is responsible from cell                       8], cn8 examines that cell 6 is vicinity cell, so [6, 8] is sent
42 to 48.                                                                             to cn4 , similarly, 11, [30, 32] and [53, 54] are sent to cn1 ,
   Hilbert curve merely tries best to preserve locality of origi-                     cn6 and cn5 , respectively, and then cell 57 is routed through
nal space, hence, some spatial relation is lost after lineariza-                      looking up finger table and is forwarded to cn5 .
tion, e.g., in Figure 1, cell 10 and cell 53 are adjacent to                             After retrieving objects, cnq examines the number of re-
each other in original space, but they are distinguished by 43                        sults, if it is not less than k, then cnq sorts these objects
in linear space, which consequently increases complexity of                           according to the distance to q and returns results, other-
query processing. So we propose a vicinity-based approach                             wise, i.e., there are less than k objects, cnq increases r by δ,
to improve performance. Each Cloud node not only main-                                a new range query (q, r+δ) is formed, and the intersecting
tains finger table, but also connects to its vicinity neighbors,                      cells are also calculated and sent to corresponding charge-
in particular, for each Cloud node cni , assuming the Chord                           nodes. Then the charge-nodes only search for the new com-
key of it is keyi , then the vicinity neighbors are charge-nodes                      ing cells and return results to cnq , and again cnq examines
of cell keyi ’s left, lower, right, upper cells, if the charge-node                   the number, and the above procedure is repeated until k
of left (lower, right, upper) cell is cni itself, then continue                       objects are obtained. Continuing the example in Figure 2,
searching the charge-node of left (lower, right, upper) of left                       when cn8 finds that the number of results is less than k,
(lower, right, upper) cell, until the charge-node is not cni                          it issues a new range query with radius r2 , and the corre-
itself. Take cn8 as an example in Figure 1(a), its Chord key                          sponding cells are sent to charge-nodes until k objects are
is 10, and left, lower, right, upper cells of cn8 are 11, 9, 53,                      retrieved, the iteration is terminated.
31, but cell 9 is cn8 ’s charge-cell, so cell 6 is selected, and
the vicinity neighbors are cn1 , cn4 , cn3 and cn6 .                                  Algorithm 1 kNN Query Parallel Processing
                                                                                      Input:
4.      PARALLEL PROCESSING                                                               q=(xq , yq ), k
   In this section, parallel processing for kNN query is pre-                         Output:
sented. The main workflow is leveraging a series of range                                 Qlist //result list
queries to accomplish kNN query. In particular, firstly, an                            1: r = estimateR(k)
initial radius r is calculated by k and estimation of spa-                             2: cnq = f indN earestCN (q)
tial objects distribution[1][11], then a range query (q, r) is                         3: while true do
formed, i.e., to find objects located in the circular range                            4:   SCq = cnq .f indIntersectingCells(q, r)
centered at q, with the radius r.                                                      5:   for each interi ∈ SCq do
                                                                                       6:      if (cnv =cnq .vN eighbor(interi ))6=null then
                                            key=60                                     7:         cnq .sendTo(cnv , interi )
                                                     cn5

                                                                      cn4 key=8
                                                                                       8:      else
                                          key=52                         cn8 key=10    9:         cnq .forward(interi )
                                         cn3
                                                                             12
                                                                                 11
                                                                                      10:      end if
                                    key=48                              key=14        11:   end for
                                    cn2                                     cn1
                                                                                      12:   ret=cnq .receiveResults()
              r1                                                                      13:   if |ret|≥k then
                   q r2                                                     18
                                                                                      14:      return k nearest neighbors to q
                                    42
                                         cn6
                                          key=41                                      15:   else
                                                                  cn7
                                                                        key=24        16:      r = r+δ
                                                                 26                   17:   end if
                   (a)                                     (b)                        18: end while

                     Figure 2: Parallel Processing                                      Algorithm 1 presents our parallel processing for kNN query.
                                                                                      Line 1 calculates the radius r, and then the nearest Cloud
     For processing such range query, first, the spatially near-                      node cnq to q is found in line 2. From line 3 to the end,
range queries are issued, and the returned results are exam-        query then replying are called responders. Next issue is that
ined until k objects are obtained. Node cnq first finds the         we consider both point objects and extent objects (the ob-
intersecting cells SCq by (q, r) (line 4), and then for each        jects with spatial shape), so for an extent object, it may
interval in SCq , cnq uses vicinity neighbor (line 6 and 7)         cross more than one cells, which means one object might be
or finger table (line 8 and 9) to deliver the query messages,       stored in different Cloud nodes, thus when query is sent to
and then checks the number of returned results, until gets k        different nodes, the results returned may contain duplicates,
objects.                                                            hence the requester should detect duplicates and eliminate
                                                                    them. Another issue is that, when a responder cnj receives
5.    SEQUENTIAL PROCESSING                                         query message, it would be costly for cnj returning all ob-
                                                                    jects maintained by it, hence cnj should just return the ob-
   In this section, we present the sequential processing for
                                                                    jects with distance to q not larger than that of the current
kNN query, which is a progressive procedure to obtain k ob-
                                                                    top element in the queue, however, such a fashion would lost
jects. The difference is that, parallel processing is composed
                                                                    cnj forever, i.e., the rest objects in cnj have no chance to be
of a series of range queries, with constantly extending radius
                                                                    scanned. So we propose a concept called map object, which
of the circular range to retrieve results, while for sequential
                                                                    means this map object is just attached to the current results
processing, the query is progressively delivered to the cor-
                                                                    returned to requester, but it is not a result for this time, it
responding Cloud nodes, i.e., the Cloud nodes receive the
                                                                    might be a result in the following iterations, and its function
query in an order according to the distance to the query
                                                                    is to lead the requester to send query to cnj if necessary. In
point. Although for parallel processing, query messages are
                                                                    the following, we present the description for sequential pro-
almost delivered to different Cloud nodes at the same time,
                                                                    cessing, assuming that, requester cnq issues a kNN query (q,
and then nodes process local search simultaneously, the crit-
                                                                    k), the processing in requester and responders are described
ical point in parallel processing is the estimation of range ra-
                                                                    below:
dius r, i.e., with a large r, k results would be retrieved at one
                                                                       (1) Processing in Requester. A min-priority queue P Q is
time, but some useless objects beyond k objects are also ob-
                                                                    used here to store three kinds of elements: b-MBR (bM), ob-
tained, which promises waste of communication, while with
                                                                    ject (obj) and map object (mobj). The mobj is a structure
a small r, some query messages would be repeatedly deliv-
                                                                    sent from responder, representing an obj stored in some re-
ered to the same Cloud nodes, which also lower down the
                                                                    sponder, and is formatted as hdist, {CloudN odes}, rangei,
query performance. The advantage of sequential process-
                                                                    where dist is the distance between q and the obj being rep-
ing is that, without prior knowledge of k, the results are
                                                                    resented, and {CloudN odes} is a list of identifiers of Cloud
incrementally obtained according to sorting function, with-
                                                                    nodes which are responsible for the key interval range. El-
out wasting communication cost. For achieving sequential
                                                                    ements in P Q are sorted by distance, i.e., M IN DIST (q,
process, we propose to use histogram.
                                                                    bM), distance between q and obj or mobj.dist. In order to
5.1    Design of Histogram                                          break the tie that two elements have the same distance, we
                                                                    make a convention that obj is preferred to mobj, and mobj is
  We use histogram to assist sequential processing. Each
                                                                    preferred to bM. Initially, all bMs of cnq are enqueued into
Cloud node maintains its own histogram, which is com-
                                                                    P Q, in addition, a counter c storing the number of glob-
posed of several buckets. Each bucket is formatted with
                                                                    ally retrieved objs so far is set to zero. After that, the top
hbid, range, numi, where bid identifies a bucket, and range
                                                                    element e of P Q is dequeued.
denotes an interval marked with a starting value and an
ending value of the Chord key domain, and num denotes                  • If e is a bM, cnq uses vicinity-connection or finger
the number of items in range. Different buckets’ ranges                  table to route bM-request hq, ctopdist, intervali, to
are non-overlapping with each other and all buckets’ ranges              the charge-nodes (responders) of the cells in interval,
constitute the key domain.                                               where ctopdist is distance between current top element
  For each Hilbert cell, its charge-node calculates the MBR              and q, and interval is the cell interval covered by e.
(Minimum Bounding Rectangle) of objects intersecting with                Then the thread for operating P Q is suspended until
the cell, hence for each Cloud node, there is a set of MBRs              cnq receives a reply from responders. The reply may
corresponding to its charge-cell s. Relying on continuous                contain both objs and mobj, and objs are added into
maintaining routing tables of Cloud node[4], the MBRs are                result list, and c is increased by the number of objs,
able to be disseminated to other Cloud nodes, i.e., the MBR              and then cnq examines that, if c is larger than or equal
information is piggy-back on the routing table maintenance               to k, meaning that k nearest objects are found, cnq in-
message. Such a design would reduce the communication                    forms all responders to terminate, otherwise, i.e., c is
cost, thus the whole performance would not suffer. Af-                   less than k, mobj is enqueued into P Q.
ter constant message exchanging, each Cloud node is able
to know the MBR for each Hilbert cell, and then for each               • If e is an mobj, cnq sends mobj-request hq, e.dist,
bucket in the histogram, a bucket MBR (b-MBR) is built,                  ctopdist, e.rangei to the responders e.{CloudN odes}.
which summarizes all objects in the bucket.                              Similarly, P Q is blocked until cnq receives a reply and
                                                                         follows process above.
5.2    Processing Description
                                                                       • Otherwise, i.e., e is an obj, e is added to result list and
   The main tool in sequential processing is priority queue[3],          c is increased by one. If c is equal to k, cnq informs all
which is controlled by query issuing node, the node con-                 responders to terminate the processing.
stantly fetches top element from the queue, and parses the
element into query, and deliver the query to corresponding            (2) Processing in Responder. When a responder cnj re-
charge-nodes. To clearly specify such interaction, we call          ceives a request, depending on the type of the request, two
the issuing node as requester, and the nodes receiving the          cases are discussed below:
   • If the request is a bM-request hq, ctopdist, intervali,         Algorithm 2 kNN Query Sequential Processing
     for each obj stored in cnj , which is bounded in interval,      Input:
     cnj calculates the distance between q and the obj.                  q=(xq , yq ), k
     Only the objs with the distance not larger than ctopdist        Output:
     are added to the return list. To eliminate local dupli-             Qlist //result list
     cates, we use a simple but effective method based on             1: Requester cnq ’s Process:
     the observation that, for a bM-request, the responders           2: P Q=φ, c=0
     are always located in a contiguous key set and in the            3: cnq .enqueue(bMs, P Q)
     relationship of predecessor and successor. Assuming              4: while P Q 6= φ do
     that, three responders cn1 , cn2 and cn3 are located for         5:    e=cnq .dequeue(P Q)
     a bM-request with the query key range [key1 , key2 ].            6:    if e is type of bM then
     In particular, [key1 , keya ) is maintained by cn1 , [keya ,     7:       cnq .deliver(hq, ctopdist, intervali)
     keyb ) is for cn2 and [keyb , key2 ] is for cn3 . After they     8:    else if e is type of mobj then
     finds qualified objs, cn1 sends its results to its succes-       9:       cnq .deliver(hq, e.dist, ctopdist, e.rangei)
     sor cn2 , and cn2 sends the results found by itself to          10:    else
     cn3 as well as the ones from cn1 . Now, cn3 is able to          11:       Qlist←e
     use the identifiers of objs to filter the duplicates among      12:    end if
     their findings, then a refined result list is determined.       13:    cnq updates Qlist and c when receives a reply
     After that, cn3 sends a reply to the requester, contain-        14:    if c≥k then
     ing the refined results. To determine mobj, cn1 , cn2           15:       cnq informs all related responders to terminate
     and cn3 follow the collaborative way above to find an           16:       return Qlist
     object nobj (next object) which is the immediate one            17:    end if
     to the last object in current result in ascending order         18: end while
     by distance. After that, cn3 sends a mobj-reply to              19:
     the requester, mobj=hnobj.distance, {cn1 , cn2 , cn3 },         20: Responder cnj ’s Process:
     rangei.                                                         21: msg=cnj .receive()
                                                                     22: if msg is type of bM-request then
   • If the request is a mobj-request hq, dist, ctopdist, rangei,
                                                                     23:    hq, ctopdist, intervali=parse(msg)
     the responders search the objs with the distances be-
                                                                     24:    objlist=cnj .search&F ilter(q, ctopdist, interval)
     tween dist and ctopdist, whose associated keys are in
                                                                     25: else
     range. Then similar to the above processing, the re-
                                                                     26:    hq, dist, ctopdist, rangei=parse(msg)
     sponders aggregate the results and one of them sends
                                                                     27:    objlist=cnj .search&F ilterM Obj(q, dist, ctopdist,
     a mobj-reply to the requester.
                                                                            range)
   We briefly describe sequential query processing in Algo-          28: end if
rithm 2. From line 1 to line 18, the pseudo-codes show               29: objlist0 =duplicateEliminate(objlist)
processing in requester, and the processing in responder is          30: cnj .return(objlist0 )
presented between line 20 and line 30.
   Figure 3 shows an example for kNN query sequential pro-
cessing. Assume that cn6 in Figure 1 issues a 5-NN query             mobj-request to cn4 with dist=5, ctopdist=7, and cn4 finds
specified by a spatial query point q. We illustrate three near-      obj5 and replies to cn6 , then c is set to 5, which means the
est bMs, bM1 for the bucket with range [30, 32], containing          processing is terminated.
obj3 , obj4 , with distance 0 to q; bM2 for [8, 11], containing
obj1 , obj5 , obj6 , with distance 2 to q; and bM3 for [52, 55],     6.     EXPERIMENTAL EVALUATION
containing obj1 , obj2 , with distance 3 to q. And the dis-
                                                                        We use two different datasets to evaluate our approaches.
tances for obj1 ∼ obj6 to q are 3, 7, 1, 4, 5, 2. Initially, cn6
                                                                     The first one is a synthetic dataset generated by GSTD[12]
enqueues the three bMs, resulting hbM1 (0), bM2 (2), bM3 (3)i,
                                                                     in space domain [0, 1]2 and time domain [0, 1]. In particular,
then bM1 is dequeued, because cn6 is the charge-node of cell
                                                                     500,000 objects, each specified by a two-dimensional rectan-
30 to 32, it searches locally, and finds obj3 as the first result,
                                                                     gle, are initialized at t=0, with center coordinates in Gaus-
and the mobj is obj4 , due to obj4 belonging to cn6 itself, so
                                                                     sian distribution (mean=0.5 and variance=0.1) and size uni-
obj4 is directly added into P Q, now the queue is hbM2 (2),
                                                                     formly distributed in [0, 0.1]2 . As t evolves uniformly, the
bM3 (3), obj4 (4)i. Next bM2 is dequeued, and cn6 sends a
                                                                     objects move from central part to the border of the space as
bM-request with ctopdist=3 (bM3 ’s M IN DIST ) to the re-
                                                                     well as resizing themselves, which totally results in almost
sponders cn4 , cn8 and cn1 for [8, 11], then the reply is obj6 ,
                                                                     100 million records. Figure 4 visualizes the synthetic dataset
obj1 and mobj5 (obj5 is not a result for this iteration, be-
                                                                     (for the sake of legibility, only 5,000 objects are visualized).
cause in5 ’s distance(=5) is larger than ctopdist(=3), it is
                                                                     The second dataset is a real one1 which records trajectories
the immediate object following obj1 according to the dis-
                                                                     of taxis in Beijing from Nov. 1st to 3rd, 2013. In particular,
tance to q), so cn6 adds obj6 and obj1 into result list, and
                                                                     each record in the dataset contains vehicle ID, geo-location,
enqueues mobj5 , and increases c by 2, resulting 3. Similarly,
                                                                     recording time stamp, etc. The amount of the records in the
cn6 sends bM-request to cn3 and cn5 which are responsi-
                                                                     dataset is about 60 million.
ble for [52, 55], after searching, cn5 replies the result obj1
                                                                        We implement VIHCO in Java and run it on a set of com-
and mobj2 , because obj1 is a duplicate, it is discarded, and
                                                                     puting units. Each unit (Cloud node) is configured with
mobj2 is enqueued. Then obj4 is inserted into result list,
                                                                     1
c is set to 4. After that, cn6 processes mobj5 and sends a               http://activity.datatang.com/20130830/description
                                                                              key=60                                                                                             initial:
                                                                                       cn5                                                                                       < bM1(0), bM2(2), bM3(3) >
                                                                                                                                                cn4 key=8                        result: null
                                                                            key=52                                                                 cn8 key=10                    < bM2(2), bM3(3), obj4(4) >
                                                                           cn3                                                                                    11
                bM1                                                                                                                                       12
                                                                                                                                                                                 result: obj3
                                    q     obj3
                                                                      key=48                                                                         key=14                      < bM3(3), obj4(4), mobj5(5) >
                            obj4
                                                        bM3           cn2                                                                                cn1
                                                                                                                                                                                 result: obj3, obj6, obj1
                                   obj6
                                          obj1                                                                                                                                   < obj4(4), mobj5(5), mobj2(7) >
                                                                                                                                                         18
                            obj5
                                                                                                                                                                                  result: obj3, obj6, obj1
                                                                      42
                      bM2                        obj2                      cn6
                                                                            key=41                                                                                               < mobj5(5), mobj2(7) >
                                                                                                                                               cn7                                result: obj3, obj6, obj1, obj4
                                                                                                                                                     key=24
                                                                                                                                       26                                        < mobj2(7) >
                                                                                                                                                                                 result: obj3, obj6, obj1, obj4, obj5


                                                                  Figure 3: Sequential Processing


                                                                                               Next, we can see the comparison results from Figure 5(b)
                                                                                             under different system sizes. A small size gives better per-
                                                                                             formance to sequential processing and a large size gives to
                                                                                             parallel one. The reason is similar to the previous one.
      (a) t=0     (b) t=0.25              (c) t=0.5      (d) t=0.75        (e) t=1
                                                                                                                             7 0 0 0                                                                                                            6 5 0 0

           Figure 4: Visualization of Synthetic Dataset                                                                                                                            P a r a lle l
                                                                                                                                                                                                                                                6 0 0 0
                                                                                                                                                                                                                                                                      P a r a lle l
                                                                                                                             6 0 0 0                                                S e q u e n tia l                                                                  S e q u e n tia l
                                                                                                                                                                                     R T -C A N                                                                         R T -C A N
                                                                                                                                                                                                                                                5 5 0 0
                                                                                               Q u e ry T h ro u g h p u t   5 0 0 0


                                                                                                                                                                                                                Q u e ry T h ro u g h p u t
                                                                                                                                                                                                                                                5 0 0 0
Intel(R) Core2 Quad 2.5GHz processor, 2GB memory and                                                                         4 0 0 0
                                                                                                                                                                                                                                                4 5 0 0
500GB disk. The number of nodes varies from 4 to 16.                                                                         3 0 0 0
                                                                                                                                                                                                                                                4 0 0 0
And they are connected via 1Gbps network links. We use                                                                       2 0 0 0
                                                                                                                                                                                                                                                3 5 0 0
RT-CAN for comparison and query throughput (number of
queries processed per second) as the metric in the exper-                                                                    1 0 0 0                                                                                                            3 0 0 0


iments. And we vary k and network size to measure the                                                                                  4                  8            1 6        3 2                    6 4                                                  4                    8                 1 2    1 6

                                                                                                                                                                       k                                                                                                       C lo u d N o d e C o u n t
performance, which are 16 and 8 as default, respectively.
                                                                                                                                                      (a) Effect of k                                                                                   (b) Effect of system size
6.1      Results on Synthetic Dataset                                                                                                                   Figure 5: Results on Synthetic Dataset
   Figure 5 shows the performances of three query processing
methods on synthetic dataset. When we increase k from 4 to
64, both performances degrade, which can be explained that
a larger k involves more Cloud nodes to inspect distances
                                                                                             6.2                                                Results on Real Dataset
from objects to the query point, hence messages and com-                                       Figure 6 shows the results on real dataset. The advantage
parisons are raised. However, we can see from Figure 5(a),                                   of VIHCO and our query processing methods are definitely
RT-CAN degrades faster than our two processing methods,                                      revealed in the results, where RT-CAN deteriorates more
this is due to: first, VIHCO uses not only long-links (fin-                                  seriously with k increased. A detail observation is that the
ger table) to find charge-nodes, but also vicinity neighbors                                 throughput of RT-CAN is only about 1,000 when k is equal
to quickly locate nearby nodes, second, our parallel process-                                to 64, while our methods are above 4,000. We can see VI-
ing method takes differential cells between two consecutive                                  HCO is more suitable for the skewed dataset than RT-CAN
range queries into consideration, which reduces communica-                                   for kNN query.
tion cost, on the other hand, our sequential method is able to
elaborately forward query to the related Cloud nodes, while                                                                  8 0 0 0                                                                                                                7 5 0 0

RT-CAN does not take these details into consideration.                                                                       7 0 0 0
                                                                                                                                                                                         P a r a lle l
                                                                                                                                                                                          S e q u e n tia l                                         7 0 0 0
                                                                                                                                                                                                                                                                       P a r a lle l
                                                                                                                                                                                                                                                                        S e q u e n tia l
   For comparing parallel and sequential method, we can see                                                                  6 0 0 0
                                                                                                                                                                                           R T -C A N
                                                                                                                                                                                                                                                    6 5 0 0
                                                                                                                                                                                                                                                                         R T -C A N
from Figure 5(a), when k is small, sequential processing out-
                                                                                             Q u e ry T h ro u g h p u t


                                                                                                                                                                                                                      Q u e ry T h ro u g h p u t


                                                                                                                                                                                                                                                    6 0 0 0
                                                                                                                             5 0 0 0
performs parallel one, while for a large k, the result is oppo-                                                                                                                                                                                     5 5 0 0
                                                                                                                             4 0 0 0
site. This can be explained that when k is small, there are                                                                                                                                                                                         5 0 0 0
fewer objects to be scanned, and parallel processing does not                                                                3 0 0 0
                                                                                                                                                                                                                                                    4 5 0 0
exploit parallelism of Cloud nodes sufficiently, on the other                                                                2 0 0 0
                                                                                                                                                                                                                                                    4 0 0 0
hand, sequential method is able to deliver messages accu-                                                                    1 0 0 0
                                                                                                                                                                                                                                                    3 5 0 0
rately, any no-related node would not be contacted, hence                                                                                  4                  8            1 6     3 2                    6 4                                                     4                    8             1 2    1 6

                                                                                                                                                                           k                                                                                                   C lo u d N o d e C o u n t
the communication cost is reduced. While with a large k,
the disadvantage of sequential method increases, i.e., Cloud                                                                                          (a) Effect of k                                                                                   (b) Effect of system size
nodes receive query messages one by one, and for parallel
                                                                                                                                                                   Figure 6: Results on Real Dataset
method, the query range is enlarged, more nodes are able to
search results simultaneously, the performance is raised.
7.   CONCLUSIONS                                                      infrastructure for location aware services. In Mobile
   With the increasing of DaaS (Database as a Service) re-            Data Management (MDM), 2011 12th IEEE
quirement, kNN query processing methods would be paid                 International Conference on, volume 1, pages 7–16.
more attention by both academic community and industrial              IEEE, 2011.
circles. In this paper, we propose an interesting problem,       [10] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and
which is with better performance for Cloud computing, par-            H. Balakrishnan. Chord: A scalable peer-to-peer
allel method or sequential one. For answering such a ques-            lookup service for internet applications. ACM
tion, we first devise an distributed indexing structure VI-           SIGCOMM Computer Communication Review,
HCO for constructing overlay network, featured by vicinity-           31(4):149–160, 2001.
connection which is capable to fast locate the destinations.     [11] Y. Tao, J. Zhang, D. Papadias, and N. Mamoulis. An
Upon VIHCO, we present the algorithms of parallel and se-             efficient cost model for optimization of nearest
quential processing, and explore the answers through exper-           neighbor search in low and medium dimensional
iments both on synthetic and real dataset, and the results            spaces. Knowledge and Data Engineering, IEEE
show that our VIHCO is better than RT-CAN, and the se-                Transactions on, 16(10):1169–1184, 2004.
quential method is more efficient under small k query condi-     [12] Y. Theodoridis, J. R. Silva, and M. A. Nascimento. On
tion and small system size, while parallel one suits for large        the generation of spatiotemporal datasets. In Advances
k and large scale of computing nodes.                                 in Spatial Databases, pages 147–164. Springer, 1999.
   In the future, we plan to extend our work to in high di-      [13] J. Wang, S. Wu, H. Gao, J. Li, and B. C. Ooi.
mensional space and road network space.                               Indexing multi-dimensional data in a cloud system. In
                                                                      Proceedings of the 2010 ACM SIGMOD International
8.   ACKNOWLEDGMENTS                                                  Conference on Management of data, pages 591–602.
  This work is supported by NSF of China grant 61303062               ACM, 2010.
and 71331008. We would like to thank Prof. Dai and Dr.           [14] C. Zhang, X. Chen, B. Ge, and W. Xiao. Indexing
Hu for helping with the proof.                                        historical spatio-temporal data in the cloud. In Big
                                                                      Data (Big Data), 2015 IEEE International Conference
9.   REFERENCES                                                       on. IEEE, 2015.
 [1] S. Berchtold, C. Böhm, D. A. Keim, and H.-P.
     Kriegel. A cost model for nearest neighbor search in
     high-dimensional data space. In Proceedings of the
     sixteenth ACM SIGACT-SIGMOD-SIGART
     symposium on Principles of database systems, pages
     78–86. ACM, 1997.
 [2] G. Chen, H. T. Vo, S. Wu, B. C. Ooi, and M. T. Özsu.
     A framework for supporting dbms-like indexes in the
     cloud. Proceedings of the VLDB Endowment,
     4(11):702–713, 2011.
 [3] G. R. Hjaltason and H. Samet. Distance browsing in
     spatial databases. ACM Transactions on Database
     Systems (TODS), 24(2):265–318, 1999.
 [4] M. Jelasity, S. Voulgaris, R. Guerraoui, A.-M.
     Kermarrec, and M. Van Steen. Gossip-based peer
     sampling. ACM Transactions on Computer Systems
     (TOCS), 25(3):8, 2007.
 [5] I. Kamel and C. Faloutsos. Hilbert r-tree: An
     improved r-tree using fractals. 1993.
 [6] R. Lange, F. Dürr, and K. Rothermel. Scalable
     processing of trajectory-based queries in
     space-partitioned moving objects databases. In
     Proceedings of the 16th ACM SIGSPATIAL
     international conference on Advances in geographic
     information systems, page 31. ACM, 2008.
 [7] H. Lu, Z. Huang, C. S. Jensen, and L. Xu. Distributed,
     concurrent range monitoring of spatial-network
     constrained mobile objects. In Advances in Spatial and
     Temporal Databases, pages 403–422. Springer, 2007.
 [8] A. Meka and A. Singh. Dist: a distributed
     spatio-temporal index structure for sensor networks.
     In Proceedings of the 14th ACM international
     conference on Information and knowledge
     management, pages 139–146. ACM, 2005.
 [9] S. Nishimura, S. Das, D. Agrawal, and A. E. Abbadi.
     Md-hbase: a scalable multi-dimensional data

</pre>