<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Map-Based Visual Exploration of Geolocated Time Series</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Georgios Chatzigeorgakidis</string-name>
          <email>chgeorgakidis@uop.gr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dimitrios Skoutas</string-name>
          <email>dskoutas@imis.athena-innovation</email>
          <email>dskoutas@imis.athena-innovation. gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kostas Patroumpas</string-name>
          <email>kpatro@imis.athena-innovation.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Spiros Athanasiou</string-name>
          <email>spathan@imis.athena-innovation.gr</email>
          <email>spiros@uop.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Spiros Skiadopoulos</string-name>
          <email>spiros@uop.gr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Athena R.C.</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Peloponnese</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <fpage>92</fpage>
      <lpage>99</lpage>
      <abstract>
        <p>The amount and significance of time series that are associated with specific locations, such as visitor check-ins at various places or sensor readings, have increased in many domains over the last years. Although several works exist for time series visualization and visual analytics in general, there is a lack of eficient techniques for geolocated time series in particular. In this work, we present an approach that relies on a hybrid spatial-time series index to allow for interactive map-based visual exploration and summarization of geolocated time series data. In particular, we use the BTSR-tree index, which extends the R-tree by maintaining bounds for the time series indexed at each node. We describe the structure of this index and show how it can be directly exploited to produce map-based visualizations of geolocated time series at diferent zoom levels eficiently. We empirically validate our approach using two real-world datasets, as well as a synthetic one that is used to test the scalability of our method.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Time series are generated and stored at a vastly increasing rate
in many industrial and research applications, including the Web
and the Internet of Things, public utilities, finance, astronomy,
biology, and many more. A significant portion concerns geolocated
time series, i.e., those generated at, or otherwise associated with
specific locations. While indexing, mining and exploring time
series data has attracted a lot of interest from the database and
data mining communities [
        <xref ref-type="bibr" rid="ref12 ref25 ref28 ref4">4, 12, 25, 28</xref>
        ], studying of geolocated
time series is still largely overlooked.
      </p>
      <p>Geolocated time series can be found in various domains and
applications. A typical example is encountered in the DAIAD
project1, where time series are used to represent water
consumption measured by smart meters installed in urban households.
Analyzing such time series can provide valuable insights
regarding trends and patterns of consumer behavior in daily life. Such
results can then be used to forecast and balance water demand,
as well as to plan and prioritize interventions that can guide
consumers towards more sensible water use. Similar use cases
can also be found in other domains, such as in geomarketing or
mobile advertisement, where geolocated time series may
represent the number of visitors or the revenue generated at a certain
location across time. Extracting insights, trends and patterns can
be significantly facilitated by map-based visualizations of
summarized time series data. For example, such visualizations can
reveal which type of consumption patterns are most frequently
observed among consumers in a certain area or what the spatial
distribution of sales for a certain product looks like.</p>
      <p>However, time series is an inherently complex data type, and
such datasets can reach extremely large volumes, both
horizontally (i.e., very long series across time) and vertically (i.e., time
series generated by countless sources). Consequently,
management, analysis and exploration of such Big Time Series Data is a
task of excessive complexity, requiring eficient algorithms. In
particular, visual exploration of geolocated time series needs to
process the required information in real time, while the user
interacts with the map. Whenever the user zooms in or scrolls
the map, visual analytics and aggregates should be computed
on-the-fly, e.g., identifying the predominant patterns in the time
series and their spatial distribution within the map area.</p>
      <p>Consider the example illustrated in Figure 1. When the user
zooms the map into the red rectangle, the visualization tool
should readily identify and present the two patterns (shown
in blue and green color) appearing therein. To avoid cluttering
the map, when the spatial distribution is very dense or the map
area is too large, it is meaningful to display only aggregate
information, e.g., the number of time series per identified pattern and
their respective centroids; individual time series may be identified
only at a greater zoom level.</p>
      <p>
        Such fast computation and retrieval for large datasets of
geolocated time series can be enabled by indexing. Several approaches
have been proposed that eficiently index large amounts of plain
time series data, either relying on Discrete Wavelet Transform
like [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] to reduce dimensionality of time series, or the family
of indices based on Symbolic Aggregate Approximation (SAX)
over the time series [
        <xref ref-type="bibr" rid="ref28 ref3 ref31 ref4">3, 4, 28, 31</xref>
        ]. However, all aforementioned
techniques index the data solely on the time series domain, not
taking the spatial dimension into account. If the analyzed time
series are inherently associated with a spatial attribute (e.g.,
locations of smart meters), such indexing does not eficiently support
queries and visualizations that do not simply apply to the time
series domain, but also involve spatial filters. As in the example
of Figure 1, a user may need to explore time series similar to
a specific pattern, but also having locations within the actual
map area. For such mixed requests, it is ineficient to evaluate
each predicate separately, e.g., using first the time series index to
retrieve a candidate set of time series similar to the pattern, and
then applying a spatial filter against these candidates to retrieve
only those qualifying within the specified query area.
      </p>
      <p>
        To address this shortcoming, we have recently proposed an
extension to the R-tree spatial index [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], ofering eficient
support to similarity search on geolocated time series. The idea
behind our BTSR-tree hybrid index [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is to combine both spatial
proximity and time series similarity. To that end, in addition to
the standard Minimum Bounding Rectangle (MBR) denoting the
spatial bound of its contents, each node is augmented with a
Minimum Bounding Time Series (MBTS), i.e., a band with upper
and lower bounds that encloses all time series contained in its
subtree. Maintaining both kinds of bounds per node enables
pruning the search space simultaneously in the spatial and the time
series domains while traversing the index. To increase pruning
efectiveness, time series indexed in a given node are further
distinguished into bundles on the basis of their similarity, hence
achieving tighter bounds in the MBTS of these bundles.
      </p>
      <p>In this paper, we take advantage of this novel BTSR-tree
index and propose an interactive method for map-based visual
exploration and summarization over large datasets of geolocated
time series. Intuitively, when the user interacts with the map,
i.e., either zooms in/out or moves the visible area, this technique
can identify the most significant patterns characterizing the time
series located in this area. These patterns are abstracted by the
MBTSs of the bundles contained in the nodes of the BTSR-tree
index within this area, summarizing the various time series therein.
In addition, the corresponding MBRs and the number of
geolocated time series per pattern can be depicted on the map.</p>
      <p>For providing prompt visualizations of summaries over
geolocated time series data and minimizing latency when drawing the
relevant graphic elements, we need early access to both spatial
and time series information while traversing the index. For this
purpose, we adapt the BTSR-tree index so as to also include
aggregates per node, i.e., the number of time series pertaining to
each bundle. Subsequently, we introduce a new traversal
algorithm for eficient retrieval of a given number of bundles that
are the most representative in the map area. To the best of our
knowledge, this is the first work that considers visual exploration
and summarization of geolocated time series.</p>
      <p>In summary, our main contributions are as follows:
• We propose an adapted variant of the BTSR-tree index
as well as a novel algorithm for its traversal in order to
quickly retrieve summaries (bundles) of geolocated time
series within a given area.
• Thanks to robust index support, we introduce a method
that can cluster bundles according to time series similarity,
calculate their support and identify their spatial extent,
thus enabling their interactive map-based exploration.
• We exemplify the proposed visualization method with two
use cases based on real-world datasets.
• We also empirically evaluate the performance of index
traversal, confirming its low execution cost against a large
synthetic dataset of geolocated time series.</p>
      <p>The remainder of this paper is organized as follows. Section 2
reviews related work. Section 3 outlines basic concepts and
formulates the problem. Section 4 describes the BTSR-tree index and
introduces our approach on summarization of geolocated time
series for their eficient visual exploration. Section 5 presents
indicative use cases with map visualizations and also reports
performance results. Finally, Section 6 concludes the paper and
outline future research directions.
2</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
      <p>As our approach suggests visual exploration of time series and is
based on indexing of such data, next we briefly survey both of
these research topics.</p>
      <p>
        Visual Exploration of Time Series. In constrast to declarative
visualization specifications suggested in [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ], a recent tutorial
[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] advocates the use of example-based methods in exploration
of large relational, textual, and graph datasets. Such a
queryby-example approach has been applied in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] so as to explore
relevance feedback for retrieval from time series databases.
Instead of returning the top matching time series, this technique
incorporates diversity into the results, which are presented to
the user for feedback and refined in several rounds.
      </p>
      <p>
        RINSE [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ] is a Recursive Interactive Series Explorer
specifically designed for exploration of data series. Built on top of
ADS+ [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ], a special adaptive index structure for data series, it
can progressively build parts of the index on demand at query
time, concerning only those chunks of the data involved in users’
queries. In terms of visualization, users can get those series
qualifying to range or nearest-neighbor queries interactively drawn
on screen, as well as monitor various statistics regarding the
index footprint (e.g., RAM and disk usage) as it gets updated.
      </p>
      <p>
        In contrast, ATLAS [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is a visual analytics tool specifically
geared towards interactivity when ad hoc filters, arbitrary
aggregations, and trend exploration are applied against massive time
series data. This client-server architecture employs a column
store as its backend equipped with indexing, and preemptively
caches data that may be required in queries so as to reduce latency
when panning, scrolling, and zooming over time series.
      </p>
      <p>
        Recently, the ONEX paradigm [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] concerns online
exploration of time series. It first constructs compact similarity groups
over time series for specific lengths based on Euclidean distance,
and then can eficiently support exploration of these groups with
the Dynamic Time-Warping (DTW) method over their
representatives of diferent lengths and alignments.
      </p>
      <p>
        Besides, smoothing can be applied to streaming time series to
remove noise in visualizations while preserving large-scale
deviations [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. To highlight important phenomena without harming
representation quality from oversmoothing, this approach
introduces quantitative metrics involving variance of first diferences
and kurtosis to automatically calibrate smoothing parameters.
      </p>
      <p>
        ForeCache [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] leverages two prefetching mechanisms to
facilitate exploration of large geospatial, multidimentional and time
series data stored in a DBMS. By predicting the user’s
behavior, it fetches the necessary data as the user interacts with the
application.
      </p>
      <p>None of the aforementioned methods and systems provides
map-based visual exploration of geolocated time series, as is the
goal of our work in this paper.</p>
      <p>
        Indexing of Time Series. Earlier approaches towards indexing
of time series data were based on leveraging multi-resolution
representations. For instance, the Discrete Wavelet Transform
[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] is used in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] to gradually reduce the dimensionality of
time series data via the Haar wavelet [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] and generate an index
using the coeficients of the transformed sequences. In [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], it
is further observed that, other than orthonormal wavelets,
biorthonormal ones can also be used for eficient similarity search
over wavelet-indexed time series data, demonstrating several
such wavelets that outperform the Haar wavelet in terms of
precision and performance. In addition, an alternative approach
regarding k-nearest neighbor search over time series data is
introduced in [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The proposed method accesses the coeficients
of Haar-wavelet-transformed time series through a sequential
scan over step-wise increasing resolutions.
      </p>
      <p>
        State-of-the-art approaches for time series indexing comprise
methods based on the Symbolic Aggregate Approximation (SAX)
representation [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. This is derived from the Piecewise Aggregate
Approximation (PAA) representation of a time series [
        <xref ref-type="bibr" rid="ref19 ref30">19, 30</xref>
        ], by
quantizing the segments of its PAA representation on the y-axis.
The first attempt to leverage the potential of the SAX
representation was presented in [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ], introducing the indexable Symbolic
Aggregate Approximation (iSAX), capable of a multi-resolution
representation for time series. The iSAX index was further
extended to iSAX 2.0 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] by enabling bulk loading of time series
data. Its next version is the iSAX2+ index [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], which handles
better the expensive I/O operations caused by the aggressive
node splitting while building the index. Finally, the ADS+
index [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] is another extension of iSAX, which overcomes the still
significantly expensive index build time by adaptively building
the index while processing the workload of queries issued by
the user. A comprehensive overview of the time series indexing
approaches based on the SAX representation is presented in [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ].
      </p>
      <p>
        Unfortunately, none of the abovementioned access methods
can inherently support geolocated time series, i.e., time series
inextricably associated with a location. To the best of our
knowledge, the only index in the literature that supports such time
series is the BTSR-tree index [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. This hybrid index follows a
similar rationale set by spatio-textual indices [
        <xref ref-type="bibr" rid="ref11 ref14 ref8 ref9">8, 9, 11, 14</xref>
        ] that
have been proposed to speed up evaluation of queries
combining location-based predicates with keyword search. Essentially,
this paradigm implies combining a spatial index structure (e.g.,
R-tree, Quadtree, Space-Filling Curve) with a textual index (e.g.,
inverted file, signature file). Depending on their structure, these
variants can be characterized either as spatial-first or textual-first
indices [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. In a similar spirit, our BTSR-tree is a spatial-first
index based on the R-tree that can additionally abstract similarity
of time series instead of a textual one. As a result, it can ofer
analogous improvements when searching against geolocated time
series data, as we discuss in more detail in Section 4.1.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>PROBLEM DEFINITION</title>
      <p>Next, we introduce notation and definitions used in our approach,
and we formally define the problem addressed in this paper.</p>
      <p>A time series is a time-ordered sequence of values T = {v1, . . . ,
vw }, where vi is the value at the i-th time point and w is the
length of the series. In particular, we deal with time series that
are additionally characterized by a location, denoted by T .loc.
Assuming a 2-dimensional space, we further use the notation
T .locx , T .locy to refer to the (x ,y) coordinates of T ’s location.</p>
      <p>
        In the spatial domain, the distance between two geolocated
time series T and T ′ of equal length w is calculated using the
Euclidean distance of their respective locations. Furthermore,
we normalize this distance with maxDistsp , i.e., the maximum
spatial distance of any pair of objects in the dataset, to obtain a
measure in the interval [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ]. Thus:
distsp (T ,T ′) =
q
(T .locx − T ′.locx )2 + (T .locy − T ′.locy )2
      </p>
      <p>maxDistsp</p>
      <p>
        In the time series domain, similarly to other prior works (e.g.,
[
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]), we also apply the Euclidean distance to measure the
similarity of a pair of objects. In future work, we plan to make use of
more complex distance measures [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. More specifically, we
calculate the distance between two time series T and T ′ as follows:
distt s (T ,T ′) =
vt w
      </p>
      <p>X(vi − vi′)2
i=1
maxDistt s
(1)
(2)
where maxDistt s denotes the maximum distance of any pair of
objects in the dataset and is used for normalization, as above.</p>
      <p>To index and summarize time series, we use the notion of
Minimum Bounding Time Series (MBTS). An MBTS is a summarization
of a set of time series T , defined by a pair of upper and lower
time series bounds that contain them. Formally, given a set of
time series T , its MBTS consists of an upper bounding time series
Tup and a lower bounding time series Tlo , constructed by selecting
the maximum (for Tup ) and minimum (for Tlo ) of values at each
time point among all time series of the set as follows:
Tup = { max T [0], . . . , max T [w − 1]}</p>
      <p>T ∈T T ∈T (3)
Tlo = { min T [0], . . . , min T [w − 1]}</p>
      <p>T ∈T T ∈T</p>
      <p>We can now formally introduce our problem. Given a set of
geolocated time series and an area, our goal is to produce a
summary for visualization comprising the following two parts:
• Time series summary: A collection of MBTSs (bundles),
summarizing the time series located within the given area.
• Spatial summary: A set of MBRs, each one associated with
an object counter for each identified bundle.</p>
      <p>The bundles provide a summarization of the time series that
are contained within their MBTSs. Figure 2 depicts an example
of two time series bundles for two diferent sets of time series.
Regarding the spatial summary, for each MBR associated with
a certain bundle, the counter denotes the number of time series
contained in it.
4</p>
    </sec>
    <sec id="sec-4">
      <title>APPROACH</title>
      <p>We propose a visualization method for geolocated time series
that draws on a map the time series and spatial summaries for
the current visible area. Using this process, a user can select the
bundle of her preference and the proper spatial summary will
appear on the map after acquiring the necessary MBRs from
the BTSR-tree index. Whenever the user zooms in/out or moves
around the map, the BTSR-tree is traversed, and the
corresponding bundles, MBRs and object counts are obtained to drive the
visualization. In each case, the rectangle corresponding to the
visible part of the map is used to feed a traversal algorithm that
eficiently gathers the results. In the following, we describe this
process in detail, after providing some necessary background
information on the BTSR-tree index.
4.1</p>
    </sec>
    <sec id="sec-5">
      <title>The BTSR-tree Index</title>
      <p>
        To eficiently generate real-time visualizations of geolocated time
series data, we need early access to both spatial and time series
related information while traversing the index, in order to maintain
low latency levels when drawing the required graphic elements.
However, none of the approaches presented in Section 2 supports
geolocated time series indexing. To the best of our knowledge,
the recently proposed BTSR-tree index [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is the only one that
provides the desired functionality.
      </p>
      <p>
        The BTSR-tree is based on the R-tree [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] for the spatial
indexing part. The R-tree organizes a hierarchy of nested
ddimensional rectangles. Each node corresponds to a disk page
and represents the MBR of its children or, for leaf nodes, the
MBR of its contained geometries. The number of entries per node
(excluding the root) is between a lower bound m and a
maximum capacity M. Query execution in R-trees starts from the root.
MBRs in any visited node are tested for intersection against the
search region. Qualifying entries are recursively visited until the
leaf level or until no further overlaps are found. Several paths
may be probed, as multiple sibling entries could overlap with
the search region. The BTSR-tree extends the information stored
within each node of the R-Tree with bundles of MBTSs. This
allows to eficiently prune the search space when evaluating hybrid
queries combining time series similarity with spatial proximity.
      </p>
      <p>As in the standard R-tree, each node of the BTSR-tree has at
least m and at most M entries and stores the MBRs of its
children. Additionally, for each child, a node stores a pre-specified
number of time series bundles, each consisted of an MBTS that
encloses all the time series indexed in its subtree. Each bundle is
calculated using Equation 3. Construction and maintenance of
the BTSR-tree follow the procedures of the R-tree for data
insertion, deletion and node splitting. Objects (i.e., geolocated time
series) are inserted into leaf nodes, and any resulting changes
are propagated upwards. Once the nodes have been populated,
the bundles of each node are calculated bottom-up. To construct
the time series bundles within each node, we rely on k-means
clustering. The objects contained in each node are clustered
according to their Euclidean distance on the time series domain.
The example in Figure 2 depicts the bundles (the two bands with
a thick outline) obtained for a set of time series (shown as thin
polylines) when the number of clusters is set to k = 2.</p>
      <p>
        For inner nodes, the bundles are constructed bottom-up. First,
in each leaf node, the contained time series are clustered into
k bundles. Then, the MBTS of each bundle is computed and
stored in the node. As a next step, each parent node receives
all the MBTSs of its children and computes its own k bundles
and respective set of MBTS by clustering them. The process
continues upwards, until reaching the root of the tree. Optionally,
Piecewise Aggregate Approximation [
        <xref ref-type="bibr" rid="ref19 ref30">19, 30</xref>
        ] can be applied over
the time series. As detailed in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], this allows a trade of between
the number of bundles per node and the MBTS resolution, thus
permitting a larger number (&gt; k) of bundles in nodes at higher
levels in the tree hierarchy.
      </p>
      <p>In addition, to support the required functionality of our
visualization method, we further extend here the information stored in
each node with the count of geolocated time series that are fully
contained within each bundle. This is also done bottom-up, while
the index is traversed to calculate the bundles. At each leaf node,
after the clustering, we propagate the number of members of
each cluster to its parent, which, in turn calculates its clusters and
aggregates the counts it has received for each bundle’s members.
This procedure continues up to the root of the tree.
4.2</p>
    </sec>
    <sec id="sec-6">
      <title>Summary Construction for Map-Based</title>
    </sec>
    <sec id="sec-7">
      <title>Visualization</title>
      <p>We now present our summarization approach for producing
mapbased visualizations of geolocated time series. The process is
outlined in Algorithm 1. It takes as input the query rectangle
(q), i.e., the area of the map for which the visualization is
produced, and the number of bundles k to be generated. The process
comprises three distinct steps. Initially, the BTSR-tree index is
traversed to obtain the MBRs contained in the query rectangle,
along with their bundles and the number of objects per bundle
(Line 1). Next, k-means clustering is applied using the average
time series per bundle as centroids (Line 2). Finally, the new
bundles are calculated and the proper MBRs and
corresponding object counts are assigned to each bundle (Line 3). Next, we
describe each step in more detail.</p>
      <p>Step 1: BTSR-tree Traversal. During this step, the BTSR-tree
index is traversed, with the target being the fast provision of a
predefined number k of geolocated time series bundles contained
within the given area q, along with the MBRs where these bundles
can be found and the total number of geolocated time series that
reside within each MBR. All required information is stored within
the nodes of the BTSR-tree, thus, when a node that is contained
within the query rectangle is found, the relevant information
is retrieved and added to the intermediate results, without any
need to continue searching in its sub-tree. The output of this step
is passed to the next step of the procedure.</p>
      <p>In more detail, the traversal is performed as follows. After
initializing a queue with the root’s children (Line 7), we loop
over it (Line 8) until it’s empty. For each inner node’s child N ′,
we check whether its MBR is contained within the given query
rectangle q (Lines 11–12). If so, its MBR, time series bundles and
the number of objects per bundle are added to the intermediate
results (Line 13) as a tuple with the following components:
⟨mbr , {(mbts1,cnt1), ..., (mbtsk ,cntk )}⟩
Each such tuple indicates the MBR of a node (mbr ), consisting
of the coordinates of the lower left and upper right point, as
well as k pairs denoting the bundles of the node along with the
corresponding number of objects per bundle. If the MBR is not
contained in the query rectangle, we check whether it overlaps
with it and if so, we add the child node to the queue (Line 15). If
not, this MBR is located outside the query rectangle, and thus
we can skip searching this child. Once no more nodes are left to
search, the intermediate results are finally returned (Line 16).</p>
      <p>Step 2: Bundles Clustering. The traversal algorithm returns
tuples, each containing the bundles residing in the query rectangle,
the corresponding nodes’ MBRs and the number of objects per
bundle. During step 2, k-means clustering is executed on the
average time series of each bundle.</p>
      <p>Line 2 of Algorithm 1 calls the clustering procedure. Initially,
for each tuple (Line 20), we loop over its bundles (Line 21) and
generate a new tuple per bundle of the following format:
⟨Tavд ,mbts,cnt ,mbr ⟩
This new tuple contains an average time series, the bundle itself
(mbts), the number cnt of objects enclosed in this bundle, and the
MBR (mbr ) this bundle belongs to (Line 23). The average time
series Tavд is calculated by averaging the upper and lower bound
of each bundle (Line 22), i.e., average value at each time point.
The resulting collection of tuples (Line 24) is fed to the k-means
algorithm in order to return the required number k of bundles
to be created. This clustering generates a clustered collection of
tuples using the calculated average time series (Line 25). These
results are then forwarded to step 3 (Line 26).</p>
      <p>Step 3: Bundles Calculation and MBR Assignment. During step 3,
the clustered tuples received from step 2 are used to calculate the
ifnal bundles, corresponding MBRs and total number of objects
per MBR are assigned to each bundle. The final bundles are
calculated in a similar manner to the MBTS bundles during BTSR-tree
construction. More specifically, at each time point, we obtain the
maximum and minimum value among the corresponding upper
and lower bounds for the bundles of each cluster (see Section
4.1). In case two MBRs that belong to the same final bundle are
the same, their number of objects is aggregated. The final result
is then forwarded to the visualization layer.</p>
      <p>Line 3 of Algorithm 1 calls the corresponding procedure. For
each cluster of tuples received from step 2 (Line 29), we loop
over its members (Line 32) and we use each tuple’s bundle and
MBR to update the upper and lower bounds and the collection of
MBRs that are contained in the final bundle (Lines 33–34). Once
the bounds and the corresponding list of MBRs for the current
bundle have been calculated, we issue an aggregated tuple to the
ifnal result (Line 35). This tuple has the following components:
⟨mbts ′, {(mbr1,cnt1), ..., (mbrn ,cntn )}⟩
where mbts ′ is a resulting bundle, along with the MBRs associated
with it. Note that the number n of MBRs (as well as their shape)
may be varying per bundle, reflecting the spatial distribution
of the respective pattern. Each MBR is accompanied with the
corresponding number of objects (i.e., raw time series) therein.
The final result with all such tuples is then returned in order to
generate the visualization (Line 36).
10
11
12
13</p>
    </sec>
    <sec id="sec-8">
      <title>EXPERIMENTAL EVALUATION</title>
      <p>In this section, we first describe our experimental setup, followed
by indicative examples of map-based visualizations of real-world
geolocated time series, as well as scalability results using a
synthetic dataset containing 4 million time series. The experiments
were conducted on a Dell PowerEdge M910 with 4 Intel Xeon
E7-4830 CPUs, each containing 8 cores clocked at 2.13GHz, 256
GB RAM and a total storage space of 900 GB. Finally, we assume
that the index fits in memory.</p>
      <p>Algorithm 1: Summarization of Geolocated Time Series
Input: The query rectangle q; the number of bundles to be
generated k
Output: A list R containing tuples of bundles, MBRs and
object counts
1 R ← IndexT raversal (q)
2 Rc ← BundlesClusterinд(R,k )
3 Rf ← BundlesCalculation(Rc )
// Step 1
// Step 2
// Step 3
4 return Rf
5 Procedure IndexT raversal (q)
6 R ← ∅
7 Q ← Root .дetChildren()
8 while Q , ∅ do
9 N ← Q .дet N ext ()
if N is not leaf then
foreach N ′ ∈ N .дetChildren() do
if q.contains (N ′.mbr ) then</p>
      <p>R ← R ∪ {⟨N ′.mbr , {N ′.mbts }, {N ′.cnt }⟩}
return R
else if q.overlaps (N ′.mbr ) then</p>
      <p>Q ← Q ∪ N ′.дetChildren()
17 Procedure BundlesClusterinд(R,k )</p>
      <p>Rc ← ∅
C ← ∅
foreach t ∈ R do
foreach b ∈ t .mbts do</p>
      <p>Tavд ← avд(b.up,b.lo)
t ′ ← ⟨Tavд ,b,t .cnt (b ),t .mbr ⟩</p>
      <p>C ← C ∪ {t ′}
Rc ← kmeans (C .avд,k )
return Rc
27 Procedure BundlesCalculation(Rc )
28 Rf ← ∅
29 foreach Cl ∈ Rc .clusters do
30 B ← ∅
31 M ← ∅
32 foreach t ∈ Cl do
33 B ← updateMBT S (B,t .mbts )
34 M ← updateMBRs (M,t .mbr ,t .cnt )</p>
      <p>Rf ← Rf ∪ {⟨{B}, {M }⟩}
return Rf
5.1
5.1.1 Datasets. We use two real-world datasets selected from
diferent application domains and with diverse characteristics. In
addition, we generated a synthetic dataset to test the scalability
of our method. Table 1 lists a summary of the characteristics of
each dataset.</p>
      <p>DAIAD Water Consumption (Water). Courtesy of the DAIAD
project, we acquired a geolocated time series dataset of hourly
water consumption for 822 households in Alicante, Spain from
1/1/2015 to 20/1/2017. In order to get a more representative
dataset for our tests, we calculated the average weekly time
series per household, which is the average consumption value
per hour of the week. Thus, the length of each resulting time
series is 24 × 7 = 168 values across the week.</p>
      <p>NYC taxi dropofs (Taxi) . This dataset contains time series
extracted from yellow taxi rides in New York City during 2015.
The original data2 provide pick-up and drop-of locations, as well
as corresponding timestamps for each ride. For each month, we
generated time series by applying a uniform spatial grid over the
entire city (cell side was 200 meters) and counting all drop-ofs
therein for each day of the week at the time granularity of one
hour. Thus, we obtained the number of drop-ofs for 24 × 7 time
intervals in every cell, which essentially captures the weekly
lfuctuation of taxi destinations there. The centroid of each cell is
used as the geolocation of the corresponding time series.</p>
      <p>Synthetic Water Consumption (Synthetic). To examine the
scalability of our method, we generated a synthetic dataset comprising
4 million geolocated time series by inflating the water
consumption dataset. This was achieved by using the original time series
as seeds and introducing some random variations in their
location and pattern. We chose the water dataset so as to generate
a more densely populated dataset (Alicante is a medium-sized
city), in order to stress-test our visualization method.</p>
      <p>
        5.1.2 Parameters. We built the BTSR-Tree index setting the
minimum and maximum number of entries per node to m = 60
and M = 200, respectively. Regarding the number of bundles,
we set k0 = 5 for its leaf nodes. The number of bundles for the
traversal algorithm is set to be equal to the number of bundles
at the leafs, i.e., k = k0 = 5. For an evaluation of the BTSR-Tree
index under diferent parameter settings, please refer to [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
5.2
      </p>
    </sec>
    <sec id="sec-9">
      <title>Map Visualizations</title>
      <p>Our visualization method depicts the MBTS derived for the most
representative patterns of time series at the currently visible
area of the map. Once our summarization method returns the
results, the corresponding MBRs contained in the current view
and zoom level are drawn on the map, along with the number
of the geolocated time series that belong to the selected bundle.
This number is depicted using circles, colored green for small
numbers, yellow for larger and red for more densely populated
MBRs, thus easily conveying the local intensity of this pattern.
2http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml</p>
      <p>The bundles are listed on the left of the map, using confidence
bands to indicate their upper and lower bounds. The average
time series of each bundle is also depicted. A user can scroll this
list and select the bundle of their preference. Once a bundle is
selected, the contents of the map are updated accordingly with
the respective MBRs and aggregates.</p>
      <p>
        Figure 3 shows an example of such visualization using the
water dataset. The depicted area is in the center of Alicante, in
the most densely populated zone of the city. In this example,
Bundle 4 is selected (indicated with a green colored frame) and
the relevant MBRs are shown on the map (using red colored
frames). This indicates that inside each depicted MBR there
exists a specific number of geolocated time series that have been
clustered to the chosen bundle. As mentioned, each geolocated
time series in this dataset represents hourly water consumption
of a household across one week. Diferent consumption
behaviors have been grouped together and a daily pattern for each
bundle can be noticed which is due to the Circadian rhythmic
way that people consume water [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The rather large number of
geolocated time series in the bundle, considering the zoom level
and the extent of the MBRs, intuitively suggests that neighboring
families tend to have similar water consumption behavior.
      </p>
      <p>Figure 4 illustrates another example, this time using the taxi
dataset in New York City. This dataset is significantly larger, and
the zoom level selected in this example is lower (a larger
geographic area is visible), hence the MBRs contain a larger number
of time series. In this figure, we choose Bundle 1, which
represents the rather quieter taxi dropof zones in Manhattan, as the
total number of dropofs there is rarely over 60 during any hour
of the week. In this example, there is also a clear daily routine in
all bundles, with the dropofs reaching a local maximum twice
per day, suggesting the rush hours in New York City, when
people commute to and from their work. In almost all bundles, the
daily pattern is significantly diferent on Saturdays and Sundays,
which confirms the intuition that during weekends people do
not tend to commute in a routinely fashion. Overall, such visual
representations of information digested from massive time series
data can easily catch users’ attention to important phenomena
and ongoing trends, confirming the usefulness of our approach.
5.3</p>
    </sec>
    <sec id="sec-10">
      <title>Performance Results</title>
      <p>In order to evaluate the performance of our approach on larger
datasets, we built the index using the synthetic dataset and
examined its response time for diferent zoom levels on the map. Since
this is intended as an interactive application, where the
summarization method is triggered as soon as the user moves the map,
response times must be adequately small. Ideally, the response
time should be in the order of milliseconds. In our method, this
is facilitated by the fact that the search along a path stops once
it encounters a node whose MBR is contained in the actual map
extent (rectangle). In this experiment, we measure the response
time for diferent zoom levels, since zooming-in requires deeper
traversal of the BTSR-tree index in order to locate the relevant
nodes. We use map scales to indicate the diferent zoom levels.</p>
      <p>Figure 5 depicts traversal costs for diferent map scales over
the areas covered by the three datasets. More specifically, the
water and synthetic datasets cover the area of the city of Alicante,
Spain, whereas the taxi dataset the wider metropolitan area of
New York City. Response time in all cases is equal or lower than
one second, which makes this method suitable for interactive
visualization. The synthetic dataset, due to its very high density
is significantly slower than the rest, however still the results are
obtained in less than a second. The response for the water dataset
is almost instant due to its small size and very low density.</p>
      <p>Initially, in all cases, at the largest scale, the visible area of the
map contains all the time series in the dataset, thus it only has to
retrieve information from the root of the index. Then, as we zoom
in, more nodes have to be visited, as the MBRs of the accessed
nodes begin to overlap with the map rectangle and their children
have to be retrieved. The worst case for the synthetic dataset is at
scale 1:5000, which roughly corresponds to a large neighborhood
of the city, where many time series are located. For the taxi
dataset, the worst case is at 1:20000, which corresponds to the
wider Manhattan area and then the response time gradually drops
due to the lower dataset density. The number of nodes accessed
in each case is proportional to the response times, ranging from
one node (the root) in case of the smaller map scale (all city) up to
165 at scale of 1:5000 for the synthetic dataset, one up to 53 for the
taxi dataset and one up to 15 for the water dataset. Interestingly,
fewer node accesses are required in all cases at the very large
scale of 1:500, since the respective small map area overlaps with
fewer nodes and most of the search space is pruned.</p>
      <p>Consequently, we deem that our method performs adequately
fast even against a heavily dense synthetic dataset, where a large
number of time series are contained within a small area.
6</p>
    </sec>
    <sec id="sec-11">
      <title>CONCLUSIONS AND FUTURE WORK</title>
      <p>In this paper, we introduced a method for map-based visual
exploration of geolocated time series data. To that end, we proposed
a summarization approach over geolocated time series, which
allows a visual analytics application to retrieve the required
information. Such retrieval can be achieved at low latency, thus being
suitable for interactive exploration of large volumes of such data.
The results can be displayed on a map, depicting the relevant
MBRs and the number of time series contained in each one, for a
selected pattern detected in the time series data. Thanks to the
support of a robust hybrid indexing technique, the patterns
detected at a given zoom level are calculated via k -means clustering
over the time series that reside in the currently visible part of the
map. Our experiments on a large-scale synthetic dataset indicated
that the visualization can be rendered adequately fast for use in
interactive map-based applications. Additionally, we presented
indicative demonstrations of the visualizations generated on two
real-world datasets from diferent domains, confirming that these
visualizations are helpful in revealing patterns both on the time
series themselves as well as their geographic distribution.</p>
      <p>Our ongoing and future work focuses on supporting more
detailed visual analytics and identifying more fine-grained patterns
through visual exploration. One possible extension would be to
enable zooms along time, so that the user can identify patterns
and their spatial distribution, not only over the entire time series,
but also over particular intervals. Further, it would be
interesting to drill-down in a particular summarized result and discover
whether there are diferentiations in the spatial distributions of
its constituent, more detailed patterns.</p>
    </sec>
    <sec id="sec-12">
      <title>ACKNOWLEDGMENTS</title>
      <p>This work has been partially funded and supported by the General
Secretariat for Research and Technology (GSRT), the Hellenic
Foundation for Research and Innovation (HFRI), the EU Project
HELIX (grant No 5002781).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Aschof</surname>
          </string-name>
          .
          <article-title>Circadian rhythms in man</article-title>
          .
          <source>Science</source>
          ,
          <volume>148</volume>
          (
          <issue>3676</issue>
          ):
          <fpage>1427</fpage>
          -
          <lpage>1432</lpage>
          ,
          <year>1965</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Battle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Chang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Stonebraker</surname>
          </string-name>
          .
          <article-title>Dynamic prefetching of data tiles for interactive visualization</article-title>
          .
          <source>In SIGMOD</source>
          , pages
          <fpage>1363</fpage>
          -
          <lpage>1375</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Camerra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shieh</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Keogh</surname>
          </string-name>
          .
          <source>iSAX 2</source>
          .
          <article-title>0: Indexing and mining one billion time series</article-title>
          . In ICDM, pages
          <fpage>58</fpage>
          -
          <lpage>67</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Camerra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shieh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rakthanmanon</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Keogh</surname>
          </string-name>
          .
          <article-title>Beyond one billion time series: indexing and mining very large time series collections with i SAX2+</article-title>
          . Knowl. Inf. Syst.,
          <volume>39</volume>
          (
          <issue>1</issue>
          ):
          <fpage>123</fpage>
          -
          <lpage>151</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>K.</given-names>
            <surname>Chan</surname>
          </string-name>
          and
          <string-name>
            <given-names>A. W.</given-names>
            <surname>Fu</surname>
          </string-name>
          .
          <article-title>Eficient time series matching by wavelets</article-title>
          .
          <source>In ICDE</source>
          , pages
          <fpage>126</fpage>
          -
          <lpage>133</lpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gerth</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Hanrahan</surname>
          </string-name>
          .
          <article-title>Maintaining interactivity while exploring massive time series</article-title>
          .
          <source>In IEEE VAST</source>
          , pages
          <fpage>59</fpage>
          -
          <lpage>66</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Chatzigeorgakidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Skoutas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Patroumpas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Athanasiou</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Skiadopoulos</surname>
          </string-name>
          .
          <article-title>Indexing geolocated time series data</article-title>
          .
          <source>In SIGSPATIAL</source>
          , pages
          <volume>19</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>19</lpage>
          :
          <fpage>10</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Chen</surname>
          </string-name>
          , G. Cong,
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Jensen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Wu</surname>
          </string-name>
          .
          <article-title>Spatial keyword query processing: An experimental evaluation</article-title>
          .
          <source>PVLDB</source>
          ,
          <volume>6</volume>
          (
          <issue>3</issue>
          ):
          <fpage>217</fpage>
          -
          <lpage>228</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Suel</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Markowetz</surname>
          </string-name>
          .
          <article-title>Eficient query processing in geographic web search engines</article-title>
          .
          <source>In SIGMOD</source>
          , pages
          <fpage>277</fpage>
          -
          <lpage>288</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Christoforaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Dimopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Markowetz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Suel</surname>
          </string-name>
          .
          <article-title>Text vs. space: eficient geo-search query processing</article-title>
          .
          <source>In CIKM</source>
          , pages
          <fpage>423</fpage>
          -
          <lpage>432</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>G.</given-names>
            <surname>Cong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Jensen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Wu</surname>
          </string-name>
          .
          <article-title>Eficient retrieval of the top-k most relevant spatial web objects</article-title>
          .
          <source>PVLDB</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ):
          <fpage>337</fpage>
          -
          <lpage>348</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>H.</given-names>
            <surname>Ding</surname>
          </string-name>
          , G. Trajcevski,
          <string-name>
            <given-names>P.</given-names>
            <surname>Scheuermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Keogh</surname>
          </string-name>
          .
          <article-title>Querying and mining of time series data: experimental comparison of representations and distance measures</article-title>
          .
          <source>PVLDB</source>
          ,
          <volume>1</volume>
          (
          <issue>2</issue>
          ):
          <fpage>1542</fpage>
          -
          <lpage>1552</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>B.</given-names>
            <surname>Eravci</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Ferhatosmanoglu</surname>
          </string-name>
          .
          <article-title>Diversity based relevance feedback for time series search</article-title>
          .
          <source>Proc. VLDB Endow</source>
          .,
          <volume>7</volume>
          (
          <issue>2</issue>
          ):
          <fpage>109</fpage>
          -
          <lpage>120</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>I. D.</given-names>
            <surname>Felipe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Hristidis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Rishe</surname>
          </string-name>
          .
          <article-title>Keyword search on spatial databases</article-title>
          .
          <source>In ICDE</source>
          , pages
          <fpage>656</fpage>
          -
          <lpage>665</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Graps</surname>
          </string-name>
          .
          <article-title>An introduction to wavelets</article-title>
          .
          <source>IEEE Comput. Sci. Eng</source>
          .,
          <volume>2</volume>
          (
          <issue>2</issue>
          ):
          <fpage>50</fpage>
          -
          <lpage>61</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Guttman.</surname>
          </string-name>
          R-trees:
          <article-title>A dynamic index structure for spatial searching</article-title>
          .
          <source>In SIGMOD</source>
          , pages
          <fpage>47</fpage>
          -
          <lpage>57</lpage>
          ,
          <year>1984</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Haar</surname>
          </string-name>
          .
          <article-title>Zur theorie der orthogonalen funktionensysteme</article-title>
          .
          <source>Mathematische Annalen</source>
          ,
          <volume>69</volume>
          (
          <issue>3</issue>
          ):
          <fpage>331</fpage>
          -
          <lpage>371</lpage>
          ,
          <year>1910</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kashyap</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Karras</surname>
          </string-name>
          .
          <article-title>Scalable kNN search on vertically stored time series</article-title>
          .
          <source>In SIGKDD</source>
          , pages
          <fpage>1334</fpage>
          -
          <lpage>1342</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Keogh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chakrabarti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Pazzani</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Mehrotra</surname>
          </string-name>
          .
          <article-title>Dimensionality reduction for fast similarity search in large time series databases</article-title>
          .
          <source>Knowl. Inf. Syst.</source>
          ,
          <volume>3</volume>
          (
          <issue>3</issue>
          ):
          <fpage>263</fpage>
          -
          <lpage>286</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Keogh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wei</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Lonardi</surname>
          </string-name>
          .
          <article-title>Experiencing SAX: a novel symbolic representation of time series</article-title>
          .
          <source>Data Min. Knowl. Discov.</source>
          ,
          <volume>15</volume>
          (
          <issue>2</issue>
          ):
          <fpage>107</fpage>
          -
          <lpage>144</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>D.</given-names>
            <surname>Mottin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lissandrini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Velegrakis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          .
          <article-title>New trends on exploratory methods for data analytics</article-title>
          .
          <source>Proc. VLDB Endow</source>
          .,
          <volume>10</volume>
          (
          <issue>12</issue>
          ):
          <fpage>1977</fpage>
          -
          <lpage>1980</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>R.</given-names>
            <surname>Neamtu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ahsan</surname>
          </string-name>
          , E. Rundensteiner, and
          <string-name>
            <surname>G. Sarkozy.</surname>
          </string-name>
          <article-title>Interactive time series exploration powered by the marriage of similarity distances</article-title>
          .
          <source>Proc. VLDB Endow</source>
          .,
          <volume>10</volume>
          (
          <issue>3</issue>
          ):
          <fpage>169</fpage>
          -
          <lpage>180</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          .
          <article-title>Big sequence management: A glimpse of the past, the present, and the future</article-title>
          .
          <source>In SOFSEM</source>
          , pages
          <fpage>63</fpage>
          -
          <lpage>80</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>J.</given-names>
            <surname>Paparrizos</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Gravano.</surname>
          </string-name>
          k-shape:
          <article-title>Eficient and accurate clustering of time series</article-title>
          .
          <source>In SIGMOD</source>
          , pages
          <fpage>1855</fpage>
          -
          <lpage>1870</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>P.</given-names>
            <surname>Paraskevopoulos</surname>
          </string-name>
          , G. Pellegrini, and
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          .
          <article-title>Tweeloc: A system for geolocalizing tweets at fine-grain</article-title>
          .
          <source>In ICDM Workshops</source>
          , pages
          <fpage>1178</fpage>
          -
          <lpage>1183</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>I. Popivanov</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Miller</surname>
          </string-name>
          .
          <article-title>Similarity search over time-series data using wavelets</article-title>
          .
          <source>In ICDE</source>
          , pages
          <fpage>212</fpage>
          -
          <lpage>221</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>K.</given-names>
            <surname>Rong</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Bailis</surname>
          </string-name>
          . Asap:
          <article-title>Prioritizing attention via time series smoothing</article-title>
          .
          <source>Proc. VLDB Endow</source>
          .,
          <volume>10</volume>
          (
          <issue>11</issue>
          ):
          <fpage>1358</fpage>
          -
          <lpage>1369</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>J.</given-names>
            <surname>Shieh</surname>
          </string-name>
          and
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Keogh</surname>
          </string-name>
          .
          <article-title>iSAX: indexing and mining terabyte sized time series</article-title>
          .
          <source>In SIGKDD</source>
          , pages
          <fpage>623</fpage>
          -
          <lpage>631</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>E.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Battle</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Madden</surname>
          </string-name>
          .
          <article-title>The case for data visualization management systems: Vision paper</article-title>
          .
          <source>Proc. VLDB Endow</source>
          .,
          <volume>7</volume>
          (
          <issue>10</issue>
          ):
          <fpage>903</fpage>
          -
          <lpage>906</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>B.</given-names>
            <surname>Yi</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Faloutsos</surname>
          </string-name>
          .
          <article-title>Fast time sequence indexing for arbitrary Lp norms</article-title>
          .
          <source>In VLDB</source>
          , pages
          <fpage>385</fpage>
          -
          <lpage>394</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>K.</given-names>
            <surname>Zoumpatianos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Idreos</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          .
          <article-title>Indexing for interactive exploration of big data series</article-title>
          .
          <source>In SIGMOD</source>
          , pages
          <fpage>1555</fpage>
          -
          <lpage>1566</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>K.</given-names>
            <surname>Zoumpatianos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Idreos</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          . Rinse:
          <article-title>Interactive data series exploration with ads+</article-title>
          .
          <source>Proc. VLDB Endow</source>
          .,
          <volume>8</volume>
          (
          <issue>12</issue>
          ):
          <fpage>1912</fpage>
          -
          <lpage>1915</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>