<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Template-based Time series generation with Loom</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lars Kegel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Hahmann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wolfgang Lehner</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Technische Universität Dresden 01062 Dresden</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Time series analysis and forecasting are important techniques for decision-making in many domains. They are typically evaluated on given sets of time series that have a constant size and speci ed characteristics. Synthetic datasets are relevant because they are exible in both size and characteristics. In this demo, we present our prototype Loom, that generates datasets with respect to the user's con guration of categorical information and time series characteristics. The prototype allows for comparison of di erent analysis techniques.</p>
      </abstract>
      <kwd-group>
        <kwd>Time series analysis</kwd>
        <kwd>Data generation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Time series describe the dynamic behavior of a monitored
object, parameter, or process over time and are one of the
most popular and useful data types. They can be found
in a multitude of application domains, e.g. as item sales
in commerce, various sensor readings in manufacturing
processes, or as demand and production in the energy domain.
Obviously, this makes them a valuable source for diverse
data analysis techniques, such as forecasting [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Especially
in the domain of renewable energy production, where the
uctuating character of renewable energy sources makes
accurate forecasts vital in order to match electricity
production and demand. Further applications on time series data
include querying, classi cation, e cient storage and much
more. The ubiquity of this data type and the ongoing trend
for data collection, storage and analysis have led to a
substantial amount of research that is dedicated to the handling
and processing of large sets of time series data. While all
these research endeavors can di er greatly with respect to
their individual goals and application scenarios they have
one thing in common: they require large amounts of time
series data in order to evaluate, verify, and optimize their
ndings. Although there are many stakeholders that have
a substantial interest in using and exploiting time series,
acquiring sophisticated data is not easy. Basically there
are two sources: First, public open repositories or single
datasets [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], that are tailored to speci c applications and
only o er a small selection. Second, "real" data owned by
companies/organizations that is sometimes made available
to partners in the context of closed research projects but
rarely to the general public. Moreover, obtaining real data
can be tedious due to the time and cost that is necessary to
collect them [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Based on our own experience we can state
that some data is always available, that allows to conduct
basic evaluations. This situation normally becomes
problematic when scalability, versatility, and robustness have
to be examined. These require a more versatile selection
of data, containing datasets with varying size, time series
length, trends, seasonality, or just a di erent blend of time
series characteristics. In general, this is not available which
often leads to researchers using workarounds to create more
data, e.g. duplication to increase the number of time series
or their length.
      </p>
      <p>To cope with this problem, we demonstrate Loom which
is a user-friendly and exible approach to generate sets of
time series for the evaluation of arbitrary analysis techniques
or the benchmarking of time series management systems.
Loom stands for the process of weaving di erent time
series generators, datasets of arbitrary size. In addition, users
can generate categorical information to structure the time
series hierarchically. Thus, they form a data cube that can
be explored by usual OLAP queries, such as roll up or drill
down. Generated datasets can be directly exported to
relational databases or at le formats in order to easily
utilize them in di erent applications. The usage of Loom is
template-driven at its core. This means, given datasets can
be analyzed in order to extract a template containing their
de ning characteristics. These templates are then used to
create di erent variants of datasets that are still similar to
the template. This approach eases the application of our
tool as users do not have to specify a completely synthetic
time series model. In addition, this mechanism o ers a
certain degree of anonymization for otherwise closed data.</p>
      <p>In the remainder of this article, we present a general
system overview in Section 2, before we describe our
demonstraTemplate
creation
tion in detail in Section 3. Previous work related to dataset
generation and time series models is presented in Section 4
before the concluding remarks and pointers to future work
are given in Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>2. SYSTEM OVERVIEW</title>
      <p>
        The main work ow of Loom is depicted in Figure 1 and
shows all steps necessary to create a set of time series. In
this section, we give an outline of the idea behind each step
before we describe its implementation in Section 3.
Template creation. This optional step at the beginning of
the work ow allows the user to upload and analyze given
time series data in order to create templates that can be
used during the latter steps of the data generation process.
Currently, Loom employs three types of template creation:
(1) If present, existing hierarchies of categorical
information are extracted and stored. In order to anonymize the
data, the original attributes are replaced with synthetic ids.
(2) Given time series are extracted and taken as samples
for time series generation. (3) The whole set of time series
is analyzed to create a template that represents its
characteristics and can be used to create multiple datasets that
are similar to the original. While the rst two types are
straightforward, the third one is more complex. In order
to create the described template, Loom uses an approach
based on the hierarchical divisive analysis clustering
(DIANA) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. With this method, the dataset is partitioned
into groups of similar time series. From each of these
clusters, the time series with the lowest average distance to the
remaining members is selected as a prototype. By tting
an analytical time series model, e.g. ARIMA, to this time
series, a generator that represents the characteristics of its
underlying cluster is created. This generator can be used
to create multiple variations of the original time series. To
complete the template, the size of each group in relation to
the whole dataset is stored. With the collected information
it is possible to create multiple datasets that are di erent
but still share the characteristic time series of the original
and their distribution.
      </p>
      <p>
        Data cube modelling. Usually, a set of time series does
not only contain sequences of measured values, but also
categorical information, e.g. geography, purpose, or color. Sets
of these attributes are organized in hierarchies, called
dimensions, while sets of these dimensions form the skeleton of a
data cube. The goal of this step is to allow users the
conguration and creation of such cubes. In short, data cubes
can be formally described as follows [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]:
      </p>
      <p>A data cube skeleton consists of a set of dimensions. A
dimension is a lattice of levels L = fl1; l2; : : : ; lmg. The
constraint of this lattice states that the values of a level, called
category attributes functionally determine the values of its
parent level, e.g., l0 ! l00. More formally, for each level l, l0,
l00 of the same dimension:
l ! l
l ! l0 ^ l0 ! l ) l = l0
l ! l0 ^ l0 ! l00 ) l ! l00
(re exivity)
(antisymmetry)
(transitivity)
For the sake of simplicity, we implemented totally ordered
dimensions in Loom; this will be discussed in Section 5. A
total order has the following additional condition:
l ! l0 _ l0 ! l (totality)
As an example, Figure 2 shows the totally ordered dimension
Geography of Australia with the two levels State and Region.
The category attributes of Region functionally determine the
category attributes of State, e.g. Melbourne and Ballarat
determine Victoria.</p>
      <p>Three parameters are necessary to con gure a data cube
skeleton: the number of dimensions, the number of levels
per dimension, and the outdegree of a category attribute.
While the rst two parameters are straightforward, the last
one needs more explanation. The outdegree of a category
attribute de nes the number of subcategories within a
category and thus describes the branching between the levels of
a dimension. To illustrate this, we regard the example from
Figure 2 where we observe an outdegree of 2. This means a
category attribute on the State level, e.g. New South Wales,
is related to two category attributes on the lower Region
level, e.g. Sydney and Blue Mountains. Category attributes
that have no further subcategories are called base category
attributes and form the leaves of a dimension's hierarchy.
Thus, a data cube skeleton is the cross-product of all base
category attributes of all dimensions.</p>
      <p>In the energy domain, categorical information is also
bene cial because forecast models that are built for categories
may lead to more accurate and robust forecast results. For
Region Sydney</p>
      <p>Blue Mountains</p>
      <p>Melbourne</p>
      <p>
        Ballarat
instance, the Irish Smart Metering Project [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] gathers time
series of smart meters in over 5,000 Irish households and
businesses. In a survey, the owners give additional
information, such as Social class, House type and Age of the house.
We identify 8 dimensions, each of whose consisting of one
dimension level. Thus, forecast models can be created for
individual time series or aggregated time series along one or
more dimensions.
      </p>
      <p>Time series generation. The data cube skeleton that was
con gured and created in the previous step must now be
lled with facts, which in the case of our system are time
series. In the following, we again give a quick formalization of
time series and explain what is necessary to con gure their
creation. A time series is a sequence of successive
observations xt (1 t T ) recorded at speci ed time instances
t. For this demonstration, we assume that observations are
complete and equidistant, i.e. there exists an observation
for every time instance and all time instances have the same
distance. For con guration a time frame must be de ned
that consists of start and end time instance as well as the
distance between time instances. In addition one or more
measure columns must be de ned, depending whether
univariate or multivariate time series should be generated. To
ll these measures with actual values, our system uses time
series models and samples.</p>
      <p>As an example, the user can create a synthetic model with
a base bt, season st and error component et:</p>
      <p>xt = bt s(t mod L) + et
The season length is given by L. The component st is
a seasonal mask of length L. Thus, the weights repeat
every L time instances. The error component et is
normal distributed N (0; 2) and overlays the \perfect" model
xt? = bt s(t mod L). The standard deviation depends on
the user's accuracy expectation, expressed as mean average
percentage error (MAPE):</p>
      <p>T
M AP E = 1 X</p>
      <p>T t=1
j</p>
      <p>?
xt
x?
t</p>
      <p>T
xt j = 1 X</p>
      <p>j
T t=1 bt s(t mod L)</p>
      <p>j
et
The error distribution is calculated such that the user given
M AP E holds on average of the whole time series.</p>
      <p>
        Datasets may also be created from template. During
template creation, a set of ARIMA models is created that is
based on user given data. Synthetic time series are
generated by those ARIMA models, incorporating normally
distributed errors or sampled errors from the template [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
Time series mapping. The mapping links time series to
the data cube, i.e. populates the skeleton with facts. As
manual insertion of thousands of time series into a large
data cube would not be feasible, Loom features an automatic
mapping that randomly distributes the generated time series
over the data cube skeleton. The user may set the model
frequency by weighting each time series model. If templates
are used, their distribution is used as default.
      </p>
      <p>
        Dataset export. After the con guration and mapping steps
are done, the actual data is created and can be exported in
a suitable format for further use. Our application o ers two
di erent export destinations: either le or database. File
export o ers general formats like CSV and SQL script,
allowing the use of our generated data in almost every
application. In addition, we o er export as RData which is the
data format data.table for the popular statistical workbench
R [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The database export transforms the created data into
fact tables that can be imported into any RDBMS. These
tables have one time and at least one measure column,
categorical information may be stored in fact tables as well as
in dimensional tables. As we deal with a high amount of
structured data, it is necessary to bring the dataset into an
appropriate schema depending on the chosen export option.
3.
      </p>
    </sec>
    <sec id="sec-3">
      <title>DEMONSTRATION</title>
      <p>This section demonstrates the usage of Loom. Starting
the application, the user sets a workspace directory that
is used for con guration les and generated datasets.
Below, we describe the con guration of a template, a dataset
and further generation steps. These steps correspond to the
work ow shown in Figure 1.
3.1</p>
    </sec>
    <sec id="sec-4">
      <title>Template configuration</title>
      <p>The optional template con guration annotates a user given
dataset with information that is further needed for the
template creation. The user inputs a CSV le and annotates
each column with the corresponding semantic (time,
measure or category). Moreover, the time column needs
information about the time type (integer, date, time) and the
respective format. As shown in Figure 3, the user con gures
dimensions by Drag and Drop of the category names.
Optionally, the user may indicate the primary attribute that
is the lowest level category in every dimension. Once this
con guration is nished, the template may be used for time
series and/or data cube generation.</p>
      <p>After login, the user is greeted by an overview window
that displays all datasets that he/she has already generated
and those that are queued for generation, Figure 4. Clicking
\Create new", opens a wizard dialog that guides the user
through the con guration. The rst input by the user is a
dataset name which is needed to reference and handle the
con guration and its results.
3.2.1</p>
      <sec id="sec-4-1">
        <title>Data cube configuration</title>
        <p>The next dialog, Figure 5, allows the con guration of the
data cube. Loom o ers two ways of con guration: template
and synthetic.</p>
        <p>Template con guration: The user can select a
template as basis for the data cube skeleton. Templates
are derived from real world datasets or from existing
data cubes and are ready-made con gurations
featuring default values for number of dimensions, number
of levels and outdegrees. Thus, this type of con
guration is more user friendly than the synthetic one.
Users can still customize the con guration by making
selective adjustments to the template. It is possible
to create di erent variants, e.g. smaller, larger, highly
branched, etc., of an existing data cube.</p>
        <p>Synthetic con guration: Alternatively, this type of
conguration allows users the full manual customization
of the data cube skeleton. Users specify the number of
dimensions, the number of levels per dimension and the
outdegree per level. These parameters can be provided
xed for each element or the whole cube. In addition,
Loom o ers a random parameter distribution,
allowing a probabilistic setting with randomly structured
dimensions.</p>
        <p>Both template and synthetic con gurations, employ
random distribution of outdegrees to a certain extent. This
means the speci c number of facts that can be
accommodated by the data cube skeleton is not known during
conguration. As the user must know the actual structure of
the data cube in order to con gure an appropriate number
of time series, our system o ers a preview of the data cube
skeleton directly after con guration. This preview is
depicted in Figure 6 and shows all category attributes ordered
by their size. The user may restart the modeling or accept
the generated result.</p>
        <p>
          In many benchmarks such as TPC-H [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], it is common to
set a scale factor SF of a database size, such as 10, 30, 100.
Loom adopts this functionality by using a scale dimension
that consists of exactly one level with SF 1 category
attributes and can be used to adapt the size of the data
cube as desired. Thus, the resulting data cube is in ated by
this factor.
        </p>
        <p>While the parameters and con guration types described
in this section provide users with a versatile and
comfortable way of modeling the categorical information of a data
cube, its con guration is not mandatory. If a user wishes
for a plain set of unlabeled data, only a primary attribute is
generated to identify the created time series.</p>
        <p>Time series con guration is split into three separate
dialogs: time attribute, measure attributes, and time series
models. The time attribute needs parameters such as the
time type and the timeframe, i.e. start time, granularity,
end time.</p>
        <p>In the measure dialog, the user sets the number of measure
columns of the dataset. For each measure attribute, the user
sets a data type. Usually, a measure is of double-precision
oating-point format but in order to decrease dataset size
in a database, the user can also set single-precision
oatingpoint or integer format.</p>
        <p>Most importantly, time series models have to be added to
the con guration. For this, Loom o ers four options:
Sampled from Template: All time series data is
generated \as is" by taking values from given time series
from existing datasets uploaded by users. According
to the timeframe con guration data is extracted from
the original time series and added to the generated one.
Mismatches with the timeframe or data cube size are
resolved using duplication, cutting, granularity
conversion etc.</p>
        <p>
          Recombined from Template: Time series are created
via decomposition and recombination. Classical
decomposition strategies like decompose [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] and stl [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]
are used to extract the de ning components trend,
seasonality, and noise from an existing set of time series.
By recombining these components, new time series are
created and can be used to add volume and variety to
a dataset.
        </p>
        <p>Modeled from Template: This option uses the third
type of templates we described in Section 2. Users
can load the time series generators and their
distributions of an existing dataset. Customization is possible
by changing the number of time series an individual
generator creates or by removing/adding certain
generators.</p>
        <p>Synthetic time series: The user con gures a time series
model from scratch, without relying on any given
measures. Time series properties are de ned freely and are
synthetically generated, e.g. with a linear rising trend,
a regular gauss-shaped seasonal component, and a
normally distributed error series.
3.2.3</p>
      </sec>
      <sec id="sec-4-2">
        <title>Export configuration</title>
        <p>In the last dialog, the user sets the export con guration.
The export destination has di erent options. CSV, SQL
and RData create at les within the workspace.
Alternatively, time series are exported to a database via JDBC
driver. Thus, the user sets up a database connection with
a database location and login credentials. Finally, the user
sets the schema. Loom supports di erent schemas: (1)
Basic unnormalized export in a Universal schema, which
creates high redundancy as it stores the categorical information
with each value of a time series. (2) The partly resp. fully
normalized Star and Snow ake schema, which allows more
compact exports and are common in database design. (3)
The Parent-Child schema, which stores each functional
dependency as a pair of parent attribute and child attribute in
the respective dimension table.</p>
        <p>Particular attention has to be paid to the export when
dimensions are unbalanced, i.e. their base categories are
not on the same level. Figure 7 shows an example for such
a dimension, where base category attributes are either on
the region level (Darwin, Alice Springs) or on the state
level (Australia Capital Territory). While both the
universal schema and the parent-child schema support unbalanced
dimensions, the star and snow ake schema only support
unbalanced dimensions when referential integrity can be
guaranteed. To achieve this, the user may add a primary
attribute column and a minimum outdegree. If the user did
not set a data cube skeleton, then a primary attribute is
automatically generated.</p>
        <p>Top
State</p>
        <p>Northern
Territory</p>
        <p>Australian Capital
Territory
Australian Capital
Territory
Region Darwin</p>
        <p>Alice Springs</p>
        <p>Closing the con guration window brings the user back to
the initial overview where a new entry has been added, see
Figure 4. Now, the generation process can be invoked by
clicking "Start Selected\. The state switches to "In Progress\
and indicates the current amount of data that has been
generated.
4.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>RELATED WORK</title>
      <p>Workload generation has been studied in many papers
each of which focus either on the generation process or time
series modelling. To our knowledge, Loom is the rst
application that integrated both techniques and allows for exible
data cube and time series characteristics. In the following,
we present selected sources that relate to our prototype.</p>
      <p>
        The IDAS dataset generator [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] o ers data generation
based on statistical distributions. Attributes form
dependency graphs that are not necessarily lattices. The goal is
the creation of a synthetic dataset for testing data mining
algorithms. The work ow is similar to Loom since the user
creates a dataset by specifying the number of tables, setting
the attributes and initiating the data generation. Moreover,
the authors experienced similar shortcomings with real data,
such as privacy issues, a lack of training data or
unsatisfying categorical information. Still, measures do not depend
on time, thus time series generation is not supported.
      </p>
      <p>
        Scha ner and Januschowski [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] focus on benchmarking
of databases under varying request rates. Request rates can
be seen as time series of aggregate tenant traces. Since there
is not enough real data available, they provide two
methodologies for generating synthetic tenant traces. (1) The
modelling approach ts a function as a model for a given
aggregate tenant trace. The function's shape has been
determined empirically. By adding an error term, they create
diversity among the synthetic tenant traces. (2) Another
way is the decomposition of time series by bootstrapping.
Thus, a given trace is split into windows that are randomly
shifted and result in synthetic traces. Both approaches are
similar to Loom's template creation in that synthetic time
series are either modelled or recombined from template.
      </p>
      <p>
        The F-TPC-H benchmark [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is a modi ed TPC-H
benchmark for time series generation. This work reuses the given
TPC-H schema in that customers submit orders of products
for a certain quantity. While this quantity does not depend
on time in TPC-H, the modi ed F-TPC-H adds dependency
for representing trend and seasonal e ects via ARIMA
models. Thus, this work represents a subset of Loom because it
consists of a given schema and allows for synthetic time
series in the sales domain. Loom also supports schema
exibility and allows for composing di erent time series generators.
      </p>
      <p>
        A speci c use case for managing energy data is given by
[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. This work proposes a uni ed Data Warehouse schema
for storing workloads, given as information about actors like
producers and consumers, o ers, and time series about past
measures. Their time series schema involves measures from
di erent types such as energy, power and price. Categorical
information is necessary in order to store special annotations
such as aggregation level of time series or additional
information for each time series type. A time series is represented
by several tables: (1) a time series table stores the primary
attribute that identi es the time series and that links to each
category, (2) another dimensional table stores time frames
with an identi er and the resp. time frame information, (3)
the fact table itself consists of the primary attribute, a
measure column and a foreign key to the time frame. Thus,
this schema is not a traditional star of snow ake schema
and cannot directly be covered by Loom. Moreover, Loom
keeps time and measure together as fact. This may increase
redundancy but we opt for this solution for several reasons:
(1) there is no need for an additional description of time
frames, (2) a time frame is encoded either as an integer or a
short string, thus the space for storage is still a ordable, (3)
there is no join operation needed in order to retrieve a time
series. After all, time series from the energy domain may be
generated by models integrated in Loom.
      </p>
    </sec>
    <sec id="sec-6">
      <title>CONCLUSIONS AND FUTURE WORK</title>
      <p>In this paper, we introduced Loom as a tool for generating
large sets of synthetical time series data. Our prototype
utilizes di erent time series generators to create multiple
time series that share certain characteristics. In addition,
our prototype allows the creation of dimensional categorical
information for the description of time series. Besides the
full manual de nition of a dataset, Loom features a template
driven approach that analyses given datasets and allows the
creation of synthetic variants of this template data.</p>
      <p>
        Currently, our approach only generates complete time
series with equidistant time stamps. This is done, as forecast
methods like Exponential smoothing [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and ARIMA [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] rely
on these properties, except few models like [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Part of our
future work will be the integration of functions for
generating incomplete time series with con gurable gap patterns.
      </p>
      <p>Regarding data cube modelling, we assume that a
dimension is a totally ordered set of levels, which is the case in most
real-world datasets. However, there are exceptions, such as
the modelling of a time dimension with levels: day, week
and month. There, a day functionally determines a week
and a month, but a week does not determine the month.
Such lattices are not supported by our prototype.</p>
      <p>Further future work will be focused on time series mapping
to the data cube. Right now, we use a very simple approach
for this and randomly distribute our time series over the
data cube skeleton. This approach will be replaced with a
more sophisticated method that allows the con guration of
the distribution. For example, time series from a certain
generator only occur in a speci ed subset of the data cube
skeleton.</p>
    </sec>
    <sec id="sec-7">
      <title>REFERENCES</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>CER</given-names>
            <surname>Smart Metering</surname>
          </string-name>
          <article-title>Project</article-title>
          . http://www.ucd.ie/issda/data,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>TPC</given-names>
            <surname>Benchmark H</surname>
          </string-name>
          . http://www.tpc.org/TPC_ Documents_Current_Versions/pdf/tpch2.17.1.pdf,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R</given-names>
            <surname>data</surname>
          </string-name>
          .
          <source>table package</source>
          . https://cran.r-project.org/ web/packages/data.table/index.html,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <article-title>[4] R forecast package</article-title>
          . https://cran.r-project.org/ web/packages/forecast/index.html,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>G. E. P.</given-names>
            <surname>Box</surname>
          </string-name>
          and
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Jenkins</surname>
          </string-name>
          .
          <article-title>Time series analysis forecasting and control</article-title>
          .
          <source>Holden-Day</source>
          , San Francisco,
          <year>1970</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R. B.</given-names>
            <surname>Cleveland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. S.</given-names>
            <surname>Cleveland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. E.</given-names>
            <surname>McRae</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Terpenning.</surname>
          </string-name>
          <article-title>STL: A Seasonal-Trend Decomposition Procedure Based on Loess</article-title>
          .
          <source>Journal of O cial Statistics</source>
          ,
          <volume>6</volume>
          :3{
          <fpage>73</fpage>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>U.</given-names>
            <surname>Fischer</surname>
          </string-name>
          .
          <source>Forecasting in database systems</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>U.</given-names>
            <surname>Fischer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rosenthal</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W.</given-names>
            <surname>Lehner</surname>
          </string-name>
          .
          <article-title>F2DB: The Flash-Forward Database System</article-title>
          .
          <source>In ICDE</source>
          , pages
          <volume>1245</volume>
          {
          <fpage>1248</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C. C.</given-names>
            <surname>Holt</surname>
          </string-name>
          .
          <article-title>Forecasting trends and seasonal by exponentially weighted averages</article-title>
          .
          <source>O ce of Naval Research Memorandum</source>
          ,
          <volume>52</volume>
          ,
          <year>1957</year>
          . Reprinted in:
          <source>International Journal of Forecasting</source>
          ,
          <volume>20</volume>
          (
          <issue>1</issue>
          ):
          <fpage>5</fpage>
          -
          <lpage>10</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Hyndman</surname>
          </string-name>
          .
          <article-title>Time series data library</article-title>
          . http://data.is/TSDLdemo. Accessed on 9-24-15.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Jeske</surname>
          </string-name>
          et al.
          <article-title>Generation of synthetic data sets for evaluating the accuracy of knowledge discovery systems</article-title>
          .
          <source>In Proc. of KDD</source>
          , pages
          <volume>756</volume>
          {
          <fpage>762</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaufman</surname>
          </string-name>
          and
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Rousseeuw</surname>
          </string-name>
          . Finding Groups in Data. Wiley,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kendall</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Stuart</surname>
          </string-name>
          .
          <source>The Advanced Theory of Statistics</source>
          , volume
          <volume>3</volume>
          .
          <string-name>
            <surname>Gri</surname>
            <given-names>n</given-names>
          </string-name>
          ,
          <year>1983</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Scha</surname>
          </string-name>
          ner and
          <string-name>
            <given-names>T.</given-names>
            <surname>Januschowski</surname>
          </string-name>
          .
          <article-title>Realistic tenant traces for enterprise DBaaS</article-title>
          .
          <source>In Workshops Proc. of ICDE</source>
          , pages
          <volume>29</volume>
          {
          <fpage>35</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R. H.</given-names>
            <surname>Shumway</surname>
          </string-name>
          and
          <string-name>
            <surname>D. S.</surname>
          </string-name>
          <article-title>Sto er</article-title>
          .
          <source>Time Series Analysis and Its Applications</source>
          . Springer,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>L.</given-names>
            <surname>Siksnys</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Thomsen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T. B.</given-names>
            <surname>Pedersen. MIRABEL DW</surname>
          </string-name>
          <article-title>: managing complex energy data in a smart grid</article-title>
          .
          <source>In Proc. of DaWaK</source>
          , pages
          <volume>443</volume>
          {
          <fpage>457</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>P.</given-names>
            <surname>Vassiliadis</surname>
          </string-name>
          .
          <article-title>Modeling multidimensional databases, cubes and cube operations</article-title>
          .
          <source>In Proc. of SSDBM</source>
          , pages
          <volume>53</volume>
          {
          <fpage>62</fpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>