<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Workload-based Heuristics for Evaluation of Physical Database Architectures</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andreas LÜBCKE</string-name>
          <email>andreas.luebcke@ovgu.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin SCHÄLER</string-name>
          <email>martin.schaeler@ovgu.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Veit KÖPPEN</string-name>
          <email>veit.koeppen@ovgu.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gunter SAAKE</string-name>
          <email>gunter.saake@ovgu.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computer Science, Otto-von-Guericke-University Magdeburg</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <fpage>3</fpage>
      <lpage>10</lpage>
      <abstract>
        <p>Database systems are widely used in different application domains. Therefore, it is difficult to decide which database management system meets the requirements of a certain application at most. This observation is also true for scientific and statistical data management, due to new application and research fields. New requirements are often implied to data management while discovering unknown research and applications areas. That is, heuristics and tools do not exist to select an optimal database management system. In previous work, we proposed a decision framework based on application workload analyses. Our framework supports application performance analyses by mapping and merging workload information to patterns. In this paper, we present heuristics for performance estimation to select an optimal database management system for a given application. We show that these heuristics improve our decision framework by complexity reduction without loss of accuracy.</p>
      </abstract>
      <kwd-group>
        <kwd />
        <kwd>Heuristics</kwd>
        <kwd>storage architecture</kwd>
        <kwd>design</kwd>
        <kwd>performance</kwd>
        <kwd>query processing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Database systems (DBS) are pervasively for almost each branch of business activity.
Therefore, DBS have to manage different requirements for heterogeneous application
domains. New data management approaches are developed (e.g., NoSQL-DBMSs [
        <xref ref-type="bibr" rid="ref14 ref9">9,14</xref>
        ],
MapReduce [
        <xref ref-type="bibr" rid="ref12 ref13">12,13</xref>
        ], Cloud Computing [
        <xref ref-type="bibr" rid="ref16 ref3 ref7">3,16,7</xref>
        ], etc.) to make the growing amount of
data1 manageable for special application domains. We argue, these approaches are
developed for special applications and need a high degree of expert knowledge for
usage, administration, and optimization. However, we focus our observations to relational
database management systems (DBMSs) in this paper. Relational DBMSs are commonly
used DBS for highly diverse applications and besides relational DBMS are well-know to
many IT-affine people.
      </p>
      <p>
        Relational DBMSs2 are developed to manage data of daily business and reduce
paper trails of companies (e.g., finance institutions) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This approach dominates more and
more the way of data management that we today know as online transaction
processing (OLTP). Nowadays, faster and more accurate forecasts for revenues and expenses
are not enough anymore. A new application domain evolves that focuses on analysis of
data to support business decisions. Codd et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] defines this type of data analysis as
1Consider the data explosion problem [
        <xref ref-type="bibr" rid="ref22 ref26">22,26</xref>
        ].
      </p>
      <p>2In the following, we use the term DBMS synonymously for relational DBMS.
online analytical processing (OLAP). Consequently, two disjunctive application domains
for relational data management exist with different scopes, impacts, and limitations (cf.
Section 1).</p>
      <p>
        In recent years, business application have a high demand for solutions that support
tasks from both OLTP and OLAP [
        <xref ref-type="bibr" rid="ref15 ref21 ref28 ref31 ref33 ref34">15,21,28,31,33,34</xref>
        ], thus coarse heuristics for
typical OLTP and/or OLAP applications have become obsolete (e.g., data warehouses
without updates always perform best on column-oriented DBMSs). Nevertheless, new
approaches, which we mention above, also show impacts and limitations (e.g.,
in-MemoryDBMS only, focus on real-time or dimension updates, etc.), such that we argue there is
no DBMS that fits for OLTP and OLAP in all application domains. Heuristics and current
approaches for physical design and query optimization only consider a certain
architecture3 (e.g., design advisor [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] and self-tuning [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] for row-oriented DBMSs or
equivalent for column-oriented DBMSs [
        <xref ref-type="bibr" rid="ref19 ref30">19,30</xref>
        ]). That is, the decision for a certain architecture
has to be done beforehand. Consequently, there is no approach that neither advices
physical design spanning different architectures for OLTP, OLAP, and mixed OLTP/OLAP
workloads nor that estimates which architecture is optimal to process a query/database
operation.
      </p>
      <p>
        However, we refine obsolete heuristics for physical design of DBS (e.g., heuristics
from classical OLTP domain). We consider OLTP, OLAP, and mixed application
domains for physical design. We present heuristics that propose the usage of row-oriented
DBMSs (row stores) or column-oriented DBMSs (column stores) under
certain circumstances. Furthermore, we present heuristics for query execution or rather for
processing (relational) database operations on column and row stores. Our heuristics
show which query type and/or database operation performs better on a particular
architecture4 and how single database operations affect performance of a query or a
workload. We derive our heuristics from experiences in workload analyses with the help of
our decision model, presented in [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ].
      </p>
      <p>In the following sections, we consider the main differences between row- and
column store as well as the advantages and disadvantages for column stores. Section 2
addresses our heuristics for physical design of DBMSs concerning different storage
architectures (i.e., row or column store). In Section 3, we present heuristics for query
processing on row and column stores. Section 4 gives an overview of related research. Finally,
we summarize our discussions and give an outlook in Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>1. Column or Row Store: Assets and Drawbacks</title>
      <p>
        In previous work, we already discussed differences between column and row stores
according to different subjects [
        <xref ref-type="bibr" rid="ref23 ref24 ref25">25,24,23</xref>
        ]. We can summarize our major observations in
Table 1. Of course, column stores architecture also has disadvantages. We just name
them because they are mostly contradictory to row store architecture advantages. Column
stores perform worse on update operations and concurrent non-read-only data access due
to partitioned data, thus on frequent updates and consistency checks tuple reconstructions
cause notable cost.
      </p>
      <p>
        Finally, row stores can outperform column stores in their traditional application
domain nor vice versa. Other researchers confirm our consideration that one architecture
3We use the term architecture synonymously for storage architecture.
4The term architecture refers to row- and column-oriented database architecture.
cannot sustain the other architecture in their native domain (e.g., by simulating
architecture through partitioned schema [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]).
      </p>
    </sec>
    <sec id="sec-3">
      <title>2. Heuristics on Physical Design</title>
      <p>Physical design of DBSs is important as long as DBMSs exist. We do not only restrict
our considerations to a certain architecture. Further, we consider a set of heuristics that
can be used to forecast which architecture is more suitable for a given application.</p>
      <p>Some existing rules still have their validity. First, pure OLTP applications perform
best on row stores. Second, classic OLAP application with an ETL (extract, transform,
and load) process or (very) rare updates are satisfied with column stores5. In the
following, we consider a more exciting question. In which situation one architecture
outperforms the other one and in which case they perform nearly equivalent.</p>
      <p>OLTP For OLTP workloads, we just recommend to use row stores as we do it for
decades. A column store does not achieve competitive performance except column-store
architecture will significantly change.</p>
      <p>OLAP In this domain one might suspect a similar situation as for OLTP workloads.
However, this is not true in general. We are aware that column stores outperform row
stores for many applications and/or queries in this domain; that is, for aggregates and
access as well as processing of a few columns. In most cases, column stores are most
suitable for applications in this domain. Nevertheless, there exist complex OLAP queries
where column stores lose their advantages (cf. Section 1). For these complex queries,
row stores can achieve competitive results, even if they consume more memory. These
queries have to be considered for architecture selection because they critically influence
the physical design estimation even more if there is a significant amount of these queries
in workload.</p>
      <p>OLTP/OLAP In these scenarios, your physical design strongly depends on the ratio
between updates, point queries, and analytical queries. Our experience is that column
stores perform about 100-times slower on OLTP-transactions (updates, inserts, etc.) than
row stores. This fact is even worse because we do not even consider concurrency (e.g.,
ACID); that is, we make this observation in single-user execution on transactions.
Assuming transaction and analytical queries in average take the same time, we state that
one transaction only occurs every 100 queries (OLAP). The fact that analytical queries
5Note: Not all column stores support updates just ETL.
last longer than a single iteration leads us to a smaller ratio. Our experience shows that
10 executions of analytical queries on a column store are of greater advantage than the
loss by a single transaction. If you have a smaller ration than 10:1 (analyses/Tx) then we
cannot give a clear statement. We recommend using a row store in this situation or you
know beforehand that the ratio will change to more analytical queries. If the ratio falls
under this ratio for a column store and is not a temporary change then a system change is
appropriate. So, in mixed workloads it is all about the ratio of analytical queries to
transactions. Note that the ratio 100:1 and 10:1 can change considering OLAP-query type.
We address this issue in Section 3.</p>
      <p>
        Our heuristics can be used as a guideline for architecture decisions for certain
applications. That is, we select the most suitable architecture for an application and
afterwards use existing approaches (like IBM’s advisor [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]) to tune physical design of a
certain architecture. If workload and DBSs are available for analysis, we emphasize to use
our decision model [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] to calculate an optimal architecture for a defined workload. The
presented heuristics for physical design extent our decision model to reduce calculation
costs (i.e., solution room is pruned). Additionally, heuristics make our decision model
available for scenarios where only restricted information is available.
      </p>
    </sec>
    <sec id="sec-4">
      <title>3. Heuristics on Query Execution</title>
      <p>In the following, we present heuristics for query execution on complex or hybrid DBS.
We assume a DBS that supports column- and row-store functionality or a complex
environment with at least two DBMSs containing data redundant whereby at least one DBMS
is setup for each OLTP and OLAP. In the following, we only discuss query processing
heuristics for OLAP and OLTP/OLAP workloads due to our assumptions above and the
fact that there is no competitive alternative to row stores for OLTP.</p>
      <p>
        OLAP In this domain, we face a huge amount of data that generally is not frequently
updated. Column stores are able to significantly reduce the amount of data due to
aggressive compression. That is, more data can be loaded into main memory and I/O between
storage and main memory is reduced. We state that this I/O reduction is the major benefit
of column stores. We made the experience that row stores perform worse on many OLAP
queries because row stores drop performance due to fact that CPUs often are idle while
waiting for I/O from storage. Moreover, row stores read data that is not required due to
the physical design of data. In our example from TPC-H benchmark [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ], one can see
that only a few columns of the lineitem relation have to be accessed (cf. Listing 1). In
contrast to column stores, row stores have to access the complete relation to answer this
query. We state, most OLAP query fit into this pattern. We recommend using column
store functionality to answer this type of OLAP queries as long as they only access a
minority of columns from relation for aggregation and predicate selection.
1 from lineitem
2 where l_shipdate &gt;= d a t e ’1994-01-01’
3 and l_shipdate &lt; d a t e ’1994-01-01’ + i n t e r v a l ’1’ y e a r
4 and l_discount b e t w e e n .06 - 0.01 and .06 + 0.01
5 and l_quantity &lt; 24;
      </p>
      <p>Listing 1. TPC-H query Q6
Complex Queries OLAP is often used for complex analyses, thus queries become more
complex too. These queries describe complex issues and/or produce large business
reports. Our example (Listing 2) from the TPC-H benchmark could be part of a report (or
a more complex query). Complex queries access and/or aggregate many tuples, that is,
nearly the complete relation has to be read. This implies a number of tuple
reconstructions that significantly reduce the performance of column stores. Hence, row stores can
achieve competitive performance because nearly the complete relations have to be
accessed. Our example shows another reason for a number of tuple reconstructions: group
operations. Tuples have to be reconstructed before aggregating groups. Other reasons
for significant performance reduction by tuple reconstructions can be a large number of
predicate selections on different columns as well as complex joins. We argue, this
complex OLAP query type can be executed on both architectures. This fact can be used to
load balance queries in complex environments and hybrid DBMSs.
6 s e l e c t c_custkey, c o u n t (o_orderkey) from
7 customer l e f t o u t e r j o i n orders on c_custkey = o_custkey
8 and o_comment n o t l i k e ’%special%request%’
9 group by c_custkey) as c_orders (c_custkey, c_count)
10 group by c_count
11 o r d e r by custdist desc , c_count d e s c ;</p>
      <p>Listing 2. TPC-H query Q13
Mixed Workloads In mixed workload environments, our first recommendation is to split
the workload into two parts OLTP and OLAP. Both parts can be allocated to the
corresponding DBS with row- or column-store functionality. As we mentioned above, this
split methodology can also be used for load balancing if one DBS is too busy. With this
approach, we achieve competitive performance for both OLAP and OLTP because,
according to our assumptions, complex systems have to have at least one DBMS of each
architecture. Further, we state that a future hybrid system has to satisfy both architecture,
too. Processing mixed workloads with our split behavior has two additional advantages.
First, we can reduce issues with complex OLAP queries by correct allocation or dividing
over both DBS. Second, we can consider time-bound parameters for queries.</p>
      <p>
        As mentioned above, two integration methods for the query-processing heuristics
are available. First, we propose the integration into a hybrid DBMSto decide where to
optimally execute a query. That is, heuristics enable rule-based query optimization for
hybrid DBMS as we know from row-store optimizer [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Second, we propose a global
manager on top of complex environments to optimally distribute queries (as known from
distributed DBMSs [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]). Our decision model analyzes queries and decides where to
distribute queries to. Afterwards, queries are locally optimized by DBMS itself. Note,
time-bound requirements change over time and system design determines how
up-todate data is in OLAP system (e.g., real-time load [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] is available or not). We state that
such time-bound requirements can be passed as parameters to our decision model. That
is, even OLAP queries can be distributed to OLTP part of the complex environment if
data in OLAP part is insufficiently updated or vice versa we have to ensure analysis on
most up-to-date data. Additionally, such parameters can be used to alternatively allocate
high-priority queries to another DBS part if the query has to wait otherwise.
      </p>
    </sec>
    <sec id="sec-5">
      <title>4. Related Work</title>
      <p>
        Several approaches are developed to analyze and classify workloads (e.g., [
        <xref ref-type="bibr" rid="ref18 ref29">18,29</xref>
        ]).
These approaches recommend tuning and design of DBS, try to merge similar tasks, etc.
to improve performance of DBSs. To the present, workload-analysis approaches are
limited to classification of queries to execution pattern or design estimations for a certain
architecture; that is, the solution space is beforehand pruned analysis and performance
estimations are done. This results in information loss due to an inappropriate reduction.
With our decision model and heuristics, we propose an approach that is independent from
architectural issues.
      </p>
      <p>
        Due to success in the analytical domain, researchers devote more attention on
column stores whereby focused on analysis performance [
        <xref ref-type="bibr" rid="ref1 ref19">1,19</xref>
        ] or to overcome
updateproblems with separate storages [
        <xref ref-type="bibr" rid="ref1 ref15">15,1</xref>
        ]. However, a hybrid system in an architectural
manner does not exist. In-memory-DBMSs are developed in recent years (e.g.,
HyPer [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]) that satisfy requirements for mixed OLTP/OLAP workloads. Nevertheless, we
state that even today not all DBSs can run in-memory (due to monetary or environmental
constraints), thus we propose a more general approach.
      </p>
      <p>
        Recommending indexes [
        <xref ref-type="bibr" rid="ref10 ref5">5,10</xref>
        ], materialized views [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], configurations [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], or
physical design in general [
        <xref ref-type="bibr" rid="ref36 ref37 ref6">6,37,36</xref>
        ] are in focus of researchers. However, all approaches
are limited to certain DBMSs or at least one architecture to the best of our knowledge.
Our approach is situated on top of these design and tuning approaches. That is, we
support a first coarse granular design and tuning for a global view on hybrid systems and
utilize existing approaches to optimize locally.
      </p>
      <p>
        In the literature, researchers compare the performance of column and row stores
considering certain scenarios. Cornell and Yu [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] focus on disc-access minimization
for transaction to improve query execution time. Abadi et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] compare different
approaches for column-oriented storage concerning analytical queries. We state, current
applications demand for a combined observations of OLTP and OLAP scenarios.
      </p>
      <p>
        In line with Zukowski et al. [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ], we observe benefits to convert from NSM to DSM
and vice versa during query processing. In contrast to Zukowski et al., we do not only
focus on CPU trade-offs caused by tuple reconstructions. Our approach is used for both
scenarios and considers overall benefit of global and local tuning.
      </p>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusion</title>
      <p>In recent years, many new approaches as well as new requirements encourage
performance of DBMSs in many application domains. Nevertheless, new approaches and
requirements also increase complexity of DBMS selection, DBS design, and tuning for
a certain application. We focus our considerations on OLTP, OLAP, and OLTP/OLAP
workloads in general. That is, we considered which DBMS is most suitable.</p>
      <p>
        Therefore, we present heuristics on design estimations and query execution for both
relational storage architectures row and column stores. We use our decision model [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]
to observe the performance of several applications. Consequently, we derive heuristics
from our experiences while evaluating our decision model. We present these heuristics
for DBS design to a priori select the most suitable DBMS and to tune this system
afterwards. Our approach avoids misleading tuning if the architecture selection is wrong.
Furthermore, we present heuristics on query execution for both storage architectures. On the
one hand, we want to emphasize our heuristics for physical design. On the other hand,
we propose these heuristics for integration in hybrid DBS whether it is a real hybrid
DBMS or it is a complex system that consists of different DBMSs (at least one row and
one column store). Summarizing, there is no alternative in OLTP environments to row
stores, for OLAP applications we recommend to use column stores taking into account
that a number of complex OLAP queries can change this recommendation, and finally
in mixed OLTP/OLAP environments the focus is on the ratio of OLAP queries to OLTP
transactions (again taking very complex OLAP queries into account). Our heuristics are
a first step to rule-based query optimization in hybrid systems/architectures.
      </p>
      <p>In future work, we will evaluate our heuristics considering standard benchmark to
achieve meaningful results. After evaluation, we will implement the heuristics in our
decision model, thus we achieve a design advisor for both relational storage architectures.
Finally, we plan to implement a hybrid DBMSs using our heuristics for ruled-based query
optimization.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>This paper has been funded in part by the German Federal Ministry of Education and
Science (BMBF) through the Research Program under Contract FKZ: 13N10817.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Daniel</surname>
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Abadi</surname>
          </string-name>
          .
          <article-title>Query execution in column-oriented database systems</article-title>
          .
          <source>PhD thesis</source>
          , Cambridge, MA, USA,
          <year>2008</year>
          . Adviser: Madden, Samuel.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Morton</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Astrahan</surname>
            , Mike W. Blasgen,
            <given-names>Donald D.</given-names>
          </string-name>
          <string-name>
            <surname>Chamberlin</surname>
            , Kapali P. Eswaran, Jim Gray,
            <given-names>Patricia P.</given-names>
          </string-name>
          <string-name>
            <surname>Griffiths</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Frank King</surname>
            <given-names>III</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raymond</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lorie</surname>
            ,
            <given-names>Paul R.</given-names>
          </string-name>
          <string-name>
            <surname>McJones</surname>
          </string-name>
          , James W. Mehl,
          <string-name>
            <surname>Gianfranco R. Putzolu</surname>
            , Irving L. Traiger, Bradford W. Wade, and
            <given-names>Vera</given-names>
          </string-name>
          <string-name>
            <surname>Watson. System</surname>
            <given-names>R</given-names>
          </string-name>
          :
          <article-title>Relational Approach to Database Management</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .
          <volume>1</volume>
          (
          <issue>2</issue>
          ) (
          <year>1976</year>
          ),
          <fpage>97</fpage>
          -
          <lpage>137</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Armbrust</surname>
          </string-name>
          , Armando Fox, Rean Griffith,
          <string-name>
            <given-names>Anthony D.</given-names>
            <surname>Joseph</surname>
          </string-name>
          ,
          <string-name>
            <surname>Randy H. Katz</surname>
            , Andrew Konwinski,
            <given-names>Gunho</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>David A.</given-names>
          </string-name>
          <string-name>
            <surname>Patterson</surname>
            , Ariel Rabkin, Ion Stoica, and
            <given-names>Matei</given-names>
          </string-name>
          <string-name>
            <surname>Zaharia</surname>
          </string-name>
          .
          <article-title>Above the Clouds: A Berkeley View of Cloud Computing</article-title>
          .
          <source>Technical Report UCB/EECS-2009-28</source>
          , EECS Department, University of California, Berkeley,
          <year>Feb 2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Daniel</surname>
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Abadi</surname>
          </string-name>
          ,
          <string-name>
            <surname>Samuel R. Madden</surname>
            , and
            <given-names>Nabil</given-names>
          </string-name>
          <string-name>
            <surname>Hachem</surname>
          </string-name>
          .
          <article-title>Column-stores vs. row-stores: How different are they really</article-title>
          ? In SIGMOD'
          <volume>08</volume>
          (
          <year>2008</year>
          ),
          <fpage>967</fpage>
          -
          <lpage>980</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Nicolas</given-names>
            <surname>Bruno</surname>
          </string-name>
          and
          <string-name>
            <given-names>Surajit</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          .
          <article-title>To tune or not to tune? A lightweight physical design alerter</article-title>
          .
          <source>In VLDB'06</source>
          ,
          <string-name>
            <given-names>VLDB</given-names>
            <surname>Endowment</surname>
          </string-name>
          (
          <year>2006</year>
          ),
          <fpage>499</fpage>
          -
          <lpage>510</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Nicolas</given-names>
            <surname>Bruno</surname>
          </string-name>
          and
          <string-name>
            <given-names>Surajit</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          .
          <article-title>An online approach to physical design tuning</article-title>
          .
          <source>In ICDE'07</source>
          ,
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2007</year>
          ),
          <fpage>826</fpage>
          -
          <lpage>835</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Rajkumar</given-names>
            <surname>Buyya</surname>
          </string-name>
          , Chee Shin Yeo, and
          <string-name>
            <given-names>Srikumar</given-names>
            <surname>Venugopal</surname>
          </string-name>
          .
          <source>Market-Oriented Cloud Computing: Vision</source>
          , Hype, and
          <article-title>Reality for Delivering IT Services as Computing Utilities</article-title>
          . In HPCC (
          <year>2008</year>
          ),
          <fpage>5</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Edgar</surname>
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Codd</surname>
          </string-name>
          , Sally B.
          <string-name>
            <surname>Codd</surname>
          </string-name>
          , and
          <string-name>
            <surname>Clynch</surname>
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Salley</surname>
          </string-name>
          .
          <article-title>Providing OLAP to User-Analysts: An IT Mandate</article-title>
          . Ann ArborMichigan (
          <year>1993</year>
          ),
          <fpage>24</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Fay</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Jeffrey</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Sanjay</given-names>
            <surname>Ghemawat</surname>
          </string-name>
          , Wilson C.
          <article-title>Hsieh, Deborah A</article-title>
          .
          <string-name>
            <surname>Wallach</surname>
            , Michael Burrows, Tushar Chandra,
            <given-names>Andrew</given-names>
          </string-name>
          <string-name>
            <surname>Fikes</surname>
            , and
            <given-names>Robert</given-names>
          </string-name>
          <string-name>
            <surname>Gruber</surname>
          </string-name>
          .
          <article-title>Bigtable: A Distributed Storage System for Structured Data</article-title>
          . In OSDI (
          <year>2006</year>
          ),
          <fpage>205</fpage>
          -
          <lpage>218</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Surajit</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          and
          <string-name>
            <given-names>Vivek</given-names>
            <surname>Narasayya</surname>
          </string-name>
          .
          <article-title>Self-tuning database systems: A decade of progress</article-title>
          . In VLDB'
          <volume>07</volume>
          (
          <year>2007</year>
          ),
          <fpage>3</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Douglas</surname>
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Cornell</surname>
            and
            <given-names>Philip S.</given-names>
          </string-name>
          <string-name>
            <surname>Yu</surname>
          </string-name>
          .
          <article-title>An effective approach to vertical partitioning for physical design of relational databases</article-title>
          .
          <source>Trans. Softw. Eng</source>
          .
          <volume>16</volume>
          (
          <issue>2</issue>
          ) (
          <year>1990</year>
          ),
          <fpage>248</fpage>
          -
          <lpage>258</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Jeffrey</given-names>
            <surname>Dean</surname>
          </string-name>
          and
          <string-name>
            <given-names>Sanjay</given-names>
            <surname>Ghemawat</surname>
          </string-name>
          .
          <source>MapReduce: Simplified Data Processing on Large Clusters. In OSDI</source>
          (
          <year>2004</year>
          ),
          <fpage>137</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Jeffrey</given-names>
            <surname>Dean</surname>
          </string-name>
          and
          <string-name>
            <given-names>Sanjay</given-names>
            <surname>Ghemawat</surname>
          </string-name>
          .
          <source>MapReduce: simplified data processing on large clusters. Commun. ACM</source>
          <volume>51</volume>
          (
          <issue>1</issue>
          ) (
          <year>2008</year>
          ),
          <fpage>107</fpage>
          -
          <lpage>113</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Giuseppe</surname>
            <given-names>DeCandia</given-names>
          </string-name>
          , Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Vosshall</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Werner</given-names>
            <surname>Vogels</surname>
          </string-name>
          .
          <article-title>Dynamo: amazon's highly available key-value store</article-title>
          .
          <source>In SOSP</source>
          (
          <year>2007</year>
          ),
          <fpage>205</fpage>
          -
          <lpage>220</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Clark</surname>
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>French</surname>
          </string-name>
          .
          <article-title>Teaching an OLTP database kernel advanced datawarehousing techniques</article-title>
          .
          <source>In ICDE'97</source>
          (
          <year>1997</year>
          ),
          <fpage>194</fpage>
          -
          <lpage>198</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Ian</surname>
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Foster</surname>
            ,
            <given-names>Yong</given-names>
          </string-name>
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>Ioan</given-names>
          </string-name>
          <string-name>
            <surname>Raicu</surname>
            , and
            <given-names>Shiyong</given-names>
          </string-name>
          <string-name>
            <surname>Lu</surname>
          </string-name>
          .
          <source>Cloud Computing and Grid Computing</source>
          <volume>360</volume>
          -
          <string-name>
            <given-names>Degree</given-names>
            <surname>Compared</surname>
          </string-name>
          . CoRR, abs/0901.0131,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Goetz</given-names>
            <surname>Graefe</surname>
          </string-name>
          and
          <string-name>
            <given-names>David J.</given-names>
            <surname>DeWitt. The EXODUS Optimizer</surname>
          </string-name>
          <article-title>Generator</article-title>
          . In SIGMOD'
          <volume>87</volume>
          (
          <year>1987</year>
          ),
          <fpage>160</fpage>
          -
          <lpage>172</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Marc</given-names>
            <surname>Holze</surname>
          </string-name>
          , Claas Gaidies, and
          <string-name>
            <given-names>Norbert</given-names>
            <surname>Ritter</surname>
          </string-name>
          .
          <article-title>Consistent on-line classification of DBS workload events</article-title>
          .
          <source>In CIKM'09</source>
          (
          <year>2009</year>
          ),
          <fpage>1641</fpage>
          -
          <lpage>1644</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Stratos</given-names>
            <surname>Idreos</surname>
          </string-name>
          . Database Cracking:
          <article-title>Torwards Auto-tuning Database Kernels</article-title>
          .
          <source>PhD thesis</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Eva</surname>
            <given-names>Kwan</given-names>
          </string-name>
          , Sam Lightstone,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bernhard Schiefer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Adam J.</given-names>
            <surname>Storm</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Leanne</given-names>
            <surname>Wu</surname>
          </string-name>
          .
          <article-title>Automatic database configuration for DB2 Universal Database: Compressing years of performance expertise into seconds of execution</article-title>
          .
          <source>In BTW'03</source>
          ,
          <string-name>
            <surname>GI</surname>
          </string-name>
          (
          <year>2003</year>
          ),
          <fpage>620</fpage>
          -
          <lpage>629</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Alfons</given-names>
            <surname>Kemper</surname>
          </string-name>
          and Thomas Neumann.
          <article-title>HyPer: A hybrid OLTP&amp;OLAP main memory database system based on virtual memory snapshots</article-title>
          .
          <source>In ICDE'11</source>
          (
          <year>2011</year>
          ),
          <fpage>195</fpage>
          -
          <lpage>206</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Henry</surname>
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Korth</surname>
            and
            <given-names>Abraham</given-names>
          </string-name>
          <string-name>
            <surname>Silberschatz</surname>
          </string-name>
          .
          <source>Database Research Faces the Information Explosion. Commun. ACM</source>
          <volume>40</volume>
          (
          <issue>2</issue>
          ) (
          <year>1997</year>
          ),
          <fpage>139</fpage>
          -
          <lpage>142</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Andreas</surname>
            <given-names>Lübcke</given-names>
          </string-name>
          , Veit Köppen, and
          <string-name>
            <given-names>Gunter</given-names>
            <surname>Saake</surname>
          </string-name>
          .
          <article-title>A Decision Model to Select the Optimal Storage Architecture for Relational Databases</article-title>
          .
          <source>In Proceedings of the Fifth IEEE International Conference on Research Challenges in Information Science</source>
          , RCIS (
          <year>2011</year>
          ),
          <fpage>74</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Lübcke</surname>
          </string-name>
          and
          <string-name>
            <given-names>Gunter</given-names>
            <surname>Saake</surname>
          </string-name>
          .
          <article-title>A Framework for Optimal Selection of a Storage Architecture in RDBMS</article-title>
          .
          <source>In DB&amp;IS</source>
          (
          <year>2010</year>
          ),
          <fpage>65</fpage>
          -
          <lpage>76</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Lübcke</surname>
          </string-name>
          .
          <article-title>Challenges in Workload Analyses for Column and Row Storess</article-title>
          .
          <source>In Grundlagen von Datenbanken</source>
          (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Ina</given-names>
            <surname>Naydenova</surname>
          </string-name>
          and
          <string-name>
            <given-names>Kalinka</given-names>
            <surname>Kaloyanova</surname>
          </string-name>
          .
          <article-title>Sparsity Handling and Data Explosion in OLAP Systems</article-title>
          . In
          <string-name>
            <surname>MCIS</surname>
          </string-name>
          (
          <year>2010</year>
          ),
          <fpage>62</fpage>
          -
          <lpage>70</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>M. Tamer</given-names>
            <surname>Özsu</surname>
          </string-name>
          and
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Valdurie</surname>
          </string-name>
          .
          <source>Principles of Distributed Database Systems. Springer, 3rd edition</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>Hasso</given-names>
            <surname>Plattner</surname>
          </string-name>
          .
          <article-title>A common database approach for OLTP and OLAP using an in-memory column database</article-title>
          .
          <source>In SIGMOD'09</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2009</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>2</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Kimmo</surname>
            <given-names>E. E. Raatikainen. Cluster</given-names>
          </string-name>
          <string-name>
            <surname>Analysis</surname>
            and
            <given-names>Workload</given-names>
          </string-name>
          <string-name>
            <surname>Classification</surname>
          </string-name>
          .
          <source>SIGMETRICS Performance Evaluation Review</source>
          <volume>20</volume>
          (
          <issue>4</issue>
          ) (
          <year>1993</year>
          ),
          <fpage>24</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Michael</surname>
            <given-names>Stonebraker</given-names>
          </string-name>
          , Daniel J. Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Samuel Madden,
          <string-name>
            <surname>Elizabeth J. O'Neil</surname>
          </string-name>
          ,
          <string-name>
            <surname>Patrick E. O'Neil</surname>
          </string-name>
          , Alex Rasin, Nga Tran, and
          <string-name>
            <surname>Stanley</surname>
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Zdonik. C-Store</surname>
          </string-name>
          :
          <article-title>A column-oriented DBMS</article-title>
          .
          <source>In VLDB'05</source>
          ,
          <string-name>
            <given-names>VLDB</given-names>
            <surname>Endowment</surname>
          </string-name>
          (
          <year>2005</year>
          ),
          <fpage>553</fpage>
          -
          <lpage>564</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>Ricardo</given-names>
            <surname>Jorge</surname>
          </string-name>
          Santos and
          <string-name>
            <given-names>Jorge</given-names>
            <surname>Bernardino</surname>
          </string-name>
          .
          <article-title>Real-time data warehouse loading methodology</article-title>
          .
          <source>In IDEAS'08</source>
          (
          <year>2008</year>
          ),
          <fpage>49</fpage>
          -
          <lpage>58</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Transaction</given-names>
            <surname>Processing Performance Council. TPC BENCHMARKT M H. White Paper</surname>
          </string-name>
          ,
          <year>April 2010</year>
          .
          <article-title>Decision Support Standard Specification, Revision 2</article-title>
          .
          <fpage>11</fpage>
          .0.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Alejandro</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Vaisman</surname>
          </string-name>
          ,
          <string-name>
            <surname>Alberto O. Mendelzon</surname>
          </string-name>
          , Walter Ruaro, and Sergio G. Cymerman.
          <article-title>Supporting dimension updates in an OLAP server</article-title>
          .
          <source>Information Systems</source>
          <volume>29</volume>
          (
          <issue>2</issue>
          ) (
          <year>2004</year>
          ),
          <fpage>165</fpage>
          -
          <lpage>185</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Youchan</surname>
            <given-names>Zhu</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Lei</given-names>
            <surname>An</surname>
          </string-name>
          , and Shuangxi Liu.
          <article-title>Data Updating and Query in Real-Time Data Warehouse System</article-title>
          . In CSSE'
          <volume>08</volume>
          (
          <year>2008</year>
          ),
          <fpage>1295</fpage>
          -
          <lpage>1297</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Marcin</surname>
            <given-names>Zukowski</given-names>
          </string-name>
          , Niels Nes, and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Boncz</surname>
          </string-name>
          .
          <article-title>DSM vs. NSM: CPU performance tradeoffs in blockoriented query processing</article-title>
          .
          <source>In Proceedings of the 4th international workshop on Data management on new hardware</source>
          ,
          <source>DaMoN'08</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2008</year>
          ),
          <fpage>47</fpage>
          -
          <lpage>54</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Daniel</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Zilio</surname>
          </string-name>
          , Jun Rao, Sam Lightstone,
          <string-name>
            <surname>Guy M. Lohman</surname>
            ,
            <given-names>Adam J.</given-names>
          </string-name>
          <string-name>
            <surname>Storm</surname>
          </string-name>
          , Christian Garcia-Arellano,
          <article-title>and Scott Fadden. DB2 Design Advisor: Integrated automatic physical database design</article-title>
          .
          <source>In VLDB'04</source>
          ,
          <string-name>
            <given-names>VLDB</given-names>
            <surname>Endowment</surname>
          </string-name>
          (
          <year>2004</year>
          ),
          <fpage>1087</fpage>
          -
          <lpage>1097</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <surname>Daniel</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Zilio</surname>
          </string-name>
          , Calisto Zuzarte, Sam Lightstone, Wenbin Ma,
          <string-name>
            <surname>Guy M. Lohman</surname>
            , Roberta Cochrane, Hamid Pirahesh, Latha S. Colby, Jarek Gryz, Eric Alton, Dongming Liang, and
            <given-names>Gary</given-names>
          </string-name>
          <string-name>
            <surname>Valentin</surname>
          </string-name>
          .
          <article-title>Recommending materialized views and indexes with IBM DB2 Design Advisor</article-title>
          . In ICAC'
          <volume>04</volume>
          (
          <year>2004</year>
          ),
          <fpage>180</fpage>
          -
          <lpage>188</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>