<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Cost (%CPU)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Workload Representation across Different Storage Architectures for Relational DBMS</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andreas Lübcke</string-name>
          <email>andreas.luebcke@ovgu.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Veit Köppen</string-name>
          <email>veit.koeppen@ovgu.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gunter Saake</string-name>
          <email>gunter.saake@ovgu.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computer Science, University of Magdeburg</institution>
          ,
          <addr-line>Magdeburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2011</year>
      </pub-date>
      <volume>16</volume>
      <issue>32</issue>
      <fpage>79</fpage>
      <lpage>84</lpage>
      <abstract>
        <p>Database systems differ from small-scale stripped database programs for embedded devices with minimal footprint to large-scale OLAP applications for server devices. For relational database management systems, two storage architectures have been introduced: the row-oriented and the column-oriented architecture. To select the optimal architecture for a certain application, we need workload information and statistics. In this paper, we present a workload representation approach that enables us to represent workloads across different DBMSs and architectures. Our approach also supports fine granular workload analyses based on database operations.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        New requirements for database applications [
        <xref ref-type="bibr" rid="ref23 ref26 ref27">23, 26, 27</xref>
        ] came up
in recent years. Therefore, database management system (DBMS)
vendors and researchers developed new technologies, e.g.,
columnoriented DBMSs (column stores) [
        <xref ref-type="bibr" rid="ref1 ref22 ref30">1, 22, 30</xref>
        ]. New approaches are
developed to satisfy the new requirements for database
applications, thus the number of candidates in the decision process has also
increased. Moreover, new application fields imply a more complex
decision process to find the suitable DBMS for a certain use case.
      </p>
      <p>
        We need statistics to come to a suitable design decision. These
statistics have to be represented system-independent for sound and
comparable decision. That implies the independence of workload
representation from different storage architectures. In this paper,
we introduce a new approach of workload statistics aggregation
and maintenance across different DBMSs and architectures. We
showed in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] that query-based workload analyses, as described
in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], are not suitable to select the optimal storage architecture. To
overcome drawbacks of query-based workload analyses, we define
workload patterns based on database operations. We introduce a
workload decomposition algorithm that enables us to analyze query
parts. Workload patterns represent the decomposed workloads to
compare the performance of database operations for column and
row stores. These workload patterns contain all statistics needed
for cost estimations. We simulate the statistic gathering process
with a exemplary workload.
2.
      </p>
    </sec>
    <sec id="sec-2">
      <title>STATISTICS REPRESENTATION</title>
      <p>
        To select the optimal storage architecture, we have to analyze a
given workload; thus, we need to decompose this workload. We
have to map single operations of a workload (at least of one query)
and their optimizer statistics to evaluable patterns. Therefore, we
present our pattern framework which stores all necessary statistics
for subsequent performance analyses. In [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], we illustrate the
procedure of our decision process regarding the storage architecture
selection. Below, we outline the design of our pattern framework.
2.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Pattern Types</title>
      <p>To analyze the influence of single operations, we propose three
patterns for operations in workload queries. The three operation
patterns are tuple operations, aggregations and groupings, and
join operations. We define a number of sub-patterns for each of
those three to characterize particular operations more precisely within
the patterns. This way, we support analyses based on the three
patterns and additionally fine granular analyses based on sub-patterns,
i.e., we can determine where the majority of costs emerge within a
workload (at least one query).</p>
      <p>First, the tuple operation pattern covers all operations that
process or modify tuples, e.g., selection, sort operations. We propose
this pattern for performance analyses because row stores process
directly on tuples in contrast to column stores that costly
reconstruct tuples. We identify the following sub-patterns:
Sort/order operation: Sort/order operation creates sequences of
tuples and affects all attributes of a tuple. We consider
duplicate elimination as a sort operation because an internal sort
is necessary to find duplicates.</p>
      <sec id="sec-3-1">
        <title>Data access and tuple reconstruction: Row stores always access</title>
        <p>tuples and column stores must reconstruct tuples to access
more than one column.</p>
        <p>Projection: Projection returns a subset of tuple attribute values
and causes (normally) no additional costs for query
execution.</p>
        <p>Filtering: Filtering selects tuples from tables or intermediate
results based on a selection predicate, e.g., selection in
WHEREclause and HAVING-clause.</p>
        <p>
          Second, we cover all column processing operations in the
aggregation and grouping pattern, e.g., COUNT and MIN/MAX. We
propose this pattern as counterpart to the tuple operation pattern.
The operations of this pattern work only on single columns except
for grouping operations which can also process several columns,
e.g., GROUP BY CUBE. Due to column-wise partitioned data and
single column processing, column stores perform well on
aggregations (cf. [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]). We identify the following sub-patterns:
Min/Max operation: The min/max operation provides the
minimum/maximum value of a single attribute (column).
        </p>
        <p>Sum operation: This operation provides the sum of all values in
one column.</p>
        <p>Count operation: The count operation provides the number of
attribute values in a column and COUNT(*) provides only the
number of key values, thus it processes a single column.
Average operation: The average operation computes all values of
a single column as well as the sum operation, but it can have
different characteristics, e.g., mean (avg) or median.</p>
        <p>Group by operation: This operation merges equal values
according to a certain column and results in a subset of tuples.</p>
        <p>Grouping across a number of columns is also possible.
Cube operations: The cube operation computes all feasible
combination of groupings for selected dimensions. This
generation requires the power set of aggregating columns, i.e., n
attributes are computed by 2n GROUP BY clauses.</p>
        <p>Standard deviation: The standard deviation (or variance) is a
statistical measure for the variability of a data set and is
computed by a two pass algorithm which means two cycles.</p>
        <p>
          Third, the join pattern matches all join operations of a workload.
Join operations are costly tasks for DBMSs. This pattern shows
differences of join techniques between column and row stores, e.g.,
join processing on compressed columns or on bitmaps. Within
this pattern, we evaluate the different processing techniques against
each other. Consequently, we define the following sub-patterns:
Vector based: The column oriented architecture naturally supports
vector based join techniques while row stores have to
maintain and create structures, e.g., bitmap (join) indexes [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
Non-vector based: This pattern matches "classic" join techniques
(from row stores1) to differentiate the performance between
vector and non-vector based join, thus we can estimate
effects on the join behavior by architecture.
        </p>
        <p>We only propose these two sub-patterns because the join concepts,
e.g., merge or nested loop join, exist for both architectures. Hence,
we assume that there is no necessity to map each join concept into
its own sub-pattern. Figure 1 shows all introduced patterns and
their relation to each other based on our exemplary workload.
2.2</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Dependencies between Patterns</title>
      <p>Database operations are not always independent from each other.
We can identify dependencies between the following patterns: join,
filtering, sort/order, group/cube, and data access pattern.</p>
      <p>Join operations innately imply tuple selections (filtering pattern).
However, the tuple selection itself is part of the join operation by
definition, thus we assume that an additional decomposition of join
operations is not necessary. Moreover, new techniques would have
to be implemented to further decompose join operations and gather
the necessary statistics. Hence, the administrative cost for
tuning will be noticeably increased. To a side-effect, the comparison
of join techniques belonging to different architectures will be no
longer possible because of system-specific decomposition.</p>
      <p>We state that two different types of sort/order operation can
occur, i.e., implicit and explicit sort. The explicit sort is caused by
workload or user, thus we consider this operation in the sort/order
pattern. In contrast, we do not consider the implicit sort operation
in the sort/order pattern because this sort operation is caused by the
optimizer, e.g., for sort-merge join. Therefore, we assign all costs
of grouping to the GROUP BY (or CUBE) pattern including the
sort costs to sustain comparability.</p>
      <p>Third, tuple reconstruction is part of several operations for
column stores. We add these costs to the tuple operation pattern. We
1Some column stores also support these join techniques.
sustain the comparability of operations beyond the architectures
because row stores are not affected by tuple reconstructions.</p>
      <p>We assume further workload decomposition is not meaningful
because administrative costs would affect the performance of
existing systems as well as the comparability of performance issues
between the architectures according to certain workload parts. These
impacts would disadvantageously affect the usability of our pattern
framework.
3.</p>
    </sec>
    <sec id="sec-5">
      <title>QUERY DECOMPOSITION</title>
      <p>In this section, we introduce the query decomposition approach.
First, we illustrate the (re-) used DBMS functionality and how we
gather necessary statistics from existing systems. Second, we
introduce the mapping of decomposed query parts to our established
workload patterns and show a decomposition result by example.
Our approach is applicable to each relational DBMS. Nevertheless,
we decide to use a closed source system for the following
considerations because the richness of detail of optimizer/query plan output
is higher and easier to understand. More detailed information will
result in more accurate recommendation.
3.1</p>
    </sec>
    <sec id="sec-6">
      <title>Query Plans</title>
      <p>
        A workload decomposition based on database operations is
necessary to select the optimal storage architecture (cf. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]).
Therefore, we use query plans [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] which exist in each relational DBMS.
On the one hand, we reuse database functionality and avoid new
calculation costs for optimization. On the other hand, we make
use of system optimizer estimations that are necessary for physical
database design [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        Based on query plans, we gather statistics directly from DBMS
and use the optimizer cost estimations. The example in Listing 1
shows an SQL query and we transform this to a query plan in
Table 1 [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Table 1 already offers some statistics such as number
of rows, accessed bytes by the operation, or costs. Nevertheless,
Table 1 shows only an excerpt of gathered statistics. All possible
values for query plan statistics can be found in [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] Chapter 12.10.
Hence, we are able to determine the performance of operations on
a certain architecture (in our example a row store) by statistics such
as CPU costs and/or I/O costs. In addition to performance
evaluation by several estimated costs, we can gather further statistics from
query plans which may influence performance of an operation on a
certain architecture, e.g., cardinality. For column stores, the
operation cardinality can indirectly affect performance if the operation
processes several columns, thus column stores have to process a
number of tuple reconstructions, e.g., high cardinality means many
reconstructions. Thus, we use meta-data to estimate influences of
data itself on the performance, e.g., we can compute the selectivity
of attributes.
3.2
      </p>
    </sec>
    <sec id="sec-7">
      <title>From Query Plans to Workload Patterns</title>
      <p>
        We have to map the gathered statistics from DBMS to our
workload patterns. We use a second example [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] (Listing 2 and Table 2)
to simulate a minimum workload instead of a single query. In the
following, we illustrate the mapping approach by using the
examples in Listing 1 and 2. In our name convention, we define a unique
number2 that identifies queries of the workload within our mapping
algorithm, i.e., 1.X represents query 1 (Listing 1) and equally 2.X
represents query 2 (Listing 2). Furthermore, we reuse the
operation IDs from query plans (Table 1 and 2) in the second hierarchy
2In the following considerations, we start with 1 which represents
the first query.
1 SELECT *
2 FROM employees e JOIN departments d
3 ON e.department_id=d.department_id
4 ORDER BY last_name;
      </p>
      <p>Operation
SELECT STATEMENT</p>
      <p>SORT ORDER BY</p>
      <p>HASH JOIN</p>
      <p>TABLE ACCESS FULL
TABLE ACCESS FULL
DEPARTMENTS
EMPLOYEES
level (for X), e.g., 1.4 is the operation with ID 4 of query 1 (cf.
Table 1). In the following, we refer the CPU cost of Table 1 and 2.</p>
      <p>The first query (Listing 1) is decomposed into four patterns. First,
we see the data access operation of the department (ID 3)
and the employees (ID 4) tables in the corresponding query
plan in Table 1. The total cost for the data access operations is 5.
Second, the join operation (ID 2) is executed with a hash join
algorithm. The hash join cost is only 1 because in Table 1 costs
are iteratively sum up and the costs of its children (5) and its own
cost (1) are summed up to 6 for ID 2. Third, the sort
operation (ID 1) implements the ORDER BY statement with cost of
1. The total costs of all processed operations are 7 now. Fourth,
the select statement (ID 0) represents the projection and causes
no additional cost (remain 7). Following our name convention, the
identifiers from 1.0 to 1.4 represent the operations of our first
query (Listing 1) in Figure 1.</p>
      <p>We also decompose the second example (Listing 2) into four
operation types (cf. Table 2). First, IDs 3, 7, and 8 represent
the data access operations and cause total costs of 14. Second,
the optimizer estimates both hash joins (ID 2 and 6) with no
(additional) costs because their costs are only composed by the
summed costs of their children (ID 3, 4 and ID 7, 8). Third,
the GROUP BY statement in Listing 2 is implemented by
hashbased grouping operations (ID 1 and ID 5). The cost of each
HASH GROUP BY is 1 and the total costs of this operation type
are 2. Fourth, the projection (ID 0) and the sum operation
represented by select statement causes again no additional costs. If
the sum operation causes costs then it will be represented by a
separate operation (ID). Following our name convention, the
identifiers from 2.0 to 2.8 represent the operations of the second query
(Listing 2) in Figure 1. The view (ID 2.4) is not represented in
our workload pattern because its costs are already mapped by its
child operations (ID 2.5-2.8).</p>
      <p>In our examples, we summarize single operations of similar types
(five for example query two). In the following, we list the five
operation types and assign them to our workload patterns and their
sub-patterns that we introduced in Section 2. The join operations
of our example queries ID 1.2, 2.2, and 2.6 are assigned to
the non-vector based join pattern. We assign the operations with
ID 1.3, 1.4, 2.3, 2.7, and 2.8 to the data access
subpattern of the tuple operation pattern. We also assign the
projections (ID 1.0 and 2.0) and the sort operation (ID 1.1) to
the tuple operation pattern. Finally, we assign the group by
operations (ID 2.1 and 2.5) to the group by sub-pattern within the
aggregation and grouping pattern. We present the result in Figure 1
whereby we only show ID and cost of each operation for reasons
of readability. We state that the we do not need to directly extract
statistics from existing systems. Our pattern framework is system
independent, thus we are also able to use already extracted (or
aggregated) data as well as estimated values.
3.3</p>
    </sec>
    <sec id="sec-8">
      <title>Operations in Column Stores</title>
      <p>
        We state that we do not need a separate decomposition algorithm
for column stores, i.e., the query plan operations of column stores
can be also mapped to our workload patterns. Representatively, we
illustrate the mapping of C-Store/Vertica query plan operations
introduced in [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] and map them to our workload patterns as follows:
Decompress: Decompress is mapped to the data access pattern.
      </p>
      <p>
        This operation decompresses data for subsequent operations
in the query plan that cannot process on compressed data
(cf. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]).
      </p>
      <p>Select: Select is equivalent to the selection of relational algebra
with the exception that the result is represented as bitstring.</p>
      <p>Hence, we map it to the filtering pattern.</p>
      <p>Mask: Mask process on bitstrings and returns only those values
whose associated bits in the bitstring are 1. Consequently,
we map mask to the filtering pattern.</p>
      <p>Project: Projection is equivalent to the projection of relational
algebra, thus this operation is mapped to the projection pattern.
Sort: This operation sorts the columns of a C-Store projection
according to a (set of) sort column(s). This technique is
equivalent to sort operations on projected tuples, i.e., we can map
this operation to the sort/order pattern.</p>
      <p>
        Aggregation Operators: These operations compute aggregations
and groupings like in SQL [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], thus we directly map these
operations to the corresponding sub-pattern in the
aggregation &amp; grouping pattern.
      </p>
      <p>Concat: Concat combines C-Store projections sorted in the same
order into a new projection. We regard this operation as tuple
reconstruction and map it to the corresponding pattern.
Permute: This operation permutes the order of columns in C-Store
projections according to the given order by a join index. It
prevents additional replication overhead that would emerge
through creation of join indexes and C-Store projections in
several orders. This operation is used for joins, thus we map
its cost to the join pattern.</p>
      <p>
        Join: We map this operation to the join pattern and distinguish two
join types. First, if tuples are already reconstructed then we
process them as row stores, i.e., we map this join type to
the non-vector based join pattern. Second, the join operation
only processes columns that are needed to evaluate the join
predicate. The join result is a set of pairs of positions in the
input columns [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This join type can process on compressed
data as well as it can use vector based join techniques, thus,
we map this join type to the vector based join pattern.
Bitstring Operations: These operations (AND, OR, NOT)
process bitstrings and compute a new bitstring with respect to
the corresponding logical operator. These operations
implement the concatenation of different selection predicates.
      </p>
      <p>Therefore, we map these operations to the filtering pattern.</p>
      <p>Finally, we state that our approach can be used for each
relational DBMS. Each relational DBMS is referable to the relational
data model, so these DBMSs are based on the relational algebra
in some manner too. Thus, we can reduce or map those
operations to our workload patterns; in worst case, we have to add an
architecture-specific operation for hybrid DBMSs to our pattern,
e.g., tuple reconstruction for column stores. For a future (relational)
1 SELECT c.cust_last_name, SUM(revenue)
2 FROM customers c, v_orders o
3 WHERE c.credit_limit &gt; 2000
4 AND o.customer_id(+) = c.customer_id
5 GROUP BY c.cust_last_name;</p>
      <sec id="sec-8-1">
        <title>Listing 2: Example SQL query (119) [21]</title>
        <p>CUSTOMERS
V_ORDERS
ORDERS
ORDER_ITEMS
...
...
...
...
...
...
...
...
...
...</p>
        <p>Group by
ID Cost
2.1 1
2.5 1
hybrid storage architecture, such an operation could be necessary
to map the cost for conversions between row- and column-oriented
structures and vice versa.</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>DEMONSTRATING EXAMPLE</title>
      <p>We decide to simulate the workload with the standardized
TPCH benchmark (2.8.0) to show the usability of our approach. We use
the DBMSs Oracle 11gR2 Enterprise Edition and Infobright ICE
3.3.1 for our experiments3. We run all 22 TPC-H queries and
extract the optimizer statistics from the DBMSs. For reasons of
clarity and comprehensibility, we only map three representative TPC-H
queries namely Q2, Q6, and Q14 to the workload patterns.</p>
      <p>The query structure, syntax, and execution time are not sufficient
to estimate the query behavior on different storage architectures.
We introduced an approach based on database operations that
provides analyses to find long running operations (bottlenecks).
Moreover, we want to figure out reasons for the behavior, thus we have
to use additional metrics. We select the I/O cost to compare the
DBMSs and summarize the optimizer output in Table 3. Following
our previous name convention, we define the query IDs according
to their TPC-H query number, i.e., we map the queries with the
IDs 2, 6, and 14. The operations are identified by their query
plan number (IDs in Table 3), thus the root operation of TPC-H
query Q2 has the ID 2.0 in Figure 2. All values in Table 3 are
given in Kbytes. The given values are input costs of each
operation except the table access costs because no information on input
costs to table access operations are available. Note, the
granularity of Oracle’s costs measurements is on the byte level whereas the
3We also wanted to evaluate our approach with the DBMSs
solutions from Vertica and Sybase because both DBMSs use cost-based
optimizer and we would be able to receive more expressive results.
We requested the permission to use the systems for our evaluation
but until now the decision is pending.
measurements of ICE are on the data pack (65k) level.</p>
      <p>In Figure 2, we present our workload patterns with I/O costs of
the corresponding TPC-H queries. As we mentioned before, the
projection operation causes no additional costs. Hence, the I/O
costs in Table 3 and Figure 2 represent the size of final results.
The stored information can be analyzed and aggregated in decision
models with any necessary granularity. In our example, we only
sum up all values of the data access pattern for each query to
calculate the I/O costs per query in Kbytes. For these three queries, all
results and intermediate results are smaller than the available main
memory, thus no data has to be reread subsequently. Oracle reads
1452:133 Kbytes for query Q2 and takes 8:14 seconds. ICE needs
41 seconds and access 2340 Kbytes. We suppose, the DBMS with
minimal I/O cost performs best. Our assumption is confirmed for
query Q14. Oracle accesses 7020:894 Kbytes and computes the
query in 22:55 seconds whereas ICE computes it in 3 seconds and
reads 6240 Kbytes. Nevertheless, we cannot prove our assumption
for query Q6. Oracle (3118 Kbytes) accesses less data than ICE
(5980) Kbytes but ICE (2 seconds) computes this query ten times
faster than Oracle (22:64 seconds). Hence, we cannot figure out a
definite correlation for our sample workload.</p>
      <p>We state that only I/O cost is not sufficient to estimate the
behavior of database operations. However, I/O cost is one important
metric to describe performance behavior on different storage
architectures because one of the crucial achievements of column stores is
the reduction of data size (i.e., I/O cost) by aggressive compression.
The I/O cost also gives an insight into necessary main memory for
database operations or if operations have to access the secondary
memory. Hence, we can estimate that database operations are
completely computed in main memory or data have to be reread/read
stepwise4.
4We remind of the performance gap (circa 105) between main
memory and HDDs.
Oracle</p>
      <p>Q2 (41sec)
ID4:65;ID5:65;ID6:845;ID7:65ID8:260;
ID10:65;ID11:65;ID12:65;ID13:845
ID3:1300;ID9:1040
Non-vector based join</p>
      <p>Group by</p>
    </sec>
    <sec id="sec-10">
      <title>RELATED WORK</title>
      <p>
        Several column stores have been proposed [
        <xref ref-type="bibr" rid="ref1 ref14 ref30">1, 14, 30</xref>
        ] for OLAP
applications. But all systems are pure column stores and do not
support any row store functionality. Thus, a storage architecture
decision between row and column store is necessary. Abadi et
al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] compare row and column store with respect to performance
on the star schema benchmark. They simulate column store
architecture by indexing every single column or vertical partitioning of
the schema. They show that using column store architecture in a
row store is possible but the performance is poor. In this paper, we
do not compare end to end performance of DBMSs or architectures.
We support sound and comparable analyses based on database
operations across different DBMSs with our approach. We do not
discuss approaches like DSM [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], hybrid NSM/DSM schemes [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ],
or PAX [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] because the differences to state-of-the-art column stores
have been already discussed, e.g., Harizopoulus et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        There are systems available which attempt to fill the gap between
a column and a row store. C-Store [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] uses two different storage
areas to overcome the update problems of column stores. A related
approach brings together a column store approach and the typical
row store domain of OLTP data [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. However, we do not develop
hybrid solutions that attempt to fill this gap for now.
      </p>
      <p>
        There exist a number of design advisors which are related to our
work, e.g., IBM DB2 Configuration Advisor [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The IBM
Configuration Advisor recommends pre-configurations for databases.
Zilio et al. [
        <xref ref-type="bibr" rid="ref28 ref29">28, 29</xref>
        ] introduce an approach that gathers statistics like
our approach directly from DBMSs. The statistics are used to
advise index and materialized view configurations. Similarly,
Chaudhuri et al. [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ] present two approaches which illustrate the whole
tuning process using constraints such as space threshold. However,
these approaches operate on single systems instead of comparing
two or more systems. In contrast to the mentioned approaches, our
approach do not consider tune configurations, indexes, etc.
      </p>
      <p>
        Another approach for OLAP applications is Ingres/Vectorwise
which applies the Vectorwise (formerly MonetDB/X100)
architecture into the Ingres product family [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. In cooperation with
Vectorwise, Ingres developes a new storage manager ColumnBM for
the new Ingres/Vectorwise. However, the integration of the new
architecture into the existing environment remains unclear [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
6.
      </p>
    </sec>
    <sec id="sec-11">
      <title>CONCLUSION</title>
      <p>In recent years, column stores have shown good results for DWH
applications and often outperformed established row stores.
However, new requirements arise in the DWH domain that cannot be
satisfied only by column stores. The new requirements demand
also for row store functionality, e.g., real-time DWHs need
sufficient update processing. Thereby, the complexity of design
process increases because we have to choose the optimal architecture
for given applications. We showed with an experiment that
workload analyses based on query structure and syntax are not sufficient
to select the optimal storage architecture. Consequently, we
suggested a new approach based on database operations. We
introduced workload patterns which contain all workload information
beyond the architectures, e.g., statistics and operation cost. We
also presented a workload decomposition approach based on
existing database functionality that maps operations of a given workload
to our workload patterns. We illustrated the methodology of our
decomposition approach using an example workload. Subsequently,
we state that a separate decomposition algorithm for column stores
is not needed. We stated that our presented approach is transparent
to any workload and any storage architecture based on the
relational data model. In the evaluation, we proved the usability of
our approach. Additionally, we demonstrate the comparability of
different systems using different architectures even if the systems
provide different information with respect to their query execution.
The decision process can be periodically repeated, thus the storage
architecture selection is not static. Moreover, our approach can be
used for optimizer (decisions) in hybrid relational DBMS that has
to select the storage method for parts of data.</p>
      <p>
        In future work, we will investigate two strategies to implement
our workload patterns in a prototype. First, we utilize a new DBS
to export periodically statistics and operation costs which we map
to our workload patterns. This way, we will not affect performance
of analyzed systems by prediction computation. Second, we adapt
existing approaches [
        <xref ref-type="bibr" rid="ref17 ref5">5, 17</xref>
        ] to automatically gather statistics, e.g.,
mapping statistics and workload patterns directly into a graph
structure (query graph model). Additionally, aggregated or estimated
values from other sources can be stored. We will perform detailed
studies on OLAP, OTLP, and mixed workloads to gather expressive
values for predictions.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Abadi</surname>
          </string-name>
          .
          <article-title>Query execution in column-oriented database systems</article-title>
          .
          <source>PhD thesis</source>
          , Cambridge, MA, USA,
          <year>2008</year>
          . Adviser: Madden, Samuel.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Abadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Madden</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Hachem</surname>
          </string-name>
          .
          <article-title>Column-stores vs. row-stores: How different are they really?</article-title>
          <source>In SIGMOD '08</source>
          , pages
          <fpage>967</fpage>
          -
          <lpage>980</lpage>
          , New York, NY, USA,
          <year>2008</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ailamaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>DeWitt</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. D. Hill</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Skounakis</surname>
          </string-name>
          .
          <article-title>Weaving relations for cache performance</article-title>
          .
          <source>In VLDB '01</source>
          , pages
          <fpage>169</fpage>
          -
          <lpage>180</lpage>
          , San Francisco, CA, USA,
          <year>2001</year>
          . Morgan Kaufmann Publishers Inc.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>M. M. Astrahan</surname>
            ,
            <given-names>M. W.</given-names>
          </string-name>
          <string-name>
            <surname>Blasgen</surname>
            ,
            <given-names>D. D.</given-names>
          </string-name>
          <string-name>
            <surname>Chamberlin</surname>
            ,
            <given-names>K. P.</given-names>
          </string-name>
          <string-name>
            <surname>Eswaran</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>P. P.</given-names>
          </string-name>
          <string-name>
            <surname>Griffiths</surname>
            ,
            <given-names>W. F. K.</given-names>
          </string-name>
          III,
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Lorie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. R.</given-names>
            <surname>McJones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Mehl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. R.</given-names>
            <surname>Putzolu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. L.</given-names>
            <surname>Traiger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. W.</given-names>
            <surname>Wade</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Watson. System</surname>
          </string-name>
          <string-name>
            <surname>R</surname>
          </string-name>
          :
          <article-title>Relational approach to database management</article-title>
          .
          <source>ACM TODS</source>
          ,
          <volume>1</volume>
          (
          <issue>2</issue>
          ):
          <fpage>97</fpage>
          -
          <lpage>137</lpage>
          ,
          <year>1976</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Bruno</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          .
          <article-title>To tune or not to tune? A lightweight physical design alerter</article-title>
          .
          <source>In VLDB '06</source>
          , pages
          <fpage>499</fpage>
          -
          <lpage>510</lpage>
          . VLDB Endowment,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>N.</given-names>
            <surname>Bruno</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          .
          <article-title>An online approach to physical design tuning</article-title>
          .
          <source>In ICDE '07</source>
          , pages
          <fpage>826</fpage>
          -
          <lpage>835</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          and
          <string-name>
            <given-names>V.</given-names>
            <surname>Narasayya</surname>
          </string-name>
          .
          <article-title>Autoadmin “what-if” index analysis utility</article-title>
          .
          <source>In SIGMOD '98</source>
          , pages
          <fpage>367</fpage>
          -
          <lpage>378</lpage>
          , New York, NY, USA,
          <year>1998</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G. P.</given-names>
            <surname>Copeland</surname>
          </string-name>
          and
          <string-name>
            <given-names>S. N.</given-names>
            <surname>Khoshafian</surname>
          </string-name>
          .
          <article-title>A decomposition storage model</article-title>
          .
          <source>In SIGMOD '85</source>
          , pages
          <fpage>268</fpage>
          -
          <lpage>279</lpage>
          , New York, NY, USA,
          <year>1985</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D. W.</given-names>
            <surname>Cornell</surname>
          </string-name>
          and
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Yu</surname>
          </string-name>
          .
          <article-title>An effective approach to vertical partitioning for physical design of relational databases</article-title>
          .
          <source>IEEE Trans. Softw</source>
          . Eng.,
          <volume>16</volume>
          (
          <issue>2</issue>
          ):
          <fpage>248</fpage>
          -
          <lpage>258</lpage>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Finkelstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schkolnick</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Tiberio</surname>
          </string-name>
          .
          <article-title>Physical database design for relational databases</article-title>
          .
          <source>ACM TODS</source>
          ,
          <volume>13</volume>
          (
          <issue>1</issue>
          ):
          <fpage>91</fpage>
          -
          <lpage>128</lpage>
          ,
          <year>1988</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Harizopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Abadi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Madden</surname>
          </string-name>
          .
          <article-title>Performance tradeoffs in read-optimized databases</article-title>
          .
          <source>In VLDB '06</source>
          , pages
          <fpage>487</fpage>
          -
          <lpage>498</lpage>
          . VLDB Endowment,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12] Ingres/Vectorwise. Ingres/
          <article-title>VectorWise sneak preview on the Intel Xeon processor 5500 series-based platform</article-title>
          .
          <source>White Paper</source>
          ,
          <year>September 2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>E.</given-names>
            <surname>Kwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lightstone</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. B. Schiefer</surname>
            ,
            <given-names>A. J.</given-names>
          </string-name>
          <string-name>
            <surname>Storm</surname>
            , and
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Wu</surname>
          </string-name>
          .
          <article-title>Automatic database configuration for DB2 Universal Database: Compressing years of performance expertise into seconds of execution</article-title>
          .
          <source>In BTW '03</source>
          , pages
          <fpage>620</fpage>
          -
          <lpage>629</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Legler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Lehner</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Ross</surname>
          </string-name>
          .
          <article-title>Data mining with the SAP NetWeaver BI Accelerator</article-title>
          .
          <source>In VLDB '06</source>
          , pages
          <fpage>1059</fpage>
          -
          <lpage>1068</lpage>
          . VLDB Endowment,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Lübcke</surname>
          </string-name>
          .
          <article-title>Cost-effective usage of bitmap-indexes in DS-Systems</article-title>
          .
          <source>In 20th Workshop "Grundlagen von Datenbanken"</source>
          , pages
          <fpage>96</fpage>
          -
          <lpage>100</lpage>
          . School of Information Technology, International University in Germany,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Lübcke</surname>
          </string-name>
          .
          <article-title>Challenges in workload analyses for column and row stores</article-title>
          .
          <source>In 22nd Workshop "Grundlagen von Datenbanken"</source>
          , volume
          <volume>581</volume>
          .
          <article-title>CEUR-WS</article-title>
          .org,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Lübcke</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Geist</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Bubke</surname>
          </string-name>
          .
          <article-title>Dynamic construction and administration of the workload graph for materialized views selection</article-title>
          .
          <source>Int. Journal of Information Studies</source>
          ,
          <volume>1</volume>
          (
          <issue>3</issue>
          ):
          <fpage>172</fpage>
          -
          <lpage>181</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>A.</given-names>
            <surname>Lübcke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Köppen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Saake</surname>
          </string-name>
          .
          <article-title>A decision model to select the optimal storage architecture for relational databases</article-title>
          .
          <source>RCIS</source>
          , France, MAY
          <year>2011</year>
          . IEEE. to appear.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Oracle</given-names>
            <surname>Corp</surname>
          </string-name>
          .
          <source>Oracle Database Concepts</source>
          11g
          <source>Release (11.2)</source>
          . 14 Memory
          <string-name>
            <surname>Architecture (Part Number</surname>
          </string-name>
          E10713-05),
          <year>March 2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Oracle</given-names>
            <surname>Corp</surname>
          </string-name>
          .
          <source>Oracle Performance Tuning Guide 11g Release (11.2)</source>
          . 12
          <string-name>
            <surname>Using EXPLAIN PLAN (Part Number</surname>
          </string-name>
          E10821-05),
          <year>March 2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Oracle</given-names>
            <surname>Corp</surname>
          </string-name>
          .
          <source>Oracle Performance Tuning Guide 11g Release (11.2)</source>
          . 11
          <string-name>
            <given-names>The</given-names>
            <surname>Query Optimizer (Part Number</surname>
          </string-name>
          E10821-05),
          <year>March 2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>H.</given-names>
            <surname>Plattner</surname>
          </string-name>
          .
          <article-title>A common database approach for OLTP and OLAP using an in-memory column database</article-title>
          .
          <source>In SIGMOD '09</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>2</lpage>
          , New York, NY, USA,
          <year>2009</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Santos</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Bernardino</surname>
          </string-name>
          .
          <article-title>Real-time data warehouse loading methodology</article-title>
          .
          <source>In IDEAS '08</source>
          , pages
          <fpage>49</fpage>
          -
          <lpage>58</lpage>
          , New York, NY, USA,
          <year>2008</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>J.</given-names>
            <surname>Schaffner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bog</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Krüger</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Zeier</surname>
          </string-name>
          .
          <article-title>A hybrid row-column OLTP database architecture for operational reporting</article-title>
          .
          <source>In BIRTE '08</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>M.</given-names>
            <surname>Stonebraker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Abadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Batkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cherniack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ferreira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Lau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Madden</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. J. O</given-names>
            <surname>'Neil</surname>
            , P. E. O'Neil
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rasin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tran</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Zdonik. C-Store</surname>
          </string-name>
          :
          <article-title>A column-oriented DBMS</article-title>
          .
          <source>In VLDB '05</source>
          , pages
          <fpage>553</fpage>
          -
          <lpage>564</lpage>
          . VLDB Endowment,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Vaisman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. O.</given-names>
            <surname>Mendelzon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Ruaro</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S. G.</given-names>
            <surname>Cymerman</surname>
          </string-name>
          .
          <article-title>Supporting dimension updates in an OLAP server</article-title>
          .
          <source>Information Systems</source>
          ,
          <volume>29</volume>
          (
          <issue>2</issue>
          ):
          <fpage>165</fpage>
          -
          <lpage>185</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>An</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>Data updating and query in real-time data warehouse system</article-title>
          .
          <source>In CSSE '08</source>
          , pages
          <fpage>1295</fpage>
          -
          <lpage>1297</lpage>
          , Washington, DC, USA,
          <year>2008</year>
          . IEEE Computer Society.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Zilio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lightstone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Lohman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Storm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Garcia-Arellano</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Fadden. DB2 Design</surname>
          </string-name>
          <article-title>Advisor: Integrated automatic physical database design</article-title>
          .
          <source>In VLDB '04</source>
          , pages
          <fpage>1087</fpage>
          -
          <lpage>1097</lpage>
          . VLDB Endowment,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Zilio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zuzarte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lightstone</surname>
          </string-name>
          , W. Ma,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Lohman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cochrane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Pirahesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. S.</given-names>
            <surname>Colby</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gryz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Alton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Liang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Valentin</surname>
          </string-name>
          .
          <article-title>Recommending materialized views and indexes with IBM DB2 Design Advisor</article-title>
          .
          <source>In ICAC '04</source>
          , pages
          <fpage>180</fpage>
          -
          <lpage>188</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zukowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Boncz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nes</surname>
          </string-name>
          , and
          <string-name>
            <surname>S. Heman.</surname>
          </string-name>
          <article-title>MonetDB/X100 - a DBMS in the CPU cache</article-title>
          .
          <source>IEEE Data Eng. Bulletin</source>
          ,
          <volume>28</volume>
          (
          <issue>2</issue>
          ):
          <fpage>17</fpage>
          -
          <lpage>22</lpage>
          ,
          <year>June 2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>