<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Distributed Processing in the Query Optimizer</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="editor">
          <string-name>Vancouver, Canada</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>supervised by Prof. Dr. Thomas Neumann, Technische Universität München</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Distributed database systems gain relevance both in industry and academia. However, existing research on query optimization for relational database systems focuses largely on systems running on a single machine. Work on distributed systems neglects available workload information in database systems. In this work, we present optimization strategies to fully leverage the potential of distributed systems running on modern cloud architectures with fast networks. We focus on the optimal assignment of tasks to compute nodes and the joint optimization of join ordering and distribution layout of data. Furthermore, we introduce distributed plans and simulation-based evaluations using a new cost model for computation time.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Considering the very high bandwidth available in
modern cloud systems, distributed processing becomes more
and more attractive to not only handle large data sizes
but also to improve processing performance. Still, good
physical plans are key for eficient execution. We argue
that distributed execution engines require changes to
existing query optimizers for optimal performance.
Existing join ordering algorithms yield suboptimal results
because they fail to model the cost of data transfers.
Furthermore, they need to spread computational load while
avoiding waiting on data when assigning tasks to
machines. We plan to contribute the following components
to investigate new optimization opportunities:</p>
      <sec id="sec-1-1">
        <title>1. A strategy to transform query plans for eficient</title>
        <p>distributed execution.</p>
        <p>Methods like hash
distributed joins require repartitioning of their input
data. Our strategy chooses favorable
distributions and introduces necessary data shufling.
2. A new operator-based computation time
estimation method that allows us to compare the cost of
transferring data over the network against local
processing time.
3. An optimization method for the task assignment
problem which determines on which node each
part of a distributed query should be executed to
minimize query response time.</p>
      </sec>
      <sec id="sec-1-2">
        <title>4. A simulator that models the execution of distributed query plans on a cluster considering each nodes computation and network capabilities. This simulator uses our previous computation</title>
        <p>VLDB 2023 PhD Workshop, co-located with the 49th International</p>
        <p>1
 2</p>
        <p>3
Pipeline Break
Pipelining
Required Data</p>
        <p>Shufle Stage
(c) Distributed query plan
base relations.</p>
        <p>across various cluster setups using
simulationbased evaluation.
5. A new join ordering algorithm that can take
effects of distributed data partitioning and
execution into account to jointly optimize distribution
layout.
use of (2) to estimate computational load and can be
evaluated using (4). (5) also has a direct impact on query
performance and can be compared against existing
algorithms using (1).</p>
        <p>Res
B</p>
        <p>1
 2
 7
 1
 9
 8
 4
 2
 6
 5
 3</p>
        <p>3
 1</p>
        <p>B
2
 1</p>
        <p>B</p>
        <p>2
(a) Initial query plan
(b) New pipeline boundaries
time estimations to track execution times accu- (3) will directly improve query response time. It makes
rately. We can verify the quality of assignments</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <sec id="sec-2-1">
        <title>There are several distributed database systems, but the</title>
        <p>
          number of publications on distributed optimization is
rather limited. Microsoft extended the search space of
SQL Server’s query optimizer with data distribution
information for cost-based search [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Its cloud-native
successor Polaris avoids the task assignment problem by
writing and reading all intermediate results from a decoupled
storage service [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. This removes many efects of data
locality. Redshift automatically chooses partition keys
and distribution for observed query workloads, but there
is little information published about its query optimizer
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Snowflake [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] uses a classical Cascades-like query
optimizer [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] but fixes the data distribution at query
runtime. Vertica segments tables by their columns instead
of hash partitioning [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. It chooses join ordering with a
worklist-based approach that considers distribution
information and terminates when the memory budget is
exhausted. MemSQL performs cost-based query rewrites
in a heuristically pruned search space, weighing data
transfers with a constant factor [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. SparkSQL uses cost
and rule-based optimizations to broadcast small tables
and perform preaggregations [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Rödiger et al. propose
network optimal partition assignment using MILP for
single join operations in [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. They approach data skew
with selective broadcast and Flow-Join that dynamically
broadcast partitions and tuples respectively [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
        </p>
        <p>
          There is also a lot of related work in the area of big
data that covers similar optimization aspects, such as
task scheduling [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. In contrast to big data systems
like Hadoop and Spark, relational database systems have
much more information ahead of time. We build upon
that research utilizing this additional information and
database-specific optimizations such as join ordering.
query results to data units. Hash-distributed execution
can be used to efectively perform aggregations and joins
on large amounts of data. However, the processing speed
of joins can be improved by broadcasting data units in
cases of skewed data or vast diferences in cardinalities
[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. Furthermore, it is possible that the overhead of data
transfers in further stages outweighs the advantages of
distributed processing. Thus, the optimizer should also
be able to decide that a data unit should only reside on a
single node. In summary, data units can have the
following four partition layouts:
• Hash-partitioned: The data unit is hash
partitioned by a key.
• Broadcast: All data resides in a single partition
that is broadcast to all eligible nodes.
• Single-node: All data resides on one node,
further processing will not be distributed.
        </p>
        <p>• Scattered: Tuples are partitioned without a key.</p>
      </sec>
      <sec id="sec-2-2">
        <title>There are many metrics, such as throughput, query la</title>
        <p>tency, cloud cost, and energy consumption. We choose
to optimize for latency, as we expect that optimizing for
lower execution and transfer times will improve results
in all metrics.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Components For Distributed</title>
    </sec>
    <sec id="sec-4">
      <title>Query Optimization</title>
      <sec id="sec-4-1">
        <title>The main components of our research project are distributed plan generation, computation time estimation, task assignment, a simulator for distributed execution, and a new join ordering optimizer.</title>
        <p>4.1. Distributed Plan Generation</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>3. Distributed Processing Model</title>
      <sec id="sec-5-1">
        <title>We compose distributed query plans from three main components:</title>
        <p>
          First, we define the characteristics of distributed systems Data Units can be base relations from a database,
for which we want to optimize. We focus our work on intermediate results, or the final result of a query. They
OLAP systems, but the concepts are also applicable to contain the available attributes and the estimated number
transactional workloads. The system of concern has dis- of tuples in the data unit. Each data unit is annotated with
aggregated compute from storage to allow flexible scaling a partition layout determining the type of partitioning
of compute nodes similar to Snowflake [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Nodes can and the partition key if any.
vary in computational and network capabilities to use Pipelines represent the fused computation of
operavailable cloud instances cost-efectively. The full dataset ators that is not interrupted by data materialization or
has to be stored on a storage service. Tables are stored in transfers. A pipeline always takes one data unit as an
ina columnar fashion and hash partitioned on user-defined put and creates one data unit as an output. Additionally,
distribution keys. Similarly to Polaris [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], nodes may a pipeline may require the presence of further data units,
cache arbitrary partitions locally. In contrast, we explic- e.g., a pipeline performing a hash-join would require the
itly do not disaggregate query state from compute nodes data unit of the build side while taking the data unit of
to avoid the latency overhead of writing back and reading the probe side as input.
all intermediate results from the storage service. Shufle Stages repartition data. A shufle stage takes
        </p>
        <p>We generalize relations, intermediate results, and final one data unit as input and returns one data unit as output.
The only diference between input and output data unit 4000
is their respective partition layout. cyenu23000000</p>
        <p>We initially create distributed query plans from phys- rFeq1000
ical plans created for single machines, as depicted in 0 1.00 1.25 1.50 Relati1v.e7C5ost of Ass2ig.0n0ment 2.25 2.50 2.75
Figure 1a. Our method takes a query plan and partition
layout information for all base relations and distributes Figure 2: Cost distribution of 100 thousand sampled task
the plan in several passes. assignments for TPC-H Q21 on 16 machines.</p>
        <p>First, the operators in the plan are combined to
pipelines. Next, we determine the best partition layout
at each operator and split pipelines where necessary, as The task assignment optimizer focuses on the choice of
in Figure 1b. Finally, we explicitly name the data units which node should execute which tasks. Each node can
at the ends of each pipeline. The output of a pipeline execute any task, if the necessary data is transferred
acmay have a diferent partition layout than the required cordingly. However, this will have significant impact
input layout of its scanning pipeline. In this case, we on performance. We want to evenly spread the
comcreate two data units and link them with a shufle stage, putational load among nodes and minimize time spent
as shown for  5 and  6 in Figure 1c. waiting on data transfers. Each assignment also has an</p>
        <p>
          Distributing single-node plans like this will not yield efect on subsequent execution, as it determines on which
optimal results as the original plan does not incorporate node the resulting partition will reside.
any information about the distributed system. Ultimately, As depicted in Figure 2, good task assignments can
the optimizer should consider distribution in all phases. improve performance over 2x. The number of possible
Most stages of this method can be reused for such an end- assignments  assign =  tansokdess grows exponentially in the
to-end optimizer, and we can create distributed plans to number of nodes, which renders exhaustive
enumeraconduct experiments early. Our work on this rule-based tion of assignments infeasible. Using our computation
plan generation is mostly done. time estimates and estimated time spent for data
transfers, we plan to build a heuristic optimizer for the task
4.2. Computation Time Estimation assignment problem that is able to generate good
assignments in short time. We will consider sampling-based
Traditional single-node query optimizers rely on rela- and greedy methods to find good initial plans in short
tively simple cost models because cardinality estimation time and refining methods like iterative improvement
errors outweigh the efect of more detailed models [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. and simulated annealing to further improve these plans.
For distributed processing, however, we need to compare We have implemented first approaches to this problem.
the relative cost of data transfers over the network to
local computation to find good execution strategies. In 4.4. Distributed Execution Simulator
the presence of fast modern networks, it is no longer
sufifcient to simply rely on the sizes of intermediate results The best way to evaluate optimizations is to conduct
and ignore the computation time. For an exact compari- benchmarks on a real system. However, it is intricate to
son, we want to accurately predict the computation time conduct thorough large scale benchmarks on distributed
of pipelines. We build a fine-grained operator-based cost systems. Experiments on large compute clusters are
exmodel to predict the average computation time required pensive and take substantial efort to realize with a
workfor each tuple at each operator. The profiling method pro- in-progress system. Hence, we decided to simulate the
posed by Beischl et al. [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] provides very detailed data for distributed system without the need for a real
implemenmodern compiled data processing engines. We use this tation. This simulator is much more flexible, as we can
data and available information about each operator, such make fundamental changes to the execution model with
as tuple size, input cardinality, and expression complex- little efort. Also, the structure of the simulated cluster
ity, to create a detailed performance model. Optimization can be changed in compute nodes count, hardware, and
stages can consider the performance estimations of this network speeds efortlessly. It can also be used as a direct
model if they are included in the query plan. We have cost function for cost-based optimization methods.
yet to implement this method. The simulator takes a distributed query plan, a cluster
definition, each nodes cached partitions of base relations,
4.3. Task Assignment Optimizer and a task assignment as input. It maintains pending and
currently active tasks and data transfers over the network
Single pipelines can be distributed using data-parallelism. and their current progress in percent. First, it computes
If the scanned data unit is partitioned into  partitions, we the estimated remaining time to finish for each active
create  tasks for this pipeline, where each task scans one task and transfer. The shortest time  min determines when
partition and outputs one partition of the pipelines result. the set of currently running tasks and transfers changes.
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>The simulator advances the progress of all operations</title>
        <p>by  min. At least one of them will finish and therefore
make a new partition available at some node. Finally, it
ifnds all pending operations that can start since now new
partitions are available. By accumulating all values of
 min, we can compute the overall runtime of the query.</p>
        <p>
          Our implementation of this simulator is ready for use.
We use it to evaluate randomly sampled task assignments
and give their execution time distribution in Figure 2. As
our implementation is fast, it can easily simulate tens of
thousands of executions per second and is hence suitable
for direct integration in the optimization loop.
Not only the new problem of task assignment has
optimization potential. Join ordering algorithms for
singlenode execution yield deficient plans for distributed
execution [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. As large base relations are likely to be hash
partitioned on join keys, it will be advantageous to
execute the respective join first and avoid reshufling the
data, even if that might not be optimal for single-node
execution. Furthermore, the join ordering algorithm can
directly choose the distribution (hash distributed,
broadcast, or simply using only a single node) of intermediate
results. We will investigate the feasibility of applying
exhaustive dynamic programming algorithms that extend
solutions by physical properties similar to SQL server
PDW [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. This will enlarge the search space significantly,
and exhaustive search will be infeasible in many cases.
Hence, we will work on further possibilities to restrict the
search space and gracefully fall back to fast
approximations. Additionally, we will work on a new cost model for
enumeration algorithms which incorporates both
computation and network time. We have not yet started
working on this problem.
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusion</title>
      <p>Distributed query processing opens potential for
optimizations at many diferent stages of physical plan
generation. This work proposes approaches to use that
potential in several ways. We describe a way to lift current
physical query plans for distributed execution. Then, we
create a simulation-based evaluation method for these
plans. We highlight the importance of the task
assignment problem and sketch several methods to find good
assignments. Finally, we present our vision for new
enumeration-based join ordering algorithms that jointly
optimize the distribution of data with the join ordering.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Shankar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. V.</given-names>
            <surname>Nehme</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Aguilar-Saborit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Elhemali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Halverson</surname>
          </string-name>
          , E. Robinson,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Subramanian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>DeWitt</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>A. GalindoLegaria, Query optimization in microsoft SQL server PDW</article-title>
          , in: SIGMOD Conference, ACM,
          <year>2012</year>
          , pp.
          <fpage>767</fpage>
          -
          <lpage>776</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Aguilar-Saborit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ramakrishnan</surname>
          </string-name>
          ,
          <article-title>POLARIS: the distributed SQL engine in azure synapse</article-title>
          ,
          <source>Proc. VLDB Endow</source>
          .
          <volume>13</volume>
          (
          <year>2020</year>
          )
          <fpage>3204</fpage>
          -
          <lpage>3216</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Armenatzoglou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Basu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bhanoori</surname>
          </string-name>
          , et al.,
          <article-title>Amazon redshift re-invented</article-title>
          , in: SIGMOD Conference, ACM,
          <year>2022</year>
          , pp.
          <fpage>2205</fpage>
          -
          <lpage>2217</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Dageville</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Cruanes</surname>
          </string-name>
          , et al.,
          <article-title>The snowflake elastic data warehouse</article-title>
          , in: SIGMOD Conference, ACM,
          <year>2016</year>
          , pp.
          <fpage>215</fpage>
          -
          <lpage>226</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>G. Graefe,</surname>
          </string-name>
          <article-title>The cascades framework for query optimization</article-title>
          ,
          <source>IEEE Data Eng. Bull</source>
          .
          <volume>18</volume>
          (
          <year>1995</year>
          )
          <fpage>19</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>N.</given-names>
            <surname>Tran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lamb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Shrinivas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bodagala</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Dave,</surname>
          </string-name>
          <article-title>The vertica query optimizer: The case for specialized query optimizers</article-title>
          , in: ICDE, IEEE Computer Society,
          <year>2014</year>
          , pp.
          <fpage>1108</fpage>
          -
          <lpage>1119</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jindel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Walzer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Jimsheleishvilli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Andrews</surname>
          </string-name>
          ,
          <article-title>The memsql query optimizer: A modern optimizer for real-time analytics in a distributed database</article-title>
          ,
          <source>Proc. VLDB Endow</source>
          .
          <volume>9</volume>
          (
          <year>2016</year>
          )
          <fpage>1401</fpage>
          -
          <lpage>1412</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Armbrust</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Xin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Huai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. K.</given-names>
            <surname>Bradley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kaftan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Franklin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ghodsi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zaharia</surname>
          </string-name>
          ,
          <string-name>
            <surname>Spark</surname>
            <given-names>SQL</given-names>
          </string-name>
          :
          <article-title>relational data processing in spark</article-title>
          , in: SIGMOD Conference, ACM,
          <year>2015</year>
          , pp.
          <fpage>1383</fpage>
          -
          <lpage>1394</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>W.</given-names>
            <surname>Rödiger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mühlbauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Unterbrunner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Reiser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kemper</surname>
          </string-name>
          , T. Neumann,
          <article-title>Localitysensitive operators for parallel main-memory database clusters</article-title>
          , in: ICDE, IEEE Computer Society,
          <year>2014</year>
          , pp.
          <fpage>592</fpage>
          -
          <lpage>603</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>W.</given-names>
            <surname>Rödiger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Idicula</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kemper</surname>
          </string-name>
          , T. Neumann,
          <article-title>Flow-join: Adaptive skew handling for distributed joins over high-speed networks</article-title>
          ,
          <source>in: ICDE, IEEE Computer Society</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>1194</fpage>
          -
          <lpage>1205</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Soualhia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Khomh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tahar</surname>
          </string-name>
          ,
          <article-title>Task scheduling in big data platforms: A systematic literature review</article-title>
          ,
          <source>J. Syst. Softw</source>
          .
          <volume>134</volume>
          (
          <year>2017</year>
          )
          <fpage>170</fpage>
          -
          <lpage>189</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>V.</given-names>
            <surname>Leis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gubichev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mirchev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Boncz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kemper</surname>
          </string-name>
          , T. Neumann,
          <article-title>How good are query optimizers, really?</article-title>
          ,
          <source>Proc. VLDB Endow</source>
          .
          <volume>9</volume>
          (
          <year>2015</year>
          )
          <fpage>204</fpage>
          -
          <lpage>215</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Beischl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kersten</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bandle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Giceva</surname>
          </string-name>
          , T. Neumann,
          <article-title>Profiling dataflow systems on multiple abstraction levels</article-title>
          , in: EuroSys, ACM,
          <year>2021</year>
          , pp.
          <fpage>474</fpage>
          -
          <lpage>489</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>