=Paper= {{Paper |id=Vol-3075/paper1 |storemode=property |title=Design Considerations Towards AI-Driven Co-Processor Accelerated Database Management |pdfUrl=https://ceur-ws.org/Vol-3075/paper1.pdf |volume=Vol-3075 |authors=Anh Trang Le,Bala Gurumurthy,Christoph Steup,Gabriel Campero Durand,David Broneske,Gunter Saake |dblpUrl=https://dblp.org/rec/conf/gvd/LeGSDBS21 }} ==Design Considerations Towards AI-Driven Co-Processor Accelerated Database Management== https://ceur-ws.org/Vol-3075/paper1.pdf
     Design Considerations Towards AI-Driven Co-Processor
              Accelerated Database Management

                Anh Trang Le                                  Bala Gurumurthy                       Christoph Steup
           Gabriel Campero Durand                             David Broneske                         Gunter Saake
           Otto-von-Guericke-Universität                Otto-von-Guericke-Universität         Otto-von-Guericke-Universität
               Magdeburg, Germany                           Magdeburg, Germany                    Magdeburg, Germany
                firstname.lastname@ovgu.de                  firstname.lastname@ovgu.de           firstname.lastname@ovgu.de


ABSTRACT                                                                    CPU counterparts still have poor performance for overall
Adopting AI techniques for query optimization is an on-                     query processing in analytical benchmarks [28]. This is part-
going research interest in the database community. Current-                 ly explained by the fact that optimizing query processing for
ly, the search space for the best plan increases drastically,               such systems encounters numerous intrinsic challenges cau-
with the growing heterogeneity of the target hardware, the                  sed by the diversity of tuning techniques per device , the
novel tuning choices offered, and co-processing. Hence, the                 uncertainty (given such techniques) in accurately modelling
need for AI techniques to identify such a best plan in a                    real-world performance impact factors (through parametric
reasonable time-frame is imminent. Though AI-based solu-                    cost models), the influence of workloads, as well as the need
tions for improving query processing exist, there is still a                to support scalability to more devices, data and new tu-
need for principled system designs able to incorporate the                  ning choices [6]. All these aspects create a huge optimizati-
different innovations, leverage synergy effects, and keep with              on space, which is hard to evaluate [17], turning the task of
production-readiness expectations when using AI. In this pa-                establishing a uniform research prototype for their detailed
per, we propose a series of seven ideal design characteristics              study into a tough nut to crack.
we envision for such systems. We then make the case for                        In this paper, we propose an early vision for a princi-
revisiting the traditional Mariposa system, to consider its                 pled system architecture with the goal of exposing and ea-
market concepts as a useful starting point for new system                   sing query optimization in a co-processor accelerated data
designs to support the identified characteristics. Altogether,              system. In traditional systems, optimization decisions are
we expect that this short paper could be a modest contri-                   commonly addressed with hard-corded rules and heuristics.
bution towards AI-driven heterogeneous processing, empha-                   Such systems, can result in weaknesses for generalizing to
sizing the practical aspects of a supportive and principled                 unknown workloads or devices, as well as difficulties for ex-
overall design.                                                             tending the heuristics and their maintenance.
                                                                               Throughout the last decade, there has been a shift to-
                                                                            wards employing AI techniques in handling these tasks more
Keywords                                                                    efficiently, alleviating the mentioned drawbacks of traditio-
AI for DBMS, Self-driving DBMS, Heterogeneous query pro-                    nal methods. In the database community, there is a strong
cessing, Hardware-accelerated query processing                              research trend that studies how AI can benefit database op-
                                                                            timizations [20, 14, 21, 29, 4]. Though the prospect is bright,
                                                                            there are several obstacles coming from AI itself, especially
1.    INTRODUCTION                                                          in the deployment of AI solutions [13, 19]. Hence, adopting
  In the recent decade, computer systems composed of                        them in the co-processing domain forces us to solve a du-
multiple heterogeneous processors have quickly become the                   al challenge: enhancing the performance of hybrid processor
norm, rather than the exception [27]. Along with this ra-                   databases, as well as maintaining the AI, and specially ma-
pid growth, we also witness an increasing adoption of hy-                   chine learning (ML) models in production.
brid processor database systems that circumvent the ’power                     In order to confront this situation, we consider that prin-
wall’ [3] and show great potentials for speeding up query                   cipled designs are needed for heterogeneous hardware data-
processing [24]. However, without tailored optimization stra-               base systems, which can facilitate the inclusion of learning
tegies, these systems cannot achieve the best performance                   from the ground-up, to address the different challenges in
gains. In fact, studies show that some GPU-accelerated sys-                 these systems. More precisely we propose seven characteri-
tems with better operator-level implementations than their                  stics we deem as essential for the system: C1) task modula-
                                                                            rization, C2) collaborative agents as building blocks, C3) ex-
                                                                            changeability of optimizers, C4) the separation of represen-
                                                                            tation and policies, C5) concepts for database administrators
                                                                            (DBAs) to manage AI components, C6) ease of adaptation
                                                                            to training scenarios, and C7) learning from demonstrations.
                                                                            Overall, we propose that all these characteristics would con-
                                                                            tribute to improve the solutions for heterogeneous database
32nd GI-Workshop on Foundations of Databases (Grundlagen von Daten-
banken), September 01-03, 2021, Munich, Germany.                            query processing, while simultaneously addressing the needs
Copyright © 2021 for this paper by its authors. Use permitted under Crea-   for AI production readiness.
tive Commons License Attribution 4.0 International (CC BY 4.0).
     In more detail, our core contributions in this paper are:    plans across multiple devices, or optimizing groups of que-
                                                                  ries at-a-time (which results pertinent for high parallelism
      • We present to the community a first proposal of se-       devices). Other than that, many relevant performance fac-
        ven ideal features that we deem as central to building    tors (e.g. device saturation, query expression complexity) or
        and maintaining a practical AI-based DBMS for co-         implementation details (e.g. cache consistency) could result
        processor systems.                                        in difficulties to model accurately for cost estimations.
                                                                     Summary: For efficient query processing over varied hard-
      • We propose an early high-level design that builds on      ware, it is important to consider methods that are able to
        the ideas of the market components of the classical       deal with large-scale optimization, and to work with uncer-
        Mariposa system developed by Stonebraker et al. [26],     tain models for performance factors.
        while seeking to support the ideal system features we        AI Adoption: It might be conceptually simple to alter a
        identified.                                               certain computer system task (e.g. magic number selection,
                                                                  to build a hash function), replacing hard-coded rules with
The remainder of this paper is structured as follows: Sec. 2,
                                                                  an AI-friendly interface that provides experience for a mo-
points out various challenges in incorporating AI com-
                                                                  del, to eventually master the task. However, in practice it
ponents to support co-processor database management.
                                                                  is far from trivial to build and maintain such models at the
Sec. 3 outlines our proposed design needs, it also describes
                                                                  highest production-readiness levels [13]. In essence, unlike
the high-level architecture of a system able to serve these
                                                                  traditional software components, AI models can be harder
needs. This section covers the system features, as well as the
                                                                  to test and can fail in unexpected ways, specially for deep
main workflow. Sec. 4, formalizes research questions that we
                                                                  learning. Models can often be black boxes, or require copious
aim to address with our proposed system design. Sec. 5, pro-
                                                                  training to be efficiently used. Machine learning models, spe-
vides context to the design we consider, by reviewing related
                                                                  cifically, consist of an entirely particular lifecycle going from
work. Finally, Sec. 6, wraps-up this paper with a summary
                                                                  data management tasks (including data collection and fea-
and points for future work.
                                                                  ture engineering), model learning (and tuning), validation
                                                                  and deployment, with challenges and cross-cutting concerns
2.      CHALLENGES OF HETEROGENEOUS                               all through this lifecycle [19]. Some common challenges are:
        DATA MANAGEMENT                                           insufficient data, concept drift, and adversarial attacks.
                                                                     Summary: The incorporation of AI components might af-
   In developing an AI-driven DBMS for heterogeneous pro-
                                                                  fect the guarantees that a system can provide. To overcome
cessors, the literature suggests several challenges.
                                                                  this, it is fundamental that the system is made safe to criti-
   Storage Engine Design: From the perspective of a sto-
                                                                  cal AI errors with fall-back mechanisms, and finally that an
rage engine, the trade-off between consistency, availabili-
                                                                  easy-to-use interface is offered for administrators to engage
ty and partition-tolerance, for a given workload, is a fo-
                                                                  with AI metrics and model lifecycle management.
remost concern [22]. Addressing availability, some challen-
ges are: mechanisms to efficiently use scale-out processing
to an increasing number of heterogeneous processors, while        3.   DEVELOPING AN AI-ENABLED
keeping in mind aspects such as different data transfer ra-
tes [1]. Up front data distribution strategies, such as hard-          CO-PROCESSOR ACCELERATED
ware islands [24] or layered designs [28] for hot-cold data            DATABASE PROTOTYPE
are commonly adopted, but exploration of alternatives is li-         In order to overcome the identified challenges of scalabi-
mited in the domain (e.g., [16]). Furthermore, co-processor-      lity as well as the need for adaptability and support for the
friendly data structures and storage optimization adapted to      AI/ML lifecycle, a principled design for building a system is
the diversity of application scenarios, devices, and data cha-    required. To date, similar designs have already been consi-
racteristics (e.g., increasing relevance of textual and semi-     dered by researchers in other areas, already offering design
structured data), remains important for availability. Addres-     principles. In this section, we consider briefly some of such
sing consistency, strong mechanisms for supporting isolation      design concepts from a general, to a more specific case, which
level guarantees are necessary for increasing system maturi-      we then use to propose the series of ideal characteristics, that
ty.                                                               become the basis of our system design.
   Summary: Relevant directions for storage technologies to          From a general application perspective, university cour-
enhance their co-processing efficiency, while keeping with        ses1 and textbooks already study good practices for building
consistency constraints, include: workload-tuned data dis-        systems that incorporate machine learning [9]. Furthermore,
tribution, transfer-aware processing, the ease for incorpora-     there are several papers that highlight intrinsic challenges
ting alternative processor-specific data optimizations (e.g.,     which require designs to adapt to them [25]. They usual-
layouts, or compression), and finally the ability to seamlessly   ly refer to difficulties such as modularization, or to specific
share data structures across processors.                          problems of an ML approach (e.g., [7]).
   Query Engine Design: The processing of single queries over        Moving to a more specific application, the authors of
heterogeneous hardware offers numerous optimization choi-         the AutoSys framework [18] suggest 4 principles organized
ces, as compared to the case of homogeneous hardware (e.g.        around the goals of making systems learnable, and making
just-in-time code generation to fuse pipelined operators into     the learning manageable: exposing system behavioral featu-
unified kernels, more diversity of operator variants, or oppor-   res for learning through well-defined interfaces (P1), careful
tunities for resource sharing of concurrent kernels). Further-    monitoring of model behavior (P2), modularization of the
more, there are many variants for a single operator present
depending on the underlying device [3, 5]. This number of         1
                                                                    For example, SE4AI offered by Christian Kästner at CMU:
choices only increases when considering distributing single       https://github.com/ckaestne/seaibib
learning to scope complexity (P3), and resource manage-            be solved in a collaboration among agents (e.g., sub-query
ment for system exploration and maintenance (P4). In fur-          selection and optimization per device). To this end, clean
ther work, authors report experience in applying their fra-        abstractions for the task and communication protocols are
mework, providing further practical advice.                        required. This characteristic seeks to address the storage en-
   Similar design considerations have a long history in the        gine challenge for high device adaptability.
community that studies self-driving data management, al-           C3- Exchangeability of optimizers: Similar to current databa-
beit not often coupled with AI (e.g., Babu et al. have argued      ses that already employ alternative optimizers in tasks like
for experiment-driven adaptive tuning by having replicated         join order optimization, for a research-oriented prototype it
test databases [2]); most recently Kossmann and Schlosser          becomes essential to support the use of alternative optimi-
also highlight the importance of modular designs and the           zers/models in a plug-and-play manner for dealing with a
plug-and-play nature of optimizations in designing such ad-        specifically modularized task. To the point, extending opti-
aptive systems [11]. Furthermore, authors identify co-related      mizers (with new features), and integrating new optimizers
tasks as a core challenge to efficient modularization, propo-      should also be supported with ease, to facilitate the evolu-
sing and testing a linear programming framework that ena-          tion of the overall system.
bles them to deal with such complication.
   In recent years, research in AI-based databases has pro-        C4- The separation of representation and policies: Following
posed designs tailored to the needs in the area. Due to space      related work [20], we propose that a smart separation bet-
limitations, we discuss in the following a few key ideas from      ween representation learning (i.e., how a model decides to
a limited set of them: Pavlo et al. [20] build a system desi-      represent an entity) and policies (i.e., the decisions made
gned with a principled distinction between workload mode-          by a model, for a task, given a representation), will be of
ling (i.e., representation learning) and system control (i.e.,     value. This separation would facilitate representation re-use
policies). In further work, they continue their approach whi-      (transfer learning) across tasks that work on similar entities
le distinguishing between externally and internally coupled        (e.g., a query) and the analysis of alternative multi-modal
intelligent mechanisms [21], illustrated by their work in Ot-      solutions for a task which could provide benefits on different
terTune and NoisePage, respectively. Within the research           scenarios (e.g., a query can be represented as multi-sets of
scope of SageDB, Kraska et al. [12] present and evaluate a         traditionally encoded predicates, joins, tables; but it can al-
comprehensive vision for how common database components            so be represented as a graph of such features). As different
can be replaced with AI. Among their core ideas, authors de-       policies can benefit from stable compact learned representa-
velop the concept of instance optimality, which posits that        tions, this characteristic seeks to help in the aforementioned
a learned model for a database needs only to be provably           query engine challenges for large-scale optimization.
optimal to the intended workload and system configuration.         C5- Concepts for DBAs to manage AI components: In consi-
Finally, the authors of XuanYuan [14] present a broad high-        deration of the many steps requiring human management in
level design that focuses on identifying what are the learna-      the ML lifecycle, we envision that the role of the DBA could
ble components of current databases, considering task mo-          be extended to incorporate a degree of actions to manage
dularization, and categorizing tasks according to the func-        this lifecycle. To support this, novel services exposing ML
tionality they offer to the overall system (e.g., self-healing,    management with clearly-defined interfaces will be needed
self-assembling, self-optimizing).                                 in the database context.
   Based on this preceding work, and on the challenges iden-       C6- Ease of adaptation to training scenarios: Different user
tified for heterogeneous co-processing, we propose the fol-        scenarios will create different alternatives for training the
lowing seven characteristics we deem that an AI-based da-          ML models. It might be that some scenarios accept live trai-
tabase should reasonably offer for this domain. We should          ning in the background for a given task, while other scenarios
note that these characteristics might not be exhaustive, but       might require collecting experience data for offline learning
aim to serve as a starting point towards a principled design.      at a later stage. Some scenarios might allow for ample large-
                                                                   scale training, while training on other scenarios might be
C1- Task Modularization: The growing hardware heteroge-
                                                                   severely resource-constrained. In either case, the design for
neity increasingly expands the space of all possible optimi-
                                                                   the system components in charge of scheduling model trai-
zation choices for DBMSs. Already query optimizers for such
                                                                   ning, with its intrinsic resource management, should be able
systems employ staging, wherewith optimizations are confi-
                                                                   to cater to such variations. The ability of models to schedule
gured into stages and at each one there are specific sets of
                                                                   self-training should also be supported.
rules and mechanisms that can be adopted. Concerning ML
components, employing a single, monolithic model to learn          C7- Learning from demonstrations: The final feature that we
such a complex space and address the optimizing at a sin-          believe is essential for successful adoption of AI models to
gle shot can escalate the learning cost and complexity. Task       solve computer system tasks has to do with robustness. In
modularization is a good alternative, since the optimizati-        order for the model to be able to replace a current strategy,
on problems to be tackled can be decomposed and solved             an efficient and reasonable solution would be starting with
separately, resulting easier to learn.                             the model by being pre-trained on experience collected from
                                                                   the current strategy. Hence, mechanisms for creating and
C2- Collaborative agents as building blocks: The best models       using demonstrations for training are important.
for a given task on a selected device are only required to be        After listing these ideal system characteristics, we can now
instance optimal (i.e, their strategies do not need to generali-   present a tentative design that can be adopted to fulfill them.
ze to other devices). Hence, as much as possible, it might be        To be precise, we make the case for a design based on the
beneficial for designs to strive towards supporting device-        Mariposa system – a market-based distributed DBMS [26],
specific simple instance-optimal models by decomposing a           which we will discuss further in Sec. 5. In general, Maripo-
task (e.g., a single query optimization) into parts that can       sa operates the query processing in a decentralized manner
that allows for local autonomy regarding query execution          served. Once choices are made, finally queries can be execu-
in each site contained within a network, instead of centrali-     ted on the devices. This scheme describes a query market,
zed management. Mariposa’s working mechanism primarily            as the proposed by Stonebraker et al. [26]. By framing the
bases itself on an economic paradigm, focusing on two sepa-       problem in economic terms, this approach helps decentrali-
rate markets, for data and query distribution, respectively.      zed coordination and favors local strategies for optimization
In our research, we argue for building on Mariposa’s market       (C2). Some optimization choices include: variant selection,
concepts, in two key ways: First, by considering an architec-     operator merging into unified kernels, different sub-query
ture with heterogeneous processing capabilities. Second, by       splitting and pipelining strategies, parallelism tuning with
investigating how AI-based solutions can augment the pro-         morsel-driven execution, locality awareness, intermediate re-
posed markets. In this regard, we take as a hypothesis that       sults reuse and operator sharing across queries.
the modularization of the optimization mechanisms presen-            Data Management: The data distribution lifecycle, which
ted in Mariposa (C1, C2) serve to scope the complexity of         occurs in the background, can be understood as follows: On
the learning tasks, serving as a workable basis for incorpo-      system start, given the lack of information for distributing
rating technological innovations as well as the production        the data, some assumptions can be made by the storage ma-
readiness. Figure 1 envisions a general architecture of our       nager to achieve fragmentation and distribute the data for
proposed design, which employs the market concepts from           load balancing. In general, data can be grouped into frag-
the Mariposa system as a starting point.                          ments that are commonly co-accessed and that provide a
   At a high level, four components are involved:                 given utility. While the system is online there are two ways
   Global Optimizer: This component maps SQL queries to           in which data can be redistributed: First, when an optimal
actual plans. It is in charge of global query optimization        plan for a query cannot be found, global requests for data re-
including: the generation of global plans, partial splitting of   organization can be made by the global optimizer (with some
the plans (to distribute among devices), decision support for     pre-designed mechanism). Following these requests, the de-
selection of plans returned from the device processor class       vice optimizers can organize autonomously how to serve the
optimizer, and (optionally) requests for data re-distribution.    global hints. Second, device optimizers themselves are re-
   Storage Manager: This component provides a centralized         sponsible for tracking the utility derived from a given data
collection of statistics about devices and a tracking mecha-      fragment (i.e., depending on the queries that can be ser-
nism of data distribution schemes. It enables user-facing con-    ved by such fragment). This enables devices to have metrics
figurations of the overall storage, including schema manage-      to be able to assess how much utility can be derived from
ment, index selection and coarse-grained partitioning. It also    fragments that are not locally available. Hence, by using in-
is intended to provide the DBA with an interface to the AI        formation from the storage manager, device optimizers can
components, including learning from demonstration. Hence,         participate in a data market. In this market, devices buy
this component realizes C5 and C7.                                copies of fragments, and delete local copies of fragments,
   Device Processor Class Optimizer: This is the key com-         while keeping with some constraints (e.g., for co-location or
ponent for decentralized modularized data management. As          availability). The market formulation is expected to facili-
we could propose a component per processor, or per compu-         tate adaptivity and work distribution (C2). Some learning
te node/device (i.e., irrespective of the co-processor variety    tasks to be tackled with this system include: local algorithm
included), we find that a component per type of processor         selection, local query optimization, global plan selection, lo-
serves as a workable middle-ground. This component is in          cal fragment partitioning, data sharing, global management
charge of local data fragmentation, local query optimization      for data redistribution, query classification/prioritization.
(and pricing), algorithm selection and the actual execution          Altogether, the proposed design is intended to realize the
of queries. It is also responsible for autonomous data sharing.   ideal characteristics we set as goals (C1-C7). To achieve this,
   AI Support System: This element encompasses the func-          the characteristics of the original Mariposa design are lever-
tionality required to support the ML lifecycle. It includes       aged (C1,C2). Furthermore, the design seeks to facilitate
model management, model training, among others. It is in-         the use of alternative optimizers or models for a task (C3),
tended to facilitate C6 and C7.                                   to provide chances for reusing representations across com-
   Our architecture enables the distribution of query and         ponents (e.g., of queries with respect to the device optimi-
sub-query plans for cost estimation on the device processor       zers), which that can then adopt different policies (C4), and
class optimizers; besides the distribution of data driven by      to create opportunities for components to have flexible AI
the device-specific component, in addition to (partial input      training engaging the DBA with the process, and enabling
from) the global optimizer.                                       to learn from demonstration data (C5-C7).
   Query Processing: At the start, a group of queries enters
the system at a given time step, and at the global optimizer,     4.   OPEN QUESTIONS
they are ranked by their importance to overall performance
                                                                    Based on our proposed design, in this section we turn to
goals. Second, they are globally partitioned and subsets of
                                                                  open questions we envision our design to be able to help
their plans are shipped to the devices, for pricing. In third
                                                                  address. These questions relate to query engine (Q1-Q2),
place, the different device processor class optimizers provide
                                                                  storage engine (Q3) or machine learning (Q4-Q5) challenges.
a set of optimizations and prices for the queries requested.
                                                                  Q1: What building blocks for intelligent and collaborative
To do this, they featurize the query plans (C4), and sug-
                                                                  query processing are necessary to achieve improvements on
gest different combinations of sub-queries to execute with
                                                                  heterogeneous processors, considering single-query optimi-
different costs (for this they consider local data statistics
                                                                  zation –focusing on algorithm selection, parallelism tuning,
and learned models for algorithm selection). The prices are
                                                                  splitting, merging and pipelining of operators; compared to
then returned, in a fourth step, to the global optimizer, so
                                                                  strong baselines?
this optimizer can select among the bids until all queries are
                                                                  Q2: What strategic designs for intelligent and collaborative
                                   SQL Query                                                                                     System Administration
                                   Interface                                                                                           Interface



                                                                                                     Storage Manager

                                                                                                                                                  Device           Manager for
                                                                                          Fragment           Access          Indexes and
                                                   Configurations         Logging                                                               Monitor and           AI
                                                                                          Tracking           Paths              Views
                       Global Optimizer                                                                                                            Info            Components

                                                          Workload-level
                        Query Prioritization             Performance Goal
                                                                                                                                                                                   AI
                                                                                       Device Processor Class Optimizer                                                          Support
                      Global Plan Partitioning                                                                                                                                   System

                                                                Local Storage Manager                                      Local Query Optimizer                      Executor

                      Global Plan Selection
                                                                    Fragment
                                                                                    Global                      Algorithm/
                                                    Fragment        valuation                    Data                           Local query        Intermediate
                                                                                    Storage                      Variant
                                                    formation          and                      sharing                         optimization       Results Views
                      Global Management for                                           Info                      Selection
                                                                    Statistics
                        Data Redistribution



                                                                                         Data market interface (bidding)                   Query market
                                                                                                                                           interface (bidding)




                                                 Figure 1: General Architecture of our Proposed System



query processing lead to performance gains on heterogeneous                                                    by machine intelligence, instead of focusing on only one or
processors, considering multi-query optimization (MQO) –                                                       some certain tasks within it. Some highlighted work, which
with a focus on intermediate results reuse and operator sha-                                                   can be considered to be in a relative early stage, include
ring? To what degree do intelligent methods compete with                                                       Peloton [20], SageDB [12] and GaussDB2 .
non AI-based alternatives?
Q3: What precise contributions are brought from different                                                      5.2            Market-based distributed database
applications of AI, to the efficiency of data sharing across                                                                  systems
co-processors; contrasted with competitive baselines?                                                             In economics, a market is defined as any structure that
Q4: How do AI-based approaches perform in robustness                                                           enables trading activities among its participants, for any ty-
tests, compared to heuristic baselines, with respect to                                                        pes of goods, services or information, following a pricing me-
changing assumptions such as novel processors or unseen                                                        chanism that aims for optimal distribution and allocation of
workloads/kinds of queries? What level of improvements                                                         resources. Interestingly, this concept has been reformulated
does curricula management bring regarding robustness and                                                       to efficiently solve the problems of query optimization in
sample-efficiency?                                                                                             many distributed data management systems. To help with
Q5: What techniques from learning management (such as                                                          some of our ideal design characteristics (C2), we consider
learning from demonstrations, or transfer optimization) or                                                     this concepts to be relevant.
from database implementation contribute the most to an                                                            The system that we base our design on is Mariposa [26],
efficient integration of the AI components into the lifecy-                                                    which adopts market concepts to achieve autonomous da-
cle of data management? Do these techniques contribute to                                                      ta sharing and query processing. In a wide-area network,
trade-off management between approaches? To what extent                                                        Mariposa allows each single site to take a full control over
do these techniques improve the overall readiness of our so-                                                   its own resources, enabling it to decide on data objects to
lution over baseline choices?                                                                                  buy or sell and queries for which to bid on, for execution.
                                                                                                               A bidding protocol is defined to regulate the transactions
5.    RELATED WORK                                                                                             among all sites within the two markets: 1) Query Market:
                                                                                                               each query Q enters the system with a budget B(t) indi-
5.1    AI-based database systems                                                                               cating the price that the user wants to pay for running Q
                                                                                                               within time t. Also, Q is administered by a broker, which
   Incorporating AI components to traditional systems, for
                                                                                                               sends out to bidder sites the requests for bids to execute
improving the overall system performance, is a significant
                                                                                                               subqueries Q1 , ..., Qn and then decides on the winning sites.
topic that is currently catching great attention from resear-
                                                                                                               2) Data Market: each table included in the FROM clause of
chers, in both theoretical and applied aspects. On one hand,
                                                                                                               a query can be split into a set of fragments. A site needs to
multiple studies investigate the strategies for an overall co-
                                                                                                               buy fragments referenced in the subquery that it wants to
design of systems and AI, fitting for general applications [18]
                                                                                                               bid on, and can sell its must-evicted fragments at any time
and more specific ones in databases [21]. On the other hand,
                                                                                                               by conducting an auction, following the system pricing me-
many research zooms into the particular problems that can
                                                                                                               chanism. The trading process runs continuously. Each site
benefit from using suitable AI techniques. Those, in data-
                                                                                                               makes decisions on storing, buying and selling fragments or
bases, range from cost and cardinality estimation assisted
                                                                                                               the replicas of fragments made by the site itself, aiming at
by deep neural networks [10], join order selection or parti-
                                                                                                               maximizing its profit per unit time.
tioning supported by reinforcement learning [15, 4, 8], and
many more [29, 14]. In a bigger scope, other recent literature                                                 2
                                                                                                                 https://e.huawei.com/en/solutions/cloud-computing/big-
also studies complete database management system assisted                                                      data/gaussdb-distributed-database
   Another framework that is based on an economic para-            [4] G. C. Durand, R. Piriyev, M. Pinnecke, D. Broneske,
digm to acquire self-adaptive query allocation in large-scale          B. Gurumurthy, and G. Saake. Automated vertical
distributed systems is SQLB [23], in which the authors high-           partitioning with deep reinforcement learning. In ADBIS.
                                                                       Springer, 2019.
light the importance of maintaining constantly the interests
                                                                   [5] B. Gurumurthy, D. Broneske, M. Pinnecke, G. Campero,
of the participators throughout the ongoing market. The sys-           and G. Saake. Simd vectorized hashing for grouped
tem targets at preserving participants’ satisfaction on query          aggregation. In ADBIS. Springer, 2018.
allocation/execution and guaranteeing query load balancing         [6] B. Gurumurthy, T. Drewes, D. Broneske, G. Saake, and
within the system, which then helps in minimizing the re-              T. Pionteck. Adaptive data processing in heterogeneous
sponse time and maximizing system throughput.                          hardware systems. In GvDB, 2018.
   NashDB [16] is a more recent framework that shows ef-           [7] A. Haj-Ali, N. K. Ahmed, T. Willke, J. Gonzalez, et al. A
ficiency in autonomously handling data fragmentation, rep-             view on deep reinforcement learning in system
                                                                       optimization. arXiv preprint arXiv:1908.01275, 2019.
licas generation, allocation and cluster sizing to attain the
                                                                   [8] B. Hilprecht, C. Binnig, and U. Röhm. Learning a
Nash equilibrium, i.e. supply-demand balance in markets.               partitioning advisor for cloud databases. In SIGMOD, 2020.
                                                                   [9] G. Hulten. Building Intelligent Systems. Springer, 2019.
6.   CONCLUSION                                                   [10] A. Kipf, D. Vorona, J. Müller, T. Kipf, et al. Estimating
   The search space of traditional query optimizers is ve-             cardinalities with deep sketches. In SIGMOD. ACM, 2019.
ry large. Such search space is further increased many folds       [11] J. Kossmann and R. Schlosser. Self-driving database
                                                                       systems: A conceptual approach. DAPD, 2020.
with the introduction of co-processors for query execution.
                                                                  [12] T. Kraska, M. Alizadeh, A. Beutel, H. Chi, et al. SageDB:
AI techniques are promising solutions in traversing such a             A learned database system. In CIDR, 2019.
large search space, identifying the best plan in an effective     [13] A. Lavin, C. M. Gilligan-Lee, A. Visnjic, S. Ganju, et al.
and time-efficient way. As a consequence, there is a growing           Technology readiness levels for machine learning systems.
body of research devoted to AI-based solutions [29]. Howe-             arXiv, 2021.
ver, turning these solutions into a production-ready contri-      [14] G. Li, X. Zhou, and S. Li. Xuanyuan: An AI-native
bution remains a challenge since AI, and machine learning              database. IEEE Data Eng. Bull., 42(2), 2019.
in specific, contain many intrinsic challenges that require       [15] R. Marcus and O. Papaemmanouil. Deep reinforcement
overall system considerations. In sum, systems builders are            learning for join order enumeration. In aiDM@SIGMOD.
                                                                       Association for Computing Machinery, 2018.
placed in the difficult position of having to simultaneously
                                                                  [16] R. Marcus, O. Papaemmanouil, S. Semenova, and
tackle a set of heterogeneous co-processing challenges, next           S. Garber. NashDB: An end-to-end economic method for
to a set of AI adoption challenges.                                    elastic database fragmentation, replication, and
   As a motivation for this work, we considered that a prin-           provisioning. In SIGMOD. Association for Computing
cipled system design could contribute to addressing the afo-           Machinery, 2018.
rementioned 2 sets of challenges, while at the same time          [17] A. Meister, S. Breß, and G. Saake. Toward GPU-accelerated
helping in the integration of different technological innovati-        database optimization. Datenbank-Spektrum, 15(2), 2015.
ons. In order to contribute towards this goal, in this pa-        [18] C.-J. Mike Liang, H. Xue, M. Yang, and L. Zhou. The case
                                                                       for learning-and-system co-design. ACM SIGOPS
per we summarized a list of preceding work that helped                 Operating Systems Review, 53(1), 2019.
us to identify 7 design characteristics (C1-C7), addressing       [19] A. Paleyes, R.-G. Urma, and N. Lawrence. Challenges in
needs for scoping complexity and difficulty of learning (C1,           deploying machine learning: A survey of case studies.
C7), high adaptability/instance optimality (C2-C3), scale              arXiv, abs/2011.09926, 2020.
(C4), and machine learning issues in general (C5-C6). Ba-         [20] A. Pavlo, G. Angulo, J. Arulraj, H. Lin, et al. Self-driving
sed on this, we proposed an early overall system design, ba-           database management systems. In CIDR, volume 4, 2017.
sed on concepts originally studied in the visionary Mariposa      [21] A. Pavlo, M. Butrovich, A. Joshi, L. Ma, et al. External vs.
system, specifically the economic concepts for a data and              internal: An essay on machine learning agents for
                                                                       autonomous database management systems. IEEE Data
query market. We propose this design to fulfill the design             Eng. Bull., 42, 2019.
characteristics, while offering an AI-based heterogeneous co-     [22] M. Pinnecke, D. Broneske, G. C. Durand, and G. Saake.
processing database. To conclude, we listed open questions             Are databases fit for hybrid workloads on GPUs? A storage
that we would like to review, moving forward, by using our             engine’s perspective. In ICDE. IEEE, 2017.
proposed design.                                                  [23] J.-A. Quiane-Ruiz, P. Lamarre, and P. Valduriez. SQLB: A
                                                                       Query Allocation Framework for Autonomous Consumers
                                                                       and Providers. 2007.
7.   ACKNOWLEDGMENTS                                              [24] A. Raza, P. Chrysogelos, P. Sioulas, V. Indjic, et al.
  This work was partially funded by the DFG (grant no.: SA             GPU-accelerated data management under the test of time.
465/51-1 and PI 447/9). The authors would like to thank                In CIDR, 2020.
Marcus Pinnecke, Andrey Kharitonov, Rajatha Rao and               [25] I. Stoica, D. Song, R. A. Popa, D. Patterson, et al. A
Yash Shah for collaborations related to this work.                     berkeley view of systems challenges for ai. arXiv preprint
                                                                       arXiv:1712.05855, 2017.
                                                                  [26] M. Stonebraker, P. M. Aoki, W. Litwin, A. Pfeffer, et al.
8.[1] I.REFERENCES
         Arefyeva, D. Broneske, G. Campero, M. Pinnecke, and           Mariposa: A wide-area distributed database system. VLDB
     G. Saake. Memory management strategies in CPU/GPU                 Journal, 5(1), 1996.
     database systems: A survey. In BDAS. Springer, 2018.         [27] M. Zahran. Heterogeneous computing: Hardware and
 [2] S. Babu, N. Borisov, S. Duan, H. Herodotou, and                   software perspectives. ACM, 2016.
     V. Thummala. Automated experiment-driven management          [28] Y. Zhang, Y. Zhang, J. Lu, S. Wang, et al. One size does
     of (database) systems. In HotOS, 2009.                            not fit all: Accelerating OLAP workloads with GPUs.
 [3] D. Broneske, S. Breß, M. Heimel, and G. Saake. Toward             DAPD, 38, 2020.
     hardware-sensitive database operations. In EDBT.             [29] X. Zhou, C. Chai, G. Li, and J. Sun. Database meets
     OpenProceedings.org, 2014.                                        artificial intelligence: A survey. TKDE, 2020.