<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>September</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Ephemeral Per-query Engines for Serverless Analytics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michael Wawrzoniak</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rodrigo Bruno</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ana Klimovic</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gustavo Alonso</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>INESC-ID/Técnico, U. Lisboa</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Systems Group, Computer Science Department</institution>
          ,
          <addr-line>ETH Zürich</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>1</volume>
      <issue>2023</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>We challenge the common assumption that queries are submitted to a pre-configured, already running engine and put forward the idea of dynamically instantiating a chosen data processing engine upon query submission by leveraging Function-as-aService (FaaS) platforms. We demonstrate the idea by running unmodified data processing engines (we use Apache Drill as an initial example) on real-world serverless FaaS platforms and show that such engines can be instantiated on demand when a query arrives. We aim to eventually support a wide range of queries and workloads. Wide access to such functionality would be a game changer in data processing. First, it would enable pay-per-query models supporting sporadic, interactive data analysis on arbitrary engines. Second, it would significantly increase the flexibility for data processing by enabling the possibility of dynamically choosing the actual engine, its configuration, and the resource allocation on a per-query basis. Logically, this amounts to dynamically attaching a query engine to the query rather than sending the query to a pre-configured and already deployed engine. In this paper we elaborate on this vision, outline the design of the MetaQ prototype that we are building to explore the idea, demonstrate that it is realistic through initial experiments, and discuss its many exciting practical implications.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Serverless</kwd>
        <kwd>Data Analytics</kwd>
        <kwd>Functions-as-a-Service</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        is instantiated (potentially selected from a variety of
engines) in the best possible configuration and deployment
Operating a long-running query engine has several lim- for the query, the query is executed by the engine, and
itations. First, it generates costs even if it is idle. Sec- upon completion, the engine is shut down (unless there is
ond, most distributed query engines lack elasticity, which a reason to keep it running, like a similar query arriving
leads to deployments being over-provisioned to cope while the engine is active). This eliminates the need for
with potential peak loads [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ]. And third, as work- dynamic elasticity in the engine. Every query gets an
enload diversity increases, each query might benefit from a gine deployed on just the resources it needs (e.g., nodes,
diferent configuration and/or engine deployment (e.g., memory, bandwidth, CPUs). This also simplifies engine
involving accelerators, caches, parallelism level, etc.), re- deployment (since the engine can be instantiated
specifisulting in the engine often running in a less than optimal cally for the query at hand, e.g., maximizing data source
setting for most queries [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. locality) and removes the need for auto-tuning [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] of
      </p>
      <p>
        In this paper we explore an ambitious and radically long-running engines (the engine settings need to be
opnew design: one in which we take advantage of server- timized only for the given query, which allows for more
less computing to provide ephemeral per-query engines specialized and eficient solutions [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]). The approach also
(EPQE), i.e., query engines dynamically instantiated for eliminates the problem of idle resources since if there is
each query and discarded upon completion. The ulti- no query, there is no engine running. Finally, another
mate goal is to be able to select the optimal engine and crucial aspect of the idea is the possibility of selecting
configuration on a per-query basis, to eliminate the inefi- among diferent data processing engines on a per-query
ciencies of using all-purpose configurations and resource basis. This opens up the opportunity to use diferent
overprovisioning. engines depending on factors like data types (e.g.,
relaIn the EPQE paradigm, given a query, a query engine tional, semi-structured, graphs), file formats used (e.g.,
Arrow, Parquet, CVS, JSON, etc.), expected performance
(e.g., based on previous profiling), feature set (e.g.,
availability of required statistical functions), or suitability to
the overall task (e.g., when the query is a step in an
ML pipeline). The idea resembles unikernel operating
systems [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] where, for each application, a specialized
operating system is constructed (e.g. from a library
operating system [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]) and instantiated, already optimized for
the application.
      </p>
      <p>The vision of EPQE is enabled by the emergence of
SM - session manager
MO - meta-system optimizer
PP - platform provider</p>
      <sec id="sec-1-1">
        <title>FaaS</title>
      </sec>
      <sec id="sec-1-2">
        <title>Storage</title>
      </sec>
      <sec id="sec-1-3">
        <title>No compute resources instantiated before query execution</title>
        <p>SM
MO</p>
        <p>
          Function as a Service (FaaS). In serverless computing, spective. These systems propose, among other things,
users deploy and invoke fine-grain functions on-demand [ 9, complex ways to reduce the overhead of communicating
10]. There are three main characteristics of serverless through cloud storage, clever optimizations to minimize
that can help in realizing the EPQE idea. First, thanks the amount of data exchanged, and suggest algorithms
to lightweight VM system infrastructure [
          <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
          ], func- to reduce the impact of start-up times as the number of
tions can be instantiated quickly. For example, in AWS functions needed grows. In addition, there are eforts to
Lambda [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], function cold start initialization latency is leverage commercial serverless FaaS oferings to provide
∼ 200ms. Such fast resource instantiation times allow caching and storage services to data center applications
starting a new engine for a query without contributing running outside of the serverless functions [23, 24].
significantly to the overall execution time. Second, in- Unlike these existing eforts that build custom
experidividual functions can be deployed with diferent CPU mental FaaS query engines to circumvent the limitations
and memory configurations. Furthermore, thousands of of serverless platforms, our approach is to leverage
existfunctions can be instantiated in parallel. Such a level of re- ing serverless infrastructure to run unmodified
state-ofsource availability and configurability allows us to right- the-art data processing systems. By including existing
size and right-configure engines at per-query granularity. unmodified engines, we will be able to take advantage
Finally, FaaS platforms provide fine-grained resource ac- of their wide variety, feature completeness, and years
counting (e.g., AWS Lambda users pay at microsecond of efort put into their development and optimization.
granularity), aligning the costs of the EPQEs to the work However, all of the real-world FaaS platforms that we
done and can play a role in deciding which engine to are aware of do not provide execution environments that
instantiate. support running of-the-shelf distributed query engines.
        </p>
        <p>
          However, despite their advantages, today’s FaaS server- Our approach is to leverage an evolution of the Boxer [25]
less platforms are not adequate for general data process- system, which aims to overcome FaaS limitations (e.g.,
ing [
          <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
          ] since running queries often requires features by enabling inter-function networking) to provide an
that are missing, such as caching, support for direct com- execution environment on top of existing commercial
munication among functions, and persistent state. This FaaS platforms (such as AWS Lambda) that matches the
is the result of a conscious choice by providers who bun- requirements of unmodified of-the-shelf query engines.
dle functions with a very restricted programming model To explore the feasibility of the EPQE concept, in this
based on network-isolated, event-triggered modules com- paper we investigate whether (1) it is already possible to
posable into larger systems through workflow-based or- run existing query engines on a commercial serverless
chestration services [
          <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
          ]. To overcome this mismatch, system (AWS Lambda); (2) whether the resulting
pera significant research efort is underway. One approach formance is acceptable since existing distributed query
involves redesigning serverless platforms from scratch engines have not been originally designed to operate on
and developing a completely new FaaS platform to be top of serverless functions; and (3) get an initial idea of
run on VMs (e.g., Anna key-value store [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]). Another whether selecting engines on a per-query basis would
approach relies on commercial serverless FaaS oferings bring an advantage. We build a prototype system, MetaQ,
and tries to overcome some of the platform shortcom- as a way to realize the EPQE model and conduct a
feasiings from a data processing [
          <xref ref-type="bibr" rid="ref19 ref20">19, 20</xref>
          ] or ML [
          <xref ref-type="bibr" rid="ref21">21, 22</xref>
          ] per- bility study.
(a) the initial resource allocation (e.g., where, how,
and how many configured AWS Lambda
functions should be started),
(b) the query engine to use (such as Apache Drill,
        </p>
        <p>Apache Spark, Trino [31], etc.),
(c) the configuration of the query engine
instantiation, including auxiliary systems such as Zookeeper
[32] (e.g., mapping of engine executors onto the
resources, configuring engine settings, required
storage plugins, etc.)</p>
        <p>
          Our focus is on distributed data processing platforms, (SM) for the given query. If the user-supplied
specificasuch as Apache Spark or Apache Drill, instead of tradi- tions of the query engine or resources are not complete
tional database engines, such as PostgreSQL or MySQL. or are left underspecified, then (step ○ 2 ) the meta-system
We do not analyze the cost tradeofs of using AWS Lambda optimizer (MO) is used to choose all of the missing
specifor data analytics, previous works [
          <xref ref-type="bibr" rid="ref19 ref20">19, 20</xref>
          ] established ifcations. The specification has three elements:
that serverless can reduce costs for bursty query
workloads. In particular, steady, similar, high-throughput
workloads are better served by long-running systems
utilizing more cost-efective infrastructure than AWS
Lambda (e.g., AWS EC2 virtual machines).
        </p>
        <p>We report the result of using an unmodified version
of Apache Drill [26] in a distributed configuration over
serverless FaaS and its performance running the
TPCH benchmark. This initial experiment shows that the
EPQE approach is feasible and, for all but one query,
executing the query with the ephemeral approach is faster
than the time it takes to simply instantiate a system with Once the complete specification is determined, it is used
matching configuration over AWS Fargate [ 27]/Elastic to instruct the platform provider (PP) (step ○ 3 ) to
instanContainer Service (ECS) [28] (without even starting to tiate and configure the specified resources and then start
run any queries). We study the start-up time of a query the configured query engine processes (and any
auxilprocessing engine in this context to examine its practical iary systems). The platform provider (step ○ 4 ), using the
feasibility. Finally, we also discuss preliminary results specification of initial resource allocation (a), requests
that indicate that some queries run faster in one engine the resources from the underlying platform, such as
net(Apache Drill) than in others (Apache Spark [29]) and worked FaaS functions, configures their networking, and
vice-versa, providing initial evidence that the per-query assigns necessary names, roles, and ids to function
inengine selection approach can bring important advan- stances. The query engine specification (b) determines
tages. which function (or container) images are instantiated
from the available catalog. Finally, before the platform
2. MetaQ Prototype provider starts the query engine, the specification of the
query engine configuration (c) is used to populate the
necessary configuration files and environment variables
for the query engine.</p>
        <p>Once the engine is started and ready to process queries,
the session manager (step ○ 5 ) submits the user query and
awaits the results from the execution engine. When the
query execution completes, the session manager retrieves
the results (step ○ 6 ) and returns them to the user (step ○ 7 ).</p>
        <p>When the query execution completes, all of the resources
are released, and the system scales back to zero.</p>
        <p>We assume that the persistent data is stored in
standard formats (such as Parquet, ORC, Avro, CSV) and is
available through cloud storage services compatible with
the common query engines (such as S3 or EBS). We
restrict the set of distributed query engines considered to
ones that can be used in such networked shared-disk
configurations.</p>
        <p>Our current prototype of MetaQ uses AWS Lambda
FaaS functions. To run of-the-shelf query engines
despite the restricted function execution environment, we
utilized Boxer to provide the required but missing
functionality. Boxer is a system that runs standard
datacenter applications in FaaS environments, providing the
expected network-of-hosts execution model. Boxer runs in
every function, alongside the application processes, and</p>
        <sec id="sec-1-3-1">
          <title>We first outline the design of the MetaQ prototype, a</title>
          <p>proof of concept design of the EPQE paradigm.</p>
          <p>MetaQ has three main components: session manager
(SM), platform provider (PP), and meta-system optimizer
(MO). The session manager oversees end-to-end query
execution and its resources, including handling
communication with the client. The platform provider orchestrates
the required resources and configures the environment
required for the query engine execution. The meta-system
optimizer is used to determine the complete specification
of the resources, the query engine to be used, its
conifguration, and possibly engine-specific query rewriting.</p>
          <p>In cases when users specify the complete specification,
the meta-system optimizer can be bypassed since the
execution is fully specified.</p>
          <p>Figure 1 illustrates query execution in MetaQ. To
execute a query, a user (step ○ 1 ) starts MetaQ and
speciifes the query and (optionally) the specifications of the
query engine and resources to use for the query
execution. MetaQ launches as a serverless FaaS function
that can be instantiated on demand via a request to an
API proxy service of a cloud provider (such as AWS API
Gateway [30]).</p>
          <p>MetaQ begins by instantiating the session manager</p>
          <p>Startup
Query Executution
Fargate/ECS/EC2 min. billable
Fargate/ECS Init
0.42 0.48
0
1
3
5
6
7
8</p>
          <p>9 12
TPC-H Query
13
14
16
17
18
20
establishes an ephemeral network between the partici- reduce costs for bursty query workloads. For this study,
pating functions. Boxer executes unmodified application we chose to use a variant of Boxer as our MetaQ platform
processes (query engines and any auxiliary systems) in a provider (PP) component, which allowed us to instantiate
FaaS environment while transparently exposing function- networked systems using AWS Lambda. For this initial
to-function networking via the standard POSIX interfaces validation, we assumed that along with every considered
(stream sockets etc.). To facilitate configuring the unmod- query, the user specifies the complete system
specifiified distributed query engines in FaaS, Boxer is used to cation (resource allocations, query engine specification,
assign roles to functions, provide name resolution, host and configuration). This bypasses the meta-engine
optimembership, and coordinate query engine process execu- mizer(MO), which we plan to explore in the next stages
tion. The collection of these Boxer features provides an of our research.
execution environment in AWS Lambda FaaS that closely We experiment with per-query instantiations of Apache
matches what is expected by distributed query engines. Drill, a general-purpose distributed SQL engine inspired</p>
          <p>Although we show how MetaQ can run in FaaS envi- by Google Dremel [33]. We used the TPC-H benchmark
ronments, its design is not tied to them. For example, to simulate the user queries to be evaluated using MetaQ.
MetaQ’s components (SM,MO,PP) could execute locally Using the benchmark tools, we populated S3 cloud
storon the user’s computer, and then could provide (a sub- age with data set at scale factor 10, resulting in 12 GBytes
set of) standard client protocols that many distributed of data and with the largest relation with almost 60
milquery engines often expose (such as PostgreSQL stan- lion tuples. Each TPC-H query evaluation request was
dard wire protocol or JDBC). Independently, there could accompanied by the complete query system specification
be diferent platform providers (PP) giving access to dif- specifying (a) resources for 10 AWS Lambda functions
ferent types of resources for query execution, from the with 6 vCPUs, x86_64 architecture, and 10GB of
memuser’s local resources (useful for smaller workloads) to ory each, (b) Apache Drill as the query engine (the only
serverless container services such as AWS Fargate or engine option in our experiment), and (c) stock
configfuture serverless platforms that may provide access to uration options for Apache Drill worker nodes, a head
heterogeneous hardware accelerators. node, and a single Apache Zookeeper node (required by
Apache Drill).</p>
          <p>
            The experiment emulates a session manager (SM) that
3. Feasibility study uses the Boxer system as the platform provided (PP) to
instantiate resources on AWS Lambda and to start Apache
3.1. Methodology Drill nodes (and Zookeeper). The experimental session
To validate the real-world feasibility of the EPQE paradigm, manager then waits for the query system to be available,
we experiment with some of the basic components of the and then submits the query and waits for the results,
MetaQ prototype design. We focus our analysis on the and returns on completion. In this study, to factor out
technical feasibility of MetaQ rather than analyzing its the efects of function caching, we ensure that only cold
cost tradeofs, [
            <xref ref-type="bibr" rid="ref19 ref20">19, 20</xref>
            ] have shown that serverless can functions are used for each query.
3.2. End-to-end query latency
Figure 2 shows the median end-to-end query execution
times. Without optimizing the Drill configuration, the 0.8
observed median end-to-end query execution times were
between 30.42s and 65.13s seconds. (Not all of the TPC- itron0.6
H queries were able to run on Drill with the current Boxer roop0.4
variant due to its limit of less than 1024 file descriptors P
available to Drill, while for some queries, Drill required 0.2
more.) For comparison, if we chose an alternative
platform provider (PP) based on a serverless container service 0.0 20 25 30 35 40 45
such as AWS Fargate (using AWS Elastic Container Ser- Startup time [sec]
vice(ECS), or AWS Elastic Kubernetes Service(EKS) [34]) Figure 3: Empirical CDF of the observed system startup times
we expect the execution times to be significantly higher. of all instantiations in the experiment. The time from resource
Such container services are not optimized for startup instantiation to the time when the 8-worker-node Apache
times, and their implementations rely on EC2 for on- Drill system is ready to start executing the query.
demand resource allocation. We observed that the
median time to just instantiate a comparable (serverless)
container (8GBytes of memory, with 1024MBytes image 3.3. Engine startup time
size) using AWS Fargate/Elastic Container Service(ECS)
is 54.9s (dashed line in Figure 2). This means that by the We also examined the variance of the query execution
time the ECS container only begins to start the query times and startup times. The error bars in Figure 2 show
engine, all but one of the queries executed by MetaQ are the maximum and minimum times for each query
execualready finished, and the resources are already released. tion time relative to the median of the startup time (the
Furthermore, the minimum billable duration for AWS variance due to the startup time is factored out). We
obFargate/ECS is 1 minute, while AWS Lambda billing is at serve a noticeable, but acceptable, variance in the query
1ms granularity with no set minimum. execution time, with the majority of the queries having
          </p>
          <p>Takeaway 1: MetaQ improves performance and re- median execution times within 10% of the slowest and
duces resource usage by instantiating per-query data fastest executions. The highest observed dispersion was
processing engines on FaaS infrastructure compared to for Query 20, with the slowest observed execution being
containers or virtual machines. 18% slower than the median.</p>
          <p>Figure 2 shows that a significant fraction of the query However, when we inspect the distribution of all of the
execution is consumed by the startup time. The median startup times during the experiment, shown in Figure 3,
time for the system to become ready to start executing we observe significantly higher variance (note that in
a query is 19.67s, and (in terms of median values) that this experiment, the startup time is independent of the
consumes between 30% (for query 9) and 67% (for query executed query since the client always specifies the same
6) of the total execution times for the queries we tested. complete specification with each query request, so we
There are many techniques that can be used to reduce do not factor the startup times by query executed.) The
this time (we have not optimized it in this experiment startup times ranged widely from 17.78s to 47.02s. In
at all), from configuring the system to avoid starting particular, the measurements form two groups; of the
unnecessary components to snapshotting JVM state [35, total of 70 measurements, the top 5 times (grey area in
36, 37]. Fortunately, because faster startup times are Figure 3) were above 43s, while all the remaining runs
desirable for other use-cases of FaaS platforms as well, needed less than 27s to start executing the query. Our
recently AWS Lambda started to ofer ability to fully initial investigation into the source of this variance
indisnapshot the initial function state to avoid this issue [38]. cates that its main contributor is the time for the Apache
We have not yet explored this feature, so the current Drill workers to become available after their processes
results with FaaS should be treated as a conservative are started. Since the variance does not persist to the
upper bound since there are further optimizations that query execution times, it suggests that these stragglers
we can enable, such as restoring from snapshots. are not due to their function execution context being
(per</p>
          <p>
            Takeaway 2: MetaQ does not interfere with poten- manently) resource constrained. It is possible that these
tial optimizations that cloud providers could introduce functions had to fetch base images from a deeper storage
to FaaS. Its performance will only improve with these hierarchy as the worker processes were loading blocks of
optimizations, giving it an even bigger advantage over data during startup. This suggests that the meta-system
current solutions. optimizer (MO) strategies should consider the possibility
of instantiating additional workers to compensate for
this straggler phenomenon. Once enough workers are Lambada [
            <xref ref-type="bibr" rid="ref19">19</xref>
            ] and Starling [
            <xref ref-type="bibr" rid="ref20">20</xref>
            ] both ofer a data-analytics
available, MetaQ could then terminate the unnecessary platform on top of serverless. Others have explored the
stragglers. A similar technique is already performed in- benefits and pitfalls of running ML training and inference
ternally by Boxer platform provider. Boxer, depending on FaaS [
            <xref ref-type="bibr" rid="ref21">21, 22</xref>
            ]. In all these cases, a major limitation
on configuration, already instantiates additional func- is that serverless functions are stateless and exchange
tions and proceeds only with the requested number of data through remote storage services (e.g., S3). Hence,
functions that became available first and immediately for each query or task deployed on FaaS, a significant
terminates the rest of the slower and unnecessary func- portion of time is spent reading/writing data from/to
stortions. Notice that these techniques that discard stragglers age. Complex queries that require shufling data become
are feasible using FaaS because of the fine granularity of even more of a problem by requiring multiple rounds of
accounting and no minimum billing time. access to storage servers, thereby further increasing the
          </p>
          <p>Takeaway 3: Although limited in scope, the experi- overhead. A lot of prior work focuses on how to
mitiment demonstrates the real-world feasibility of the per- gate the data-passing limitations of FaaS infrastructure
query engine paradigm. This very simple experiment by constructing custom experimental systems.
leaves many possibilities for future improvements, but A first contribution and potentially the first application
it already highlights the potential of our vision and mo- of the idea behind MetaQ is that it aims to run existing
tivates the further exploration of the design space and platforms without having to wait until a suitable new
future work on the MetaQ prototype. data processing or ML engine is developed matching the
characteristics of serverless. Our approach enables
run3.4. Selecting Query Engines ning complex data processing tasks at a large scale using
existing mature systems, using a variety of engines
taiA key aspect of the EPQE is the possibility of choos- lored to the query and data at hand, and deploying at the
ing a diferent engine for each query. Although fur- scale needed while still maintaining all the advantages
ther investigation is necessary, our preliminary com- of serverless.
parison of query execution times of Apache Drill and
Apache Spark, indicates that there likely will be a per- 4.2. Dynamically Extensible Engines
formance gain from choosing diferent engines on
perqueries granularity. We measured the query execution Data processing engines, such as traditional relational
times of TPC-H scale factor 30 for Apache Drill and databases or many SQL-centric distributed platforms,
Apache Spark using 8 AWS Lambda worker nodes. Ig- are limited along two dimensions. One is in terms of
noring the startup times and only based on the relative deployment, as only one configuration is available at
query execution times, we observe that for 14 of the any time. This leads to overprovisioning to make sure
queries (1,4,5,6,7,9,10,11,12,13,15,16,17,22) Drill notice- the system can cope with any possible workload. The
ably outperformed Spark, for 3 queries (14,19,20) Spark other is in terms of functionality. Very often, data is
outperformed Drill, for 2 queries (8,18) perfromance was processed in these engines and then needs to be moved
similar, while 3 queries were completed by only one of to other systems for further processing (e.g., ML training,
the two engines (2,21 Spark only, 3 Drill only.) These statistical analysis, visualization).
initial results suggest that, indeed, the notion of instanti- MetaQ can be used as an extension of existing engines
ating a diferent engine depending on the query can be to address these two problems. In the same way we
beneficial. This opens up very interesting research ques- show that one can launch a complete data processing
tions in terms of how an optimizer could decide which system on serverless when a query arrives, an existing
system to use. engine running on a VM could do the same to trigger
additional capacity when necessary. For example, the basic
mechanism presented here can be used to have Apache
4. Use Cases Drill launch additional ephemeral engines when the
longrunning system is not able to cope with the additional
In this section we explore use cases that could be either load. Recently, a similar approach has been explored by
implemented on top of the prototype of MetaQ or would modifying an existing system, Pixels-Turbo [39] is an
exrequire additional work on several aspects of the system tension of a Pixels [40] query engine that can instantiate
and further research. query engines in AWS Lambda function to add elasticity
to the system instantiated on long-running VMs. In the
4.1. More Eficient Data Analytics case of missing functionality for some tasks, the
tranThere is a growing amount of work exploring how to best sition to another system can be done by triggering the
use commercial serverless platforms for data analytics. corresponding system once the data processing engine
ifnishes. This eliminates the need to have both systems
running all the time and helps to automate the process
rather than copying the data and transferring it manually
to the other system (and then copying results back).</p>
          <p>Complementary to these ideas is the notion of
deploying a minimalist system (i.e., requiring much fewer
resources) on a permanent infrastructure using VMs and
then using the mechanisms of MetaQ to launch a more
complete version of the system (or one tailored exactly
to the task at hand) when queries arrive that require the
more advanced functionality.
4.3. User-owned Data Analytics Stack
Cloud providers ofer a set of Query-as-a-Service
platforms, such as AWS Athena [41], which provide a
simpliifed interface for large-scale analytics and charge users
per byte read. However, users may still prefer to run
their queries on a data analytics stack that they fully
control (e.g., to optimize parameters and hardware
conifgurations for their workloads). MetaQ enables users to
run their own data analytics stack while still benefiting
from simple abstractions and a convenient pay-per-query
cost model, as resources can be acquired and released
on-demand in response to load. As Palkar and Zaharia
point out, users may also prefer to run their own
analytics engines and web services rather than relying on
out-of-the-box cloud solutions for privacy reasons [42].</p>
          <p>This is especially true when queries involve UDFs, as
these are more dificult to securely isolate in shared
infrastructure deployments. By operating their own data
analytics stack, users get to control how the system is
configured and monitor how they are billed for the work
performed for a particular task.
4.4. Data Lakes
Data Lakes refer to collections of heterogeneous data that
needs to be processed in a variety of diferent ways. The
problem with this notion is that the processing is also
highly heterogeneous, and it is the user who is
responsible for handling it. Lakehouses is a new iteration of the
concept that incorporates the data processing as a
firstclass citizen and provides support for diferent engines,
languages, etc., while automating as much as possible
the task of matching data to engines and tools [43].</p>
          <p>MetaQ is well-suited to Lakehouses as it enables
dynamically selecting the engine and processing tools on
the fly, and this can be done on the basis such as data
types, data sizes, type of query, user requirements, or
cost, etc. Furthermore, the per-query engine vision
enables an intriguing possibility: sharing of auxiliary data
structures across engines (indexes, partitions, zone maps,
etc.) as well as creating a general infrastructure that is
engine agnostic (e.g., a main memory caching layer for
data to avoid having to retrieve it from slow storage
every time or a results cache). Such infrastructure exists,
but it is typically system specific. MetaQ opens up the
possibility of seeing these aspects as orthogonal to the
actual engine. In the extreme, all common modules of
query engines could become serverless components
dynamically added to an engine as it is instantiated with
the query-specific functionality.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>5. Research Opportunities</title>
      <sec id="sec-2-1">
        <title>The idea of EPQE behind MetaQ opens a number of interesting research directions which we now highlight.</title>
        <p>5.1. The Meta-Engine</p>
      </sec>
      <sec id="sec-2-2">
        <title>EPQE unlock a number of opportunities when it comes</title>
        <p>
          to selecting the most appropriate engine for each query.
This can be done in a very simple manner by, for
instance, asking the user to specify which engine to use.
However, we are interested in automating the selection
process by building an end-to-end query system that
handles this. In a scenario where users write queries in
an engine-agnostic syntax (for example, in a declarative
language such as SQL), MetaQ’s meta-system optimizer
could inspect the query and determine which engine is
the most eficient given the data types, its type (static
or streaming), the type of operations required, etc. This
leads to cross-engine optimizations, such as picking the
engine that is faster to perform a given operation
provided by several engines. The main research question
is how to derive meta-system optimizer policies. One
possible approach is to extend the domain of automatic
configuration systems [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] with the additional tasks of
choosing not just configuration parameters for a query
engine, but also the choice of the query engine itself
and resource allocation based on the query considered,
eventually realizing the vision of vertically integrated
per-query optimization.
5.2. Autoscaling Per-query Deployments
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>With a new deployment being launched and shut down</title>
        <p>per query, it is now possible to optimize the deployment
where the engine will run for every query. Such
deployment configuration could determine the amount of
resources used, such as the CPU and/or memory budget.</p>
        <p>Such configuration could be inferred by analyzing the
query and data inputs to estimate the amount of data that
would be processed and, therefore, the amount of
compute and memory necessary to finish the query within
a particular time frame. From another perspective, it is
now possible to dynamically find tradeofs between
execution time and price for each query. This tradeof could
also be exposed to users as a way to prioritize interactive services provided by the cloud. This results in
inefiqueries over batch workloads. ciencies that are dificult to address: over-provisioning,
coarse resource allocation, generic engine configurations,
5.3. Query Scheduling and Caching low utilization, etc. In this paper, we put forward the idea
of ephemeral per-query engines: selected query engines
Beyond automatically sizing and optimizing per-query dynamically instantiated when a query arrives and
redeployments, it is also possible to schedule query execu- moved when it terminates. In the paper, we have outlined
tion on nodes that have some locally cached data or that the idea, discussed its potential to address many of the
are close to storage nodes. For example, if a workload re- limitations of current deployments, provided a feasibility
quires two queries to be executed, the second query could study, and demonstrated that, while there is still much
be scheduled for execution on the same physical node(s) work to do, it is possible to implement it in current FaaS
that was used to execute the previous one. To keep data platforms. The initial experiments are highly
encouraglocal, caching approaches such as Faa$t cache [44] can ing. They show that existing engines can be suficiently
also be used to keep the output of queries. quickly instantiated on demand to run a single query.
Building on this basis, we have also discussed and
pre5.4. System Infrastructure sented several research directions that can be pursued
based on the ideas and results presented here.</p>
      </sec>
      <sec id="sec-2-4">
        <title>To implement inter-function communication, MetaQ pro</title>
        <p>totype uses Boxer as its platform provider. Boxer (and
therefore MetaQ) do not require any cloud provider
intervention and can be deployed today in AWS Lambda.
However, Boxer is not yet feature complete in terms of
interfaces, networking support, reliability, and
integration within larger systems. That is something that we
are working on at the moment so as to have a more
solid basis for the system. Similarly, Boxer was initially
built for AWS Lambda. We are in the process of
studying how to port Boxer to other commercial serverless
oferings. Doing so would open yet another wave of
exciting opportunities, like triggering serverless jobs across
heterogeneous clouds using the networking capabilities
available in Boxer.
5.5. Generalizing to Other Engines
Our experiments are only a first step towards the
perquery engine vision. We plan to test this paradigm and
our MetaQ prototype on a wider range of data processing
engines and platforms on top of the existing prototype
to make sure it can indeed be used as a general-purpose
distributed computing platform equivalent to what can
be done on a VM. Systems that we are in the process
of testing include Apache Spark, Trino, Databend [45],
Flink [46], Clickhouse [47]. Having them running on the
same serverless platform will also ofer a great
opportunity to study the engine designs that are most suitable
for serverless, providing very valuable information on
the road toward serverless native engines.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>6. Conclusion</title>
      <sec id="sec-3-1">
        <title>Distributed data processing engines often require to have a fixed underlying infrastructure to run in the form of pre-allocated VMs, Virtual Private Networks, and other</title>
        <p>[38] Improving startup performance with Lambda Snap- house: A new generation of open platforms that
Start, 2023. URL: https://docs.aws.amazon.com/ unify data warehousing and advanced analytics, in:
lambda/latest/dg/snapstart.html, (accessed: 2023- 11th Conference on Innovative Data Systems
Re03-01). search, CIDR 2021, Virtual Event, January 11-15,
[39] H. Bian, T. Sha, A. Ailamaki, Using cloud functions 2021, Online Proceedings, www.cidrdb.org, 2021.</p>
        <p>as accelerator for elastic data analytics 1 (2023). [44] F. Romero, G. I. Chaudhry, I. n. Goiri, P. Gopa, P.
Ba[40] H. Bian, A. Ailamaki, Pixels: An eficient column tum, N. J. Yadwadkar, R. Fonseca, C. Kozyrakis,
store for cloud data lakes, in: 2022 IEEE 38th Inter- R. Bianchini, Faa$t: A transparent auto-scaling
national Conference on Data Engineering (ICDE), cache for serverless applications, Association for
2022, pp. 3078–3090. Computing Machinery, New York, NY, USA, 2021.
[41] Amazon Athena, 2020. URL: http://docs.aws. [45] Databend, 2023. URL: https://databend.rs/,
(acamazon.com/athena/, (accessed: 2020-08-17). cessed: 2023-06-20).
[42] S. Palkar, M. Zaharia, Diy hosting for online privacy, [46] Apache Flink, 2023. URL: https://flink.apache.org/,
in: Proceedings of the 16th ACM Workshop on Hot (accessed: 2023-03-01).</p>
        <p>Topics in Networks, HotNets-XVI, 2017. [47] ClickHouse, 2023. URL: https://clickhouse.com/,
(ac[43] M. Zaharia, A. Ghodsi, R. Xin, M. Armbrust, Lake- cessed: 2023-06-20).</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Vuppalapati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Miron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Truong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Motivala</surname>
          </string-name>
          , T. Cruanes,
          <article-title>Building an elastic query engine on disaggregated storage</article-title>
          ,
          <source>in: Proceedings of the 17th Usenix Conference on Networked Systems Design and Implementation</source>
          , NSDI'20,
          <string-name>
            <given-names>USENIX</given-names>
            <surname>Association</surname>
          </string-name>
          , USA,
          <year>2020</year>
          , p.
          <fpage>449</fpage>
          -
          <lpage>462</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Babu</surname>
          </string-name>
          ,
          <article-title>Tempo: Robust and self-tuning resource management in multi-tenant parallel databases</article-title>
          ,
          <source>Proc. VLDB Endow</source>
          .
          <volume>9</volume>
          (
          <year>2016</year>
          )
          <fpage>720</fpage>
          -
          <lpage>731</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. R.</given-names>
            <surname>Narasayya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. C.</given-names>
            <surname>König</surname>
          </string-name>
          ,
          <article-title>Automated demand-driven resource scaling in relational database-as-a-service</article-title>
          ,
          <source>in: Proceedings of the 2016 International Conference on Management of Data, SIGMOD '16</source>
          ,
          <year>2016</year>
          , p.
          <fpage>1923</fpage>
          -
          <lpage>1934</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Augusta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Idreos</surname>
          </string-name>
          , Jafar:
          <article-title>Near-data processing for databases</article-title>
          ,
          <source>in: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15</source>
          ,
          <year>2015</year>
          , p.
          <fpage>2069</fpage>
          -
          <lpage>2070</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D. V.</given-names>
            <surname>Aken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Brillard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fiorino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Billian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pavlo</surname>
          </string-name>
          ,
          <article-title>An inquiry into machine learning-based automatic configuration tuning services on real-world database management systems</article-title>
          ,
          <source>Proc. VLDB Endow</source>
          .
          <volume>14</volume>
          (
          <year>2021</year>
          )
          <fpage>1241</fpage>
          -
          <lpage>1253</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhu</surname>
          </string-name>
          , J. Liu,
          <string-name>
            <given-names>M.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>Bestconfig: Tapping the performance potential of systems via automatic configuration tuning, Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Madhavapeddy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mortier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rotsos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Scott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gazagnaire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Crowcroft</surname>
          </string-name>
          , Unikernels:
          <article-title>Library operating systems for the cloud, Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kuenzer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.-A.</given-names>
            <surname>Bădoiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lefeuvre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Santhanam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jung</surname>
          </string-name>
          , G. Gain,
          <string-name>
            <given-names>C.</given-names>
            <surname>Soldani</surname>
          </string-name>
          , C. Lupu, c. Teodorescu, [22]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          , T. T. A.
          <string-name>
            <surname>Dinh</surname>
            , G. Hu,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Y. M.</given-names>
          </string-name>
          <string-name>
            <surname>Chee</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Răducanu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Banu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Mathy</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Deaconescu</surname>
            ,
            <given-names>B. C.</given-names>
          </string-name>
          <string-name>
            <surname>Ooi</surname>
          </string-name>
          ,
          <article-title>Serverless data science - are we there yet? C.</article-title>
          <string-name>
            <surname>Raiciu</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Huici</surname>
          </string-name>
          ,
          <article-title>Unikraft: Fast, specialized A case study of model serving, in: SIGMOD '22: unikernels the easy way</article-title>
          ,
          <source>Association for Comput- International Conference on Management of Data, ing Machinery</source>
          , New York, NY, USA,
          <year>2021</year>
          . Philadelphia, PA, USA, June 12 - 17,
          <year>2022</year>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Schleier-Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sreekanti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Khandelwal</surname>
          </string-name>
          , [23]
          <string-name>
            <given-names>A.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Anwar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Rupprecht</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Carreira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. J.</given-names>
            <surname>Yadwadkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Popa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. E.</given-names>
            <surname>Gon- D. Skourtis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tarasov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Yan</surname>
          </string-name>
          , Y. Cheng, Infinicache: zalez, I. Stoica,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Patterson</surname>
          </string-name>
          ,
          <article-title>What serverless Exploiting ephemeral serverless functions to build computing is and should become: The next phase of a cost-efective memory cache, in: USENIX FAST, cloud computing</article-title>
          ,
          <source>Commun. ACM</source>
          <volume>64</volume>
          (
          <year>2021</year>
          )
          <fpage>76</fpage>
          -
          <lpage>84</lpage>
          .
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Castro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ishakian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Muthusamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Slominski</surname>
          </string-name>
          , [24]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Carver</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. J.</given-names>
            <surname>Newman</surname>
          </string-name>
          ,
          <article-title>The rise of serverless computing, Commun</article-title>
          . ACM A.
          <string-name>
            <surname>Anwar</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Rupprecht</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Tarasov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Skourtis</surname>
          </string-name>
          ,
          <volume>62</volume>
          (
          <year>2019</year>
          )
          <fpage>44</fpage>
          -
          <lpage>54</lpage>
          .
          <string-name>
            <given-names>F.</given-names>
            <surname>Yan</surname>
          </string-name>
          , Y. Cheng, Infinistore: Elastic serverless
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Agache</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brooker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Iordache</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Liguori,
          <source>cloud storage 16</source>
          (
          <year>2023</year>
          ). R. Neugebauer,
          <string-name>
            <given-names>P.</given-names>
            <surname>Piwonka</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.-M. Popa</surname>
            , Firecracker: [25]
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Wawrzoniak</surname>
            ,
            <given-names>I. Müller</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bruno</surname>
          </string-name>
          , G. Alonso,
          <article-title>Lightweight virtualization for serverless applica- Boxer: Data analytics on network-enabled servertions</article-title>
          ,
          <source>in: NSDI</source>
          ,
          <year>2020</year>
          . less platforms,
          <source>in: CIDR</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ao</surname>
          </string-name>
          , G. Porter,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Voelker</surname>
          </string-name>
          , Faasnap: Faas made [26]
          <string-name>
            <given-names>Apache</given-names>
            <surname>Drill</surname>
          </string-name>
          ,
          <year>2022</year>
          . URL: https://drill.apache.org/,
          <article-title>fast using snapshot-based vms, Association for (accessed: 2022-</article-title>
          <volume>10</volume>
          -20). Computing Machinery, New York, NY, USA,
          <year>2022</year>
          . [27]
          <string-name>
            <given-names>AWS</given-names>
            <surname>Fargate -</surname>
          </string-name>
          <article-title>Serverless compute for containers,</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>AWS</given-names>
            <surname>Lambda</surname>
          </string-name>
          ,
          <year>2020</year>
          . URL: https://aws.amazon.com/ 2023-03-
          <fpage>01</fpage>
          . URL: https://aws.amazon.com/fargate/, lambda, (accessed:
          <fpage>2020</fpage>
          -08-17). (accessed:
          <fpage>2023</fpage>
          -03-01).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>J. M. Hellerstein</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          <string-name>
            <surname>Faleiro</surname>
          </string-name>
          , J. Gonzalez, [
          <volume>28</volume>
          ]
          <string-name>
            <given-names>Amazon</given-names>
            <surname>Elastic Container Service (Amazon</surname>
          </string-name>
          <string-name>
            <given-names>ECS</given-names>
            ), J.
            <surname>Schleier-Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sreekanti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tumanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <year>2023</year>
          . URL: https://aws.amazon.com/ecs/,
          <article-title>(accessed: Serverless computing: One step forward</article-title>
          ,
          <source>two steps</source>
          <year>2023</year>
          -
          <volume>03</volume>
          -01). back, in: CIDR,
          <year>2019</year>
          . [29]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zaharia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Xin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wendell</surname>
          </string-name>
          ,
          <string-name>
            <surname>T. Das</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          Arm-
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , T. Ristenpart, M. Swift, brust,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rosen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Venkataraman</surname>
          </string-name>
          ,
          <article-title>Peeking behind the curtains of serverless platforms, M. J</article-title>
          .
          <string-name>
            <surname>Franklin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Ghodsi</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalez</surname>
          </string-name>
          , S. Shenker, in
          <source>: Proceedings of the 2018 USENIX Conference I. Stoica</source>
          ,
          <article-title>Apache spark: A unified engine for big on Usenix Annual Technical Conference</article-title>
          ,
          <source>USENIX data processing 59</source>
          (
          <year>2016</year>
          ).
          <source>ATC '18</source>
          ,
          <year>2018</year>
          . [30]
          <string-name>
            <surname>Amazon</surname>
            <given-names>API Gateway</given-names>
          </string-name>
          ,
          <year>2023</year>
          . URL: https://aws.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>AWS</given-names>
            <surname>Step Functions</surname>
          </string-name>
          ,
          <year>2023</year>
          . URL: https://aws. amazon.com/api-gateway/, (accessed:
          <fpage>2023</fpage>
          -03-01). amazon.com/step-functions/, (accessed:
          <fpage>2023</fpage>
          -03- [31]
          <string-name>
            <surname>Trino</surname>
          </string-name>
          ,
          <year>2023</year>
          . URL: https://trino.io/, (accessed:
          <fpage>2023</fpage>
          -
          <lpage>01</lpage>
          ).
          <fpage>06</fpage>
          -
          <lpage>20</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Azure</given-names>
            <surname>Durable Functions</surname>
          </string-name>
          ,
          <year>2023</year>
          . URL: [32]
          <string-name>
            <given-names>P.</given-names>
            <surname>Hunt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Konar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. P.</given-names>
            <surname>Junqueira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Reed</surname>
          </string-name>
          , https://learn.microsoft.com/en-us/azure/ Zookeeper:
          <article-title>Wait-free coordination for internetazure-functions/durable/, (accessed: 2023-03- scale systems</article-title>
          ,
          <source>USENIX ATC'10</source>
          ,
          <year>2010</year>
          .
          <volume>01</volume>
          ). [33]
          <string-name>
            <given-names>S.</given-names>
            <surname>Melnik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gubarev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Long</surname>
          </string-name>
          , G. Romer, S. Shiv-
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>V.</given-names>
            <surname>Sreekanti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. C.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schleier-Smith</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. E.</surname>
          </string-name>
          akumar, M. Tolton, T. Vassilakis, Dremel: InteracGonzalez,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Hellerstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tumanov</surname>
          </string-name>
          ,
          <article-title>Cloud- tive analysis of web-scale datasets, 2010. burst: Stateful functions-as-a-service</article-title>
          ,
          <source>Proc. VLDB [34] Amazon Elastic Kubernetes Service (EKS)</source>
          ,
          <year>2023</year>
          . Endow.
          <volume>13</volume>
          (
          <year>2020</year>
          )
          <fpage>2438</fpage>
          -
          <lpage>2452</lpage>
          . URL: https://aws.amazon.com/eks/, (accessed:
          <fpage>2023</fpage>
          -
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>I. Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Marroquín</surname>
          </string-name>
          , G. Alonso, Lambada: Inter-
          <volume>03</volume>
          -01).
          <article-title>active data analytics on cold data using serverless</article-title>
          [35]
          <string-name>
            <given-names>W.</given-names>
            <surname>Shin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.-H.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Min</surname>
          </string-name>
          ,
          <article-title>Fireworks: A fast, cloud infrastructure</article-title>
          ,
          <source>in: SIGMOD</source>
          ,
          <year>2020</year>
          . eficient, and
          <article-title>safe serverless framework using vm-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>M.</given-names>
            <surname>Perron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. Castro</given-names>
            <surname>Fernandez</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <article-title>DeWitt, S. Mad- level post-jit snapshot, Association for Computing den, Starling: A scalable query engine on cloud Machinery</article-title>
          , New York, NY, USA,
          <year>2022</year>
          . functions, in: SIGMOD,
          <year>2020</year>
          . [36]
          <string-name>
            <given-names>D.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zang</surname>
          </string-name>
          , G. Yan,
          <string-name>
            <given-names>C.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Alonso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          , Catalyzer:
          <article-title>Sub-millisecond startup for A</article-title>
          .
          <string-name>
            <surname>Klimovic</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Singla</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>C. Zhang,</given-names>
          </string-name>
          <article-title>Towards serverless computing with initialization-less bootdemystifying serverless machine learning training, ing</article-title>
          , in: ASPLOS,
          <year>2020</year>
          . in: Proceedings of the 2021 International Confer- [37]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cadden</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Unger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Awad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Krieger</surname>
          </string-name>
          ,
          <source>ence on Management of Data</source>
          ,
          <year>2021</year>
          , p.
          <fpage>857</fpage>
          -
          <lpage>871</lpage>
          . J.
          <string-name>
            <surname>Appavoo</surname>
          </string-name>
          , Seuss:
          <article-title>Skip redundant paths to make serverless fast</article-title>
          , in: EuroSys,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>