<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Secure Data Processing at Scale</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kajetan Maliszewski supervised by  Volker Markl</string-name>
          <email>maliszewski@tu-berlin.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Technische Universitat Berlin</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <abstract>
        <p>Although the cloud is today a de-facto standard for scalable data processing, there are still many applications that cannot make use of the cloud due to data or computation privacy. Sensitive data, such as in the health domain; and computations, such as core-business AI pipelines, grew into valuable assets that made secure data processing a hot topic in industry and academia. On one hand, the existing data processing systems prioritize performance and, to a certain level, trade users' privacy. On the other hand, privacy-preserving data processing systems sacri ce performance. In this PhD thesis, we envision a fully secure general-purpose data processing system for the cloud. Overall, we aim at devising: (i) algorithms that are adequate to work with very limited memory, such as the one exposed by trusted execution environments; (ii) scalable state management techniques; (iii) oblivious data-access algorithms; and (iv) privacy-preserving query optimizations techniques to speed up query execution.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Processing data on the cloud has become omnipresent in
our days [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. For example, services, such as Amazon AWS
and Microsoft Azure, have made trivial for companies,
researchers, and organizations (users for short) to set up and
maintain compute nodes. The cloud has given
unprecedented power to users: they can now run applications and
analytics before they were not able to run. For instance, a
small company can easily o er scalable data analytics
without owning its data infrastructure [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        However, there are still many applications from di erent
domains that cannot fully bene t from the cloud. Among
these, we mainly nd users working with sensitive data,
e. g., on medical or transactional data, and users
performing sensitive computations, e. g., machine/deep
learning pipelines de ning the core business of a company. These
users typically have to classify their data or computations
and hence are subject to strict compliance rules that force
them to not trade privacy. As most cloud solutions do not
treat privacy as a rst-class citizen, users end up working on
their premises sacri cing scalability and e ciency. This, for
example, is the case of most applications in the healthcare
domain, which use in-house solutions and infrastructure [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        Even though, hybrid-cloud has recently appeared as a
possible solution to this problem [
        <xref ref-type="bibr" rid="ref17 ref18">17, 18</xref>
        ], sensitive data and
computations still cannot be moved from the private cloud.
This is because existing data processing systems lack
features for preserving the privacy of data and computations.
Although few systems work on encrypted data [
        <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
        ], most
cloud-based data processing systems, such as Spark and
Flink, expose data and computations at the hardware level.
The research community proposed using trusted execution
environments (TEEs) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] to provide solutions to this
problem [
        <xref ref-type="bibr" rid="ref11 ref16 ref19 ref7 ref9">7, 9, 11, 16, 19</xref>
        ]. Other works have also used oblivious
algorithms to provide stronger security for TEEs by hiding
data access patterns [
        <xref ref-type="bibr" rid="ref19 ref7">7, 19</xref>
        ]. Nevertheless, all these
solutions su er from several of the following problems: they
(i) lack basic performance optimizations; (ii) do not scale
out; (iii) cannot support stateful operators nor large state
information; and (v) are ad-hoc to speci c cases.
      </p>
      <p>Therefore, despite all these e orts, we are still missing a
holistic solution that could be a panacea to all the
aforementioned concerns. The Holy Grail would be to replicate
the success of general-purpose distributed data processing
systems for secure, scalable cloud data processing. Users
should focus on the logic of their applications while the
system should take care of running and scaling out such
applications e ciently and without any data/computation leakage.
To the best of our knowledge, there is no general-purpose
system that provides support for fully secure and scalable
cloud data processing.</p>
      <p>Building such a system is particularly challenging for
many reasons. First, users must be able to easily de ne
privacy constraints over their data and computations. Second,
we have to revisit data processing algorithms and state
management techniques to work within secure environments.
Third, the system must ensure data and computation
privacy on public compute nodes without sacri cing
performance. Fourth, it is not clear how the system optimizes
queries in the cloud when privacy is a rst-class citizen.</p>
      <p>In this thesis, we plan to tackle the above research
challenges and lay down the foundations of the foreseen
generalpurpose system. We plan to proceed as follows: we will rst
build a single node secure and e cient data processing
engine; we will follow with making our data processing engine
distributed and scalable, by considering public and trusted
nodes; we will focus on devising a privacy-preserving query
optimizer. In summary, we plan to make the following major
contributions:
1. We will devise an e cient general-purpose data
processing engine for TEEs. In particular, we will propose
new data processing algorithms that are adequate to
work with very limited memory (provided by TEEs
environments).
2. We will extend our data processing engine to support
stateful operators. We will particularly provide both
oblivious data-access algorithms and support for large
state information in TEEs environments.
3. We will propose bidirectional data anonymization
algorithms as well as data processing algorithms being
able to work over encrypted data.
4. We will then devise di erent privacy-preserving query
optimizations techniques to speed up query execution.</p>
      <p>In the remainder of this paper, we rst de ne the
problem we plan to tackle in this thesis in Section 2. We present
related work in Section 3. In Section 4, we depict our
envisioned solution. We follow, in Section 5, with di erent open
challenges we must tackle to make our envisioned solution a
reality. Lastly, we conclude this paper in Section 6.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>PROBLEM STATEMENT</title>
      <p>E cient and fully secure data processing on the cloud is
currently not possible at the terabyte scale. Guaranteeing
robustness and consistent privacy level requires (i) usage of
novel hardware technologies for low-level security, (ii)
redesigning existing approaches for new environments, and
(iii) e cient secure query processing engine. Data and
computations can only be fully protected using technologies such
as TEEs, combined with oblivious data access. Most of the
existing execution approaches cannot be easily mapped to
the new runtimes due to heavy limitations that TEEs
enforce on the users. Data processing algorithms have to be
redesigned having these limitations in mind. Most
importantly, compute resources have to be wisely fully utilized
to guarantee high query execution e ciency. Moreover,
worker nodes need to be classi ed by the level of security
they require; private compute nodes are considered secure,
trusted/secure compute nodes can prove their security but
require data anonymization for privacy, and public compute
nodes need oblivious processing in a TEE.</p>
      <p>Therefore, the problem, and main challenge, resides in
how to enable e cient and truly secure data processing jobs
to hybrid compute environments.</p>
    </sec>
    <sec id="sec-3">
      <title>RELATED WORK</title>
      <p>
        E cient Execution &amp; State Management.
Sanctuary [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] is a distributed streaming system. The authors
present a set of algorithms to manage large state, but they
blindly spill to disk the state information. Additionally, the
state is not managed obliviously, hence, the longer the job
runs, the more information leaks out. SecureStreams [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
is a lightweight streaming platform running in SGX
enclaves. The jobs are build using a set of simple operators
but the system lacks optimizations (unnecessary
encryption/decryption between each operator) and does not
scaleout well (inter-operator communication quickly becomes a
tra c bottleneck). It also does not protect against
accesspattern attacks. TrustedDB [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] is a secure database built
on a cryptographic coprocessor, an older trusted hardware
architecture. It stores large state externally and accesses
it using a Paging Module, which exposes the access
patterns. EnclaveDB [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] is a database utilizing SGX that only
handles state up to the size of the enclave memory
(approximately 90 MB). The authors assume that the future releases
of SGX would support much larger memory, however, up
until the current release it has not happened.
      </p>
      <p>
        The problem of tiny memory has been previously
addressed in small footprint databases [
        <xref ref-type="bibr" rid="ref10 ref6">6, 10</xref>
        ]. They propose
lightweight solutions, however, drastically limiting
functionality for their very speci c use cases, i. e., handheld
computers, and smartcards.
      </p>
      <p>
        Data Access Privacy. ObliDB [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is an oblivious database
core engine for general workloads. It hides access patterns
using oblivious query processing algorithms that require a
full table scan for each query. However, it runs only on a
single node in an SGX environment. Opaque [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] executes
encrypted Spark jobs using oblivious access. It proposes a
query optimizer to mitigate the cost of obliviousness,
however, it still reaches performance degradation of up to 46x.
Query Optimization. The existing systems utilizing
hybrid-cloud [
        <xref ref-type="bibr" rid="ref17 ref18">17, 18</xref>
        ] perform the optimizations based on
manual tagging the data by the users with its sensitivity.
Later, only the insensitive data is processed in the public
cloud. This is cumbersome for the users and harmful for
workloads consisting mainly of sensitive data. In contrast
to these works on hybrid-cloud, we aim at sending sensitive
data to the public cloud and omit the tagging process by
leveraging secure hardware.
4.
      </p>
    </sec>
    <sec id="sec-4">
      <title>OUR VISION</title>
      <p>To overcome the problem stated in Section 2, we envision a
system comprising a master node and three types of compute
nodes: (i) private compute nodes (a fully trusted worker),
(ii) trusted/secure compute nodes (guaranteed with certi
cates of trust), (iii) public compute nodes (a fully untrusted
machine). We believe that these abstractions ideally ful ll
the performance needs while maintaining strong privacy.</p>
      <p>A user submits a query to the master node. In turn, the
master node parses the query and optimizes it by exploiting
the knowledge about the topology and available resources. It
then compiles, executes, and monitors the query. Note that
it is the compute nodes that carry out the actual execution.</p>
      <p>Figure 1 depicts the execution of a query over the three
types of machines, namely private, trusted/secure, and
public compute nodes. Each node has a rigorous way of
processing the query enforced by the query execution plan.
Private compute nodes are allowed to process the input data in
clear, i. e., neither encrypted nor anonymized (green arrow).
Trusted/secure compute nodes are considered trustworthy.
Hence, they can process data as a private node but might
also be forced to process anonymized data depending on its
trustworthy degree (orange arrow). Public compute nodes
are simply considered insecure. They thus receive
enclaveencrypted data and process it inside a TEE (red arrow).
Executing a query in such environments is far from being
trivial as data might be transferred from one kind of
compute nodes to another. For example, Figure 1 illustrates
such data transfers with di erent arrow colors: while
passing from orange to green means data de-anonymization, red
to green stands for decryption with the enclave key. At the
end of the query, the data is aggregated and sent to output
as de ned by the query.</p>
    </sec>
    <sec id="sec-5">
      <title>RESEARCH CHALLENGES</title>
      <p>Building a system as described in Section 4 comes with
several research challenges, mainly around data access
privacy, e cient query execution, state management, and query
optimization. We elaborate on each of these in the following.
5.1</p>
    </sec>
    <sec id="sec-6">
      <title>Efficient query execution</title>
      <p>E ciently executing a query in TEEs is quite challenging
because of the extremely small main memory capacity in
TEEs, expensive CPU instruction set, and no system calls.
For example, Intel's proprietary TEE technology (SGX)
denes an enclave, a private and highly protected region in
memory that cannot be accessed from outside of its process.
It places the enclave code and data in a special memory
area of 128 MB. Yet, excluding space for the SGX metadata,
there is approximately only 90 MB left for the application.</p>
      <p>As a result, the design of state-of-the-art data processing
operations (e. g., a join operator) cannot simply be mapped
to enclave-enabled versions. For example, consider the case
of a hash-join operator. In the build phase, this operator
takes the smaller table and builds the hash table for the
selected key. Once the table reaches 90 MB in size, it will
start spilling the records to the memory outside the enclave
in an encrypted form. These data spilling operations cause
expensive calls to the CPU due to the context-switching
instructions and costly encryption.</p>
      <p>Therefore, e ciently executing queries in TEEs requires
a radical change in the design of data processing operators.
We will investigate new techniques for cache management
inside the enclaves and optimizations on CPU instruction
set speci cally for relational algebra. To speed up query
execution even further, we will investigate the use of query
compilation techniques to generate highly optimized code
for TEEs. Additionally, we will examine parallel enclaves
execution on multi-core CPUs. We will design data
processing operators relying on these new techniques.
5.2</p>
    </sec>
    <sec id="sec-7">
      <title>State Management</title>
      <p>The state is an essential element of an operator. During
execution, it is used for storing metadata and
intermediate results, e. g., a rolling aggregation while scanning a
table. Handling large state management e ciently for
enclaveenabled operators is challenging because of the extremely
small main memory capacity in TEEs (see Section 5.1).</p>
      <p>
        Existing systems store large states in the memory
outside of the enclave[
        <xref ref-type="bibr" rid="ref16 ref5">5, 16</xref>
        ]. However, constant paging, and
thus, data encryption and decryption, imposes great
performance deterioration. For example, TrustedDB [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] uses a
custom-built Paging Module that stores all pages outside of
the Secure Coprocessor. The pages are pulled on-demand
as needed by the query processing engine. Thoma et al.
in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] propose stateful operators for stream processing
using SGX's built-in paging mechanism. Yet, both systems are
not su ciently optimized for secure hardware. For example,
in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], a hash-join operator blindly spills the hash table to
the outside memory, leading to expensive calls outside the
enclave whenever it looks for a tuple.
      </p>
      <p>Thus, new state management techniques are required for
settings where the main memory is drastically limited. We
will investigate new data structures and data indexes that
allow for state management outside the enclave while at the
same time reducing the unnecessary outside calls.
5.3</p>
    </sec>
    <sec id="sec-8">
      <title>Data Access Privacy</title>
      <p>
        Although TEEs (such as SGX) provide secure data
processing via hardware, it is still possible to have data
leakage by understanding the data access patterns done by the
operators inside the enclave. There have thus been many
proposals on how to achieve data access privacy in systems
designed for speci c use-cases, e. g., homomorphic
encryption [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and oblivious algorithms [
        <xref ref-type="bibr" rid="ref19 ref7">7, 19</xref>
        ].
      </p>
      <p>
        However, it is a perpetuating problem how to glue these
proposals together to provide a general-purpose data
processing engine with oblivious data access and without
hurting performance. For instance, Opaque [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] provides
oblivious data processing on SGX, but comes with an overhead
of up to 46x! Similarly, ObliDB [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] comes with a huge
overhead as it performs a full table scan for each query to achieve
obliviousness.
      </p>
      <p>
        We will explore di erent ways of reducing such overhead
incurred by current oblivious data access algorithms.
Particularly, we plan to investigate how to set a lower-bound on
the number of non-relevant tuples that are necessary to hide
access patterns. Such a lower-bound guarantee is required
to not compromise privacy while not harming performance.
Another research direction we will explore is to store inside
the enclave metadata about table values (similar to Oracle's
Zone Maps [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]). Having such knowledge ahead of a query
can even prevent scanning a table at all.
5.4
      </p>
    </sec>
    <sec id="sec-9">
      <title>Privacy-Preserving Query Optimization</title>
      <p>
        Intel SGX is notorious for its performance degradation
and vulnerability to some attacks [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. End-to-end
encryption and data de/anonymization are known to be
computationally intensive. Inter-cloud data transfers have been
identi ed as a bottleneck in cloud computing [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. These
are some of the aspects the query optimizer will consider
when deciding which data is to be executed on which nodes.
      </p>
      <p>
        Existing systems reduce the problem space by reducing
their functionality [
        <xref ref-type="bibr" rid="ref13 ref7">7, 13</xref>
        ]. However, while VC3 [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] is not
using oblivious access and thus it is vulnerable to data access
pattern attacks, ObliDB [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] runs only in an SGX-enabled
environment. Essentially, all these systems are designed for
homogeneous environments, i. e., they assume all compute
nodes in the system provide a TEE environment. This is
not the case in our envisioned system where multifarious
machines with di erent requirements co-habit together in
the same system. Enforcing a uni ed policy across all of
them would end up in performance degradations.
      </p>
      <p>To solve this challenge, we will study the performance
of connecting operators with di erent privacy constraints.
This will add a new dimension to the query optimization
and will force us to rethink the optimization techniques
for logical and physical query plans. Moreover, we will
investigate new techniques for data anonymization and
deanonymization, data processing over encrypted data, and
query optimization for heterogeneous secure environments.</p>
    </sec>
    <sec id="sec-10">
      <title>CONCLUSION</title>
      <p>We presented our plan towards a general-purpose secure
and scalable data processing system for hybrid
environments (i. e., composed of private, trusted, and public
compute nodes). We showed that each of the existing systems
solves only a piece of the problem. State-of-the-art
solutions force users to work with sensitive data or computations
only within a private environment, hence underutilizing the
available computing resources. Moreover, some works trade
users' privacy or drastically limit the use-case, e. g., to a
centralized system not suitable for very large datasets. We
showed that we are still missing a system that sees the bigger
picture and solves the problem of truly secure data
processing at scale. We presented our envisioned system to solve
such a problem and discussed the main research challenges
that we must tackle to make our vision a reality.</p>
    </sec>
    <sec id="sec-11">
      <title>ACKNOWLEDGMENTS</title>
      <p>This work was funded by the German Ministry for
Education and Research as BIFOLD - Berlin Institute for the
Foundations of Learning and Data (ref. 01IS18025A and ref.
01IS18037A).</p>
      <p>REFERENCES</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>AWS</given-names>
            <surname>Startup</surname>
          </string-name>
          <article-title>Stories</article-title>
          . https://aws.amazon.com /campaigns/aws-startups-stories/.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>[2] Microsoft's Growth ReAzuring Under Nadella</article-title>
          . https://markets.businessinsider.com /news/stocks/microsoft-s
          <article-title>-growth-reazuring-undernadella-1028372914.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Oracle</given-names>
            <surname>Database</surname>
          </string-name>
          <article-title>Concepts</article-title>
          . https://docs.oracle.com.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Antonopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Arasu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Eguro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hammer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kaushik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kossmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ramamurthy</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Szymaszek</surname>
          </string-name>
          .
          <article-title>Pushing the limits of encrypted databases with secure hardware</article-title>
          .
          <source>arXiv preprint arXiv:1809.02631</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bajaj</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Sion</surname>
          </string-name>
          .
          <article-title>TrustedDB: A Trusted Hardware Based Database with Privacy and Data Con dentiality</article-title>
          .
          <source>In SIGMOD</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bobineau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bouganim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pucheral</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Valduriez</surname>
          </string-name>
          . PicoDBMS:
          <article-title>Scaling down database techniques for the smartcard</article-title>
          .
          <source>In VLDB</source>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Eskandarian</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Zaharia</surname>
          </string-name>
          .
          <article-title>ObliDB: oblivious query processing for secure databases</article-title>
          .
          <source>PVLDB</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Griebel</surname>
          </string-name>
          , H.-U. Prokosch, F. Kopcke,
          <string-name>
            <given-names>D.</given-names>
            <surname>Toddenroth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Christoph</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Leb</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Engel</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Sedlmayr</surname>
          </string-name>
          .
          <article-title>A scoping review of cloud computing in healthcare</article-title>
          .
          <source>BMC Med</source>
          . Inf. &amp;
          <string-name>
            <surname>Decision Making</surname>
          </string-name>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Havet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pires</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Felber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pasin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rouvoy</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Schiavoni</surname>
          </string-name>
          .
          <article-title>Securestreams: A reactive middleware framework for secure data stream processing</article-title>
          .
          <source>In DEBS</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Karlsson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Leung</surname>
          </string-name>
          , and
          <string-name>
            <surname>T. Pham. IBM</surname>
          </string-name>
          <article-title>DB2 everyplace: A small footprint relational database system</article-title>
          .
          <source>In ICDE</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>C.</given-names>
            <surname>Priebe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Costa</surname>
          </string-name>
          .
          <article-title>Enclavedb: A secure database using SGX</article-title>
          . In S&amp;P,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sabt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Achemlal</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Bouabdallah</surname>
          </string-name>
          . Trusted Execution Environment: What It is, and What It is Not. In Trustcom/BigDataSE/ISPA,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>F.</given-names>
            <surname>Schuster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Costa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Fournet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gkantsidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Peinado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Mainar-Ruiz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Russinovich</surname>
          </string-name>
          . VC3:
          <article-title>Trustworthy data analytics in the cloud using SGX</article-title>
          . In S&amp;P,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Stephen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Savvides</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sundaram</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Ardekani</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Eugster</surname>
          </string-name>
          . STYX:
          <article-title>stream processing with trustworthy cloud-based execution</article-title>
          .
          <source>In SoCC</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S. D.</given-names>
            <surname>Tetali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lesani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Majumdar</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Millstein</surname>
          </string-name>
          . MrCrypt:
          <article-title>Static analysis for secure cloud computations</article-title>
          .
          <source>In OOPSLA</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>C.</given-names>
            <surname>Thoma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Labrinidis</surname>
          </string-name>
          .
          <article-title>Behind enemy lines: Exploring trusted data stream processing on untrusted systems</article-title>
          .
          <source>In CODASPY</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>X.</given-names>
            <surname>Xu</surname>
          </string-name>
          and
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          .
          <article-title>A framework for privacy-aware computing on hybrid clouds with mixed-sensitivity data</article-title>
          .
          <source>In HPCC/CSS/ICESS</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ruan</surname>
          </string-name>
          .
          <article-title>Sedic: privacy-aware data intensive computing on hybrid clouds</article-title>
          .
          <source>In CCS</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>W.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Beekman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Popa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. E.</given-names>
            <surname>Gonzalez</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Stoica. Opaque:</surname>
          </string-name>
          <article-title>An oblivious and encrypted distributed analytics platform</article-title>
          .
          <source>In NSDI</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>