<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>September</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>odel Plugins</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Viktor Sanca</string-name>
          <email>viktor.sanca@epfl.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anastasia Ailamaki</string-name>
          <email>anastasia.ailamaki@epfl.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Context-Rich Formats</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>System A Invocation</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>System B Invocation</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Vancouver, Canada</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>EPFL</institution>
          ,
          <addr-line>Lausanne</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Google</institution>
          ,
          <addr-line>Sunnyvale</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>1</volume>
      <issue>2023</issue>
      <abstract>
        <p>Extracting value and insights from increasingly heterogeneous data sources involves multiple systems combining and consuming the data. With multi-modal and context-rich data such as strings, text, videos, or images, the problem of standardizing the data model and format for interchangeable use is further exacerbated by a non-uniform way of processing, extracting, and preserving content and context from the data. This makes the data movement, reuse, and exchange between diferent systems a non-composable, manual process. On the other hand, increasingly powerful and popular machine learning-driven data representation models map the input data into uniform high-dimensional vector embeddings for further processing, informed by particular models. However, using models is expensive, and the manual integration efort might exacerbate unnecessary costs.</p>
      </abstract>
      <kwd-group>
        <kwd>System A</kwd>
        <kwd>Heterogeneous Data</kwd>
        <kwd>System B</kwd>
        <kwd>PyTorch [3]</kwd>
        <kwd>which are specialized and optimized for</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License the information coming from heterogeneous and
multiAttribution 4.0 International (CC BY 4.0).</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>Technological advances, the proliferation of the Internet,
personal multimedia and sensor devices, and social media
have changed data types and formats. While tabular,
numerical, and generally relational data represent the
backbone of many applications, the main task of data
analytics is to provide and support timely, eficient, and
declarative value extraction. Therefore, supporting novel
and useful ways to process the data is a natural goal of
modern analytics.</p>
      <p>Machine learning methods are particularly dominant
in extracting insights from context-rich data, which we
analyze more deeply in a recent study on context-rich
analytical engines and their future architectures [1], where
the main takeaway is that diferent information sources,
such as images, text, traditional relational data, and
metadata, will interact in a complex and potentially ad-hoc</p>
      <p>Work done entirely at EPFL.
(A. Ailamaki)
(A. Ailamaki)
0000-0002-4799-8467 (V. Sanca); 0000-0002-9949-3639</p>
      <sec id="sec-2-1">
        <title>On the other hand, data might come from specialized</title>
        <p>engines or object stores - images, videos, audio, and
documents might be stored in separate engines. Finally, they
might be combined with relational data in a complex
hybrid ML-relational query plan to combine insights from</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2. Contextualized Data</title>
    </sec>
    <sec id="sec-4">
      <title>Consumption</title>
      <p>modal data sources, for example, combining sentiment
analysis, object detection, and performing joins with
corresponding relational data.</p>
      <p>No matter the implementation, as a monolithic sys- A significant corpus of work in machine learning,
estem with internal communication or multiple standalone pecially in representation learning, is the key enabler
components and modules, the initial data is exchanged of multi-modal and context-rich analytics. They
transand transformed in a polystore-like fashion [4, 5]. While form the human-centric, context-rich data into
machinerelational analytics come with strict schema and opera- centric formats amenable to further automated
processtor transformations for data compatibility, introducing ing. Tasks such as sentiment analysis, similarity search,
multi-modality and model-driven transformation, the object detection, translation, and data generation become
data movement and exchange process becomes manual possible - and we consider these tasks as parts of a more
and imperative, especially when using machine learn- complex system.
ing models that can perform many tasks. Therefore, the While on their own, they are useful, such models can
data, information, and task flow becomes an arbitrary be combined with traditional analytics to perform
multiprocess that is finally left to the end-user to optimize, as modal value extraction. For example, performing
senillustrated in Figure 1. timent analysis on the text associated with an image</p>
      <p>Consuming contextual data becomes complex not only that might contain certain objects on a retail website
for specifying the resulting schema and data movement connected to transactions in a traditional RDBMS can
but equally due to comparatively more expensive data provide new sources of insight or decision-making,
utilizprocessing when using models. This can result in higher ing the data already collected and stored across diferent
computational resource requirements, increased latency, systems.
and higher monetary costs, especially in ad-hoc
imperative settings where data that is not relevant might be
processed, or data of frequent relevance might be unnec- 2.1. Models for Multi-Modal Context-Rich
essarily processed multiple times, resulting in ineficient Data
resource utilization.</p>
      <p>Machine learning models for multi-modal tasks often
have an expensive model-specific input data embedding
step, followed by corresponding post-processing or
operations performed directly on embeddings, such as
similarity search. Conversely, this data might need to be
mapped back to the original representation for the
output or further analysis, requiring a link to the original
object.</p>
      <p>We consider these requirements and:</p>
      <sec id="sec-4-1">
        <title>Before the advent of ML-driven data processing, large</title>
        <p>scale analysis of contextual data often involved
humanin-the-loop through crowdsourcing approaches such as
Amazon Mechanical Turk[6], reCAPTCHA[7], or hiring
domain experts. While useful in the early days of
generating datasets for training the models, human-based
analysis is slow, error-prone, and expensive for ad-hoc
analytics.</p>
        <p>Natural language processing has long been the study of
representation learning, resulting in approaches such as
• describe model-driven processing, their associ- word2Vec [8] or FastText [9] that allow operations such as
ated costs, and several use cases in Section 2, context-string-similarity and classification, even for
mis• outline interaction between the original objects spellings [10] and out-of-dictionary words. More
comand common intermediate, model-driven vector plex approaches based on Transformer architecture [11]
data representation in Section 3, resulted in popular models such as BERT [12], or
GPT3 [13] and GPT-4 [14], that allowed more complex tasks
• propose a design for eficient and composable such as translation and text generation. The change of
exchange of contextual data supporting multi- complexity of the embedding and processing method has
modal data and embedding-based models Sec- increased the computational (and monetary) cost of
protion 4. cessing, but equally, the functionality that those models
Our design proposal is named E-scan and aims to mo- ofer as part of the data processing pipeline and
autotivate a common extensible interface with new models mated, machine-driven insights.
and data formats that enables eficient contextual data Furthermore, a rich research area in machine learning
exchange, caching, and lightweight processing behind a drives embedding models that support other context-rich
common plugin/connector-based interface, with a focus data formats, equally transforming the input into
subseon vector embeddings. quently processed embeddings, depending on the task
and the architecture. Having specialized models for tasks
allows multi-modal processing by selecting an
appropriate method for the task. Models such as Segment
Anything Model (SAM) [15], ResNet [16], or Dall-E [17] can A similar situation holds for other data formats
regardbe used for model-driven image processing and genera- ing the cost Table 2, as formats such as images or audio
tion. PANNs [18] or Whisper [19] are designed for audio might have more complex architectures and processing
processing. Finally, models trained on web-scale data ex- and, consequentially, pricing. We note that open-source
ist as Foundation Models [20], that can be re-trained and models can be used, of similar characteristics, on
reposiadapted for a specific task and dataset without expensive tories such as HuggingFace [21] or TensorFlow Hub [22].
re-building from scratch. Still, the cost of local or cloud resources remains for
in</p>
        <p>Using models, analytics evolve from joins and aggre- stantiating and running these models, depending on the
gations into more complex operations such as object de- particular cloud provider. Ideally, some processing can be
tection, sentiment analysis, classification, and similarity avoided using caching, but this might not be immediately
operations. However, a potential issue is that the commu- possible in the case of images or generative AI based on
nication between the data storage, model specification, given data-driven prompts.
and processing using a particular model and framework Still, analytical queries are typically selective, and not
are decoupled, and the interactions become increasingly all data is of equal interest for analysis, which motivated
complex, in the case as simple as what the model inputs prior research in lazy data ingestion using NoDB [23]
arshould be, and what transformations and schema will chitectures. In our case, this motivates pull-based model
the model output for subsequent processing. We will invocation where data is not eagerly embedded and
proconsider this in our system design. cessed but only on demand from the consuming/invoking
operator. Allowing this process to happen in an ad-hoc,
2.2. The Cost of Model-Driven imperative setting in complex systems risks higher than
Embedding necessary costs and longer latency for processing data,
which would later be discarded or re-processed fully
mulModel-based analytics introduce complex and expensive tiple times.
operations for large-scale data processing and challenges We present the caching and pull-based plugin design
in optimizing the cost and resources. Firstly, such oper- that aims to reduce the cost and resource requirements
ations often require computational resources like GPUs and reduce latency in Section 4.
to instantiate the models and achieve desirable latency.</p>
        <p>However, some models might not be open-sourced, re- 3. Intermediate Data
sulting in monetary costs that are increasingly high if
models are not used frugally. We outline the pricing of Representation
diferent text embedding models of OpenAI in Table 1.</p>
        <p>If we consider the cost of processing single words only
(a token) and an operation that embeds the words in an
object or column store, performing this operation on a
relatively modest data size of 1M tuples would result in a
cost from $3 to $30 for that operation only in case of single
words. This cost is likely higher for realistic scenarios
over sentences or documents. We cannot discard the
monetary and computational costs, and this cost should
be amortized and invoked only when necessary.</p>
        <p>Furthermore, it is unlikely to have 1M unique words,
nor is their frequency equal or of the same interest.
Embedding the same words once and then reusing this
embedding motivates the need for caching mechanisms that
we introduce in our plugin design.</p>
        <p>In this work, we focus on models which, at some
processing phase, transform the context-rich data of
various formats into embeddings. Embeddings are
highdimensional vectors (tensors) of values, which are on
their own context-free data structures. The separation of
concerns between the context- and processing-providing
models and context-free embeddings allows transparent
caching and processing optimizations, where at least part
of the cost can be amortized.</p>
        <sec id="sec-4-1-1">
          <title>3.1. Vector Embeddings with E-Scan</title>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>Embeddings represent an intermediate data representation that is contextualized or processed by a corresponding model. They are uniform, no matter the data type,</title>
        <p>Context-Rich Data
Multi-Modal, Heterogeneous
0.135 0.654 ڮ 0.345 0.848
0.548 0.870 ڮ 0.984 0.156
0.498 0.148 ڮ 0.165 0.958</p>
        <p>ڭ
0.318 0.844 ڮ 0.283 0.418
0.046 0.651 ڮ 0.162 0.658
0.156 0.598 ڮ 0.968 0.411
Model-Driven Embeddings Mapping Into Vector Format</p>
        <p>AI-Driven Context Uniform Data Representation</p>
        <p>E-Scan: Model Embedding Data Plugin
where their dimensionality may vary. This common
representation motivates having a vector data management
layer that would serve as a caching layer, ofering exact
and approximate retrieval and data access methods.</p>
        <p>Figure 2 shows a conceptual flow of information:
rather than System A, System B, and the base data stored
in various object stores being exposed as raw sources, we
propose a scan connector/plugin-based mechanism called
E-Scan. This allows a composable and common plugin
for exchanging contextual data. First, the model
information is preserved to provide context to the embeddings,
along with those transferred and exchanged as vector
data. In case the original data is required, and for caching
purposes, the object ID is also preserved to provide a link
with the original data (black arrow). Then, if System A
and System B need to process or exchange context-rich
information, the plugin contains all the information and
specification of the model, data, and original input in a
uniform way.
$</p>
        <p>Context-Rich Data
Multi-Modal, Heterogeneous</p>
        <p>E-Scan Plugin</p>
        <p>Model + Metadata</p>
        <p>Vector Database/Cache
Lazy + Efficient Embedding</p>
        <p>Decoding/Un-Mapping
Data Access + Processing Methods
$
keeping a reference to the original file containing the
exact source and identifying key.</p>
        <p>Input requirements should indicate the expected data
format or a schema and particular characteristics of a type
that can be instantiated, such as image size in pixels. The
output should equally specify what is expected, an output
schema, and the fields and types of the given schema.
This can allow model-operator composability in a more
traditional relational optimization case [1] or provide
a blueprint for generating a code for data exchange or
ingestion between diferent components.</p>
        <p>As embedding might not be the first or the last step of
models, to enable interoperability with such plugins, the
model design needs adaptation to start consuming from
a plugin in an intermediate step rather than requiring
original input data. Thus, green and blue lines represent
the model and data exchange via a common descriptive
connector rather than ad-hoc via dashed arrows.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. E-Scan Plugin and System</title>
    </sec>
    <sec id="sec-6">
      <title>Design</title>
      <sec id="sec-6-1">
        <title>3.2. Schema and Metadata</title>
        <sec id="sec-6-1-1">
          <title>Data access patterns and the cost of complex model</title>
          <p>Rather than exchanging data and embeddings directly, driven analytics (Section 2) motivated a system design
in an ad-hoc manner, and manually keeping information that is pull-based, lazy, and proactively caching
expenabout the models, we propose storing this provenance sive operations. This section presents the system aspects
information in the plugin metadata. To our best knowl- of contextual data exchange with E-Scan.
edge, there isn’t a uniform way for such specifications, Beyond the specification intended for connecting
varibut keeping the information about the particular model, ous components that can customize the schema, to
facilicreator, origin, and input and output parameters rep- tate the data exchange and processing by diferent
comresents minimal information required for exchanging ponents, similar to the ideas behind Apache Arrow [24],
information between diferent components. Keeping and insomuch that a lightweight engine and access methods
transferring only the data of need allows lightweight are also part of the design, as presented in Figure 3.
transfer by exchanging only the requested data, and
ideally not transferring the original data, rather than only</p>
        </sec>
      </sec>
      <sec id="sec-6-2">
        <title>4.1. Caching and Intermediate Storage</title>
        <p>applicable to systems, components, models, or parts of
models that create embeddings that can be reused.
Naturally, the designation of invalidation or which
embeddings might not be good candidates for caching are part
of the caching policy.</p>
        <p>The analysis of caching versus recomputing model-based
data transformations opens up future work in
performance estimation or the benefits of such caching
involving data movement of input data and model state that
might be larger than working memory, as well as
hardware characteristics such as accelerators and intercon- 4.2. Lazy Retrieval and Mapping
nects with monetary cost, extending the embedding table The design of E-Scan aims to allow better and more
efcaching approaches [25]. ifcient interoperability through a common plugin and</p>
        <p>The immediate purpose of caching is to avoid expen- interface, to encapsulate functionality and avoid
resive model processing. This access path is represented implementing desirable data exchange characteristics
by solid blue and green arrows in Figure 3. As embed- in complex systems.
dings are all in a common vector format, data manage- In this work, we follow the principles behind
ment engines specialized for vectors such as Milvus [26] NoDB [23], as much as we do not want to embed all the
can be used as the caching support layer. Alternatively, data eagerly. Not all the data might be the object of
interlightweight storage and retrieval systems based on vector- est, and this process should be pull-based, meaning that
based indexes supporting searches on heterogeneous it should happen only upon requesting the embedding.
hardware such as FAISS [27] are good candidates for This saves initial processing time and is progressively
the vector caching and storage layer. faster with the previously described caching mechanism</p>
        <p>Beyond storing and caching embeddings, basic opera- in case of cache hits, without any implied prefetching
tions such as similarity or top-K search is often available, mechanism, just using the access patterns requested by
allowing more complex data processing and access pat- the consumer.
terns. Traditional index structures that link the records The consumer has to specify to the plugin which model,
with their primary keys in original object representa- from which schema, and which data it wishes to
transtions are also necessary to allow fast retrieval, besides form and provide this information to the connector, for
full data scans. The object must also be registered to have example, by extending industry-led formats [28, 29]. This
been embedded by a given model, either explicitly in a also involves specifying which objects (context-rich data)
data structure or lightweight mechanisms such as bloom should be involved in the query and potentially cached,
iflters. which can be an explicit list or a result of another query.</p>
        <p>Only in case of a cache miss should the request be prop- This mechanism and specification are also similar to
agated to the system and model for explicit embedding (or adapting to diferent data types in ViDa [ 30], where
codebatched as a group request on the system side). We call generation can also be used to avoid overheads of
functhis lazy embedding, where this mechanism covers the tion calls and create custom-built access patterns and
case of avoiding the expensive path of execution for pre- procedures tailored for specific data formats, beyond the
viously visited objects rather than eagerly embedding all common vector representation. This also allows pulling
the data. This can happen due to duplicates or re-visiting the plugin specification inside system components that
the same data in diferent queries. Furthermore, this support code generation to avoid an explicit component
mechanism can be adapted to avoid embedding if a suf- and communication overhead or designing an embedded,
ifciently similar entry is already present in embeddings. process-local component corresponding to what DuckDB
Still, to approximate similarity in the embedding domain, is used for in analytical query processing [31].
a cheaper embedding method is required, either through
a lightweight model or by applying other input similarity
methods that should yield performance improvements, 4.3. Example and Conceptual Use of
efectively trading of similarity and approximation for E-Scan
computational cost as in traditional approximate query The main goal of E-Scan is to allow eficient,
easy-toprocessing. use, and composable interoperability of components that</p>
        <p>Finally, as processing might require original objects, involve embedding and vector data over context-rich,
not only embeddings, a corresponding key, and the path/i- multi-modal formats. In Figure 4, we provide a short
dentifier is saved to retrieve the object from the corre- example of E-Scan in action. Without this component,
sponding object-store component (black arrow, Figure 3) - the user would be forced to know and implement all the
as not all use cases and model have a full encoder-decoder details of caching, processing, optimization, and system
architecture that can be used to produce the requested interaction, as in Figure 1.
output. This process replaces manual data manipulation In this example, we use the Segment Anything Model
and embedding with a generalizable caching mechanism (SAM) [15] as a sample state-of-the-art image processing
tool that segments the objects from the images. We
indicate this processing part in green, with corresponding a1n)dOsbejgecmtednettaetciotinon 2) Sememanbteicddwinogrds
steps under number 1). For text, we use BERT [12] as
an example of a word-embedding method that allows
semantic similarity matching and classification.
Conversely, this processing path is represented in blue, and
the corresponding steps are under number 2). Image Engine • ID-based object lookup Text Engine</p>
        <p>A consumer queries the common interface, starting 1.b) SAM embeddings 2.b) BERT embeddings
from the image and text data and instantiated models 1.c) Image cache + ID 2.c) Text cache + ID
with corresponding metadata. Suppose this is driven $ E-Scan Plugin $
by a query that wants to find all the images with more Model + Metadata
than three objects which contain cats or dogs, where l1a.zay) rPeuqlul-ebsatsed, 2.a)laPzuyllr-ebqauseedst,
the corresponding image description (text) is of positive
sentiment and contains synonyms of the words ”joy” or • Exact model + metadata info
”cute”. •• COabcjehcetd-E+mlbaezdydcinogmmpuatpaptiinogn</p>
        <p>Rather than having this as a manual process, the re- • Index-based vector search
quest comes to the common interface, requesting text Common Interface
and image data processing. This work does not discuss
cardinality estimation or cost diference, for example, if
iftrhstanthiemtaegxet eemmbbeeddddiinnggs. sThhoisulwdobueldexceocnusteeqduiefncthiaelalypeern- Figure 4: Example system functionalities of E-Scan.
able a filter pushdown and sideways information passing
of more selective processing over the query for finding 5. Related Work
images that contain more than three objects with cats
or dogs, which remains the topic of future work but The provenance of new machine-learning-driven
methwould be the task of the plugin or corresponding query ods creates novel data processing and extraction
chaloptimizer as in our recent proposed work of holistic op- lenges and opportunities. In our recent work, we propose
timization of context-rich engines [1]. holistic declarative optimization of context-rich
analyti</p>
        <p>In this case, the plugin dispatches the lazy requests cal engines meant for hybrid model-relational
processto the Image Engine that retrieves the images and per- ing [1].
forms embedding and processing 1.a). Conversely, image The challenges of integration of systems and
compoembeddings are created 1.b) and cached along with the nents are related to polystores [4, 5]. In this work, we
original object ID 1.c). If there are similar images based on have tackled the initial blueprint of data exchange
primisimpler embeddings or other metrics, the caching mech- tives and components, still, similar lessons should be
apanism can deploy a similarity search before dispatching plied for cross-system optimization, especially for more
requests to the model. complex and hybrid analytical query processing that
in</p>
        <p>With the existing original image object IDs, we can volves multiple heterogeneous systems.
iflter out corresponding text object IDs that qualify and From the abstractions perspective, this work
correperform the task similarly dispatched to Text Engine sponds to the purpose of Apache Arrow for in-memory
lazily. Finally, the requested data is exchanged, and the formats for flat and hierarchical data [ 24]. The main idea
results of processing and embeddings are exchanged via is to abstract the implementation details and allow easy
the common interface for upstream processing. data exchange between the components, delegating the</p>
        <p>Some details remain the topic of future work, such as complexity to the underlying system abstractions
hanthe granularity and the benefit of replacing processing dled by the provided metadata and lightweight engine.
embeddings and intermediate results with data cache. Finally, this work relates to NoDB [23] as it does not
This is because while embedding data is expensive, other process nor ingest data before needed, saving on
prepost-processing operations might also be useful to cache processing cost, considering the cost of processing the
explicitly. Similarly, this can also motivate modifying entire data which might not be of immediate interest.
existing model architectures and replacing explicit em- Instead, we adopt lazy data loading and use caching and
bedding with an intermediate lightweight caching mech- appropriate exact and approximate data access patterns
anism. Still, the main goal of E-Scan is to provide a to speed up the processing and avoid unnecessary
comblueprint for a common interface for eficient and com- putation, both due to loading and repeated requests to
posable emerging data processing components. similar data, therefore proposing a new dimension to
existing work on embedding table caching [25].</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <sec id="sec-7-1">
        <title>We thank the anonymous reviewers for their insightful comments and detailed feedback.</title>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>6. Conclusion and Future Work References</title>
      <sec id="sec-8-1">
        <title>In this work, we presented E-Scan: the initial vision and</title>
        <p>proposal for a plugin/connector that intends to simplify
and make eficient communication and data exchange
between model-driven components in analytical query
processing.</p>
        <p>We target composability via the common and
extensible plugin-based interface that components can invoke,
registering the necessary schema information and
metadata specific to the particular data and model shape and
requirements. Thanks to the common and context-free
vector format of embeddings, we propose a reusable and
eficient design of caching layer driven by a lightweight
vector engine and access methods. This brings closer
computationally heavy model processing and trades of
part of it for database-inspired caching techniques.</p>
        <p>The main goal is to achieve functionality and
performance while decoupling the data management details
from the intent of the user via lightweight abstractions.</p>
        <p>Still, there are many available models and use cases
that are yet to be tested and remain the topics of future
work. Designing a concrete interface with established
frameworks and models is required to achieve a
wellencompassing set of functionalities and achieve desired
ease of use. Finally, exploring the diferent granularities
of replacing embedding computation with caching and
standardizing the data exchange protocols is a desired
long-term outcome for supporting the holistic
integration of model-driven analytics with data management
through optimizable, declarative, and composable
interfaces and components.
D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, the-art natural language processing, arXiv preprint
E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, arXiv:1910.03771 (2019).</p>
        <p>C. Berner, S. McCandlish, A. Radford, I. Sutskever, [22] TensorFlow Hub, 2023. URL: https://www.
D. Amodei, Language models are few-shot learners, tensorflow.org/hub.
in: H. Larochelle, M. Ranzato, R. Hadsell, M. Bal- [23] I. Alagiannis, R. Borovica, M. Branco, S. Idreos,
can, H. Lin (Eds.), Advances in Neural Information A. Ailamaki, Nodb: eficient query execution
Processing Systems 33: Annual Conference on Neu- on raw data files, in: K. S. Candan, Y. Chen,
ral Information Processing Systems 2020, NeurIPS R. T. Snodgrass, L. Gravano, A. Fuxman (Eds.),
Pro2020, December 6-12, 2020, virtual, 2020. ceedings of the ACM SIGMOD International
Con[14] OpenAI, Gpt-4 technical report, 2023. ference on Management of Data, SIGMOD 2012,
arXiv:2303.08774. Scottsdale, AZ, USA, May 20-24, 2012, ACM, 2012,
[15] A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, pp. 241–252. URL: https://doi.org/10.1145/2213836.</p>
        <p>L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.- 2213864. doi:10.1145/2213836.2213864.
Y. Lo, et al., Segment anything, arXiv preprint [24] Apache Arrow, 2023. URL: https://github.com/
arXiv:2304.02643 (2023). apache/arrow.
[16] K. He, X. Zhang, S. Ren, J. Sun, Deep residual [25] Z. Wang, Y. Wei, M. Lee, M. Langer, F. Yu,
learning for image recognition, in: 2016 IEEE Con- J. Liu, S. Liu, D. G. Abel, X. Guo, J. Dong, J. Shi,
ference on Computer Vision and Pattern Recogni- K. Li, Merlin hugectr: Gpu-accelerated
recomtion, CVPR 2016, Las Vegas, NV, USA, June 27-30, mender system training and inference, in:
Pro2016, IEEE Computer Society, 2016, pp. 770–778. ceedings of the 16th ACM Conference on
RecURL: https://doi.org/10.1109/CVPR.2016.90. doi:10. ommender Systems, RecSys ’22, Association for
1109/CVPR.2016.90. Computing Machinery, New York, NY, USA, 2022,
[17] A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, p. 534–537. URL: https://doi.org/10.1145/3523227.</p>
        <p>A. Radford, M. Chen, I. Sutskever, Zero-shot text- 3547405. doi:10.1145/3523227.3547405.
to-image generation, 2021. arXiv:2102.12092. [26] J. Wang, X. Yi, R. Guo, H. Jin, P. Xu, S. Li, X. Wang,
[18] Q. Kong, Y. Cao, T. Iqbal, Y. Wang, W. Wang, X. Guo, C. Li, X. Xu, K. Yu, Y. Yuan, Y. Zou, J. Long,
M. D. Plumbley, Panns: Large-scale pretrained Y. Cai, Z. Li, Z. Zhang, Y. Mo, J. Gu, R. Jiang,
audio neural networks for audio pattern recogni- Y. Wei, C. Xie, Milvus: A purpose-built
vection, IEEE ACM Trans. Audio Speech Lang. Pro- tor data management system, in: G. Li, Z. Li,
cess. 28 (2020) 2880–2894. URL: https://doi.org/ S. Idreos, D. Srivastava (Eds.), SIGMOD ’21:
Inter10.1109/TASLP.2020.3030497. doi:10.1109/TASLP. national Conference on Management of Data,
Vir2020.3030497. tual Event, China, June 20-25, 2021, ACM, 2021, pp.
[19] A. Radford, J. W. Kim, T. Xu, G. Brockman, 2614–2627. URL: https://doi.org/10.1145/3448016.</p>
        <p>C. McLeavey, I. Sutskever, Robust speech recog- 3457550. doi:10.1145/3448016.3457550.
nition via large-scale weak supervision, 2022. [27] J. Johnson, M. Douze, H. Jégou, Billion-scale
simiarXiv:2212.04356. larity search with GPUs, IEEE Transactions on Big
[20] R. Bommasani, D. A. Hudson, E. Adeli, R. B. Data 7 (2019) 535–547.</p>
        <p>Altman, S. Arora, S. von Arx, M. S. Bernstein, [28] Understanding AI Plugins in Semantic Kernel and
J. Bohg, A. Bosselut, E. Brunskill, E. Brynjolfs- Beyond, 2023. URL: https://learn.microsoft.com/
son, S. Buch, D. Card, R. Castellon, N. S. Chat- en-gb/semantic-kernel/ai-orchestration/plugins.
terji, A. S. Chen, K. Creel, J. Q. Davis, D. Dem- [29] OpenAI Platform Plugins, 2023. URL: https://
szky, C. Donahue, M. Doumbouya, E. Durmus, platform.openai.com/docs/plugins/getting-started.
S. Ermon, J. Etchemendy, K. Ethayarajh, L. Fei- [30] M. Karpathiotakis, I. Alagiannis, T. Heinis,
Fei, C. Finn, T. Gale, L. Gillespie, K. Goel, N. D. M. Branco, A. Ailamaki, Just-in-time data
virGoodman, S. Grossman, N. Guha, T. Hashimoto, tualization: Lightweight data management with
P. Henderson, J. Hewitt, D. E. Ho, J. Hong, K. Hsu, vida, in: Seventh Biennial Conference on
InnoJ. Huang, T. Icard, S. Jain, D. Jurafsky, P. Kalluri, vative Data Systems Research, CIDR 2015,
AsiloS. Karamcheti, G. Keeling, F. Khani, O. Khattab, mar, CA, USA, January 4-7, 2015, Online
ProceedP. W. Koh, M. S. Krass, R. Krishna, R. Kuditipudi, ings, www.cidrdb.org, 2015. URL: http://cidrdb.org/
et al., On the opportunities and risks of foundation cidr2015/Papers/CIDR15_Paper8.pdf.
models, CoRR abs/2108.07258 (2021). URL: https: [31] M. Raasveldt, H. Mühleisen, Duckdb: an
embed//arxiv.org/abs/2108.07258. arXiv:2108.07258. dable analytical database, in: Proceedings of the
[21] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. De- 2019 International Conference on Management of
langue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Fun- Data, 2019, pp. 1981–1984.
towicz, et al., Huggingface’s transformers:
State-of</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>