<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Performance Study of One-dimensional Learned Cardinality Estimation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yuchen Ji</string-name>
          <email>ji.yuchen@ist.osaka-u.ac.jp</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yuya Sasaki</string-name>
          <email>sasaki@ist.osaka-u.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daichi Amagata</string-name>
          <email>amagata.daichi@ist.osaka-u.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Takahiro Hara</string-name>
          <email>hara@ist.osaka-u.ac.jp</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Osaka University, JST Presto</institution>
          ,
          <addr-line>Osaka</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Osaka University</institution>
          ,
          <addr-line>Osaka</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The latest proximity query processing methods benefit from learned models (indexes) and have shown better performances than non-learned indexing approaches. This paper focuses on one-dimensional data, because querying one-dimensional data is one of the most important operations in database management systems. Specifically, we address the one-dimensional cardinality estimation problem and consider the questions: Are learned methods suitable for one-dimensional cardinality estimation? If so, how good are they? To answer these questions, we first design a prototype of a learned method for one-dimensional cardinality estimation. Second, we empirically evaluate the learned method together with existing methods. Then, we analyze the strong and weak points of these methods and find suitable cases for learned methods.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Cardinality estimation is a fundamental problem in query
optimization. It aims at estimating the number of records required by
a query. Most query optimization techniques are cost based [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
and cardinality estimation can help estimate the cost. Recently,
learned cardinality estimation methods [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ] have shown
remarkable improvement compared with traditional methods such
as histogram-based methods. These learned methods mainly use
deep learning models to map given query predicates to the
estimated cardinality of query results, because they can handle
complex non-linear relationships better than traditional methods
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. However, existing learned cardinality estimation methods
focus on multi-dimensional data [
        <xref ref-type="bibr" rid="ref11 ref15 ref16 ref19 ref6">6, 11, 15, 16, 19</xref>
        ]. They ignore the
most fundamental case, namely one dimension. One-dimensional
data and queries play an essential role in database management
systems (DBMSs). They support basic operations such as
managements of user ids and queries according to timestamps.
Cardinality estimation on one dimension can help estimate
execution costs and better plan these operations [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Investigations on
one-dimensional cases can also help develop methods for
multidimensional cases. For example, traditional histogram-based and
sampling-based methods are both extended from one dimension
to multiple dimensions.
      </p>
      <p>
        In the field of machine learning for databases, indexes receive
benefits from machine learning [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The learned index is first
proposed for point and range queries on one-dimensional data
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Learned indexes show much better performances compared
      </p>
    </sec>
    <sec id="sec-2">
      <title>PRELIMINARY</title>
    </sec>
    <sec id="sec-3">
      <title>Problem statement</title>
      <p>
        Consider a dataset  consisting of sorted one-dimensional data
with unique integer keys. A range query predicate  is specified
as a central key  and a range . The query result is the set
of records whose keys fall into [ − ,  + ]. Here, the goal
of cardinality estimation is to estimate the number of records
from  satisfying the query predicate. The estimated cardinality
is denoted by () and the real cardinality is denoted by
 (). (Another equivalent concept of cardinality is ,
and it represents the percentage of records satisfying the query
predicate [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].)
Error metrics. We evaluate Mean Absolute Percentage Error
(MAPE) and Q-error to see estimation errors. These errors are
widely used for cardinality estimation problems [
        <xref ref-type="bibr" rid="ref15 ref16 ref18">15, 16, 18</xref>
        ] and
are respectively defined as:
and
      </p>
      <p>MAPE = () −  ()</p>
      <p>()
Q-error =
max (),  ()
2.2</p>
    </sec>
    <sec id="sec-4">
      <title>Existing methods</title>
      <p>Sampling. This is the most straightforward approach to
estimating the cardinality of a given range query. The estimated
cardinality is obtained by scaling the cardinality on the samples.
The performance of sampling is directly related to the number
of samples (more samples yields a better accuracy but incurs a
higher computational cost).</p>
      <p>
        Histogram. Histogram-based approaches build a histogram on a
given dataset and sum up the counts of buckets intersected with
a given query predicate as estimated cardinality [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Histograms
have a long history of being used to solve these estimation tasks
[
        <xref ref-type="bibr" rid="ref17 ref7">7, 17</xref>
        ]. For one-dimensional cases, there are two classical
histograms: equal-width and equal-depth. A histogram is essentially
a group of linear models, where each bucket is a linear model
with a slope of the number of records divided by the width. This
one-level histogram usually uses a higher computation cost than
existing hierarchical learned models [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], because it needs
hundreds of add operations to sum up the counts of buckets.
Direct calculation. The cardinality of range queries on one
dimensional data can be easily estimated by two point queries
because the records in one-dimensional datasets are generally
sorted according to keys.
      </p>
      <p>More concretely, we can obtain the exact cardinality after
computing the orders of records with  −  and  +  as keys in
the dataset for the query :</p>
      <p>() =  ( + ) −  ( − ) +  ,
where  ( ) indicates the order of the first record whose key
is greater than or equal to  in the sorted dataset.  equals 1
if there exists a record whose key equals  +  in the dataset.
Otherwise  equals 0. Hence, the computation cost is equivalent
to the costs of the two point queries. The computational eficiency
can be improved by using indexes.
3</p>
    </sec>
    <sec id="sec-5">
      <title>LEARNED MODEL DESIGN</title>
      <p>
        We derive our idea from a learned index structure, Recursive
Model Index (RMI) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. RMI is the first and state-of-the-art
learned index for the search problem on one-dimensional data. It
stacks learned models to approximate the cumulative distribution
function (CDF) and then computes the position of a given key in
a sorted array according to the CDF. Here the term "CDF" means
the function mapping keys to their corresponding positions in
an array.
      </p>
      <p>As shown in Figure 1, the approximated CDF, F, which maps
keys to the corresponding positions, can also be used for
cardinality estimation. The mapping done by indexes has an intrinsic
relationship with the mapping from range queries to estimated
cardinality. We can approximate the relationship between
cardinality and the range started from  by  ′:</p>
      <p>′ () = ( ( + ) −  ( − ))/2.</p>
      <p>RMI usually adopts linear models to approximate  and shows
good performance. Therefore, we also employ linear models to
approximate  ′.</p>
      <p>Based on the above observations, we design a prototypic
cardinality estimation by refining the original RMI 1. As shown in
Figure 2, this model has a two-level hierarchy. For an input query,
the first level directs to a specific model in the second level, and
models in the second level then predict the cardinality. We use a
two-layer neural network in the first level. In the second level,
we organize many multi-scale linear models into an array. Each
multi-scale linear model is responsible for a subset of the original
dataset.</p>
      <p>
        For the first level, another choice is a linear model with less
computation cost than a neural network. However, choosing a
linear model will result in a more skewed distribution of subsets
for the second level [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], thus negatively afecting performance
and training. The neural network is hence a better choice. It is
hard for a simple linear model in the second level to cover range
queries with totally diferent scales of selectivity. Therefore, we
train some models with diferent scales for each subset in the
second level2.
      </p>
      <p>During the estimation procedure, the first level takes the
central key  of the given query as input and directs it to the
corresponding model in the second level. The selected second-layer
model estimates the cardinality according to the query range.</p>
      <p>From the recent success of learned indexes, we expect that
this learned model works well, i.e., it will provide high eficiency
and accuracy. We report its practical performance in the next
section.
4</p>
    </sec>
    <sec id="sec-6">
      <title>EVALUATION</title>
      <p>
        Datasets. We used four real-world datasets with diferent
distributions from the SOSD benchmark [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. These datasets have
recently been used for evaluating learned index works [
        <xref ref-type="bibr" rid="ref14 ref5">5, 14</xref>
        ].
Each of them contains 200 million 64-bit key/value records:
1Although we follow the structure of RMI, our model is trained for estimating the
cardinality of a given query.
2The selectivity is application dependent. Hence, applications can train this model
by specifying the range of selectivity.
• amzn: book popularity data from Amazon.
• face: Facebook user IDs sampled randomly.
• osm: cell IDs from Open Street Map.
      </p>
      <p>• wiki: timestamps from Wikipedia.</p>
      <p>Environment. All experiments were conducted on a server
with an Intel Xeon Gold 6254 @3.10GHz CPU and 768GB RAM,
running Ubuntu 18.04 LTS. The evaluated methods were
implemented in Python.</p>
      <p>
        Model setting. We evaluated the methods listed in Table 1. In the
case of the sampling methods, we estimated the cardinality based
on calculated cardinality of sampled datasets by binary search.
An equal-depth histogram has the same number of records in
each bucket. For the direct calculation method, we built an RMI
index on the original dataset. Compared with classical B-Trees [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ],
RMI shows better computational and space costs [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Recall that
direct calculation returns the exact cardinality, so we measured
the computation time and model size for direct calculation. For
the learned method, we used a two-layer neural network with 16
neurons in the hidden layer for the first level. We set the number
of models in the second level to 1 ∗ 105 to leverage the storage
cost and accuracy. Table 2 shows the model sizes of all methods.
We controlled the sizes (space consumptions) of LC, HM, SA3,
and DC so that they were (almost) the same.
      </p>
      <p>Training of LC. For the first level, we assume that a given
dataset is equally divided into subsets for models of the second
level. We composed keys and corresponding ids of models into
pairs as the training data. After the training of the first level was
Method
Model size</p>
      <p>LC
1.6</p>
      <p>HM
0.8</p>
      <p>SA2
16</p>
      <p>SA3
1.6</p>
      <p>SA4
0.16</p>
      <p>DC
1.6
done, we partitioned the original dataset according to the output
of the first level. For the second level, we trained the group of
models sequentially. We prepared training data of multiple scales
for each model. For each scale, we generated 1000 query
predicates (, ) and calculated the true cardinality as the training
data.</p>
      <p>The keys and ids were both scaled to a maximum of 100 during
the training and evaluation. To train the neural network of the
ifrst level, we used Adam as the optimizer. We set the learning
rate to 1 ∗ 10−4. It took two epochs to finish the training. The loss
remained large when the network was trained on osm, because
a two-layer neural network cannot approximate the skewed
distribution well. We divided the datasets into subsets for models in
the second level according to the predicted results of the trained
neural network. Therefore, the high loss of the neural network
did not matter. We fitted the linear models of the second level to
the training data by using non-linear least squares. In the case of
the face dataset, there were tens of outliers that made it hard for
training. We removed the outliers when we scaled the keys for
model training.</p>
      <p>Workloads. For query predicates (, ), first, we selected scales
of selectivity3 for query range  from {10−5, 10−4, 10−3}. Then we
randomly generated 1000 pairs of (, ) for each scale of
selectivity. The keys generated are limited within the range of existing
keys in the datasets. The generated  was floated around the
selected selectivity with possible magnitudes within (10−1, 101).
Evaluation results. Table 3 shows the estimation errors on
all datasets and queries with a selectivity of around 10−4. The
50th/95th/99th values show the corresponding percentile errors.
Table 4 shows estimation errors on the wiki dataset with
selectivities of around 10−5, 10−4, and 10−3.</p>
      <p>The MAPE errors and Q-errors do not show significant
differences. Most MAPE errors are much smaller than 1, with
zeros after their decimal points. According to the definitions, the
similarity of MAPE errors and Q-errors is easy to understand.
However, for SA4, whose selectivity of sampling hardly matches
the queries’ selectivity, its Q-errors are extremely large.
Exp-1 (LC vs. SA). Sampling methods can easily find a
tradeof between accuracy and storage cost. According to the errors
shown in Tables 3 and 4, we study the following: (i) Generally,
LC in the current setting has a competitive accuracy over SA3.
(ii) The 50th errors of LC are usually slightly larger than those of
SA3, whereas LC has smaller 95th and 99th errors. This means
that the distribution of LC’s errors is more even than that of SA3.
(iii) Sampling methods have larger errors for smaller selectivity.
In contrast, the variation of LC’s errors for diferent scales of
selectivity is more stable.</p>
      <p>The MAPE errors of the three sampling methods are linear to
their sampling scales. This stable multiple of errors for sampling
methods of continuous scales makes sampling methods suitable
for a performance baseline. Sampling methods with diferent
scales can be treated as a trade-of between accuracy and storage
cost. It is always easy for sampling methods to reduce errors.
3We omit the term “1∗”.
Exp-2 (LC vs. HM). Tables 3 and 4 show that HM generally
returns smaller errors than LC. Similar to SA, HM shows degraded
performance with smaller selectivity of queries. HM has a more
significant advantage over LC for queries with a selectivity of
10−3.</p>
      <sec id="sec-6-1">
        <title>Exp-3 (Performances on osm). LC has the worst accuracy on</title>
        <p>
          osm among all datasets, whereas the other methods do not show
decreased performances on osm. This is mainly attributed to the
distribution of osm. Figure 3 shows that osm is the most skewed.
The small segment shown in the zoomed region of osm shows
that the distribution lacks local structures, making it dificult for
LC to learn the CDF. Similar situations are met in indexing works
[
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. HM and the sampling methods are not bothered a lot by
the distribution of osm, because they do not need to learn the
distribution.
        </p>
      </sec>
      <sec id="sec-6-2">
        <title>Exp-4 (Performances on amzn, face, and wiki). The CDFs</title>
        <p>of amzn and wiki show very smooth distributions. The zoomed
small segments of amzn and wiki also look similar. Considering
zoomed segments of face, face also lacks local structures,
similarly to osm. As a result, LC has more significant errors on face
compared with its errors on amzn and wiki.</p>
        <p>
          Exp-5 (Estimation time). Table 5 shows the average estimation
times on diferent datasets with a selectivity of 10−4. LC is much
faster than the other methods. DC takes less time than HM and
SA4 on amzn, face, and wiki. When RMI is used as the index, the
time cost of a point query consists of the inference time (evaluated
by RMI) and the search time (for searching around the predicted
order). As DC uses two point queries to estimate cardinality, it is
reasonable that DC needs longer time than LC. We see that LC,
HM, and SA have stable times on diferent datasets. DC needs
more time on osm than the other datasets. The point queries on
osm incur a long search time [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. HM scans the border values
of buckets, and this scanning contributes to a major part of the
time costs. The sampling methods with large samples involve
many record accesses, thereby SA2 and SA3 are very slow.
5
        </p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>DISCUSSION</title>
      <p>The main advantage of the prototype learned method LC is its
computational eficiency. This comes from its structure, which is
based on simple learned models. At the same time, the estimation
errors of LC seem a bit larger than those of the other evaluated
methods. In this section we summarize and discuss our findings.</p>
      <p>SA2
44821,5
42313.6
44743.2
47847.8</p>
      <p>SA3
1176.6
1162.3
1230.1
1271.0</p>
      <p>SA4
129.8
129.9
138.5
143.1</p>
      <p>DC
77.6
91.8
93.0
196.6
Best performance. LC has the best computational eficiency.
Its error is a bit larger than those of the other models but is
absolutely small. DC provides the true value of cardinality, showing
the best accuracy. At the same time, its estimation time normally
outperforms those of the sampling methods and HM.
Consequently, LC and DC are the options, but for estimation purpose,
we consider that LC is better suited, as it yields high eficiency
and small error.</p>
      <p>Diferences among LC, HM, and DC. As mentioned in Section
2.1, HM is a group of linear models. LC, HM, and DC with RMI
all make use of models. The model diferences lead to diferences
in their performances. DC calculates cardinality by finding the
orders of two border keys of the query range. It involves twice
the computation cost of RMI. Compared with DC, LC infers the
diference of the orders of two border keys instead of inferring
the orders. Hence, LC reduces the computation cost to one pass
of the learned models. Models of HM take charge of counting
their buckets. When the query range covers multiple buckets,
which is a typical case, HM accumulates counts of all covered
buckets. Only on the border buckets intersected by the query
range, inferences by models need to be executed. The errors in
HM come from estimations of the border buckets.</p>
      <p>
        Further optimization. If we treat LC as a baseline, better
methods for one-dimensional cardinality estimation should have
smaller estimation errors and competitive computational
eficiency. A possible way is to add some extra models to LC to get
more accurate estimations. However, complex models are not
suitable because they incur high time costs. Therefore, the issue is
how to make LC more accurate while retaining simple structures.
If we seek to optimize models compared with DC, then we need
to cut down its computation time. With RMI as an index, DC
consumes about 20 multiplication operations (varying among the
diferent models used in RMI) and tens of add operations. Less
computation than DC means that only several multiplication
operations are allowed. This is a new research challenge.
Training cost. For learned methods, we need to consider the
training costs when bringing them to applications. The training
cost of LC consists of two parts. (i) For model training, LC shares
similar costs with the original RMI model, which DC uses. In
our evaluation, to train the prototype learned model on a dataset
consisting of 200 million records, it took about a half hour for a
simple neural network and 10 minutes for 1∗105 linear models. (ii)
For the preparation of training data, LC has two ways, passive and
active. The query results produced by DBMSs can be utilized for
training. However, this passive way cannot guarantee suficient
amounts of data to train the models. Besides, active collection of
training data for learned models can be expensive [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. LC needs
a longer time than DC to collect training data, because DC does
not need to obtain true cardinality. However, both training data
collection and model training are done once, so the ofline cost of
LC would not be a drawback in practice.
      </p>
      <p>Summary. To summarize our discussion, LC and DC are options
of one-dimensional cardinality estimation if their training costs
are not bottlenecks for applications. They have accuracy-time
trade-of. We provided the directions of further optimizations for
LC and DC.</p>
    </sec>
    <sec id="sec-8">
      <title>6 CONCLUSION</title>
      <p>In this study, we addressed an unanswered question: Are learned
methods suitable for one-dimensional cardinality estimation? We
designed a prototype learned structure for one-dimensional
cardinality estimation. We empirically evaluated the performances
of the learned method and existing methods to investigate their
advantages and disadvantages. The evaluation results answer the
question: A learned method based on the current state-of-the-art
model is useful for quick and accurate cardinality estimation.
In future work, we will seek to optimize the prototypic learned
method to achieve better accuracy by using other structures. We
also plan to design a strategy to collect training data, e.g., to
judge whether collected training data satisfy the needs.</p>
    </sec>
    <sec id="sec-9">
      <title>ACKNOWLEDGMENTS</title>
      <p>This research is partially supported by JST PRESTO Grant
Number JPMJPR1931 and JST CREST Grant Number JPMJCR21F2.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Nicolas</given-names>
            <surname>Bruno</surname>
          </string-name>
          and
          <string-name>
            <given-names>Surajit</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Exploiting statistics on query expressions for optimization</article-title>
          .
          <source>In SIGMOD</source>
          .
          <volume>263</volume>
          -
          <fpage>274</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Surajit</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          .
          <year>1998</year>
          .
          <article-title>An overview of query optimization in relational systems</article-title>
          .
          <source>In Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems</source>
          .
          <volume>34</volume>
          -
          <fpage>43</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Douglas</given-names>
            <surname>Comer</surname>
          </string-name>
          .
          <year>1979</year>
          .
          <article-title>Ubiquitous B-tree</article-title>
          .
          <source>ACM Computing Surveys (CSUR) 11</source>
          ,
          <issue>2</issue>
          (
          <year>1979</year>
          ),
          <fpage>121</fpage>
          -
          <lpage>137</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Graham</given-names>
            <surname>Cormode</surname>
          </string-name>
          , Minos Garofalakis,
          <string-name>
            <surname>Peter J. Haas</surname>
            , and
            <given-names>Chris</given-names>
          </string-name>
          <string-name>
            <surname>Jermaine</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Synopses for Massive Data: Samples, Histograms</article-title>
          , Wavelets, Sketches.
          <source>Found. Trends Databases 4</source>
          ,
          <fpage>1</fpage>
          -
          <lpage>3</lpage>
          (
          <year>2012</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>294</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Crotty</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>Hist-Tree: Those Who Ignore It Are Doomed to Learn.</article-title>
          .
          <source>In CIDR.</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Shohedul</given-names>
            <surname>Hasan</surname>
          </string-name>
          , Saravanan Thirumuruganathan, Jees Augustine, Nick Koudas, and
          <string-name>
            <surname>Gautam Das</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Deep Learning Models for Selectivity Estimation of Multi-Attribute Queries</article-title>
          . In SIGMOD.
          <volume>1035</volume>
          -
          <fpage>1050</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Yannis</given-names>
            <surname>Ioannidis</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>The History of Histograms (Abridged) (VLDB</article-title>
          ).
          <volume>19</volume>
          -
          <fpage>30</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Kipf</surname>
          </string-name>
          , Ryan Marcus, Alexander van Renen,
          <string-name>
            <surname>Mihail Stoian</surname>
            , Alfons Kemper, Tim Kraska, and
            <given-names>Thomas</given-names>
          </string-name>
          <string-name>
            <surname>Neumann</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>RadixSpline: A Single-Pass Learned Index</article-title>
          .
          <source>In Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management (aiDM) . Article 5.</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Tim</given-names>
            <surname>Kraska</surname>
          </string-name>
          , Alex Beutel, Ed H Chi,
          <string-name>
            <given-names>Jefrey</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Neoklis</given-names>
            <surname>Polyzotis</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>The case for learned index structures</article-title>
          .
          <source>In SIGMOD</source>
          .
          <volume>489</volume>
          -
          <fpage>504</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Guoliang</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Xuanhe</given-names>
            <surname>Zhou</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Lei</given-names>
            <surname>Cao</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>AI Meets Database: AI4DB and DB4AI</article-title>
          . In SIGMOD.
          <volume>2859</volume>
          -
          <fpage>2866</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Qiyu</surname>
            <given-names>Liu</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Yanyan</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Lei</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>LHist: Towards Learning Multidimensional Histogram for Massive Spatial Data</article-title>
          .
          <source>In ICDE</source>
          .
          <volume>1188</volume>
          -
          <fpage>1199</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Marcel</given-names>
            <surname>Maltry</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jens</given-names>
            <surname>Dittrich</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>A Critical Analysis of Recursive Model Indexes</article-title>
          .
          <source>arXiv preprint arXiv:2106.16166</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Ryan</surname>
            <given-names>Marcus</given-names>
          </string-name>
          , Andreas Kipf, Alexander van Renen,
          <string-name>
            <surname>Mihail Stoian</surname>
            , Sanchit Misra, Alfons Kemper, Thomas Neumann, and
            <given-names>Tim</given-names>
          </string-name>
          <string-name>
            <surname>Kraska</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Benchmarking Learned Indexes</article-title>
          .
          <source>PVLDB 14</source>
          ,
          <issue>1</issue>
          (
          <year>2020</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Mihail</surname>
            <given-names>Stoian</given-names>
          </string-name>
          , Andreas Kipf, Ryan Marcus, and
          <string-name>
            <given-names>Tim</given-names>
            <surname>Kraska</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>PLEX: Towards Practical Learned Indexing</article-title>
          .
          <source>In International Workshop on Applied AI for Database Systems and Applications (AIDB).</source>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Ji</surname>
            <given-names>Sun</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Guoliang</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Nan</given-names>
            <surname>Tang</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>Learned Cardinality Estimation for Similarity Queries</article-title>
          .
          <source>In SIGMOD</source>
          .
          <volume>1745</volume>
          -
          <fpage>1757</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Yaoshu</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Chuan Xiao, Jianbin Qin, Rui Mao, Makoto Onizuka,
          <string-name>
            <surname>Wei</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Rui Zhang, and
          <string-name>
            <given-names>Yoshiharu</given-names>
            <surname>Ishikawa</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>Consistent and flexible selectivity estimation for high-dimensional data</article-title>
          .
          <source>In SIGMOD</source>
          .
          <volume>2319</volume>
          -
          <fpage>2327</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Edward</surname>
            <given-names>J</given-names>
          </string-name>
          <string-name>
            <surname>Wegman</surname>
          </string-name>
          .
          <year>1969</year>
          .
          <article-title>Nonparametric probability density estimation</article-title>
          .
          <source>Technical Report</source>
          . North Carolina State University. Dept. of Statistics.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Peizhi</given-names>
            <surname>Wu</surname>
          </string-name>
          and
          <string-name>
            <given-names>Gao</given-names>
            <surname>Cong</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>A Unified Deep Model of Learning from Both Data and Queries for Cardinality Estimation</article-title>
          . In SIGMOD.
          <year>2009</year>
          -
          <fpage>2022</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Zongheng</surname>
            <given-names>Yang</given-names>
          </string-name>
          , Eric Liang, Amog Kamsetty, Chenggang Wu, Yan Duan, Xi Chen, Pieter Abbeel,
          <string-name>
            <surname>Joseph M. Hellerstein</surname>
            , Sanjay Krishnan, and
            <given-names>Ion</given-names>
          </string-name>
          <string-name>
            <surname>Stoica</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Deep Unsupervised Cardinality Estimation</article-title>
          .
          <source>PVLDB 13</source>
          ,
          <issue>3</issue>
          (
          <year>2019</year>
          ),
          <fpage>279</fpage>
          -
          <lpage>292</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Xiaohui</given-names>
            <surname>Yu</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ada</given-names>
            <surname>Fu</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>Piecewise Linear Histograms for Selectivity Estimation</article-title>
          .
          <source>In International Symposium on Information Systems and Engineering (ISE)</source>
          .
          <volume>319</volume>
          -
          <fpage>326</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>