=Paper= {{Paper |id=Vol-2572/short14 |storemode=property |title=Position Caching in a Column-Store with Late Materialization: An Initial Study |pdfUrl=https://ceur-ws.org/Vol-2572/short14.pdf |volume=Vol-2572 |authors=Viacheslav Galaktionov,Evgeniy Klyuchikov,George Chernishev |dblpUrl=https://dblp.org/rec/conf/dolap/GalaktionovKC20 }} ==Position Caching in a Column-Store with Late Materialization: An Initial Study== https://ceur-ws.org/Vol-2572/short14.pdf
Position Caching in a Column-Store with Late Materialization:
                       An Initial Study
                      Viacheslav Galaktionov1, 2 , Evgeniy Klyuchikov1, 2 , George Chernishev1, 2, 3
                                             1 Information Systems Engineering Lab, JetBrains Research
                                                          2 Saint Petersburg State University, Russia
                                             3 National Research University Higher School of Economics

                                       {viacheslav.galaktionov,evgeniy.klyuchikov,chernishev}@gmail.com

ABSTRACT                                                                                   Select R1.B, R1.C,
A common technique to speed up DBMS query processing is to                                 R2.E, R2.H, R3.F
cache parts of query results and reuse them later. In this paper we                        From R1, R2, R3
                                                                                           Where R1.A = R2.D
propose a novel approach which is aimed specifically at caching                                                               Project
                                                                                           AND R2.G = R3.K
intermediates in a late-materialization-oriented column-store.                                                                 Join
   The idea of our approach is to cache positions (row numbers)                                                                G=K
instead of data values. The small size of positional representation                                                   G
is a valuable advantage: cache can accommodate more entries
                                                                                                                                        K   F
and consider intermediates that involve “heavy” operators, e.g.                                               Project
joins of large tables.                                                                                         Join
   Position caching thrives in late materialization environments                                               A=D
since position exchange is prevalent in them. In particular, ex-
pensive predicates and heavy joins are usually processed based                               C     B      A               D       G             E      H
on positions. Our approach is able to cache them efficiently, thus
significantly reducing system load.
                                                                                         Figure 1: Late materialization example, adapted from [17]
   To assess the importance of intermediates our position caching
technique features a cost model that is based on usage statistics
and complexity estimations. Furthermore, to allow intermediate
                                                                                         scheme depends on many parameters: join filtering power, data
reuse for the queries that are not fully identical, we proposed an
                                                                                         distribution, disk parameters such as seek times and bandwidth.
efficient query containment checking algorithm. Several policies
                                                                                            Late materialization has formed the basis of several column-
for cache population and eviction were proposed. Finally, our
                                                                                         oriented DBMSes [1]. Its many aspects, such as the design of LM-
approach is enhanced by lightweight compression schemes.
                                                                                         enabled operators [3, 17] and the query processing schemes [6,
   Experimental evaluation was performed using a stream of ran-
                                                                                         11], have been extensively studied. However, some aspects re-
domly generated Star-Schema-Benchmark-like queries. It showed
                                                                                         ceived less attention.
up to 3 times improvement in query run times. Additionally, com-
                                                                                            In this paper we further investigate one of them — intermediate
pressing the intermediates reduces the space requirements by up
                                                                                         result caching. Unlike many existing caching techniques, our
to 2 times without a noticeable performance overhead.
                                                                                         approach does not cache the data itself. Instead, it aims to cache
                                                                                         useful data in positional form which allows us to:
1     INTRODUCTION                                                                          (1) reduce the memory footprint, since all table row data,
Late materialization [1] is a query processing technique that aims                              which can be arbitrarily large, is replaced with a single
to operate on positions as long as possible. Essentially, it defers                             position (a 32-bit number)
tuple reconstruction to later stages of a query plan. The goal is to                        (2) use thusly saved memory to store intermediates that are
conserve disk bandwidth by: 1) reading individual columns only                                  closer to the bottom part of the plan. It comes more costly,
when they are necessary for processing at a given point, and 2)                                 but on the other hand, they have higher reuse chance.
organizing column access in such an order that subsequent reads                             Our caching approach also employs query containment checks
require fewer disk accesses due to data filtering.                                       which allows reusing intermediates from queries that are not
   This approach leads to unconventional query plans with novel                          fully identical, but one of them subsumes another.
opportunities [10] for efficient query processing. In such query                            To demonstrate the viability of the proposed position caching,
plans operators: 1) usually ingest only necessary columns, 2) ex-                        we have implemented it inside PosDB — a distributed disk-based
change not only data, but also positions. Positions are essentially                      column-store with late materialization. Since PosDB currently
row ids which will be later used to request data. An example of                          lacks a full-fledged cost-based optimizer, we had to devise a num-
late materialization processing is presented in Figure 1.                                ber of simpler heuristic algorithms. While our approach is simpler
   Here, for column G, query processor reads only the values                             than existing methods, it still allows us to obtain considerable
that have passed join predicate. For columns E and H it is even                          performance gains.
better: query processor will have to read values that have passed                           Apart from that we have also implemented compression for
both join predicates. Of course, the benefit of such processing                          cached positions. Keeping only sequences of integers favours
                                                                                         compression which allows us to reduce memory footprint even
© Copyright 2020 for this paper held by its author(s). Published in the proceedings of   further. For now, we have implemented only delta encoding and
DOLAP 2020 (March 30, 2020, Copenhagen, Denmark, co-located with EDBT/ICDT
2020) on CEUR-WS.org. Use permitted under Creative Commons License Attribution           null suppression since our experiments indicated that these meth-
4.0 International (CC BY 4.0).                                                           ods result in good space savings with a small overhead.
    Overall, the contribution of this paper is the following:          reused later, b) can reuse current cache. If any are found, they
    (1) A novel approach to caching intermediates in a column-         interact with cache during query evaluation by either a) putting
        store that caches positions instead of data.                   the resulting data into cache or b) reusing cached data instead of
    (2) A study of applicability of compression to caching inside      subtree evaluation.
        a column-store.                                                    In this study we assume that the target DBMS represents
    (3) An experimental evaluation of several policies for cache       query plans as a tree of operators, each of which returns values
        population and eviction.                                       or positions exclusively. As we target late materialization, most of
                                                                       the filtering and join operators are positional and (due to OLAP
2    RELATED WORK                                                      specifics) appear to be the most costly parts of a query plan.
                                                                           Our caching approach intentionally targets only the positional
Caching intermediates. The idea of reusing intermediate re-            data. As we stated in the introduction, “positional” cache entries
sults for query evaluation in databases has existed for a long time.   are lightweight and are more susceptible to compression. There-
It comes in two formulations. In the first one there is a number       fore, “heavy” operators from the bottom of the tree (usually costly
of simultaneously running queries and the goal is to generate          filters and joins) can be cached with more ease. Thus, a significant
query plans that would maximize result sharing. This is called         decrease in system load can be achieved.
multi-query optimization [15, 16]. The second option is caching:           An essential drawback of position only caching is repeated
parts of query plans that were evaluated in the past are stored        data value reading. However, disk subsystem decreases the over-
and reused when appropriate query arrives. In this paper we            head. Caching is usually applied when the queries are somewhat
consider the second option.                                            similar in the terms of referenced disk pages. Therefore, the disk
    The next two papers study caching of intermediates inside          pages necessary for materialization are often already stored in-
the MonetDB [11] system family, an in-memory column-store              side the buffer manager.
with late materialization. In reference [12] authors propose to            Caching: general overview. Our caching algorithm consists
exploit MonetDB operator-at-a-time processing model. The idea          of pre- and post- processing of query plans. The former adds
is to cache and reuse BAT by-products of query execution. The          special operators to gather statistical information and substitutes
proposed approach featured plan subsumption and both inter-            subtrees with existing cache entries. The latter collects the final
and intra- query reuse. The study [13] addresses the case of           statistical information and updates the cache if needed.
pipelined operator evaluation inside Vectorwise database engine.           The cache consists of query history and a set of cache en-
    In reference [9] another mechanism for in-memory result            tries (each being a list of position blocks). Query history is a list
caching was proposed. The idea is to cache internal data struc-        of subtrees of N most recently executed queries together with
tures of a database engine in order to lower memory pressure           the subtrees that are still cached. Each history entry contains a
and preserve cache and register locality. In particular, hash tables   pointer to the corresponding cache entry (if any), a high-level
that are produced by hash-joins and hash-aggregations.                 description of the subtree, its result’s size and usage statistic. A
    Disk-based caching was considered in reference [4]. Apart          usage statistic is a set of queries the entry can be potentially
from being able to cache data, this system can also cache positions.   useful for. All this information is used in the cost models of cache
Also, authors proposed a cost-based optimizer to find the best         population and eviction policies.
plan. Experiments showed that position caching performs better             There might not be a perfectly matching cache entry for a
than data caching when cache entries are stored on disk.               subtree, but at the same time there may be an entry that provides
    Caching and compression. Compression has been used in              a larger, subsuming, result. This is determined with a containment
databases for a long time [19], however it is column-stores where      check. If the check is passed, the original subtree is replaced by
it brought significant benefits. Compression in column-stores          this cache entry with a special filtering operator on top.
is different from row-stores: 1) columnar storage results in ho-           Caching: preprocessing. Preprocessing is the most sophisti-
mogeneity of data which in turn leads to better compression            cated part, see Figure 2. The algorithm iterates through the query
rates and, 2) compression should be lightweight, i.e. it should not    plan and for each subtree 𝑆: a) checks if it fits into cache b) adds a
excessively load CPU.                                                  special operator PBCounter on top of it c) updates cache history
    We reference two papers which we believe to be most impor-         d) tries to replace 𝑆 with a (possibly filtered) cache entry.
tant, while many others can be found. Authors of reference [2]             The first two steps are related to size estimation of interme-
studied several lightweight compression algorithms in a disk-          diates. An auxiliary PBCounter is used to count the number of
based column-store. They compared their behavior in case of            blocks passed during query evaluation. Then, this number is used
eager decompression and in case when query engine operated             to: a) update the benefit measure of 𝑆 (described later) b) filter
on compressed data directly. Also, they have presented a deci-         future subtrees that are guaranteed not to fit into cache.
sion tree for selecting compression method depending on data               After size estimation has been taken care of, we add a history
characteristics. Compression for in-memory column-stores was           entry for 𝑆 and update the usage statistics of other entries that can
studied in reference [20]. Authors have proposed and evaluated         (potentially) be used by 𝑆. Usage statistics in their turn participate
three different compression schemes which aimed to efficiently         in the cost model of population and eviction policies. Therefore,
use CPU resources.                                                     we try to add the best yet uncached entry into cache.
    To the best of our knowledge, no paper considered compres-             Finally, after the cache was adjusted, we search for a cache
sion in the context of caching intermediates in column-stores.         entry that can replace 𝑆 with minimum additional filtering. If one
                                                                       is found, we use a special operator FetchCache instead of direct 𝑆
3    PROPOSED APPROACH                                                 evaluation. This operator passes position blocks, decompressing
Preliminaries and prerequisites. A common workflow of a                them if needed.
DBMS caching subsystem is as follows. When a query arrives,                Caching: postrocessing. This stage has two goals: gather the
its plan is generated and inspected for subtrees that a) can be        results of every PBCounter that is a part of the current query’s
                Initial query plan        Patched query plan                           Try to replace S with a cache entry
                                                                   Loop                Replace S with FetchCache(T) or
                 Select next              yes      All subtrees
                                                                                            Filter(FetchCache(T))
                 subtree, S          no            processed?

                                                                                                    found
                Result size allows         no               Replace S with                                     Search for best fitting
                   caching?                                 PBCounter(S)                not found             cached intermediate, T
                                                no info
                                          yes
                                                                     Update cache
                     Save in cache history                     Traverse cache history for yet               Consult cache policies
                     Update containment and                   uncached entry with max benefit,               Admit UT to cache if
                     usability statistics                                   UT                              required


                                                 Figure 2: PosDB caching workflow: preprocessing


operator tree, and calculate the benefit estimates for each inter-                  Query Containment. Implementing a caching subsystem
mediate.                                                                        implies solving the query containment problem. It arises when
   Policies for cache population and eviction. We have con-                     it is necessary to determine which cache entry corresponds to a
sidered several different policies for adding and removing in-                  newly arrived query. Given the sheer expressive power of SQL,
termediates. We assume that for each history entry a numeric                    it is hard to check even the simple equivalence of two queries.
benefit can be calculated. We also define total benefit as a                    However, in order to build an efficient caching subsystem it is
product of benefit and the usage frequency (number of queries                   essential to find cached entries that can be used as base table to
in the the entry usage statistics).                                             evaluate the given query. For example, consider two queries:
   For cache population we have considered:                                          • SELECT * FROM T1 WHERE T1.A > 100
    • MIN: choose an uncached intermediate whose total benefit                       • SELECT * FROM T1 WHERE T1.A > 110
      is larger than minimal among cached intermediates.
                                                                                If the first query is cached, then the second one can use the stored
    • AVG: choose an uncached intermediate whose total benefit
                                                                                answer with additional filtering. Depending on the predicate
      is larger than average among cached intermediates.
                                                                                selectivity and hardware parameters this approach may result in
  Cache eviction has a larger number of options:                                savings.
    • LRU (Least Recently Used): evict intermediates that were                      It is well-known that in the general case checking query con-
      not used for the longest time.                                            tainment is very demanding computationally. For example, it was
    • LFU (Least Frequently Used): evict intermediates with the                 shown [8] that the problem of conjunctive query containment
      shortest history of usage.                                                is NP complete. Therefore, we had to restrict admissible query
    • LCS (Largest Cache Space): evict intermediates that occupy                class to “mostly” conjunctive SPJ (Select-Project-Join) queries.
      the largest amount of space.                                              “Mostly” means that OR can appear in WHERE clause only to list
    • BENEFIT: evict entries that have the lowest benefit.                      possible individual values of an attribute, e.g. T1.A = 5 OR T1.A
    • BENEFITHIST: evict entries that have the lowest total                     = 6 OR T1.A = 7 is allowed, while T1.A < 5 OR T1.A = 10 is
      benefit.                                                                  not. Therefore, the resulting class is limited, but big enough to
    • RANDOM: evict random intermediates.                                       cover all queries from the Star Schema Benchmark (SSB). Note,
   Cost model. The benefit of a query history entry is defined                  that since only positional data is involved, there is no need in
as a function of:                                                               checking containment of aggregation or sort.
                                                                                    In order to perform containment checks we have designed a
    • Subtree complexity: a weighted sum of the subtree opera-
                                                                                special data structure to represent query metadata. Essentially, it
      tors. Each operator type corresponds to a constant weight
                                                                                records: 1) the set of attribute fragments covered, 2) all conducted
      which is a model parameter;
                                                                                joins, 3) all predicates that were used for filtering.
    • Working set complexity: an estimated amount of data
                                                                                    Obtaining such a structure is just a matter of traversing the
      the query executor requests from the storage system to
                                                                                corresponding operator tree.
      process the subtree;
                                                                                    Now, in order to perform a containment check given two such
    • Actual result size: computed by the corresponding PBCounter
                                                                                structures, all that is required is to make sure they share their sets
      operator;
                                                                                of attribute fragments and joins, and every predicate from the first
    • Usage statistics: a list of unique query numbers that could
                                                                                structure is implied by some predicate from the second structure.
      (or did) use it.
                                                                                If that condition holds, then the second structure describes a
   Note that we rely on two kinds of complexity: subtree and                    containing query.
working set complexity. While the former favours the“structurally                   Compression. Cached position blocks appear to be a good
complex” subtrees, the latter favours operation on larger amounts               target for compression. First, columnar representation guarantees
of data. They allow to distinguish simple queries on a large data               data homogeneity and positions are just integers of fixed size.
and complex queries an a small data.                                            Secondly, the structure of a query plan explicitly restricts the
   Overall, the following formula is used to calculate benefit:                 possible position block structure. For example, large tables like
                          Complexity(𝐼 ) × EstimatedSize(𝐼 )                    SSB LINEORDER are usually treated in the way that preserves
       𝐵𝑒𝑛𝑒 𝑓 𝑖𝑡 (𝐼 ) =                                      .                  their initial order. Therefore, the corresponding position stream
                                   ActualSize(𝐼 )
                                                 SELECT sum(lo_revenue), d_year, p_brand1
                                                 FROM lineorder, date, part, supplier                     To user                                    Operator
                                                                                                                          Operator       Reader
                                                 WHERE     lo_orderdate = d_datekey                                                                  internals
                                                       and lo_partkey = p_partkey
                Join Index                             and lo_suppkey = s_suppkey                          Sort            local access         remote access
                                                       and p_category = 'MFGR#12'
               T1    T2    T3                          and s_region = 'AMERICA'
                                                 GROUP BY d_year, p_brand1                ...           Aggregate
               1     2     3                     ORDER BY d_year, p_brand1;                                                                        Tuples

               2     1     2                                                                                                                      Columns
                                                                                    Read(d_datekey)        Join      Read(lo_orderdate)
               3     3     3
                                                                     Read(s_suppkey)           Join       Read(lo_suppkey)           DS(date)
        T1          T2           T3
       row1         row1        row1                Read(s_region)     Filter        Read(p_partkey)        Join      Read(lo_partkey)
       row2         row2        row2
       row3         row3        row3                           DS(supplier)         DS(lineorder)      Read(p_category)         Filter          DS(part)

        (a) Example of join index                                                       (b) Query plan example

                                                          Figure 3: PosDB internals


is ordered and relatively dense. On the contrary, small tables                      Often, positional operators (e.g. Join, Filter) require data val-
which participate in joins often provide sparse and unordered                   ues in addition to positions. To provide such operators with data, a
sequences of positions.                                                         set of auxiliary entities called readers were introduced. For exam-
   Turning to concrete compression strategies, we have tried                    ple, ColumnReader retrieves values of a specific attribute, while
different encoding methods and found the delta and null suppres-                SyncReader provides values of several attributes synchronously.
sion to be the most appropriate. The former allows to narrow                        PosDB is a both distributed and parallel column-store. It is
down the range of absolute values inside each position block,                   distributed in terms of both data and query execution: a table may
especially when they have ascending or descending order. Then,                  be fragmented and replicated across several nodes. A number
the latter transforms 32-bit (signed or unsigned) numbers into                  of table-level fragmentation methods is supported: round-robin,
16-bit and 8-bit ones. Thus, up to four times space compression                 hash and range partitioning strategies. Query distribution allows
can be achieved with an almost negligible overhead of two for                   it to run a query on different nodes, possible with individual query
loops.                                                                          plan parts residing on distinct nodes. Query distribution is imple-
   Two other apparently efficient compression strategies are RLE                mented by two pairs of special operators: {SendPos, ReceivePos}
and Range. However, experiments demonstrated their limited use                  and {SendTuple, ReceiveTuple}. Therefore, both positional and
in practice. After the first filtering or join operator the consecutive         tuple operators can be executed on arbitrary nodes, regardless
positions are usually different or repeat just a little. In the latter          where their children reside.
case delta + null suppression give a comparable or better result.                   Both inter- and intra-query parallelism [14] are supported. To
   We have also assessed and turned down some of the more                       implement intra-query parallelism two special operators were
complex compression methods, such as Dictionary compression                     created. Asynchronizer allows to execute an operator tree in a
and LZW. They either require significant time overhead or fail to               separate thread and UnionAll is used to collect data from several
provide noticeable space gains, encoding numbers with almost                    subtrees that are executed in their own threads.
the same numbers.                                                                   Recently [5], fully distributed operators like distributed join
                                                                                and aggregation were added. A detailed description of PosDB
4    POSDB                                                                      architecture can be found in papers [5, 7].
To evaluate our approach we have implemented it inside PosDB —
a distributed column-store with a processing model that adheres                 5      EXPERIMENTAL EVALUATION
to late materialization. Each PosDB query plan consists of two                  We have evaluated the proposed approach with two experiments,
parts: position- and tuple-based (see Figure 3b). In the former                 designed to measure: a) the impact of different eviction policies on
(the bottom one) operators exchange only positions that are                     query execution performance and b) the impact of compression
represented as a generalized join index [18] (Figure 3a). In our                on the amount of space used for storing intermediates.
system this data structure is used to represent results of filter                  We fixed the size of cache history to 50 last queries and used
and join operators. For a filter, it is just a list of positions (row           only the MIN population policy. Cache size was varied from 2 to
numbers) that satisfy a predicate. This structure can describe                  32 thousands of position blocks, each of them being 32 KB large
results of an arbitrary number of joins, e.g. consider the join                 when uncompressed.
index presented in Figure 3a. Here, the second row of 𝑇1 was                       Finally, we relied on SSB with SF 10 and conducted a series
joined with the first row of 𝑇2 and then the resulting tuple was                of runs, a thousand of random “SSB-like” queries each. These
joined with the second row of 𝑇3 .                                              queries differ from the usual SSB ones only in the way the filtering
   The upper part of any plan consists of operators that work                   predicates are generated: in addition to randomly choosing the
with tuples, similarly to classic row-stores. The lowest of these               boundaries, we also randomly select the relation. For example,
operators is responsible for materialization, i.e. it transforms join           DATE.D_YEAR = 1993 can become DATE.D_YEAR < 1995.
indexes into tuples of actual values. Such materializing operators                 Experimental environment. We used a laptop with an In-
perform aggregation and window functions.                                       tel(R) Core(TM) i7-3630QM CPU @ 2.40GHz; 8 GB RAM; HDD
                                                                      BENEFIT                                                                                BENEFIT




                                                                                                  Used cache space, bytes/(32 × 210 )
                                                                        LRU                                                             30,000                 LRU
                    4,000                                            RANDOM                                                                                  RANDOM
                                                                    BENEFITHIST                                                                          NO COMPRESSION
Execution time, s




                                                                        LFU
                                                                        LCS                                                             20,000
                    3,000

                                                                                                                                        10,000

                    2,000

                                                                                                                                            0
                            0    5000 10000 15000 20000 25000 30000 35000                                                                        0     5000 10000 15000 20000 25000 30000 35000
                                  156  313   469   625   781   938 1094                                                                                 156  313   469   625   781   938 1094
                                Cache size (upper in blocks, lower in MB)                                                                            Cache size (upper in blocks, lower in MB)

                                 Figure 4: Query execution performance                                                                      Figure 5: Memory usage before and after compression


                                                                                                                                               Inc., Hanover, MA, USA.
                     Samsung ST1000LM024, 5400 rpm; Funtoo Linux, kernel version                                                           [2] Daniel Abadi, Samuel Madden, and Miguel Ferreira. 2006. Integrating Com-
                     5.2.7; gcc 9.2.0.                                                                                                         pression and Execution in Column-oriented Database Systems (SIGMOD ’06).
                        Query execution performance. First, we measured the per-                                                           [3] D. J. Abadi, D. S. Myers, D. J. DeWitt, and S. R. Madden. 2007. Materialization
                                                                                                                                               Strategies in a Column-Oriented DBMS. In ICDE’07. 466–475.
                     formance of query execution when different eviction policies are                                                      [4] Chungmin Melvin Chen and Nicholas Roussopoulos. 1994. The Implementa-
                     used. The results are given in Figure 4 and show that with smaller                                                        tion and Performance Evaluation of the ADMS Query Optimizer: Integrating
                     cache sizes, the choice of a strategy had no significant impact on                                                        Query Result Caching and Matching (EDBT ’94). 323–336.
                                                                                                                                           [5] George Chernishev, Viacheslav Galaktionov, Valentin Grigorev, Evgeniy
                     query performance. With larger cache sizes, however, BENEFIT                                                              Klyuchikov, Evgeniy Slobodkin, and Kirill Smirnov. [n.d.]. PosDB 2019: A
                     turned out to be the best, RANDOM — the worst.                                                                            Distributed Disk-Based Column-Store with Late Materialization (submitted).
                                                                                                                                               Proc. VLDB Endow. (Oct. [n. d.]).
                        Another interesting outcome is that BENEFITHIST, which took                                                        [6] G. A. Chernishev, V. A. Galaktionov, V. D. Grigorev, E. S. Klyuchikov, and
                     history into account, resulted in worse performance than BENEFIT,                                                         K. K. Smirnov. 2018. PosDB: An Architecture Overview. Programming and
                     which did not. We suppose that just relying on the usage fre-                                                             Computer Software 44, 1 (01 Jan 2018), 62–74.
                                                                                                                                           [7] G. A. Chernishev, V. A. Galaktionov, V. D. Grigorev, E. S. Klyuchikov, and
                     quency results in a longer eviction time for already-cold cache                                                           K. K. Smirnov. 2018. PosDB: An Architecture Overview. Programming and
                     entries (which were hot some time ago). We believe the situation                                                          Computer Software 44, 1 (Jan. 2018), 62–74.
                     could be improved by also considering the times of several last                                                       [8] Rada Chirkova. 2009. Query Containment. Springer US, 2249–2253.
                                                                                                                                           [9] Kayhan Dursun, Carsten Binnig, Ugur Cetintemel, and Tim Kraska. 2017.
                     uses.                                                                                                                     Revisiting Reuse in Main Memory Database Systems (SIGMOD ’17). 15.
                        Space consumption due to compression. Next, we enabled                                                            [10] Stavros Harizopoulos, Daniel Abadi, and Peter Boncz. 2009. Column-Oriented
                                                                                                                                               Database Systems, VLDB 2009 Tutorial. nms.csail.mit.edu/~stavros/pubs/
                     compression for intermediates and ran the same experiment us-                                                             tutorial2009-column_stores.pdf
                     ing only the BENEFIT, LRU, and RANDOM eviction policies. Figure 5                                                    [11] Stratos Idreos, Fabian Groffen, Niels Nes, Stefan Manegold, K. Sjoerd Mul-
                     shows the maximum amount of main memory used to store the                                                                 lender, and Martin L. Kersten. 2012. MonetDB: Two Decades of Research in
                                                                                                                                               Column-oriented Database Architectures. IEEE Data Eng. Bull. 35, 1 (2012).
                     intermediates during a particular run. The values were initially                                                     [12] Milena G. Ivanova, Martin L. Kersten, Niels J. Nes, and Romulo A.P. Gonçalves.
                     measured in bytes, and then divided by the standard size of a                                                             2010. An Architecture for Recycling Intermediates in a Column-store. ACM
                     position block (32 KB).                                                                                                   Trans. Database Syst. 35, 4, Article 24 (Oct. 2010), 24:1–24:43 pages.
                                                                                                                                          [13] Fabian Nagel, Peter Boncz, and Stratis D. Viglas. 2013. Recycling in Pipelined
                        It is worth noting that enabling compression has not had any                                                           Query Evaluation (ICDE ’13). IEEE Computer Society, 338–349.
                     noticeable effect on query execution times. Therefore, for the                                                       [14] M. Tamer Ozsu. 2007. Principles of Distributed Database Systems (3rd ed.).
                                                                                                                                               Prentice Hall Press, Upper Saddle River, NJ, USA.
                     sake of brevity, the corresponding graph has been omitted.                                                           [15] Prasan Roy and S. Sudarshan. 2009. Multi-Query Optimization. Springer US,
                                                                                                                                               Boston, MA, 1849–1852.
                     6      CONCLUSION AND FUTURE WORK                                                                                    [16] Timos K. Sellis. 1988. Multiple-query Optimization. ACM Trans. Database
                                                                                                                                               Syst. 13, 1 (March 1988), 23–52.
                     In this paper we have proposed a novel approach to in-memory                                                         [17] Dimitris Tsirogiannis, Stavros Harizopoulos, Mehul A. Shah, Janet L. Wiener,
                     positional caching inside a column-store with late materializa-                                                           and Goetz Graefe. 2009. Query Processing Techniques for Solid State Drives
                                                                                                                                               (SIGMOD ’09). Association for Computing Machinery, 59–72.
                     tion. Compared to data caching, positional caching allows the                                                        [18] Patrick Valduriez. 1987. Join Indices. ACM Trans. Database Syst. 12, 2 (June
                     system to save space by storing only integers which can be easily                                                         1987), 218–246.
                                                                                                                                          [19] Till Westmann, Donald Kossmann, Sven Helmer, and Guido Moerkotte. 2000.
                     compressed. Experiments performed in PosDB demonstrated the                                                               The Implementation and Performance of Compressed Databases. SIGMOD
                     viability of this approach. Also we have evaluated a number of                                                            Rec. 29, 3 (Sept. 2000), 55–67.
                     policies for cache population and eviction, including our own.                                                       [20] M. Zukowski, S. Heman, N. Nes, and P. Boncz. 2006. Super-Scalar RAM-CPU
                                                                                                                                               Cache Compression. In ICDE’06. 59–59.
                     Future work will include evaluation of caching in a distributed
                     environment.

                     REFERENCES
                      [1] Daniel Abadi, Peter Boncz, and Stavros Harizopoulos. 2013. The Design and
                          Implementation of Modern Column-Oriented Database Systems. Now Publishers