<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>PhD Workshop, August</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Spatio-Temporal Locality in Hash Tables</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Matt A. Pugh</string-name>
          <email>matt.pugh@ed.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Supervised by  Stratis Viglas</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Edinburgh 10 Crichton St. Scotland</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <volume>28</volume>
      <issue>2017</issue>
      <abstract>
        <p>The overall theme of this Ph.D. is looking at ways to use emerging NVM (Non-Volatile Memory) technologies in realworld data-science scenarios. It is hoped that the exploitation of the characteristics of the technology will result in performance improvements, de ned as being either/or an increase in computational throughput and energy-use reduction. Primarily, this has been through the inclusion of temporal locality into HopH (Hopscotch Hashing) by [2]. The problem of highly-skewed access patterns a ecting lookup time and required computation is shown through a simple model. A simulator is then used to measure the expected performance gains of incorporating temporal locality, given di erent HT (Hash Table) con gurations. This work was originally motivated by NVM, as a way to mask the extra latency anticipated, but the work is applicable to HTs in DRAM (Dynamic Random-Access Memory) also. The second area of interest in the Ph.D. is looking at exploiting the characteristics of NVM for di erent families of machine learning algorithms, though this paper focuses solely on the former.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>At this stage, the primary focus has been at introducing
spatial and temporal locality into HopH { taking the concept
of minimising amortised average lookup time into HopH will
provide spatio-temporal locality with an amortised
complexity of O(1) for the typical HT operations; Get(), Insert(),
and Delete(). Details on the approaches undertaken are
provided in Section 3. This is motivated by skewed access
patterns, following Zipf's law, and high-performance
systems, such as RDBMS (Relational DataBase Management
System) join operators.</p>
      <p>In its simplest form, a Hash Table (HT) T is an M -bucket
long associative array-like data structure that maps a key k
of any hashable type to value v. A given bucket Bi in T
can have S slots for key/value tuples. The occupancy (or
load factor) of a HT is 0 L 11. The bene t of a HT is
that the associated value's position i in memory is calculated
directly using a hash function h(k), i = h(k) mod M .
1.1</p>
      <p>
        A ST (Splay Tree) is a self-adjusting binary search tree
introduced by [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] that has the property that the most
frequently accessed node is near the head of the tree. STs have
the amortised time-complexity for Get() and Insert()
operations of O(log n), and the goal of their invention was to
minimise average time of lookups in worst-case scenario in
real-time terms. Their solution to achieving this is to
minimise the number of node traversals by ensuring the
mostfrequently-accessed nodes are as close to the head as
possible, a property which is closely tied to temporal locality,
although not exactly the same thing. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] acknowledge that
the computational cost of ongoing reordering is high, and
may be problematic given the target application workload.
This is clear when considering the rotation of sub-trees in
a Splay() operation { there is no spatial locality, or
cachefriendliness, guaranteed by the structure of the tree in
memory, as such pointer-chasing in rotations is inevitable.
      </p>
      <p>
        In databases, [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] describe cracking; a method of self
organisation data by query workload. This process is tuned
for a RDBMS in which a column of interest ACRK is
retained as a copy, and is continuously reorganised based on
the queries touching it. This is analogous to data hotness
as we wish to view it, and can be considered a cache-tuning
problem, similar to a LRU (Least Recently Used) cache. As
ACRK is a column, they are able to e ectively partition the
data itself to improve probing response times signi cantly.
Partitioning within a HopH context is less intuitive as there
are, at any position, H overlapping neighbourhoods.
2.
      </p>
    </sec>
    <sec id="sec-2">
      <title>MODEL AND SIMULATOR</title>
      <p>The structure of a HT is entirely dependent on the order in
which keys were inserted into it. For example, the hottest
element of the neighbourhood, or even entire table, over
some given time period, may be in the last available bucket
in a neighbourhood, leading to multiple evaluations before
nding the tuple required.
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>Model</title>
      <p>Assume a constructed Hash Table T , of which we have a
set of all keys K in T . Let X be the sequence of accesses,
1Assuming no chained-structures resulting in L &gt; 1 load
factor</p>
      <p>C C C δC C
Figure 1: This gure illustrates buckets (coloured blocks)
in neighbourhoods (colours) over some contiguous range of
buckets within the table T . For a given neighbourhood,
access patterns may naturally fall as the above distribution,
where C is the cache-line size and each blue block represents
a constant number of accesses to the associated key ki.
Suppose a request is made for the hottest item k4, there will be
3 + cache-invalidations before hitting the hottest data k4
if T [h(k4)] is not already cached. Reordering the data such
that k4 is in position 0, minimising cache-invalidations, is the
optimal distribution. Note that the distribution of colours
in T does not change.
such that every element x 2 X exists in K. We de ne a
simple cost model (T; X) that gives a unit cost of a bucket
traversal as , and a slot traversal and key comparison as
unit cost, where and are the number of respective
operations.</p>
      <p>(T; X) =
+
(1)</p>
      <p>It is simple to use this model with a su cient access-skew
in X to show good potential gains in the worst case of the
con guration of T . Consider a RO (Read-Only) T where
some key k has hashed into bucket Bi, but the only available
slot in the otherwise fully-occupied neighbourhood i is at
Bi+H 1, slot S. Assume that X is completely skewed, such
that there is only one k present, repeated A times. In this
case, the cost of accesses is:</p>
      <p>(T; X) = AH( ) + ASH( )</p>
      <p>In this example, it is clear to see that minimising the
values of = AH and/or = ASH are the only areas of
movement upon which we can optimise, if we assign some
real-world values, where S = 3; H = 64; = 15; = 15; A =
1 109, we obtain a cost of:
(T; X) ' 3:84
1012
When we assume that X0 the hottest element k0 in the
rst slot of the rst bucket in a neighbourhood, the cost is
naturally far lower:
(T; X0) = A( + ) ' 3:0
1010</p>
      <p>This is the lowest possible cost incurred for T and X0, and
is therefore referred to as the oracle. With this reordering,
99:22% of cycles would be saved versus T with X. This
approach has the problem that manually attempting to cover
a number of con gurations of T to nd the expected bene t
of rearrangement would be tedious. To deal with this, we
employ a simulator.
2.2</p>
    </sec>
    <sec id="sec-4">
      <title>Simulator</title>
      <p>
        The developed simulator provides the estimated
performance bene t of performing reordering for varying table
sizes, load factors and skews. This is done using a
number of discrete probability distributions to obtain di erent
con gurations of T , in order to explore the problem space
we are interested in:
1. The number of slots that each bucket will have
populated in the simulator's construction is sampled from
a Multinomial distribution, for bucket Bi this is &amp;i.
2. N = bS M Lc samples are drawn from a Zip an
distribution Z( ). These are the random hotness values
that are then inserted into each bucket Bi 2 T over &amp;i
slots in each bucket. This approach gives a load factor
L and skew over the data. T is then randomly
permuted to distribute the remaining (for L &lt; 1) empty
buckets throughout the table.
3. As the number of buckets occupied within a
neighbourhood i is not a constant, either b c or d e
buckets per neighbourhood are allocated with probability
= d e b c, where is the expected number of
elements in a neighbourhood, given in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. must be
within the [0; 1] interval, and is interpreted as the
probability that the neighbourhood contains the
upperbound of entries. At each neighbourhood root,
neighbourhood occupancy is sampled from a Binomial
distribution with probability .
4. Finding the locations within the neighbourhood for
occupancy should not be done linearly, and must be at
least partly stochastic to emulate the chronological
insertions over multiple neighbourhoods. Approaching
this in a linear manner would construct T such that
all insertions happened to populate the table in exactly
linear order; this behaviour is not realistic. Instead,
we randomly draw samples from a Poisson
distribution that has a mean = 2, in order that the mass of
the distribution is towards the beginning of the
neighbourhood, but may be further on.
2.2.1
      </p>
      <sec id="sec-4-1">
        <title>Output &amp; Discussion</title>
        <p>Table 1 shows the output obtained thus far from the
simulator, this shows that even on a coarse reordering policy,
we begin to approx a factor of 2 improvement in terms of
work performed. Performing ne reordering over the same
con guration invariably leads to better results than coarse.
A key caveat is that this metric concerns itself only with
the hottest element in the table, extensions are underway to
take a more wholistic view of potential performance gains.
The fact that as L ! 1, gains appear to disappear is entirely
expected. This is due to the fact that as a table becomes
full, it is to be expected that most neighbourhoods will only
have one bucket in T , therefore reordering is not possible.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>METHODS</title>
      <p>
        HopH is used as the basis for the solution. The argument
of reordering based on the hotness of the data itself given by
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is highly aligned with the goals for this work. This work
di ers as the speci c objective is achieving spatio-temporal
locality, in order to minimise average lookup times. The
value of S is selected such that a bucket Bi ts within a
cache-line size C = 64B. In order that the size of the bucket
jBij C, we choose S, where the size of a tuple j j =
8 + 8 = 16B, and is the size of any required meta-data,
to be S = j C k. Di erent access patterns are simulated
j j
by drawing samples from a Zip an distribution Z, whose
parameter a ects the skew of the samples obtained.
3.1
      </p>
    </sec>
    <sec id="sec-6">
      <title>Reordering Strategies</title>
      <p>This section describes a number of re-ordering strategies
for the placement of data in a neighbourhood. These
methods look at coarse, down to ne tuple-level, and heuristic
reordering operations.
3.1.1</p>
      <sec id="sec-6-1">
        <title>Fine - Intra-Bucket (FIntrB)</title>
        <p>This method simply sorts the tuples within a bucket Bi
(that must be of the same neighbourhood) by hotness. This
is performed using a priority queue, inserting the tuples
and ordering by their number of accesses, before
reinserting them into Bi. As we know that jBij C, there are no
further bucket traversals required, and any potential gains
are purely in terms of operations performed within Bi for a
Contains() or Get() and, as such, will be minimal.
3.1.2</p>
      </sec>
      <sec id="sec-6-2">
        <title>Coarse Inter-Bucket (CIB)</title>
        <p>We can express the overall hotness of a bucket Bi by
the summation of accesses to all tuples contained within
it. With this method, we do not care about the order of the
tuples within slots, but simply that the most-frequently-hit
bucket is closest to the neighbourhood root. A clear
compromise of this strategy is that it does not care about the
distribution of accesses within Bi; in the example where
S = 3 and Bi has one element with many accesses, but
two without, and Bj has a uniform distribution whose total
(summed) access is more than Bi, Bj will be promoted rst.
3.1.3</p>
      </sec>
      <sec id="sec-6-3">
        <title>Fine Inter-Bucket (FIB)</title>
        <p>By far the most expensive operation, this seeks to
redistribute the neighbourhood at a tuple-level using a priority
queue to order tuples by hotness, before reinserting them
into all buckets within the neighbourhood. The positions of
buckets within the neighbourhood do not change. In terms
of memory use, this strategy is the most demanding.
Potentially, if every possible bucket in a neighbourhood i belongs
to that neighbourhood, there will be many elements to copy
and reinsert over H buckets. Although the memory traversal
will be sequential, this will involve H 1 cache-invalidations.
The trade-o for this cost is that we are guaranteed to have
all tuples in correct order, with none of the compromises of
FIB (Fine Inter-Bucket) or CIB (Coarse Inter-Bucket).
3.1.4</p>
      </sec>
      <sec id="sec-6-4">
        <title>Heuristic</title>
        <p>The simplest approach is a direct-swap heuristic, which
simply compares the current tuple i with the rst tuple of
the neighbourhood, r, if the i is the hotter of the two. This
approach should not have a high computational overhead,
as in traversing the neighbourhood to nd i, we already
stored a reference to r upon rst encountering it. If i is
hotter than r, swap them and their distribution entries.
3.1.5</p>
      </sec>
      <sec id="sec-6-5">
        <title>Approximate Sorting</title>
        <p>This approach exploits the spatial locality a orded by
HopH; we are guaranteed that all H buckets within
neighbourhood i are in the same, homogeneous region of
memory. Once these pages are in the cache-hierarchy, latencies
in accessing them are reduced and, depending on the cache
layer, very e cient. As the Get() or Contains()
operation traverses i, a Bubblesort-like operation can be applied
based on hotness.
3.2</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Epochs &amp; Squashing</title>
      <p>In order to be adaptive over time, there must be a series of
rules that govern how the hotness of data changes over times
and accesses. For approaches where there is an absolute
trigger for reordering, an epoch is de ned at a neighbourhood
level, and is its state before a reordering method is invoked.
For non-triggered reordering methods, there must be a
continual state of adaptation for the associated meta-data.</p>
      <p>Once tuples in i have been rearranged, and the numeric
values of tuple accesses have been reduced in some manner,
the hottest element in i should remain so (preservation of
hotness). For epoch-based approaches, this means the skew
and shape of the distribution should be roughly equal after
an epoch, but smaller in magnitude. For non-epoch-based
approaches, the distribution should be more uid,
dynamically changing all values in i regularly. Those tuples in i
that have not had many accesses should have historic
access counts reduced, until su cient epochs have passed that
they no longer have any hotness in epoch-based reordering
(cooling).
3.3</p>
    </sec>
    <sec id="sec-8">
      <title>Deletions</title>
      <p>Handling deletions in HotH (Hotscotch Hashing) should
follow two principles further to a baseline HopH deletion:
1. For per-slot level meta-data, the distribution entry
(number of reads) for the tuple to be deleted should
also be deleted from any per-bucket counter if present,
before being erased itself.
2. If any slot that is not the rst slot of a bucket contains
the tuple to be deleted, subsequent slots should be
shift towards the rst slot, to minimise unnecessary
traversal gaps in the bucket structure.</p>
      <p>After this, should the bucket be empty after tuple
deletion, the standard HopH process is followed.
3.4</p>
    </sec>
    <sec id="sec-9">
      <title>Invocation</title>
      <p>Finding the balance between the severity of the reordering
strategy employed, and how and when to trigger it are key
points in this work. We explore a number of invocation
strategies, and the various forms of meta-data required to
permit them.
3.4.1</p>
      <sec id="sec-9-1">
        <title>Trigger-based</title>
        <p>The simplest method of invocation is that of setting a
minimum number of accesses to a bucket or neighbourhood,
dependant upon the reordering strategy used, where a counter
= 0 is meta-data within the bucket structure. Upon every
Get() or Contains() call, is incremented and evaluated.
Once exceeds some instantiation-de ned value , the
reordering is invoked.
3.4.2</p>
      </sec>
      <sec id="sec-9-2">
        <title>Probabilistic-based</title>
        <p>Instead of storing any meta-data at a
bucket/neighbourhood level, we can simply state that with some probability
r = P(Perform Redistribution), we will perform a
reordering. A key point in using this method is to avoid paying
the cost of the given reordering strategy often; however the
generation and evaluation of PRN (Pseudo Random
Number)s is itself a non-zero cost. Conclusions drawn from
experimentation using a number of PRNG (Pseudo Random
Number Generator)s show that the xorshift a good
candidate. Further work is looking into simpler masking / shifting
of an integer value, as statistically-sound randomness isn't
mandatory.</p>
        <p>EXPERIMENTS</p>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>DISCUSSION</title>
      <p>
        Implementations of all methods described have been
complete, with results expected soon. As the overarching theme
of this thesis is NVM, work will be conducted to exploit its
characteristics to aid HotH. An approach by [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] provides a
way for leaf nodes being stored in a DRAM/NVM hybrid
B+Tree, where the system prefers Contains() operations
to Get(), as keys are stored in quick DRAM and values in
NVM. Following a similar methodology for hybrid NVM /
DRAM should provide a fast key lookup for Contains(), as
we can now t many keys within C, but also gives slower
reordering due to the extra cost of moving around NVM
vs. DRAM. Unlike the problem solved by [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], we must still
ensure spatial and temporal locality. Intuitively, this could
mean analog structures of M elements in DRAM and NVM
with a translation function t(i; s) ! j that takes the bucket
and slot numbers i; s respectively, and maps it to a position
(o set) j in the NVM table.
      </p>
    </sec>
    <sec id="sec-11">
      <title>ACKNOWLEDGMENTS</title>
      <p>This work was supported in part by the EPSRC Centre
for Doctoral Training in Data Science, funded by the UK
Engineering and Physical Sciences Research Council (grant
EP/L016427/1) and the University of Edinburgh.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Albers</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Karpinski</surname>
          </string-name>
          .
          <article-title>Randomized splay trees: Theoretical and experimental results</article-title>
          .
          <source>Information Processing Letters</source>
          ,
          <volume>81</volume>
          (
          <issue>4</issue>
          ):
          <volume>213</volume>
          {
          <fpage>221</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Herlihy</surname>
          </string-name>
          .
          <source>Hopscotch Hashing</source>
          . pages
          <fpage>0</fpage>
          {
          <fpage>15</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Idreos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kersten</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S. Manegold. Database</given-names>
            <surname>Cracking</surname>
          </string-name>
          .
          <source>CIDR '07: 3rd Biennial Conference on Innovative Data Systems Research</source>
          , pages
          <volume>68</volume>
          {
          <fpage>78</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>I.</given-names>
            <surname>Oukid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lasperas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nica</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Willhalm</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W.</given-names>
            <surname>Lehner. FPTree: A Hybrid SCM-DRAM Persistent</surname>
          </string-name>
          and
          <article-title>Concurrent B-Tree for Storage Class Memory</article-title>
          .
          <source>Proceedings of the 2016 International Conference on Management of Data - SIGMOD '16</source>
          , pages
          <fpage>371</fpage>
          {
          <fpage>386</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Pagh</surname>
          </string-name>
          and
          <string-name>
            <given-names>F. F.</given-names>
            <surname>Rodler</surname>
          </string-name>
          . Cuckoo Hashing.
          <source>Journal of Algorithms</source>
          ,
          <volume>51</volume>
          (
          <issue>2</issue>
          ):
          <volume>122</volume>
          {
          <fpage>144</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D. D.</given-names>
            <surname>Sleator</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Tarjan</surname>
          </string-name>
          .
          <article-title>Self-adjusting Binary Search Trees</article-title>
          .
          <source>Journal of the ACM</source>
          ,
          <volume>32</volume>
          (
          <issue>3</issue>
          ):
          <volume>652</volume>
          {
          <fpage>686</fpage>
          ,
          <year>1985</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>