<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Accelerating Prolly Trees: Simplified Chunking for Rapid Updates</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Abhimanyu Rawat</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tarun Kumar Vangani</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hanno Cornelius</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vanesa Daza</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>BITS Pilani</institution>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Information and Communication Technologies, Universitat Pompeu Fabra</institution>
          ,
          <addr-line>08018 Barcelona</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Status Research and Development</institution>
          ,
          <country country="SG">Singapore</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Amidst the growing reliance on Conflict-free Replicated Data Types (CRDTs) for synchronizing data across distributed networks, our research introduces a novel implementation of Prolly trees. This innovative approach overcomes the limitations of existing Prolly tree models by simplifying design complexities and enabling faster implementation. Our design strategically mitigates existing performance bottlenecks and operational intricacies by introducing an advanced architecture that curtails complexities associated with chunk splitting and minimizes the necessity for updating tree nodes. Furthermore, our implementation incorporates an ingenious Anchor Nodes design and supports batch insertions, substantially elevating the structure's proficiency in managing updates, insertions, and deletions. Developed using Golang, to leverage its execution speed, and Python, for its readability and straightforward adaptability, our version presents significant improvements over conventional Prolly tree configurations. Through exhaustive evaluation and comparative analysis against existing benchmarks, our model underscores a pronounced enhancement in performance metrics. Our method significantly improves both creation and insertion operations, achieving a 30% to 50% enhancement for Dolthub and a multiple-fold increase in performance for Canvas.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Prolly Tree</kwd>
        <kwd>CRDT</kwd>
        <kwd>Merkle Tree</kwd>
        <kwd>Blockchain</kwd>
        <kwd>Distributed</kwd>
        <kwd>Data Structure</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        With the continuous growth of decentralized technologies, the necessity for scalable and eficient data
structures has intensified. Conflict-free Replicated Data Types (CRDTs) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] have gained prominence
within the decentralized computing arena. CRDTs are specifically engineered to allow replication
across multiple nodes within a network, facilitating convergence towards a consistent state without
necessitating a centralized authority [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        One notable example of a CRDT is the Prolly (Probabilistically Balanced) tree, a distributed data
structure optimized for replicating data across multiple nodes within a network, particularly beneficial
in the context of State Machine Replication systems such as blockchains [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. Here, Prolly trees
play a crucial role in assisting client software [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], operating atop communication layers like libp2p [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
or devp2p [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], to synchronize states. These states may encompass sets of transactions, for example,
facilitating a seamless state synchronization process. The Prolly tree [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ], a variant of the Merkle tree
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], utilizes a tree-like structure to store the hashes of individual data blocks within a large dataset.
This arrangement facilitates eficient verification of the dataset’s contents. By calculating a set diference
between two Prolly trees in logarithmic time, it is possible to synchronize the state between multiple
parties. Designed for environments requiring weak consistency, the Prolly tree ofers both scalability
and eficiency, enabling the synchronization of large datasets under eventual consistency paradigms.
      </p>
      <p>
        The key performance metrics for a Prolly tree are multifaceted, including the chunking algorithm
which significantly influences the tree’s structure, the capability for inserts to minimally impact the
existing structure, and the eficiency of random reads amidst continuous insertions, among others.
Existing designs [
        <xref ref-type="bibr" rid="ref11 ref12 ref13">11, 12, 13</xref>
        ] have underscored the necessity for optimizing the Prolly tree to enhance
its performance and scalability. In production environments to synchronize the extensive datasets, the
goal is to develop a Prolly tree that not only processes large volumes of data efortlessly but also scales
eficiently in response to a high influx of messages. The performance of Prolly trees is significantly
afected by the order of message or data insertion. However, within the framework of causal broadcast,
data tend to follow a linear sequence, mitigating the complexities associated with random insertions
and thereby enhancing the operational eficiency of Prolly trees. Nevertheless, designing a system
that accommodates random insertions with minimal disruption to the existing tree structure remains
paramount.
      </p>
      <p>In this paper, we introduce an optimized iteration of the Prolly tree, specifically engineered for
enhanced eficiency and scalability, particularly under conditions of sequential insertion. This refined
version of the Prolly tree has been developed and tested using Python and Golang, with extensive
performance evaluations conducted to ascertain its eficacy and scalability. The findings from our
analysis indicate that the optimized Prolly tree exhibits superior capability in managing large datasets
and demonstrates potential to significantly improve scalability across an extensive network of nodes
compared to prior implementations.</p>
      <p>Our contributions are manifold: (1) We propose a novel chunking algorithm that markedly influences
the height and structure of the Prolly tree. (2) We unveil a design modification that ensures random
insertions into the Prolly tree exert minimal impact on its structural integrity. (3) We introduce an
innovative tree backbone design aimed at accommodating a sequence of insertions without impacting
the tree’s existing structure or its performance. (4) We provide an open-source implementation of the
optimized Prolly Tree, complemented by comprehensive performance testing to evaluate its eficiency
and scalability.</p>
      <p>In our experiments with the Prolly Tree implementation on commodity hardware, we observed a
minimum time improvement of around 30% compared to existing implementations. The structure of
this paper is organized as follows: Section 2 provides an overview of the CRDTs focused on Prolly
tree-related data structures. In Section 3, we introduce our proposed design for the optimized Prolly tree,
detailing how it diverges from existing models. Section 4 details the implementation of the optimized
Prolly Tree in Python and Golang, accompanied by a discussion of various performance metrics. Section
5 reviews relevant prior work on CRDT solutions akin to Prolly trees. We conclude the paper in Section
6, where we summarize our findings and outline directions for future research.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>In this section, we briefly explain the fundamentals of the CRDTs in the context of Prolly tree, a brief
introduction to the tree-based technology that facilitates data indexing and integrity preservation.</p>
      <sec id="sec-2-1">
        <title>2.1. Conflict-free Replicated Data Types (CRDTs)</title>
        <p>
          Conflict-free Replicated Data Types (CRDTs) are a class of data structures designed to facilitate reliable
data replication across multiple nodes in distributed systems, without necessitating central coordination
[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. The fundamental idea of CRDTs is to permit updates at various nodes independently, ensuring that
such updates can be consolidated in a manner that deterministically resolves any conflicts, thereby
achieving eventual consistency throughout the system. CRDTs are particularly valuable in settings
where network partitions are common or where latency and continuous availability are critical concerns,
such as in decentralized applications. A set diference is calculated between two instances of a CRDT,
and the resulting data can be applied to synchronize the original data layers.
        </p>
        <sec id="sec-2-1-1">
          <title>2.1.1. B-Tree and N-ary Tree</title>
          <p>
            B-tree [
            <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
            ] is a self-balancing tree data structure that maintains sorted data, allowing for eficient
insertion, deletion, and search operations, even in large datasets. They scale gracefully with the size of
the dataset, and are widely utilized in the implementation of databases and filesystems. Each node in a
B-tree contains a number of keys that are in a specific, sorted order and can have a variable number of
child nodes within predefined limits. This structure not only ensures that the tree remains balanced,
with all leaf nodes at the same depth, but also minimizes disk accesses—making B-trees particularly
eficient for storage systems that rely on sequential block access. Each internal node has at most 2
children in the case of B-tree.
          </p>
          <p>Similarly, n-ary trees share the same foundational properties as binary trees, with the key diference
being their capacity to have at most n children, as opposed to the binary tree’s limit of two. This
significantly reduces the height of the tree but impacts the width of the tree. For B-tree operations such
as search, insert, and delete have (log2  ) complexity where N is the number of keys in the tree. For
n-ary trees, operational complexity is (log  ).</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>2.1.2. Merkle Tree</title>
          <p>
            Merkle trees, serve as a cryptographically secure and eficient structure for verifying the contents
of large datasets [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ]. They are essentially binary trees with same operational complexity in which
every leaf node represents a data block’s cryptographic hash, and each non-leaf node is the hash of its
children. This hierarchical hashing strategy ensures that any alteration in a single data block can be
quickly detected by observing changes in the root hash, facilitating eficient and secure verification
processes. The sequential arrangement of leaf data within a Merkle tree is of paramount importance, as
altering the position of the same data can lead to a diferent root hash, fundamentally changing the
tree’s integrity. The construction of a Merkle tree exhibits a complexity of O(N), while the verification
process benefits from a more eficient complexity of O(log N), where N represents the total number of
data blocks.
          </p>
          <p>
            In the realms of blockchains and cryptocurrencies, Merkle trees play a crucial role for light clients
in verifying data inclusion within a tree structure. Notably, in leading cryptocurrencies like Bitcoin
and Ethereum, Merkle trees are instrumental in confirming the presence of transactions within a block.
This verification process is remarkably eficient, requiring only a logarithmic number of data nodes
from the Merkle tree to ascertain the inclusion of a specific transaction. Additionally, a specialized
variant known as the Merkle Patricia trie [
            <xref ref-type="bibr" rid="ref16">16</xref>
            ] is extensively utilized in Ethereum, serving a pivotal
function in maintaining the blockchain’s state.
          </p>
        </sec>
        <sec id="sec-2-1-3">
          <title>2.1.3. Prolly Tree</title>
          <p>
            Prolly trees data structures are designed to optimize storage and retrieval in distributed databases
and version control systems. First introduced by Noms [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ], Prolly tree structure is a variant of the
B-tree and Merkle tree, distinguished by its probabilistic approach to node splitting and merging, which
enhances performance in distributed environments. Prolly trees are particularly adept at handling large
datasets that require eficient synchronization across multiple clients or servers, due to their eficient
data deduplication and conflict-free replication capabilities. Each node in a Prolly tree can contain a
mix of data and hashes, facilitating both direct data access and verification of large data blocks without
necessitating complete traversal of the tree. In the design the number of children each node can have is
probabilistically deterministic.
          </p>
          <p>
            Prolly Trees comprise of the following notable components:
• Node: It is the simplest unit of data storage in Prolly tree. It represents a key-value pair at Level
0 of the Prolly tree. At non-zero levels, only the key is used to traverse the tree. Each Node also
stores a Merkle hash of the children nodes below it. In this paper, Node is sometimes referred to
as a key-value pair or a key.
• Level: It is the level of the tree at which the node is present. Level 0 is the leaf level of the tree
where Nodes have both key and value filled in them. At Non-zero levels, only the key is used to
traverse the tree.
• Boundary Node: It is any node in the Prolly tree whose node_hash attribute falls below a certain
threshold, leading to its promotion to the subsequent level in the tree. This node contains the
rolling Merkle-hash of a group of non-boundary nodes present in the level just below it until the
ifrst boundary node or None is seen when moving from right to left. It is also called a Promoted
Node.
• Tail/Anchor Node: It is a special concept where a Node (non-data) is inserted at each level of
the Prolly Tree. It can also be termed as a Fake node. It acts as a backbone for the Prolly tree. It
is always a boundary node by default. This is a design put forward by Canvas [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ], where the
position of Anchor Node is on the left side of the Tree, other Prolly tree implementations such as
Dolthub [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ] and IPLD [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ] do not have this concept. In our proposed Prolly tree, the anchor Node
is positioned on the rightmost side of each level, so that for the most recent incoming Nodes that
are inserted in the tree and happen to be non-boundary nodes, their Merkle hash state is tracked
with such Node.
• Bucket/Chunk: It is a collection of nodes that are present at the same level in the Prolly tree and
on either end there are boundary nodes (left side boundary node excluded). The bucket/chunk
boundary starts from the last seen boundary node in a sequence until the first seen Boundary
node. The chunk is also represented using the rolling Merkle hash of all the nodes that are present
in the chunk/bucket. This Merkle hash is subsequently stored in the boundary node positioned at
the immediate upper level, thereby cryptographically encapsulating the chunk’s state.
          </p>
          <p>Throughout the rest of the document, the terms "bucket" and "chunks" are used interchangeably to
refer to the same concept.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Optimized Prolly Tree</title>
      <p>In this section, we first elaborate on the design of our optimized Prolly tree as per the diferent
components involved, then examine its distinctive properties that enhance its eficiency compared to
other implementations.</p>
      <sec id="sec-3-1">
        <title>3.1. Chunking and Height of Prolly Tree</title>
        <p>Before commencing the construction of the Prolly tree, it is essential to define the mechanisms for
controlling the tree’s width and height. The width pertains to the number of children each Node in the
tree can accommodate, essentially the chunk or bucket size. The height, conversely, is determined by
this width in relation to the total number of Nodes, reflecting the overall structure’s vertical dimension.
Width and height are crucial as they introduce a level of determinism to the Prolly tree’s structure.</p>
        <p>Adopting a probabilistic methodology for tree structure, wherein the sizes of chunks or buckets
are determined by the hash of key-value pairs—essentially, the content of the tree Nodes—we put
forward a model to estimate the chunk size and anticipated height of the tree. The hash of each Node
(content-defined structure) is utilized to ensure that, even within a probabilistic framework, any entity
in the network can reconstruct the identical Prolly tree. This model relies on the division of the hash
value space into Q equally sized segments and identifies boundary nodes according to specific criteria
that involve analyzing the last few hexadecimal digits of each node’s hash value.</p>
        <p>The model operates under the following assumptions:
• Uniform Distribution: The hash function is assumed to uniformly distribute values across its
range. This assumption ensures that any segment of its output, including the last few hex digits,
exhibits a uniform distribution.
• Independent Selection: Each node’s determination as a boundary node is considered an
independent event, with the probability  of being a boundary node defined as  = 1 .</p>
        <sec id="sec-3-1-1">
          <title>3.1.1. Model Description</title>
          <p>Under these assumptions, the expected height  of the tree, for a total of  nodes, can be estimated
by acknowledging the relationship between the number of nodes and the frequency at which nodes
become boundary nodes:
 =
1

 ≈ log( )</p>
          <p>Given , the expected height  of a tree with N number of unique Nodes can be approximated by:</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Construction of Prolly Tree</title>
        <p>The construction of the Prolly tree is pivotal, serving as the foundational step to initialize the tree
using existing key-value data. This data, irrespective of its origin, is seamlessly integrated into the
tree’s framework, demonstrating the implementation’s flexibility. The process of building the Prolly
tree adheres to two methodologies outlined in Base Level construction Algorithms 1 and Other Levels
construction Algorithm 2. These two algorithms play a vital role in bootstrapping a Prolly tree.</p>
        <sec id="sec-3-2-1">
          <title>3.2.1. Base Level Construction</title>
          <p>As detailed in Algorithm 1, it begins with the incorporation of key-value data store to form the tree’s
base level. Each piece of data undergoes hashing with the SHA-256 algorithm, and the resultant hash is
assigned as the node_hash attribute in a Prolly tree Node. A Node is elevated to the next level if the
hexadecimal digit furthest to the right of node_hash falls below a predefined threshold. This elevation
process is iteratively applied until all data is allocated to level 0, with a probabilistic selection of nodes
ascending to level 1. Promoted nodes encapsulate the rolling hash of their child nodes from the lower
level within the merkle_hash attribute, facilitating the identification and comparison of subtree or
chunk contents. As depicted in Figure 1, an Anchor node is introduced at the extremity of level 0,
(1)
(2)
Rolling Hash of child chunk</p>
          <p>Rolling Hash of child chunk
on the rightmost side, further elaborated upon in Section 2.1.3. The Nodes in the pink box are the
boundary/promoted nodes, they are forming a chunk of their own starting from the previously observed
boundary Node.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Other Levels Construction</title>
          <p>For levels exceeding zero, as detailed in Algorithm 2, each Node, except the Anchor Node, situated at
these elevated levels is subjected to rehashing based on the value contained in its node_hash attribute.
The decision to promote a Node to the subsequent level again depends on whether its node_hash meets
a specified threshold value. This promotion process is repeated until the arrangement at a given level
is reduced to a single Node, in conjunction with an Anchor Node. At the final stage of this iterative
process, the Anchor Node is designated as the tree’s root, thereby establishing the ultimate hierarchical
structure of the Prolly tree.</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Updating the Prolly Tree</title>
        <p>
          Updating a Prolly tree is similar to updating a conventional Merkle or B tree [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], with the additional
intricacy of chunking. The update operations include inserting, deleting, or modifying a key-value pair
(Node at Level 0) within the tree. As delineated in Algorithm 3, the update process entails three main
steps: (1) locating the position of the key-value pair to be updated, (2) executing the update operation,
and (3) updating the Merkle hash values of the impacted chunks or boundary Nodes.
        </p>
        <p>Search operations play a crucial role in identifying the position of the key-value pair that needs
updating. The search algorithm, denoted as SEARCH_POSITION(...), operates recursively, navigating
from the root node to the leaf node along the path of the key-value pair designated for updating. As
specified between lines 4-8, the search process, particularly during insertion, begins at the root and
proceeds downward to the leaf level to find the key that is immediately smaller than the target key.
This is the reason the algorithm evaluates at intermediate Nodes whether the left Node is higher. Upon
locating the target leaf Node, the algorithm inserts the Node, corrects the pointers, assesses the extent
to which the inserted Node will ascend, and then refreshes the Merkle hash values of the impacted
chunks. The Merkle hash update algorithm, UPDATE_HASH(...), also a recursive function, starts from
the leaf Node and moves up to the root Node, updating the Merkle hash values of the afected chunks
(maximum two in our case). Upon insertion or any other update, Merkle hash of the afected chunks
are updated first and then it moves towards the right side of the Prolly tree to update the chunks till
the root node. Line 17 Find next boundary Node is a procedure that finds the boundary Node of each
chunk on the right side of the updated site in the tree moving up in a reverse waterfall manner to
eficiently update the parts of the tree. During sequential updates within the tree, only the Anchor
Nodes, extending all the way up to the Root Node, are afected by the Merkle hash updates(if inserted
Node is non-boundary). Conversely, only the directly impacted chunk undergoes a structural update in
scenarios involving random updates.</p>
        <p>When updating content, the previously described algorithm operates similarly, except for the insertion
step. In such instances, only the Merkle hash of the impacted chunk is updated. This update process is
notably eficient because it ensures no other chunk is afected. On the other hand, during a deletion,
the search mechanism functions in the same manner. However, with deletion, the Node is removed
from all Levels, and alongside the pointers to adjacent nodes, the Merkle hash of the afected chunk is
updated. In the case of deletion, there exists a probability of  (︀ 1 )︀ (where b is the chunking factor) that
an existing chunk will be removed and merged with an adjacent chunk.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Features of The Optimized Prolly Tree</title>
        <p>Highlighted below are the distinguishing features of the optimized Prolly tree, contributing to its
eficiency compared to other implementations:
◁ Target position is left to this Node</p>
        <sec id="sec-3-4-1">
          <title>3.4.1. Right Side Anchor Node Design</title>
          <p>Anchor Nodes within the Prolly tree serve as a foundational framework, efectively integrating node
data pending inclusion in a chunk. This architecture significantly enhances eficiency over other designs,
as illustrated in Figure 2. Here, each insertion or update influences the entire pre-formed tree through
updates to the merkle_hash. Additionally, Figure 3 demonstrates that sequential insertions leave the
tree’s existing structure unafected; changes are confined to the upgrade path, where Merkle hashes
are updated following any node modification. It is a notable advantage over other designs where new
changes necessitate adjustments throughout the entire tree. In contrast, in our design, only the Anchor
Nodes across various levels are subject to modifications, thereby streamlining the update process and
maintaining the integrity of the tree’s structure.</p>
        </sec>
        <sec id="sec-3-4-2">
          <title>3.4.2. Subtree Segmentation</title>
          <p>
            The proposed Prolly tree design features an innovative approach for segregating pre-established tree
chunks, enabling these segments to operate independently as subtrees. This capability is crucial,
considering that a node’s height remains uniform across the network, thereby allowing these chunks
to autonomously function. As shown in Figure 1, bucket 4 exemplifies how a chunk or subtree can
serve as a standalone unit of comparison. Such functionality becomes particularly advantageous for
lightweight client applications unable to store large sized Prolly trees, exemplified by clients within
Internet of Things (IoT) blockchain ecosystems [
            <xref ref-type="bibr" rid="ref18">18</xref>
            ]. This segregation feature significantly enhances the
eficiency of synchronization activities across Prolly trees with varying heights. As detailed in Section
3.4.1 and Figure 3, in scenarios involving sequential updates, existing tree chunks can be segregated
from the primary tree structure. This allows for updates to be confined to Anchor Nodes or newly
developed subtrees.
          </p>
        </sec>
        <sec id="sec-3-4-3">
          <title>3.4.3. Batch Insertion</title>
          <p>A crucial innovation of the proposed Prolly tree is its support for batch insertion, a mechanism that
allows for the simultaneous insertion of multiple key-value pairs. This method is akin to embedding
a miniature subtree within the larger tree structure, markedly enhancing eficiency by diminishing
the requisite number of updates. The implementation of a sophisticated chunking algorithm alongside
a thoughtfully designed Anchor node system underpins this capability. This approach, favoring the
collective insertion of a subtree over individual additions, streamlines the entire process. Notably, this
feature proves invaluable for the synchronization of Prolly trees; insertions made between boundary
nodes obviate the need for the intricate restructuring typically necessary in other models, thereby
facilitating a more seamless integration process.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Implementation</title>
      <p>In this section, we provide the implementation details of the proposed Prolly tree along with a few
metrics over which it outperforms the existing designs.</p>
      <sec id="sec-4-1">
        <title>4.1. Python and Go Implementation</title>
        <p>Our proposed Prolly tree design, as detailed in Section 3, is available through an open-source
implementation. For this project, we selected Go due to its eficiency and speed, and Python for its simplicity,
readability, and ease of modification. We have structured our implementation into three fundamental
components of the Prolly tree, aiming to optimize both performance and accessibility in the development
process.</p>
        <sec id="sec-4-1-1">
          <title>4.1.1. Node Class</title>
        </sec>
        <sec id="sec-4-1-2">
          <title>4.1.2. Level Class</title>
          <p>This entity signifies a Node within the Prolly tree, characterized by attributes including key, value,
pointers to other Nodes within the tree, merkle_hash, node_hash, among others. Each Node serves
as a self-contained unit, enabling traversal throughout the entirety of the tree by leveraging these
interconnected components.</p>
          <p>This component embodies a specific level within the Prolly tree, distinguished by attributes like the
level number and the tail node of that level. The Level functions as a repository for Nodes situated at a
distinct tier within the tree, acting as a modular unit that can expand or contract according to the tree’s
structural needs.</p>
        </sec>
        <sec id="sec-4-1-3">
          <title>4.1.3. ProllyTree Class</title>
          <p>This constitutes the operational core of the Prolly tree, integrating both Node and Level components to
construct the tree structure. It is equipped with utility functions facilitating essential operations such as
traversal, insertion, deletion, and the updating of Merkle roots. Due to its component-based modularity,
this unit is highly programmable, allowing for swift adaptations in the tree’s behavior to meet various
requirements.</p>
          <p>For testing and demonstration purposes, we have introduced a Message class designed to generate
message objects for insertion into the tree. This class is characterized by attributes such as key and
value, facilitating the creation of message objects. Additionally, we have implemented a sample dif
algorithm tasked with comparing two Prolly trees to identify discrepancies between them. This feature
not only serves to verify the tree’s integrity but also showcases its capacity to manage updates eficiently.
We are making our codebase for the proposed Prolly tree publicly available1 for anyone to recreate and
verify benchmarks, and to further extend capabilities.
4.2. Performance Metrics
* Prolly tree initialization data used for Canvas in Table 1 is taken directly from the publicly available
ifgures 2 by Canvas, it is because the benchmarking code3 by Canvas does not support the Tree
initialization benchmarks. While the Dolthub and proposed Prolly tree benchmarks were executed. In the
1Available at: https://github.com/ABresting/Prolly-Tree-Waku-Message
2https://joelgustafson.com/posts/2023-05-04/merklizing-the-key-value-store-for-fun-and-profitempirical-maintenanceevaluation
3https://github.com/canvasxyz/okra
case of Dolthub, data showcases a trend that the proposed Prolly tree implementation outperforms the
existing solutions by around 30-45% for creating the Prolly tree. While it is around 3 times faster than
the Canvas Prolly tree.</p>
          <p>We tested the performance of the existing Prolly trees against our proposed implementation by
utilizing the publicly available metrics and open-source codebases of Canvas4 and Dolthub5, evaluating
across various metrics. Operations such as creation/initialization of a tree and insertions were performed
on Prolly trees of varying capacities to assess their eficiency. The following benchmarks were conducted
on hardware equipped with a 13th Gen Intel i7-13700H processor with 20 cores, 64 GB of main memory,
and a 1 TB M2 SSD, all running on Ubuntu 22.04 LTS. Specifically, our approach accelerates the tree
bootstrapping process from 30% to multiple-folds during creation time and demonstrates significantly
faster performance in handling insertions, achieving these results on average with commodity hardware.
Benchmarks from Canvas and Dolthub are also very interesting since they are running atop real-world
production environments with libp2p as a network layer in the case of Canvas. The performance metrics
are summarized in Table 1 and 2.</p>
          <p>The benchmarks presented in Table 2 were generated through the execution of the publicly available
benchmark codebase of both Dolthub and Canvas. In the table, the same number of unique entries was
inserted into an existing tree that already contained an identical(unique also) number of entries. For
instance, inserting 1,000 entries into a tree already holding 1,000 Nodes, similar to other numbers of
4https://github.com/canvasxyz/okra/blob/main/benchmarks/main.zig
5https://github.com/dolthub/dolt/blob/main/go/performance/kvbench/prolly_store.go</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Related Work</title>
      <p>In this section, we explore tree-based Conflict-free Replicated Data Types (CRDTs), including designs
akin to the Prolly tree that have been introduced in previous research.</p>
      <p>
        CRDTs are pivotal for achieving eventual consistency within distributed systems, ensuring that
datasets synchronize with minimal communication and computation overhead. Auvolat et al. [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]
introduced the Merkle Search Tree (MST), an n-ary Merkle tree that shares similarities with the Prolly
tree design. Their approach utilizes a chunking algorithm that hashes data (key-value pairs); the number
of leading zeros in the resultant hash dictates the node’s height within the MST. Hashes without a
leading zero remain at the leaf node level, while those with one or more leading zeros are elevated
to higher levels. Their design is reminiscent of the B-tree’s node tracking methodology, albeit with a
distinct structural configuration. Importantly, this construction maintains the order of transactions,
facilitating eficient remote comparisons or diferences with other MST instances.
      </p>
      <p>
        The Prolly tree, as developed by Jose et al. at Canvas [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], adopts a nuanced method for determining
node promotion, specifically concerning boundary node selection. This method entails assessing
whether certain bytes from the data node’s resultant hash fall below a predefined threshold. If so, the
node is elevated to the next hierarchical level. For nodes situated above the initial level, the evaluation
involves the aggregated hashes of their child nodes, proceeding until a new boundary node is identified.
To enhance the stability of the Prolly tree, a unique anchor node is introduced at the beginning of every
level. This node, invariably the first child of its parent, ensures structural consistency and provides a
backbone to the tree, as defined in Section 2.1.3. Notably, random updates within the tree can potentially
alter its structure both horizontally and vertically, extending up to the root. Such modifications, which
may result in the splitting of node buckets, add a layer of complexity to the tree’s architecture. Unlike
traditional data structures, the Prolly tree utilizes a content-addressable order, thereby not maintaining
lexicographical sequence.
      </p>
      <p>
        The Prolly tree implementation by Son et al. at Dolthub [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] exhibits considerable similarity to the
Prolly tree framework developed at Canvas. A primary distinction, however, lies in the utilization of
a rolling hash for the incoming data at the leaf level to determine the initiation point for boundary
node creation. Unlike the Canvas model, the Dolthub approach does not incorporate the concept of
an anchor node. Instead, it features a continuous bucket of nodes that receive incoming data nodes,
from which no boundary node is derived. Comparable to the Canvas Prolly tree, the Dolthub Prolly
tree encounters similar complexities in tree restructuring due to random updates or insertions. With
each insertion certain parts of the Prolly tree needs to update their hashes.
      </p>
      <p>
        The Prolly tree implementation by IPLD [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] shares notable similarities with those developed by
Canvas and Dolthub. It too employs a rolling hash mechanism for determining the formation of node
chunks and the subsequent creation of boundary nodes. The process of splitting existing chunks aligns
with the methodologies observed in both Canvas and Dolthub implementations. A distinctive feature
of the IPLD implementation is its high programmability and adaptability to changes, facilitated by a
commitment to component modularization. This approach enables a flexible architecture, allowing for
easier modifications and updates in response to evolving requirements.
      </p>
      <p>Our Prolly tree design is strategically oriented towards minimizing both the complexity inherent in
the tree’s structure and the necessity for its restructuring while seamlessly preserving the established
properties and standards outlined in Section 2.1.3. The design decisions employed facilitate simplified
updates to the tree, thereby reducing the impact on the tree’s overall structure and enhancing the
eficiency of parallel read and write operations. Although, the mechanisms by which two Prolly trees
are compared and the methodology for calculating diferences between them fall beyond the scope
of this paper, but we have provided a sample execution code in the provided codebase and a brief
explanation in the Appendix A.1</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>In conclusion, our study presents an innovative approach to optimizing Prolly trees, a pivotal structure
in distributed database systems and version control applications. Building on our enhanced and
optimized design, which surpasses a variety of existing approaches, we introduce several key features
that significantly enhance performance and eficiency. Notably, avoiding horizontal chunk splitting, the
unique Anchor Nodes design reduces the complexity of updates and synchronizations, streamlining
operations while maintaining high levels of data integrity and system resilience, batch insertion, etc.
Our open-source implementation, crafted in Go for its performance and Python for its ease of use,
underscores the practicality and adaptability of our design.</p>
      <p>Empirical evaluations, benchmarked against publicly available metrics and the codebases of Canvas
and Dolthub, validate the eficiency of our proposed Prolly tree implementation, which adheres to the
established standards and essential properties of a Prolly tree. Our approach demonstrates a 30% to 50%
improvement in tree initialization time for Dolthub and multiple-fold increases for Canvas, with similar
gains in insertion eficiency, all on commodity hardware. These findings underscore the robustness and
eficiency of our design and its potential as a foundational model for future enhancements in distributed
computing environments.</p>
      <p>Our research contributes to the ongoing dialogue in distributed systems design, ofering a practical
solution that balances eficiency, flexibility, and performance. In future, we sought to develop a
multithreaded Prolly tree design towards faster operations and cutting down additional time.</p>
    </sec>
    <sec id="sec-7">
      <title>A. Appendix A</title>
      <sec id="sec-7-1">
        <title>A.1. Synchronizing Prolly Trees</title>
        <p>The synchronization process initiates with a dif protocol, involving two Prolly tree root nodes from
diferent peer-to-peer clients. For efective comparison, these roots—or trees—must be at the same
height. If one tree is taller than the other, the taller tree’s root node descends to match the height of the
shorter tree. Notably, in our design, Anchor Nodes serve as the roots of the Prolly tree. Utilizing this
design, each level’s Anchor Node can act as a root for the tree below, ensuring uniform tree heights.
Consequently, the taller tree selects an Anchor Node at the same level as the shorter tree’s root. As
illustrated in Figures 4 and 5, both trees are adjusted to the same height, facilitating the commencement
of the comparison. The key node missing in this scenario is the node with key 7.</p>
        <p>The synchronization process involves comparing the Merkle Hash of each root nodes. If a match is
found, the algorithm concludes, as it indicates the trees are identical. If the Merkle hashes do not match,
the intermediate nodes beneath the root node are retrieved. Each node at this stage represents a subtree.
For example, at level 2, nodes 6 and 9, along with the Tail/Anchor Node, represent subtrees encapsulating
the state of all underlying nodes. Subtrees where the Merkle hashes match are not explored further.
Ultimately, the algorithm isolates the specific segment of leaf nodes where the diferences manifest.
These nodes can be retrieved using a method tailored to the bandwidth and resources of the clients.
As shown in Figure 4, node 7, located within chunk 4, exemplifies a missing node. This process may
identify multiple difering nodes across several chunks, which are then synchronized accordingly.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Shapiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Preguiça</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Baquero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zawirski</surname>
          </string-name>
          ,
          <article-title>Conflict-free replicated data types</article-title>
          , in: Stabilization, Safety, and
          <source>Security of Distributed Systems: 13th International Symposium, SSS</source>
          <year>2011</year>
          , Grenoble, France,
          <source>October 10-12</source>
          ,
          <year>2011</year>
          . Proceedings 13, Springer,
          <year>2011</year>
          , pp.
          <fpage>386</fpage>
          -
          <lpage>400</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Jannes</surname>
          </string-name>
          ,
          <article-title>Secure and resilient data replication for the client-centric decentralized web</article-title>
          ,
          <source>in: Proceedings of the 23rd International Middleware Conference Doctoral Symposium</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>V.</given-names>
            <surname>Buterin</surname>
          </string-name>
          ,
          <article-title>Ethereum: A next-generation smart contract and decentralized application platform, Ethereum project yellow paper (2014 (accessed March,</article-title>
          <year>2024</year>
          )). URL: https://ethereum.org/ 669c9e2e2027310b6b3cdce6e1c52962/Ethereum_Whitepaper_-_
          <string-name>
            <surname>Buterin</surname>
          </string-name>
          _
          <year>2014</year>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Nakamoto</surname>
          </string-name>
          ,
          <article-title>Bitcoin: A peer-to-peer electronic cash system</article-title>
          , Cryptography Mailing list at https://metzdowd.com (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>O.</given-names>
            <surname>Thorén</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Taheri-Boshrooyeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Cornelius</surname>
          </string-name>
          ,
          <article-title>Waku: A family of modular p2p protocols for secure &amp; censorship-resistant communication</article-title>
          ,
          <source>in: 2022 IEEE 42nd International Conference on Distributed Computing Systems Workshops (ICDCSW)</source>
          , IEEE,
          <year>2022</year>
          , pp.
          <fpage>86</fpage>
          -
          <lpage>87</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.</given-names>
            <surname>Labs</surname>
          </string-name>
          , Libp2p,
          <source>2021 (accessed March</source>
          ,
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Wood</surname>
          </string-name>
          , Devp2p wire protocol,
          <source>2014 (accessed March</source>
          ,
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Sehn</surname>
          </string-name>
          , Prolly trees,
          <source>2024 (accessed March</source>
          ,
          <year>2024</year>
          ). URL: https://www.dolthub.com/blog/ 2024-03-03
          <string-name>
            <surname>-</surname>
          </string-name>
          prolly-trees/.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Boodman</surname>
          </string-name>
          ,
          <article-title>Prolly trees: Probabilistic b-trees</article-title>
          ,
          <source>2021 (accessed March</source>
          ,
          <year>2024</year>
          ). URL: https://github. com/attic-labs/noms/blob/master/doc/intro.md
          <article-title>#prolly-trees-probabilistic-b-trees.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R. C.</given-names>
            <surname>Merkle</surname>
          </string-name>
          ,
          <article-title>A digital signature based on a conventional encryption function</article-title>
          ,
          <source>in: Conference on the theory and application of cryptographic techniques</source>
          , Springer,
          <year>1987</year>
          , pp.
          <fpage>369</fpage>
          -
          <lpage>378</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Gustafson</surname>
          </string-name>
          ,
          <article-title>Merklizing the key/value store for fun and</article-title>
          profit,
          <source>2023 (accessed March</source>
          ,
          <year>2024</year>
          ). URL: https://docs.canvas.xyz/blog/2023-05-04
          <article-title>-merklizing-the-key-value-store</article-title>
          .html.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <article-title>Ipld prolly trees specification</article-title>
          ,
          <source>2022 (accessed March</source>
          ,
          <year>2024</year>
          ). URL: https://github.com/ipld/ipld/blob/ prolly-trees/specs/advanced
          <article-title>-data-layouts/prollytree/spec</article-title>
          .md#structure.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Son</surname>
          </string-name>
          ,
          <article-title>How dolt stores table data</article-title>
          ,
          <source>2020 (accessed March</source>
          ,
          <year>2024</year>
          ). URL: https://www.dolthub.com/ blog/2020-04-01
          <article-title>-how-dolt-stores-table-data/#prolly-trees.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R.</given-names>
            <surname>Bayer</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
          <article-title>McCreight, Organization and maintenance of large ordered indices</article-title>
          ,
          <source>in: Proceedings of the 1970 ACM SIGFIDET (Now SIGMOD) Workshop on Data Description, Access and Control</source>
          ,
          <year>1970</year>
          , pp.
          <fpage>107</fpage>
          -
          <lpage>141</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Adelson-Velskii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Landis</surname>
          </string-name>
          ,
          <article-title>An algorithm for organization of information</article-title>
          , in: Doklady Akademii Nauk, volume
          <volume>146</volume>
          ,
          <string-name>
            <surname>Russian</surname>
            <given-names>Academy</given-names>
          </string-name>
          <source>of Sciences</source>
          ,
          <year>1962</year>
          , pp.
          <fpage>263</fpage>
          -
          <lpage>266</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>H. S. de Ocáriz Borde</surname>
          </string-name>
          ,
          <article-title>An overview of trees in blockchain technology: merkle trees and merkle patricia tries</article-title>
          , University of Cambridge: Cambridge, UK (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>D.</given-names>
            <surname>Comer</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ubiquitous</surname>
          </string-name>
          b-tree,
          <source>ACM Comput. Surv</source>
          .
          <volume>11</volume>
          (
          <year>1979</year>
          )
          <fpage>121</fpage>
          -
          <lpage>137</lpage>
          . URL: https://doi.org/10.1145/ 356770.356776. doi:
          <volume>10</volume>
          .1145/356770.356776.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>S.</given-names>
            <surname>Popov</surname>
          </string-name>
          , The tangle,
          <source>2018 (accessed April</source>
          ,
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A.</given-names>
            <surname>Auvolat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Taïani</surname>
          </string-name>
          ,
          <article-title>Merkle search trees: Eficient state-based crdts in open networks</article-title>
          ,
          <source>in: 2019 38th Symposium on Reliable Distributed Systems (SRDS)</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>221</fpage>
          -
          <lpage>22109</lpage>
          . doi:
          <volume>10</volume>
          .1109/ SRDS47363.
          <year>2019</year>
          .
          <volume>00032</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>