=Paper= {{Paper |id=Vol-2622/paper11 |storemode=property |title=ElasticBloC: A Massively Scalable Architecture For Blockchain Based Applications |pdfUrl=https://ceur-ws.org/Vol-2622/paper11.pdf |volume=Vol-2622 |authors=Hadi Jibbawi,Rafiqul Haque,Yehia Taher,Ali Jaber |dblpUrl=https://dblp.org/rec/conf/bdcsintell/JibbawiHTJ19 }} ==ElasticBloC: A Massively Scalable Architecture For Blockchain Based Applications== https://ceur-ws.org/Vol-2622/paper11.pdf
ElasticBloC: A Massively Scalable Architecture For
          Blockchain Based Applications
              Hadi Jibbawi                                                 Yehia Taher                               Rafiqul Haque
          Lebanese University                                    Université de Versailles – Paris-                 Intelligencia R&D
            Beirut, Lebanon                                                                                           Paris, France
        hadi.jibbawi@gmail.com                                                Saclay
                                                                       Versailles, France                    Rafiqul.Haque@intelligencia.fr
                                                                     yehia.taher@uvsq.fr

                                                                              Ali Jaber
                                                                       Lebanese University
                                                                         Beirut, Lebanon
                                                                        ali.jaber@ul.edu.lb

   Abstract—Blockchain is an emerging technology that would                          Security is an ever-growing concern in the financial industry.
possibly disrupt the existing centralized financial systems lead                  With the advent of digital financial information systems and
to the rise to a new technology era for the financial sector.                     rich transaction technologies, the operations have become
Additionally, different new use cases such as healthcare, identity
management, etc. suggest that Blockchain has much wider appli-                    faster and the operational activities have spanned largely; at
cations. Blockchain is founded on distributed ledger technology                   the same time, the risk of breaching information has been
that ensures trust through consensus between parties in a peer-                   increased enormously. Although, there are advanced technolo-
to-peer network instead of the need to a third party or central                   gies encryption technologies that enable cryptic transmission
authority. However, blockchain has several limitations such as                    of financial data between financial actors (e.g., banks, insider
scalability, latency, low throughput which are the main barriers
for Blockchain being adopted by the industries. Of all, scalability
                                                                                  intruder), security remain a problem because cryptographic
is the most critical limitation of blockchain that needs an                       algorithms are still weak to many attacks launched by the
efficient and effective solution. In this paper, we aim to enhance                adversaries. Blockchain was deemed a major breakthrough
the scalability of blockchain by designing and implementing                       technology that would prevent some unsolved security issues.
a massively scalable architecture for private blockchain-based                    Therefore, the huge adoption of Blockchain was forecasted by
applications, called ElasticBloC. To evaluate our contribution,
we conducted several experiments on ElasticBloC. The results
                                                                                  many such as Gartner within not only the financial industry
showed that ElasticBloC is a high-performant architecture that                    but also the other industries.
scales massively.                                                                     Furthermore, trust is critical when it is related to commu-
   Index Terms—Blockchain, Performance, Scalability                                nication between two or more parties. Historically and till
                                                                                   now, trust is achieved mostly by the third party like banks or
                            I. I NT RO D U CTI O N                                 authorities, that holds our data. In other words, communicating
   Over the last few years, Blockchain has drawn huge atten-                       parties rely on a common ledger which is held and managed
tion to industry experts, technology evangelists, and academic                     by this third party. As a typical example, Clearing Households
researchers due to its immense potentiality described in a large                   validates and manages the communication between trading
body of literature and other sources such as blogs, forums,                        parties. So Blockchain technology comes as a potential tech-
etc. Many researchers and industry experts argued that it is                       nology. Its disruptive aspect is that it eliminates the need for
a revolutionary technology like the Internet that will provide                     intermediaries while performing transactions [7]. Hence, it can
a highly efficient way to transact in a secure, immutable,                         empower groups of parties to agree on events without needing
transparent, and auditable manner. In 2016, it was one of the                      the third party, such as the promise of this new technology [8].
technologies that reached a peak in inflated expectation; since                       Our study revealed several limitations of Blockchain tech-
then the interests in blockchain have been soaring to different                    nologies, as mentioned in the earlier section. However, the
types of industries. In particular, the interest of Blockchain                     major concerns of Blockchain technologies are two-fold: scal-
technology among the financial industry started to grow as                         ability and performance. Scalability is considered the critical
it might be a potential one that would enable them to avoid                        drawback that stands against blockchain technology. In fact,
financial debacles such as the Heartland Payment Systems data                      it is the first limitation that must be addressed to make
breach that had happened in 2008.                                                  blockchain an acceptable technology. The reason is obvious.
                                                                                   Consider a Walmart payment system that processes more
                                                                                   250 transactions every hour. Since Blockchain technology



 Copyright © 2019 for this paper by its authors. Use permitted under Creative
 Commons License Attribution 4.0 International (CC BY 4.0).




                                                                                                                                                      73
replicates blocks in different consensus servers hosted in dif-     jointly manage a public Blockchain (BC) that ensures end-
ferent locations to increase trust and guarantee security against   to-end privacy and security. The overlay is organized as
any odd modification, Blockchain must require a scalable            distinct clusters to reduce overheads and the cluster heads are
infrastructure not only at the operational level but also at the    responsible for managing the public BC. LSB incorporates
physical level. Because blockchain technologies are founded         several optimizations which include algorithms for lightweight
on immutability principle which means that every block cre-         consensus, distributed trust and throughput management. To
ated cannot be updated or replaced by a new block. All new          ensure scalability, the overlay nodes are organized as clusters
blocks will be appended only which will require physical            and only the cluster heads (CH) are responsible for managing
scalability especially for retail companies like Walmart, banks,    the public BC. Technically speaking, this is a conventional
and high-end manufacturing companies.                               approach to gain scalability which is very limited. However,
   In the existing blockchain protocols such as Ethereum, Bit-      it is not possible to gain massive scalability because it is not
coin, Ripple, and Tendermint, each participating node should        supported by underlying system-level technologies such as file
process every transaction in the network. In this case, nodes       system.
need more storage, bandwidth, and computation power as                 Zamani et al. [26] developed a solution called RapidChain
the blockchain expands. Indeed, this technology will lose its       sharding-based public blockchain protocol that is resilient to
decentralization, because the blockchain will reach a limit         Byzantine faults from up to a 1/3 fraction of its partici-
that only specific nodes can process a block [19]. According        pants and achieves complete sharding of the communication,
to Vitalik, with the current design of blockchain technology,       computation, and storage overhead of processing transactions
scalability cannot be achieved because it focuses on decentral-     without assuming any trusted setup. RapidChain employs an
ization and security, not scalability. While a decentralization     optimal intra-committee consensus algorithm that can achieve
consensus mechanism offers some critical benefits, such as          very high throughputs via block pipelining, a novel gossiping
fault tolerance, a strong guarantee of security, political neu-     protocol for large blocks, and a provably-secure reconfigu-
trality, and authenticity, it comes at the cost of scalability.     ration mechanism to ensure robustness. Using an efficient
Existing solutions, some as to be mentioned in Section 2, vary      cross-shard transaction verification technique, the proposed
in their aspects. Some scale to a limit, another downgrade the      protocol avoids gossiping transactions to the entire network.
performance [22], etc.                                              The empirical evaluations suggest that RapidChain can pro-
   Realizing the significance of a massively scalable infras-       cess (and confirm) more than 7,300 tx/sec with an expected
tructure of Blockchain technology which cannot be addressed         confirmation latency of roughly 8.7 seconds in a network of
by existing solutions, in this paper we developed a scalable        4,000 nodes with an overwhelming time-to-failure of more
Blockchain technology in which scalability is strongly corre-       than 4,500 years. RapidChain is focused more on performance.
lated with the performance that has an impact on blockchain         A limited effort was put on scalability; not to mention that the
efficiency.                                                         scalability is logical like the one proposed in [9]. This does
   The remainder of this paper is organized as follows. Section     not address the massive scalability limitation.
2 discusses works related to the core issue of this paper. We          Eyal et al. [14] proposed a protocol called Bitcoin-NG that
explained our solution ElasticBloc in section 3. We reported        is founded on several novel metrics of interest in quanti-
our experiments with ElasticBloc and discussed the results in       fying the security and efficiency of Bitcoin-like blockchain
Section 4. We present a conclusion in Section 5.                    protocols. We implement Bitcoin-NG and perform large-scale
                                                                    experiments at 15% the size of the operational Bitcoin system,
                    II. R ELATED W O R KS                           using unchanged clients of both protocols. These experiments
                                                                    demonstrate that Bitcoin-NG scales optimally, with bandwidth
   This research revolves around the scalability issue of           limited only by the capacity of the individual nodes and
blockchain technology. In this section, we described a review       latency limited only by the propagation time of the network.
of works related to this issue. Ehmke et al. [10] proposed a        The scalability yet again is achieved at logical, not at the
solution based on the idea of Ethereum to keep the state of         physical level. Zhang and Jacobsen proposed DCS properties
the system explicitly in the current block but further pursues      (Decentralization, Consistency, and Scalability) as an analogy
this by including the relevant part of the current system state     to the CAP theorem. The authors provided a general structure
in new transactions as well. This enables other participants        of the blockchain platform which decomposes the distributed
to validate incoming transactions without having to download        ledger into six layers: Application, Modeling, Contract, Sys-
the whole blockchain initially. The scalability in the proposed     tem, Data, and Network. Finally, we classify research angles
solution is the logical level that is, the authors’ developed       across three dimensions: DCS properties impacted, targeted
techniques extending Merkle Patricia Tree [4]. In [9], Dorri et     applications, and related layers. The proposed solution is yet
al. proposed a tiered Lightweight Scalable Blockchain (LSB)         again limited in terms of scalability. Guo et al. [13] proposed
that is optimized for IoT requirements. The authors explored        a solution that relies on two key techniques: a fair contract
LSB in a smart home setting as a representative example             partition algorithm leveraging integer linear programming to
of broader IoT applications. LSB achieves decentralization          partition a set of smart contracts into multiple subsets, and
by forming an overlay network where high resource devices           a random assignment protocol assigning subsets randomly




                                                                                                                                       74
to a subgroup of users. This is a logical model for gaining           used technology that relies on Byzantine Consensus Protocol
scalability as all the other authors proposed.                        which reduces the computational cost significantly. The core
   To sum up, the research on Blockchain technology is still          of this protocol Byzantine Fault Tolerance [5]. The extreme
limited. Most of the research focuses on the performance of           parallelism ensued by the file system and BFT based consensus
blockchain applications. The solutions concerning scalability         protocol makes ElasticBloC high-performant.
proposed in the literature by far deals with logical scalability
such as partitioning/sharding blocks. However, it is not ade-         B. Architecture of ElasticBloC
quate to gain massive scalability which needs physical level             ElasticBloC is composed of several components. Fig. 1
scalability that is achieved through system-level technologies        depicts the architecture of ElasticBloC. The components are
such as distributed file systems.                                     briefly explained in the following:
III. ELASTICBLOC – THE MASSIVELY SCALABLE                                • Transaction Gateway: The transaction gateway is a con-
           BLOCKCHAIN SOLUTION                                             nection component that enables to discover and connect
                                                                           with blockchain endpoint. It is also a channel through
   This section provides a detailed description of the core
                                                                           which users launch a transaction request.
contribution of this paper. It begins with an overview of
                                                                         • Request Receiver: It is an upfront server that receives
ElasticBloC, then a description of the high-level architecture
                                                                           the HTTP requests.
of ElasticBloC is provided followed by a presentation of
                                                                         • Communication Interface: It is a standard interface
the solution workflow. The functionalities of ElasticBloC are
                                                                           for communication with Python applications. Since Elas-
briefly explained and finally, I explained the implementation
                                                                           ticBloC is developed using Python programming lan-
of ElasticBloC.
                                                                           guage, this component is critical in communicating with
A. A Brief Overview of ElasticBloC                                         Python applications
                                                                         • Operation Synthesizer: It handles various operations in-
   ElasticBloC is a solution for developing blockchain appli-
cations. It is a generic solution that aims to support building            cluding scaling up complex applications, object-relational
all types of blockchain-based applications such as asset man-              mapping, validation of a request, authentication checking,
agement, smart contract or notarization in a scalable manner.              and upload handling, etc.
                                                                         • Blockchain Engine: It is one of the key components that
The users can deploy these applications on ElasticBloC which
perform the internal blockchain functions such as generating               perform a multitude of tasks including handling events,
blocks, adding blocks in the storage, retrieving the blocks,               data modeling, and operation orchestration.
etc. ElasticBloC guarantees scalability at the physical level
for blockchain applications.
   ElasticBloC relies on a cluster computing paradigm that
underpins developing an ecosystem consisting of a massive
number of physical nodes (servers). As mentioned earlier, the
focus of the solution proposed in this paper is to achieve
the scalability of the physical layer instead of the logical
layer. The scalability at the logical layer can be achieved in
many ways such as the logical partition of blocks and then
store the partition in different nodes within the cluster which
consists of the limited number of nodes. However, scalability
at the physical level needs extensible architecture. ElasticBloC
architecture is extensible which enables users to add physical
nodes on the fly or offline to enhance the capability to store
any number of blocks generated by transaction applications. In                          Fig. 1. ElasticBloC Architecture
order to gain easy extensibility, it reuses an existing distributed
file system that simplifies adding new nodes. This file system          •   Functional Interface: It is another important component
supports commodity hardware; therefore, building a large                    of ElasticBloC. It allows for Byzantine Fault Tolerant
cluster using ElasticBloC is cost-effective.                                replication of applications written in any programming
   Performance is another issue dealt with by ElasticBloC. The              language. The consensus engine communicates with the
file system adopted in blockchain support extreme parallelism               application via a socket protocol that satisfies the func-
as it underlies the MapReduce functional programming model                  tional interface. This interface consists of 3 primary
used by the applications to read and write blocks. Furthermore,             message types that get delivered from the core to the
ElasticBloC adopted a technology that avoids computationally                application. The application replies with corresponding
expensive proof of work [48]. Proof-of-work based consensus                 response messages. It consists of three message types:
protocols are also slow, requiring up to an hour to reasonably              – DeliverTx Messages: Each transaction in the
confirm a payment to prevent double-spending. ElasticBloC                      blockchain is delivered with this message.




                                                                                                                                         75
    – CheckTx Messages: The CheckTx message is similar                                                TABLE I
       to DeliverTx, but it is only for validating transactions.                   B I G C H A I N DB D R I V E R M A I N M E T H O D S
    – Commit Messages: The Commit message is used to
       compute a cryptographic commitment to the current            Class               Method                      Parameters            Description
       application state, to be placed into the next block          BigchainDB          BigchainDB                 *nodes                 Creates          an
                                                                                                                                          instance           of
       header.                                                                                                                            bigchaindb driver
  • Broadcasting Interface: It receives HTTP post request                                                                                 which is able to
                                                                                                                                          create, sign, and
    from blockchain engine and broadcast it blockchain                                                                                    send transactions
    repository.                                                                                                                           to several nodes
  • Blockchain Repository: It securely and consistently             BigchainDB          api info                   Headers                Retrieves
                                                                                                                                          the          HTTP
    replicates an application on many nodes. It works even                                                                                API          details
    if up to 1/3 of machines fail in arbitrary ways [1]. Every                                                                            provided          by
    machine that is not faulty sees the same transaction log                                                                              the BigchainDB
                                                                                                                                          server
    and computes the same state. Secure and consistent repli-       BigchainDB          Info                       Headers                Retrieves
    cation is a fundamental problem in distributed systems;                                                                               information
    it plays a critical role in the fault tolerance of a broad                                                                            of     the     node
                                                                                                                                          connected         to
    range of applications, from currencies, to elections, to                                                                              via     the     root
    infrastructure orchestration, and beyond.                                                                                             endpoint, such as
    The ability to tolerate machines failing in arbitrary ways,                                                                           sever version and
    including becoming malicious, is known as Byzantine                                                                                   overview of all
                                                                                                                                          the endpoints
    fault tolerance (BFT) [5].                                      Transactions        Fulfill                    transaction,           Fulfills the given
  • Blockchain Database Network: It is a network of four            Endpoint                                       private keys           transaction
    or more nodes.                                                  Transactions        Get                        *, asset id, oper-     Retrieves a list of
                                                                    Endpoint                                       ation, headers         transactions that
  • Scalable Block Storage Cluster: This is the most im-                                                                                  have the specified
    portant component that enhances physical infrastructure                                                                               asset
    to a massive number of nodes. It enormously increases           Transactions        Prepare                    *,       operation     Prepares          a
                                                                    Endpoint                                       (CREATE         or     transaction
    the capability of storing blocks as records. It is founded                                                     TRANSFER),             payload, ready to
    on column-oriented databases that are supported by a dis-                                                      signers,               be fulfilled
    tributed file system that can support building a blockchain                                                    recipients, assets,
                                                                                                                   metadata, inputs
    lake consisting of thousands of nodes. The cluster consists     Transactions        Retrieve                   transaction id,        Retrieves      the
    of one or more master nodes and hundreds of data                Endpoint                                       headers                transaction     of
    nodes that essentially store the blocks. It is highly faulted                                                                         given id
                                                                    Transactions        Send                       transaction,           Sends a transac-
    tolerant because each block is replicated into three nodes      Endpoint                                       mode, headers          tion to the first
    (can be more depending on users’ preference). If any node                                                                             specified nodes
    is not functioning two other nodes are available.               Outputs             Get                        public key,            Retrieves
                                                                    Endpoint                                       spent, headers         transaction
    In fact, it is not only a storage cluster, but the column-                                                                            outputs by the
    oriented database also enables querying and managing                                                                                  public key
    blocks.                                                         Assets Endpoint     Get                        *, search, limit,      Retrieves the as-
                                                                                                                   headers                sets that match
C. ElasticBloC Operational Methods                                                                                                        the search text
                                                                    Crypto              Generate keypair           None                   Generates        a
   ElasticBloC enables us to perform different blockchain                                                                                 cryptographic
operations using various methods that are described into                                                                                  key pair
two categories: BigchainDB methods that are provided by
BigchainDB server and HBase connection methods for estab-
lishing a connection with BigchainDB and performing various            The above table represents the interface of functions
operations. I implemented all HBase connection methods                 that a client can use to communicate with ElasticBloC.
within the scope of this paper. These methods are presented            This facilitates dealing with such modular architecture.
in the following subsections.                                          The reason behind this facilitation is that once a client
  a) BigchainDB Methods                                                transacts or operates via these functions, the rest of
      ElasticBloC provides a library that allows the                   the flow is automated i.e. the operation flows through
      client   to   perform    ElasticBloC   functionalities.          the required components automatically. So, the client
      This library is called “bigchaindb driver”. The                  communicates with one component.
      table below lists the major methods that a client                BigchainDB driver calls in its method’s implementation
      can use to transact or operate in ElasticBloC.                   the methods of the BigchainDB-HBase connector, that
                                                                       will be explained in the successive section, to perform
                                                                       any operation that accesses HBase, such as retrieving data




                                                                                                                                                                  76
    (transactions, assets, metadata, etc.), checking the pre-                                          TABLE II
    existence of a newly submitted transaction, or storing of        B I G C H A I N DB-HB A S E C O N N E C TO R F U N C T I O NA L I T I E S M E T H O D S
    committed block with its details.                               Method                      Parameters                Description
 b) BigchainDB-HBase Connector                                      connect                    backend,     host,         Creates a new connection
    BigchainDB-HBase connector is considered the core of                                       port, name, con-           to the backend database
                                                                                               nection timeout            (HBase)
    our contribution. In this connector, I implemented a group      create tables              connection,                Creates tables in HBase to
    of methods that BigchainDB can expose in order to                                          dbname                     be used by BigchainDB
    connect and operate with HBase as a backend database.           delete tables              connection,                Deletes the created tables
                                                                                               dbname                     in HBase
    The following table describes the methods implemented           store transaction          connection,                Stores a transaction in
    with the connector.                                                                        signed transaction         Transactions table
    Hence the preceding methods represent the interface             store transactions         connection,                Stores a list of transac-
                                                                                               signed transactions        tions in Transaction table
    of the connector that integrates BigchainDB sever with
                                                                    get transaction            connection,                Gets a transaction from
    HBase.                                                                                     transaction id             Transactions table
                                                                    get transactions           connection,                Gets a list of transactions
D. Solution Workflow                                                                           transaction ids            from Transactions table
   ElasticBloC has a unique general workflow. In fact, this         store metadatas            connection,                Stores metadata in Meta-
                                                                                               metadata                   data table
workflow differs in its small parts according to the submitted
                                                                    get metadata               connection,                Gets metadata from Meta-
transaction mode or the nature of the desired operation. The                                   transaction ids            data table
diagram below shows the general workflow of ElasticBloC.            store asset                connection, asset          Stores asset in Assets ta-
Once the client has a valid transaction i.e. the transaction                                                              ble
                                                                    store assets               connection,                Stores a list of assets in
                                                                                               assets                     Assets table
                                                                    get asset                  connection,                Gets an asset from Assets
                                                                                               asset id                   table
                                                                    get assets                 connection,                Gets a list of assets from
                                                                                               asset ids                  Assets table
                                                                    store block                connection, block          Stores a block in Blocks
                                                                                                                          table
                                                                    get block                  connection,                Gets a block from Blocks
                                                                                               block id                   table
                                                                    get spent                  connection,                Check if a transaction id
                                                                                               transaction id,            was already used as
                                                                                               output index               an input. A transaction
                                                                                                                          can be used as an input
                                                                                                                          for another transaction.
                                                                                                                          Bigchain needs to make
                                                                                                                          sure that a given txid is
                                                                                                                          only used once Gets the
                   Fig. 2. ElasticBloC Workflow                                                                           spending transaction
                                                                    get latest block           Connection                 Gets the latest committed
                                                                                                                          block
conforms to the BigchainDB Transactions Specification, he           get txids filtered         connection,                Gets all transactions for
submits it to one or more ElasticBloC nodes through the                                        asset id,                  a particular asset id and
BigchainDB HTTP API [17]. In particular, it embeds the                                         operation                  optional operation
                                                                    get owned ids              connection,                Gets a list of transactions
transaction in an HTTP request and specifies one of the                                        owner                      ids we can use which has
predefined ends points to send through. These endpoints are:                                                              inputs
   • POST /API/v1/transactions                                      get spending               connection,                Gets transactions which
                                                                     transactions              inputs                     spend given inputs
   • POST /API/v1/transactions?mode=async
                                                                    get block with             connection,                Gets block holding a spe-
   • POST /API/v1/transactions?mode=sync                             transaction               transaction id             cific transaction
   • POST /API/v1/transactions?mode=commit                          delete transaction         connection,                Deletes a transaction from
                                                                                               transaction id             database and its relevant
   After that, the HTTP request holding the transaction arrives                                                           asset and metadata
at the BigchainDB node at the Gunicorn [16] web server              delete transactions connection,                       Deletes transactions from
in that node. Then Gunicorn forwards the request towards                                transaction ids                   database and their relevant
                                                                                                                          assets and metadata
the BigchainDB server using its exposed Web Server Gate-            delete latest block connection                        Delete the latest commited
way Interface (WSGI). The request reaches the BigchainDB                                                                  block
server through the Flask web application development frame-         store unspent              connection,                Stores unspent outputs in
                                                                     outputs                   unspent outputs            utxos table
work which simplifies working with WSGI/Gunicorn. The
                                                                    get unspent                connection,    *,          Gets unspent outputs
BigchainDB server uses a Python method to check the trans-           outputs                   query
action’s validity. If the transaction is not valid, then the HTTP   delete unspent             connection,                Deletes unspent outputs
response status code is 400 which means error. Otherwise, it         outputs                   unspent outputs            from utxos table
                                                                    store pre commit           Connection, state          Stores pre commit state
is put into a new JSON string and sent to the local Tendermint       state
instance via Tendermint Broadcast API.                              get pre commit             Connection,                Gets pre commit state of
                                                                     state                     commit id                  a commit id




                                                                                                                                                               77
    Now, the operations between the local Tendermint in-               in two phases of voting on a proposed block before it
stance and BigchainDB are established by the Applica-                  is committed, and follow a simple locking mechanism
tion Blockchain Interface (ABCI) which is an integral part             which prevents any malicious coalition of less than one-
of Tendermint and implemented also at the BigchainDB                   third of the validators from compromising safety [2].
server side. In this case, Tendermint uses the broadcast               In this, the broadcast interface and the Blockchain repos-
endpoint which is relevant to the initial BigchainDB’s re-             itory (See in Fig. 1) are implemented using Tendermint.
quest chosen. For example, if a client sent a transaction           b) BigchainDB
through /API/v1/transactions?mode=commit endpoint, Tender-             The Blockchain Engine of ElasticBloC is implemented
mint uses /broadcast tx commit endpoint respectively. Ten-             using BigchainDB, which is for database-style decentral-
dermint stores the initial validated transactions in its own           ized storage: a blockchain database. BigchainDB com-
mempool (memory pool). When it decides to create a block,              bines the key benefits of distributed DBs and tradi-
Tendermint sends the creation request to BigchainDB by                 tional blockchains, with an emphasis on the scale [21].
exposing a specific ABCI method. Then, it starts to send initial       BigchainDB on top of an enterprise-grade distributed
validated transactions that needed to be grouped in the desired        DB, from which BigchainDB inherits high throughput,
block also using another ABCI method to BigchainDB which               high capacity, low latency, a full-featured NoSQL query
rechecks the validity of the transaction before it is added to         language, and permissioning. Nodes can be added to
the block.                                                             increase throughput and capacity.
    The proposed block is then broadcasted to the network by        c) HBase
Tendermint. Then it makes sure that all the nodes agree on this        The scalable block storage cluster (See Fig. 1) is imple-
block in a Byzantine fault tolerance way. When the network             mented using HBase technology. Although BigchainDB
agrees on a new block, Tendermint appends the new block                aims at increasing scalability, yet massive scalability
to the blockchain in its local LevelDB, and the BigchainDB             could not be achieved using BigchainDB.
server receives a commit message enforcing it to write the             Therefore, I implemented the scalable block storage
new block and the including transactions, assets, and metadata         of using HBase. HBase [15] is modeled on Google’s
in a separate way to the HBase repository. HBase then writes           BigTable database [6]. HBase provides a distributed,
these data into the Hadoop Distributed File System underlying          fault-tolerant scalable database, built on top of the HDFS
it. The same process is done at each node in the ElasticBloC           file system [24], with random real-time read/write access
network.                                                               to data. Each HBase table is stored as a multidimensional
                                                                       sparse map, with rows and columns, each cell having a
E. Implementation of ElasticBloC                                       timestamp [6]. A cell value at a given row and column
   As mentioned earlier, my goal is to use existing technologies       is uniquely identified by:
for building ElasticBloC. The reason is two-fold: avoiding             (Table, Row, Column-Family: Column, Timestamp)
reinventing the same technology that already exists and it             ⇒
would be impractically ambitious to develop a complex ar-              Value
chitecture like ElasticBloC. I developed ElasticBloC using             HBase has its own Java client API, and tables in HBase
the most advanced technologies. I briefly described the main           can be used both as an input source and as an output
technologies in the following:                                         target for MapReduce jobs through Table Input and Table
  a) Tendermint                                                        Output Format. There is no HBase single point of failure.
     Tendermint is a secure state-machine replication algo-            HBase uses Zookeeper [27], another Hadoop subproject,
     rithm in the blockchain paradigm. It provides a form              for the management of partial failures.
     of BFT-ABC (Atomic Broadcast) that is furthermore                 The HBase connector which is the primary contribu-
     accountable - if safety is violated, it is always possible        tion of this paper was implemented using Python. The
     to verify who acted maliciously [3].                              first step of building the connector was to indicate the
     Tendermint begins with a set of validators, identified by         tables that are needed to store the architecture data
     their public key, where each validator is responsible for         (blocks, transactions, assets . . . ). The second step was
     maintaining a full copy of the replicated state, and for          to write a file that opens a connection to Hbase, based
     proposing new blocks (batches of transactions), and vot-          on the connection parameters and values given by the
     ing on them [20]. Each block is assigned an incrementing          BigchainDB configuration file, and return an instance of
     index, or height, such that a valid blockchain has only           this connection.
     one valid block at each height. At each height, validators        In the next, the schema file was written. The schema
     take turns proposing new blocks in rounds, such that for          file defines and creates the database schema at HBase
     any given round there is at most one valid proposer. It           once BigchainDB is initialized. After that, the required
     may take multiple rounds to commit a block at a given             querying methods were implemented. Some of these
     height due to the asynchrony of the network, and the              methods are for retrieving data, others for storing, up-
     network may halt altogether if one-third or more of the           dating or deleting data. In addition to the above, some
     validators are offline or partitioned [3]. Validators engage      web application development tools have been used in
                                                                       developing.



                                                                                                                                    78
F. Experiments & Results
   This section describes some experiments that we conducted
on ElasticBloC and discusses its results. The goal of the
experiments is to evaluate two characteristics of ElasticBloC:
scalability and performance. I conducted the following exper-
iments are: Initial loading experiment, functionality experi-
ment, and its result, scalability experiment and its result, and
the ElasticBloC performance evaluation.
  a) Initial Loading Experiment
     • Purpose:
                                                                              Fig. 4. BigchainDB Web Interface After Startup.
        Testing the start-up running and initializing of the
        whole architecture.
     • Requirements:
        The required steps are to run the ElasticBloC com-
        ponents and check the connectivity between these                                 Fig. 5. Tendermint Start-up.
        components. The following summarizes these steps:
          – Run the Hadoop cluster and HBase.
          – Run the BigchainDB server.                                    to that stored in HBase in order to confirm that the
          – Run Tendermint instance.                                      data is the same.
     • Results:                                                      b) Functionality Experiment
        The architecture components run successfully and the            • Purpose: Test if ElasticBloC performs operations nor-
        connection between the components is established.                 mally.
        Once established, BigchainDB executed the schema                • Required Steps: Testing the functionality of Elas-
        file in the implemented connector and created the                 ticBloC is done through writing and executing a Python
        needed tables in HBase. The following screenshots                 script that gets uses of the bigchaindb driver library.
        represent some results of the successful initial start-up.        The steps below show the required steps:
                                                                            – Run ElasticBloc components.
                                                                            – Write a Python script that creates a transaction,
                                                                              fulfills it with the sender private key, sends the
                                                                              transaction in a commit mode. The following Fig-
                                                                              ure represents the testing Python script.




                                                                                    Fig. 6. The Experiment’s Python Script

                                                                        • Results
                                                                          The transactions are successfully created and sent.
                                                                          Because the sent transactions are in commit mode,
                                                                          so directly BigchainDB created a block for the trans-
                   Fig. 3. BigchainDB Start-up.                           action. This block was appended to the Tendermint
                                                                          local copy of the blockchain, and the block, including
       As we can see above, the components ran success-                   transaction, assets, and metadata are stored in their
       fully and Tendermint opened the required sockets and               specific tables at HBase.
       established the ABCI Handshaking. It compares the                  Fig. 7 shows the created block in the blockchain stored
       application’s highest height and the application hash              at Tendermint.




                                                                                                                                    79
                                                                                 Fig. 9. The New Data nodes Cluster.
      Fig. 7. The Appended Block in Tendermint Blockchain.


                                                                         ElasticBloC runs normally.
                                                                      • Discussion
                                                                         As ElasticBloC scales by adding a new server to the
                                                                         Hadoop cluster, it is feasible to add data nodes either
                                                                         on fly or offline. This means that ElasticBloC has the
                                                                         ability to scales massively.
      Fig. 8. The Appended Block and Its Details in HBase.         d) Performance Evaluation
                                                                      Concerning the large blockchain dataset, we were not able
                                                                      to access a large blockchain dataset rather than building it.
     Fig. 8 shows the result of the submitted transaction             For that, we could not test ElasticBloC on a massive scale
     with its details retrieved from HBase.                           dataset. Accordingly, we rely on this section on some
   • Discussion
                                                                      previously conducted experiments and workbenches that
     The experiment is conducted successfully and the                 give us, theoretically, a clear idea on the performance of
     architecture is working finely.                                  the overall architecture.
     One of the positive aspects of the architecture is that          The overall performance of ElasticBloC is evaluated by
     creation, fulfillment, sending, and validating of the            its components, in particular, the performance of Ten-
     transaction, with the block creation and appending to            dermint as a consensus engine and HBase as a backend
     the blockchain, and its storage in HBase took around             database.
     only one second.                                                 Tendermint acts as a high-performant in a large dis-
     One major limitation of the experiment is that these             tributed environment. According to Cosmos white paper
     experiments were conducted on a limited blockchain               [18]:
     dataset. The reason behind this is that there was neither        “Despite its strong guarantees, Tendermint provides ex-
     possibility to access a large blockchain dataset nor time        ceptional performance. In benchmarks of 64 nodes dis-
     to build our own large dataset. Instead, we take into            tributed across 7 data centers on 5 continents, on com-
     consideration previous benchmarks and experiments                modity cloud instances, Tendermint consensus can pro-
     were done using huge bulks of data in which it helps             cess thousands of transactions per second, with commit
     us to evaluate our architecture.                                 latencies on the order of one to two seconds. Notably, the
c) Scalability Experiment                                             performance of well over thousands of transactions per
   • Purpose                                                          second is maintained even in harsh adversarial conditions,
     Test if ElasticBloC can scale massively.                         with validators crashing on broadcasting maliciously
   • Required Steps                                                   crafted votes.”
     Massive scalability means that ElasticBloC has the               On the other side, building MongoDB on the top of HDFS
     ability to scale as much as it needs on its physical layer.      is less efficient than building HBase on the top of the
     This could be ensured if we succeeded in adding new              mention file system. The reason behind this argument is
     data nodes to the ElasticBloC node. To do that we tried          that HBase is natively developed to run on the top of
     to add new Hadoop nodes to the Hadoop cluster either             HDFS, while MongoDB needs a connector as the third
     while ElasticBloC is running or when it is offline.              party to be built on the top of HDFS.
   • Results                                                          Moreover, based on several benchmarks, such as [11] and
     The previous experiment is conducted successfully and            [12], HBase acts more efficiently than MongoDB in large




                                                                                                                                      80
clusters. For instance, End Point [23] performed a series                                                               TABLE VII
of tests for the performance of several NoSQL databases                        T H RO U G H P U T C O M PA R I S O N I N M I X E D O P E R AT I O NA L A N D A NA LY T I C A L
                                                                                                                        W O R K L OA D
including HBase and MongoDB. The following are some
comparison results for ‘the performance of HBase and                                          Number         HBase                      MongoDB
MongoDB in different tests based on [11].                                                     of             (operation/sec)            (operation/sec)
                                                                                              Nodes
                                                                                              1              269.30                     939.01
                                                                                              2              333.12                     30.96
                               TABLE III                                                      4              1228.61                    10.55
     T H RO U G H P U T C O M PA R I S O N W H I L E L OA D I N G DATA                        8              2151.74                    39.28
                                                                                              16             5986.65                    337.4
      Number          HBase                      MongoDB
                                                                                              32             8936.18                    227.80
      of              (operation/sec)            (operation/sec)
      Nodes
      1               15617.98                   8368.44
      2               23373.93                   13462.51
                                                                                     The above experiment is tangible evidence of how HBase
      4               38991.82                   18038.49
      8               74405.64                   34305.30                            is more efficient than MongoDB.
      16              143553.41                  73335.62                            Hence, according to theory and pre-existing experiments,
      32              296857.36                  134968.87                           HBase would also enhance the overall performance of
                                                                                     ElasticBloC. However, in its worth case, replacing Mon-
                                                                                     goDB with HBase will not downgrade the performance
                                                                                     of ElasticBloC.
                               TABLE IV
   T H RO U G H P U T C O M PA R I S O N W H I L E R E T R I E V I N G DATA    G. Conclusion & Future Work
      Number          HBase                      MongoDB                          Blockchain has not been adopted widely until now except
      of              (operation/sec)            (operation/sec)
      Nodes                                                                    for cryptocurrency applications, however, it has been identified
      1               428.12                     2149.08                       as potential technologies for several areas that need trust and
      2               1381.06                    2588.04                       security.
      4               3955.03                    2752.40
      8               6817.14                    2165.17
                                                                                  It is relatively a new technology that needs various improve-
      16              16542.11                   7782.36                       ments to reach to a maturity level. It has several limitations that
      32              20020.73                   6983.82                       are the main barriers to the wider adoption of this technology.
                                                                               Of all, scalability and performance are the major limitations
                                                                               that must be addressed.
                                                                                  This research primarily aims at addressing the scalabil-
                                TABLE V                                        ity problem. There are some solutions that offer techniques
          T H RO U G H P U T I N B A L A N C E D R E A D /W R I T E            methods, and guidelines for scalable blockchain. However, we
      Number          HBase                      MongoDB
                                                                               found that state-of-the-art technologies focus on scalability at
      of              (operation/sec)            (operation/sec)               the logical level which is an inadequate approach if scalability
      Nodes                                                                    at the physical level is to be guaranteed. In this paper,
      1               527.47                     1278.81
                                                                               we designed and implement a scalable architecture called
      2               1503.09                    1441.32
      4               4175.8                     1501.06                       ElasticBloC which enables users to build a highly scalable
      8               7725.94                    2195.92                       blockchain-based ecosystem consisting of tens or more of
      16              16381.78                   1230.96                       physical nodes.
      32              20177.71                   2335.14
                                                                                  In this paper, we proposed a solution for the scalability
                                                                               limitation of blockchain technology. We discussed our solution
                                                                               ElasticBloC which is a scalable architecture for building
                                                                               blockchain-based applications such as payment system, no-
                               TABLE VI                                        tarization, the smart contract can be implemented by ensur-
   T H RO U G H P U T I N R E A D /U P DAT E /W R I T E O P E R AT I O N S .
                                                                               ing scalability. ElasticBloC is a built on cluster computing
      Number          HBase                      MongoDB                       paradigm that building infrastructure with a massive num-
      of              (operation/sec)            (operation/sec)               ber of nodes. We presented the components with a detailed
      Nodes
      1               324.8                      1261.94
                                                                               description. We discussed the workflow of ElasticBloc. We
      2               961.01                     1480.72                       discussed technologies that we used in implementing the
      4               2749.35                    1754.30                       proposed solution.
      8               4582.67                    2028.06
      16              10259.63                   1114.13
                                                                                  We tested ElasticBloc to evaluate the scalability. Our exper-
      32              16739.51                   2363.69                       iment shows that ElasticBloc has the ability to scale up; it is
                                                                               flexible for adding as many servers. We discussed the results
                                                                               of our experiments in this paper. Additionally, we provided




                                                                                                                                                                                 81
some previous workbenches to provide a comparative view of                        [24] Shvachko, K., Kuang, H., Radia, S., & Chansler, R. (2010, May). The
the abilities of ElasticBloc.                                                          hadoop distributed file system. In Mass storage systems and technologies
                                                                                       (MSST), 2010 IEEE 26th symposium on (pp. 1-10). Ieee.
   Several works are lined up for a future extension of Elas-                     [25] Vukolić, M. (2015, October). The quest for scalable blockchain fabric:
ticBloC. However, to the best of our knowledge, the imminent                           Proof-of-work vs. BFT replication. In International Workshop on Open
critical task that must be accomplished in the near future                             Problems in Network Security (pp. 112-125). Springer, Cham.
                                                                                  [26] Zamani, M., Movahedi, M., & Raykova, M. (n.d.). RapidChain: A Fast
is extending the functional capabilities of our solution. We                           Blockchain Protocol via Full Sharding.
planned to enhance add new modules to ElasticBloC to enable                       [27] ZooKeeper, A. (2017). The Apache Software Foundation. Accessed
users to develop permission-less blockchain-based applications                         December, 29, 2017.
or for permission blockchain-based applications.

                              R EF E RE N CES
 [1] Anonymous.               Available              Retrieved             from
     https://tendermint.readthedocs.io/en/latest/introduction.html.
 [2] Branden, J. V. Building a Performance Model of the Tendermint Con-
     census Algorithm.
 [3] Buchman, E. (2016). Tendermint: Byzantine fault tolerance in the age
     of blockchains (Doctoral dissertation).
 [4] Buterin, V. (2014). A next-generation smart contract and decentralized
     application platform.
 [5] Castro, M., & Liskov, B. (2003). U.S. Patent No. 6,671,821. Washington,
     DC: U.S. Patent and Trademark Office.
 [6] Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A.,
     Burrows, M., ... & Gruber, R. E. (2008). Bigtable: A distributed storage
     system for structured data. ACM Transactions on Computer Systems
     (TOCS), 26(2), 4.
 [7] Condos, J., Sorrell, W. H., & Donegan, S. L. (2016). Blockchain
     technology: Opportunities and risks.
 [8] DBS Group Research. (2016). Understanding blockchain technology and
     what it means for your business.
 [9] Dorri, A., Kanhere, S. S., Jurdak, R., & Gauravaram, P. (2017). LSB: A
     Lightweight Scalable BlockChain for IoT Security and Privacy. arXiv
     preprint arXiv:1712.02969.
[10] Ehmke, C., Wessling, F., & Friedrich, M. C. (2018). Proof-of-property: a
     lightweight and scalable blockchain protocol. In Proceedings of the 1st
     International Workshop on Emerging Trends in Software Engineering
     for Blockchain. (pp. 48-51). ACM.
[11] End Point. (2015). Benchmarking Top NoSQL Databases: Apache
     Cassandra, Couchbase, Hbase, and MongoDB.
[12] Gandini, A., Knottenbelt, W. J., Osman, R., & Piazolla, P. (n.d.).
     Performance evaluation of NoSQL databases.
[13] Gao, Z., Xu, L., Chen, L., Shah, N., Lu, Y., & Shi, W. (2017, December).
     Scalable blockchain based smart contract execution. In Parallel and Dis-
     tributed Systems (ICPADS), 2017 IEEE 23rd International Conference
     on (pp. 352-359). IEEE.
[14] Gencer, E. A., Sirer, G. E., Van Renesse, R., & Eyal, I. (2016). Bitcoin-
     NG: A Scalable Blockchain Protocol. In NSDI.
[15] George, L. (2011). HBase: the definitive guide: random access to your
     planet-size data. ” O’Reilly Media, Inc.”.
[16] Gunicorn - Python WSGI HTTP Server for UNIX. (n.d.). Retrieved from
     https://gunicorn.org.
[17] The        HTTP         Client-Server      API        —        BigchainDB
     Server       0.8.2     documentation.       (n.d.).    Retrieved      from
     http://docs.bigchaindb.com/projects/server/en/v0.8.2/drivers-clients/http-
     client-server-api.html.
[18] Internet of Blockchains - Cosmos Network. (n.d.). Retrieved from
     https://cosmos.network/resources/whitepaper.
[19] James-Lubin, K. (2015, January 22). Blockchain scalability. Retrieved
     from https://www.oreilly.com/ideas/blockchain-scalability.
[20] Kwon, J. (2014). Tendermint: Consensus without Mining.
[21] McConaghy, T., Marques, R., Mü ller, A., De Jonghe, D., McConaghy,
     T., McMullen, G., ... & Granzotto, A. (2016). BigchainDB: a scalable
     blockchain database. white paper, BigChainDB.
[22] Out of Asia. (2017, December 27). Five Issues Preventing
     Blockchain From Going Mainstream: The Insanely Popular
     Crypto Game Etheremon Is One Of Them. Retrieved from
     https://www.forbes.com/sites/outofasia/2017/12/22/five-issues-
     preventing-blockchain-from-going-mainstream-the-insanely-popular-
     crypto-game-etheremon-is-one-of-them/#6d364bb66fad.
[23] Secure Business Solutions — End Point. (n.d.). Retrieved from
     http://www.endpoint.com/.




                                                                                                                                                                  82