=Paper=
{{Paper
|id=Vol-2622/paper11
|storemode=property
|title=ElasticBloC: A Massively Scalable Architecture For Blockchain Based Applications
|pdfUrl=https://ceur-ws.org/Vol-2622/paper11.pdf
|volume=Vol-2622
|authors=Hadi Jibbawi,Rafiqul Haque,Yehia Taher,Ali Jaber
|dblpUrl=https://dblp.org/rec/conf/bdcsintell/JibbawiHTJ19
}}
==ElasticBloC: A Massively Scalable Architecture For Blockchain Based Applications==
ElasticBloC: A Massively Scalable Architecture For
Blockchain Based Applications
Hadi Jibbawi Yehia Taher Rafiqul Haque
Lebanese University Université de Versailles – Paris- Intelligencia R&D
Beirut, Lebanon Paris, France
hadi.jibbawi@gmail.com Saclay
Versailles, France Rafiqul.Haque@intelligencia.fr
yehia.taher@uvsq.fr
Ali Jaber
Lebanese University
Beirut, Lebanon
ali.jaber@ul.edu.lb
Abstract—Blockchain is an emerging technology that would Security is an ever-growing concern in the financial industry.
possibly disrupt the existing centralized financial systems lead With the advent of digital financial information systems and
to the rise to a new technology era for the financial sector. rich transaction technologies, the operations have become
Additionally, different new use cases such as healthcare, identity
management, etc. suggest that Blockchain has much wider appli- faster and the operational activities have spanned largely; at
cations. Blockchain is founded on distributed ledger technology the same time, the risk of breaching information has been
that ensures trust through consensus between parties in a peer- increased enormously. Although, there are advanced technolo-
to-peer network instead of the need to a third party or central gies encryption technologies that enable cryptic transmission
authority. However, blockchain has several limitations such as of financial data between financial actors (e.g., banks, insider
scalability, latency, low throughput which are the main barriers
for Blockchain being adopted by the industries. Of all, scalability
intruder), security remain a problem because cryptographic
is the most critical limitation of blockchain that needs an algorithms are still weak to many attacks launched by the
efficient and effective solution. In this paper, we aim to enhance adversaries. Blockchain was deemed a major breakthrough
the scalability of blockchain by designing and implementing technology that would prevent some unsolved security issues.
a massively scalable architecture for private blockchain-based Therefore, the huge adoption of Blockchain was forecasted by
applications, called ElasticBloC. To evaluate our contribution,
we conducted several experiments on ElasticBloC. The results
many such as Gartner within not only the financial industry
showed that ElasticBloC is a high-performant architecture that but also the other industries.
scales massively. Furthermore, trust is critical when it is related to commu-
Index Terms—Blockchain, Performance, Scalability nication between two or more parties. Historically and till
now, trust is achieved mostly by the third party like banks or
I. I NT RO D U CTI O N authorities, that holds our data. In other words, communicating
Over the last few years, Blockchain has drawn huge atten- parties rely on a common ledger which is held and managed
tion to industry experts, technology evangelists, and academic by this third party. As a typical example, Clearing Households
researchers due to its immense potentiality described in a large validates and manages the communication between trading
body of literature and other sources such as blogs, forums, parties. So Blockchain technology comes as a potential tech-
etc. Many researchers and industry experts argued that it is nology. Its disruptive aspect is that it eliminates the need for
a revolutionary technology like the Internet that will provide intermediaries while performing transactions [7]. Hence, it can
a highly efficient way to transact in a secure, immutable, empower groups of parties to agree on events without needing
transparent, and auditable manner. In 2016, it was one of the the third party, such as the promise of this new technology [8].
technologies that reached a peak in inflated expectation; since Our study revealed several limitations of Blockchain tech-
then the interests in blockchain have been soaring to different nologies, as mentioned in the earlier section. However, the
types of industries. In particular, the interest of Blockchain major concerns of Blockchain technologies are two-fold: scal-
technology among the financial industry started to grow as ability and performance. Scalability is considered the critical
it might be a potential one that would enable them to avoid drawback that stands against blockchain technology. In fact,
financial debacles such as the Heartland Payment Systems data it is the first limitation that must be addressed to make
breach that had happened in 2008. blockchain an acceptable technology. The reason is obvious.
Consider a Walmart payment system that processes more
250 transactions every hour. Since Blockchain technology
Copyright © 2019 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
73
replicates blocks in different consensus servers hosted in dif- jointly manage a public Blockchain (BC) that ensures end-
ferent locations to increase trust and guarantee security against to-end privacy and security. The overlay is organized as
any odd modification, Blockchain must require a scalable distinct clusters to reduce overheads and the cluster heads are
infrastructure not only at the operational level but also at the responsible for managing the public BC. LSB incorporates
physical level. Because blockchain technologies are founded several optimizations which include algorithms for lightweight
on immutability principle which means that every block cre- consensus, distributed trust and throughput management. To
ated cannot be updated or replaced by a new block. All new ensure scalability, the overlay nodes are organized as clusters
blocks will be appended only which will require physical and only the cluster heads (CH) are responsible for managing
scalability especially for retail companies like Walmart, banks, the public BC. Technically speaking, this is a conventional
and high-end manufacturing companies. approach to gain scalability which is very limited. However,
In the existing blockchain protocols such as Ethereum, Bit- it is not possible to gain massive scalability because it is not
coin, Ripple, and Tendermint, each participating node should supported by underlying system-level technologies such as file
process every transaction in the network. In this case, nodes system.
need more storage, bandwidth, and computation power as Zamani et al. [26] developed a solution called RapidChain
the blockchain expands. Indeed, this technology will lose its sharding-based public blockchain protocol that is resilient to
decentralization, because the blockchain will reach a limit Byzantine faults from up to a 1/3 fraction of its partici-
that only specific nodes can process a block [19]. According pants and achieves complete sharding of the communication,
to Vitalik, with the current design of blockchain technology, computation, and storage overhead of processing transactions
scalability cannot be achieved because it focuses on decentral- without assuming any trusted setup. RapidChain employs an
ization and security, not scalability. While a decentralization optimal intra-committee consensus algorithm that can achieve
consensus mechanism offers some critical benefits, such as very high throughputs via block pipelining, a novel gossiping
fault tolerance, a strong guarantee of security, political neu- protocol for large blocks, and a provably-secure reconfigu-
trality, and authenticity, it comes at the cost of scalability. ration mechanism to ensure robustness. Using an efficient
Existing solutions, some as to be mentioned in Section 2, vary cross-shard transaction verification technique, the proposed
in their aspects. Some scale to a limit, another downgrade the protocol avoids gossiping transactions to the entire network.
performance [22], etc. The empirical evaluations suggest that RapidChain can pro-
Realizing the significance of a massively scalable infras- cess (and confirm) more than 7,300 tx/sec with an expected
tructure of Blockchain technology which cannot be addressed confirmation latency of roughly 8.7 seconds in a network of
by existing solutions, in this paper we developed a scalable 4,000 nodes with an overwhelming time-to-failure of more
Blockchain technology in which scalability is strongly corre- than 4,500 years. RapidChain is focused more on performance.
lated with the performance that has an impact on blockchain A limited effort was put on scalability; not to mention that the
efficiency. scalability is logical like the one proposed in [9]. This does
The remainder of this paper is organized as follows. Section not address the massive scalability limitation.
2 discusses works related to the core issue of this paper. We Eyal et al. [14] proposed a protocol called Bitcoin-NG that
explained our solution ElasticBloc in section 3. We reported is founded on several novel metrics of interest in quanti-
our experiments with ElasticBloc and discussed the results in fying the security and efficiency of Bitcoin-like blockchain
Section 4. We present a conclusion in Section 5. protocols. We implement Bitcoin-NG and perform large-scale
experiments at 15% the size of the operational Bitcoin system,
II. R ELATED W O R KS using unchanged clients of both protocols. These experiments
demonstrate that Bitcoin-NG scales optimally, with bandwidth
This research revolves around the scalability issue of limited only by the capacity of the individual nodes and
blockchain technology. In this section, we described a review latency limited only by the propagation time of the network.
of works related to this issue. Ehmke et al. [10] proposed a The scalability yet again is achieved at logical, not at the
solution based on the idea of Ethereum to keep the state of physical level. Zhang and Jacobsen proposed DCS properties
the system explicitly in the current block but further pursues (Decentralization, Consistency, and Scalability) as an analogy
this by including the relevant part of the current system state to the CAP theorem. The authors provided a general structure
in new transactions as well. This enables other participants of the blockchain platform which decomposes the distributed
to validate incoming transactions without having to download ledger into six layers: Application, Modeling, Contract, Sys-
the whole blockchain initially. The scalability in the proposed tem, Data, and Network. Finally, we classify research angles
solution is the logical level that is, the authors’ developed across three dimensions: DCS properties impacted, targeted
techniques extending Merkle Patricia Tree [4]. In [9], Dorri et applications, and related layers. The proposed solution is yet
al. proposed a tiered Lightweight Scalable Blockchain (LSB) again limited in terms of scalability. Guo et al. [13] proposed
that is optimized for IoT requirements. The authors explored a solution that relies on two key techniques: a fair contract
LSB in a smart home setting as a representative example partition algorithm leveraging integer linear programming to
of broader IoT applications. LSB achieves decentralization partition a set of smart contracts into multiple subsets, and
by forming an overlay network where high resource devices a random assignment protocol assigning subsets randomly
74
to a subgroup of users. This is a logical model for gaining used technology that relies on Byzantine Consensus Protocol
scalability as all the other authors proposed. which reduces the computational cost significantly. The core
To sum up, the research on Blockchain technology is still of this protocol Byzantine Fault Tolerance [5]. The extreme
limited. Most of the research focuses on the performance of parallelism ensued by the file system and BFT based consensus
blockchain applications. The solutions concerning scalability protocol makes ElasticBloC high-performant.
proposed in the literature by far deals with logical scalability
such as partitioning/sharding blocks. However, it is not ade- B. Architecture of ElasticBloC
quate to gain massive scalability which needs physical level ElasticBloC is composed of several components. Fig. 1
scalability that is achieved through system-level technologies depicts the architecture of ElasticBloC. The components are
such as distributed file systems. briefly explained in the following:
III. ELASTICBLOC – THE MASSIVELY SCALABLE • Transaction Gateway: The transaction gateway is a con-
BLOCKCHAIN SOLUTION nection component that enables to discover and connect
with blockchain endpoint. It is also a channel through
This section provides a detailed description of the core
which users launch a transaction request.
contribution of this paper. It begins with an overview of
• Request Receiver: It is an upfront server that receives
ElasticBloC, then a description of the high-level architecture
the HTTP requests.
of ElasticBloC is provided followed by a presentation of
• Communication Interface: It is a standard interface
the solution workflow. The functionalities of ElasticBloC are
for communication with Python applications. Since Elas-
briefly explained and finally, I explained the implementation
ticBloC is developed using Python programming lan-
of ElasticBloC.
guage, this component is critical in communicating with
A. A Brief Overview of ElasticBloC Python applications
• Operation Synthesizer: It handles various operations in-
ElasticBloC is a solution for developing blockchain appli-
cations. It is a generic solution that aims to support building cluding scaling up complex applications, object-relational
all types of blockchain-based applications such as asset man- mapping, validation of a request, authentication checking,
agement, smart contract or notarization in a scalable manner. and upload handling, etc.
• Blockchain Engine: It is one of the key components that
The users can deploy these applications on ElasticBloC which
perform the internal blockchain functions such as generating perform a multitude of tasks including handling events,
blocks, adding blocks in the storage, retrieving the blocks, data modeling, and operation orchestration.
etc. ElasticBloC guarantees scalability at the physical level
for blockchain applications.
ElasticBloC relies on a cluster computing paradigm that
underpins developing an ecosystem consisting of a massive
number of physical nodes (servers). As mentioned earlier, the
focus of the solution proposed in this paper is to achieve
the scalability of the physical layer instead of the logical
layer. The scalability at the logical layer can be achieved in
many ways such as the logical partition of blocks and then
store the partition in different nodes within the cluster which
consists of the limited number of nodes. However, scalability
at the physical level needs extensible architecture. ElasticBloC
architecture is extensible which enables users to add physical
nodes on the fly or offline to enhance the capability to store
any number of blocks generated by transaction applications. In Fig. 1. ElasticBloC Architecture
order to gain easy extensibility, it reuses an existing distributed
file system that simplifies adding new nodes. This file system • Functional Interface: It is another important component
supports commodity hardware; therefore, building a large of ElasticBloC. It allows for Byzantine Fault Tolerant
cluster using ElasticBloC is cost-effective. replication of applications written in any programming
Performance is another issue dealt with by ElasticBloC. The language. The consensus engine communicates with the
file system adopted in blockchain support extreme parallelism application via a socket protocol that satisfies the func-
as it underlies the MapReduce functional programming model tional interface. This interface consists of 3 primary
used by the applications to read and write blocks. Furthermore, message types that get delivered from the core to the
ElasticBloC adopted a technology that avoids computationally application. The application replies with corresponding
expensive proof of work [48]. Proof-of-work based consensus response messages. It consists of three message types:
protocols are also slow, requiring up to an hour to reasonably – DeliverTx Messages: Each transaction in the
confirm a payment to prevent double-spending. ElasticBloC blockchain is delivered with this message.
75
– CheckTx Messages: The CheckTx message is similar TABLE I
to DeliverTx, but it is only for validating transactions. B I G C H A I N DB D R I V E R M A I N M E T H O D S
– Commit Messages: The Commit message is used to
compute a cryptographic commitment to the current Class Method Parameters Description
application state, to be placed into the next block BigchainDB BigchainDB *nodes Creates an
instance of
header. bigchaindb driver
• Broadcasting Interface: It receives HTTP post request which is able to
create, sign, and
from blockchain engine and broadcast it blockchain send transactions
repository. to several nodes
• Blockchain Repository: It securely and consistently BigchainDB api info Headers Retrieves
the HTTP
replicates an application on many nodes. It works even API details
if up to 1/3 of machines fail in arbitrary ways [1]. Every provided by
machine that is not faulty sees the same transaction log the BigchainDB
server
and computes the same state. Secure and consistent repli- BigchainDB Info Headers Retrieves
cation is a fundamental problem in distributed systems; information
it plays a critical role in the fault tolerance of a broad of the node
connected to
range of applications, from currencies, to elections, to via the root
infrastructure orchestration, and beyond. endpoint, such as
The ability to tolerate machines failing in arbitrary ways, sever version and
including becoming malicious, is known as Byzantine overview of all
the endpoints
fault tolerance (BFT) [5]. Transactions Fulfill transaction, Fulfills the given
• Blockchain Database Network: It is a network of four Endpoint private keys transaction
or more nodes. Transactions Get *, asset id, oper- Retrieves a list of
Endpoint ation, headers transactions that
• Scalable Block Storage Cluster: This is the most im- have the specified
portant component that enhances physical infrastructure asset
to a massive number of nodes. It enormously increases Transactions Prepare *, operation Prepares a
Endpoint (CREATE or transaction
the capability of storing blocks as records. It is founded TRANSFER), payload, ready to
on column-oriented databases that are supported by a dis- signers, be fulfilled
tributed file system that can support building a blockchain recipients, assets,
metadata, inputs
lake consisting of thousands of nodes. The cluster consists Transactions Retrieve transaction id, Retrieves the
of one or more master nodes and hundreds of data Endpoint headers transaction of
nodes that essentially store the blocks. It is highly faulted given id
Transactions Send transaction, Sends a transac-
tolerant because each block is replicated into three nodes Endpoint mode, headers tion to the first
(can be more depending on users’ preference). If any node specified nodes
is not functioning two other nodes are available. Outputs Get public key, Retrieves
Endpoint spent, headers transaction
In fact, it is not only a storage cluster, but the column- outputs by the
oriented database also enables querying and managing public key
blocks. Assets Endpoint Get *, search, limit, Retrieves the as-
headers sets that match
C. ElasticBloC Operational Methods the search text
Crypto Generate keypair None Generates a
ElasticBloC enables us to perform different blockchain cryptographic
operations using various methods that are described into key pair
two categories: BigchainDB methods that are provided by
BigchainDB server and HBase connection methods for estab-
lishing a connection with BigchainDB and performing various The above table represents the interface of functions
operations. I implemented all HBase connection methods that a client can use to communicate with ElasticBloC.
within the scope of this paper. These methods are presented This facilitates dealing with such modular architecture.
in the following subsections. The reason behind this facilitation is that once a client
a) BigchainDB Methods transacts or operates via these functions, the rest of
ElasticBloC provides a library that allows the the flow is automated i.e. the operation flows through
client to perform ElasticBloC functionalities. the required components automatically. So, the client
This library is called “bigchaindb driver”. The communicates with one component.
table below lists the major methods that a client BigchainDB driver calls in its method’s implementation
can use to transact or operate in ElasticBloC. the methods of the BigchainDB-HBase connector, that
will be explained in the successive section, to perform
any operation that accesses HBase, such as retrieving data
76
(transactions, assets, metadata, etc.), checking the pre- TABLE II
existence of a newly submitted transaction, or storing of B I G C H A I N DB-HB A S E C O N N E C TO R F U N C T I O NA L I T I E S M E T H O D S
committed block with its details. Method Parameters Description
b) BigchainDB-HBase Connector connect backend, host, Creates a new connection
BigchainDB-HBase connector is considered the core of port, name, con- to the backend database
nection timeout (HBase)
our contribution. In this connector, I implemented a group create tables connection, Creates tables in HBase to
of methods that BigchainDB can expose in order to dbname be used by BigchainDB
connect and operate with HBase as a backend database. delete tables connection, Deletes the created tables
dbname in HBase
The following table describes the methods implemented store transaction connection, Stores a transaction in
with the connector. signed transaction Transactions table
Hence the preceding methods represent the interface store transactions connection, Stores a list of transac-
signed transactions tions in Transaction table
of the connector that integrates BigchainDB sever with
get transaction connection, Gets a transaction from
HBase. transaction id Transactions table
get transactions connection, Gets a list of transactions
D. Solution Workflow transaction ids from Transactions table
ElasticBloC has a unique general workflow. In fact, this store metadatas connection, Stores metadata in Meta-
metadata data table
workflow differs in its small parts according to the submitted
get metadata connection, Gets metadata from Meta-
transaction mode or the nature of the desired operation. The transaction ids data table
diagram below shows the general workflow of ElasticBloC. store asset connection, asset Stores asset in Assets ta-
Once the client has a valid transaction i.e. the transaction ble
store assets connection, Stores a list of assets in
assets Assets table
get asset connection, Gets an asset from Assets
asset id table
get assets connection, Gets a list of assets from
asset ids Assets table
store block connection, block Stores a block in Blocks
table
get block connection, Gets a block from Blocks
block id table
get spent connection, Check if a transaction id
transaction id, was already used as
output index an input. A transaction
can be used as an input
for another transaction.
Bigchain needs to make
sure that a given txid is
only used once Gets the
Fig. 2. ElasticBloC Workflow spending transaction
get latest block Connection Gets the latest committed
block
conforms to the BigchainDB Transactions Specification, he get txids filtered connection, Gets all transactions for
submits it to one or more ElasticBloC nodes through the asset id, a particular asset id and
BigchainDB HTTP API [17]. In particular, it embeds the operation optional operation
get owned ids connection, Gets a list of transactions
transaction in an HTTP request and specifies one of the owner ids we can use which has
predefined ends points to send through. These endpoints are: inputs
• POST /API/v1/transactions get spending connection, Gets transactions which
transactions inputs spend given inputs
• POST /API/v1/transactions?mode=async
get block with connection, Gets block holding a spe-
• POST /API/v1/transactions?mode=sync transaction transaction id cific transaction
• POST /API/v1/transactions?mode=commit delete transaction connection, Deletes a transaction from
transaction id database and its relevant
After that, the HTTP request holding the transaction arrives asset and metadata
at the BigchainDB node at the Gunicorn [16] web server delete transactions connection, Deletes transactions from
in that node. Then Gunicorn forwards the request towards transaction ids database and their relevant
assets and metadata
the BigchainDB server using its exposed Web Server Gate- delete latest block connection Delete the latest commited
way Interface (WSGI). The request reaches the BigchainDB block
server through the Flask web application development frame- store unspent connection, Stores unspent outputs in
outputs unspent outputs utxos table
work which simplifies working with WSGI/Gunicorn. The
get unspent connection, *, Gets unspent outputs
BigchainDB server uses a Python method to check the trans- outputs query
action’s validity. If the transaction is not valid, then the HTTP delete unspent connection, Deletes unspent outputs
response status code is 400 which means error. Otherwise, it outputs unspent outputs from utxos table
store pre commit Connection, state Stores pre commit state
is put into a new JSON string and sent to the local Tendermint state
instance via Tendermint Broadcast API. get pre commit Connection, Gets pre commit state of
state commit id a commit id
77
Now, the operations between the local Tendermint in- in two phases of voting on a proposed block before it
stance and BigchainDB are established by the Applica- is committed, and follow a simple locking mechanism
tion Blockchain Interface (ABCI) which is an integral part which prevents any malicious coalition of less than one-
of Tendermint and implemented also at the BigchainDB third of the validators from compromising safety [2].
server side. In this case, Tendermint uses the broadcast In this, the broadcast interface and the Blockchain repos-
endpoint which is relevant to the initial BigchainDB’s re- itory (See in Fig. 1) are implemented using Tendermint.
quest chosen. For example, if a client sent a transaction b) BigchainDB
through /API/v1/transactions?mode=commit endpoint, Tender- The Blockchain Engine of ElasticBloC is implemented
mint uses /broadcast tx commit endpoint respectively. Ten- using BigchainDB, which is for database-style decentral-
dermint stores the initial validated transactions in its own ized storage: a blockchain database. BigchainDB com-
mempool (memory pool). When it decides to create a block, bines the key benefits of distributed DBs and tradi-
Tendermint sends the creation request to BigchainDB by tional blockchains, with an emphasis on the scale [21].
exposing a specific ABCI method. Then, it starts to send initial BigchainDB on top of an enterprise-grade distributed
validated transactions that needed to be grouped in the desired DB, from which BigchainDB inherits high throughput,
block also using another ABCI method to BigchainDB which high capacity, low latency, a full-featured NoSQL query
rechecks the validity of the transaction before it is added to language, and permissioning. Nodes can be added to
the block. increase throughput and capacity.
The proposed block is then broadcasted to the network by c) HBase
Tendermint. Then it makes sure that all the nodes agree on this The scalable block storage cluster (See Fig. 1) is imple-
block in a Byzantine fault tolerance way. When the network mented using HBase technology. Although BigchainDB
agrees on a new block, Tendermint appends the new block aims at increasing scalability, yet massive scalability
to the blockchain in its local LevelDB, and the BigchainDB could not be achieved using BigchainDB.
server receives a commit message enforcing it to write the Therefore, I implemented the scalable block storage
new block and the including transactions, assets, and metadata of using HBase. HBase [15] is modeled on Google’s
in a separate way to the HBase repository. HBase then writes BigTable database [6]. HBase provides a distributed,
these data into the Hadoop Distributed File System underlying fault-tolerant scalable database, built on top of the HDFS
it. The same process is done at each node in the ElasticBloC file system [24], with random real-time read/write access
network. to data. Each HBase table is stored as a multidimensional
sparse map, with rows and columns, each cell having a
E. Implementation of ElasticBloC timestamp [6]. A cell value at a given row and column
As mentioned earlier, my goal is to use existing technologies is uniquely identified by:
for building ElasticBloC. The reason is two-fold: avoiding (Table, Row, Column-Family: Column, Timestamp)
reinventing the same technology that already exists and it ⇒
would be impractically ambitious to develop a complex ar- Value
chitecture like ElasticBloC. I developed ElasticBloC using HBase has its own Java client API, and tables in HBase
the most advanced technologies. I briefly described the main can be used both as an input source and as an output
technologies in the following: target for MapReduce jobs through Table Input and Table
a) Tendermint Output Format. There is no HBase single point of failure.
Tendermint is a secure state-machine replication algo- HBase uses Zookeeper [27], another Hadoop subproject,
rithm in the blockchain paradigm. It provides a form for the management of partial failures.
of BFT-ABC (Atomic Broadcast) that is furthermore The HBase connector which is the primary contribu-
accountable - if safety is violated, it is always possible tion of this paper was implemented using Python. The
to verify who acted maliciously [3]. first step of building the connector was to indicate the
Tendermint begins with a set of validators, identified by tables that are needed to store the architecture data
their public key, where each validator is responsible for (blocks, transactions, assets . . . ). The second step was
maintaining a full copy of the replicated state, and for to write a file that opens a connection to Hbase, based
proposing new blocks (batches of transactions), and vot- on the connection parameters and values given by the
ing on them [20]. Each block is assigned an incrementing BigchainDB configuration file, and return an instance of
index, or height, such that a valid blockchain has only this connection.
one valid block at each height. At each height, validators In the next, the schema file was written. The schema
take turns proposing new blocks in rounds, such that for file defines and creates the database schema at HBase
any given round there is at most one valid proposer. It once BigchainDB is initialized. After that, the required
may take multiple rounds to commit a block at a given querying methods were implemented. Some of these
height due to the asynchrony of the network, and the methods are for retrieving data, others for storing, up-
network may halt altogether if one-third or more of the dating or deleting data. In addition to the above, some
validators are offline or partitioned [3]. Validators engage web application development tools have been used in
developing.
78
F. Experiments & Results
This section describes some experiments that we conducted
on ElasticBloC and discusses its results. The goal of the
experiments is to evaluate two characteristics of ElasticBloC:
scalability and performance. I conducted the following exper-
iments are: Initial loading experiment, functionality experi-
ment, and its result, scalability experiment and its result, and
the ElasticBloC performance evaluation.
a) Initial Loading Experiment
• Purpose:
Fig. 4. BigchainDB Web Interface After Startup.
Testing the start-up running and initializing of the
whole architecture.
• Requirements:
The required steps are to run the ElasticBloC com-
ponents and check the connectivity between these Fig. 5. Tendermint Start-up.
components. The following summarizes these steps:
– Run the Hadoop cluster and HBase.
– Run the BigchainDB server. to that stored in HBase in order to confirm that the
– Run Tendermint instance. data is the same.
• Results: b) Functionality Experiment
The architecture components run successfully and the • Purpose: Test if ElasticBloC performs operations nor-
connection between the components is established. mally.
Once established, BigchainDB executed the schema • Required Steps: Testing the functionality of Elas-
file in the implemented connector and created the ticBloC is done through writing and executing a Python
needed tables in HBase. The following screenshots script that gets uses of the bigchaindb driver library.
represent some results of the successful initial start-up. The steps below show the required steps:
– Run ElasticBloc components.
– Write a Python script that creates a transaction,
fulfills it with the sender private key, sends the
transaction in a commit mode. The following Fig-
ure represents the testing Python script.
Fig. 6. The Experiment’s Python Script
• Results
The transactions are successfully created and sent.
Because the sent transactions are in commit mode,
so directly BigchainDB created a block for the trans-
Fig. 3. BigchainDB Start-up. action. This block was appended to the Tendermint
local copy of the blockchain, and the block, including
As we can see above, the components ran success- transaction, assets, and metadata are stored in their
fully and Tendermint opened the required sockets and specific tables at HBase.
established the ABCI Handshaking. It compares the Fig. 7 shows the created block in the blockchain stored
application’s highest height and the application hash at Tendermint.
79
Fig. 9. The New Data nodes Cluster.
Fig. 7. The Appended Block in Tendermint Blockchain.
ElasticBloC runs normally.
• Discussion
As ElasticBloC scales by adding a new server to the
Hadoop cluster, it is feasible to add data nodes either
on fly or offline. This means that ElasticBloC has the
ability to scales massively.
Fig. 8. The Appended Block and Its Details in HBase. d) Performance Evaluation
Concerning the large blockchain dataset, we were not able
to access a large blockchain dataset rather than building it.
Fig. 8 shows the result of the submitted transaction For that, we could not test ElasticBloC on a massive scale
with its details retrieved from HBase. dataset. Accordingly, we rely on this section on some
• Discussion
previously conducted experiments and workbenches that
The experiment is conducted successfully and the give us, theoretically, a clear idea on the performance of
architecture is working finely. the overall architecture.
One of the positive aspects of the architecture is that The overall performance of ElasticBloC is evaluated by
creation, fulfillment, sending, and validating of the its components, in particular, the performance of Ten-
transaction, with the block creation and appending to dermint as a consensus engine and HBase as a backend
the blockchain, and its storage in HBase took around database.
only one second. Tendermint acts as a high-performant in a large dis-
One major limitation of the experiment is that these tributed environment. According to Cosmos white paper
experiments were conducted on a limited blockchain [18]:
dataset. The reason behind this is that there was neither “Despite its strong guarantees, Tendermint provides ex-
possibility to access a large blockchain dataset nor time ceptional performance. In benchmarks of 64 nodes dis-
to build our own large dataset. Instead, we take into tributed across 7 data centers on 5 continents, on com-
consideration previous benchmarks and experiments modity cloud instances, Tendermint consensus can pro-
were done using huge bulks of data in which it helps cess thousands of transactions per second, with commit
us to evaluate our architecture. latencies on the order of one to two seconds. Notably, the
c) Scalability Experiment performance of well over thousands of transactions per
• Purpose second is maintained even in harsh adversarial conditions,
Test if ElasticBloC can scale massively. with validators crashing on broadcasting maliciously
• Required Steps crafted votes.”
Massive scalability means that ElasticBloC has the On the other side, building MongoDB on the top of HDFS
ability to scale as much as it needs on its physical layer. is less efficient than building HBase on the top of the
This could be ensured if we succeeded in adding new mention file system. The reason behind this argument is
data nodes to the ElasticBloC node. To do that we tried that HBase is natively developed to run on the top of
to add new Hadoop nodes to the Hadoop cluster either HDFS, while MongoDB needs a connector as the third
while ElasticBloC is running or when it is offline. party to be built on the top of HDFS.
• Results Moreover, based on several benchmarks, such as [11] and
The previous experiment is conducted successfully and [12], HBase acts more efficiently than MongoDB in large
80
clusters. For instance, End Point [23] performed a series TABLE VII
of tests for the performance of several NoSQL databases T H RO U G H P U T C O M PA R I S O N I N M I X E D O P E R AT I O NA L A N D A NA LY T I C A L
W O R K L OA D
including HBase and MongoDB. The following are some
comparison results for ‘the performance of HBase and Number HBase MongoDB
MongoDB in different tests based on [11]. of (operation/sec) (operation/sec)
Nodes
1 269.30 939.01
2 333.12 30.96
TABLE III 4 1228.61 10.55
T H RO U G H P U T C O M PA R I S O N W H I L E L OA D I N G DATA 8 2151.74 39.28
16 5986.65 337.4
Number HBase MongoDB
32 8936.18 227.80
of (operation/sec) (operation/sec)
Nodes
1 15617.98 8368.44
2 23373.93 13462.51
The above experiment is tangible evidence of how HBase
4 38991.82 18038.49
8 74405.64 34305.30 is more efficient than MongoDB.
16 143553.41 73335.62 Hence, according to theory and pre-existing experiments,
32 296857.36 134968.87 HBase would also enhance the overall performance of
ElasticBloC. However, in its worth case, replacing Mon-
goDB with HBase will not downgrade the performance
of ElasticBloC.
TABLE IV
T H RO U G H P U T C O M PA R I S O N W H I L E R E T R I E V I N G DATA G. Conclusion & Future Work
Number HBase MongoDB Blockchain has not been adopted widely until now except
of (operation/sec) (operation/sec)
Nodes for cryptocurrency applications, however, it has been identified
1 428.12 2149.08 as potential technologies for several areas that need trust and
2 1381.06 2588.04 security.
4 3955.03 2752.40
8 6817.14 2165.17
It is relatively a new technology that needs various improve-
16 16542.11 7782.36 ments to reach to a maturity level. It has several limitations that
32 20020.73 6983.82 are the main barriers to the wider adoption of this technology.
Of all, scalability and performance are the major limitations
that must be addressed.
This research primarily aims at addressing the scalabil-
TABLE V ity problem. There are some solutions that offer techniques
T H RO U G H P U T I N B A L A N C E D R E A D /W R I T E methods, and guidelines for scalable blockchain. However, we
Number HBase MongoDB
found that state-of-the-art technologies focus on scalability at
of (operation/sec) (operation/sec) the logical level which is an inadequate approach if scalability
Nodes at the physical level is to be guaranteed. In this paper,
1 527.47 1278.81
we designed and implement a scalable architecture called
2 1503.09 1441.32
4 4175.8 1501.06 ElasticBloC which enables users to build a highly scalable
8 7725.94 2195.92 blockchain-based ecosystem consisting of tens or more of
16 16381.78 1230.96 physical nodes.
32 20177.71 2335.14
In this paper, we proposed a solution for the scalability
limitation of blockchain technology. We discussed our solution
ElasticBloC which is a scalable architecture for building
blockchain-based applications such as payment system, no-
TABLE VI tarization, the smart contract can be implemented by ensur-
T H RO U G H P U T I N R E A D /U P DAT E /W R I T E O P E R AT I O N S .
ing scalability. ElasticBloC is a built on cluster computing
Number HBase MongoDB paradigm that building infrastructure with a massive num-
of (operation/sec) (operation/sec) ber of nodes. We presented the components with a detailed
Nodes
1 324.8 1261.94
description. We discussed the workflow of ElasticBloc. We
2 961.01 1480.72 discussed technologies that we used in implementing the
4 2749.35 1754.30 proposed solution.
8 4582.67 2028.06
16 10259.63 1114.13
We tested ElasticBloc to evaluate the scalability. Our exper-
32 16739.51 2363.69 iment shows that ElasticBloc has the ability to scale up; it is
flexible for adding as many servers. We discussed the results
of our experiments in this paper. Additionally, we provided
81
some previous workbenches to provide a comparative view of [24] Shvachko, K., Kuang, H., Radia, S., & Chansler, R. (2010, May). The
the abilities of ElasticBloc. hadoop distributed file system. In Mass storage systems and technologies
(MSST), 2010 IEEE 26th symposium on (pp. 1-10). Ieee.
Several works are lined up for a future extension of Elas- [25] Vukolić, M. (2015, October). The quest for scalable blockchain fabric:
ticBloC. However, to the best of our knowledge, the imminent Proof-of-work vs. BFT replication. In International Workshop on Open
critical task that must be accomplished in the near future Problems in Network Security (pp. 112-125). Springer, Cham.
[26] Zamani, M., Movahedi, M., & Raykova, M. (n.d.). RapidChain: A Fast
is extending the functional capabilities of our solution. We Blockchain Protocol via Full Sharding.
planned to enhance add new modules to ElasticBloC to enable [27] ZooKeeper, A. (2017). The Apache Software Foundation. Accessed
users to develop permission-less blockchain-based applications December, 29, 2017.
or for permission blockchain-based applications.
R EF E RE N CES
[1] Anonymous. Available Retrieved from
https://tendermint.readthedocs.io/en/latest/introduction.html.
[2] Branden, J. V. Building a Performance Model of the Tendermint Con-
census Algorithm.
[3] Buchman, E. (2016). Tendermint: Byzantine fault tolerance in the age
of blockchains (Doctoral dissertation).
[4] Buterin, V. (2014). A next-generation smart contract and decentralized
application platform.
[5] Castro, M., & Liskov, B. (2003). U.S. Patent No. 6,671,821. Washington,
DC: U.S. Patent and Trademark Office.
[6] Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A.,
Burrows, M., ... & Gruber, R. E. (2008). Bigtable: A distributed storage
system for structured data. ACM Transactions on Computer Systems
(TOCS), 26(2), 4.
[7] Condos, J., Sorrell, W. H., & Donegan, S. L. (2016). Blockchain
technology: Opportunities and risks.
[8] DBS Group Research. (2016). Understanding blockchain technology and
what it means for your business.
[9] Dorri, A., Kanhere, S. S., Jurdak, R., & Gauravaram, P. (2017). LSB: A
Lightweight Scalable BlockChain for IoT Security and Privacy. arXiv
preprint arXiv:1712.02969.
[10] Ehmke, C., Wessling, F., & Friedrich, M. C. (2018). Proof-of-property: a
lightweight and scalable blockchain protocol. In Proceedings of the 1st
International Workshop on Emerging Trends in Software Engineering
for Blockchain. (pp. 48-51). ACM.
[11] End Point. (2015). Benchmarking Top NoSQL Databases: Apache
Cassandra, Couchbase, Hbase, and MongoDB.
[12] Gandini, A., Knottenbelt, W. J., Osman, R., & Piazolla, P. (n.d.).
Performance evaluation of NoSQL databases.
[13] Gao, Z., Xu, L., Chen, L., Shah, N., Lu, Y., & Shi, W. (2017, December).
Scalable blockchain based smart contract execution. In Parallel and Dis-
tributed Systems (ICPADS), 2017 IEEE 23rd International Conference
on (pp. 352-359). IEEE.
[14] Gencer, E. A., Sirer, G. E., Van Renesse, R., & Eyal, I. (2016). Bitcoin-
NG: A Scalable Blockchain Protocol. In NSDI.
[15] George, L. (2011). HBase: the definitive guide: random access to your
planet-size data. ” O’Reilly Media, Inc.”.
[16] Gunicorn - Python WSGI HTTP Server for UNIX. (n.d.). Retrieved from
https://gunicorn.org.
[17] The HTTP Client-Server API — BigchainDB
Server 0.8.2 documentation. (n.d.). Retrieved from
http://docs.bigchaindb.com/projects/server/en/v0.8.2/drivers-clients/http-
client-server-api.html.
[18] Internet of Blockchains - Cosmos Network. (n.d.). Retrieved from
https://cosmos.network/resources/whitepaper.
[19] James-Lubin, K. (2015, January 22). Blockchain scalability. Retrieved
from https://www.oreilly.com/ideas/blockchain-scalability.
[20] Kwon, J. (2014). Tendermint: Consensus without Mining.
[21] McConaghy, T., Marques, R., Mü ller, A., De Jonghe, D., McConaghy,
T., McMullen, G., ... & Granzotto, A. (2016). BigchainDB: a scalable
blockchain database. white paper, BigChainDB.
[22] Out of Asia. (2017, December 27). Five Issues Preventing
Blockchain From Going Mainstream: The Insanely Popular
Crypto Game Etheremon Is One Of Them. Retrieved from
https://www.forbes.com/sites/outofasia/2017/12/22/five-issues-
preventing-blockchain-from-going-mainstream-the-insanely-popular-
crypto-game-etheremon-is-one-of-them/#6d364bb66fad.
[23] Secure Business Solutions — End Point. (n.d.). Retrieved from
http://www.endpoint.com/.
82