=Paper=
{{Paper
|id=Vol-2599/paper4
|storemode=property
|title=Using Triples as the Data Model for Blockchain Systems
|pdfUrl=https://ceur-ws.org/Vol-2599/paper4.pdf
|volume=Vol-2599
|authors=Dennis Przytarski
|dblpUrl=https://dblp.org/rec/conf/semweb/Przytarski19
}}
==Using Triples as the Data Model for Blockchain Systems==
<pdf width="1500px">https://ceur-ws.org/Vol-2599/paper4.pdf</pdf>
<pre>
              Using Triples as the Data Model
                  for Blockchain Systems

                                  Dennis Przytarski

              University of Stuttgart, IPVS, 70569 Stuttgart, Germany
                   Dennis.Przytarski@ipvs.uni-stuttgart.de


      Abstract. Current permissioned blockchain systems utilize the key-
      value data model to store and query the ledger. As the key-value pairs
      are not sufficiently expressive to represent relationships between data,
      we present a proposal for the utilization of triples as the data model for
      blockchain systems. This approach enables a powerful query engine and
      reduces the number of data stores that have to be maintained.

      Keywords: blockchain · data model · Merkle B-tree · query · triple


    Blockchain systems were initially designed for cryptocurrencies, but applica-
tions have started using them as immutable data stores to share business data
among many untrusted participants without a central authority.
    Current permissioned blockchain systems such as Hyperledger Fabric [1] store
the blockchain ( 1 in Figure 1) serialized on the file system. The blockchain is
a chained list of blocks where each block contains a sequence of transactions. A
transaction contains important application data represented as key-value pairs.
The latest application state is computed by traversing the entire blockchain
from the first to the last block while considering every transaction within a
block. Efficient access to the latest application state is achieved through the
maintenance of a database called world state ( 2 in Figure 1).
    This approach has several disadvantages. First, the absence of an explicit his-
tory means that if analytical queries on historical blockchain data are needed,
the blockchain data (fully or partially) either have to be analyzed manually or
exported to an additional separate analytics data store. Secondly, important
relationships between the transaction data are embedded in the key-value repre-
sentation and require their reconstruction at a higher layer (i. e., in the applica-
tion). Thirdly, there are increased maintenance costs due to the additional data
stores. Finally, the tamper-resistance property, which is a fundamental property
of blockchain technology, is not ensured for the world state and the analytics
data store without the implementation of additional data integrity measures.
    Our approach seeks to eliminate these disadvantages by replacing the key-
value data model with a generic but flexible data model. We propose to store
the blockchain data in a triple data model using <entity, attribute, value>
triples. This structured data representation enables the generic modeling of rela-
tionships, flexible schemas, and an ad-hoc query facility. Complex queries execute


                      Copyright c 2019 for this paper by its authors.
  Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
2            D. Przytarski

                                                 latest application state
         2     World State
                                                  queries         Query                                   Query
              Key-value store
                                                                  Engine              queries   Engine queries
             (Document store)
                                                                                     historical          latest
                                                                                    application        application
                              is derived from                                          state              state
Ledger


         1 Blockchain                                                       3 Blockchain
             Block n-2                 Block n-1             Block n          Block n-2                 Block n-1            Block n
              Header                     Header              Header            Header                     Header             Header
                 TRXN                     TRXN                TRXN
                    ...                    ...                    ...
                                                                                     latest application state
                                                            Transaction                                                    Merkle B-tree
                                                                                                  entity attribute value


                                                                                        Example
                                                            Key     Value                                                    <Triple>
              Example


                        key             value                                                     CAR0 car/color    blue
                        CAR0 {color:“blue“,make:“Ford“}
                                                            ... ...                               CAR0 car/make Ford
                                                                                                                                ...


Fig. 1. The blockchain storage architecture of Hyperledger Fabric [1] (left side) in
comparison with our approach (right side).


directly on the blockchain data, eliminating the maintenance of additional data
stores. Our approach addresses the key issues of preserving the integrity of the
blockchain’s data structure, maintaining an efficient data representation, and
supporting a powerful query engine.
    Our blockchain ( 3 in Figure 1) implementation utilizes Merkle B-trees [2].
A Merkle B-tree contains all triples that reflect the respective application state.
Each block is represented by two Merkle B-trees where the first one is sorted by
<entity, attribute> and the second one by <attribute, entity>. The latest
application state is always stored in the last block, any historical application state
in one of the preceding blocks. A query engine supporting a SPARQL-like query
language uses these two Merkle B-trees within a block to efficiently compute the
result of a query.
    This implementation, however, has high storage requirements and requires
several optimization mechanisms. The use of techniques such as data compres-
sion, data deduplication, and data encoding as well as the reuse of already stored
data structures contribute to reducing storage usage.
    Future work will entail the research of further optimization mechanisms, the
mechanism and format for exposing the triple data model to smart contracts as
well as the development and evaluation of a prototype implementation.


References
1. Androulaki, E., et al.: Hyperledger fabric: A distributed operating system for per-
   missioned blockchains. In: Proceedings of the Thirteenth EuroSys Conference. pp.
   30:1–30:15. EuroSys ’18, ACM, New York, NY, USA (2018)
2. Li, F., et al.: Dynamic authenticated index structures for outsourced databases. In:
   Proceedings of the 2006 ACM SIGMOD International Conference on Management
   of Data. pp. 121–132. ACM (2006)

</pre>