=Paper= {{Paper |id=Vol-2973/paper_262 |storemode=property |title=BLF: A Blockchain Logging Framework for Mining Blockchain Data |pdfUrl=https://ceur-ws.org/Vol-2973/paper_262.pdf |volume=Vol-2973 |authors=Paul Beck,Hendrik Bockrath,Tom Knoche,Mykola Digtiar,Tobias Petrich,Daniil Romanchenko,Richard Hobeck,Luise Pufahl,Christopher Klinkmüller,Ingo Weber |dblpUrl=https://dblp.org/rec/conf/bpm/Ven21 }} ==BLF: A Blockchain Logging Framework for Mining Blockchain Data== https://ceur-ws.org/Vol-2973/paper_262.pdf
BLF: A Blockchain Logging Framework for Mining
Blockchain Data
Paul Beck1 , Hendrik Bockrath1 , Tom Knoche1 , Mykola Digtiar1 , Tobias Petrich1 ,
Daniil Romanchenko1 , Richard Hobeck2 , Luise Pufahl2 , Christopher Klinkmüller3
and Ingo Weber2
1
  Technische Universitaet Berlin, Berlin, Germany
2
  Software & Business Engineering, Technische Universitaet Berlin, Berlin, Germany
3
  CSIRO Data61, Sydney, Australia


                                         Abstract
                                         Blockchain technology is increasingly used to realize decentralized applications and execute cross-
                                         organizational processes. Understanding how an application is used and how partners and users par-
                                         ticipate is essential to avoid failures and plan improvements. This understanding can be built by ana-
                                         lyzing logs; but although data is in principle given in the immutable ledger, log extraction is currently
                                         still inconvenient, slow, and subject to interpretation. In this demo, we present BLF, an extensible log-
                                         ging framework for decentralized applications deployed on a blockchain. The framework is realized for
                                         Ethereum and Hyperledger, and has been tested for applications on those networks, but is extensible for
                                         other blockchains. Practitioners can use it to analyze their blockchain application and BPM researchers
                                         can explore with it new types of event data – event logs from blockchain applications.

                                         Keywords
                                         Logging, Blockchain Application, Process Mining




1. Introduction
Blockchain technology enables a new generation of applications, commonly referred to as
decentralized applications (DApp), which can e.g., support the execution of cross-organizational
business processes [1]. Whereas DApp developers have full control over their DApp’s features,
the shared nature of the networks on which the DApps are deployed limits the developers
influence on when, where, and under what circumstances they are executed. Thus, the analysis
of DApp behavior based on logs is essential for avoiding failures and planning improvements
for the future. Here, process mining techniques [2] support the analysis of events over time and
can provide useful insights as e.g., shown in [3, 4].
   In this demo, we present the Blockchain Logging Framework (BLF) whose main components
are summarized in Fig. 1. At heart, BLF enables the generation of logs from DApp data by
allowing users to define an extract-transform-load (ETL) pipeline. To this end, users have to

Proceedings of the Demonstration & Resources Track, Best BPM Dissertation Award, and Doctoral Consortium at BPM
2021 co-located with the 19th International Conference on Business Process Management, BPM 2021, Rome, Italy,
September 6-10, 2021
" ⟨firstname⟩.⟨lastname⟩@tu-berlin.de (R. Hobeck); ⟨firstname⟩.⟨lastname⟩@tu-berlin.de (L. Pufahl);
christopher.klinkmueller@data61.csiro.au (C. Klinkmüller); ⟨firstname⟩.⟨lastname⟩@tu-berlin.de (I. Weber)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)



                                                                                                           1
Paul Beck et al. CEUR Workshop Proceedings                                                       1–5


                      Blockchain        Blockchain Logging Framework        Output
                                                    (BLF)
                                                                              CSV
                     Transactions

                     Transaction                                              TXT
                       Receipts                     Extractor
                        State                                                 XES

                           emit

                                         Manifest
                                          (BcQL)
                                                                Validator
                           execute

                       DApp
                    Source Code



Figure 1: Overview of the blockchain logging framework.


specify a manifest using BLF’s Blockchain Query Language (BcQL). To support and ease the
definition of the manifests, BLF’s validator can verify their correctness and inform users about
potential specification errors. The framework itself is an extension of the Ethereum Logging
Framework [4, 5] (ELF) which was developed and tested for Ethereum [6]. BLF by contrast is
designed as a generic framework for generating logs from DApps on any blockchain platform.
While BLF currently provides adapters for Ethereum and Hyperledger [7], it is extensible so
that other platforms can be supported in the future. In the remainder, we present the main
functionality, its query language, and the possibility to extend it. We conclude by outlining case
studies and a small demonstration for a Hyperledger DApp.


2. Main Functionality of BLF
BLF is written in Java and its source code is publicly available1 . It consists of three parts: BcQL,
the validator, and the extractor. In the following, we briefly present how BcQL can be applied to
specify manifest files and thus ETL pipelines for DApps. After that, we summarize the extractor
and validator functionality which builds upon manifest files (see Fig. 1). Lastly, we elaborate on
possibilities to extend the framework.

Manifest and BcQL. A manifest file defines how logs are generated from DApp data. This
includes details regarding which blockchain to connect to, what data to extract, how to structure
the data, and where to store it in which format. To this end, BcQL is designed as a declarative

   1
       https://github.com/TU-ADSP/Blockchain-Logging-Framework



                                                       2
Paul Beck et al. CEUR Workshop Proceedings                                                    1–5




Figure 2: Manifest structure


query language that abstracts away low-level extraction details like data decoding, composition
of API calls, etc., so that developers can focus on defining the actual ETL process.
   In Fig. 2, the structure of a manifest document is shown with its mandatory and optional
elements. In each manifest the user first has to define the blockchain context (e.g. “ETHEREUM”,
“HYPERLEDGER”), a connection to a blockchain node, and an output folder. Additionally,
BcQL gives the option to specify (1) filters (e.g., block filter, transaction filter), which allow
users to narrow down the DApp data to be extracted, (2) expression statements which provide
transformation and logic operators that can be used to process the data, and (3) emit statements
for formatting data in a specific target format. Additionally, the user may configure the emission
mode and the error handling strategy. By default, emission of the output files is done as
soon as all data of a blockchain application has been extracted, transformed, and loaded. Safe
batching and streaming allow to emit data for each processed block whereby the former option
continuously updates the main output files and the latter produces new files for each block.
Regarding error handling, BLF is capable of handling runtime errors in two distinct ways: errors
are either ignored, or they lead to the abortion of the current execution. Either way, the errors
will get printed to the console and written to an error log file. A detailed step-by-step tutorial
on how to write a manifest is provided on the project website2 .

Validator and Extractor. The main components of BLF are the validator and the extractor
which process user-defined manifest files (see Fig. 1). Here, we briefly summarize these two
components. The interested reader can find more information on these functions in [5].
   The validator supports the user and checks the manifest for specification errors. In a first
step, the parser generator library ANTLR4 parses a textual manifest file based on a specification
of BcQL’s grammar. In this regard, ANTLR4 generates an intermediate representation of the
manifest and identifies syntactic errors. Semantic analysis of the intermediate representation
is implemented as a set of custom rules that, e.g., check if filters are correctly nested and that
    2
        https://github.com/TU-ADSP/Blockchain-Logging-Framework/wiki/Manifest



                                                     3
Paul Beck et al. CEUR Workshop Proceedings                                                          1–5




Figure 3: Listeners to Extend BLF to new Blockchain technologies.


variable, parameter, and literal types are compatible.
   With a validated manifest, the extractor is able to extract, transform, and load data from
DApps. The framework extracts data block by block in their historical order, i.e., how they were
created and included in a blockchain. During the extraction, the specified filters are considered.
For transformation BLF provides a basic set of operators and additionally allows users to
integrate custom operators at compile time of the Ethereum Logging Framework. Finally, data
can be formatted and exported as textual application logs, or in the comma-separated values
(CSV) and the eXtensible Event Stream (XES) [2] format.

Extending BLF. The interaction between BLF and a blockchain node is done through a
standardized interface, called BaseBlockchainListener (see Fig. 3). By implementing this interface
for specific blockchain platforms, e.g., for Corda R3 or EOS, developers can add support for
additional platforms. Currently, BLF provides standard implementations for Hyperledger and
Ethereum. Besides functionality to extract data from blocks, transactions, log entries, and the
blockchain state, developers must declare the default variables that these entities have. For
example, on Ethereum blocks are identified by a block number, i.e., the position of the block in
the blockchain, while transactions and log entries are identified by indices that encode their
position within a block. Developers can follow the step-by-step tutorial3 to integrate further
platforms.


3. Demonstration and Maturity
ELF, the predecessor of BLF, was already used in several case studies, amongst others to examine
the popular Ethereum game CryptoKitties [4], and the Ethereum DApp Augur [3], a popular
prediction and betting market. These case studies demonstrate real-world applications of the
framework and possibilities how it can be used. After reworking ELF to the extensible BLF, we
wrote a BcQL manifest again for Augur, ran BLF and successfully demonstrated BLFs continuous
capability to extract event logs from Ethereum DApps.
   The availability of Hyperledger use cases from production environments is low, because
Hyperledger is used as a private permissioned blockchain and has no public blockchain system.

   3
       https://github.com/TU-ADSP/Blockchain-Logging-Framework/wiki/Adding-a-new-Blockchain-to-the-BLF



                                                    4
Paul Beck et al. CEUR Workshop Proceedings                                                  1–5


Thus, we implemented our own DApp on a Hyperledger node: HyperKitties4 , a Hyperledger
reimplementation of the CryptoKitties Ethereum smart contract. By porting CryptoKitties to
Hyperledger we wanted to test whether the event log generation from Hyperledger results in
reasonable event logs as observed in previous case studies. We used HyperKitties to write events
into our private Hyperledger blockchain. We then created a manifest file that lets BLF connect
to the local Hyperledger node, extract the HyperKitties events from the blocks, and generate an
event log. We opened it in Disco where we could validate a reasonable event log similar to the
Ethereum result. Examples on how to write a manifest and a screencast are provided on the
main project website (see Footnote 1).
   This demo presented a framework to log data of blockchain-based applications. It provides
functions to extract, transform, and load data. The framework additionally supports different
modes for data emission and exception handling. It has been already applied in larger case
studies and offers BPM researchers a new source of data for process mining. Albeit being usable
on production systems, BLF’s maturity should still be classified as a fully functional research
prototype. In future, we want to extend it to other blockchain technologies and improve the
usability of BcQL’s grammar.


References
[1] X. Xu, I. Weber, M. Staples, Architecture for blockchain applications, Springer, 2019.
[2] W. Van Der Aalst, Process mining - Data science in action, Springer, 2016.
[3] R. Hobeck, C. Klinkmüller, H. D. Bandara, I. Weber, W. van der Aalst, Process mining on
    blockchain data: A case study of augur, in: BPM 2021, accepted, Springer, 2021.
[4] C. Klinkmüller, A. Ponomarev, A. B. Tran, I. Weber, W. van der Aalst, Mining blockchain
    processes: Extracting process mining data from blockchain applications, in: BPM 2019:
    Blockchain Forum, Springer, 2019, pp. 71–86.
[5] C. Klinkmüller, I. Weber, A. Ponomarev, A. B. Tran, W. van der Aalst, Efficient logging for
    blockchain applications, arXiv preprint arXiv:2001.10281 (2020).
[6] G. Wood, et al., Ethereum: A secure decentralised generalised transaction ledger, Ethereum
    project yellow paper 151 (2014) 1–32.
[7] E. Androulaki, A. Barger, V. Bortnikov, C. Cachin, K. Christidis, A. De Caro, et al., Hyper-
    ledger fabric: a distributed operating system for permissioned blockchains, in: EuroSys
    conference, 2018, pp. 1–15.




   4
       https://github.com/TU-ADSP/HyperKitties



                                                 5