<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>June</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>The Blockchain Role in Ethical Data Acquisition and Provisioning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sara Migliorini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mauro Gambini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alberto Belussi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carlo Combi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Verona</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Member of the IEEE Blockchain Technical Community</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <volume>4</volume>
      <issue>2019</issue>
      <abstract>
        <p>The collection of personal data through mobile applications and IoT devices represents the core business of many corporations. From one hand, users are losing control about the property of their data and rarely are conscious about what they are sharing with whom; from the other hand, laws like the European General Data Protection Regulation try to bring data control and ownership back to users. In this paper we discuss the possible impact of the blockchain technology in building independent and resilient data management systems able to ensure data ownership and traceability. The use of this technology could play a major role in creating a transparent global market of aggregated personal data where voluntary acquisition is subject to clear rules and some forms of incentives, making not only the process ethical but also encouraging the sharing of high quality sensitive data.</p>
      </abstract>
      <kwd-group>
        <kwd>Blockchain ∙ Decentralized Autonomous Organization ∙ Network coalition ∙ Data ownership and traceability ∙ Voluntary provisioning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The acquisition and processing of personal data through mobile applications and
IoT devices is the core business of many IT corporations. The amount of
generated data is predicted to reach 44ZB by 2020, at the same time IoT devices
will be around 30 billions [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Nowadays, individuals have very uncertain control
about their data, how they are collected, viewed and monetized. In the future
all major countries are likely to introduce specific privacy laws for protecting
personal data. For example, any organization that provides goods and services
to EU citizens must comply with the new General Data Protection Regulation
(GDPR) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The GDPR protects any information that can be directly or
indirectly used to identify a person. This information varies from user names, emails
and IP addresses to healthcare information and bank details. GDPR includes
several rights, for instance the right to be forgotten (GDPR Art.17) and the
right to be notified of data breaches (GDPR Art.33–34). Any data pro cessing
that is not compliant with the GDPR can result in significant fines, till Euros
20 million or 4% of the global company revenues. Laws like the GDPR increase
the security requirements of any business that processes personal data and that
means more burden and risks for the involved companies. Big IT corporations
can manage the risk by delegating to new specialized companies the collection
and aggregation of personal data.
      </p>
      <p>
        Open networks and public ledgers can provide an alternative business model
to control and trace the use of personal data. The blockchain can lead to the
development of independent and resilient data management systems able to
ensure data ownership and traceability and increasing the user awareness [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. This
may guarantee a more fair use of the data and it can be the only viable way
to collect sensible data on voluntary base, like healthcare related information
and biological profiles. Nevertheless, the adoption of a blockchain infrastructure
comes with its own limits and the open issues are manifold. For instance,
relatively to the right to be forgotten (GDPR Art.17), blockchain deletions, or more
in general blockchain updates, should be carefully investigated [
        <xref ref-type="bibr" rid="ref1 ref9">1, 9</xref>
        ].
      </p>
      <p>The aim of this paper is to take a look at the potentialities ofered by this new
technology in the development of new ways to collect, maintain and use personal
data, and discuss some problems and limitations that have to be overcome for
making it efective in this scenario. A blockchain infrastructure might be the
right way to bring back to the users the ownership of their data. With a clear
idea about what is shared and with whom, users can be encouraged to share
more personal and sensitive data with specific companies, even on voluntary
basis, revoking such privilege at their discretion. Moreover, this technology can
also provide the right infrastructure for creating a global market of aggregated
personal data: well-informed conscious users can decide to share their personal
data on voluntary basis in presence of a clear usage rules and economic benefits.</p>
      <p>The remainder of this paper is organized as follows: Sect. 2 summarizes some
previous investigations about the applicability of the blockchain technology in
the field of data provisioning. Sect. 3 briefly illustrates the main concepts
underlying the blockchain technology and the notion of network coalition. Sec. 4
investigates the idea of developing a network coalition for voluntary data
provisioning. Finally, Sect. 5 summarizes the paper and discusses future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        In recent years some blockchain-based solutions have been proposed for
sharing medical data among several hospitals while providing data access control,
provenance and auditing [
        <xref ref-type="bibr" rid="ref13 ref7">13, 7</xref>
        ]. However, blockchain was originally designed to
record transactional data, which is relatively small in size, while the information
to be stored can be large, for instance in the healthcare domain many images or
treatment plans have to be recorded. The notion of of-chain storage has been
proposed in literature to deal with this problem. Essentially, data are kept
outside the blockchain, for instance in a traditional database, while the blockchain
will only be used to store their digital fingerprints to ensure data authenticity [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        Even if the blockchain was originally developed for storing only information
about financial transactions, in these years it has also been used to certify the
existence and tracking the ownership of digital or physical assets [
        <xref ref-type="bibr" rid="ref3 ref6">3, 6</xref>
        ]. It is
estimated that Bitcoin transactions storing diferent information are about the
1% of the total transactions in the Bitcoin blockchain [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Blockchain and Network Coalitions</title>
      <p>
        Currently, several variants of blockchain exist, these variants are often classified
as distributed ledgers. In its original form [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] a blockchain is essentially a
temporally ordered list of permanent data blocks. The head of the list is called genesis
block and includes some evidence about its release date, while every other block
is generated at fairly regular intervals and contains a cryptographic message
digest, or briefly a hash, of its predecessor, creating a chain of references. Each
block also includes a proof of work, namely an evidence that a certain amount
of work have been spent for producing it. This proof is obtained by repeatedly
applying a cryptographic hash function to a block, varying its content at each
iteration by using a diferent nonce, until one of the target hashes is found. The
described operation is part of the mining process that is simultaneously
performed by several competitive network agents called miners. Altering a given
block requires the recomputation of the hashes of all its successors in a limited
amount of time. Since such operation could be quite expensive, the probability
of observing a block replaced by another one decreases over time as new ones are
added in front of it. A block referring to a given one is said to confirm it and after
a certain number of confirmations, a block is considered practically immutable.
      </p>
      <p>The key innovation of the blockchain technology is a decentralized emergent
consensus protocol that enables a group of agents to reach an agreement about
a global state by accepting data transmitted across an open byzantine
Peer-toPeer (P2P) network. The consensus can be considered emergent, because there
is not specific point in time in which it is explicitly reached; while the network
is said to be open and byzantine, because agents can be self-interested, they can
enter and leave the system without authentication or secure connections and
they can act strategically against the P2P protocol.</p>
      <p>
        The blockchain technology appeared for the first time in the implementation
of the Bitcoin protocol [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] as a clever solution to the double-spending problem
that does not require a trusted central authority. In Bitcoin, each block contains
a set of transactions representing the transferring of tokens from a source to
a destination account address. Following the Bitcoin protocol, each agent can
independently validate both transactions and blocks and reach a consensus about
the blockchain state in an autonomous way.
      </p>
      <p>
        A generalization of the Bitcoin protocol that properly extends the blockchain
technology has been proposed by Ethereum [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and similar platforms, in which a
global state can be updated, not only by token transactions, but also by generic
instructions previously stored in the blockchain. In particular, the Ethereum
platform provides a virtual machine, called Ethereum VM or EVM, that can
run general-purpose scripts encoding arbitrary state transition functions. These
scripts are called smart contracts and they are considered autonomous software
agents executed by the EVM when a certain event occurs, for instance when a
transaction is scheduled or a message received. When a contract is triggered,
it runs a sequence of predefined instructions that can control the related token
balance, the key-value store used to keep track of persistent variables and the
invocation of other contracts [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Ethereum contracts are suficiently expressive to create new
cryptocurrencies like Bitcoin and to found network coalitions [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], often called Decentralized
Autonomous Organizations (DAOs). A network coalition is a concerted form of
cooperation, in which a group of actors decide to collaborate with the explicit
purpose to achieve a common goal. Supply chains, cooperatives, strategic
business alliances, joint ventures can be good examples of coalitions.
Decentralization, namely the lack of an established central authority, is a main characteristic,
together with the possibility to have a dynamic composition, that is new
components can freely join the coalition, while existing ones can leave it. Governance
rules are encoded inside a set of smart contracts and members hold a certain
amount of tokens through which they can exercise their voting power.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Voluntary-based Coalitions for Data Provisioning</title>
      <p>A blockchain infrastructure might be the right way to bring back to the users the
ownership of their data. With a clear idea about what is shared and with whom,
users can be encouraged to share more personal data with specific companies,
even on voluntary basis, revoking such privilege at their discretion. Moreover,
this technology can also provide the right infrastructure for creating a global
market of aggregated personal data: well-informed conscious users can decide
to share their personal data on voluntary basis in presence of a clear benefit.
However, the utility of a single piece of personal data is dificult to quantify
and its value is typically very low. The value of personal data can increase only
after some aggregation and integration process, such process is usually performed
by external companies which collect data from users and resell the aggregated
datasets to other companies. The current provisioning process is depicted in
Fig. 1, where users share in a more or less conscious way their personal data to
an organization, depicted in the middle, which takes care of aggregating such
raw data and producing useful information that in turn will be sold to other
organizations, depicted on the right. The users are unaware of how such data
are processed by the middle organization and have no control about the nature
of the organizations on the right and the usage they can make of these data.</p>
      <p>A blockchain infrastructure can trigger a paradigm shift in the acquisition
and aggregation process. With the right technology, users can voluntary
collaborate for collecting and aggregating their personal data, producing valuable
information. The coalition can sell such data to other organizations, but
potentially maintaining the control about their use and transfer. This new form of data
collection and aggregation is exemplified in Fig. 2 where the centralized company
in Fig. 1 is replaced by a network coalition formed by an open P2P network of
users. This paradigm shift can be a win/win condition for both users and
companies: from one hand, users have more control on their personal data and they
are more conscious of their roles and rights as a group. For instance, they can
move a class action against improper data usage when this cannot be prevented
with cryptographic methods. The blockchain can act as a tamper-proof log and
used as evidence before the court in case of dispute. From the other hand,
companies can externalize some data acquisition costs and reduce the risk induced
by a wrong treatment of personal data w.r.t. the existing privacy regulations.
In addition, this new form of data acquisition can be an efective way to collect
huge amount of personal data that require a voluntary and incentivized efort.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and Future Work</title>
      <p>
        This paper takes a first look to the applicability of the emerging blockchain
technology for building independent and resilient data management systems able to
ensure data ownership and traceability. The blockchain technology is considered
an enabling technology, namely a technology that opens the design space to
new innovative applications and even to new way of thinking about
algorithmic solutions in which economic aspects play a major role. We conjecture that
this technology may be useful to both encourage users to share their data even
in more sensitive context, such as the health-care one, and to create a global
market of aggregated personal data. Despite the benefits of using a blockchain
infrastructure for data provisioning, its adoption comes with its own limits and
the open issues are manifold. For instance, relatively to the right to be forgotten
(GDPR Art.17), blockchain deletions, or more in general blockchain updates,
should be carefully investigated [
        <xref ref-type="bibr" rid="ref1 ref9">1, 9</xref>
        ]. Relatively to the actual establishment of
network coalitions for data provisioning, particular attention has to be placed
to their legal recognition in diferent countries and to the way privacy laws, like
the GDPR, can be applied to them.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bartoletti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pompianu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>An Analysis of Bitcoin OP RETURN Metadata</article-title>
          .
          <source>In: Financial Cryptography and Data Security</source>
          . pp.
          <fpage>218</fpage>
          -
          <lpage>230</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Buterin</surname>
          </string-name>
          , V.:
          <article-title>A Next-generation Smart Contract and Decentralized Application Platform (</article-title>
          <year>2014</year>
          ), http://github.com/ethereum/wiki/wiki/White-Paper
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xue</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Bootstrapping a blockchain based ecosystem for big data exchange</article-title>
          .
          <source>In: 2017 IEEE International Congress on Big Data</source>
          . pp.
          <fpage>460</fpage>
          -
          <lpage>463</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Esposito</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Santis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tortora</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Choo</surname>
            ,
            <given-names>K.R.</given-names>
          </string-name>
          :
          <article-title>Blockchain: A Panacea for Healthcare Cloud-Based Data Security</article-title>
          and
          <source>Privacy? IEEE Cloud Computing</source>
          <volume>5</volume>
          (
          <issue>1</issue>
          ),
          <fpage>31</fpage>
          -
          <lpage>37</lpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>EU</given-names>
            <surname>Commission</surname>
          </string-name>
          :
          <article-title>General Data Protection Regulation (GDPR)</article-title>
          ,
          <source>Regulation EU</source>
          <year>2016</year>
          /769., https://eur-lex.europa.eu/legal-content/EN/TXT/?uri= CELEX:
          <fpage>02016R0679</fpage>
          -
          <lpage>20160504</lpage>
          , (acc.:
          <fpage>2019</fpage>
          -
          <lpage>03</lpage>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Karafiloski</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mishev</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Blockchain solutions for big data challenges: A literature review</article-title>
          .
          <source>In: 17th Int. Conf. on Smart Technologies IEEE</source>
          . pp.
          <fpage>763</fpage>
          -
          <lpage>768</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>H.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuo</surname>
          </string-name>
          , T.T.,
          <string-name>
            <surname>Ohno-Machado</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Blockchain distributed ledger technologies for biomedical and health care applications</article-title>
          .
          <source>Journal of the American Medical Informatics Association</source>
          <volume>24</volume>
          (
          <issue>6</issue>
          ),
          <fpage>1211</fpage>
          -
          <lpage>1220</lpage>
          (09
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kugler</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>The War over the Value of Personal Data</article-title>
          .
          <source>Communications of the ACM</source>
          <volume>61</volume>
          (
          <issue>2</issue>
          ),
          <fpage>17</fpage>
          -
          <lpage>19</lpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Matzutt</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henze</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ziegeldorf</surname>
            ,
            <given-names>J.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hiller</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wehrle</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Thwarting Unwanted Blockchain Content Insertion</article-title>
          .
          <source>In: 2018 IEEE International Conference on Cloud Engineering (IC2E)</source>
          . pp.
          <fpage>364</fpage>
          -
          <lpage>370</lpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Migliorini</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gambini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Combi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>La</surname>
            <given-names>Rosa</given-names>
          </string-name>
          ,
          <string-name>
            <surname>M.:</surname>
          </string-name>
          <article-title>The Rise of Enforceable Business Processes from the Hashes of Blockchain-Based Smart Contracts</article-title>
          .
          <source>In: Enterprise, Business-Process and Information Systems Modeling</source>
          . pp.
          <fpage>130</fpage>
          -
          <lpage>138</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Nakamoto</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Bitcoin: A Peer-to-Peer Electronic Cash System (</article-title>
          <year>2008</year>
          ), http:// www.bitcoin.org/bitcoin.pdf, (acc.:
          <fpage>2018</fpage>
          -
          <lpage>11</lpage>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Shafagh</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burkhalter</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hithnawi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duquennoy</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Towards blockchainbased auditable storage and sharing of iot data</article-title>
          .
          <source>In: Proceedings of the 2017 on Cloud Computing Security Workshop</source>
          . pp.
          <fpage>45</fpage>
          -
          <lpage>50</lpage>
          . CCSW '
          <volume>17</volume>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Xia</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sifah</surname>
            ,
            <given-names>E.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Asamoah</surname>
            ,
            <given-names>K.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Du</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guizani</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>MeDShare: Trust-Less Medical Data Sharing Among Cloud Service Providers via Blockchain</article-title>
          .
          <source>IEEE Access 5</source>
          ,
          <fpage>14757</fpage>
          -
          <lpage>14767</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>