<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards a Security Reference Architecture for Big Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Julio Moreno</string-name>
          <email>Julio.Moreno@uclm.es</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eduardo Fernandez-Medina</string-name>
          <email>Eduardo.FdezMedina@uclm.es</email>
          <email>Fernande@fau.edu</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manuel A. Serrano</string-name>
          <email>Manuel.Serrano@uclm.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eduardo B. Fernandez</string-name>
          <email>Fernande@fau.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Alarcos Research Group, University of Castilla-La</institution>
          ,
          <addr-line>Mancha, Ciudad Real</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer and Electrical Engineering and</institution>
          ,
          <addr-line>Computer Science</addr-line>
          ,
          <institution>Florida Atlantic University</institution>
          ,
          <addr-line>Boca Raton, Florida</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>GSyA Research Group, University of Castilla-La Mancha</institution>
          ,
          <addr-line>Ciudad Real</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Companies are aware of Big Data importance as data are essential to conduct their daily activities, but new problems arise with new technologies, as it is the case of Big Data; these problems are related not only to the 3Vs of Big Data, but also to privacy and security. Security is crucial in Big Data systems, but unfortunately, security problems occur due to the fact that Big Data was not initially conceived as a secure environment. Furthermore, this task is dificult due to the heterogeneous configurations that a Big Data system can have. One way to solve this problem is by having a global perspective, and in this way, a Reference Architecture (RA) is a high-level abstraction of a system that can be useful in the implementation of complex systems. Several initiatives have been made for obtaining a RA for Big Data like those from IBM, ORACLE, NIST or ISO, but none of them have their main focus on security. It is widely accepted that adding elements to address threats and facilitate the definition of security requirements to RA is a good starting point for solving these kind of threats and, in this way, converting RAs into Security Reference Architectures (SRAs). In the current paper, a SRA for Big Data is defined using UML models trying to ease secure Big Data implementations; allowing to apply security patterns in order to secure final Big Data systems.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Companies are increasingly aware of Big Data importance [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. For
all of them, data are essential to conduct their daily activities and
to help senior management to achieve business objectives and, as
a result, take better decisions based on the information extracted
from such data [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. Big Data implies a change compared to
traditional techniques in three diferent ways: the amount of
data (volume), the rate of generation and transmission of data
(velocity) and the heterogeneity of the types of structured and
unstructured data that it can handle (variety) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. These properties
are known as the three Vs of Big Data [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ].
      </p>
      <p>
        New problems usually arise with new technologies, as it is the
case of Big Data. These problems are related not only to the 3 Vs
of Big data, but also to privacy and security. Big Data not only
increases the scale of the problems related to privacy and security,
as faced in the traditional management of security, but also adds
new ones that should be addressed with diferent techniques and
measures [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]. These security problems occur due to the fact that
Big Data was not conceived initially as a secure environment
[
        <xref ref-type="bibr" rid="ref33">33</xref>
        ], and therefore, the main security problems are related to the
specific architecture of Big Data itself which makes it harder to
protect the privacy of the data that it is being used [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        Obtaining an adequate level of security in Big Data can
influence its implementation in an institution because of the loss of
reputation they could sufer or because they could receive
financial penalties, due to regulations, in the case of data breaches;
in fact, without a security guarantee, Big Data will not reach
an appropriate level of acceptance [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ]. Hence, it is important
to have guidance, methodologies, and mechanisms to properly
implement not only the Big Data system, but also its security.
Big Data environments are very complex, so in order to address
their security, we need to start from a global perspective.
Security should be approached from high-level policies that can be
mapped to the lower levels [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Diferent authors [
        <xref ref-type="bibr" rid="ref2 ref23">2, 23</xref>
        ]
highlight that Reference Architectures (RA) have been shown to be
valuable to guide security in diferent environments; for example,
Cloud Computing [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] or Internet of Things [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>
        An RA is an abstract software architecture that is based on one
or more domains and with no implementation features [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Moreover, an RA should be expressed at a high level of abstraction, in
order to be reusable, extendable, and configurable. This kind of
architecture can be composed of diferent patterns to facilitate
the implementation of the system and improve the addition of
non-functional requirements [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Adding security patterns to
control their identified threats, RAs become a Security Reference
Architecture (SRA). In this way, a SRA is a high level architecture
that incorporates a set of elements facilitating the definition of
security requirements and allowing better understanding of
security policies, threats, vulnerabilities, etc., and which can be used
to describe a conceptual model of security for Big Data systems
[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ].
      </p>
      <p>
        Among our main concerns in computer security, our current
goal is to improve the security and trust of Big Data
environments. In order to achieve that objective, our first step is the
creation of a SRA for Big Data. To do that, we consider that
security patterns have a primordial role in facilitating the
implementation of security mechanisms in a Big Data ecosystem.
Hence, we modified the RA proposed by the National Institute
of Standards and Technology (NIST) for Big Data [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] to create
a richer architecture, in which the relations between the
diferent parts of Big Data are clearly exposed with a more granular
detail. This enhanced RA will allow a better understanding of
the Big Data ecosystem. In order to achieve that purpose, our
reference architecture is specified by means of UML diagrams
[
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. Finally, along with the SRA, we created a partial example
of how to apply our architecture; we have considered some of
the diferent threats that can afect a Big Data system, and how
the diferent components that take part in addressing them can
be instantiated; for example, security patterns that can help in
the solution of those problems.
      </p>
      <p>We organize the content of the paper as follows: first, we show
a section which explains the main properties of the NIST proposal
of an RA for Big Data. After that, we present the components
and structure of our SRA, together with an example of how to
use security patterns to address threats in a particular Big Data
project. Subsequently, we compare our proposal with the main
Big Data RA proposals. Finally, we include a section in which
conclusions and future work are discussed.
2</p>
    </sec>
    <sec id="sec-2">
      <title>REFERENCE MODEL: NIST REFERENCE</title>
    </sec>
    <sec id="sec-3">
      <title>ARCHITECTURE FOR BIG DATA</title>
      <p>
        For the last several years, the NIST has defined a RA for Big Data
which has received the general consensus of the industry and
scientific community [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. With the release of last version on
August 2017, this architecture collects many diferent ideas and
features for creating a Big Data ecosystem. This set of features
were extracted from the proposals of a Big Data architecture
made by the main companies of the sector, such as, Oracle and
IBM. As a result, NIST produced the RA that can be seen in Figure
1. The architecture is divided into five diferent components that
interact with each other and have diferent objectives. These
components are:
• System Orchestrator (SO): This is one of the most
important components of a Big Data ecosystem because it is
the one in charge of defining and integrating the required
data application activities into the ecosystem. The main
purpose of this component is the configuration and
management of the other components of the Big Data
architecture. In an enterprise, this function is typically centralized
and can be mapped to the traditional role of system
governor which provides the supervision of the requirements
and constraints that the Big Data must fulfill; for example,
policies, architecture, or business requirements.
• Data Provider (DP): This component oversees feeding the
Big Data ecosystem with new data. In order to accomplish
that goal, the Data Provider has a collection of interfaces,
or services, between the Big Data and the data sources.
This set of interfaces acts like a gate between the outside
world and the Big Data system.
• Big Data Application Provider (BDAP): The BDAP
component provides a specific set of services along the data life
cycle to meet the requirements established by the SO. It is
important to highlight that its main purpose is to
encapsulate the business logic and functionality to be executed by
the architecture. In a regular Big Data scenario, there are
several applications executing over the same data. As data
propagates through the ecosystem, it is being processed
and transformed in diferent ways to obtain valuable
information from the data. In order to achieve that goal, the
BDAP is composed of diferent services or activities that
can be considered as the SaaS layer of the Big Data
system. These activities are: collection, preparation, analytics,
visualization, and access. Activities can be implemented
as independent functions and deployed as stand-alone
services. Furthermore, the activities can interact with the
underlying Big Data Framework Provider, as well as with
the Data Consumer, DP or even with each other.
• Big Data Framework Provider (BDFP): The BDFP
component can be considered as the platform implementation of
the Big Data logic. It supports the activities defined in the
BDAP. In general, Big Data implementations are hybrids
that combine multiple technologies. It has three main
activities: infrastructure (virtual or physical), platform (how
the data is distributed and organized), and processing (how
data will be processed to support Big Data applications). In
addition, the BDFP component also provides the support
services for the system like communications or resource
management.
• Data Consumer (DC): It is similar to the DP component.
      </p>
      <p>Usually the actor that interacts with this component is
an end-user or another system. Similarly to the DP, it is
composed of a set of interfaces between the end-user and
the information.</p>
      <p>
        The NIST proposal cannot be considered as a SRA, but it
recognizes the importance of security and privacy in a Big Data
environment. In order to face the security problems, this
architecture has a Security and Privacy Fabric that addresses the needs
and solutions about this specific topic. In fact, there exists a
specific volume about privacy and security in Big Data [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ].
      </p>
      <p>
        From our point of view, this representation based on blocks
is not expressive enough. This kind of specification is too high
level in terms of abstraction, it provides little emphasis on
details of the subcomponents and how they are connected. This
approach can make dificult the design and implementation of a
Big Data ecosystem. Following the same approach, the ISO/IEC
organization is also working in the creation of a RA for Big Data
under the standard ISO/IEC 20547-3 [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Although, it is a work
in progress, it is expected that it will follow a similar approach
to the NIST proposal.
3
      </p>
    </sec>
    <sec id="sec-4">
      <title>A SECURITY REFERENCE</title>
    </sec>
    <sec id="sec-5">
      <title>ARCHITECTURE (SRA) FOR BIG DATA</title>
      <p>In this section, we will describe our SRA proposal which is
structured using the same schema and components as the guidelines
proposed by NIST. We consider that if our SRA is aligned with the
RA proposed by NIST, it will be easier to implement. Furthermore,
this architecture highlights the importance of implementing
security solutions based in concepts of the SRA.
3.1</p>
    </sec>
    <sec id="sec-6">
      <title>System Orchestrator (SO)</title>
      <p>The main purpose of this component is the enforcement of the
diferent requirements that the Big Data ecosystem must address.
Also, it organizes how the requirements are connected to all the
components of the architecture; in this section, we will focus on
the security requirements and the relation between them and
the diferent components. Figure 2 shows the structure of our
SO proposal. Due to the characteristics of this component, the
security activities related to it are in general focused on the
requirements and how to implement and monitor them. Those
requirements must fulfill Big Data goals and should be aligned
with the diferent business goals and company policies. In this
concern, the role of the Security Administrator is crucial to
ensure the observance of the security requirements. These security
requirements must comply with the regulations afecting each
Big Data ecosystem context. In fact, there are many other kinds of
requirements that can address the needs of a Big Data ecosystem;
for example, architecture, quality, or governance requirements.</p>
      <p>
        There are many examples of security requirements that should
be addressed in a Big Data context. Topics like data privacy and
how to secure the Big Data architecture itself are the most
addressed by researchers [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. These problems can be tackled by
using general mechanisms like user authorization and
authentication, fraud detection, risk control, auditing, encryption, network
access control, intrusion detection, or guarantee the quality and
security of the data when they come from diferent data sources
[
        <xref ref-type="bibr" rid="ref17 ref20 ref25 ref3 ref32">3, 17, 20, 25, 32</xref>
        ]. These are general security mechanisms but
they must be modified to be applied to specific types of systems,
based on possible threats.
      </p>
      <p>
        As it is shown in Figure 2, these security requirements can
be satisfied by means of diferent security solutions that follow
the security policies of the company and have as main objective
addressing threats to control vulnerabilities. An example of a
security policy in a company can be the obligation of using secure
communications, this policy can cause a security requirement
in the Big Data environment that specifies that the data
transfer between components must be secure. One way to approach
requirement is by using authentication methods, the
implementation of this security solution can be helped by means of the
“Role-based access control” security pattern. These security
solutions should be specifically implemented in the BDAP and BDFP
components. However, these solutions are not easy to implement;
thus, our model uses security patterns as a guidance. A security
pattern is a solution to a recurrent problem that indicates how
to defend against a threat, or a set of threats, in a concise and
reusable way [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Patterns are abstract solutions that must be
tailored to where they are applied. Furthermore, we can use
misuse patterns [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] as a way to understand each attack and guide
the application of the diferent security patterns that can be used
to stop a threat. Moreover, the security metadata can be defined
as a way to facilitate the coordination and realization of security
requirements. Another topic covered by our architecture is the
context of the asset; for example, the security considerations of
a medical record, are totally diferent compared to the ones of a
log file. It is important to evaluate the required security level for
each asset.
3.2
      </p>
    </sec>
    <sec id="sec-7">
      <title>Data Provider (DP)</title>
      <p>
        The DP component creates an abstraction of the data sources
considering their security metadata, if they exist. These
metadata allow the DP to identify the types of access and analysis
allowed by the data source and its security requirements. As
we explained in section 2, the DP has a set of interfaces. Those
interfaces must consider the constraints of each data source and
also the diferent security policies and requirements specified by
the SO. In this element, there may exist conflicts between the
security requirements of the data source and the ones of the Big
Data system itself. These clashes must be addressed in a way
that satisfies both sides. The security and privacy issues of this
component are mostly related to how to properly identify and
validate the end point inputs. The DP interfaces must evaluate
the provenance of the data source. It is a critical challenge in
the data collection process knowing how to validate that a data
source is not malicious and to filter out those which are [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>In our SRA, the interfaces are connected with the Collector
service of the BDAP that will be described in the next
subsection. Figure 3 represents the DP component with its interfaces.
In general, the elements that generally compose a data source,
include: the data itself that can be structured, semi-structured,
or unstructured; security requirements of the data source; and
security metadata of the data source. Those elements are not
represented in the diagram because we consider data source as
an external agent of the Big Data system. Still it is important to
know them to apply their constraints.
3.3</p>
    </sec>
    <sec id="sec-8">
      <title>Big Data Application Provider (BDAP)</title>
      <p>The BDAP component has the objective of meeting the
requirements established by the SO, including its security and privacy
requirements. To achieve that goal, the BDAP is composed of
diferent services or activities that can be considered as the SaaS
(Service as a System) layer of the Big Data ecosystem; in our case,
we assume that, in general, Big Data is implemented on a Cloud
platform, which will afect how the SRA is defined in the BDFP
component. Figure 4 shows the diferent services that constitute
this component, and also the BDAP Security Solution that must
map the SO security solutions to these stages; for example,
authorization may control here who can apply which operations to
perform data analysis.</p>
      <p>As it is represented in the diagram, not all the activities can
communicate with each other, there is a sequential order of
execution. This means that some of these activities are not mandatory
in a Big Data ecosystem. The preparation step has the purpose
of validating, cleaning and storing the data, but in a real-time
scenario where the data should be analysed as soon as it gets into
the system, this activity might be skipped. Something similar
happens to the visualization step, if the data consumer is not
a human end-user but another system, like a data warehouse
or even another Big Data ecosystem, this activity may not be
necessary.</p>
      <p>Nevertheless, the other three activities are basic in a Big Data
ecosystem: the collection activity acts like an ETL (Extract,
Transform, and Load) process and combines sets of data of similar
structure with the objective of unifying them; the analysis step
includes a set of techniques to obtain valuable knowledge from
data; for example, MapReduce algorithms and finally, the access
activity has the purpose of communicating with the DC, acting
like an interface between DC and visualization and analytics
activities. The relation between those diferent activities is
represented in Figure 4 by dotted lines, because it is a temporary
usage relation.
3.4</p>
    </sec>
    <sec id="sec-9">
      <title>Big Data Framework Provider (BDFP)</title>
      <p>In general, the BDFP component is composed of a set of
clusters which, in turn, are composed of nodes. Those nodes can be
deployed by means of Virtual Machines or Containers, which
interact with the hardware itself and the OS.</p>
      <p>The BDFP component in NIST is very abstract, with a lack of
details in the subcomponents needed to perform its processes.
Therefore, our proposal makes more emphasis in the diferent
elements and how they are connected. Figure 5 depicts the
diferent subcomponent of the BDFP. Our SRA highlights the idea of
a Big Data ecosystem with the possibility of implementing the
system with a Cloud environment and visualization techniques.</p>
      <p>
        In regard to security and privacy issues, in this component the
activities should be focused on the encryption and key
management of the data, the isolation and containerization of process
execution, authorization, authentication, audit logging, and how
to secure the storage and the network. Those security issues
should be addressed by means of the security solutions defined
on the SO, which can be implemented in this level as BDFP
security solutions. The SO security solutions are now mapped to data
protection, including application of cryptography and specialized
authorization mechanisms [
        <xref ref-type="bibr" rid="ref37 ref8">8, 37</xref>
        ].
3.5
      </p>
    </sec>
    <sec id="sec-10">
      <title>Data Consumer (DC)</title>
      <p>The DC component is, similarly to DP, composed by a set of
interfaces. The interaction could include interactive
visualization, report creation, or data drilling using business intelligence
techniques. It is important to highlight that these interfaces must
address the authorization and authentication function, in order
to reach the goal that the DC matches the metadata related to
the end-user and the security requirements and policies of the
information.</p>
      <p>Finally, Figure 6 summarizes our complete SRA for Big Data. In
this figure, the relationships between the diferent components
of the architecture can be seen in perspective. This figure is
important to better understand the example which is presented
in the following subsection.
3.6</p>
    </sec>
    <sec id="sec-11">
      <title>Examples of Application of Security</title>
    </sec>
    <sec id="sec-12">
      <title>Patterns</title>
      <p>
        As a way to show the usefulness of our SRA, we explain an
example of how to employ security patterns using our architecture.
We created the example by identifying some of the threats that
can be found in the diferent activities of the BDAP component.
A systematic method for the enumeration of threats is shown
in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Those threats can be addressed by means of security
patterns, which, in some cases, should be modified from general
security patterns to meet the Big Data inherent features. The
modification of these patterns, and the creation of new ones if
needed, is beyond the purpose of this paper and is considered
as future work. Table I summarizes some of the threats of each
activity and the general patterns that can be applied to solve
them. Those patterns are defined in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>As a way to better understand how to integrate the diferent
components of our SRA and the security patterns, we will define
how the threat TC1 can be addressed by using security patterns.
We will use an object diagram to explain it, this diagram is shown
in Figure 7. In this scenario, we have the stored data as the main
asset to protect, this asset has a vulnerability: it has no protection,
this vulnerability could be exploited by a threat like TC1. In order
to prevent that situation is necessary to implement a security
solution. To facilitate the implementation of the solution, two
security patterns can be used: Role-based access control and
Authentication. However, this security solution will still have a
high abstraction level due to the fact that it is defined in the SO
component. Hence, a low level implementation of the security
solution should be approached in the BDAP level, in this case, the
TC1 can afect the diferent services provided by the BDAP, that
Figure 6: Big Data SRA complete diagram
ID
TC1
TC2
TC3
TC4
TCo1
TP1
TA1
TA2
TV1
TAc1
is the reason why the security solution should be implemented
there and not in another component.</p>
      <p>Furthermore, we will describe how to create an instance of the
two diferent security patterns to secure the Collector
subcomponent (Authentication and Role-based Access Control security
patterns) by creating a partial example. In this example, we will
focus on a Big Data system whose objective is to process tweets
from the Twitter platform to analyse the general sentiment about
a product. Figure 8 shows the object diagram for this example.
The main component is what we want to protect, in this case:
the tweets that have been obtained to be analysed.</p>
      <p>The Authentication pattern allows us to verify the identity of
the user by using a proof of identity and an authenticator. On
the other hand, as its name indicates, one of the most important
things to implement the Role-based access control is to define
the diferent roles. In this case, we have defined four roles: the
administrator of the Big Data system, the data scientist, the end
user, and the data owner. As we explained before, this example is
focused on the Collector phase, so the defined rights of the roles
must consider this situation; for example, in this phase the end
user should not have any rights over the data. Hence, the Figure
8 shows the diferent functions that the user can perform over
the data according to their rights.
4</p>
    </sec>
    <sec id="sec-13">
      <title>COMPARISON WITH OTHER PROPOSALS</title>
      <p>There are not many reference architectures for Big Data systems;
if we focus our architecture goal in security, there are even fewer.
However, diferent authors and organizations have proposed
diferent reference architectures for Big Data. In this section, we
describe some of the most relevant proposals.</p>
      <p>
        Demchenko et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] propose a Big Data Framework
Architecture that establishes the data lifecycle in a Big Data ecosystem.
As in the NIST approach, they use a block representation; but
with a more detail in the relationships between the diferent
components of the architecture. However, they address security in
a very sketchy way and as an isolated feature, not really
connected to the other components. In [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ] the authors propose a
complete architecture in terms of the relations between the
different components; however, we found a lack of consideration
given to security and privacy aspects. Klein et al. propose in
[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] a specific reference architecture for Big Data in the national
security domain. Their architecture is very similar to the one
proposed by NIST. Our goal is to obtain a better abstraction of
the architecture, but still it is interesting how they address some
concerns by using solution patterns. They highlight the
importance of having a specific domain for the requirements. In our
case, requirements, and specifically the ones related to security,
are the main part of the SO component.
      </p>
      <p>
        Sqrrl [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] and BlueTalon [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] propose a Big Data model focused
on data-centric security. Their purpose is to embed security
information within the data itself. In the case of Sqrrl, they made
emphasis in the access control in each field of data, and to do
that they use a layered architecture built around the value or
sensitivity of the data. On the other hand, BlueTalon includes in
their proposal the concept of data lakes, a storage repository that
holds a huge amount of raw data until it is needed. There are
other proposals made by the main IT companies like Oracle [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ],
NTT data [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], IBM [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], Microsoft [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] or SAP [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ]. Table II
summarizes these RA and compares them with our SRA proposal. The
criteria were selected based on a previous systematic mapping
study that we carried out about security Big Data concerns [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ].
As a side efect of this work, we detected some characteristics
that usually are not considered in the diferent proposals and
could be important to define a SRA.
      </p>
      <p>Unlike the other proposals, our SRA has the requirements
as the main factor to consider to properly implement a Big
Data ecosystem, more specifically the security requirements that
should be approached in this phase. Moreover, we have found
in some proposals a lack of connection between the diferent
components of the architecture, our SRA clearly specifies those
relationships. Finally, our proposal has a medium abstraction
level, due to the fact that we do not consider specific technology
solutions or applications.</p>
      <p>Although there are some SRAs for Cloud environments and
some of their contributions could be useful to a Big Data
environment, there are still some diferences that are remarkable
enough to create a SRA for Big Data. For example, there are some
cases where the Big Data environment is supported by a Cloud
infrastructure, in that case, the Big Data RAs must consider that
possibility. In general, Cloud RAs are focused on the
infrastructure, while a Big Data RA must contemplate also the services
associated with the data analysis.
5</p>
    </sec>
    <sec id="sec-14">
      <title>CONCLUSION AND FUTURE WORK</title>
      <p>A more precise Reference Architecture (RA) is a better framework
to guide the use of security mechanisms to provide a high level
of security. Our Security Reference Architecture (SRA) subsumes
the published RAs, including the proposals made by NIST, Oracle,
NTT, and diferent researchers.</p>
      <p>We have created a SRA described by means of UML diagrams
that try to facilitate the implementation of secure Big Data. We
decided to use UML diagrams because we found a lack of
proposals where the relationship between the diferent components and
subcomponents is precisely defined. Also, thanks to this kind of
diagram it is possible to apply diferent security patterns, which
are usually described as UML models. Security patterns address
recurrent security problems, we have defined some of the security
patterns that can be implemented to protect the system against
threats. Our SRA emphasizes the idea of a Big Data ecosystem by
implementing the system using a Cloud Computing environment.</p>
      <p>
        We have also listed some of the threats that can be found
in a Big Data ecosystem; however, a deeper understanding of
the diferent threats that can afect these systems it is needed.
We will address this problem by creating diferent use cases
and scenarios to identify those threats as in the method of [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
Once we have the threats identified, we will find, adapt or create
security patterns that can solve those problems. We consider
these topics as the next steps to complete our SRA. Furthermore,
it is important to perform an analysis of the diferent stakeholders
that interact with the Big Data use cases.
      </p>
    </sec>
    <sec id="sec-15">
      <title>ACKNOWLEDGMENTS</title>
      <p>This work was funded by the SEQUOIA project (Ministerio de
Economía y Competitividad and the Fondo Europeo de Desarrollo
Regional FEDER, TIN2015-63502-C3-1-R).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Jacky</given-names>
            <surname>Akoka</surname>
          </string-name>
          , Isabelle Comyn-Wattiau, and
          <string-name>
            <given-names>Nabil</given-names>
            <surname>Laoufi</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Research on Big Data - A systematic mapping study</article-title>
          .
          <source>SI: New modeling in Big Data 54, Part</source>
          <volume>2</volume>
          (
          <issue>Nov</issue>
          .
          <year>2017</year>
          ),
          <fpage>105</fpage>
          -
          <lpage>115</lpage>
          . https://doi.org/10.1016/j.csi.
          <year>2017</year>
          .
          <volume>01</volume>
          .004
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Paris</given-names>
            <surname>Avgeriou</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Describing, Instantiating and Evaluating a Reference Architecture: A Case Study</article-title>
          .
          <source>Default journal</source>
          (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Bertino</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Big Data - Security and Privacy</article-title>
          .
          <source>In 2015 IEEE International Congress on Big Data</source>
          .
          <fpage>757</fpage>
          -
          <lpage>761</lpage>
          . https://doi.org/10.1109/BigDataCongress.
          <year>2015</year>
          . 126
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>BlueTalon.</surname>
          </string-name>
          <year>2016</year>
          .
          <article-title>BlueTalon Data-Centric Security Platform: Bringing Order to Data Security Chaos</article-title>
          . (
          <year>2016</year>
          ). http://bluetalon.com/data-centric_security/
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Doug</given-names>
            <surname>Cackett</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Information Management And Big Data A Reference Architecture</article-title>
          . Oracle,
          <string-name>
            <surname>February</surname>
          </string-name>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Min</given-names>
            <surname>Chen</surname>
          </string-name>
          , Shiwen Mao, and Yunhao Liu.
          <year>2014</year>
          .
          <article-title>Big data: A survey</article-title>
          .
          <source>Mobile Networks and Applications</source>
          <volume>19</volume>
          ,
          <issue>2</issue>
          (
          <year>2014</year>
          ),
          <fpage>171</fpage>
          -
          <lpage>209</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Big</given-names>
            <surname>Data Working Group Cloud Security Alliance</surname>
          </string-name>
          (CSA).
          <year>2013</year>
          .
          <article-title>Expanded Top Ten Big Data Security and Privacy</article-title>
          .
          <source>(April</source>
          <year>2013</year>
          ). https://downloads.cloudsecurityalliance.org/initiatives/bdwg/Expanded_ Top_Ten_Big_Data_Security_and_Privacy_Challenges.pdf
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Jason</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Cohen</surname>
            and
            <given-names>Subrata</given-names>
          </string-name>
          <string-name>
            <surname>Acharya</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Towards a trusted HDFS storage platform: Mitigating threats to Hadoop infrastructures using hardwareaccelerated encryption with TPM-rooted key protection</article-title>
          .
          <source>Journal of Information Security and Applications</source>
          <volume>19</volume>
          ,
          <issue>3</issue>
          (
          <year>2014</year>
          ),
          <fpage>224</fpage>
          -
          <lpage>244</lpage>
          . https://doi.org/10.1016/ j.jisa.
          <year>2014</year>
          .
          <volume>03</volume>
          .003
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>IBM</given-names>
            <surname>Corporation</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>IBM Big Data</article-title>
          &amp;
          <string-name>
            <surname>Analytics</surname>
            <given-names>RA.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>NTT</given-names>
            <surname>DATA</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>NTT DATA BigData Reference Architecture</article-title>
          . (
          <year>2015</year>
          ). http:// www.nttdata.com/global/en/shared/pdf/bigdata_reference_architecture.pdf
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Yuri</surname>
            <given-names>Demchenko</given-names>
          </string-name>
          , Cees De Laat, and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Membrey</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Defining architecture components of the Big Data Ecosystem</article-title>
          .
          <source>In Collaboration Technologies and Systems (CTS)</source>
          ,
          <source>2014 International Conference on. IEEE</source>
          ,
          <fpage>104</fpage>
          -
          <lpage>112</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Eduardo</surname>
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Fernandez</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Security patterns in practice: designing secure architectures using software patterns</article-title>
          . John Wiley &amp; Sons.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Eduardo</surname>
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Fernandez</surname>
            , Raul Monge, and
            <given-names>Keiko</given-names>
          </string-name>
          <string-name>
            <surname>Hashizume</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Building a security reference architecture for cloud systems</article-title>
          .
          <source>Requirements Engineering</source>
          <volume>21</volume>
          ,
          <issue>2</issue>
          (
          <year>June 2016</year>
          ),
          <fpage>225</fpage>
          -
          <lpage>249</lpage>
          . https://doi.org/10.1007/s00766-014-0218-7
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Eduardo</surname>
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Fernandez</surname>
            , Nobukazu Yoshioka, and
            <given-names>Hironori</given-names>
          </string-name>
          <string-name>
            <surname>Washizaki</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Modeling misuse patterns</article-title>
          .
          <source>In Availability, Reliability and Security</source>
          ,
          <year>2009</year>
          . ARES'09. International Conference on. IEEE,
          <fpage>566</fpage>
          -
          <lpage>571</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Eduardo</surname>
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Fernandez</surname>
          </string-name>
          , Nobukazu Yoshioka, Hironori Washizaki, and
          <string-name>
            <surname>Madiha</surname>
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Syed</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Modeling and Security in Cloud Ecosystems</article-title>
          .
          <source>Future Internet</source>
          <volume>8</volume>
          ,
          <issue>2</issue>
          (April
          <year>2016</year>
          ),
          <volume>13</volume>
          . https://doi.org/10.3390/fi8020013
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16] ISO/IEC.
          <year>2018</year>
          .
          <article-title>ISO/IEC CD 20547-3 - Information technology - Big data reference architecture - Part 3: Reference architecture</article-title>
          . (
          <year>2018</year>
          ). https://www. iso.org/standard/71277.html?browse=tc
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kaushik</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Jain</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Challenges to big data security and privacy</article-title>
          .
          <source>International Journal of Computer Science and Information Technologies (IJCSIT) 5</source>
          ,
          <issue>3</issue>
          (
          <year>2014</year>
          ),
          <fpage>3042</fpage>
          -
          <lpage>3043</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>John</surname>
            <given-names>Klein</given-names>
          </string-name>
          , Ross Buglak, David Blockow,
          <string-name>
            <given-names>Troy</given-names>
            <surname>Wuttke</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Brenton</given-names>
            <surname>Cooper</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>A reference architecture for big data systems in the national security domain</article-title>
          .
          <source>In Proceedings of the 2nd International Workshop on BIG Data Software Engineering. ACM</source>
          , Austin, Texas,
          <fpage>51</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Srdjan</surname>
            <given-names>Krco</given-names>
          </string-name>
          , Boris Pokric, and
          <string-name>
            <given-names>Francois</given-names>
            <surname>Carrez</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Designing IoT architecture (s): A European perspective</article-title>
          .
          <source>In Internet of Things (WF-IoT)</source>
          ,
          <source>2014 IEEE World Forum on. IEEE</source>
          ,
          <fpage>79</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Guillermo</given-names>
            <surname>Lafuente</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>The big data security challenge</article-title>
          .
          <source>Network Security</source>
          <year>2015</year>
          ,
          <volume>1</volume>
          (Jan.
          <year>2015</year>
          ),
          <fpage>12</fpage>
          -
          <lpage>14</lpage>
          . https://doi.org/10.1016/S1353-
          <volume>4858</volume>
          (
          <issue>15</issue>
          )
          <fpage>70009</fpage>
          -
          <lpage>7</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Fang</surname>
            <given-names>Liu</given-names>
          </string-name>
          , Jin Tong, Jian Mao, Robert Bohn, John Messina, Lee Badger, and
          <string-name>
            <given-names>Dawn</given-names>
            <surname>Leaf</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>NIST cloud computing reference architecture</article-title>
          .
          <source>NIST special publication 500</source>
          ,
          <year>2011</year>
          (
          <year>2011</year>
          ),
          <fpage>292</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>V.</given-names>
            <surname>Mayer-Schönberger</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Cukier</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Big Data: A Revolution that Will Transform how We Live, Work, and Think</article-title>
          . Houghton Miflin Harcourt. https: //books.google.es/books?id=uy4lh-WEhhIC
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Nenad</given-names>
            <surname>Medvidovic</surname>
          </string-name>
          and
          <string-name>
            <given-names>Richard N.</given-names>
            <surname>Taylor</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Software architecture: foundations, theory, and practice</article-title>
          .
          <source>In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 2. ACM</source>
          ,
          <volume>471</volume>
          -
          <fpage>472</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Microsoft</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Microsoft Big Data Solution Brief</article-title>
          . (
          <year>2014</year>
          ). http://download. microsoft.com/download/f/a/1/fa126d6d-841b
          <string-name>
            <surname>-</surname>
          </string-name>
          4565
          <string-name>
            <surname>-</surname>
          </string-name>
          bb26
          <article-title>-d2add4a28f24/ microsoft_big_data_solution_brief</article-title>
          .pdf
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Julio</surname>
            <given-names>Moreno</given-names>
          </string-name>
          , Manuel A.
          <string-name>
            <surname>Serrano</surname>
          </string-name>
          , and
          <string-name>
            <surname>Eduardo</surname>
          </string-name>
          Fernández-Medina.
          <year>2016</year>
          .
          <article-title>Main Issues in Big Data Security</article-title>
          .
          <source>Future Internet</source>
          <volume>8</volume>
          ,
          <issue>3</issue>
          (
          <year>2016</year>
          ),
          <fpage>44</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>NIST</given-names>
            <surname>NBD-WG</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>NIST Big Data Reference Architecture</article-title>
          . (
          <year>2017</year>
          ). https: //bigdatawg.nist.gov/_uploadfiles/M0639_v1_
          <fpage>9796711131</fpage>
          .docx
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>NIST</given-names>
            <surname>NBD-WG</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>NIST Big Data Security and Privacy</article-title>
          . (
          <year>2017</year>
          ). https: //bigdatawg.nist.gov/_uploadfiles/M0638_v1_
          <fpage>4829021654</fpage>
          .docx
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>Pekka</given-names>
            <surname>Pääkkönen</surname>
          </string-name>
          and
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Pakkala</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Reference architecture and classification of technologies, products and services for big data systems</article-title>
          .
          <source>Big Data Research</source>
          <volume>2</volume>
          ,
          <issue>4</issue>
          (
          <year>2015</year>
          ),
          <fpage>166</fpage>
          -
          <lpage>186</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>James</surname>
            <given-names>Rumbaugh</given-names>
          </string-name>
          , Ivar Jacobson, and
          <string-name>
            <given-names>Grady</given-names>
            <surname>Booch</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Unified modeling language reference manual, the</article-title>
          .
          <source>Pearson Higher Education.</source>
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>S.</given-names>
            <surname>Sagiroglu</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Sinanc</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Big data: A review</article-title>
          .
          <source>Collaboration Technologies and Systems (CTS)</source>
          ,
          <source>2013 International Conference on (May</source>
          <year>2013</year>
          ),
          <fpage>42</fpage>
          -
          <lpage>47</lpage>
          . https: //doi.org/10.1109/CTS.
          <year>2013</year>
          .6567202
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>SAP</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>CIO Guide to Using the SAP HANA® Platform for Big Data</article-title>
          .
          <source>(Feb</source>
          .
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>B.</given-names>
            <surname>Saraladevi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Pazhaniraja</surname>
          </string-name>
          , P. Victer Paul, MS Saleem Basha, and
          <string-name>
            <given-names>P.</given-names>
            <surname>Dhavachelvan</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Big Data and Hadoop-A study in security perspective</article-title>
          .
          <source>Procedia computer science 50</source>
          (
          <year>2015</year>
          ),
          <fpage>596</fpage>
          -
          <lpage>601</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Priya</surname>
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Sharma</surname>
            and
            <given-names>Chandrakant P.</given-names>
          </string-name>
          <string-name>
            <surname>Navdeti</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Securing big data hadoop: a review of security issues, threats and solution</article-title>
          .
          <source>Int. J. Comput. Sci. Inf. Technol</source>
          <volume>5</volume>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>SQRRL</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Big Data and Data Centric Security</article-title>
          . (
          <year>2014</year>
          ). http://sqrrl.com/ media/Data-Centric-
          <article-title>Security-WP-final-</article-title>
          .
          <source>pdf</source>
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>Bhavani</given-names>
            <surname>Thuraisingham</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Big data security and privacy</article-title>
          .
          <source>In Proceedings of the 5th ACM Conference on Data and Application Security and Privacy. ACM</source>
          ,
          <volume>279</volume>
          -
          <fpage>280</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Hua</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiaohong Jiang</surname>
            , and
            <given-names>Georgios</given-names>
          </string-name>
          <string-name>
            <surname>Kambourakis</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Special issue on Security, Privacy and Trust in network-based Big Data</article-title>
          .
          <source>Information Sciences: an International Journal 318</source>
          ,
          <string-name>
            <surname>C</surname>
          </string-name>
          (
          <year>2015</year>
          ),
          <fpage>48</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <surname>Jiaqi</surname>
            <given-names>Zhao</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lizhe</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Jie Tao, Jinjun Chen, Weiye Sun, Rajiv Ranjan, Joanna Kołodziej, Achim Streit, and
          <string-name>
            <given-names>Dimitrios</given-names>
            <surname>Georgakopoulos</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>A security framework in G-Hadoop for big data computing across distributed Cloud data centres</article-title>
          .
          <source>J. Comput. System Sci. 80</source>
          ,
          <issue>5</issue>
          (
          <year>2014</year>
          ),
          <fpage>994</fpage>
          -
          <lpage>1007</lpage>
          . https://doi.org/10. 1016/j.jcss.
          <year>2014</year>
          .
          <volume>02</volume>
          .006 Special Issue on Dependable and
          <string-name>
            <given-names>Secure</given-names>
            <surname>Computing</surname>
          </string-name>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>