<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Towards a Security Reference Architecture for Big Data</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Julio</forename><surname>Moreno</surname></persName>
							<email>julio.moreno@uclm.es</email>
							<affiliation key="aff0">
								<orgName type="laboratory">GSyA Research Group</orgName>
								<orgName type="institution">University of Castilla-La Mancha Ciudad Real</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Manuel</forename><forename type="middle">A</forename><surname>Serrano</surname></persName>
							<email>manuel.serrano@uclm.es</email>
							<affiliation key="aff1">
								<orgName type="laboratory">Alarcos Research Group</orgName>
								<orgName type="institution">University of Castilla-La Mancha Ciudad Real</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Eduardo</forename><surname>Fernandez-Medina</surname></persName>
							<email>eduardo.fdezmedina@uclm.es</email>
							<affiliation key="aff2">
								<orgName type="laboratory">GSyA Research Group</orgName>
								<orgName type="institution">University of Castilla-La Mancha Ciudad Real</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Eduardo</forename><forename type="middle">B</forename><surname>Fernandez</surname></persName>
							<email>fernande@fau.edu</email>
							<affiliation key="aff3">
								<orgName type="department">Department of Computer and Electrical Engineering and Computer Science</orgName>
								<orgName type="institution">Florida Atlantic University Boca Raton</orgName>
								<address>
									<region>Florida</region>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Towards a Security Reference Architecture for Big Data</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073)</idno>
					</monogr>
					<idno type="MD5">A7D91EBEBF1D4446DD229939B54F63A5</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T03:33+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Companies are aware of Big Data importance as data are essential to conduct their daily activities, but new problems arise with new technologies, as it is the case of Big Data; these problems are related not only to the 3Vs of Big Data, but also to privacy and security. Security is crucial in Big Data systems, but unfortunately, security problems occur due to the fact that Big Data was not initially conceived as a secure environment. Furthermore, this task is difficult due to the heterogeneous configurations that a Big Data system can have. One way to solve this problem is by having a global perspective, and in this way, a Reference Architecture (RA) is a high-level abstraction of a system that can be useful in the implementation of complex systems. Several initiatives have been made for obtaining a RA for Big Data like those from IBM, ORACLE, NIST or ISO, but none of them have their main focus on security. It is widely accepted that adding elements to address threats and facilitate the definition of security requirements to RA is a good starting point for solving these kind of threats and, in this way, converting RAs into Security Reference Architectures (SRAs). In the current paper, a SRA for Big Data is defined using UML models trying to ease secure Big Data implementations; allowing to apply security patterns in order to secure final Big Data systems.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">INTRODUCTION</head><p>Companies are increasingly aware of Big Data importance <ref type="bibr" target="#b0">[1]</ref>. For all of them, data are essential to conduct their daily activities and to help senior management to achieve business objectives and, as a result, take better decisions based on the information extracted from such data <ref type="bibr" target="#b20">[22]</ref>. Big Data implies a change compared to traditional techniques in three different ways: the amount of data (volume), the rate of generation and transmission of data (velocity) and the heterogeneity of the types of structured and unstructured data that it can handle (variety) <ref type="bibr" target="#b5">[6]</ref>. These properties are known as the three Vs of Big Data <ref type="bibr" target="#b26">[30]</ref>.</p><p>New problems usually arise with new technologies, as it is the case of Big Data. These problems are related not only to the 3 Vs of Big data, but also to privacy and security. Big Data not only increases the scale of the problems related to privacy and security, as faced in the traditional management of security, but also adds new ones that should be addressed with different techniques and measures <ref type="bibr" target="#b32">[36]</ref>. These security problems occur due to the fact that Big Data was not conceived initially as a secure environment <ref type="bibr" target="#b29">[33]</ref>, and therefore, the main security problems are related to the specific architecture of Big Data itself which makes it harder to protect the privacy of the data that it is being used <ref type="bibr" target="#b6">[7]</ref>.</p><p>Obtaining an adequate level of security in Big Data can influence its implementation in an institution because of the loss of reputation they could suffer or because they could receive financial penalties, due to regulations, in the case of data breaches; in fact, without a security guarantee, Big Data will not reach an appropriate level of acceptance <ref type="bibr" target="#b31">[35]</ref>. Hence, it is important to have guidance, methodologies, and mechanisms to properly implement not only the Big Data system, but also its security. Big Data environments are very complex, so in order to address their security, we need to start from a global perspective. Security should be approached from high-level policies that can be mapped to the lower levels <ref type="bibr" target="#b11">[13]</ref>. Different authors <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b21">23]</ref> highlight that Reference Architectures (RA) have been shown to be valuable to guide security in different environments; for example, Cloud Computing <ref type="bibr" target="#b11">[13]</ref> or Internet of Things <ref type="bibr" target="#b17">[19]</ref>.</p><p>An RA is an abstract software architecture that is based on one or more domains and with no implementation features <ref type="bibr" target="#b1">[2]</ref>. Moreover, an RA should be expressed at a high level of abstraction, in order to be reusable, extendable, and configurable. This kind of architecture can be composed of different patterns to facilitate the implementation of the system and improve the addition of non-functional requirements <ref type="bibr" target="#b13">[15]</ref>. Adding security patterns to control their identified threats, RAs become a Security Reference Architecture (SRA). In this way, a SRA is a high level architecture that incorporates a set of elements facilitating the definition of security requirements and allowing better understanding of security policies, threats, vulnerabilities, etc., and which can be used to describe a conceptual model of security for Big Data systems <ref type="bibr" target="#b19">[21]</ref>.</p><p>Among our main concerns in computer security, our current goal is to improve the security and trust of Big Data environments. In order to achieve that objective, our first step is the creation of a SRA for Big Data. To do that, we consider that security patterns have a primordial role in facilitating the implementation of security mechanisms in a Big Data ecosystem. Hence, we modified the RA proposed by the National Institute of Standards and Technology (NIST) for Big <ref type="bibr">Data [26]</ref> to create a richer architecture, in which the relations between the different parts of Big Data are clearly exposed with a more granular detail. This enhanced RA will allow a better understanding of the Big Data ecosystem. In order to achieve that purpose, our reference architecture is specified by means of UML diagrams <ref type="bibr" target="#b25">[29]</ref>. Finally, along with the SRA, we created a partial example of how to apply our architecture; we have considered some of the different threats that can affect a Big Data system, and how the different components that take part in addressing them can be instantiated; for example, security patterns that can help in the solution of those problems.</p><p>We organize the content of the paper as follows: first, we show a section which explains the main properties of the NIST proposal of an RA for Big Data. After that, we present the components and structure of our SRA, together with an example of how to use security patterns to address threats in a particular Big Data project. Subsequently, we compare our proposal with the main Big Data RA proposals. Finally, we include a section in which conclusions and future work are discussed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">REFERENCE MODEL: NIST REFERENCE ARCHITECTURE FOR BIG DATA</head><p>For the last several years, the NIST has defined a RA for Big Data which has received the general consensus of the industry and scientific community <ref type="bibr">[26]</ref>. With the release of last version on August 2017, this architecture collects many different ideas and features for creating a Big Data ecosystem. This set of features were extracted from the proposals of a Big Data architecture made by the main companies of the sector, such as, Oracle and IBM. As a result, NIST produced the RA that can be seen in Figure <ref type="figure" target="#fig_0">1</ref>. The architecture is divided into five different components that interact with each other and have different objectives. These components are:</p><p>• System Orchestrator (SO): This is one of the most important components of a Big Data ecosystem because it is the one in charge of defining and integrating the required data application activities into the ecosystem. The main purpose of this component is the configuration and management of the other components of the Big Data architecture. In an enterprise, this function is typically centralized and can be mapped to the traditional role of system governor which provides the supervision of the requirements and constraints that the Big Data must fulfill; for example, policies, architecture, or business requirements. • Data Provider (DP): This component oversees feeding the Big Data ecosystem with new data. In order to accomplish that goal, the Data Provider has a collection of interfaces, or services, between the Big Data and the data sources. This set of interfaces acts like a gate between the outside world and the Big Data system. • Big Data Application Provider (BDAP): The BDAP component provides a specific set of services along the data life cycle to meet the requirements established by the SO. It is important to highlight that its main purpose is to encapsulate the business logic and functionality to be executed by the architecture. In a regular Big Data scenario, there are several applications executing over the same data. As data propagates through the ecosystem, it is being processed and transformed in different ways to obtain valuable information from the data. In order to achieve that goal, the BDAP is composed of different services or activities that can be considered as the SaaS layer of the Big Data system. These activities are: collection, preparation, analytics, visualization, and access. Activities can be implemented as independent functions and deployed as stand-alone services. Furthermore, the activities can interact with the underlying Big Data Framework Provider, as well as with the Data Consumer, DP or even with each other. • Big Data Framework Provider (BDFP): The BDFP component can be considered as the platform implementation of the Big Data logic. It supports the activities defined in the BDAP. In general, Big Data implementations are hybrids that combine multiple technologies. It has three main activities: infrastructure (virtual or physical), platform (how the data is distributed and organized), and processing (how data will be processed to support Big Data applications). In addition, the BDFP component also provides the support services for the system like communications or resource management. • Data Consumer (DC): It is similar to the DP component.</p><p>Usually the actor that interacts with this component is an end-user or another system. Similarly to the DP, it is composed of a set of interfaces between the end-user and the information.</p><p>The NIST proposal cannot be considered as a SRA, but it recognizes the importance of security and privacy in a Big Data environment. In order to face the security problems, this architecture has a Security and Privacy Fabric that addresses the needs and solutions about this specific topic. In fact, there exists a specific volume about privacy and security in Big Data <ref type="bibr">[27]</ref>.</p><p>From our point of view, this representation based on blocks is not expressive enough. This kind of specification is too high level in terms of abstraction, it provides little emphasis on details of the subcomponents and how they are connected. This approach can make difficult the design and implementation of a Big Data ecosystem. Following the same approach, the ISO/IEC organization is also working in the creation of a RA for Big Data under the standard ISO/IEC 20547-3 <ref type="bibr" target="#b14">[16]</ref>. Although, it is a work in progress, it is expected that it will follow a similar approach to the NIST proposal.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">A SECURITY REFERENCE ARCHITECTURE (SRA) FOR BIG DATA</head><p>In this section, we will describe our SRA proposal which is structured using the same schema and components as the guidelines proposed by NIST. We consider that if our SRA is aligned with the RA proposed by NIST, it will be easier to implement. Furthermore, this architecture highlights the importance of implementing security solutions based in concepts of the SRA.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">System Orchestrator (SO)</head><p>The main purpose of this component is the enforcement of the different requirements that the Big Data ecosystem must address. Also, it organizes how the requirements are connected to all the components of the architecture; in this section, we will focus on the security requirements and the relation between them and the different components. Figure <ref type="figure" target="#fig_1">2</ref> shows the structure of our SO proposal. Due to the characteristics of this component, the security activities related to it are in general focused on the requirements and how to implement and monitor them. Those requirements must fulfill Big Data goals and should be aligned with the different business goals and company policies. In this concern, the role of the Security Administrator is crucial to ensure the observance of the security requirements. These security requirements must comply with the regulations affecting each Big Data ecosystem context. In fact, there are many other kinds of requirements that can address the needs of a Big Data ecosystem; for example, architecture, quality, or governance requirements.</p><p>There are many examples of security requirements that should be addressed in a Big Data context. Topics like data privacy and how to secure the Big Data architecture itself are the most addressed by researchers <ref type="bibr" target="#b23">[25]</ref>. These problems can be tackled by using general mechanisms like user authorization and authentication, fraud detection, risk control, auditing, encryption, network access control, intrusion detection, or guarantee the quality and security of the data when they come from different data sources <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b15">17,</ref><ref type="bibr" target="#b18">20,</ref><ref type="bibr" target="#b23">25,</ref><ref type="bibr" target="#b28">32]</ref>. These are general security mechanisms but they must be modified to be applied to specific types of systems, based on possible threats.</p><p>As it is shown in Figure <ref type="figure" target="#fig_1">2</ref>, these security requirements can be satisfied by means of different security solutions that follow the security policies of the company and have as main objective addressing threats to control vulnerabilities. An example of a security policy in a company can be the obligation of using secure communications, this policy can cause a security requirement in the Big Data environment that specifies that the data transfer between components must be secure. One way to approach requirement is by using authentication methods, the implementation of this security solution can be helped by means of the "Role-based access control" security pattern. These security solutions should be specifically implemented in the BDAP and BDFP components. However, these solutions are not easy to implement; thus, our model uses security patterns as a guidance. A security pattern is a solution to a recurrent problem that indicates how to defend against a threat, or a set of threats, in a concise and reusable way <ref type="bibr" target="#b10">[12]</ref>. Patterns are abstract solutions that must be tailored to where they are applied. Furthermore, we can use misuse patterns <ref type="bibr" target="#b12">[14]</ref> as a way to understand each attack and guide the application of the different security patterns that can be used to stop a threat. Moreover, the security metadata can be defined as a way to facilitate the coordination and realization of security requirements. Another topic covered by our architecture is the context of the asset; for example, the security considerations of a medical record, are totally different compared to the ones of a log file. It is important to evaluate the required security level for each asset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Data Provider (DP)</head><p>The DP component creates an abstraction of the data sources considering their security metadata, if they exist. These metadata allow the DP to identify the types of access and analysis allowed by the data source and its security requirements. As we explained in section 2, the DP has a set of interfaces. Those interfaces must consider the constraints of each data source and also the different security policies and requirements specified by the SO. In this element, there may exist conflicts between the security requirements of the data source and the ones of the Big Data system itself. These clashes must be addressed in a way that satisfies both sides. The security and privacy issues of this component are mostly related to how to properly identify and validate the end point inputs. The DP interfaces must evaluate the provenance of the data source. It is a critical challenge in the data collection process knowing how to validate that a data source is not malicious and to filter out those which are <ref type="bibr" target="#b6">[7]</ref>.</p><p>In our SRA, the interfaces are connected with the Collector service of the BDAP that will be described in the next subsection. Figure <ref type="figure" target="#fig_2">3</ref> represents the DP component with its interfaces. In general, the elements that generally compose a data source, include: the data itself that can be structured, semi-structured, or unstructured; security requirements of the data source; and security metadata of the data source. Those elements are not represented in the diagram because we consider data source as an external agent of the Big Data system. Still it is important to know them to apply their constraints.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Big Data Application Provider (BDAP)</head><p>The BDAP component has the objective of meeting the requirements established by the SO, including its security and privacy requirements. To achieve that goal, the BDAP is composed of different services or activities that can be considered as the SaaS (Service as a System) layer of the Big Data ecosystem; in our case, we assume that, in general, Big Data is implemented on a Cloud platform, which will affect how the SRA is defined in the BDFP component. Figure <ref type="figure">4</ref> shows the different services that constitute this component, and also the BDAP Security Solution that must map the SO security solutions to these stages; for example, authorization may control here who can apply which operations to perform data analysis.</p><p>As it is represented in the diagram, not all the activities can communicate with each other, there is a sequential order of execution. This means that some of these activities are not mandatory in a Big Data ecosystem. The preparation step has the purpose of validating, cleaning and storing the data, but in a real-time scenario where the data should be analysed as soon as it gets into the system, this activity might be skipped. Something similar happens to the visualization step, if the data consumer is not a human end-user but another system, like a data warehouse or even another Big Data ecosystem, this activity may not be necessary.</p><p>Nevertheless, the other three activities are basic in a Big Data ecosystem: the collection activity acts like an ETL (Extract, Transform, and Load) process and combines sets of data of similar structure with the objective of unifying them; the analysis step includes a set of techniques to obtain valuable knowledge from data; for example, MapReduce algorithms and finally, the access activity has the purpose of communicating with the DC, acting like an interface between DC and visualization and analytics  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Big Data Framework Provider (BDFP)</head><p>In general, the BDFP component is composed of a set of clusters which, in turn, are composed of nodes. Those nodes can be deployed by means of Virtual Machines or Containers, which interact with the hardware itself and the OS.</p><p>The BDFP component in NIST is very abstract, with a lack of details in the subcomponents needed to perform its processes. Therefore, our proposal makes more emphasis in the different elements and how they are connected. Figure <ref type="figure">5</ref> depicts the different subcomponent of the BDFP. Our SRA highlights the idea of a Big Data ecosystem with the possibility of implementing the system with a Cloud environment and visualization techniques.</p><p>In regard to security and privacy issues, in this component the activities should be focused on the encryption and key management of the data, the isolation and containerization of process execution, authorization, authentication, audit logging, and how to secure the storage and the network. Those security issues should be addressed by means of the security solutions defined on the SO, which can be implemented in this level as BDFP security solutions. The SO security solutions are now mapped to data protection, including application of cryptography and specialized authorization mechanisms <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b33">37]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5">Data Consumer (DC)</head><p>The DC component is, similarly to DP, composed by a set of interfaces. The interaction could include interactive visualization, report creation, or data drilling using business intelligence techniques. It is important to highlight that these interfaces must address the authorization and authentication function, in order to reach the goal that the DC matches the metadata related to the end-user and the security requirements and policies of the information.</p><p>Finally, Figure <ref type="figure" target="#fig_4">6</ref> summarizes our complete SRA for Big Data. In this figure, the relationships between the different components of the architecture can be seen in perspective. This figure is important to better understand the example which is presented in the following subsection.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.6">Examples of Application of Security Patterns</head><p>As a way to show the usefulness of our SRA, we explain an example of how to employ security patterns using our architecture. We created the example by identifying some of the threats that A systematic method for the enumeration of threats is shown in <ref type="bibr" target="#b10">[12]</ref>. Those threats can be addressed by means of security patterns, which, in some cases, should be modified from general security patterns to meet the Big Data inherent features. The modification of these patterns, and the creation of new ones if needed, is beyond the purpose of this paper and is considered as future work. Table <ref type="table">I</ref> summarizes some of the threats of each activity and the general patterns that can be applied to solve them. Those patterns are defined in <ref type="bibr" target="#b10">[12]</ref>. As a way to better understand how to integrate the different components of our SRA and the security patterns, we will define how the threat TC1 can be addressed by using security patterns.</p><p>We will use an object diagram to explain it, this diagram is shown in Figure <ref type="figure" target="#fig_5">7</ref>. In this scenario, we have the stored data as the main asset to protect, this asset has a vulnerability: it has no protection, this vulnerability could be exploited by a threat like TC1. In order to prevent that situation is necessary to implement a security solution. To facilitate the implementation of the solution, two security patterns can be used: Role-based access control and Authentication. However, this security solution will still have a high abstraction level due to the fact that it is defined in the SO component. Hence, a low level implementation of the security solution should be approached in the BDAP level, in this case, the TC1 can affect the different services provided by the BDAP, that   is the reason why the security solution should be implemented there and not in another component. Furthermore, we will describe how to create an instance of the two different security patterns to secure the Collector subcomponent (Authentication and Role-based Access Control security patterns) by creating a partial example. In this example, we will focus on a Big Data system whose objective is to process tweets from the Twitter platform to analyse the general sentiment about a product. Figure <ref type="figure" target="#fig_6">8</ref> shows the object diagram for this example. The main component is what we want to protect, in this case: the tweets that have been obtained to be analysed.</p><p>The Authentication pattern allows us to verify the identity of the user by using a proof of identity and an authenticator. On the other hand, as its name indicates, one of the most important things to implement the Role-based access control is to define the different roles. In this case, we have defined four roles: the administrator of the Big Data system, the data scientist, the end user, and the data owner. As we explained before, this example is focused on the Collector phase, so the defined rights of the roles must consider this situation; for example, in this phase the end user should not have any rights over the data. Hence, the Figure <ref type="figure" target="#fig_6">8</ref> shows the different functions that the user can perform over the data according to their rights.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">COMPARISON WITH OTHER PROPOSALS</head><p>There are not many reference architectures for Big Data systems; if we focus our architecture goal in security, there are even fewer. However, different authors and organizations have proposed different reference architectures for Big Data. In this section, we describe some of the most relevant proposals.</p><p>Demchenko et al. <ref type="bibr" target="#b9">[11]</ref> propose a Big Data Framework Architecture that establishes the data lifecycle in a Big Data ecosystem. As in the NIST approach, they use a block representation; but with a more detail in the relationships between the different components of the architecture. However, they address security in a very sketchy way and as an isolated feature, not really connected to the other components. In <ref type="bibr" target="#b24">[28]</ref> the authors propose a complete architecture in terms of the relations between the different components; however, we found a lack of consideration given to security and privacy aspects. <ref type="bibr">Klein et al.</ref> propose in <ref type="bibr" target="#b16">[18]</ref> a specific reference architecture for Big Data in the national security domain. Their architecture is very similar to the one proposed by NIST. Our goal is to obtain a better abstraction of the architecture, but still it is interesting how they address some concerns by using solution patterns. They highlight the importance of having a specific domain for the requirements. In our  Sqrrl <ref type="bibr" target="#b30">[34]</ref> and BlueTalon <ref type="bibr" target="#b3">[4]</ref> propose a Big Data model focused on data-centric security. Their purpose is to embed security information within the data itself. In the case of Sqrrl, they made emphasis in the access control in each field of data, and to do that they use a layered architecture built around the value or sensitivity of the data. On the other hand, BlueTalon includes in their proposal the concept of data lakes, a storage repository that holds a huge amount of raw data until it is needed. There are other proposals made by the main IT companies like Oracle <ref type="bibr" target="#b4">[5]</ref>, NTT data <ref type="bibr" target="#b8">[10]</ref>, IBM [9], Microsoft <ref type="bibr" target="#b22">[24]</ref> or SAP <ref type="bibr" target="#b27">[31]</ref>. Table II summarizes these RA and compares them with our SRA proposal. The criteria were selected based on a previous systematic mapping study that we carried out about security Big Data concerns <ref type="bibr" target="#b23">[25]</ref>. As a side effect of this work, we detected some characteristics that usually are not considered in the different proposals and could be important to define a SRA.</p><p>Unlike the other proposals, our SRA has the requirements as the main factor to consider to properly implement a Big Data ecosystem, more specifically the security requirements that should be approached in this phase. Moreover, we have found in some proposals a lack of connection between the different components of the architecture, our SRA clearly specifies those relationships. Finally, our proposal has a medium abstraction level, due to the fact that we do not consider specific technology solutions or applications.</p><p>Although there are some SRAs for Cloud environments and some of their contributions could be useful to a Big Data environment, there are still some differences that are remarkable enough to create a SRA for Big Data. For example, there are some cases where the Big Data environment is supported by a Cloud infrastructure, in that case, the Big Data RAs must consider that possibility. In general, Cloud RAs are focused on the infrastructure, while a Big Data RA must contemplate also the services associated with the data analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">CONCLUSION AND FUTURE WORK</head><p>A more precise Reference Architecture (RA) is a better framework to guide the use of security mechanisms to provide a high level of security. Our Security Reference Architecture (SRA) subsumes the published RAs, including the proposals made by NIST, Oracle, NTT, and different researchers.</p><p>We have created a SRA described by means of UML diagrams that try to facilitate the implementation of secure Big Data. We decided to use UML diagrams because we found a lack of proposals where the relationship between the different components and subcomponents is precisely defined. Also, thanks to this kind of diagram it is possible to apply different security patterns, which are usually described as UML models. Security patterns address recurrent security problems, we have defined some of the security patterns that can be implemented to protect the system against threats. Our SRA emphasizes the idea of a Big Data ecosystem by implementing the system using a Cloud Computing environment.</p><p>We have also listed some of the threats that can be found in a Big Data ecosystem; however, a deeper understanding of the different threats that can affect these systems it is needed. We will address this problem by creating different use cases and scenarios to identify those threats as in the method of <ref type="bibr" target="#b12">[14]</ref>.</p><p>Once we have the threats identified, we will find, adapt or create security patterns that can solve those problems. We consider these topics as the next steps to complete our SRA. Furthermore, it is important to perform an analysis of the different stakeholders that interact with the Big Data use cases.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: NIST proposal for a Big Data architecture [26]</figDesc><graphic coords="2,317.54,83.69,216.00,179.99" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: System Orchestrator (SO) diagram</figDesc><graphic coords="4,81.64,83.68,432.00,288.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Data Provider (DP) diagram</figDesc><graphic coords="4,97.74,409.54,143.99,129.60" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :Figure 5 :</head><label>45</label><figDesc>Figure 4: Big Data Application Provider (BDAP) diagram</figDesc><graphic coords="5,81.64,284.16,431.99,288.01" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: Big Data SRA complete diagram</figDesc><graphic coords="6,74.45,83.68,446.38,590.40" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: Using security patterns to address a specific threat</figDesc><graphic coords="7,61.75,390.12,215.99,180.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 8 :</head><label>8</label><figDesc>Figure 8: Application of Authentication and Role-based access control patterns</figDesc><graphic coords="8,81.64,83.69,431.99,230.39" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Identified threats and security patterns for the different activities</figDesc><table><row><cell>ID</cell><cell>Activity</cell><cell>Threat</cell><cell>Security Pattern</cell></row><row><cell>TC1</cell><cell>Common to all the</cell><cell>Data modified</cell><cell>Authentication, Role-based access control</cell></row><row><cell></cell><cell>activities</cell><cell></cell><cell></cell></row><row><cell>TC2</cell><cell>Common to all the</cell><cell>Data destroyed</cell><cell>Authentication, Role-based access control</cell></row><row><cell></cell><cell>activities</cell><cell></cell><cell></cell></row><row><cell>TC3</cell><cell>Common to all the</cell><cell>Data illegally read</cell><cell>Encryption, Role-based access control, Au-</cell></row><row><cell></cell><cell>activities</cell><cell></cell><cell>thentication</cell></row><row><cell>TC4</cell><cell>Common to all the</cell><cell>Unapproved change in activity</cell><cell>Logger and Auditor, Controlled access</cell></row><row><cell></cell><cell>activities</cell><cell>function</cell><cell>session,Role-based access control, Authenti-</cell></row><row><cell></cell><cell></cell><cell></cell><cell>cation</cell></row><row><cell>TCo1</cell><cell>Collection</cell><cell>Malicious data source</cell><cell>Authentication</cell></row><row><cell>TP1</cell><cell>Preparation</cell><cell>Malicious filter</cell><cell>Logger and Auditor, Controlled access ses-</cell></row><row><cell></cell><cell></cell><cell></cell><cell>sion, Role-based access control, Authentica-</cell></row><row><cell></cell><cell></cell><cell></cell><cell>tion</cell></row><row><cell>TA1</cell><cell>Analysis</cell><cell cols="2">Infer PII* from anonymized data Encryption, Logger and Auditor, Multilevel</cell></row><row><cell></cell><cell></cell><cell></cell><cell>security, Role-based access control, Authenti-</cell></row><row><cell></cell><cell></cell><cell></cell><cell>cation</cell></row><row><cell>TA2</cell><cell>Analysis</cell><cell>Malicious analysis algorithms</cell><cell>Logger and Auditor, Controlled access ses-</cell></row><row><cell></cell><cell></cell><cell></cell><cell>sion, Role-based access control, Authentica-</cell></row><row><cell></cell><cell></cell><cell></cell><cell>tion</cell></row><row><cell>TV1</cell><cell>Visualization</cell><cell>PII* exposed due to high graphic</cell><cell>Multilevel security, Authentication, Role-</cell></row><row><cell></cell><cell></cell><cell>granularity</cell><cell>based access control</cell></row><row><cell>TAc1</cell><cell>Access</cell><cell>Several malicious access</cell><cell>Authentication, Role-based access control</cell></row></table><note>*PII -Personal Identifiable Information</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 :</head><label>2</label><figDesc>Comparison between RAs</figDesc><table><row><cell>RA</cell><cell>Pro-</cell><cell>Requirements</cell><cell>Security</cell><cell>Connection</cell><cell>Abstraction</cell></row><row><cell>posal</cell><cell></cell><cell>concern</cell><cell>con-</cell><cell>between</cell><cell>level</cell></row><row><cell></cell><cell></cell><cell></cell><cell>cern</cell><cell>compo-</cell><cell></cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell>nents</cell><cell></cell></row><row><cell>NIST</cell><cell></cell><cell>Medium</cell><cell>High</cell><cell>Low</cell><cell>High</cell></row><row><cell cols="3">Demchenko Medium</cell><cell>Low</cell><cell>Medium</cell><cell>Medium</cell></row><row><cell>Klein</cell><cell></cell><cell>Low</cell><cell cols="2">Medium Medium</cell><cell>Low</cell></row><row><cell cols="2">Pääkkönen</cell><cell>Medium</cell><cell>Low</cell><cell>High</cell><cell>Medium</cell></row><row><cell cols="2">and Pakkala</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="2">SRA Pro-</cell><cell>High</cell><cell>High</cell><cell>High</cell><cell>Medium</cell></row><row><cell>posal</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="6">case, requirements, and specifically the ones related to security,</cell></row><row><cell cols="4">are the main part of the SO component.</cell><cell></cell><cell></cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ACKNOWLEDGMENTS</head><p>This work was funded by the SEQUOIA project (Ministerio de Economía y Competitividad and the Fondo Europeo de Desarrollo Regional FEDER, TIN2015-63502-C3-1-R).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Research on Big Data -A systematic mapping study</title>
		<author>
			<persName><forename type="first">Jacky</forename><surname>Akoka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Isabelle</forename><surname>Comyn-Wattiau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nabil</forename><surname>Laoufi</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.csi.2017.01.004</idno>
		<ptr target="https://doi.org/10.1016/j.csi.2017.01.004" />
	</analytic>
	<monogr>
		<title level="j">SI: New modeling in Big Data</title>
		<imprint>
			<biblScope unit="volume">54</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="105" to="115" />
			<date type="published" when="2017-11">2017. Nov. 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Describing, Instantiating and Evaluating a Reference Architecture: A Case Study</title>
		<author>
			<persName><forename type="first">Paris</forename><surname>Avgeriou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Default journal</title>
		<imprint>
			<date type="published" when="2003">2003. 2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Big Data -Security and Privacy</title>
		<author>
			<persName><forename type="first">E</forename><surname>Bertino</surname></persName>
		</author>
		<idno type="DOI">10.1109/BigDataCongress.2015.126</idno>
		<ptr target="https://doi.org/10.1109/BigDataCongress.2015.126" />
	</analytic>
	<monogr>
		<title level="m">IEEE International Congress on Big Data</title>
				<imprint>
			<date type="published" when="2015">2015. 2015</date>
			<biblScope unit="page" from="757" to="761" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><surname>Bluetalon</surname></persName>
		</author>
		<ptr target="http://bluetalon.com/data-centric_security/" />
		<title level="m">BlueTalon Data-Centric Security Platform: Bringing Order to Data Security Chaos</title>
				<imprint>
			<date type="published" when="2016">2016. 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Information Management And Big Data A Reference Architecture</title>
		<author>
			<persName><forename type="first">Doug</forename><surname>Cackett</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2013-02">2013. February (2013</date>
			<publisher>Oracle</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Big data: A survey</title>
		<author>
			<persName><forename type="first">Min</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shiwen</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yunhao</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Mobile Networks and Applications</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="171" to="209" />
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<ptr target="https://downloads.cloudsecurityalliance.org/initiatives/bdwg/Expanded_Top_Ten_Big_Data_Security_and_Privacy_Challenges.pdf" />
		<title level="m">Expanded Top Ten Big Data Security and Privacy</title>
				<imprint>
			<publisher>CSA</publisher>
			<date type="published" when="2013-04">2013. April 2013</date>
		</imprint>
		<respStmt>
			<orgName>Big Data Working Group Cloud Security Alliance</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Towards a trusted HDFS storage platform: Mitigating threats to Hadoop infrastructures using hardwareaccelerated encryption with TPM-rooted key protection</title>
		<author>
			<persName><forename type="first">Jason</forename><forename type="middle">C</forename><surname>Cohen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Subrata</forename><surname>Acharya</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.jisa.2014.03.003</idno>
		<ptr target="https://doi.org/10.1016/j.jisa.2014.03.003" />
	</analytic>
	<monogr>
		<title level="j">Journal of Information Security and Applications</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="224" to="244" />
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<ptr target="http://www.nttdata.com/global/en/shared/pdf/bigdata_reference_architecture.pdf" />
		<title level="m">NTT DATA BigData Reference Architecture</title>
				<imprint>
			<date type="published" when="2015">2015. 2015</date>
		</imprint>
		<respStmt>
			<orgName>NTT DATA</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Defining architecture components of the Big Data Ecosystem</title>
		<author>
			<persName><forename type="first">Yuri</forename><surname>Demchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Cees</forename><surname>De Laat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Peter</forename><surname>Membrey</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Collaboration Technologies and Systems (CTS), 2014 International Conference on. IEEE</title>
				<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="104" to="112" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Security patterns in practice: designing secure architectures using software patterns</title>
		<author>
			<persName><forename type="first">Eduardo</forename><forename type="middle">B</forename><surname>Fernandez</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2013">2013</date>
			<publisher>John Wiley &amp; Sons</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Building a security reference architecture for cloud systems</title>
		<author>
			<persName><forename type="first">Eduardo</forename><forename type="middle">B</forename><surname>Fernandez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Raul</forename><surname>Monge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Keiko</forename><surname>Hashizume</surname></persName>
		</author>
		<idno type="DOI">10.1007/s00766-014-0218-7</idno>
		<ptr target="https://doi.org/10.1007/s00766-014-0218-7" />
	</analytic>
	<monogr>
		<title level="j">Requirements Engineering</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="225" to="249" />
			<date type="published" when="2016-06">2016. June 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Modeling misuse patterns</title>
		<author>
			<persName><forename type="first">Eduardo</forename><forename type="middle">B</forename><surname>Fernandez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nobukazu</forename><surname>Yoshioka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hironori</forename><surname>Washizaki</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ARES&apos;09. International Conference on. IEEE</title>
				<imprint>
			<date type="published" when="2009">2009. 2009</date>
			<biblScope unit="page" from="566" to="571" />
		</imprint>
	</monogr>
	<note>Availability, Reliability and Security</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Modeling and Security in Cloud Ecosystems</title>
		<author>
			<persName><forename type="first">Eduardo</forename><forename type="middle">B</forename><surname>Fernandez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nobukazu</forename><surname>Yoshioka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hironori</forename><surname>Washizaki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Madiha</surname></persName>
		</author>
		<author>
			<persName><surname>Syed</surname></persName>
		</author>
		<idno type="DOI">10.3390/fi8020013</idno>
		<ptr target="https://doi.org/10.3390/fi8020013" />
	</analytic>
	<monogr>
		<title level="j">Future Internet</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page">13</biblScope>
			<date type="published" when="2016-04">2016. April 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<idno>ISO/IEC. 2018. ISO/IEC CD 20547</idno>
		<ptr target="https://www.iso.org/standard/71277.html?browse=tc" />
		<title level="m">-3 -Information technology -Big data reference architecture -Part 3: Reference architecture</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Challenges to big data security and privacy</title>
		<author>
			<persName><forename type="first">M</forename><surname>Kaushik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jain</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Computer Science and Information Technologies (IJCSIT)</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="3042" to="3043" />
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">A reference architecture for big data systems in the national security domain</title>
		<author>
			<persName><forename type="first">John</forename><surname>Klein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ross</forename><surname>Buglak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><surname>Blockow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Troy</forename><surname>Wuttke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Brenton</forename><surname>Cooper</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2nd International Workshop on BIG Data Software Engineering</title>
				<meeting>the 2nd International Workshop on BIG Data Software Engineering<address><addrLine>Austin, Texas</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="51" to="57" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Designing IoT architecture (s): A European perspective</title>
		<author>
			<persName><forename type="first">Srdjan</forename><surname>Krco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Boris</forename><surname>Pokric</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Francois</forename><surname>Carrez</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Internet of Things (WF-IoT)</title>
				<imprint>
			<date type="published" when="2014">2014. 2014</date>
			<biblScope unit="page" from="79" to="84" />
		</imprint>
	</monogr>
	<note>IEEE World Forum on. IEEE</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">The big data security challenge</title>
		<author>
			<persName><forename type="first">Guillermo</forename><surname>Lafuente</surname></persName>
		</author>
		<idno type="DOI">10.1016/S1353-4858(15)70009-7</idno>
		<ptr target="https://doi.org/10.1016/S1353-4858(15)70009-7" />
	</analytic>
	<monogr>
		<title level="j">Network Security</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="12" to="14" />
			<date type="published" when="2015-01">2015. 2015. Jan. 2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">NIST cloud computing reference architecture</title>
		<author>
			<persName><forename type="first">Fang</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jin</forename><surname>Tong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jian</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Robert</forename><surname>Bohn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">John</forename><surname>Messina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lee</forename><surname>Badger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dawn</forename><surname>Leaf</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NIST special publication</title>
		<imprint>
			<biblScope unit="volume">500</biblScope>
			<biblScope unit="page">292</biblScope>
			<date type="published" when="2011">2011. 2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<author>
			<persName><forename type="first">V</forename><surname>Mayer-Schönberger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Cukier</surname></persName>
		</author>
		<ptr target="https://books.google.es/books?id=uy4lh-WEhhIC" />
		<title level="m">Big Data: A Revolution that Will Transform how We Live, Work, and Think</title>
				<imprint>
			<publisher>Houghton Mifflin Harcourt</publisher>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Software architecture: foundations, theory, and practice</title>
		<author>
			<persName><forename type="first">Nenad</forename><surname>Medvidovic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Richard</forename><forename type="middle">N</forename><surname>Taylor</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 2</title>
				<meeting>the 32nd ACM/IEEE International Conference on Software Engineering-Volume 2</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="471" to="472" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<ptr target="http://download.microsoft.com/download/f/a/1/fa126d6d-841b-4565-bb26-d2add4a28f24/microsoft_big_data_solution_brief.pdf" />
		<title level="m">Microsoft Big Data Solution Brief</title>
				<imprint>
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
		<respStmt>
			<orgName>Microsoft</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Main Issues in Big Data Security</title>
		<author>
			<persName><forename type="first">Julio</forename><surname>Moreno</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Manuel</forename><forename type="middle">A</forename><surname>Serrano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Eduardo</forename><surname>Fernández-Medina</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Future Internet</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page">44</biblScope>
			<date type="published" when="2016">2016. 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Reference architecture and classification of technologies, products and services for big data systems</title>
		<author>
			<persName><forename type="first">Pekka</forename><surname>Pääkkönen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Daniel</forename><surname>Pakkala</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Big Data Research</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="166" to="186" />
			<date type="published" when="2015">2015. 2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<title level="m" type="main">Unified modeling language reference manual</title>
		<author>
			<persName><forename type="first">James</forename><surname>Rumbaugh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ivar</forename><surname>Jacobson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Grady</forename><surname>Booch</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2004">2004</date>
			<publisher>Pearson Higher Education</publisher>
		</imprint>
	</monogr>
	<note>the</note>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Big data: A review</title>
		<author>
			<persName><forename type="first">S</forename><surname>Sagiroglu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Sinanc</surname></persName>
		</author>
		<idno type="DOI">10.1109/CTS.2013.6567202</idno>
		<ptr target="https://doi.org/10.1109/CTS.2013.6567202" />
	</analytic>
	<monogr>
		<title level="m">Collaboration Technologies and Systems (CTS), 2013 International Conference on</title>
				<imprint>
			<date type="published" when="2013-05">2013. May 2013</date>
			<biblScope unit="page" from="42" to="47" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<title level="m" type="main">CIO Guide to Using the SAP HANA® Platform for Big Data</title>
		<author>
			<persName><surname>Sap</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2016-02">2016. Feb. 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Big Data and Hadoop-A study in security perspective</title>
		<author>
			<persName><forename type="first">B</forename><surname>Saraladevi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Pazhaniraja</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Victer</surname></persName>
		</author>
		<author>
			<persName><surname>Paul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Saleem Basha</surname></persName>
		</author>
		<author>
			<persName><surname>Dhavachelvan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Procedia computer science</title>
		<imprint>
			<biblScope unit="volume">50</biblScope>
			<biblScope unit="page" from="596" to="601" />
			<date type="published" when="2015">2015. 2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Securing big data hadoop: a review of security issues, threats and solution</title>
		<author>
			<persName><forename type="first">P</forename><surname>Priya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chandrakant</forename><forename type="middle">P</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><surname>Navdeti</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Int. J. Comput. Sci. Inf. Technol</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<monogr>
		<author>
			<persName><surname>Sqrrl</surname></persName>
		</author>
		<ptr target="http://sqrrl.com/media/Data-Centric-Security-WP-final-.pdf" />
		<title level="m">Big Data and Data Centric Security</title>
				<imprint>
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Big data security and privacy</title>
		<author>
			<persName><forename type="first">Bhavani</forename><surname>Thuraisingham</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th ACM Conference on Data and Application Security and Privacy</title>
				<meeting>the 5th ACM Conference on Data and Application Security and Privacy</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="279" to="280" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Special issue on Security, Privacy and Trust in network-based Big Data</title>
		<author>
			<persName><forename type="first">Hua</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xiaohong</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Georgios</forename><surname>Kambourakis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Sciences: an International Journal</title>
		<imprint>
			<biblScope unit="volume">318</biblScope>
			<biblScope unit="page" from="48" to="50" />
			<date type="published" when="2015">2015. 2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">A security framework in G-Hadoop for big data computing across distributed Cloud data centres</title>
		<author>
			<persName><forename type="first">Jiaqi</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lizhe</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jie</forename><surname>Tao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jinjun</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Weiye</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rajiv</forename><surname>Ranjan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joanna</forename><surname>Kołodziej</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Achim</forename><surname>Streit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dimitrios</forename><surname>Georgakopoulos</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.jcss.2014.02.006</idno>
		<ptr target="https://doi.org/10.1016/j.jcss.2014.02.006" />
	</analytic>
	<monogr>
		<title level="j">J. Comput. System Sci</title>
		<imprint>
			<biblScope unit="volume">80</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="994" to="1007" />
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
	</monogr>
	<note>Special Issue on Dependable and Secure Computing</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
