<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Data Modeling for NoSQL Document-Oriented Databases</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Harley Vera</string-name>
          <email>harleyve@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wagner Boaventura</string-name>
          <email>wagnerbf@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maristela Holanda</string-name>
          <email>mholanda@cic.unb.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valeria Guimar a˜es</string-name>
          <email>valeriaguimaraes@hotmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fernanda Hondo</string-name>
          <email>fernandahondo@hotmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science University of Bras ́ılia Bras ́ılia</institution>
          ,
          <country country="BR">Brasil</country>
        </aff>
      </contrib-group>
      <fpage>129</fpage>
      <lpage>135</lpage>
      <abstract>
        <p>In database technologies, some of the new issues increasingly debated are non-conventional applications, including NoSQL (Not only SQL) databases, which were initially created in response to the needs for better scalability, lower latency and higher flexibility in an era of bigdata and cloud computing. These non-functional aspects are the main reason for using NoSQL database. However, currently there are no systematic studies on data modeling for NoSQL databases, especially the document-oriented ones. Therefore, this article proposes a NoSQL data modeling standard in the form of ER diagrams, introducing modeling techniques that can be used on documentoriented databases. On the other hand the purpose of this article is not structure the data using the model proposed, but it does helping with the visualization of data. In addition, to validate the proposed model, a study case was implemented using genomic data.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Huge amounts of data are produced daily. They
are generated by smart phones, social networks,
banks transactions, machines measured by sensors
are part of Internet of Things provide information
that is growing exponentially. The management of
this data is currently performed in most cases by
relational databases that provide centralized
control of data, redundancy control and elimination
of inconsistencies
        <xref ref-type="bibr" rid="ref1">(Elmasri and Navathe, 2010)</xref>
        ;
but, some of these factors restrict the use of
alternative database models. Consequently, certain
limiting factors have led to alternative models of
databases in these scenarios. Primarily, motivated
by the issue of system scalability, a new
generation of databases, known as NoSQL, is gaining
strength and space in information systems. The
NoSQL databases emerged in the mid-90s, from
a database solution that did not provide an SQL
interface. Later, the term came to represent
solution that promote an alternative to the Relational
Model, becoming an abbreviation for Not Only
SQL.
      </p>
      <p>
        The purpose, therefore, of NoSQL solutions is
not to replace the Relational Model as a whole,
but only in cases in which there is a need for
scalability and bigdata. In the recent years, a
variety of NoSQL databases has been developed
mainly by practitioners looking to fit their specific
requirements regarding scalability performance,
maintenance and feature-set. Subsequently, there
have been various approaches to classify NoSQL
databases, each with different categories and
subcategories, such as key-value stores,
columnoriented and graph databases, oriented-document.
MongoDB
        <xref ref-type="bibr" rid="ref13 ref2">(MongoDB, 2015)</xref>
        , Neo4j
        <xref ref-type="bibr" rid="ref3">(Partner et
al., 2013)</xref>
        , Cassandra
        <xref ref-type="bibr" rid="ref4">(D. Borthakur et al., 2011)</xref>
        and HBase
        <xref ref-type="bibr" rid="ref5 ref6">(F. Chang et al., 2008)</xref>
        are examples
of NoSQL databases. This article only applies to
NoSQL document-oriented databases, because of
the heterogeneous characteristics of each NoSQL
database classification.
      </p>
      <p>
        Nonetheless, data modeling still has an
important role to play in NoSQL environments. The data
modeling process
        <xref ref-type="bibr" rid="ref1">(Elmasri and Navathe, 2010)</xref>
        involves the creation of a diagram that represents the
meaning of the data and the relationship between
the data elements. Thus, understanding is a
fundamental aspect of data modeling
        <xref ref-type="bibr" rid="ref5 ref6">(R. F. Lans, 2008)</xref>
        ,
and a pattern for this kind of representation has
few contributions for NoSQL databases.
      </p>
      <p>Addressing this issue, this article proposes a
standard for NoSQL data modeling. This proposal
uses NoSQL document-oriented databases,
aiming to introduce modeling techniques that can be
used on databases with document features.</p>
      <p>The remainder of the paper is organized as
follows: Section II presents related works. Section
III explores the concepts of modeling for NoSQL
databases based on documents, introducing the
different types of relationships and associations.
Section IV shows the proposal model in the
context of NoSQL databases based on documents.
Section V presents the study case to validate the
proposal model. Finally in Section VI, presents
the conclusion of the research and future works.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Works</title>
      <p>
        Katsov
        <xref ref-type="bibr" rid="ref7">(H. Scalable, 2015)</xref>
        presents a study of
techniques and patterns for data modeling using
different categories of NoSQL databases.
However, the approach is generic and does not define a
specific modeling engine to each database.
      </p>
      <p>
        Arora and Aggarwal
        <xref ref-type="bibr" rid="ref17 ref3 ref8">(R. Arora and R.
Aggarwal, 2013)</xref>
        propose a data modeling, but restricted
to MongoDB document database, describing a
UML Diagram Class and JSON format to
represent the documents.
      </p>
      <p>
        Similarly, Banker
        <xref ref-type="bibr" rid="ref9">(K. Banker, 2011)</xref>
        provides
some ideas of data modeling, but limited to
MongoDB database and always referring to JSON
        <xref ref-type="bibr" rid="ref10">(D.
Crockford, 2006)</xref>
        format as a modeling solution.
      </p>
      <p>
        Kaur and Rani
        <xref ref-type="bibr" rid="ref11">(K. Kaur, K.Rani, 2013)</xref>
        present
a work for modeling and querying data in NoSQL
databases, especifically present a case study for
document-oriented and graph based data model.
In the case of document-oriented propose a
data modeling restricted to MongoDB document
database, describing the data model by UML
diagram class to represent documents.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Data Modeling For</title>
    </sec>
    <sec id="sec-4">
      <title>Document-Oriented Database</title>
      <p>
        An important step in database implementation is
the data modeling, because it facilitates the
understanding of the project through key features
that can prevent programming and operation
errors. For relational databases, the data
modeling uses the Entity-Relationship Model
        <xref ref-type="bibr" rid="ref1">(Elmasri
and Navathe, 2010)</xref>
        . For NoSQL, it depends on
the database category. The focus of this article is
NoSQL document-oriented databases, where the
data format of these documents can be JSON,
BSON, or XML
        <xref ref-type="bibr" rid="ref12">(S. J. Pramod, 2012)</xref>
        .
      </p>
      <p>
        Basically, the documents are stored in
collections. A parallel is made with relational databases,
the equivalent for a collection is the record
(tuple) and for a document it is the relation (table).
Documents can store completely different sets of
attributes, and can be mapped directly to a file
format that can be easily manipulated by a
programming language. However, it is difficult to abstract
the modeling of documents for the entity
relationship model
        <xref ref-type="bibr" rid="ref5 ref6">(R. F. Lans, 2008)</xref>
        .
3.1
      </p>
      <sec id="sec-4-1">
        <title>Modeling Paradigm for document-oriented Database</title>
        <p>The relational model designed for SQL has some
important features such as integrity, consistency,
type validation, transactional guarantees, schemes
and referential integrity. However, some
applications do not need all of these features. The
elimination of these resources has an important
influence on the performance and scalability of data
storage, bringing new meaning to data modeling.</p>
        <p>
          Document-oriented databases have some
significant improvements, e.g., index management by
the database itself, flexible layouts and advanced
indexed search engines
          <xref ref-type="bibr" rid="ref7">(H. Scalable, 2015)</xref>
          . By
associating these improvements (some being
denormalization and aggregation) to the basic
principles of data modeling in NoSQL, it is
possible to identify some generic modeling standards
associated to document-oriented databases.
Analyzing the documentation of the main
documentoriented databases, MongoDB
          <xref ref-type="bibr" rid="ref13 ref2">(MongoDB, 2015)</xref>
          and CouchDB
          <xref ref-type="bibr" rid="ref14">(CouchDB, 2015)</xref>
          , similar
representations of data mapping relationships can be
found: References and Embedded Documents,
a structure which allows associating a document
to another, retaining the advantage of specific
performance needs and data recovery standards.
3.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>References Relationship</title>
        <p>
          This type of relationship stores the data by
including links or references, from one document to
another. Applications can solve these references to
access the related data in the structure of the
document itself
          <xref ref-type="bibr" rid="ref13 ref2">(MongoDB, 2015)</xref>
          . Figure 1 shows
two documents one of them for Fastq files and the
other to Activities.
3.3
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>Embedded Documents</title>
        <p>
          This type of relationship stores in a single
document structure, where the embedded documents
are disposed in a field or an array. These
denormalized data models allow data manipulation in
a single database transaction
          <xref ref-type="bibr" rid="ref13 ref2">(MongoDB, 2015)</xref>
          .
Unlike traditional relational databases that have
a simple form in the disposition in rows and
columns, a document-oriented database stores
information in text format, which consists of
collections of records organized in key-value concept, ie,
for each value represented a name (or label) is
assigned, which describes its meaning. This storage
model is known as JSON object, and the objects
are composed of multiple name/value pairs for
arrays, and other objects.
        </p>
        <p>
          In this scenario, the number of objects in a
database increases the abstraction complexity of
the logical relationship between the stored
information, especially when objects have references
to other objects. Currently, there is a lack of
solutions to conceptually represent those associated
with a NoSQL document-oriented database. As
described in
          <xref ref-type="bibr" rid="ref17 ref3 ref8">(R. Arora and R. Aggarwal, 2013)</xref>
          ,
there is no standard to represent this kind of object
modeling, several diferent manners of modeling
may arise, depending on each data administrator’s
understanding, which makes learning difficult for
those who need to read the database model.
        </p>
        <p>Therefore, this section proposes a standard for
document-oriented database viewing. Our
proposal has some properties, considering the
conceptual representation modeling type, such as:
• Ensuring a single way of modeling for
the several NoSQL document-oriented
databases.
• Simplifying and facilitating the
understanding of a document-oriented database through
its conceptual model, leveraging the
abstraction and making the correct decisions about
the data storage.
• Providing an accurate, unambiguous and
concise pattern, so that database
administrators have substantial gains in abstraction,
understanding.
• Presenting different types of relationships
between collections defined as References and
Embedded documents.
• Assisting the recognition and arrangement of
the objects, as well as its features and
relationships with other objects.</p>
        <p>The following subsections present the concepts
and graphing to build a conceptual model for
NoSQL document-oriented databases.
4.1</p>
      </sec>
      <sec id="sec-4-4">
        <title>Assumptions</title>
        <p>Before starting the discussion about the approach
of each type of the conceptual modeling
representation, it is important to highlight some
basic concepts about objects and relationships in a
document-oriented database:
• A document (or object) describes a set of
attributes that have their properties organized
in a key-value structure.
• Information contained in an document is
described by the identifier (key) and the value
associated with the key.
• Different types of relationships between
documents are defined as References and
Embedded Documents
• Because NoSQL is a non-relational data
database, the concepts of normalization, do
not apply.
• Some concepts of relationships between
objects are similar to ER modeling, such as
cardinality (one-to-one, one-to-many,
many-tomany).
The proposed solution for a conceptual modeling
to the NoSQL document-oriented databases has
two basic concepts: Document and Collections.</p>
        <p>As noted previously, a document is usually
represented by the structure of a JSON object, and as
many fields as needed may be added to the
document. For this solution, a document and a
collection of documents is represented by Figure 3.</p>
      </sec>
      <sec id="sec-4-5">
        <title>Embedded Documents 1..N</title>
        <p>
          A one-to-many relationship in embedded
documents is represented by the Figure 5. This is the
case when the notation to represent the cardinality
is the same used in UML
          <xref ref-type="bibr" rid="ref15">(F. Booch et al., 2005)</xref>
          and is placed in the upper right corner of the
embedded documents. According to the cardinality
one-to-many the larger document has embedded
multiple documents within it.
        </p>
        <p>The following section presents the definitions of
relationship types and degrees for the objects
features.
4.3</p>
      </sec>
      <sec id="sec-4-6">
        <title>Embedded Documents 1..1</title>
        <p>
          This section proposes a model that represents the
one-to-one relationship for documents embedded
in another document. In this case, the proposal
is to use the representation of an individual
Document within another element that represents a
Document. In Figure 4, cardinality is also
suggested to specify the one-to-one relationship type.
A many-to-many relationship in embedded
documents is represented by the Figure 6. According to
the cardinality many-to-many the larger document
has a many to many relationship with the
embedded document. The representation of the
cardinality is the same used in UML
          <xref ref-type="bibr" rid="ref15">(F. Booch et al.,
2005)</xref>
          .
4.6
        </p>
      </sec>
      <sec id="sec-4-7">
        <title>References 1..1</title>
        <p>
          A document can reference another, and in this
case, one must use an arrow directed to the
referenced document, as shown in Figure 7. One can
see that the directed arrow makes the left
document references to the right document.
Furthermore, the cardinality of the relationship should be
specified above the arrow. The notation of
cardinality is based on UML
          <xref ref-type="bibr" rid="ref15">(F. Booch et al., 2005)</xref>
          .
In NoSQL, a document can reference multiple
documents. To represent this relationship one
should use an arrow directed to the referenced
documents, as shown in Figure 8. The left document
references multiple documents on the right side,
by the directed arrow. Furthermore, the
cardinality of the relationship is represented by the
notation ”1..N” as in UML
          <xref ref-type="bibr" rid="ref15">(F. Booch et al., 2005)</xref>
          .
In order to evaluate our proposal, part of the
workflow described in
          <xref ref-type="bibr" rid="ref16">(J. C. Marioni et al., 2014)</xref>
          was
used. This workflow aimed at identify and
comparing expression levels of human kidney and liver
RNA samples sequenced by Illumina. The
workflow was designed in three phases (Figure. 10):
• Filtering: all the sequenced transcripts were
filtered, generating new files with good
quality sequences.
• Alignment: transcripts were mapped to the
human genome used as reference.
• Statistical Analysis: a sort process was first
executed, followed by a statistical
analysis with the mapped transcripts to discover
which genes are mostly expressed both in
kidney and liver samples.
To represent this relationship a bidirectional arrow
is used between reference documents, as shown
in Figure 9. The left document references
multiple documents on the right side and the right
document references multiple documents on the left
side. Furthermore, the cardinality of the
relationship is represented by the notation ”N..N” as in
UML
          <xref ref-type="bibr" rid="ref15">(F. Booch et al., 2005)</xref>
          .
        </p>
        <p>
          After analyzing the previously mentioned
concepts, we have chosen to create a collection of
documents for each PROV-DM type used to
create a graph node. We also defined a collection
for genomic documents (raw data). The reference
relationship approach was chosen to connect all
PROV-DM components, complementary
information of PROV-DM and genomic documents. Based
on
          <xref ref-type="bibr" rid="ref17 ref8">(R. de Paula et al., 2013)</xref>
          we defined the
documents and the attributes. A set of minimum
information related to each one of these entities.
        </p>
        <p>Figure. 11 shows our document based data
representation, explained as follows:</p>
        <p>• Project: stores different experiments of one
agent. Attributes: Id, name, description,
coordinator, start date, end date and
observation.
• FASTAQ: files used or generated in the
activity; Attributes: Id, filename and description.
• Activity: represents the execution of a
program; Attributes: Id, name, program, version
program, command line, function, start date,
end date, account ID, used (name FASTQ,
local, size), wasGenerateBy (name FASTQ,
local, size), and wasAssociatedWith (Agent
name).
• Account: represents the performance of an
experiment; Attributes: Id, name,
description, execution place, star date, end date,
observation, version and version date and
project Id.
• Agent: represents the person responsible for
a program or a phase in the workflow.
Attributes: Id, name, login and password.
5.1</p>
      </sec>
      <sec id="sec-4-8">
        <title>Implementation</title>
        <p>In this case study, we have considered the
MongoDB NoSQL database to store provenance and
data files. The primary motivation for this choice
was MongoDB’s ability to manipulate large
volumes of data. MongoDB is an open-source
Document-Oriented database designed to store
large amounts of data from multiple servers.</p>
        <p>
          It uses JSON- style documents with dynamic
schemas. The number of fields, content and size
of the document can differ from one document to
another. In practice, however, the documents in
a collection share a similar structure
          <xref ref-type="bibr" rid="ref13 ref2">(MongoDB,
2015)</xref>
          and can be mapped directly to a file format
that can be easily manipulated by a programming
language.
        </p>
        <p>
          MongoDB documents have a maximum size of
16MB. This feature is important to ensure that
a single document cannot use excessive amounts
of RAM. In order to store files larger than the
maximum size, MongoDB provides a GridFS API
          <xref ref-type="bibr" rid="ref13 ref2">(MongoDB, 2015)</xref>
          . It automatically divides large
data into 256 KB pieces and maintains metadata
for all pieces. GridFS allows for the retrieval of
individual pieces as well as entire documents.
        </p>
        <p>GridFS uses two collections to store the data:
fs.files collections, containing metadata about
files, and fs.chunks collections, which store the
actual 256k data chunks. The collections FS.file
contains the name of the FASTQ file. Thus, it
was possible to implement the relationship
between MongoDB Collection Activity using
Reference Document. In other words, we implemented
the connection between Level 1 and Level 2
through the File Name attribute that was present in
fileprovenance.files and Activity Collection.
Figure. 12 illustrates this particular implementation.
In contrast to relational database management
systems, NoSQL databases are designed to be
schemaless and flexible. Therefore, the challenge
of this work was to introduce a data modeling
standard for NoSQL document-oriented databases, in
contrast to the original idea for NoSQL databases.
The objective was to build compact, clear and
intuitive diagrams for conceptual data modeling
for NoSQL databases. While the current
studies propose generic techniques and do not define
a specific modeling engine to NoSQL database,
our idea was to present a graphical model for any
NoSQL document-oriented database. Moreover,
while other studies describe techniques based on
UML Diagram Class and JSON format as a
modeling solution, we have a new approach to solve
the conceptual data modeling issue for NoSQL
document-oriented databases.</p>
        <p>Future work includes: verifying our model for
other NoSQL database classifications, such as
key-value and column.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>R.</given-names>
            <surname>Elmasri</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Navathe</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Fundamentals of Database Systems</article-title>
          . Pearson Addison Wesley.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>MongoDB.</surname>
          </string-name>
          <year>2015</year>
          .
          <article-title>Document database</article-title>
          . [Online] Available: http://www.mongodb.org/ [Retrieved:April,
          <volume>15</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>J.</given-names>
            <surname>Partner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vukotic</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Watt</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Neo4j in Action, O'Reilly Media</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>D.</given-names>
            <surname>Borthakur</surname>
          </string-name>
          et al.
          <year>2011</year>
          .
          <article-title>Apache hadoop goes realtime at facebook</article-title>
          ,
          <source>in Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. ACM</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>1071</fpage>
          -
          <lpage>1080</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>F.</given-names>
            <surname>Chang</surname>
          </string-name>
          et al.
          <year>2008</year>
          . “
          <article-title>Bigtable: A distributed storage system for structured data</article-title>
          ,
          <source>ACM Transactions on Computer Systems (TOCS)</source>
          , vol.
          <volume>26</volume>
          , no.
          <issue>2</issue>
          ,
          <issue>2008</issue>
          , p.
          <fpage>4</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Lans</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Introduction to SQL: mastering the relational database language, Addison-Wesley Professional</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>H.</given-names>
            <surname>Scalable</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Nosql data modeling techniques</article-title>
          . [Online] Available: http://highlyscalable. wordpress.com/
          <year>2012</year>
          /03/01/ nosql-data
          <string-name>
            <surname>-</surname>
          </string-name>
          modeling-techniques/ [Retrieved:April,
          <volume>15</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>R.</given-names>
            <surname>Arora</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Aggarwal</surname>
          </string-name>
          ,
          <year>2013</year>
          .
          <article-title>Modeling and querying data in mongodb</article-title>
          ,
          <source>International Journal of Scientific and Engineering Research</source>
          (IJSER
          <year>2013</year>
          ), vol.
          <volume>4</volume>
          , no.
          <issue>7</issue>
          ,
          <string-name>
            <surname>Jul</surname>
          </string-name>
          .
          <year>2013</year>
          , pp.
          <fpage>141</fpage>
          -
          <lpage>144</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>K.</given-names>
            <surname>Banker</surname>
          </string-name>
          ,
          <year>2011</year>
          . MongoDB in action, Manning Publications Co.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>D.</given-names>
            <surname>Crockford</surname>
          </string-name>
          ,
          <year>2006</year>
          .
          <article-title>RFC 4627 (Informational) The application json Media Type for JavaScript Object Notation (JSON), IETF (Internet Engineering Task Force</article-title>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>K.</given-names>
            <surname>Kaur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Rani</surname>
          </string-name>
          ,
          <year>2013</year>
          .
          <article-title>Modeling and querying data in NoSQL databases</article-title>
          , In Big Data, IEEE International Conference on (pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          ). IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Pramod</surname>
          </string-name>
          ,
          <year>2012</year>
          .
          <article-title>Nosql distilled: A brief guide to the emerging world of polyglot persistence,</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>MongoDB</surname>
          </string-name>
          ,
          <year>2015</year>
          .
          <article-title>Data modeling introduction</article-title>
          , Online Available: http: //docs.mongodb.org/manual/core/ data-modeling-introduction/ [Retrieved: April,
          <volume>15</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>CouchDB</surname>
          </string-name>
          ,
          <year>2015</year>
          .
          <article-title>Modeling entity relationships in couchdb</article-title>
          , [Online]. Available: http://wiki.apache.org/couchdb/ [retrieved: April, 15]
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>G.</given-names>
            <surname>Booch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rumbaugh</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Jacobson</surname>
          </string-name>
          ,
          <year>2005</year>
          .
          <article-title>The unified modeling language user guide</article-title>
          .,
          <source>Pearson Education India.</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Marioni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. E.</given-names>
            <surname>Mason</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Mane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stephens</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gilad</surname>
          </string-name>
          ,
          <year>2014</year>
          .
          <article-title>RNA-SEQ: An assessment of technical reproducibility and comparison with gene expression arrays</article-title>
          ,
          <source>Genome Research</source>
          , vol.
          <volume>18</volume>
          , no.
          <issue>9</issue>
          , pp.
          <fpage>1509</fpage>
          -
          <lpage>1517</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>R. de Paula</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Holanda</surname>
            ,
            <given-names>L. SA</given-names>
          </string-name>
          <string-name>
            <surname>Gomes</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Lifschitz</surname>
            and
            <given-names>M. E. MT.</given-names>
          </string-name>
          <string-name>
            <surname>Walter</surname>
          </string-name>
          ,
          <year>2013</year>
          .
          <article-title>Provenance in bioinformatics workflows</article-title>
          .,
          <source>BMC Bioinformatics 14 (Suppl</source>
          <volume>11</volume>
          ):
          <fpage>S6</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>