<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Decentralized Model Persistence for Distributed Computing</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Abel Gómez</string-name>
          <email>abel.gomez-llana@inria.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amine Benelallam</string-name>
          <email>amine.benelallam@inria.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Massimo Tisi</string-name>
          <email>massimo.tisi@inria.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>AtlanMod team (Inria</institution>
          ,
          <addr-line>Mines Nantes, LINA)</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The necessity of manipulating very large amounts of data and the wide availability of computational resources on the Cloud is boosting the popularity of distributed computing in industry. The applicability of model-driven engineering in such scenarios is hampered today by the lack of an efficient model-persistence framework for distributed computing. In this paper we present NeoEMF/HBase, a persistence backend for the Eclipse Modeling Framework (EMF) built on top of the Apache HBase data store. Model distribution is hidden from client applications, that are transparently provided with the model elements they navigate. Access to remote model elements is decentralized, avoiding the bottleneck of a single access point. The persistence model is based on key-value stores that allow for efficient on-demand model persistence.</p>
      </abstract>
      <kwd-group>
        <kwd>Model Persistence</kwd>
        <kwd>Key-Value Stores</kwd>
        <kwd>Distributed Persistence</kwd>
        <kwd>Distributed Computing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The availability of large data processing and storage in the Cloud is becoming a
key resource for part of today’s industry, within and outside IT. It offers a
tempting alternative for companies to process, analyze, and discover new data insights,
yet in a cost-efficient manner. Thanks to existing Cloud computing companies,
this facility is extensively available for rent [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. This ready-to-use IT
infrastructure is equipped with a wide range of distributed processing frameworks, for
companies that have to occasionally process large amounts of data.
      </p>
      <p>
        One of the principal ingredients behind the success of distributed
processing are distributed storage systems. They are designed to answer to data
processing requirements of distributed and computationally extensive applications,
i.e., wide applicability, scalability, and high performance. Appearing along with
MapReduce [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], BigTable [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] strongly stood in for these qualifications. One of
the most compliant open-source implementations of MapReduce and BigTable
are Apache’s Hadoop [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] and HBase [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], respectively.
      </p>
      <p>Another success factor for widespread distributed processing is the
appearance of high-level languages for simplifying distribution by a user-friendly syntax
(mostly SQL-like). They transparently convert high-level queries into a series of
parallelizable jobs that can run in distributed frameworks, such as MapReduce,
therefore making distributed application development convenient.</p>
      <p>
        We believe that Model-Driven Engineering (MDE), especially the
query/transformation languages and engines, would be suitable for developing
distributed applications on top of structured data (models). Unfortunately, MDE
misses some fundamental bricks towards building fully distributed
transformation/query engines. In this paper we address one of those components, i.e.
a model-persistence framework for distributed computing. Several distributed
model-persistence frameworks exist today [
        <xref ref-type="bibr" rid="ref16 ref3">3,16</xref>
        ]: for the Eclipse Modeling
Framework (EMF) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] two examples are Connected Data Objects (CDO) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] that is
based on object relational mapping1, and EMF fragments [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], that maps large
chunks of model to separate URIs. We argue that these solutions are not
wellsuited for distributed computing, exhibiting one or more of the following faults:
– Model distribution is not transparent: so queries and transformations need
to explicitly take into account that they are running on a part of the model
and not the whole model (e.g. EMF fragments )
– Even when model elements are stored in different nodes, access to model
elements is centralized, since elements are requested from and provided by a
central server (e.g. CDO over a distributed database). This constitutes a
bottleneck and does not exploit a possible alignment between data distribution
and computation distribution.
– The persistence backend is not optimized for atomic operations of model
handling APIs. In particular files (e.g. XMI over HDFS [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]), relational databases
or graph databases are widely used while we have shown in previous work [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]
that key-value stores are very efficient in typical queries over very large
models. Moreover key-value stores are more easily distributed with respect to
other formats, such as graphs.
– The backend assumes to split the model in balanced chunks (e.g. EMF
Fragments). This may not be suited to distributed processing, where the
optimization of computation distribution may require uneven data distribution.
      </p>
      <p>
        In this paper we present NeoEMF/HBase, a persistence backend for EMF
built on top of the Apache HBase data store. NeoEMF/HBase is
transparent w.r.t. model manipulation operations, decentralized, and based on key-value
stores. The tool is open-source and publicly available at the paper’s website2.
This paper is organized as follows: Section 2 presents HBase concepts and
architecture, Section 3 presents the NeoEMF/HBase architecture, data model and
properties; and finally, Section 4 concludes the paper and outlines future work.
1 CDO servers (usually called repositories) are built on top of different data storage
solutions (ranging from relational databases to document-oriented databases).
However, in practice, only relational databases are commonly used, and indeed, only
DB Store [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], which uses a proprietary Object/Relational mapper, supports all the
features of CDO and is regularly released in the Eclipse Simultaneous Release [
        <xref ref-type="bibr" rid="ref2 ref4 ref5">2,4,5</xref>
        ].
2 http://www.emn.fr/z-info/atlanmod/index.php/NeoEMF/HBase
      </p>
    </sec>
    <sec id="sec-2">
      <title>Background: Apache HBase</title>
      <p>
        Apache HBase [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] is the Hadoop [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] database, a distributed, scalable,
versioned and non-relational big data store. It can be considered an open-source
implementation of Google’s Bigtable proposal [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
2.1
      </p>
      <sec id="sec-2-1">
        <title>HBase data model</title>
        <p>In HBase, data is stored in tables, which are sparse, distributed, persistent
multidimensional sorted maps. A map is indexed by a row key, acolumn key, and a
timestamp. Each value in the map is an uninterpreted array of bytes.</p>
        <p>
          HBase is built on top of the following concepts [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]:
        </p>
        <p>
          Fig. 1: Example of a table in HBase/BigTable (extracted from [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ])
text of any anchors that reference the page. CNN’s home page is referenced by
both the Sports Illustrated and the MY-look home pages, so the row contains
columns named anchor:cnnsi.com and anchor:my.look.ca. Each anchor cell
has one version; the contents column has three versions, at timestamps t3, t5,
and t6.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>HBase architecture</title>
        <p>
          Fig. 2 shows how HBase is combined with other Apache technologies to store
and lookup data. Whilst HBase leans on HDFS to store different kind of
configurable size files, ZooKeeper [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] is used for coordination. Two kinds of nodes can
be found in an HBase setup, the so-called HMaster and the HRegionServer. The
HMaster is the responsible for assigning the regions (HRegions) to each
HRegionServer when HBase is starting. Each HRegion stores a set of rows separated
in multiple column families, and each column family is hosted in an HStore. In
HBase, row modifications are tracked by two different kinds of resources, the
HLog and the Stores. The HLog is a store for the write-ahead log (WAL), and is
persisted into the distributed file system. The WAL records all changes to data
in HBase, and in the case of a HRegionServer crash ensures that the changes to
        </p>
        <p>Client
HMaster</p>
        <p>ZooKeeper</p>
        <sec id="sec-2-2-1">
          <title>Client</title>
        </sec>
        <sec id="sec-2-2-2">
          <title>Space</title>
        </sec>
        <sec id="sec-2-2-3">
          <title>HBase</title>
          <p>DFS
Client</p>
        </sec>
        <sec id="sec-2-2-4">
          <title>Hadoop</title>
        </sec>
        <sec id="sec-2-2-5">
          <title>Distributed</title>
        </sec>
        <sec id="sec-2-2-6">
          <title>File System</title>
          <p>HRegionServer
g
o
L
H</p>
          <p>HRegion</p>
          <p>Store
...
...</p>
          <p>Store</p>
          <p>MemStore</p>
        </sec>
        <sec id="sec-2-2-7">
          <title>S(tHorFeilFei)le ... S(tHorFeilFei)le</title>
          <p>HRegion
...
...</p>
          <p>HRegionServer
HRegion</p>
          <p>HRegionServer</p>
          <p>HRegion
...</p>
          <p>...</p>
          <p>...</p>
          <p>...</p>
          <p>...</p>
          <p>...</p>
          <p>DFS
Client
...</p>
          <p>DFS
Client
...</p>
          <p>NameNode</p>
          <p>DataNode</p>
          <p>DataNode</p>
          <p>DataNode
...</p>
          <p>Fig. 2: HBase architecture
the data can be replayed. Stores in a region contain an in-memory data store
(MemStore) and a persistent data stores (HFiles, that are persisted into the
distributed file system) HFiles are local to each region, and used for actual data
storage. The ZooKeeper cluster is responsible of providing the client with the
information about both the HRegionServer and the HRegion hosting the row the
client is looking up for. This information is cached at the client side, so that a
direct communication could be directly setup for the next times without
querying the HMaster. When an HRegionServer receives a write request, it sends the
request to a specific HRegion. Once the request is processed, data is first
written into the MemStore and when certain threshold is met, the MemStore gets
flushed into an HFile.
2.3</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>HBase vs. HDFS</title>
        <p>HDFS is the primary distributed storage used by Hadoop applications as it is
designed to optimize distributed processing of multi-structured data. It is well
suited for distributed storage and distributed processing using commodity
hardware. It is fault tolerant, scalable, and extremely simple to expand. HDFS is
optimized for delivering a high throughput of data, and this may be at the
expense of latency, which makes it neither suitable nor optimized for atomic model
operations. HBase is, on the other hand, a better choice for low-latency access.
Moreover, HDFS resources cannot be written concurrently by multiple writers
without locking and this results in locking delays. Also writes are always made
at the end of the file. Thus, writing in the middle of a file (e.g. changing a value
of a feature) involves rewriting the whole file, leading to more significant delays.
On the contrary, HBase allows fast random reads and writes. HBase is row-level
atomic, i.e. inter-row operations are not atomic, which might lead to a dirty read
depending on the data model used. Additionally, HBase only provides five basic
data operations (namely, Get, Put, Delete, Scan, and Increment ), meaning that
complex operations are delegated to the client appliaction (which, in turn, must
implement them as a combination of these simple operations).
3</p>
        <p>NeoEMF/HBase
Model
Manager
Persistence
Manager
Persistence</p>
        <p>Backend
APIs are consistent between the model-management framework and the
persistence driver, keeping the low-level data structures and code accessing the
database engine completely decoupled from the modeling framework high level
code. Maintaining these uniform APIs between the different levels allows
including additional functionality on top of the persistence driver by using the
decorator pattern, such as different cache levels.</p>
        <p>NeoEMF/HBase offers lightweight on-demand loading and efficient garbage
collection. Model changes are automatically reflected in the underlying storage,
making changes visible to all the clients. To do so, first we decouple dependencies
among objects by assigning a unique identifier to all model objects, and then:
– To implement lightweight on-demand loading and saving, for each live model
object, we create a lightweight delegate object that is in charge of on-demand
loading the element data and keeping track of the element’s state. Delegates
load and save data from the persistence backend by using the object’s unique
identifier.
– For efficient garbage collection in the Java Runtime Environment, we avoid
to maintain hard Java references among model objects, so that the garbage
collector can deallocate any model object that is not directly referenced by
the application.
3.1</p>
      </sec>
      <sec id="sec-2-4">
        <title>Map-based data model</title>
        <p>We have designed the underlying data model of NeoEMF/HBase to minimize
the data interactions of each method of the EMF model access API. The design
takes advantage of the unique identifier defined in the previous section to flatten
the graph structure into a set of key-value mappings.</p>
        <p>Fig. 4a shows a small excerpt of a possible Java metamodel that we will
use to exemplify the data model. This metamodel describes Java programs in
terms of Packages, ClassDeclarations, BodyDeclarations, and Modifiers. A
Package is a named container that groups a set of ClassDeclarations through the
ownedElements composition. A ClassDeclaration contains a name and a set of</p>
        <p>Package
name : String
ownedElements ClassDeclaration</p>
        <p>* name : String
bodyDeclarations *</p>
        <p>Modifier modifier BodyDeclaration
visibility : VisibilityKind 1 name : String</p>
        <p>VisibilityKind
none
public
private
protected
(a)</p>
        <p>p1 : Package
name : ’package1’</p>
        <p>bodyDeclarations
b1 : BodyDeclaration
name : ’bodyDecl1’
modifier</p>
        <p>m1 : Modifier
visibility : public
ownedElements c1 : ClassDeclaration</p>
        <p>name : ’class1’
bodyDeclarations
b2 : BodyDeclaration
name : ’bodyDecl2’
modifier</p>
        <p>m2 : Modifier
visibility : public
(b)
BodyDeclarations. Finally, a BodyDeclaration contains a name, and its visibility
is described by a single Modifier.</p>
        <p>Fig. 4b shows a sample instance of the Java metamodel, i.e., a graph of
objects conforming with the metamodel structure. The model contains a single
Package (package1), containing only one ClassDeclaration (class1). The Class
contains the bodyDecl1 and bodyDecl2 BodyDeclarations. Both of them are
public.</p>
        <p>NeoEMF/HBase uses a single table with three column families to store
models’ information: (i) a property column family, that keeps all objects’ data
stored together; (ii) a type column family, that tracks how objects interact with
the meta-level (such as the instance of relationships); and (iii) a containment
column family, that defines the models’ structure in terms of containment
references. Table 13 shows how the sample instance in Fig. 4b is represented using
this structure.</p>
        <p>As Table 1 shows, row keys are the object unique identifier. The property
column family stores the objects’ actual data. As it can be seen, not all rows have
a value for a given column. How data is stored depends on the property type and
cardinality (i.e., upper bound). For example, values for single-valued attributes
(like the name, which stored in the name column) are directly saved as a single
literal value; while values for many-valued attributes are saved as an array of
single literal values (Fig. 4b does not contain an example of this). Values for
single-valued references, such as the modifier containment reference from
BodyDeclaration to Modifier, are stored as a single value (corresponding to the
identifier of the referenced object). Examples of this are the cells for hb1, modifieri
and hb2, modifieri which contain the values ’m1’ and ’m2’ respectively. Finally,
multi-valued references are stored as an array containing the literal identifiers of
the referenced objects. An example of this is the bodyDeclarations containment
reference, from ClassDeclaration to BodyDeclaration, that for the case of the c1
object is stored as { ’b1’, ’b2’ } in the hc1, bodyDeclarationsi cell.</p>
        <p>Structurally, EMF models are trees (a characteristic inherited from its
XMLbased representation). That implies that every non-volatile object (except the
3 Actual rows have been split for improved readability
’package1’
’class1’
’bodyDecl1’
’bodyDecl’
{ ’c1’ }</p>
        <p>{ ’b1’, ’b2’ }
container
containment</p>
        <p>feature
’ROOT’
’p1’
’c1’
’c1’
’b1’
’b2’</p>
        <p>’eContents’
’ownedElements’
’bodyDeclarations’
’bodyDeclarations’
’modifiers’
’modifiers’</p>
        <p>nsURI
’http://java’
’http://java’
’http://java’
’http://java’
’http://java’
’http://java’
’http://java’
root object ) must be contained within another object (i.e., referenced from
another object via a containment reference). The containment column family
maintains a record of which is the container for every persisted object. The
container column records the identifier of the container object, while the
feature column records the name of the property that relates the container object
with the child object (i.e., the object to which the row corresponds). Table 1
shows that, for example, the container of the Package p1 is ROOT through the
eContents property (i.e., it is a root object and is not contained by any other
object). In the next row we find the entry that describes that the Class c1 is
contained in the Package p1 through the ownedElements property.</p>
        <p>The type column family groups the type information by means of the nsURI
and EClass columns. For example, the table specifies the element p1 is an
instance of the Package class of the Java metamodel (that is identified by the
http://java nsURI ).
3.2</p>
      </sec>
      <sec id="sec-2-5">
        <title>ACID properties</title>
        <p>
          NeoEMF/HBase is designed as a simple persistence layer that maintains the
same semantics as the standard EMF. Modifications in models stored using
NeoEMF/HBase are directly propagated to the underlying storage, making
changes visible to all possible readers immediatly. As in standard EMF, no
transactional support is explicitly provided, and as such, ACID properties [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]
(Atomicity, Consistency, Isolation, Durability) are only supported at the object
level:
Atomicity — Modifications on object’s properties are atomic. Modifications
involving changes in more than one object (e.g. bi-directional references),
are not atomic.
        </p>
        <p>Consistency — Modifications on object’s properties are always consistent
using a compare-and-swap mechanism. In the case of modifications involving
changes in more than one object, consistency is only guaranteed when the
model is modified to grow monotonically (i.e., only new information is added,
and no already existing data is deleted nor modified).</p>
        <p>Isolation — Reads on a given object always succeeds and always give a view
of the object’s latest valid state.</p>
        <p>Durability — Modifications on a given object are always reflected in the
underlying storage, even in the case of a Data Node failure, thanks to the
replication capabilities provided by HBase.</p>
        <p>These properties allow the use of NeoEMF/HBase as the persistence
backend for distributed and concurrent model transformations, since reads in the
source model are consistent and always success; and the creation of the target
model is a building process that creates a model that grows monotonically.
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Conclusion and Future Work</title>
      <p>In this paper we have outlined NeoEMF/HBase, an on-demand,
memoryfriendly persistence layer for distributed and decentralized model persistence.
Decentralized model persistence is useful in scenarios where multiple clients
may access models when performing distributed computing. NeoEMF/HBase
is built on top of HBase, a distributed, scalable, versioned and non-relationals
big data store, specially designed to run together with Apache Hadoop.</p>
      <p>NeoEMF/HBase takes advantage of the HBase properties by using a
simple data model that minimizes data dependencies among stored objects. More
specifically, NeoEMF/HBase exploits the row-locking mechanisms of HBase
to provide limited ACID properties without requiring the use of transactions,
which may increase latency in model operations. NeoEMF/HBase provides
ACID properties at the object level, and guarantees that: (i) object queries
always return the last valid state of an object; (ii) attribute modifications always
succeed and produce a consistent model; and (iii) modifications of references
which make the model grow monotonically always succeed and produce a
consistent model.</p>
      <p>
        Previous work [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] shows that key-value stores present clear benefits for
storing big models, since model operations cost remains constant when models size
grows. However, NeoEMF/HBase still lacks of a thorough performance
evaluation. Hence, immediate future work is focused in the development of an
evaluation benchmark. In this sense, we pursue to determine how the latency
introduced by HBase – specially on write operations – affects the overall performance.
      </p>
      <p>Additionally, a more advanced locking mechanism allowing arbitrary object
locks will be implemented. Such a mechanism will provide multi-object ACID
properties to the framework, allowing client applications to implement the
synchronization logic to perform arbitrary, distributed and concurrent
modifications.
This work is partially supported by the MONDO (EU ICT-611125) project.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>CDO</given-names>
            <surname>DB Store</surname>
          </string-name>
          (
          <year>2014</year>
          ), http://wiki.eclipse.org/CDO/DB_Store
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>CDO</given-names>
            <surname>Hibernate Store</surname>
          </string-name>
          (
          <year>2014</year>
          ), http://wiki.eclipse.org/CDO/Hibernate_Store
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>CDO</given-names>
            <surname>Model Repository</surname>
          </string-name>
          (
          <year>2014</year>
          ), http://www.eclipse.org/cdo/
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>CDO MongoDB Store</surname>
          </string-name>
          (
          <year>2014</year>
          ), http://wiki.eclipse.org/CDO/MongoDB_Store
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>CDO</given-names>
            <surname>Objectivity Store</surname>
          </string-name>
          (
          <year>2014</year>
          ), http://wiki.eclipse.org/CDO/Objectivity_ Store
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Eclipse</given-names>
            <surname>Modeling Framework</surname>
          </string-name>
          (
          <year>2014</year>
          ), http://www.eclipse.org/modeling/emf/
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Hadoop</given-names>
            <surname>Distributed File System</surname>
          </string-name>
          (
          <year>2015</year>
          ), http://hadoop.apache.
          <source>org/docs/r1.2</source>
          . 1/hdfs_design.html
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Benelallam</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gómez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sunyé</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tisi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Launay</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Neo4EMF, A Scalable Persistence Layer for EMF Models</article-title>
          .
          <source>In: Modelling Foundations and Applications, Lecture Notes in Computer Science</source>
          , vol.
          <volume>8569</volume>
          , pp.
          <fpage>230</fpage>
          -
          <lpage>241</lpage>
          . Springer (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <issue>9</issue>
          .
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghemawat</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hsieh</surname>
            ,
            <given-names>W.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wallach</surname>
            ,
            <given-names>D.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burrows</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chandra</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fikes</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gruber</surname>
          </string-name>
          , R.E.:
          <article-title>Bigtable: A distributed storage system for structured data</article-title>
          .
          <source>In: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation</source>
          - Volume
          <volume>7</volume>
          . pp.
          <fpage>15</fpage>
          -
          <lpage>15</lpage>
          . OSDI '06,
          <string-name>
            <given-names>USENIX</given-names>
            <surname>Association</surname>
          </string-name>
          , Berkeley, CA, USA (
          <year>2006</year>
          ), http://dl.acm.org/citation.cfm?id=
          <volume>1267308</volume>
          .
          <fpage>1267323</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghemawat</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <source>MapReduce: Simplified Data Processing on Large Clusters. In: Commun. ACM</source>
          . vol.
          <volume>51</volume>
          , pp.
          <fpage>107</fpage>
          -
          <lpage>113</lpage>
          . ACM, NY, USA (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Garrison</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wakefield</surname>
            ,
            <given-names>R.L.</given-names>
          </string-name>
          :
          <article-title>Success factors for deploying cloud computing</article-title>
          .
          <source>Commun. ACM</source>
          <volume>55</volume>
          (
          <issue>9</issue>
          ),
          <fpage>62</fpage>
          -
          <lpage>68</lpage>
          (
          <year>Sep 2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Gómez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tisi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sunyé</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cabot</surname>
          </string-name>
          , J.:
          <article-title>Map-based transparent persistence for very large models</article-title>
          . In: Egyed,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Schaefer</surname>
          </string-name>
          , I. (eds.) Fundamental Approaches to Software Engineering, Lecture Notes in Computer Science, vol.
          <volume>9033</volume>
          , pp.
          <fpage>19</fpage>
          -
          <lpage>34</lpage>
          . Springer Berlin Heidelberg (
          <year>2015</year>
          ), http://dx.doi.org/10.1007/ 978-3-
          <fpage>662</fpage>
          -46675-
          <issue>9</issue>
          _
          <fpage>2</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Haerder</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reuter</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Principles of transaction-oriented database recovery</article-title>
          .
          <source>ACM Comput. Surv</source>
          .
          <volume>15</volume>
          (
          <issue>4</issue>
          ),
          <fpage>287</fpage>
          -
          <lpage>317</lpage>
          (
          <year>Dec 1983</year>
          ), http://doi.acm.
          <source>org/10</source>
          .1145/289.291
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Khurana</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Introduction to HBase Schema Design</article-title>
          . ;
          <source>login: The Usenix Magazine</source>
          <volume>37</volume>
          (
          <issue>5</issue>
          ),
          <fpage>29</fpage>
          -
          <lpage>36</lpage>
          (
          <year>2012</year>
          ), https://www.usenix.org/publications/login/ october-2012
          <string-name>
            <surname>-volume-</surname>
          </string-name>
          37
          <string-name>
            <surname>-</surname>
          </string-name>
          number
          <article-title>-5/introduction-hbase-schema-design</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. Markus Scheidgen: EMF fragments (
          <year>2014</year>
          ), https://github.com/markus1978/ emf-fragments/wiki
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Scheidgen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zubow</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Map/Reduce on EMF Models</article-title>
          .
          <source>In: Proceedings of the 1st International Workshop on Model-Driven Engineering for High Performance and CLoud Computing</source>
          . pp.
          <volume>7</volume>
          :
          <fpage>1</fpage>
          -
          <issue>7</issue>
          :
          <fpage>5</fpage>
          . MDHPCL '12,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2012</year>
          ), http://doi.acm.
          <source>org/10</source>
          .1145/2446224.2446231
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>The Apache Software Foundation: Apache Hadoop</surname>
          </string-name>
          (
          <year>2015</year>
          ), http://hadoop. apache.org/
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <article-title>The Apache Software Foundation: Apache HBase (</article-title>
          <year>2015</year>
          ), http://hbase.apache. org/
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <article-title>The Apache Software Foundation: Apache ZooKeeper (</article-title>
          <year>2015</year>
          ), https://zookeeper. apache.org/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>