<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>IWSG</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Towards a Metadata-driven Multi-community Research Data Management Service</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Richard Grunzke*, Wolfgang E. Nagel</string-name>
          <email>richard.grunzke@tu-dresden.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Volker Hartmann, Thomas Jejkal,</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexander Hoffmann, Sonja Herres-Pawlis</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aline Deicke, Torsten Schrade</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hendrik Herold, Gotthard Meinel</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ajinkya Prabhune, Rainer Stotzka, Institute for Data Processing and Electronics, Karlsruhe Institute of Technology</institution>
          ,
          <addr-line>Karlsruhe</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Center for Information Services and High Performance Computing, Technische Universita ̈t Dresden</institution>
          ,
          <addr-line>Dresden</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Digitale Akademie, Akademie der Wissenschaften und</institution>
          ,
          <addr-line>Literatur Mainz, Mainz</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Institut fu ̈r Anorganische Chemie, Rheinisch-Westfa ̈lische Technische, Hochschule Aachen</institution>
          ,
          <addr-line>Aachen</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Monitoring of Settlement and, Open Space Development, Institute of Ecological and, Regional Development</institution>
          ,
          <addr-line>Dresden</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <volume>8</volume>
      <fpage>8</fpage>
      <lpage>10</lpage>
      <abstract>
        <p>-Nowadays, the daily work of many research commu- findability, pre-processing for further use and the exploitation nities is characterized by an increasing amount and complexity of existing data. of data. This makes it increasingly difficult to manage, access An established method to describe complex data structures tahned suatmileizeti mtoe,udltoimmaatienlyscgieanintisstcsiewnatinfitc tionsfiogchutss obnastehdeironsciite.nAcet is the use of metadata. This encapsulates in aggregated form instead of IT. The solution is research data management in order the substance of a data set. Metadata (”data about data”) to store data in a structured way to enable easy discovery for plays a central role in making data available for the longfuture reference. An integral part is the use of metadata. With it, term. It is essential for the comprehension and storage of data becomes accessible by its content instead of only its name data, its preservation, curation and discovery for future resaenadmlloescsataiosnp. oTsshibeleusine oorfdmerettaodfaotsatesrhaallhibgeh uassaabuiltiotym.atic and use. Metadata allows for the easier applying of complex and Here we present the architecture and initial steps of the MASi costly tasks such as searching for data based on metadata. project with its aim to build a comprehensive research data Aside from a better discovery, other data management aspects, management service. First, it extends the existing KIT Data such as managing and utilizing similarities between data sets, Manager framework by a generic programming interface and by are fostered. iancgleundeersicthgeraipnhteicgarlatwioenb oinfteprrfoavceen. aAndcveamnceetdadaadtaditainodnaplefresaitsuternest In diverse scientific communities partly very different metaidentifiers. The MASi service aims at being easily adaptable data standards exist that each incorporate community specific for arbitrary communities with limited effort. The requirements data characteristics. This limited portability to new use cases for the initial use cases within geography, chemistry and digital causes established methods in a scientific field to be of limited humanities are elucidated. The MASi research data management use in other fields. Also, the number of standardized tools to svearrvyiicnegisrecquurirreenmtelyntbseiinnganbueiflfitcuiepnttowsaayt.isfy these complex and extract metadata from heterogeneous data is limited.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        devices and data management systems (e. g. iRODS [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]) in
the data life cycle hierarchy [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Other systems including a
delimitation in regard to the KIT DM are described in Section
II-B.
      </p>
    </sec>
    <sec id="sec-2">
      <title>A. KIT Data Manager - A Repository Framework</title>
      <p>
        The KIT Data Manager [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is a generic, highly customizable
open source software framework for building research data
repository systems. Horizontally, it is organized into a number
of well-defined high-level services providing functionalities
for data and metadata management and sharing as well as
administrative services for user and group management. Due
to the focus on research data, KIT Data Manager also provides
features in addition to typical repository systems, namely a
flexible data transfer service literally supporting every data
transfer protocol and a data workflow service allowing to
locally or remotely trigger the automatic execution of data
processing tasks. These as configured in the repository system
and include data transfer to the processing environment, data
ingest of the processing results and provenance tracking.
Highlevel services can be accessed either via Java APIs, e.g.
to implement Web-based user interfaces or to extend the
basic framework by additional functionalities, or via RESTful
service interfaces, e.g. to access KIT Data Manager based
repository systems remotely using a programming language
of choice.
      </p>
      <p>Vertically, KIT Data Manager is organized into different
layers where the upper layer is formed by the high-level
services described before. For repository systems based on
KIT Data Manager this upper layer provides reliable and
well-defined extension points on the one hand and a high
degree of abstraction from underlying technologies on the
other hand. The lowest layer of the architecture interfaces
these technologies by defining a basic set of functionalities
that is provided by a corresponding technology, e.g. to store
and restore a predefined hierarchical data structure in case of
the interface that has to be implemented for integrating a data
storage technology. This offers a high degree of sustainability
as changing technologies only affects the lower layer whereas
upper layers are unaffected by technology changes.</p>
      <p>Currently, KIT Data Manager is used to implement
repository systems for various scientific disciplines, namely biology,
arts and humanities, and nano-science. Due to its extensibility
the base framework can be tailored to fulfil the specific needs
of each of these disciplines with a reasonable effort.</p>
    </sec>
    <sec id="sec-3">
      <title>B. Other Systems and Delimitation</title>
      <p>
        The ICAT system [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] aims at supporting data management
for photon science facilities [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This includes supporting
beamline proposals, access rights, experiments, studies and
instruments that produce the actual data. This data is collected
as datasets which can then be published. The attaching of
metadata such as experiments parameters, instrument
parameters, and sample descriptions is supported. This closely follows
the physics requirements but at the same time makes it hard
to adapt for other use cases. Technically, ICAT relies on a
      </p>
      <p>Java EE application server with Glassfish being the standard
and the Oracle and MySQL databases are supported. It offers
a web service interface to support, for example, the ICAT
download manager TopCat. Authentication and authorization
mechanisms are supported via LDAP, local data base or
anonymous access with a plugin interface being available for
extensions.</p>
      <p>
        DSpace [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is a mature and ready-to-use solution for
institutional repositories and it is free and open source software. It is
adaptable to fit the need of individual institutions and fosters
open access to all kinds of content. It supports submission
workflows and various ingest and export methods. Various file
types, persistent IDs and PostgreSQL and Oracle databases
are supported. Search capabilities via metadata (descriptive,
administrative, structural) are provided that foster the
longterm preservation and accessibility of data.
      </p>
      <p>
        Fedora (Flexible Extensible Digital Object Repository
Architecture) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] provides a framework with individual basic
components to build repositories. It is open source and aims
to be robust and modular. The main use case is to provide
specialized services that may be integrated with existing
environments and technologies. A main goal is to foster digital
content preservation for complex and large datasets. Metadata
for data organization is supported as well as descriptions of
relationships between and linking of datasets.
      </p>
      <p>
        EUDAT [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is a European project aiming to create a
generically applicable infrastructure to manage, access, and
preserve research data. The EUDAT services B2SHARE and
B2FIND involve metadata. B2SHARE is for storing and
sharing research data via a web portal. It is also the central
mean to upload data. This has to be done via the web portal
and metadata has to be entered manually with community
specific profiles being definable. B2FIND enables to access
data sets via their metadata and to annotate it with comments.
A B2NOTE service is planned which aims a enabling an
automatic annotation of metadata [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>iRODS as a distributed data management systems is not
focused on metadata management although it offers some basic
metadata functionality. Metadata can be attached to data as
attribute-value-unit triples on a per file basis which can be used
for searching. Integrated capabilities for metadata extraction,
annotation, provenance support is missing.</p>
      <p>In contrast to these systems, the KIT Data Manager is more
flexible. It can be specifically adapted in-depth to arbitrary
target communities with a close integration into community
workflows. It enables far reaching automations for high
usage efficiency with ready-made capabilities to be adapted to
specific communities.</p>
      <sec id="sec-3-1">
        <title>III. MASI RESEARCH DATA MANAGEMENT SERVICE</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>A. Overarching Goals</title>
      <p>The MASi service is building a generic and sustainable
repository. It will be sustainably operated for the involved
communities to fully handle their data management
requirements by utilizing metadata. One part of the project is the
development of a generic model as a concrete best practice
implementation guide. It will enable to easily satisfy specific
community data management requirements by using metadata.
Along this guide further communities will be supported to
build up their own MASi instances. The service is based on
the KIT Data Manager repository framework (see Section II
A) that we are currently extending. One the one hand, this
includes generating a generic API to support further metadata
models. On the other hand, we are implementing and will
provide generic graphical interfaces to fundamentally lower
the effort to adapt MASi to new use cases. Furthermore, we are
closely collaborating within the Research Data Alliance (RDA)
with other data researchers to develop recommendations and
aim at implementing these within MASi. An example of such
a RDA recommendation is the support for PIDs in conjunction
with PID information types and a data type registry.</p>
    </sec>
    <sec id="sec-5">
      <title>B. Generic Metadata Interface</title>
      <p>
        The metadata groups within RDA compiled a set of
principles regarding metadata. The well-known definition of
metadata as “data about data” is the basis. Metadata differs in the
mode of use and should be easily machine-understandable. It
may cover the whole lifecycle of the data starting at the idea of
a project, the acquisition to the publication which references
the data. MASi will be a single point of access for all such
kinds of metadata. The metadata is linked to the data via a
unique identifier. The identifier may be a custom one as long
the data is only managed internally. As soon the metadata is
available for the public, a persistence identifier (PID) such
as DOI (Digital Object Identifier) is used to make the data
referencable. Each PID contains at least two attributes holding
an URL to the metadata and to the data. The metadata is
available as a METS (Metadata Encoding and Transmission
Standard) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] document which is structured in XML. In
MASi it is used as the standard format for all interfaces.
In the final stage there will be a registered MASi profile of
METS which is valid in all configurations. METS defines
seven sections for different purposes. The metadata handled
by MASi itself is split in several packages (see Figure 1).
Some of the packages are very similar with the sections used
in METS. Others are allocated in a way so that they are most
suitable for MASi. Each package is responsible for a special
purpose (administrative, structural, content, bit preservation,
provenance and annotation metadata). While not all packages
are needed by every community the structure of the METS
document may slightly differ. In the future, also new packages
can be added without conflict.
      </p>
      <p>To store these different kinds of formats in an efficient way,
MASi offers a generic storage API to various kinds of
underlying data management systems. To keep the maintenance
effort manageable, we will focus on widely used standards.
MASi offers a REST interface (see Figure 1) which allows
for a high extensibility. This interface supports the CRUD
(create, read, update, delete) operations for each package or
the whole metadata sets. In case of published open access
data, anyone can perform read operations on metadata without
authentication. For all other operations the user has to be
authenticated and authorized.</p>
      <p>
        The following serves as an example regarding the
provenance package functionality: There are many workflow
engines each implementing its own format. But there are two
standards which are supported by the majority, Open
Provenance Model (OPM) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and ProvOne [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. It is possible to
transform OPM metadata to the ProvOne format without loss.
      </p>
      <p>ProvOne describes the provenance as a graph represented as</p>
      <p>XML. To allow for sophisticated queries the graph is stored
in a graph database. Therefore, it is possible to query, e.g., for
similar workflows and even more complex queries are
possible. METS has a pre-defined section for provenance metadata.
It uses digiProvMD which can be losslessly transformed to
ProvOne and vice versa.</p>
      <p>For each package there can be a specialized database storing
the metadata. MASi will collect all metadata and compile
it in a METS document using the MASi profile or in case
of a data ingest split the metadata in its packages to store
them accordingly. On client side there will be MASi tools
supporting communities to compile a valid METS document
matching the MASi profile. Subsequently, on server side a
basic quality control is triggered during metadata ingest. It is
based on the respective XML schema which is available for
all packages except for content metadata as each community
has its own specific content metadata. Such a schema needs
to be created for each community. Registered schemata can
be used for the quality control. If this control is required to
be more sophisticated, a Java plugin can be implemented and
easily activated in order so support any kind of quality control
capability.</p>
    </sec>
    <sec id="sec-6">
      <title>C. Generic Graphical Interface</title>
      <p>The motivation to develop and provide a generic web
interface for MASi is to significantly lower the time and
effort required to adapt the MASi service to further use cases.
Developers are then freed from the need to re-develop a user
interface for each new use case. This saves time for developers
familiar with the technology. However, it is fundamentally
enabling for developers unfamiliar with it.</p>
      <p>
        The generic web interface is partly built on the basis of
the Liferay portal framework [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] that provides ready-made
capabilities such as plugins, menus, groups, roles, separable
areas and user authentication management with systems such
as LDAP and Shibboleth. This enables the integration of
federations such as eduGAIN [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] for the easy re-use of
existing institute logins. Liferay is open-source, mature and
widely used. For this to seamlessly work within MASi, we
are currently developing a Liferay plugin to integrate Liferay
with the KIT Data Manager repository framework. The plugin
will ensure consistency between the user management systems
of Liferay and KIT DM by automatically syncing new Liferay
users to KIT DM. Adding users to KIT DM via another way
will be disabled in this operation mode to ensure that all users
exist in both systems. This integration will enable the KIT DM
to transparently support authentication systems that are already
supported by Liferay such as LDAP and Shibboleth. Also, the
KIT DM admin interface will be integrated with Liferay. We
will provide a detailed installation and configuration guide in
order to further lower the barrier of adoption.
      </p>
      <p>The second main part is the current development of a
generic Liferay graphical interface portlet with common
functionality. Initially, it will include basic upload, search and
download capabilities. To fundamentally increase the impact of
this development, we will create an extensive documentation
to enable developers to easily adapt the portlet for their
specific use case requirements. The documentation will include
everything from code checkout, development project
configuration, adaptation examples to compilation. A main goal of
the documentation is to lower the training period as much as
possible. All binaries, source code and documentation will be
open source and will become part of the KIT Data Manager
framework. In the course of MASi, the generic GUI portlet
will be continuously extended with increasingly advanced
generic capabilities. Consequently, the documentation will be
appropriately extended in order to enable quick community
adaptations.</p>
      <sec id="sec-6-1">
        <title>IV. INITIAL USE CASES</title>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>A. Historical Maps</title>
      <p>
        Historical topographic and cadastral maps are a valuable
and often the only source for reconstructing land use changes
over long periods of time. To access this information for large
scale spatial analyses and change detection, advanced image
analysis and pattern recognition algorithms have to be applied
to the scanned map documents. The retrieved information can
hence be used to “historize” existing land use and land cover
databases [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
The automatic information acquisition from historical maps
generates and necessitates a variety of metadata. The process
comprises three major components: firstly, the scanning of
the paper maps (which are only partially available as digital
images); secondly, the georeferencing of the scanned maps
(which is only provided for the minority of digital available
maps); and thirdly, the information extraction from the
georeferenced digital map images. Each of the three components
generate at least four obligatory metadata entries. Figure 2
shows the workflow of the information acquisition process as
well as the essential metadata that are generated during the
process. The given metadata are essential for both the change
detection process as well as the correct interpretation of the
retrieved information by third users.
      </p>
    </sec>
    <sec id="sec-8">
      <title>B. Spectroscopy in Chemistry</title>
      <p>
        In bioinorganic chemistry, a multitude of spectroscopic
information can be obtained by experimental methods such
as UV/Vis, IR, Raman, EPR and XAS spectroscopy. In most
cases, these data are complemented by theoretical simulations
which help to interpret the experimental data and obtain
scientific insights. In a concrete case, we investigate metal
complexes and their redox behavior with oxidants
(electrontaking reagents) and reductants (electron-delivering reagents)
by UV/Vis spectroscopic measurements. Here, for instance, a
copper(I) complex is treated with a cobalt(II) complex yielding
copper(II) and cobalt(II) complexes under exchange of an
electron. The copper(I) spectroscopic features decay and those
of copper(II) form. The speed of this development is monitored
every 1.5 ms for some seconds producing a large amount of
raw data. This raw data is reduced by the researcher, e.g. by
choice of a suited wavelength and absorption time traces are
generated, at the moment manually. From these time traces,
the kinetic decay constants are determined. This analysis is
performed for several ratios between oxidant and reductant to
resolve the second-order kinetics of the electron transfer. This
final data shall be stored together with the theoretical analyses
of the electron transfer by density functional theory. Metadata
annotation is important in all steps but yet an open issue: the
original raw data need annotation of who measured which
chemical system and which ratio, temperature, setup etc. This
information is traditionally documented manually in laboratory
notebooks which are stored in the working group. The reduced
data is then stored electronically. Here, metadata can comprise
all metadata of the raw data but additional information on
data reduction steps must be added to the metadata. The
theoretical data imply different metadata: the version of the
code, functional, basis set, dispersion and solvent modelling
as well as grid size should be noted in the corresponding
workflow [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Figure 3 summarizes these different levels of
data production for this example.
      </p>
    </sec>
    <sec id="sec-9">
      <title>C. Church Windows</title>
      <p>
        The ”Corpus Vitrearum Deutschland“ [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] is part of the
international “Corpus Vitrearum Medii Aevi” (CVMA). The
main focus of this long-term research project, funded by the
      </p>
      <p>Academy of Sciences and Literature Mainz and the
BerlinBrandenburg Academy of Sciences and Humanities, lies in
the analysis of medieval stained glass preserved in church
windows, museums, galleries and other places all over Europe,
the US and Canada (see Figure 4 for an example).</p>
      <p>Due to its fragile nature, medieval stained glass is greatly
affected by environmental impacts. In a first step during
the research, all windowpanes are photographed and then
documented in schematic drawings. With this documentation
as a basis, the history of each window’s glazing and any
changes or restoration activities that might have been carried
out throughout the centuries are studied. Finally, the
iconography and the religious context of each window within its
ecclesiastical space are interpreted.</p>
      <p>The CVMA curates a digital image archive of the
photographs taken. For each image, an extensive set of metadata is
provided. The records are modeled according to the guidelines
of the internationally acknowledged XMP metadata standard.
All XMP information is directly embedded in the TIFF files.
Due to this approach, an accidental separation between the file
and its metadata becomes highly unlikely. In addition to XMP,
the ICONCLASS vocabulary is used to describe and classify
the contents of each image.</p>
      <p>The MASi service will open up the CVMA image archive
to further interested parties, e.g. providers of cultural heritage
photography such as the Prometheus, Foto-Marburg or the
Europeana online platforms. During the implementation of the
service, an OAI-PMH interface will be created. Also, a
proofof-concept for the automatic matching of metadata records
with other cultural heritage data repositories will be drafted.</p>
      <p>Finally, a configurable web interface will be built that will
allow the generic embedding of this automatically enriched
metadata records into the CVMA image files.</p>
      <sec id="sec-9-1">
        <title>V. CONCLUSION AND OUTLOOK</title>
        <p>The MASi research data management service provides a
solution for the highly relevant challenge of managing large
amounts of complex data. It builds on substantial previous
work that is further extended and broadened. The MASi
service that is currently being built up, is easily able to seamlessly
integrate with highly diverse use cases. In this capacity it plays
an essential role in fulfilling the complex requirements while
further use cases are currently being planned.</p>
        <p>
          As future work, we are evaluating the integration of MASi
both with the UNICORE HPC middleware [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] and science
gateways such as MoSGrid [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. We are also continuing to
work within the RDA and contribute our own expertise in
discussions to create RDA recommendations on how to best
handle various aspects of research data management. We aim
at implementing the resulting joint recommendations within
MASi. This will contribute in the creation of MASi as a
service that is efficient, future-proof and has a high user
acceptance.
        </p>
      </sec>
      <sec id="sec-9-2">
        <title>ACKNOWLEDGMENT</title>
        <p>The authors would like to thank the DFG (German
Research Foundation) for the opportunity to do research in the
MASi project (NA711/9-1). Furthermore, financial support
by the BMBF (German Federal Ministry of Education and
Research) for the competence center for Big Data ScaDS</p>
        <p>Dresden/Leipzig is gratefully acknowledged. The research
leading up to these results has been supported by the LSDMA
project of the Helmholtz Association of German Research
Centres.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1] MASi, “Metadata Management for Applied Sciences,”
          <year>2016</year>
          . [Online]. Available: http://www.scientific-metadata.de/
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rajasekar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Moore</surname>
          </string-name>
          , C.-y. Hou,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Marciano</surname>
          </string-name>
          , A. de Torcy,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Schroeder</surname>
          </string-name>
          , S.-
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Gilbert</surname>
          </string-name>
          et al.,
          <source>“iRODS primer: Integrated Rule-Oriented Data System,” Synthesis Lectures on Information Concepts</source>
          ,
          <source>Retrieval, and Services</source>
          , vol.
          <volume>2</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>143</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Grunzke</surname>
          </string-name>
          , J. Kru¨ger, S. Gesing,
          <string-name>
            <given-names>S.</given-names>
            <surname>Herres-Pawlis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hoffmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Aguilera</surname>
          </string-name>
          , and W. E. Nagel, “
          <article-title>Managing complexity in distributed data life cycles enhancing scientific discovery,”</article-title>
          <source>in IEEE 11th International Conference on e-Science</source>
          ,
          <year>August 2015</year>
          , pp.
          <fpage>371</fpage>
          -
          <lpage>380</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Jejkal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vondrous</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kopmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Stotzka</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Hartmann</surname>
          </string-name>
          , “
          <article-title>KIT Data Manager: The Repository Architecture Enabling CrossDisciplinary Research,” in Large-Scale Data Management and Analysis -</article-title>
          <source>Big Data in Science - 1st Edition</source>
          ,
          <year>2014</year>
          . [Online]. Available: http://digbib.ubka.uni-karlsruhe.de/volltexte/1000043270
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Flannery</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Matthews</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Griffin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bicarregui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gleaves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lerusse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Downing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ashton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sufi</surname>
          </string-name>
          , G. Drinkwater et al.,
          <string-name>
            <surname>“</surname>
            <given-names>ICAT</given-names>
          </string-name>
          :
          <article-title>Integrating Data Infrastructure for Facilities Based Science</article-title>
          ,” in e-Science,
          <year>2009</year>
          . e-Science'
          <fpage>09</fpage>
          . Fifth IEEE International Conference on. IEEE,
          <year>2009</year>
          , pp.
          <fpage>201</fpage>
          -
          <lpage>207</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Grunzke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hesser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Starek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kepper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gesing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hardt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Hartmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kindermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Potthoff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hausmann</surname>
          </string-name>
          , R. Mu¨llerPfefferkorn, and R. Ja¨kel, “
          <article-title>Device-driven Metadata Management Solutions for Scientific Big Data Use Cases,” in 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP</article-title>
          <year>2014</year>
          ),
          <year>February 2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Barton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bass</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Branschofsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>McClellan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Stuve</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tansley</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Walker</surname>
          </string-name>
          , “
          <article-title>DSpace: An Open Source Dynamic Digital Repository</article-title>
          ,”
          <year>2003</year>
          . [Online].
          <source>Available: hdl.handle.net/1721.1/29465</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8] FedoraCommons, “Fedora Commons Repository Software,”
          <year>2016</year>
          . [Online]. Available: http://fedora-commons.org/
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Lecarpentier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wittenburg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Elbers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Michelini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kanso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Coveney</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Baxter</surname>
          </string-name>
          , “
          <article-title>EUDAT: A New Cross-Disciplinary Data Infrastructure for Science,”</article-title>
          <source>International Journal of Digital Curation</source>
          , vol.
          <volume>8</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>279</fpage>
          -
          <lpage>287</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10] EUDAT, “Eudat semantics working group,”
          <year>2015</year>
          . [Online]. Available: http://eudat.eu/semantics
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McDonough</surname>
          </string-name>
          , “
          <article-title>METS: Standardized Encoding for Digital Library Objects</article-title>
          ,”
          <source>International journal on digital libraries</source>
          , vol.
          <volume>6</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>148</fpage>
          -
          <lpage>158</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Moreau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Freire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Futrelle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. E.</given-names>
            <surname>McGrath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Myers</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Paulson</surname>
          </string-name>
          , “
          <article-title>The Open Provenance Model: An Overview,” in Provenance and Annotation of Data and Processes</article-title>
          . Springer,
          <year>2008</year>
          , pp.
          <fpage>323</fpage>
          -
          <lpage>326</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>V.</given-names>
            <surname>Cuevas-Vicentt</surname>
          </string-name>
          ´ın, B. Luda¨scher, P. Missier,
          <string-name>
            <given-names>K.</given-names>
            <surname>Belhajjame</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Chirigati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kianmajd</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Koop</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bowers</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Altintas</surname>
          </string-name>
          , “
          <article-title>ProvONE: A PROV Extension Data Model for Scientific Workflow Provenance</article-title>
          ,” in DataONE Provenance Working Group,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Liferay</surname>
          </string-name>
          , “Enterprise Open Source Portal and Collaboration Software,”
          <year>2016</year>
          . [Online]. Available: http://www.liferay.com/
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Geant</surname>
          </string-name>
          , “eduGAIN - Interconnecting Federations to Link Services and Users Worldwide,”
          <year>2015</year>
          . [Online]. Available: http://www.geant.net/service/eduGAIN/Pages/home.aspx
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname>Herold</surname>
          </string-name>
          , G. Meinel,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hecht</surname>
          </string-name>
          , and E. Csaplovics, “
          <article-title>A GEOBIA Approach to Map Interpretation-Multitemporal Building Footprint Retrieval for High Resolution Monitoring of Spatial Urban Dynamics</article-title>
          ,” in
          <source>International Conference on Geographic Object-Based Image Analysis</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>252</fpage>
          -
          <lpage>256</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Herres-Pawlis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hoffmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rosener</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kr</surname>
          </string-name>
          <string-name>
            <surname>A</surname>
          </string-name>
          ˜Œger,
          <string-name>
            <given-names>R.</given-names>
            <surname>Grunzke</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Gesing</surname>
          </string-name>
          ,
          <article-title>“Multi-layer Meta-metaworkflows for the Evaluation of Solvent and Dispersion Effects in Transition Metal Systems Using the MoSGrid Science Gateways</article-title>
          ,” in
          <source>Science Gateways (IWSG)</source>
          ,
          <year>2015</year>
          7th International Workshop on,
          <year>June 2015</year>
          , pp.
          <fpage>47</fpage>
          -
          <lpage>52</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18] “Corpus Vitrearum Deutschland,” http://www.corpusvitrearum.de/,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>K.</given-names>
            <surname>Benedyczak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schuller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Petrova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rybicki</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Grunzke</surname>
          </string-name>
          , “UNICORE 7
          <article-title>- Middleware Services for Distributed and Federated Computing,”</article-title>
          <source>in International Conference on High Performance Computing Simulation (HPCS)</source>
          ,
          <year>2016</year>
          , accepted.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kru</surname>
          </string-name>
          ¨ger,
          <string-name>
            <given-names>R.</given-names>
            <surname>Grunzke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gesing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Breuers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Brinkmann</surname>
          </string-name>
          , L.
          <string-name>
            <surname>de la Garza</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Kohlbacher</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Kruse</surname>
            ,
            <given-names>W. E.</given-names>
          </string-name>
          <string-name>
            <surname>Nagel</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Packschies</surname>
            , R. Mu¨llerPfefferkorn, P. Scha¨fer, C. Scha¨rfe,
            <given-names>T.</given-names>
            Steinke, T.
          </string-name>
          <string-name>
            <surname>Schlemmer</surname>
            ,
            <given-names>K. D.</given-names>
          </string-name>
          <string-name>
            <surname>Warzecha</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Zink</surname>
            , and
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Herres-Pawlis</surname>
          </string-name>
          , “
          <article-title>The MoSGrid Science Gateway - A Complete Solution for Molecular Simulations</article-title>
          ,
          <source>” Journal of Chemical Theory and Computation</source>
          , vol.
          <volume>10</volume>
          (
          <issue>6</issue>
          ), pp.
          <fpage>2232</fpage>
          -
          <lpage>2245</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>