8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016


         Towards a Metadata-driven Multi-community
            Research Data Management Service
                    Richard Grunzke*, Wolfgang E. Nagel                                    Volker Hartmann, Thomas Jejkal,
       Center for Information Services and High Performance Computing                      Ajinkya Prabhune, Rainer Stotzka
                        Technische Universität Dresden                                Institute for Data Processing and Electronics
                              Dresden, Germany                                               Karlsruhe Institute of Technology
                        richard.grunzke@tu-dresden.de                                                Karlsruhe, Germany

Alexander Hoffmann, Sonja Herres-Pawlis               Aline Deicke, Torsten Schrade          Hendrik Herold, Gotthard Meinel
      Institut für Anorganische Chemie                    Digitale Akademie                     Monitoring of Settlement and
      Rheinisch-Westfälische Technische             Akademie der Wissenschaften und               Open Space Development
              Hochschule Aachen                             Literatur Mainz                       Institute of Ecological and
               Aachen, Germany                              Mainz, Germany                          Regional Development
                                                                                                       Dresden, Germany


   Abstract—Nowadays, the daily work of many research commu-          findability, pre-processing for further use and the exploitation
nities is characterized by an increasing amount and complexity        of existing data.
of data. This makes it increasingly difficult to manage, access          An established method to describe complex data structures
and utilize to ultimately gain scientific insights based on it. At
the same time, domain scientists want to focus on their science       is the use of metadata. This encapsulates in aggregated form
instead of IT. The solution is research data management in order      the substance of a data set. Metadata (”data about data”)
to store data in a structured way to enable easy discovery for        plays a central role in making data available for the long-
future reference. An integral part is the use of metadata. With it,   term. It is essential for the comprehension and storage of
data becomes accessible by its content instead of only its name       data, its preservation, curation and discovery for future re-
and location. The use of metadata shall be as automatic and
seamless as possible in order to foster a high usability.             use. Metadata allows for the easier applying of complex and
   Here we present the architecture and initial steps of the MASi     costly tasks such as searching for data based on metadata.
project with its aim to build a comprehensive research data           Aside from a better discovery, other data management aspects,
management service. First, it extends the existing KIT Data           such as managing and utilizing similarities between data sets,
Manager framework by a generic programming interface and by           are fostered.
a generic graphical web interface. Advanced additional features
includes the integration of provenance metadata and persistent           In diverse scientific communities partly very different meta-
identifiers. The MASi service aims at being easily adaptable          data standards exist that each incorporate community specific
for arbitrary communities with limited effort. The requirements       data characteristics. This limited portability to new use cases
for the initial use cases within geography, chemistry and digital     causes established methods in a scientific field to be of limited
humanities are elucidated. The MASi research data management          use in other fields. Also, the number of standardized tools to
service is currently being built up to satisfy these complex and
varying requirements in an efficient way.                             extract metadata from heterogeneous data is limited.
   Keywords—Metadata, Communities, Research Data Manage-                 In the MASi (Metadata Management for Applied Sciences)
ment                                                                  project [1] of the DFG (German Research Foundation) we
                                                                      are developing a generic data management service for sci-
                       I. I NTRODUCTION                               entific data. Along heterogeneous scientific use cases we are
                                                                      demonstrating its applicability. The kind and extent of the data
   Today’s research landscape is characterized by steadily in-        of the participating communities is largely domain specific.
creasing amounts of data that is caused by the use of improved        Likewise, the use of metadata is not uniform across them so
data recording, increasingly complex simulation and by the            that a suitable overarching research data management service
correlation of numerous, often heterogeneous data sources.            is a fundamental requirement.
This increase of the data basis promises a higher amount of
scientific insights. When amount and complexity of data is                                  II. BACKGROUND
increasing, the requirements in regard to the data structure             The MASi research data management service is being built
are also increasing. A suitable and specific data description         using the KIT Data Manager repository framework (see Sec-
becomes paramount. Present data processing methods are often          tion II-A). Utilizing and extending the KIT DM enables MASi
reaching their capacity limit. Novel management methods for           to offer elaborate metadata functionality with a large degree
newer and more complex data become essential. Especially              of automation and flexibility. Such metadata management ca-
important are improved data descriptions, sustainable storages,       pabilities are a higher level abstraction based on basic storage
                       8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016


devices and data management systems (e. g. iRODS [2]) in               Java EE application server with Glassfish being the standard
the data life cycle hierarchy [3]. Other systems including a           and the Oracle and MySQL databases are supported. It offers
delimitation in regard to the KIT DM are described in Section          a web service interface to support, for example, the ICAT
II-B.                                                                  download manager TopCat. Authentication and authorization
                                                                       mechanisms are supported via LDAP, local data base or
A. KIT Data Manager - A Repository Framework                           anonymous access with a plugin interface being available for
   The KIT Data Manager [4] is a generic, highly customizable          extensions.
open source software framework for building research data                 DSpace [7] is a mature and ready-to-use solution for institu-
repository systems. Horizontally, it is organized into a number        tional repositories and it is free and open source software. It is
of well-defined high-level services providing functionalities          adaptable to fit the need of individual institutions and fosters
for data and metadata management and sharing as well as                open access to all kinds of content. It supports submission
administrative services for user and group management. Due             workflows and various ingest and export methods. Various file
to the focus on research data, KIT Data Manager also provides          types, persistent IDs and PostgreSQL and Oracle databases
features in addition to typical repository systems, namely a           are supported. Search capabilities via metadata (descriptive,
flexible data transfer service literally supporting every data         administrative, structural) are provided that foster the long-
transfer protocol and a data workflow service allowing to              term preservation and accessibility of data.
locally or remotely trigger the automatic execution of data               Fedora (Flexible Extensible Digital Object Repository Ar-
processing tasks. These as configured in the repository system         chitecture) [8] provides a framework with individual basic
and include data transfer to the processing environment, data          components to build repositories. It is open source and aims
ingest of the processing results and provenance tracking. High-        to be robust and modular. The main use case is to provide
level services can be accessed either via Java APIs, e.g.              specialized services that may be integrated with existing
to implement Web-based user interfaces or to extend the                environments and technologies. A main goal is to foster digital
basic framework by additional functionalities, or via RESTful          content preservation for complex and large datasets. Metadata
service interfaces, e.g. to access KIT Data Manager based              for data organization is supported as well as descriptions of
repository systems remotely using a programming language               relationships between and linking of datasets.
of choice.                                                                EUDAT [9] is a European project aiming to create a
   Vertically, KIT Data Manager is organized into different            generically applicable infrastructure to manage, access, and
layers where the upper layer is formed by the high-level               preserve research data. The EUDAT services B2SHARE and
services described before. For repository systems based on             B2FIND involve metadata. B2SHARE is for storing and
KIT Data Manager this upper layer provides reliable and                sharing research data via a web portal. It is also the central
well-defined extension points on the one hand and a high               mean to upload data. This has to be done via the web portal
degree of abstraction from underlying technologies on the              and metadata has to be entered manually with community
other hand. The lowest layer of the architecture interfaces            specific profiles being definable. B2FIND enables to access
these technologies by defining a basic set of functionalities          data sets via their metadata and to annotate it with comments.
that is provided by a corresponding technology, e.g. to store          A B2NOTE service is planned which aims a enabling an
and restore a predefined hierarchical data structure in case of        automatic annotation of metadata [10].
the interface that has to be implemented for integrating a data           iRODS as a distributed data management systems is not
storage technology. This offers a high degree of sustainability        focused on metadata management although it offers some basic
as changing technologies only affects the lower layer whereas          metadata functionality. Metadata can be attached to data as
upper layers are unaffected by technology changes.                     attribute-value-unit triples on a per file basis which can be used
   Currently, KIT Data Manager is used to implement reposi-            for searching. Integrated capabilities for metadata extraction,
tory systems for various scientific disciplines, namely biology,       annotation, provenance support is missing.
arts and humanities, and nano-science. Due to its extensibility           In contrast to these systems, the KIT Data Manager is more
the base framework can be tailored to fulfil the specific needs        flexible. It can be specifically adapted in-depth to arbitrary
of each of these disciplines with a reasonable effort.                 target communities with a close integration into community
                                                                       workflows. It enables far reaching automations for high us-
B. Other Systems and Delimitation                                      age efficiency with ready-made capabilities to be adapted to
   The ICAT system [5] aims at supporting data management              specific communities.
for photon science facilities [6]. This includes supporting
                                                                          III. MAS I R ESEARCH DATA M ANAGEMENT S ERVICE
beamline proposals, access rights, experiments, studies and
instruments that produce the actual data. This data is collected       A. Overarching Goals
as datasets which can then be published. The attaching of                The MASi service is building a generic and sustainable
metadata such as experiments parameters, instrument parame-            repository. It will be sustainably operated for the involved
ters, and sample descriptions is supported. This closely follows       communities to fully handle their data management require-
the physics requirements but at the same time makes it hard            ments by utilizing metadata. One part of the project is the
to adapt for other use cases. Technically, ICAT relies on a            development of a generic model as a concrete best practice


                                                                   2
                       8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016


                                   Figure 1. The MASi architecture for generic research data management.


implementation guide. It will enable to easily satisfy specific         Standard) [11] document which is structured in XML. In
community data management requirements by using metadata.               MASi it is used as the standard format for all interfaces.
Along this guide further communities will be supported to               In the final stage there will be a registered MASi profile of
build up their own MASi instances. The service is based on              METS which is valid in all configurations. METS defines
the KIT Data Manager repository framework (see Section II               seven sections for different purposes. The metadata handled
A) that we are currently extending. One the one hand, this              by MASi itself is split in several packages (see Figure 1).
includes generating a generic API to support further metadata           Some of the packages are very similar with the sections used
models. On the other hand, we are implementing and will                 in METS. Others are allocated in a way so that they are most
provide generic graphical interfaces to fundamentally lower             suitable for MASi. Each package is responsible for a special
the effort to adapt MASi to new use cases. Furthermore, we are          purpose (administrative, structural, content, bit preservation,
closely collaborating within the Research Data Alliance (RDA)           provenance and annotation metadata). While not all packages
with other data researchers to develop recommendations and              are needed by every community the structure of the METS
aim at implementing these within MASi. An example of such               document may slightly differ. In the future, also new packages
a RDA recommendation is the support for PIDs in conjunction             can be added without conflict.
with PID information types and a data type registry.                       To store these different kinds of formats in an efficient way,
                                                                        MASi offers a generic storage API to various kinds of un-
B. Generic Metadata Interface                                           derlying data management systems. To keep the maintenance
   The metadata groups within RDA compiled a set of princi-             effort manageable, we will focus on widely used standards.
ples regarding metadata. The well-known definition of meta-             MASi offers a REST interface (see Figure 1) which allows
data as “data about data” is the basis. Metadata differs in the         for a high extensibility. This interface supports the CRUD
mode of use and should be easily machine-understandable. It             (create, read, update, delete) operations for each package or
may cover the whole lifecycle of the data starting at the idea of       the whole metadata sets. In case of published open access
a project, the acquisition to the publication which references          data, anyone can perform read operations on metadata without
the data. MASi will be a single point of access for all such            authentication. For all other operations the user has to be
kinds of metadata. The metadata is linked to the data via a             authenticated and authorized.
unique identifier. The identifier may be a custom one as long              The following serves as an example regarding the prove-
the data is only managed internally. As soon the metadata is            nance package functionality: There are many workflow en-
available for the public, a persistence identifier (PID) such           gines each implementing its own format. But there are two
as DOI (Digital Object Identifier) is used to make the data             standards which are supported by the majority, Open Prove-
referencable. Each PID contains at least two attributes holding         nance Model (OPM) [12] and ProvOne [13]. It is possible to
an URL to the metadata and to the data. The metadata is                 transform OPM metadata to the ProvOne format without loss.
available as a METS (Metadata Encoding and Transmission                 ProvOne describes the provenance as a graph represented as


                                                                    3
                         8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016


                                                                         C. Generic Graphical Interface
                                                                            The motivation to develop and provide a generic web
                                                                         interface for MASi is to significantly lower the time and
                                                                         effort required to adapt the MASi service to further use cases.
                                                                         Developers are then freed from the need to re-develop a user
                                                                         interface for each new use case. This saves time for developers
                                                                         familiar with the technology. However, it is fundamentally
                                                                         enabling for developers unfamiliar with it.
                                                                            The generic web interface is partly built on the basis of
                                                                         the Liferay portal framework [14] that provides ready-made
                                                                         capabilities such as plugins, menus, groups, roles, separable
                                                                         areas and user authentication management with systems such
                                                                         as LDAP and Shibboleth. This enables the integration of
                                                                         federations such as eduGAIN [15] for the easy re-use of
                                                                         existing institute logins. Liferay is open-source, mature and
                                                                         widely used. For this to seamlessly work within MASi, we
                                                                         are currently developing a Liferay plugin to integrate Liferay
                                                                         with the KIT Data Manager repository framework. The plugin
                                                                         will ensure consistency between the user management systems
                                                                         of Liferay and KIT DM by automatically syncing new Liferay
                                                                         users to KIT DM. Adding users to KIT DM via another way
                                                                         will be disabled in this operation mode to ensure that all users
                                                                         exist in both systems. This integration will enable the KIT DM
                                                                         to transparently support authentication systems that are already
                                                                         supported by Liferay such as LDAP and Shibboleth. Also, the
                                                                         KIT DM admin interface will be integrated with Liferay. We
                                                                         will provide a detailed installation and configuration guide in
                                                                         order to further lower the barrier of adoption.
                                                                            The second main part is the current development of a
         Figure 2. Workflow and metadata for historical maps.            generic Liferay graphical interface portlet with common func-
                                                                         tionality. Initially, it will include basic upload, search and
                                                                         download capabilities. To fundamentally increase the impact of
                                                                         this development, we will create an extensive documentation
XML. To allow for sophisticated queries the graph is stored              to enable developers to easily adapt the portlet for their
in a graph database. Therefore, it is possible to query, e.g., for       specific use case requirements. The documentation will include
similar workflows and even more complex queries are possi-               everything from code checkout, development project config-
ble. METS has a pre-defined section for provenance metadata.             uration, adaptation examples to compilation. A main goal of
It uses digiProvMD which can be losslessly transformed to                the documentation is to lower the training period as much as
ProvOne and vice versa.                                                  possible. All binaries, source code and documentation will be
                                                                         open source and will become part of the KIT Data Manager
   For each package there can be a specialized database storing          framework. In the course of MASi, the generic GUI portlet
the metadata. MASi will collect all metadata and compile                 will be continuously extended with increasingly advanced
it in a METS document using the MASi profile or in case                  generic capabilities. Consequently, the documentation will be
of a data ingest split the metadata in its packages to store             appropriately extended in order to enable quick community
them accordingly. On client side there will be MASi tools                adaptations.
supporting communities to compile a valid METS document
                                                                                            IV. I NITIAL U SE C ASES
matching the MASi profile. Subsequently, on server side a
basic quality control is triggered during metadata ingest. It is         A. Historical Maps
based on the respective XML schema which is available for                   Historical topographic and cadastral maps are a valuable
all packages except for content metadata as each community               and often the only source for reconstructing land use changes
has its own specific content metadata. Such a schema needs               over long periods of time. To access this information for large
to be created for each community. Registered schemata can                scale spatial analyses and change detection, advanced image
be used for the quality control. If this control is required to          analysis and pattern recognition algorithms have to be applied
be more sophisticated, a Java plugin can be implemented and              to the scanned map documents. The retrieved information can
easily activated in order so support any kind of quality control         hence be used to “historize” existing land use and land cover
capability.                                                              databases [16].


                                                                     4
                       8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016


   The automatic information acquisition from historical maps
generates and necessitates a variety of metadata. The process
comprises three major components: firstly, the scanning of
the paper maps (which are only partially available as digital
images); secondly, the georeferencing of the scanned maps
(which is only provided for the minority of digital available
maps); and thirdly, the information extraction from the geo-
referenced digital map images. Each of the three components
generate at least four obligatory metadata entries. Figure 2
shows the workflow of the information acquisition process as
well as the essential metadata that are generated during the
process. The given metadata are essential for both the change
detection process as well as the correct interpretation of the
retrieved information by third users.

B. Spectroscopy in Chemistry
   In bioinorganic chemistry, a multitude of spectroscopic
information can be obtained by experimental methods such
as UV/Vis, IR, Raman, EPR and XAS spectroscopy. In most
cases, these data are complemented by theoretical simulations
which help to interpret the experimental data and obtain
scientific insights. In a concrete case, we investigate metal
complexes and their redox behavior with oxidants (electron-
                                                                        Figure 3. Metadata for the electron transfer (subgroup of the spectroscopic
taking reagents) and reductants (electron-delivering reagents)          use case).
by UV/Vis spectroscopic measurements. Here, for instance, a
copper(I) complex is treated with a cobalt(II) complex yielding
copper(II) and cobalt(II) complexes under exchange of an                Academy of Sciences and Literature Mainz and the Berlin-
electron. The copper(I) spectroscopic features decay and those          Brandenburg Academy of Sciences and Humanities, lies in
of copper(II) form. The speed of this development is monitored          the analysis of medieval stained glass preserved in church
every 1.5 ms for some seconds producing a large amount of               windows, museums, galleries and other places all over Europe,
raw data. This raw data is reduced by the researcher, e.g. by           the US and Canada (see Figure 4 for an example).
choice of a suited wavelength and absorption time traces are               Due to its fragile nature, medieval stained glass is greatly
generated, at the moment manually. From these time traces,              affected by environmental impacts. In a first step during
the kinetic decay constants are determined. This analysis is            the research, all windowpanes are photographed and then
performed for several ratios between oxidant and reductant to           documented in schematic drawings. With this documentation
resolve the second-order kinetics of the electron transfer. This        as a basis, the history of each window’s glazing and any
final data shall be stored together with the theoretical analyses       changes or restoration activities that might have been carried
of the electron transfer by density functional theory. Metadata         out throughout the centuries are studied. Finally, the iconog-
annotation is important in all steps but yet an open issue: the         raphy and the religious context of each window within its
original raw data need annotation of who measured which                 ecclesiastical space are interpreted.
chemical system and which ratio, temperature, setup etc. This              The CVMA curates a digital image archive of the pho-
information is traditionally documented manually in laboratory          tographs taken. For each image, an extensive set of metadata is
notebooks which are stored in the working group. The reduced            provided. The records are modeled according to the guidelines
data is then stored electronically. Here, metadata can comprise         of the internationally acknowledged XMP metadata standard.
all metadata of the raw data but additional information on              All XMP information is directly embedded in the TIFF files.
data reduction steps must be added to the metadata. The                 Due to this approach, an accidental separation between the file
theoretical data imply different metadata: the version of the           and its metadata becomes highly unlikely. In addition to XMP,
code, functional, basis set, dispersion and solvent modelling           the ICONCLASS vocabulary is used to describe and classify
as well as grid size should be noted in the corresponding               the contents of each image.
workflow [17]. Figure 3 summarizes these different levels of               The MASi service will open up the CVMA image archive
data production for this example.                                       to further interested parties, e.g. providers of cultural heritage
                                                                        photography such as the Prometheus, Foto-Marburg or the
C. Church Windows                                                       Europeana online platforms. During the implementation of the
   The Corpus Vitrearum Deutschland“ [18] is part of the                service, an OAI-PMH interface will be created. Also, a proof-
        ”
international “Corpus Vitrearum Medii Aevi” (CVMA). The                 of-concept for the automatic matching of metadata records
main focus of this long-term research project, funded by the            with other cultural heritage data repositories will be drafted.


                                                                    5
                          8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016


                                                                               Dresden/Leipzig is gratefully acknowledged. The research
                                                                               leading up to these results has been supported by the LSDMA
                                                                               project of the Helmholtz Association of German Research
                                                                               Centres.

                                                                                                              R EFERENCES

                                                                                [1] MASi, “Metadata Management for Applied Sciences,” 2016. [Online].
                                                                                    Available: http://www.scientific-metadata.de/
                                                                                [2] A. Rajasekar, R. Moore, C.-y. Hou, C. A. Lee, R. Marciano, A. de Torcy,
                                                                                    M. Wan, W. Schroeder, S.-Y. Chen, L. Gilbert et al., “iRODS primer: In-
                                                                                    tegrated Rule-Oriented Data System,” Synthesis Lectures on Information
                                                                                    Concepts, Retrieval, and Services, vol. 2, no. 1, pp. 1–143, 2010.
                                                                                [3] R. Grunzke, J. Krüger, S. Gesing, S. Herres-Pawlis, A. Hoffmann,
                                                                                    A. Aguilera, and W. E. Nagel, “Managing complexity in distributed data
                                                                                    life cycles enhancing scientific discovery,” in IEEE 11th International
                                                                                    Conference on e-Science, August 2015, pp. 371–380.
                                                                                [4] T. Jejkal, A. Vondrous, A. Kopmann, R. Stotzka, and V. Hartmann,
                                                                                    “KIT Data Manager: The Repository Architecture Enabling Cross-
                                                                                    Disciplinary Research,” in Large-Scale Data Management and Analysis
                                                                                    - Big Data in Science - 1st Edition, 2014. [Online]. Available:
                                                                                    http://digbib.ubka.uni-karlsruhe.de/volltexte/1000043270
                                                                                [5] D. Flannery, B. Matthews, T. Griffin, J. Bicarregui, M. Gleaves,
                                                                                    L. Lerusse, R. Downing, A. Ashton, S. Sufi, G. Drinkwater et al.,
                                                                                    “ICAT: Integrating Data Infrastructure for Facilities Based Science,” in
                                                                                    e-Science, 2009. e-Science’09. Fifth IEEE International Conference on.
                                                                                    IEEE, 2009, pp. 201–207.
Figure 4. Mauritius and his companions refuse the idolothyte. Act of the        [6] R. Grunzke, J. Hesser, J. Starek, N. Kepper, S. Gesing, M. Hardt,
saints window in the Marktkirche Hannover.                                          V. Hartmann, S. Kindermann, J. Potthoff, M. Hausmann, R. Müller-
                                                                                    Pfefferkorn, and R. Jäkel, “Device-driven Metadata Management So-
                                                                                    lutions for Scientific Big Data Use Cases,” in 22nd Euromicro In-
                                                                                    ternational Conference on Parallel, Distributed, and Network-Based
Finally, a configurable web interface will be built that will                       Processing (PDP 2014), February 2014.
allow the generic embedding of this automatically enriched                      [7] M. Smith, M. Barton, M. Bass, M. Branschofsky, G. McClellan,
                                                                                    D. Stuve, R. Tansley, and J. H. Walker, “DSpace: An Open
metadata records into the CVMA image files.                                         Source Dynamic Digital Repository,” 2003. [Online]. Available:
                                                                                    hdl.handle.net/1721.1/29465
                V. C ONCLUSION AND O UTLOOK                                     [8] FedoraCommons, “Fedora Commons Repository Software,” 2016.
                                                                                    [Online]. Available: http://fedora-commons.org/
   The MASi research data management service provides a
                                                                                [9] D. Lecarpentier, P. Wittenburg, W. Elbers, A. Michelini, R. Kanso,
solution for the highly relevant challenge of managing large                        P. Coveney, and R. Baxter, “EUDAT: A New Cross-Disciplinary Data
amounts of complex data. It builds on substantial previous                          Infrastructure for Science,” International Journal of Digital Curation,
work that is further extended and broadened. The MASi ser-                          vol. 8, no. 1, pp. 279–287, 2013.
                                                                               [10] EUDAT, “Eudat semantics working group,” 2015. [Online]. Available:
vice that is currently being built up, is easily able to seamlessly                 http://eudat.eu/semantics
integrate with highly diverse use cases. In this capacity it plays             [11] J. P. McDonough, “METS: Standardized Encoding for Digital Library
an essential role in fulfilling the complex requirements while                      Objects,” International journal on digital libraries, vol. 6, no. 2, pp.
                                                                                    148–158, 2006.
further use cases are currently being planned.                                 [12] L. Moreau, J. Freire, J. Futrelle, R. E. McGrath, J. Myers, and P. Paulson,
   As future work, we are evaluating the integration of MASi                        “The Open Provenance Model: An Overview,” in Provenance and
both with the UNICORE HPC middleware [19] and science                               Annotation of Data and Processes. Springer, 2008, pp. 323–326.
                                                                               [13] V. Cuevas-Vicenttı́n, B. Ludäscher, P. Missier, K. Belhajjame, F. Chiri-
gateways such as MoSGrid [20]. We are also continuing to                            gati, Y. Wei, S. Dey, P. Kianmajd, D. Koop, S. Bowers, and I. Altintas,
work within the RDA and contribute our own expertise in                             “ProvONE: A PROV Extension Data Model for Scientific Workflow
discussions to create RDA recommendations on how to best                            Provenance,” in DataONE Provenance Working Group, 2014.
                                                                               [14] Liferay, “Enterprise Open Source Portal and Collaboration Software,”
handle various aspects of research data management. We aim                          2016. [Online]. Available: http://www.liferay.com/
at implementing the resulting joint recommendations within                     [15] Geant, “eduGAIN - Interconnecting Federations to Link
MASi. This will contribute in the creation of MASi as a                             Services and Users Worldwide,” 2015. [Online]. Available:
service that is efficient, future-proof and has a high user                         http://www.geant.net/service/eduGAIN/Pages/home.aspx
                                                                               [16] H. Herold, G. Meinel, R. Hecht, and E. Csaplovics, “A GEOBIA
acceptance.                                                                         Approach to Map Interpretation-Multitemporal Building Footprint Re-
                                                                                    trieval for High Resolution Monitoring of Spatial Urban Dynamics,” in
                       ACKNOWLEDGMENT                                               International Conference on Geographic Object-Based Image Analysis,
                                                                                    2012, pp. 252–256.
  The authors would like to thank the DFG (German Re-                          [17] S. Herres-Pawlis, A. Hoffmann, T. Rosener, J. KrÃŒger, R. Grunzke,
search Foundation) for the opportunity to do research in the                        and S. Gesing, “Multi-layer Meta-metaworkflows for the Evaluation of
MASi project (NA711/9-1). Furthermore, financial support                            Solvent and Dispersion Effects in Transition Metal Systems Using the
                                                                                    MoSGrid Science Gateways,” in Science Gateways (IWSG), 2015 7th
by the BMBF (German Federal Ministry of Education and                               International Workshop on, June 2015, pp. 47–52.
Research) for the competence center for Big Data ScaDS                         [18] “Corpus Vitrearum Deutschland,” http://www.corpusvitrearum.de/, 2016.


                                                                           6
                           8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016


[19] K. Benedyczak, B. Schuller, M. Petrova, J. Rybicki, and R. Grunzke,
     “UNICORE 7 - Middleware Services for Distributed and Federated
     Computing,” in International Conference on High Performance Com-
     puting Simulation (HPCS), 2016, accepted.
[20] J. Krüger, R. Grunzke, S. Gesing, S. Breuers, A. Brinkmann, L. de la
     Garza, O. Kohlbacher, M. Kruse, W. E. Nagel, L. Packschies, R. Müller-
     Pfefferkorn, P. Schäfer, C. Schärfe, T. Steinke, T. Schlemmer, K. D.
     Warzecha, A. Zink, and S. Herres-Pawlis, “The MoSGrid Science
     Gateway - A Complete Solution for Molecular Simulations,” Journal
     of Chemical Theory and Computation, vol. 10(6), pp. 2232–2245, 2014.


                                                                               7