<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>IWSG</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>From the Desktop to the Grid and Cloud: Conversion of KNIME Workflows to WS-PGRADE</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Luis de la Garza</string-name>
          <email>delagarza@informatik.uni-tuebingen.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Center for Bioinformatics</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dept. of Computer Science</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Max Planck Institute for Developmental Biology</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Quantitative Biology Center, University of Tu ̈bingen</institution>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Tu ̈bingen</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>University of Tu ̈bingen</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <volume>8</volume>
      <fpage>8</fpage>
      <lpage>10</lpage>
      <abstract>
        <p>-Computational analyses for research usually consist of a complicated orchestration of data flows, software libraries, visualization, selection of adequate parameters, etc. Structuring these complex activities into a collaboration of simple, reproducible and well defined tasks brings down complexity and increases reproducibility. This is the basic notion of workflows. Workflow engines allow users to create and execute workflows, each having unique features. In some cases, certain features offered by platforms are royalty-based, hindering use in the scientific community. We present our efforts to convert whole workflows created in the Konstanz Information Miner Analytics Platform to the Web Services Parallel Grid Runtime and Developer Environment. We see the former as a great workflow editor due to its considerable user base and user-friendly graphical interface. We deem the latter as a great backend engine able to interact with most major distributed computing interfaces. We introduce work that provides a platform-independent tool representation, thus assisting in the conversion of whole workflows. We also present the challenges inherent to workflow conversion across systems, as well as the ones posed by the conversion between the chosen workflow engines, along with our proposed solution to overcome these challenges. The combined features of these two platforms (i.e., intuitive workflow design on a desktop computer and execution of workflows on distributed high performance computing interfaces) greatly benefit researchers and minimize time spent in technical chores not directly related to their area of research.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>
        Computers are essential in various scientific fields.
Example domains requiring high-performance computing (HPC)
include vaccine design, astrophysics, or the multidisciplinary
field of bioinformatics. Here, the declining costs of both
data generation and storage in the last few years [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] pushed
bioinformaticians into using HPC resources such as grids and
clouds.
      </p>
      <p>
        Simultaneously, the scope of research is getting more and
more refined and complex. As such, upholding the scientific
method increases in difficulty: Being able to reproduce
previously observed results when keeping all variables constant,
can often be an arduous task. Consequently, journals and
news outlets have repeatedly reported cases of published but
irreproducible results [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>Oliver Kohlbacher</title>
    </sec>
    <sec id="sec-3">
      <title>Center for Bioinformatics</title>
    </sec>
    <sec id="sec-4">
      <title>Dept. of Computer Science</title>
    </sec>
    <sec id="sec-5">
      <title>Faculty of Medicine</title>
    </sec>
    <sec id="sec-6">
      <title>Germany</title>
      <p>Researchers often break down big, complicated analyses
into smaller units of work that are easier to manage. These
so-called tasks perform one specific function and take an input
along with controlling parameters to produce a defined output.
Input usually takes the form of files, whereas output could
also be for example a set of visualizations. The combination
of tasks is often referred to as a workflow. Task outputs can
be passed on as inputs to other tasks, defining an order of
execution for each step of the comprising workflow. Adoption
of workflows not only increases reproducibility but also offers
the following benefits:</p>
      <p>Storage of intermediate results (e.g., for troubleshooting,
additional analysis, bottleneck identification)
Simplified substitution of single tasks (e.g., for
benchmarking, testing purposes)
Parallel execution of workflow branches (i.e., parameter
sweep)
Reusability of components</p>
      <p>Independent, parallel development of specialized tasks</p>
      <sec id="sec-6-1">
        <title>A. Workflow Interoperability and Conversion</title>
        <p>
          Throughout this work we will use workflow terminology
and representation consistent with our previous work [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
Figures 1 and 2 briefly summarize this.
        </p>
        <p>Fig. 1. The abstract layer of a workflow. Vertices represent tasks, edges
indicate the execution order. At this point, no implementation or technical
details are represented.</p>
        <p>Since the abstract workflow layer contains solely
application domain information, it is independent of the execution
requirements. Thus, the abstract layer remains unchanged across
workflow engines. In contrast, the concrete workflow layer,
the workflow engine and the executing platform are tightly
coupled. This divergence of concrete layers across engines
makes workflow interoperability challenging. Furthermore,
workflow engines often contain distinct features, complicating
conversion across platforms.</p>
        <p>
          One way to alleviate these problems is the development of
platform-independent workflow representations, e.g., the
Interoperable Workflow Intermediate Representation (IWIR) [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]
and Yet another Workflow Language (YAWL) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] to
enable fine-grained interoperability (FGI). However,
platformindependent workflow representations do not address workflow
implementations. The Sharing interoperable Workflows for
large-scale scientific Simulation on available distributed
computing interfaces project (SHIWA) [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], for instance, provides
execution of workflows built on different workflow engines
by uploading them to the SHIWA Simulation Platform. Users
handling data subject to privacy restrictions (e.g., patient data)
might find it an unsuitable solution.
        </p>
        <p>A proper workflow conversion across engines requires that
the abstract layer remains unchanged (i.e., source and target
workflow can be considered logically equivalent). The location
of resources, how different engines implement single nodes
and logical constructs (e.g., parameter sweep) are some of
the aspects to be considered. Features unique to one engine
engine represent a complication. Figure 3 shows an example
of a simplified workflow conversion.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>II. IMPLEMENTATION</title>
      <p>
        The Web Services Parallel Grid Runtime and Developer
Environment Portal (WS-PGRADE) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is a web-based
workflow engine that interacts with a wide array of resource
managers (e.g. Moab, LSF) to access distributed computing
interfaces (DCIs). This makes it a great back-end workflow
execution engine. Tasks of the same workflow can be executed
on different DCIs. However, workflow creation is a multi-step
process, posing problems for users without adequate training.
      </p>
      <p>
        The Konstanz Information Miner Analytics Platform
(KNIME Analytics Platform) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] is hosted on a personal
computer. It features an intuitive interface, contains more than
1,000 pre-loaded tools and hundreds of sample workflows.
Addition of new tools requires knowledge of the Java
programming language—an aspect that might keep some users
away from this feature. A couple of royalty-based variants (i.e.,
the so-called KNIME Collaborative Extensions) are offered to
      </p>
      <p>Fig. 3. Workflow conversion challenges. Two different engines (i.e., e1,
e2) running on two different platforms (i.e., p1, p2) contain different concrete
layers of the same workflow. The abstract layer, however, remains unchanged.
A successful workflow conversion must take into account not only the
differences among the source and target engines, but must also consider the
source and target platforms or operating systems.
remotely execute workflows, however, WS-PGRADE offers a
wider support for resource managers to access DCIs.</p>
      <p>We focus on providing fine-grained interoperability between
a great workflow editor such as the KNIME Analytics Platform
and a versatile, scalable workflow execution platform such as
WS-PGRADE.</p>
      <p>The first step to provide interoperability is to represent tasks
in a platform-independent manner. Certain attributes of tool
execution remain unchanged across platforms (e.g., version
and parameters), while some others change (e.g., location
of executables, input and output files). Attributes in need of
adjustment have to be identified. A platform-independent tool
representation facilitates the task conversion across platforms
and thus the conversion of full workflows.</p>
      <p>One of the first challenges in the conversion between these
engines is the maintenance of a database that relates tools
on the user’s computer with tools on each of the target DCI
platforms. The next set of challenges concerns the
implementation of nodes and logical workflow constructs. The KNIME
Analytics Platform implements parameter sweep via
nodedelimited workflow sections (i.e., using ZipLoopStart,
ZipLoopEnd nodes). WS-PGRADE delimits such sections with
generator and collector ports. Furthermore, WS-PGRADE
allows users to assign data files directly to input ports. The
KNIME Analytics Platform, however, requires a dedicated
node (e.g., Input File, Input Files), whose output port refers
to a file and this reference can be channeled to an input port.</p>
      <p>Some features present in the KNIME Analytics Platform
are not found in WS-PGRADE. The former requires ports to
declare which data types they are compatible with and supports
file lists as inputs; the latter is more flexible and lacks native
support of file lists as inputs (i.e., each input or output port is
related to one file). Different to WS-PGRADE, KNIME Nodes
produce outputs not only via output ports: They can also set
flow variables, which can be read further down the execution
flow.
The KNIME Analytics Platform is a Java program with a
graphic interface. KNIME Nodes are then instances of Java
classes that live inside the process which launched the KNIME
Analytics Platform. In other words, they require a running
instance of the KNIME Analytics Platform to be executed,
making their execution on a DCI a challenge.</p>
      <p>The following sections describe our approach to address the
mentioned challenges.</p>
      <sec id="sec-7-1">
        <title>A. Conversion of Nodes: Addressing Disparities between</title>
      </sec>
      <sec id="sec-7-2">
        <title>Workflow Engines</title>
        <p>The KNIME Analytics Platform features a node repository
in which users can select any of the available nodes (see Figure
4). Creation of workflows in the KNIME Analytics Platform
requires a single step, thus the abstract and concrete layers are
merged into the user-friendly workflow editor. Each KNIME
Node performs a specific task and defines a fixed number of
input, output ports. Each port is associated to a port type,
which is similar to content types (e.g., csv, pdb). Only ports of
compatible types can be interconnected. Furthermore, KNIME
Nodes rely on the assumption that incoming and outgoing data
are arranged in custom in-memory data tables. Each KNIME
Node iterates over the rows of incoming data and is able to
modify the contents of the input table, as well as its structure
(e.g., by adding columns or rows). File handling is done by
using these same data tables, their cells containing uniform
resource identifiers (URI) pointing to the needed files.</p>
        <p>WS-PGRADE, on the other hand, requires the creation of
an abstract and a concrete workflow in a multi-step process
(see Figure 5). During the creation of the concrete workflow,
users input the required attributes and command line to
associate a node to a specific remote binary. In contrast to the
KNIME Analytics Platform, WS-PGRADE allows to assign
files directly to input ports and it doesn’t perform a strict type
checking: Any output port can be connected to any input port.
Additionally, the structure of the incoming and outgoing files
is arbitrary.</p>
        <p>
          Adding nodes to the KNIME Analytics Platform
requires knowledge of the Java programming language. Generic
KNIME Nodes (GKN) [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] was developed to add nodes
without programming experience by allowing arbitrary
command line tools to behave as KNIME Nodes and to seamlessly
interact with other nodes inside the KNIME Analytics
Platform. The only requirement is the representation of the tools
by Common Tool Descriptors (CTDs), which are XML files
describing the inputs, outputs and parameters of a tool [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
Currently, several software suites [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] are able to
parse and generate CTDs (i.e., they are CTD-enabled). Figure
6 illustrates how CTDs interact with CTD-enabled tools.
        </p>
        <p>We introduce KNIME2gUSE, an extension to the KNIME
Analytics Platform which converts workflows from the
KNIME Analytics Platform to WS-PGRADE, combining the
features of both engines and overcoming their disadvantages.</p>
        <p>Conversion of KNIME Nodes that were imported using
GKN is somewhat trivial. Each of these nodes represents an
external tool that is independent of the KNIME Analytics
Platform. In this case, the matching binary for the represented
tool is required on each of the target DCIs.</p>
        <p>We identify native nodes as those KNIME Nodes that were
not imported using GKN (i.e., pre-packaged nodes, nodes
added as third-party extensions or nodes added by the user
via other means). Each native KNIME Node is an instance
of a Java class managed by the KNIME Analytics Platform.
Such nodes exist only in the context of the process that
hosts the KNIME Analytics Platform. Execution of a single
KNIME Node requires a running instance of the KNIME
Analytics Platform and converting these nodes is not trivial.
Furthermore, a suitable distribution of the KNIME Analytics
Platform must be present on each of the target DCIs.</p>
        <p>Data between KNIME Nodes can only be channeled
between ports with compatible data types. Since channeled data
are in-memory representations of table-formatted data (i.e.,
data tables), we have devised a solution that allows native
KNIME Nodes to be executed as if they were command line
tools: During the export process, native KNIME Nodes are
individually packed into a small KNIME workflow. Each such
generated workflow contains a copy of the original node, along
with any user-established settings. Since inputs and outputs
for the exported node won’t be channeled inside an instance
of the KNIME Analytics Platform, extra reader and writer
nodes (i.e., Table Reader and Table Writer) are also included
in this small workflow. These nodes allow the serialization
and deserialization of the in-memory data format required
by native KNIME Nodes. The KNIME Analytics Platform
can execute workflows in a so-called batch mode, without
the need of a graphical user interface. A suitable command
line is automatically generated during our export process.
When the batch mode execution of this generated workflow
is started, input files will be read into the KNIME data table
format; upon completion, any output will be serialized from
the KNIME data table format into a file.</p>
        <p>
          The work previously presented in [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] introduced work
we have done in the field and showcased conversion of
KNIME workflows composed solely of nodes that were
imported via GKN. We have extended KNIME2gUSE in order
to convert workflows composed of any kind of nodes. Figures
7 and 8 depict how the conversion of nodes is performed.
        </p>
      </sec>
      <sec id="sec-7-3">
        <title>B. Conversion of Workflows: Exporting KNIME wofkflows to</title>
      </sec>
      <sec id="sec-7-4">
        <title>WS-PGRADE</title>
        <p>The KNIME2gUSE plug-in produces files that can be
imported into WS-PGRADE, ready to be executed on any
configured DCI with minor modifications.</p>
        <p>We have chosen WS-PGRADE as the target engine for
the export process due to the fact that it interacts directly
with a wide selection of resource and cloud managers (a
feature not present in the royalty-based KNIME editions that
allow remote execution). It also features workflow submission,
control, monitoring and statistics. These are functionalities
which resource managers or cloud engines often lack.</p>
        <p>The KNIME Analytics Platform natively supports the
association of single input/output ports to a file list determined
at runtime, a functionality not present in WS-PGRADE. To
overcome this, a wrapper script is automatically generated
by KNIME2gUSE that zips corresponding files into a single
archive. To translate parameter sweep sections, conversion
removes KNIME Analytics Platform ZipLoopStart and
ZipLoop</p>
        <p>End nodes and substitutes suitable WS-PGRADE generator
and collector ports.</p>
      </sec>
      <sec id="sec-7-5">
        <title>Application:</title>
      </sec>
      <sec id="sec-7-6">
        <title>Biomarker</title>
      </sec>
      <sec id="sec-7-7">
        <title>Discovery in</title>
      </sec>
      <sec id="sec-7-8">
        <title>C. Example</title>
      </sec>
      <sec id="sec-7-9">
        <title>Metabolomics</title>
        <p>
          Metabolomics is a mass spectrometry-based approach aimed
to evaluate the entirety of a metabolite sample. Applications
include the tracking of chemicals and their transformation
products in waste water [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], identification of cancer types via
biomarkers [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] and elucidation of disease-underlying
mechanisms [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. Compared to complementary omics
technologies (e.g., transcriptomics, proteomics), metabolomics is
closer to the actual biochemical processes that occur, making
it attractive for biomarker development.
        </p>
        <p>A common analysis approach for studies interested in
comparative metabolite concentrations is label-free quantification.
The independence from chemical labels allows the direct
comparison of small molecules across an arbitrary number of
samples. As a consequence, the need to evaluate hundreds
of gigabyte-sized samples in concert is already common.
Numbers and sizes of concurrently evaluated samples are
steadily increasing, emphasizing the necessity for distributed
computing.</p>
        <p>
          We provide an example workflow for metabolomics
biomarker discovery using OpenMS [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] for mass
spectrometry algorithms as well as various native KNIME Nodes
(including nodes for the R scripting language). The KNIME workflow
and its converted WS-PGRADE version are shown in Figure
9. We assume some initial preparations were performed prior
to the execution of the workflow, namely, conversion from
closed mass spectrometer vendor formats to the open mzML
format and data reduction by means of peak picking, which
could also be implemented in KNIME via OpenMS [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] tools.
        </p>
        <p>
          Using a detection method for so-called small
molecules [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], we adapted a label-free quantification
pipeline [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. The quantification part of our biomarker
discovery workflow consists of sample specific feature
detection (i.e., finding the convex hulls and respective
centroids of analyte mass traces) followed by temporal
alignment of samples and the quantification of corresponding
features across samples.
Downstream small molecule identification was done via
mass-based search in the Human Metabolome Database.
Included sample normalization allows for comparison of analyte
abundances across samples. Analytes whose abundances vary
significantly after false discovery rate correction are
annotated with the mass-based identifications and exported to a
Microsoft Excel Spreadsheet (XLS format).
        </p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>III. FUTURE WORK</title>
      <p>The KNIME Analytics Platform features Metanodes
encapsulating complete workflows. We would like to extend
KNIME2gUSE to support their conversion. Furthermore,
seeing that considerable effort has been put into creating
platformindependent workflow representation formats, we would like
to add IWIR and YAWL file generation to KNIME2gUSE.
We would also like to extend our converter to support other
workflow engines, such as Galaxy.</p>
    </sec>
    <sec id="sec-9">
      <title>IV. CONCLUSION</title>
      <p>Workflows assist reproducibility and minimize time spent
validating research by reducing analysis complexity. There are
currently several workflow engines with user-friendly
interfaces that support remote execution of workflows. However,
we feel that their scalability and support of major resource
managers is still lacking. In contrast, HPC infrastructures
and their resource managers rarely support the execution and
control of workflows. As a consequence, HPC users often
require programming skills to handle the channeling of data
as well as to submit, monitor and control the respective
computing jobs.</p>
      <p>We present our efforts to support workflow export from
the KNIME Analytics Platform to WS-PGRADE, identified
challenges for both node and workflow conversion and detailed
our solutions. KNIME offers remote workflow execution, but it
is a royalty-based solution and support of DCIs is limited—an
aspect in which WS-PGRADE excels. KNIME2gUSE brings
together a user-friendly and intuitive workflow engine for
personal computers together with a scalable HPC workflow
platform that interacts with several DCIs.</p>
      <p>We thus provide the individual advantages of both engines
without any of their shortcomings. Overall, our methods
decrease time spent designing workflows and troubleshooting
conversion for different workflow engines.</p>
    </sec>
    <sec id="sec-10">
      <title>ACKNOWLEDGMENT</title>
      <p>The authors would like to thank Bernd Wiswedel, Thorsten
Meinl, Patrick Winter and Michael Berthold for their support,
patience and help in developing the KNIME2gUSE extension.</p>
      <p>This work was supported by the German Network
for Bioinformatics Infrastructure (Deutsches Netzwerk fu¨ r
Bioinformatik-Infrastruktur, de.NBI).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Greene</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Moore</surname>
          </string-name>
          , and C. Cheng, “
          <article-title>Big data bioinformatics</article-title>
          .
          <source>” Journal of cellular physiology</source>
          , vol.
          <volume>229</volume>
          , no.
          <issue>12</issue>
          , pp.
          <fpage>1896</fpage>
          -
          <lpage>900</lpage>
          , Dec.
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>McNutt</surname>
          </string-name>
          , “Reproducibility.”
          <string-name>
            <surname>Science</surname>
          </string-name>
          (New York, N.Y.), vol.
          <volume>343</volume>
          , no.
          <issue>6168</issue>
          , p.
          <fpage>229</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3] “Trouble at the lab,” The Economist, oct
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Baker</surname>
          </string-name>
          , “
          <article-title>Over half of psychology studies fail reproducibility test,”</article-title>
          <string-name>
            <surname>Nature</surname>
          </string-name>
          , Aug.
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5] L.
          <string-name>
            <surname>de la Garza</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Veit</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Szolek</surname>
            , M. Ro¨ttig,
            <given-names>S.</given-names>
            Aiche, S.
          </string-name>
          <string-name>
            <surname>Gesing</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Reinert</surname>
            , and
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Kohlbacher</surname>
          </string-name>
          , “
          <article-title>From the desktop to the grid: scalable bioinformatics via workflow conversion</article-title>
          ,
          <source>” BMC Bioinformatics</source>
          , vol.
          <volume>17</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6] L.
          <string-name>
            <surname>de la Garza</surname>
            , J. Kru¨ger, C. Scha¨rfe, M. Ro¨ttig, S. Aiche,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Reinert</surname>
            , and
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Kohlbacher</surname>
          </string-name>
          , “
          <article-title>From the desktop to the grid: conversion of knime workflows to guse</article-title>
          .”
          <string-name>
            <surname>in</surname>
            <given-names>IWSG</given-names>
          </string-name>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>K.</given-names>
            <surname>Plankensteiner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Montagnat</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Prodan</surname>
          </string-name>
          , “IWIR:
          <string-name>
            <given-names>A Language</given-names>
            <surname>Enabling Portability Across Grid Workflow Systems</surname>
          </string-name>
          ,” in SIGMOD Rec., vol.
          <volume>34</volume>
          , no.
          <issue>3</issue>
          ,
          <issue>2011</issue>
          , pp.
          <fpage>97</fpage>
          -
          <lpage>106</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>W. van</surname>
          </string-name>
          <article-title>der Aalst and A. ter Hofstede, “YAWL: yet another workflow language</article-title>
          ,
          <source>” Information Systems</source>
          , vol.
          <volume>30</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>245</fpage>
          -
          <lpage>275</lpage>
          , Jun.
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G.</given-names>
            <surname>Terstyanszky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kukla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kacsuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Balasko</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Z.</given-names>
            <surname>Farkas</surname>
          </string-name>
          , “
          <article-title>Enabling scientific workflow sharing through coarse-grained interoperability,” Future Generation Computer Systems</article-title>
          , vol.
          <volume>37</volume>
          , pp.
          <fpage>46</fpage>
          -
          <lpage>59</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Kacsuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Farkas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kozlovszky</surname>
          </string-name>
          , G. Hermann,
          <string-name>
            <given-names>A.</given-names>
            <surname>Balasko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Karoczkai</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Marton</surname>
          </string-name>
          , “
          <article-title>WS-PGRADE/gUSE Generic DCI Gateway Framework for a Large Variety of User Communities</article-title>
          ,
          <source>” Journal of Grid Computing</source>
          , vol.
          <volume>10</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>601</fpage>
          -
          <lpage>630</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Berthold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Cebron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Dill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Gabriel</surname>
          </string-name>
          , T. Ko¨tter, T. Meinl,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ohl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Thiel</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Wiswedel</surname>
          </string-name>
          , “
          <article-title>KNIME - the Konstanz information miner,” ACM SIGKDD Explorations Newsletter</article-title>
          , vol.
          <volume>11</volume>
          , no.
          <issue>1</issue>
          , p.
          <fpage>26</fpage>
          ,
          <string-name>
            <surname>Nov</surname>
          </string-name>
          .
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sturm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bertsch</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Gro¨pl, A</article-title>
          . Hildebrandt,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hussong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Lange</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Pfeifer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Schulz-Trieglaff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zerck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Reinert</surname>
          </string-name>
          , and
          <string-name>
            <given-names>O.</given-names>
            <surname>Kohlbacher</surname>
          </string-name>
          , “
          <article-title>OpenMS - an open-source software framework for mass spectrometry</article-title>
          .
          <source>” BMC bioinformatics</source>
          , vol.
          <volume>9</volume>
          , p.
          <fpage>163</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Do</surname>
          </string-name>
          ¨ring, D. Weese,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rausch</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Reinert</surname>
          </string-name>
          , “
          <article-title>SeqAn an efficient</article-title>
          , generic C+
          <article-title>+ library for sequence analysis</article-title>
          .
          <source>” BMC bioinformatics</source>
          , vol.
          <volume>9</volume>
          , no.
          <issue>1</issue>
          , p.
          <fpage>11</fpage>
          ,
          <string-name>
            <surname>Jan</surname>
          </string-name>
          .
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hildebrandt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Dehof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rurainski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bertsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schumann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. C.</given-names>
            <surname>Toussaint</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moll</surname>
          </string-name>
          , D. Sto¨ckel, S. Nickels,
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Mueller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.- P.</given-names>
            <surname>Lenhof</surname>
          </string-name>
          , and
          <string-name>
            <given-names>O.</given-names>
            <surname>Kohlbacher</surname>
          </string-name>
          , “
          <article-title>BALL-biochemical algorithms library 1.3.” BMC bioinformatics</article-title>
          , vol.
          <volume>11</volume>
          , p.
          <fpage>531</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>E. L.</given-names>
            <surname>Schymanski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. P.</given-names>
            <surname>Singer</surname>
          </string-name>
          , P. Longre´e,
          <string-name>
            <given-names>M.</given-names>
            <surname>Loos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ruff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Stravs</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Ripolle´s Vidal, and</article-title>
          <string-name>
            <given-names>J.</given-names>
            <surname>Hollender</surname>
          </string-name>
          , “
          <article-title>Strategies to characterize polar organic contamination in wastewater: exploring the capability of high resolution mass spectrometry</article-title>
          .
          <source>” Environmental science &amp; technology</source>
          , vol.
          <volume>48</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>1811</fpage>
          -
          <lpage>8</lpage>
          , Jan.
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sugimoto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. T.</given-names>
            <surname>Wong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hirayama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Soga</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Tomita</surname>
          </string-name>
          , “
          <article-title>Capillary electrophoresis mass spectrometry-based saliva metabolomics identified oral, breast and pancreatic cancer-specific profiles</article-title>
          .
          <source>” Metabolomics : Official journal of the Metabolomic Society</source>
          , vol.
          <volume>6</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>78</fpage>
          -
          <lpage>95</lpage>
          , mar
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>C.</given-names>
            <surname>Denkert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Budczies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kind</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Weichert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tablack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sehouli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Niesporek</surname>
          </string-name>
          , D. Ko¨nsgen, M. Dietel, and
          <string-name>
            <given-names>O.</given-names>
            <surname>Fiehn</surname>
          </string-name>
          , “
          <article-title>Mass spectrometrybased metabolic profiling reveals different metabolite patterns in invasive ovarian carcinomas and ovarian borderline tumors</article-title>
          .
          <source>” Cancer research</source>
          , vol.
          <volume>66</volume>
          , no.
          <issue>22</issue>
          , pp.
          <volume>10</volume>
          <fpage>795</fpage>
          -
          <lpage>804</lpage>
          , Nov.
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Irmler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hoene</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Scheler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Beckers</surname>
          </string-name>
          , M. Hrab? de Angelis, H.-U. Ha¨ring,
          <string-name>
            <given-names>B. K.</given-names>
            <surname>Pedersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          , G. Xu,
          <string-name>
            <given-names>P.</given-names>
            <surname>Plomgaard</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Weigert</surname>
          </string-name>
          , “
          <article-title>Type 2 diabetes alters metabolic and transcriptional signatures of glucose and amino acid metabolism during exercise and recovery</article-title>
          .
          <source>” Diabetologia</source>
          , vol.
          <volume>58</volume>
          , no.
          <issue>8</issue>
          , pp.
          <fpage>1845</fpage>
          -
          <lpage>54</lpage>
          , Aug.
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>E.</given-names>
            <surname>Kenar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Franken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Forcisi</surname>
          </string-name>
          , K. Wo¨rmann, H.-U. Ha¨ring, R. Lehmann,
          <string-name>
            <given-names>P.</given-names>
            <surname>Schmitt-Kopplin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zell</surname>
          </string-name>
          , and
          <string-name>
            <given-names>O.</given-names>
            <surname>Kohlbacher</surname>
          </string-name>
          , “
          <string-name>
            <surname>Automated</surname>
          </string-name>
          label
          <article-title>-free quantification of metabolites from liquid chromatography-mass spectrometry data</article-title>
          .
          <source>” Molecular &amp; cellular proteomics : MCP</source>
          , vol.
          <volume>13</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>348</fpage>
          -
          <lpage>59</lpage>
          , jan
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>H.</given-names>
            <surname>Weisser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nahnsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Grossmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Nilse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Quandt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Brauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sturm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Kenar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Kohlbacher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Aebersold</surname>
          </string-name>
          , and L. Malmstro¨m, “
          <article-title>An automated pipeline for high-throughput label-free quantitative proteomics</article-title>
          .
          <source>” Journal of proteome research</source>
          , vol.
          <volume>12</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>1628</fpage>
          -
          <lpage>44</lpage>
          , Apr.
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>