<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>International</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Using Scientific Workflows for Science and Engineering Optimisation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Scientific Workflows</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Science Gateways</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>High Performance Computing</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Numerical Optimization.</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>David Abramson University of Queensland St Lucia</institution>
          ,
          <addr-line>4072</addr-line>
          <country country="AU">Australia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2000</year>
      </pub-date>
      <volume>2000</volume>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        The work described in this extended abstract concerns the
synthesis of three normally disconnected pieces of computing
infrastructure, namely Scientific Workflows, Engineering
Optimization and Science Gateways. When combined, they
provide a rich framework for performing engineering design.
Scientific workflows have been applied to a wide range of
problems from science and engineering to ecology. They deliver
infrastructure that simplifies scripting complex distributed
experiments. For example, data may be sourced from one or more
locations, and used to drive a pipeline of computational models.
Processing steps may vary from simple-minded data reformatting
and pre-processing, which can be performed on local
workstations, through to computationally intensive models that
require supercomputers. Many workflow engines have been
produced over the years, and a reasonable summary of these can
be found in [
        <xref ref-type="bibr" rid="ref4">8</xref>
        ].
      </p>
      <p>Engineering optimization increasingly uses complex
computational models that represent some aspects of a system of
interest. For example, it can be applied to the problem of finding
optimal airfoil shapes as part of an aircraft design. For example, it
can be used to compute optimal air pollution control strategies,
find optimal shapes for radio antennas, and a wide range of
problems. Importantly, optimization algorithms are usually
iterative, and when combined with computational models, require
repeated executions of a model to produce an “objective value”.
This objective value, is then returned to the search algorithm so it
can iterate and produce better solutions.</p>
      <p>Science Gateways are Web portals that simplify access to
complex software services, and may be underpinned by large
databases and high performance computers. One of the earliest
Science Gateways was NanoHub [6], which provided access to a
wide range of engineering design tools, through a simple Web
based user interface. Since then, numerous Science Gateways
have been built. Traditionally, however, Science Gateways have
not supported Scientific Workflows per se, although some do
execute workflows behind the gateway as a way of performing
computation.</p>
      <p>In this keynote address I describe a system that integrates these
three technologies, and show how this supports automatic
engineering design optimization. Specifically, in the seminar I
will show how it can be applied to airfoil design of very high
dimensioned problems.</p>
    </sec>
    <sec id="sec-2">
      <title>2. BACKGROUND TECHNOLOGIES</title>
    </sec>
    <sec id="sec-3">
      <title>2.1 Kepler</title>
      <p>
        In general, scientific workflows can be data-intensive,
computeintensive, analysis-intensive or visualisation intensive [
        <xref ref-type="bibr" rid="ref3">7</xref>
        ]. While
there are numerous workflow systems, in this work we have
focussed on the Kepler system [
        <xref ref-type="bibr" rid="ref5">9</xref>
        ][
        <xref ref-type="bibr" rid="ref3">7</xref>
        ][
        <xref ref-type="bibr" rid="ref5">9</xref>
        ][
        <xref ref-type="bibr" rid="ref2">5</xref>
        ]. Kepler supports
different levels of workflows from low-level workflows for grid
engineers, to higher-level knowledge discovery workflows for
less-technical users. It provides domain scientists with an
easy-touse, yet powerful, system for capturing the workflows they
engage with on a daily basis. It streamlines the workflow
construction and execution process so that scientists can focus on
analyses with minimal effort. Kepler’s actor-oriented modelling is
inherited from the Ptolemy II system. Ptolemy II provides
module-oriented programming with an emphasis on multiple
component interaction semantics. The key principle is to use
welldefined Models-of-Computation that govern interactions between
components, or actors.
      </p>
      <p>Actors operate like functions in traditional programming
languages. Unlike Ptolemy II, Kepler focuses on the design and
execution of scientific workflows. Therefore the composition of
independent actors forms the scientific workflow.</p>
      <p>Kepler’s use of Models-of-Computation, as implemented through
“Directors” makes it relatively easy to change the execution
semantics. We adopted Kepler because we wanted a more
sophisticated execution mechanism, as discussed in Section 2.3.
While it would have been possible to add these semantics to other
open-source workflow tools, this was a relatively natural
extension for Kepler.</p>
    </sec>
    <sec id="sec-4">
      <title>2.2 Nimrod</title>
      <p>
        Nimrod enables users to conduct parametric experiments to study
behaviours of complex systems [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ][2][3][4]. Nimrod supports
repeated execution of the same experiments with different input
parameters, and it automates several repeated procedures such as
formulation, execution, monitoring and result gathering from
multiple experiments. Nimrod greatly reduces the programming
effort required for experiments, and has a distributed scheduling
component. Nimrod focuses on making it easy to repeat such
experiments. There are many versions of Nimrod. Here we
mention Nimrod/G and Nimrod/O.
      </p>
      <p>Nimrod/G allows users to explore many different scenarios by
selecting those that optimise the end results, but it generates an
exhaustive search. Nimrod/G can distribute computations to local
computers, remote machines connected by Grid middleware and
Cloud resources. The biggest drawback of Nimrod/G when
applied to real world engineering problems is that an exhaustive
search might be infeasible. Nimrod/O's main goal is to combine
rapid application development, distributed computing and
optimization into a single tool. Unlike Nimrod/G, however, it uses
non-linear optimization techniques to search the outputs of
arbitrary computational models. This means that Nimrod/O
usually explores many fewer design alternatives than Nimrod/G,
making it more efficient. Nimrod/O is, however, able to use
Nimrod/G to perform a computation on a remote resource or
supercomputer.</p>
    </sec>
    <sec id="sec-5">
      <title>2.3 Nimrod/K</title>
      <p>
        As discussed, Nimrod and Kepler both address different aspects of
computational science. Kepler makes it easy to specify a single
experiment, and Nimrod makes it easy to execute that experiment
across different input conditions. We have combined these into
Nimrod/K (Nimrod + Kepler). Nimrod/K provides similar
functionality to Nimrod/G, but is built on, and extends, Kepler’s
runtime engine. Thus, it is possible to create arbitrarily complex
pipelines, or workflows, of computations, but stream different
parameter values through the workflow. By combining Kepler
with Nimrod/G, it is possible to run computations on a variety of
distributed infrastructure. Likewise, leveraging Nimrod/O’s
optimization approach makes it possible to search for optimal
outputs from a workflow, rather than a single stand-alone
computation. Nimrod/K builds on Kepler’s standard Directors
(SDF and PN), adding a new one for the Tagged Dataflow
Architecture (TDA) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The TDA Director builds dynamic
concurrency into the workflow and allows independent loops to
iterate in parallel.
      </p>
    </sec>
    <sec id="sec-6">
      <title>2.4 Nimrod/OK</title>
      <p>Optimization algorithms may themselves be viewed as workflows,
usually involving repetitive looping so that results are passed from
one iteration to the next. When the features of Nimrod/K and
Nimrod/O are combined, optimisation operations are possible –
this tool variant is called Nimrod/OK. Nimrod/OK exposes the
tasks of an optimization loop and allows the user to assemble
novel arrangements of those components. Optimisation algorithms
are added as new actors in Kepler, and thus the functionality
previously available in Nimrod/O are integrated into Nimrod/OK
by building new actors.</p>
    </sec>
    <sec id="sec-7">
      <title>2.5 Science Gateways and WorkWays</title>
      <p>Science Gateways are Web based portals that hide the complexity
of the underlying software and hardware infrastructure.
Traditionally, workflows are behind Gateways and are executed
as if they are monolithic programs, and results may be rendered in
the gateway on completion. This makes it difficult to interact with
a pipeline based computation.</p>
      <p>
        WorkWays differs from this by implementing actors that can
interact with portal components whilst the workflow is still
running. This allows us to gather user input and present output
during execution, and even steer the computation as it proceeds.
We have demonstrated WorkWays on a number of interactive
workflow based computations [
        <xref ref-type="bibr" rid="ref6">10</xref>
        ].
      </p>
    </sec>
    <sec id="sec-8">
      <title>3. CONCLUSION</title>
      <p>In this keynote I provide more information on the background
technologies discussed in Section 2, and show how combining
them provides an extremely powerful platform. This platform has
the following features:
• Users can express complex computational pipelines using
Kepler as a Scientific Workflow Engine. Since Kepler has a
large library of pre-existing components, this makes it
relatively easy to build complex experiments. Further,
Kepler’s graphical user interface makes it fairly easy to treat
workflows as documentation;
• Nimrod/G can be used to perform computations on remote
high end parallel machines. This means that simple actors can
be executed locally, but more complex computations, such as
engineering models, can be run on supercomputers;
• Nimrod/OK provides the ability to script optimization loops
as workflows. Nimrod/OK has a variety of different
optimization algorithms that can be matched to the problem at
hand;
• WorkWays exposes these workflows through Web
technology, allowing a user to both input data to a running
optimization workflow and receive information (in graphical
form) as to how the computation has proceeded. They can
then steer the optimization further.</p>
      <p>Below in Figure 1 is a screen capture that illustrates how these
combine. In the right hand pane is a Nimrod/K workflow that
simulates an aerofoil. The top left image shows a particular design
in a 2-dimensioned cross-section. The bottom image shows a
Parallel Coordinates visualisation of multiple input parameters
and multiple objective function values. These panes are all
rendered in the WorkWays web portal, which also allows users to
specify and configure the computing resources required to
perform the experiment.</p>
      <p>Figure 1 – Optimizing a high dimensioned problem in WorkWays</p>
    </sec>
    <sec id="sec-9">
      <title>4. ACKNOWLEDGMENTS</title>
      <p>Many people and funding bodies have contributed to Nimrod over
a significant period. Thanks go to Blair Bethwaite, Colin Enticott,
Minh Dinh, Slavisa Garic, Jon Giddy, Chao Jin, Hoang Nguyen
and Tom Peachey all of whome contributed to Nimrod/G,
Nimrod/K and Nimrod/O. Hoang Nguyen, is responsible for the
most recent work on Science Gateways and WorkWays. Timos
Kipuros is responsible for recent work on Nimrod/OK and
engineering optimization applications.</p>
      <p>Funding has been provided by the Australian Research Council
and the Distributed Systems Technology Co-operative Research
Centre.
[6] http://nanohub.org</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Abramson</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bethwaite</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Enticott</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garic</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peachey</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michailova</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Amirrazi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2010</year>
          .
          <article-title>Embedding optimization in computational science workflows</article-title>
          .
          <source>Journal of Computational Science</source>
          ,
          <volume>1</volume>
          ,
          <fpage>41</fpage>
          -
          <lpage>47</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Altintas</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berkley</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jaeger</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ludascher</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Mock</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <article-title>Kepler: an extensible system for design and execution of scientific workflows</article-title>
          .
          <source>Scientific and Statistical Database Management</source>
          ,
          <year>2004</year>
          .
          <source>Proceedings. 16th International Conference on Scientific and Statistical Database Management</source>
          ,
          <fpage>21</fpage>
          -
          <issue>23</issue>
          <year>June 2004</year>
          2004a.
          <fpage>423</fpage>
          -
          <lpage>424</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Kepler</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>The Kepler Project [Online]</article-title>
          . Available: http://kepler-project.
          <source>org/ [Accessed</source>
          <volume>4</volume>
          /11/
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pacitti</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valduriez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Mattoso</surname>
            ,
            <given-names>M..</given-names>
          </string-name>
          <article-title>A Survey of Data-Intensive Scientific Workflow Management</article-title>
          .
          <source>J. Grid Comput</source>
          .
          <volume>13</volume>
          ,
          <issue>4</issue>
          (
          <year>December 2015</year>
          ),
          <fpage>457</fpage>
          -
          <lpage>493</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Ludascher</surname>
            , ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Altintas</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berkley</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Higgins</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jaeger</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>E. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <year>2006</year>
          .
          <article-title>Scientific workflow management and the Kepler system</article-title>
          .
          <source>Concurrency and Computation: Practice and Experience</source>
          ,
          <volume>18</volume>
          ,
          <fpage>1039</fpage>
          -
          <lpage>1065</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abramson</surname>
            ,
            <given-names>D</given-names>
          </string-name>
          , Kipouros,
          <string-name>
            <surname>T</surname>
          </string-name>
          , Janke,
          <string-name>
            <surname>A</surname>
          </string-name>
          and Galloway, G. “
          <article-title>WorkWays: Interacting with Scientific Workflows”</article-title>
          ,
          <string-name>
            <given-names>Concurrency</given-names>
            <surname>Computat</surname>
          </string-name>
          .: Pract. Exper.,
          <volume>27</volume>
          :
          <fpage>4377</fpage>
          -
          <lpage>4397</lpage>
          , 21 May 2015
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>