<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>G-FLUXO: A workflow portal specialized in Computational BioChemistry</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>E. Guti e´rrez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A. Costantini</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>J. L o´pez Cacheiro</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A. Rodr´ıguez</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CESGA.</institution>
          <addr-line>Santiago de Compostela.</addr-line>
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Chemistry, University of Perugia.</institution>
          <addr-line>Perugia</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The development of a Grid Portal Workflow aware and specialized in Computational BioChemistry is presented. The P-GRADE Portal (GridSphere based) has been expanded with specific portlet developments adding both support for Computational BioChemistry applications and for different distributed resources is described. As a first prototype, GROMACS, a versatile package to perform Molecular Dynamics simulations making use of the Newtonian equations of motion for systems with hundreds to millions of particles, has been implemented and supported. Starting from that, specific DAG workflows running GROMACS jobs, demanding very different computational resources on different local and Grid infrastructures (i.e. EGEE, EELA, etc), have been developed and tested. The JMOL based portlet tighly integrated into the portal has been developed in order to help the user in the visualization of both workflow progress and results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 INTRODUCTION</title>
      <p>In the XX century advances in BioChemistry and Computer Science
enabled the development of new mathematical models able to
simulate the behavior of complex systems at molecular level,
allowing to understand chemical reactions as well as macroscopic
properties of such systems. The huge number of atoms and
interactions involved in the calculations require an increasing
amount of computational resources not always available in local
laboratories.</p>
      <p>Advances in Computational Science as well as improvements
in CPU performance, new architecture designs and the increasing
availability of computer power on Grid platforms, together with
the adoption of new computational models and algorithms adapted
to better exploit parallelism and memory management of such
resources, gave a strong contribution to the computing simulation
of complex systems.</p>
      <p>
        Nowadays many Graphical User Interfaces for BioChemistry
programs have been developed. Special emphasis have been
put on the coordinate and collaborative use of different
applications in order to perform more complex simulations.
Schro¨dinger Software (http://www.schrodinger.com), Scienomics
MAPS (http://www.scienomics.com/Products/maps/index.php) and
Accelrys Materials Studio
(http://accelrys.com/products/materialsstudio) are commercial examples where an integration of
specific scientific applications, execution on remote resources
and centralized visualization have been implemented. However,
although a lot of work about applications integration (compatibility,
input/output format interchange, visualization...) have been done not
too much effort have been done with the aim of an easy and efficient
execution of applications on the very different computational
platforms available today (from a multicore desktop to the
heterogeneous and geographically distributed Grid infrastructure
passing through the cluster platform located on a lab or on a
supercomputer center). In this sense the G-Fluxo project is devoted
to the development of a Grid Portal Workflow specialized in
Computational BioChemistry where very different computational
platforms can be used without the need of very specific computer
skills. G-Fluxo is developed trying to use existing and widely
used technology in order to avoid very specific requirements for
adding computational resources. The P-GRADE Portal (
        <xref ref-type="bibr" rid="ref4">Kacsuk
and Sipos (2005)</xref>
        ) have been chosen as the starting point of this
project so the most widely used Grid middleware is supported.
Integration of the cluster platform is based on the SSH protocol
and the DRMAA standard (
        <xref ref-type="bibr" rid="ref6">Tro¨ger et al. (2007)</xref>
        ) operative on
most of the clusters running today. Finally a GridSphere based
solution (http://www.gridsphere.org) such as P-GRADE Portal also
lets the development of applications specific portlets fully JSR 168
compliant (http://www.jcp.org/en/jsr/detail?id=168).
      </p>
      <p>
        The first application chosen to be implemented and supported into
the G-Fluxo project is GROMACS (
        <xref ref-type="bibr" rid="ref7">van der Spoel et al. (2005</xref>
        )).
GROMACS is a suite of applications for the simulation of complex
systems making use of a wide variety of Molecular Dynamics
techniques optimized for different architectures. The variety of
simulations needed to solve a particular problem makes GROMACS
suitable to exploit different types of computer platforms to solve
one single problem. Workflows where each part can be run on very
different platforms is possible, requiring to send one part of those
jobs to High Performance Computing (HPC) and the other part to
High Throughput Computing (HTC).
      </p>
      <p>Computing Grid resources can be considered as a type of
HTC platform, on the other hand, local clusters with high speed
interconnect networks are examples of HPC platforms. In the
present paper a system to split workflows between grid and
local clusters in a transparent way for the user is described.
Traditionally the management of such workflows is performed
manually, increasing notably the effort for the scientist to take</p>
      <sec id="sec-1-1">
        <title>OGCE</title>
      </sec>
      <sec id="sec-1-2">
        <title>GridPort</title>
      </sec>
      <sec id="sec-1-3">
        <title>Vine</title>
        <sec id="sec-1-3-1">
          <title>GridSphere GridSphere GridSphere GridSphere 2.1.5 2.0.2 3.1 2.2.10 no yes (through</title>
          <p>Java Web Start)
no
no
portlets</p>
          <p>portlets
no or
difficult
no or
difficult
yes
no
portlets
and Adobe</p>
          <p>Flex
no or
difficult
advantage of both platforms. The use of the developed web portal
provides an abstraction of these systems showing a user friendly,
and customizable, interface.</p>
          <p>The Web Portal presented in this article makes use of both HPC
and HTC infrastructures showing, as an example, an interface for
job submission using GROMACS. GROMACS output visualization
is done through a specific developed portlet based on Jmol
(http://www.jmol.org).</p>
          <p>The GROMACS and Jmol packages have been implemented also
in the COMPCHEM P-GRADE Portal (http://ui.grid.unipg.it:8080/
gridsphere/gridsphere) web portal in order to test the porting of the
portlet applications developed at CESGA.</p>
          <p>The paper is organized as follows: in Section 2 the design
and implementation of the portal built upon existing technologies
is illustrated (a comparison of existing portals is also included);
in Section 3 a case study consisting of a workflow based on
GROMACS is presented. Our conclusions and future work are
summarized in section 4.
2</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>PORTAL DESIGN AND IMPLEMENTATION</title>
      <p>Nowadays the most common way of sending jobs is through the
command line, both for grid and local cluster. An alternative to
sending jobs through command line is to use a graphical interface.
A web portal has been chosen as the graphical interface, because it
offers wide compatibility, between different platforms, on the client
side. Furthermore each user can access to their data from different
places and machines, and they don’t need to be worried about new
software instalation or updates.</p>
      <p>In the following subsections, some existing technologies to send
simulations through a web portal will be described. Then, with an
existing web portal as a base, the modifications in that portal will
be described, both to send jobs to local cluster and to visualize
simulation job output.
2.1</p>
      <sec id="sec-2-1">
        <title>Existing Technologies</title>
        <p>A study of existing technologies has been done where four existing
web portals were compared, the results are presented in this Section.
Afterwards, the characteristics of the portal chosen as the starting
point for our developments will be commented with more detail.
2.1.1 Portal Comparison There are several implementations
of web portals for sending jobs to grid. Four Open Source
web portals were tested to use as a base for our work: Vine
(http://vinetoolkit.org), GridPort 4.0.1 (http://gridport.net/main),
OGCE Release 2 (http://www.ogce.org), and P-GRADE Portal. All
of these portals have a common execution platform: GridSphere.</p>
        <p>As it is shown in the Table 2.1.1 the Vine is remarkable for its
interface based on Adobe Flex (http://www.adobe.com/go/flex); for
example, it has a comfortable file manager for grid. Additionally it
has an API for development. With that API it is possible to develop
applications for the web portal and also for applications running in
a desktop environment. Unfortunately Vine lacks both a standard
language and graphical environment for managing workflows. We
also tested the GridPort 4.0.1 and OGCE Release 2. Both lack
of support for gLite (http://glite.web.cern.ch/glite), a middleware
widely used in grid platforms (EGEE, ELAA...) and it is currently
the main grid middleware used at CESGA.
P</p>
      </sec>
      <sec id="sec-2-2">
        <title>GRADE</title>
      </sec>
      <sec id="sec-2-3">
        <title>Portal</title>
        <p>yes
yes (through
Java Web Start)
portlets
easy (but
insecure)</p>
      </sec>
      <sec id="sec-2-4">
        <title>Framework</title>
      </sec>
      <sec id="sec-2-5">
        <title>GLite support</title>
      </sec>
      <sec id="sec-2-6">
        <title>Workflow editor Web interface</title>
        <p>VO
certificate
management</p>
        <p>P-GRADE Portal was the portal chosen as the base for our work
because it is Open Source, it supports gLite 3.1, it has a graphical
editor for workflows, and it is able to manage the certificates that
each user has for each Virtual Organization (VO) of the grid.
2.1.2 Portal Technology Servlets are implementations of Java
code that run at the server-side as an answer to a web client request.
The result of that execution is a web page. The concept of ”portlet”
is closely related to servlet concept. The main different is that while
a servlet generates a whole web page a portlet generates only one
part of a given page. Portlets are defined in the Sun specification
JSR 168.</p>
        <p>From the user’s point of view, a portlet is often represented
visually by a window embedded in a web page, with icons to get
help, to maximize,. . . That is called “modes” and can be specified
in a XML configuration file. Portlets can be grouped into tabs, this
approach allows the creation of separated sections in a web page,
providing more versatility in composition, to the administrator of a
site and even to a user.</p>
        <p>Portlets are usually written in Java and JSP, with some XML
configuration files, and deployed, in a portlet container at the server,
using a script language as Jackarta Ant.</p>
        <p>P-GRADE Portal is implemented using servlets and portlets. In
fact, P-GRADE Portal can be considered as a set of portlets which
are distributed with its dependencies. Those portlets are mainly
focused on grid computing. There are portlets to send jobs and
to manage certificates (using MyProxy), files in Store Elements
(SE), and user accounts. They are deployed on a container called
GridSphere. The version of GridSphere used by P-GRADE Portal
is the 2.2.10 version, which it is not the last version and have
not good support. GridSphere, in turn, uses the Apache Tomcat
(http://tomcat.apache.org) as servlet container.</p>
        <p>That platform is adequate as a basic working grid environment
and to add new portlets which expand the features of the web portal.
There is also a workflow editor, showed in the right-bottom side of
figure 1.</p>
        <p>A group of tasks connected between them is called workflow. In
a workflow, the output of one or more tasks are the input of one or
more tasks. In our case, those tasks are mainly jobs which are sent to
a grid or local cluster. The workflows can be depicted in a graphical
way. This is very useful because it make easy to choose where each
job is executed.</p>
        <p>It is possible that each job in the workflow needs different
resources, for example one needs HTC resources and other need
HPC resources, in this case, the job which needs HTC can be run in
grid and the other can be run in a local cluster. A user can indicate
that in a workflow editor. Also, other job parameters can be changed
in a workflow editor, as well as the relations between jobs.</p>
        <p>The P-GRADE Portal has a graphical interface for workflows,
as shown in the Fig. 1. The workflow is designed in a
workflow editor. The workflow editor is implemented in Java,
and it run on the client machine. To do that, it uses
Java Web Start (http://java.sun.com/javase/technologies/desktop/
javawebstart) technology.</p>
        <p>Each node in the graph, represented by an orange square,
corresponds to a job. Each job can be sent to a different grid,
being able to indicate explicitly the Computer Element (CE).
The workflows implementation is based on a DAG (Directed
Acyclic Graph), and uses Condor (http://www.cs.wisc.edu/condor)
for planning the shipment of those jobs.</p>
        <p>This workflow application also facilitates the implementation and
execution of parametric studies. In that case you may specify the
input parameters of a job to be vary for the study. These parameters,
including the intervals between the argument of each parameter
vary, and its increase or decrease, are specified in a node in the
graph (in one of the orange square that represents one job). For
each of these arguments it will generate an job output. To collect
all of these outputs there is a ”collector”. It is also represented by an
orange square that is configured as such because it connects the job
output (all the exits, with each execution of the job). This method
allows to process all the outputs of the parametric study and obtain
a single final output result of the workflow.</p>
        <p>The second step is the portal administrator has to create a SSH
public key for the portal and give that key to each user.</p>
        <p>Finally, each user must copy the SSH public key of the portal in
the file .ssh/authorized keys of their account at the local cluster. The
user should trust in the portal administrator because all their remote
accounts are exposed to the portal, but not to the other users of the
portal. It is critical that the server where the portal resides must be
very secure because it would be able to access to all remote user
accounts. This procedure is represented in Fig. 2.</p>
        <p>To copy files between local cluster and grid, and between local
cluster and server machine (“local” option in that editor), a new
syntax has been specifically defined into the workflow editor:
cluster:[host DN|host IP]:[path to file]
For each job, the following folder is created in the local cluster:</p>
        <p>The files needed for job execution are copied into this folder.
So the files specified by the user in job ports have to be sent to
this folder by default, or to another path if this is indicated in the
workflow job definition.</p>
        <p>Each user can indicate, for each job, the necessary resources
for the local cluster execution. Those resources are specified by
directives in the user script. This directives are parsed through a
script configured by the portal administrator. The administrator also
has to configure a file to map each portal user to a local cluster user.</p>
        <p>One limitation of the workflow editor of P-GRADE Portal is
that is not possible to link ports between jobs if one job goes to a
local cluster, another goes to the grid and the shared files do not
pass through the portal server. In this case a little reference file
(semaphore), which is passed through the portal server, has to be
used.</p>
        <p>For the implementation, the files ‘wkf pre CLUSTER-GRID.sh’,
‘wkf post CLUSTER-GRID.sh’ and ‘ff CLUSTER-GRID.sh’ have
been created in P-GRADE Portal. A diagram with files and
functions modified or created can be seen in the Fig. 3.</p>
        <p>The input files are managed by P-GRADE Portal through the
‘wkf pre.sh’ script. That script call some functions located in
more specific scripts. In the case of local cluster, those functions
are ‘local input copy CLUSTER-GRID’, for unlinked ports, and
‘channel input copy CLUSTER-GRID’, for linked ports. They are
located in the file ‘wkf pre CLUSTER-GRID.sh’.</p>
        <p>The output files are managed by P-GRADE Portal through the
‘wkf post.sh’ script. The following function has been introduced
in the file ‘wkf post CLUSTER-GRID.sh’ to be called by
‘wkf post.sh’: ‘local copy output CLUSTER-GRID’.</p>
        <p>‘ff CLUSTER-GRID.sh’ has been created to separate more
specific functions, more dependent on the technology to
implement the file management. Those functions are called
‘copy portal2cluster’, ‘copy cluster2portal’, ‘copy cluster2cluster’,
‘copy lfn2cluster’, ‘copy gsiftp2cluster’ and ‘copy cluster2gsiftp’.
Currently, the only implementation is using LCG, for dealing with
the Grid platform, and SSH (‘scp’ and ‘sshfs’ commands), for local
cluster.
Fig. 4. Platform architecture</p>
        <p>Each job is executed on different platforms: Grid or local cluster.
The part of P-GRADE Portal to send jobs to Grid has not been
altered. The script ‘wkf CLUSTER-GRID.sh’ has been created to
send jobs to local cluster.</p>
        <p>DRMAA is a standard library for the submission and control
of jobs to one DRMS. Two programs, coded using DRMAA,
were developed: one to send jobs and another to monitor
job status. Those programs were compiled, and executed, in
the two GridEngine (GE) platforms present at CESGA: SVG
(http://www.cesga.es/content/view/409/42/lang,en) and Finis Terrae
(http://www.cesga.es/content/view/917/115/lang,en).</p>
        <p>Those two programs are permanently located in the portal. They
are sent to local cluster, using the same scripts commented in the
previous subsection, when the user job is sent to execute. Then, they
are executed by a portal script, ’wkf CLUSTER-GRID.sh’, through
SSH. To do that it is necessary to load the remote environment
variables needed and then to execute the DRMAA programs.</p>
        <p>The user can specify explicitly some requirements for local
cluster execution of each job, resources as memory size, CPU
time,. . . Those resources are indicated in the user main script
through directives with the following syntax.</p>
        <p># gfluxo: [resource name] [resource value]
Those scripts are executed by the portal through Condor. Condor
is integrated in P-GRADE Portal and it is running in the portal
server. When a user submits a job, this job is submitted to Condor
and the scripts are executed using the files created by the workflow
editor and the files upload by the user as input.</p>
        <p>An architecture diagram showing the pre/post file management
and job submission procedures is presented on Fig. 4. Additionally
all the SSH based communication (SCP/SSHFS) channels linked
with the specific tasks and the different functions developed in the
‘ff CLUSTER-GRID.sh’ file are also shown.
2.2.2 Visualization GridSphere was used as a framework for
the development of a specific portlet for the visualization of the
GROMACS output. For that, two code files, one in JSP and another
in Java, and the corresponding configuration files were created. Also
a Jakarta Ant (http://ant.apache.org) script was developed to deploy
the portlet in GridSphere. For the specialized visualization needed in
the case of GROMACS output Jmol is used. All together is packaged
in one tar.gz file for easy distribution.</p>
        <p>Jmol is an open-source Java viewer for chemical structures. Inside
Jmol the JmolApplet is a web browser applet that can be integrated
into web pages and, in our case it has been integrated into a
GridSphere Portlet. It supports several input file formats, but does
not support the GROMACS format. One popular molecule format
is the Protein Data Bank PDB (http://www.wwpdb.org/docs.html).
The command pdb2gmx, of the GROMACS package, is used for
the conversion from GROMACS to Protein Data Bank (PDB) file
format.</p>
        <p>The use of portlets allows us to easily add new tabs and
windows associated with simulation applications needed by the
user, or applications for file management in a grid, management
credentials... Once implemented a portlet for distribution, we create
an Jakarta Ant script and everything is packaged in an archive. For
installation, the user only has to copy the file, unzip or unpack, and
deploy portlets in the container (the GridSphere, in our case) by
running this script from Jakarta Ant to be created.</p>
        <p>The source code of the Jmol portlet can be downloaded from the
G-Fluxo web site (http://gfluxo.cesga.es/download/Gromacs/portlet
jmol v0.1.tar.gz).</p>
        <p>The main class is ‘UiJmol’. That class is a child of the
‘ActionPortlet’ class of GridSphere. The ‘UiJmol’ class call to
‘uiJmol.jsp’, where the main layout of the web page is described.</p>
        <p>The portlet layout consists of a form, to choose the file to
visualize, and of an applet, to render the molecule.</p>
        <p>The form was developed using three boxlists (ListBoxBean
class). The first boxlist is used to choose the workflow, the second to
choose the job in that workflow, and the third to choose the molecule
file (output of GROMACS). The portlet searches for the files only
in the current user account of P-GRADE Portal; it can not access to
files in other user’s accounts. That Bash script get the name of the
user account through the following Java method</p>
        <p>event.getActionRequest().getRemoteUser()
Each list is obtained executing a Bash script. The Bash script is
executed from Java code when the user clicks on one of the buttons
over each listbox.</p>
        <p>To clear the listbox it is needed to use a global variable to
execute the sentence lbb workflows.clear() only when the
ListWorkflows method was already called after the portlet was
initiated. Each time a button is pressed, and one element is selected
in the previous listbox, the next listboxes are cleared.</p>
        <p>The method to visualize created the HTML and JavaScript code
necessary to call the Jmol applet with the path of the file. The file
has to be copied to a path visible from the web navigator. The path
chosen was
users/[user name]/[workflow name]/[job name]
/[molecule file]</p>
        <p>For the distribution it is needed to unpackage the tar.gz file
in the ‘gridsphere-2.2.10/projects’ directory (relative to the root
P-GRADE Portal directory). GridSphere directory is deleted by
default in the P-GRADE Portal installation so it is needed to modify
the P-GRADE Portal installation script in order to avoid it.</p>
        <p>The deployment is done with a Jakarta Ant script. To deploy
the portlet, a portal administrator has to execute the following
sentences, in the directory ‘gridsphere-2.2.10/projects/portlet jmol
v0.1’.</p>
        <p>ant install</p>
        <p>The deployment script is in the file ‘build.xml’. That script
compiles the Java and JSP source code explained before. Also, it
creates JAR files and copies all files to the ‘tomcat/webapps/jmol’
directory.</p>
        <p>The Jmol library is contained in the file ‘jmol-11.4.4-full.tar.gz’.
That file is unpackaged into ‘/webapps/gridsphere’ directory. This
location is needed because it is the root of the P-GRADE Portal,
seen by the user’s web browser.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>CASE STUDY</title>
      <p>The test application chosen as a case study has been GROMACS as
it is stated before. The criteria followed for this election was:
• The application should demmand very different platforms
(Grid and local cluster) to be run efficiently depending on the
input case.
• The application must have a very active development comunity
so it has utilities and visualization tools related with the
application that can be used and integrated into the portal.
• it should desirable to have an Open Source license schema so
there will not be any limitation of use.</p>
      <p>In order to test the features of the P-GRADE Portal developed
under the G-Fluxo project, several Gromacs workflows made up
of a chain of three jobs has been run. Each job is run on different
platforms (2 clusters, Finis Terrae and SVG, and the EGEE Grid
infrastructure (http://eu-egee.org) ).</p>
      <p>These workflows take advantadge of all the extra funcionalities
described previously:
• File Management between different platforms taking into
account the dependecies coming from the workflow definition
• Job Submission using gLite middleware and the DRMAA
implementation included in the DRMS used in both local
clusters (GridEngine, http://gridengine.sunsource.net).
• Output visualization including all the intermediate results
coming from all the jobs that belong to the workflow.</p>
      <p>The developed workflows are based on one of the available
GROMACS tutorials (http://md.chem.rug.nl/education/mdcourse/
index.html). The three jobs are:</p>
      <sec id="sec-3-1">
        <title>1. Vacuum: Energy minimization of the structure. 2. Water: Energy minimization of the solvated system. 3. PR: Relaxation of solvent and hydrogen atom positions: Position restrained Molecular Dynamics.</title>
        <p>These jobs must be executed following this order. The workflows
used as test case are shown in Fig. 5 and 6.</p>
        <p>In the first case (Fig. 5) the workflow is executed in the Finis
Terrae cluster (1st job), the SVGD cluster (2nd job) and the EGEE
grid (3rd job). The relations between jobs in this case are made using
one link (a semaphore) that defines the dependancy job chain. In
this case the user must set all the job ports adequately following
the syntax described previously in Section 2, taking into account
where the job is executed so the portal could be aware of all the
information needed for file transfers. This is needed to overcome the
limitation present in the P-GRADE Portal file transfer management
system that does not allow direct file transfer between different
platforms (different Grid middlewares following P-GRADE Portal
nomenclature). Future work that involves a deeper change in
PGRADE Portal will be needed in this sense. In the second case
(Fig. 6) only the cluster platform is used althought it involves two
different DRMS (Finis Terrae and SVG). In this case many link
relations are established between the jobs, the output files of the
first job became inputs files of the others. In this case (Fig. 7) it is
only needed to specify the route of the file (output port) once being
automatically asigned to the other job input port.</p>
        <p>Vacuum Port 4
Vacuum Port 5
Vacuum Port 6
Water Port 3</p>
        <p>Water Port 4
cluster:ft.cesga.es:gfluxo/Gromacs_DEMO/Gromacs_vacuum/1LW9−EM−vacuum.gro</p>
        <p>Vacuum Port 3
cluster:ft.cesga.es:gfluxo/Gromacs_DEMO/Gromacs_vacuum/1LW9.top
cluster:ft.cesga.es:gfluxo/Gromacs_DEMO/Gromacs_vacuum/minim.mdp
cluster:ft.cesga.es:gfluxo/Gromacs_DEMO/Gromacs_vacuum/posre.itp
Vacuum Port 6
cluster:svgd.cesga.es:gfluxo/Gromacs_DEMO/Gromacs_water/1LW9.top
cluster:svgd.cesga.es:gfluxo/Gromacs_DEMO/Gromacs_water/1LW9−water.gro
Vacuum Port 3
Vacuum Port 4
Vacuum Port 5</p>
        <p>Water Port 4</p>
        <p>Water Port 3</p>
        <p>In Fig. 8 three screenshots coming from the visualization Jmol
porlet developed in the GFluxo project are shown. Using this
portlet the PDB file carried out from the calculations can be easily
visualized and there is no need to wait for the completion of the
whole workflow.</p>
        <p>All the funcionalities available at the JmolApplet are also present
in the portlet. Even more, Jmol additional functionalities like the
RasMol/Chime scripting language and JavaScript support library
can be integrated in the portlet letting the development of a very
specific visualization portlet of great help in a specific simulation.</p>
        <sec id="sec-3-1-1">
          <title>3.1 Integration in COMPCHEM</title>
          <p>
            In order to test the porting procedure of the GROMACS
workflow and JMOL portlet packages developed at CESGA,
both packages have been installed and tested in the P-GRADE
Portal implemented by COMPCHEM VO. The COMPCHEM
VO (http://compchem.unipg.it) has been created by a group of
molecular and material science laboratories committed to adapt their
computer codes to run in the EGEE production grid infrastructure.
More specifically the goal of designing and implementing grid
empowered versions of quantum reactive scattering codes as
well as advanced visualization tools devoted to the study of
the behavior of complex molecular systems
            <xref ref-type="bibr" rid="ref1">Gervasi and Lagana`
(2004)</xref>
            ;
            <xref ref-type="bibr" rid="ref2">Gervasi et al. (2006)</xref>
            , is the task of the QDYN
and ELAMS working groups of the European Cooperation in
Science and Technology (COST) Action D37, called ”Grid-Chem”
(http://www.cost.esf.org/index.php?id=189&amp;action number=D37).
          </p>
          <p>Thanks to a recent Short Term Scientific Mission (STSM)
sponsored by COST D37 Action, the GROMACS package and
the JMOL applet have been implemented in the COMPCHEM
P-GRADE Portal and made available to the Molecular Science
Community supported by the VO. The workflow used as a case
study is one the already discussed jobs as shown in Fig. 8:
• Water: Energy minimization of the solvated system.</p>
          <p>Thanks to mentioned STSM a set of detailed wiki pages about
how to install and run GROMACS in COMPCHEM P-GRADE
Portal have been written and made available to the COMPCHEM
Community at the following link: [http://compchem.unipg.it/wiki].</p>
          <p>Two GROMACS tutorials have been written. In the first tutorial
the configuration and run procedure for a single job are described
and summarized in the following steps:
• Grid configuration, including certificates in MyProxy Server.
• Preparing a job.
• Script to GROMACS on CompChem VO, including how
to download the GROMACS application and configure the
environments variables.
• Set up of the input and output files of a job.
• Visualization of GROMACS output, using the Jmol portlet.</p>
          <p>In the second tutorial a workflow is described. A more complex
GROMACS example, with three simulations linked between them,
is shown.</p>
          <p>A set of video-tutorials, made with the gtk-recordMyDesktop
application in Ubuntu 9.04 are also available.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>CONCLUSIONS</title>
      <p>Web portals to create workflows and send jobs to different Grid
infrastructures have been tested. Starting from that a workflow
environment to send jobs to grid and local cluster based on
P-GRADE Portal has been developed. As a case study in
BioChemistry, a GROMACS workflow has been created and
tested and a portlet to visualize molecules has been developed
and integrated in the portal. The output files carried out from
the calculations have been successfully visualized in the new
portlet. The GROMACS workflows and JMOL portlet have been
also successfully implemented in the COMPCHEM computational
environment to test the porting of such applications. Through this
paper it has been shown that the combination of a GridSphere
based portal like P-GRADE Portal, that facilitates the use of
very different computational platforms, jointly with a very specific
portlet development (application aware portlet) can be flexible
enough to develop specialized portals not only devoted to the Life
Science Community. In this scenery, G-Fluxo is able to supply the
computational needs that the e-science community could have as:
• Orchestration of the use of many different Computational
infrastructures
• Specific support for applications
• Scientific Simulations Modeling: Implementation of different
methodologies</p>
      <p>
        Additional effort is needed in the local cluster platform
implementation in P-GRADE Portal. In fact, user access and
registration need to be implemented and tested. At present the SSH
Public Key Authentication implemented involves some security
risks that could be mitigated using a user access and register
architecture as the one described in the RETELAB project (
        <xref ref-type="bibr" rid="ref5">Mera
et al. (2009)</xref>
        ).
      </p>
      <p>
        It is evident that working with workflows implies applications
communication. Working on the development of common data
formats and conversion routines are critical inside the different
life sciences areas. For example tools like Open Babel (
        <xref ref-type="bibr" rid="ref3">Guha
et al. (2006)</xref>
        ) should be integrated on the GFluxo portal. With a
conversion tool, job links can be implemented not only as a file copy
action but as a format conversion action.
      </p>
      <p>Improvement in workflow support will be also performed and
aside from DAG workflow definitions, more complex workflow
languages will be supported. A particular effort is focused to link
P-GRADE Portal DAG Workflow submission with the Nova Bonita
Console (http://www.bonitasoft.com), a workflow open source
solution that support the standard XPDL process definition language
(http://www.wfmc.org/xpdl.html).</p>
    </sec>
    <sec id="sec-5">
      <title>ACKNOWLEDGMENTS</title>
      <p>We would like to thank all the people that have created P-GRADE
Portal and released it under an open source license so everyone can
contribute to its development. This work is supported by Xunta
de Galicia under the project G-Fluxo (07SIN001CT). The grid
infrastructures used are those of FORMIGA (07TIC01CT) and
EGEE-III (INFSO-RI-222667) projects.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Gervasi</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lagana`</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2004</year>
          ).
          <article-title>Simbex a portal for the a priori simulation of crossed beam experiments</article-title>
          .
          <source>Future Generation Computer Systems</source>
          ,
          <volume>20</volume>
          ,
          <fpage>703</fpage>
          -
          <lpage>715</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Gervasi</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tasso</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , and Lagana`,
          <string-name>
            <surname>A.</surname>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>Immersive molecular virtual reality based on x3d and web services</article-title>
          .
          <source>Lecture notes in Computer Science</source>
          ,
          <volume>3980</volume>
          ,
          <fpage>212</fpage>
          -
          <lpage>221</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Guha</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Howard</surname>
            ,
            <given-names>M. T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hutchison</surname>
            ,
            <given-names>G. R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murray-Rust</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rzepa</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steinbeck</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wegner</surname>
            ,
            <given-names>J. K.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Willighagen</surname>
            ,
            <given-names>E. L.</given-names>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>The Blue Obelisk-Interoperability in Chemical Informatics</article-title>
          .
          <source>Journal of Chemical Information and Modeling</source>
          ,
          <volume>46</volume>
          ,
          <fpage>991</fpage>
          -
          <lpage>998</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Kacsuk</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Sipos</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>Multi-Grid, Multi-User Workflows in the P-GRADE Portal</article-title>
          .
          <source>Journal of Grid Computing</source>
          ,
          <volume>3</volume>
          (
          <issue>3-4</issue>
          ),
          <fpage>221</fpage>
          -
          <lpage>238</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Mera</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cotos</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varela</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cotelo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , and Lo´pez,
          <string-name>
            <surname>J. I.</surname>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>An integration of several technologies in the architecture denition and deployment of a geospatial grid web portal</article-title>
          .
          <source>In WORLDCOMP'09 - The 2009 World Congress in Computer Science</source>
          , Computer Engineering, and Applied Computing.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Tro</surname>
            ¨ger,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rajic</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Domagalski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>Standardization of an api for distributed resource management systems</article-title>
          .
          <source>In Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid</source>
          <year>2007</year>
          ), pages
          <fpage>619</fpage>
          -
          <lpage>626</lpage>
          , Rio de Janeiro, Brazil.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>van der Spoel</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lindahl</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hess</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Groenhof</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mark</surname>
            ,
            <given-names>A. E.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Berendsen</surname>
            ,
            <given-names>H. J. C.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>Gromacs: Fast, flexible and free</article-title>
          .
          <source>J. Comp. Chem</source>
          .,
          <volume>26</volume>
          ,
          <fpage>1701</fpage>
          -
          <lpage>1718</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>