=Paper= {{Paper |id=None |storemode=property |title=Recipes 2.0: Building for Today and Tomorrow |pdfUrl=https://ceur-ws.org/Vol-993/paper16.pdf |volume=Vol-993 |dblpUrl=https://dblp.org/rec/conf/iwsg/DooleyH13 }} ==Recipes 2.0: Building for Today and Tomorrow== https://ceur-ws.org/Vol-993/paper16.pdf
        Recipes 2.0: Building for Today and Tomorrow
                        Rion Dooley                                                        Matthew R. Hanlon
            Texas Advanced Computing Center                                        Texas Advanced Computing Center
             The University of Texas at Austin                                      The University of Texas at Austin
                       Austin, US                                                             Austin, US


    Abstract—The history of science gateway development has, in     multiple investments of time writing system-specific code
many ways, been a story of the “Haves” vs. the “Have-nots.”         when the web only requires a single investment.
Large infrastructure projects led the way, building thick client
portals to provide coherent interfaces to an incoherent                 Despite the cost and complexity, projects that could afford
environment. Contrast this with the way the modern web is           to make the investment in a portal did so gladly because the
designed using light, front end components and outsourcing          end product was well worth the cost. Portals brought cohesion
much of the heavy lifting to a mash-up of REST APIs, and it is      to complicated infrastructure and made computational science
easy to see why modern web applications can be prototyped and       accessible to researchers without computer science degrees.
refined into stable products in the time it previously took thick   They pulled the focus away from the machines and put it back
client portals to do an initial release. This paper argues that a   onto the science.
“build for today” philosophy can lead to the rapid development
of science gateways to serve the “Have-nots.” Also presented is a       Portals such as Cipres [1], GridChem [2], UltraScan [3],
set of responsive front end components built on top of the iPlant   Galaxy [4], and NanoHub [5], just to name a few examples in
Foundation API that provide the boilerplate for rapid               the United States, continue to provide tremendous value to
development of lightweight science gateways using only HTML,        their user communities. The researchers using these portals
JavaScript, and CSS. Using these components, developers can         have made discoveries leading to hundreds of published
easily stand up new gateways or quickly add new functionality to    papers, multiple thesis and dissertations, and insights that
existing ones.                                                      would have taken significantly longer to realize, if at all. One
                                                                    cannot deny the value of such portals in today’s scientific
  Keywords— Science Gateway, REST, API, web service,                process.
AGAVE, HTML5, JavaScript, web
                                                                        The challenge portal-driven science faces is that for every
                                                                    scientist that has a portal like Cipres at their disposal there are
                      I.    INTRODUCTION
                                                                    hundreds more in the same domain who do not. No portal can
    The history of science gateway development has, in many         meet the needs of everyone. Successful portals find their niche
ways, been a story of the “Haves” versus the “Have-nots.”           and focus on providing value to the researchers in that niche.
Large infrastructure projects led the way, building thick client    Inherent in the design of a successful portal is the realization
portals to piece together incongruent service stacks and provide    that it cannot and will not meet the needs of the vast majority
cohesion to an incoherent environment. The field was                of scientists who could otherwise derive value from similar
dominated so thoroughly by these heavyweight portals, that the      tools. Thus, even within the highly technical research
terms portal and gateway became interchangeable. A gateway          landscape there is a digital divide [6] between those who have
was no longer just a means of access, it was an ecosystem of        advanced portal technology to facilitate their work and those
moving parts that all had to be managed and maintained over         who do not.
time for the gateway to work. The concept of modular design
became a relative term. If one could take a component out of            Exact numbers are difficult to obtain, but a rough
one monolithic framework instance and add it to another             approximation is possible. The National Science Foundation’s
monolithic framework instance, then the component was               Science and Engineering Indicator Report for 2012 (SEI) states
modular. Disregard the background processes, supporting             that as of 2010, the US Science and Engineering (S&E)
services, and database that needed replication in order for the     workforce is 6.65 million people [7]. Of them, 31% describe
module to work. If the UI could be reused, the component was        research and development (R&D) as a major work activity. If
considered modular.                                                 we consider only those with doctorates, 12% of those who
                                                                    describe R&D as a major work activity remain. This indicates
    The resources required to build and maintain such portals       that there are at least 247,000 PhD level workers in S&E
made finding portals with long-term success rare. Whereas at        actively conducting research in the US. Add to this the
one time portals were built as thick desktop clients, one of the    estimated 100,000 medical researchers in the US according to
reasons that portals gravitated from the desktop to the web was     the Bureau of Labor and Statistics and we come to a lower
the ongoing cost of maintaining software on multiple operating      bound of 347,000 for the number of researchers who could be
systems. Even applications written entirely in Java require         impacted by portal technologies [8].
some platform-specific attention. That means multiple sets of
unit tests, multiple testing environments, and most importantly,       Looking again at SEI, we see a reasonably proportional
                                                                    investment in R&D across US S&E companies of roughly 6%.
Given that the Pareto Principal applies to revenue distribution      release of GridChem provided federated identity management,
among businesses, we can infer an 80/20 split among industrial       job tracking, system monitoring, scheduling, enforcement of
researchers [9]. With 20% having access to the latest high           proprietary software license agreements, distributed account
technology tools to perform their research and 80% utilizing         management, large data management, full experiment
effective, but cost-restricted technologies. In academia, SEI        reproducibility, and integration with application codes
shows that the top 100 spending universities spent 80% of the        installed on the user’s local system. Many of the features took
academic R&D money in the US. This is significantly more             a significant amount of time to build which pushed back the
lopsided ratio, but as a lower bound, the Pareto Principal holds     first release of the software by nearly a year. However, after
for academic research as well. Thus, it is reasonable to assume
                                                                     its first 3 years in production GridChem had enabled 500 plus
at least an 80/20 split between the haves and the have-not
                                                                     researchers to publish over 60 papers and complete 6
across US R&D in both sectors today, indicating there are at
least 277,000 underserved researchers in the US alone.               dissertations. The software was used as a teaching tool in
                                                                     undergraduate chemistry classes at The Ohio State University,
                                                                     the University of Illinois, and the University of Kentucky to
                II.   BUILDING FOR TOMORROW                          expose hundreds of students each semester to computational
   How does one go about reaching the 277K scientists on the         chemistry. The value of GridChem is obvious, however that
other side of the digital divide? Raising taxes to build 10,000      value came at the up front cost of 6 man-years of development
portals is not realistic. It also does not address the               at a cost of $2.7M to provide enough features to
fundamentally deeper issue of utility. That portals provide          simultaneously support undergraduate students and full
value to their users is well documented [10][11][12]. What           professors alike. Further operation led to another $1M in
value they provide and at what cost are less well-documented         funding to support workflow integration and expanded support
questions. We look at 5 portals from the XSEDE Gateways              for determining appropriate parameters for use in different
Program [13] as short case studies.                                  experiments.

    Galaxy is an open, web-based platform for data intensive             The Cipres Science Gateway is a public resource for
biomedical research. Scientists can download a copy of               inference of large phylogenetic trees. As of this writing,
Galaxy for private use or they can use the hosted Galaxy             Cipres exposes 30 different tools for use on a preconfigured
instance, often called Galaxy Main. The Galaxy Main portal           set of systems ranging from large shared compute clusters to
contains over 2500 application codes in its “Shed” that users        private virtual machines. Users access these tools through a
can leverage for their work. Historically, the vast majority of      form-driven web interface. The process of developing Cipres
users select a small number of codes that they use for all their     included building multiple interfaces for each applications, job
work. In 2012, users ran over 100k jobs a month through              scheduling heuristics, data management, accounting systems,
Galaxy Main. In addition to application registration and job         identity management, and integration with multiple
submission, Galaxy also supports visualization and data              infrastructure providers. These features took a significant
publication. Both are popular features, but neither is the           amount of time, $4.5M in funding from NSF, and a very
primary focus of the portal. Does that mean that they were a         talented team of programmers to develop. The result of that
waste of time? No. Galaxy Main serves over 28k users. There          work was a wildly successful portal. Cipres now serves over
are many other features built into Galaxy, but the point of this     700 users and has been used to run nearly 100k simulations
observation is that as a portal, Galaxy casts a wide net and         burning over 15M compute hours. After 18 months in
tries to provide something of value to every one its users. The      production, Cipres’ usage was outgrowing its infrastructure.
price of doing so is added complexity, greater development           Due to the heavyweight nature of the infrastructure it took
costs, and a larger investment in supporting infrastructure to       another year of development and $1.5M in funding from NSF
run the application. Initially funded by two awards totaling         to allow them to scale out to other systems and move away
just under $1.4M in 2006 from the National Science                   from a community account model. While growth is a common
Foundation (NSF), the additional use cases necessitated              problem of success, this particular problem came at the end of
another round of funding totaling $1.1M from NSF. Recently,          the project’s original funding. Had it not been for the talent
to support the expanding user community and support                  and passion of the development team, Cipres would not have
different resource utilization patterns, another round of            been able to address its growing pains and, as such, would
funding totally $5.8M was obtained from the National Institute       have stalled until the next round of funding arrived.
of Health to carry the project through 2018. Even for a
successful portal like Galaxy Main, maintaining continuous               NanoHub is web application built upon the Joomla CMS
funding and retaining talent are ongoing concerns.                   [14] and designed to support nanotechnology research and
                                                                     education. It provides over 270 simulation tools, 3800
   GridChem is a desktop application supporting the                  seminars, tutorials, and teaching materials, 200 distinct user
computational chemistry community. Its mission is to enable          groups, and a mature workflow engine called Pegasus, which
computational and experimental scientists to do more                 supports job execution across heterogeneous systems. Behind
computational chemistry by providing capability computing            NanoHub lies a series of web services, command line tools, a
resources and services at their fingertips. To that end, the first   full CMS, and an application-authoring tool. The portal as a
whole was built to support a large community and it does so        The development of the XUP is a continuation of the previous
very well. In 2010, NanoHub saw 10k users run 380k                 5 years of development of the TeraGrid User Portal [18]. The
simulations. In 2011, 11k users ran 400k simulations. In 2012,     initial cost of development for the first TeraGrid User Portal
12k users ran 410k simulations. Clearly a lot of people are        was on the order of $800k. Since then another $1.7M has been
doing a lot of work and the growth is cumulative year over         invested in the dedicated team of developers maintaining
year. Such usage indicates that the portal is reaching a           active development on XUP, adding features, addressing user
significant number of people, exposing them to some                issues, and providing the general maintenance required of a
functionality, helping them accomplish a specific task, and a      portal with over 12,000 registered users that supports a variety
percentage are coming back year after year. The numbers are        of user communities within the XSEDE organization, each
impressive, but the behavior is consistent with other portals.     with different needs.
Users come in, find a few tools and/or features of value to
them, and make a routine using those specific tools and/or             In order to provide all of this functionality, the XUP, and
features for the duration of their interaction with the portal.    to a lesser extend the XSEDE website, rely on a suite of
                                                                   services that provide the backend information services. These
    Success comes at a price, and the price of building            services include relational databases, non-relational “NoSQL”
NanoHub was $14M from NSF. Sustaining NanoHub amid                 databases, SOAP and REST web services, flat file parsers, and
rapid growth has been an even more expensive activity. Their       many other services that interact directly with the resources in
latest round of funding is $21.9M from NSF starting in 2013.       the XSEDE CI. The front end is built from many custom
To put that in perspective, NanoHub is a Joomla instance with      developed and specialized portlet applications, as well as out-
a lot of custom plugins and some back-end services to support      of-the-box Liferay portlets. The system works because the
running nanotechnology simulations at an average rate of one       development team has administrative access to the entire
simulation every 78 seconds. Looking at the CMS alone, the         XSEDE infrastructure. They are able to obtain information
site receives 8.5 million hits a month. That is roughly half the   that other gateways simply do not have access to. As a result,
traffic of edublogs.com, the leading educational blog provider     the portlets developed for XUP and the functionality they
in the world with 1.6 million blogs since 2005 [15]. Given         provide cannot easily be replicated simply by copying over the
comparable expenditures and team sizes between the two             portlet code.
organizations, the cost of custom development and supporting
the back-end infrastructure of NanoHub costs roughly 200%              Each of the above portals is different in focus and function,
more than the total cost of running the website alone.             but they are all successful science gateway projects and
                                                                   provide broad functionality. That functionality is often
    The Extreme Science and Engineering Discovery                  targeted at a small set of users who, for a given portal, will
Environment (XSEDE) [16] is a National Science Foundation          only ever use a subset of the features. The cost of these portals
(NSF) funded national cyberinfrastructure (CI) that provides a     in terms of time and effort are all measured in multiple man-
set of large resources for scientific simulation and analysis.     years and millions of dollars before they ever had a single
The XSEDE User Portal (XUP), led by TACC, is the primary           user. They were designed to accommodate thousands of users
interface for users to XSEDE. It provides user account             when they went live and they made sure they could support a
management, project management, documentation, data                thousand users before they tried to support one. From their
management, and a myriad of other features to help users be        inception they were targeting long-term operational goals
productive on the XSEDE CI. It was built on the Liferay            rather than short-term results. To be clear, there is nothing
Portal platform [17], an enterprise open-source Java portal        wrong with that, but it is an important distinction to make. The
framework. The Liferay platform itself provides many features      image of a successful science gateway promoted over the last
out of the box, including a content management system, wikis,      decade was a portal built to support users of tomorrow rather
calendaring, web forms, user forums, security and access           than something that will get the results they need today.
control, and user notifications. Liferay also provides a plugin
development platform for extending the portal with plugins                           III.   BUILDING FOR TODAY
and portlets. As an enterprise portal, Liferay is one of the            The reality for many scientists on the wrong side of the
leaders in the field, but there is a significant financial and     digital divide is that they do not need portals built for
human cost associated with its use. The cost of training,          tomorrow; they need gateways built for today. They are
professional consulting, and enterprise support must be            content using their current workflows, but are willing to adopt
considered.                                                        technologies that make their workflows more efficient, more
                                                                   powerful, or less painful. They will gladly set down Outlook
    XUP has a very different focus than the previously             for Gmail, their departmental FTP server for Dropbox, and the
mentioned science gateways, and it is more of a “destination       server under their desk for a virtual machine on Amazon.
portal” then “science gateway”. But it is another example of a     These scientists are not pushing the boundaries of size and
large, enterprise project that is designed to be a one-stop shop   scale, but they are, in aggregate, performing the bulk of the
that provides users of XSEDE everything they need to be            science done today.
productive on XSEDE, excepting streamlined job execution.
     These scientists do not live in an enterprise world and        memory, and disk available than the virtual machines
their experimental processes are much less rigid than those of      powering the hosted services we rely upon. Furthermore,
the organizations building the previously mentioned portals         modern web browsers are constantly evolving with powerful
above. These scientists look for silver bullets, or the next best   new features both for the user and the developers of web
thing, to accelerate the time between proposal and discovery.       applications. At the same time, as more browsers have adopted
And if a miracle doesn’t come, simply squeezing an extra 5%         web standards put forth by the W3C [23] these features are
out of their week would be a huge win for them.                     more available for use natively in the browser without the
                                                                    need for polyfills such as Adobe Flash [24]. Some of these
     Whether they realize it or not, these scientists have          features can even leverage advanced capabilities of the
embraced the spirit of Agile [19] development that drives           underlying system such as GPU accelerated CSS rendering
today’s web ecosystem. In contrast to the monolithic                and animations. The latest CSS modules, such as transforms
deliverable approach historically taken by portal projects,         [25] and transitions [26] are even beginning to push the
today's web creates and innovates at a blazing pace. Working        boundaries of 3D graphics. Combined, this makes the
from incremental release to incremental release, actively           development of feature-rich, high-performance, and reliable
engaging users, and obsessing over a results-first focus            web applications using only HTML, CSS, and JavaScript a
enables high quality sites and services to be created and           reality.
refined into stable products in the time it takes most portals to
make their first release. One notable example being an                   By moving away from monolithic frameworks and large,
application called Burbn, which over a seven-month                  server-side stacks to client-side applications built using only
timeframe morphed from a web application to an iPhone app           HTML, CSS, and JavaScript and leveraging RESTful APIs,
to a cross-platform application, then changed its focus and         one can rapidly develop powerful, targeted applications that
relaunched as Instagram [20]. A second example is the social        can be quickly deployed, are highly scalable and “cloud-
bulletin board site Pinterest, which spent 3 months in              friendly.”
development before launch, then constantly adapted to user
feedback over the next year before expanding as an iPhone                One example is a tool created by Andre Mercer, an
app and exploding into the giant of today [21]. A third             undergraduate student at the University of Arizona. Andre
example is a relatively new startup called GivePulse [22],          created a simple web page that submitted a request to the
which spent 4 months iterating over designs and features with       iPlant Foundation API to run a GeneSeqer job [27]. He spent
local philanthropic organizations before publicly launching as      an afternoon creating the page, then showed it to his
a site enabling the promotion, matchmaking, and coordination        supervisor, iterated a handful of times on the wording and
of volunteers with events. While each of these examples gives       default settings, then pushed it out into the group’s website.
launch timelines in terms of months, their feature development      Jon Duvick, a bioinformatician in a sister group saw the tool
cycles were on the order of 1-2 weeks with updates and bug          and decided to add it to his site as well as embedding it as part
fixes pushed out daily.                                             of his cloud-based annotation pipeline. Based on the success
                                                                    of the original tool, Andre is now adding data browsing via a
     In each of these examples, the product that went to market     jQuery [28] dialog box to the form so users can run analysis
was markedly different from the project that was originally         on files stored in the Cloud as well their desktop.
conceived. They survived due to their ability to leverage
existing open source technologies, prototype ideas, and add                    IV.   BUILDING ON A SOLID FOUNDATION
small bits of functionality that they could present to their             One of the reasons that applications can be built with such
audience and find out if it had enough promise to invest more       light front ends is that they now rely upon a growing number
time into its continued development.                                of web-friendly APIs for much of the work. The API
                                                                    watchdog Programmable Web has tracked the growth and
     When attempting to serve the needs of the lower half of        adoption of APIs since 2005 and has seen an explosion of new
the digital divide, developers would do well to learn from          APIs in the last 2 years [29]. Much of this growth has been
Instagram, Pinterest, and GivePulse and take these lessons to       attributed to the fact that, “APIs are helping companies do
heart. Start first by understanding that not every project can or   business, with the tradeoff between adding an external
should be the next big thing. Providing a tool that helps a         dependency being out-shined by the ability to move faster
researcher to see a problem in a different light and enables the    building upon someone else’s expertise [30].” In short, APIs
discovery of a solution is a significant contribution in its own    allow companies to run lighter and move faster.
right. The gateway does not need to serve every conceivable
user community to be a success.                                         For new applications the abundance of APIs completely
                                                                    changes the established paradigm. API providers offering
     While much of what one interacts with on the web is            access to cloud storage, authentication, identity management,
provided as a hosted service, i.e. Facebook, Gmail, DailyMile,      and Backend-as-a-Service (BaaS) [31] have redefined how
etc., there is no reason that every gateway should be a hosted      applications are built. Things that used to take months to build
service. Most new desktop computers have more CPUs,                 and test are now leveraged as hosted services and integrated in
an afternoon. One well-known benefactor of building on the              •    Data: Acts as a Rosetta stone for biological data.
shoulders of other APIs is the communication platform                        Supports the conversion of data between known
provider Twilio [32]. From its inception Twilio has leveraged                formats.
Amazon Web Services to handle spikes in demand and offload              •    IO: provides multiprotocol data movement and
much of its compute load while focusing on the core part of                  management.
their service, the communication platform.                              •    Jobs: Handles the end-to-end execution of registered
                                                                             applications on a heterogeneous set of systems
     Of the thousands of public APIs available today, and the                ranging from HPC to raw VMs.
hundreds targeted towards science, there are remarkably few             •    Monitor: constantly monitors Foundation and its
that provide a generic platform for computational science. The               dependent services. Provides real-time and historical
SoapLab [33] project provides mechanisms for accessing                       monitoring test results.
SOAP services through a common interface, but it does not               •    PostIt: pre-authenticated URL shortening.
deal with federated identity, sharing, or access control. The           •    Profile: search and view profiles of other users within
NEWT project exposes HPC systems on the web, but is                          the API.
restricted in scope to systems and services at NERSC [34].
                                                                        •    Systems: provides information about systems
Recently, the CHAIN project has promoted an end-to-end
                                                                             available from Foundation including status, stats, and
solution for science gateway development based on open
                                                                             accessibility.
standards including JSR 168 and 268, OpenLDAP, SAGA,
and PKSC-11 [35]. The framework is still relatively new at the
                                                                         Since its initial release in November 2011, the Foundation
time of this writing and as such, could not be included in the
                                                                    API has supported over 250 unique projects representing 10k
evaluation process leading up to the development of the
                                                                    scientists worldwide. Users burned nearly 9M SUs running
solutions described in this paper. Based on early successes,
                                                                    over 10k jobs, leveraging 200 application codes installed on
CHAIN seems like an exciting project to watch going forward.
                                                                    HPC systems at PSC, SDSC, and TACC. Version 2, due out
The target audience and advertised use case, however, are
                                                                    prior to the publication of this paper, will add the following
more in line with traditional portal development than
                                                                    services as well as expanded support for system registration,
lightweight gateways creation. The gUSE project provides a
                                                                    federated identity management, additional execution
mature web service framework for running workflows, storing
                                                                    platforms, and a more mature callback system.
data, and registering applications [36]. Further, it has existing
integration with the WS-PGRADE portal to provide an out-of-
                                                                        •    Systems:     discovery    and     register    storage,
the-box front end based on Liferay. As with CHAIN, the
                                                                             authentication, and execution systems for use
PGRADE and gUSE project timelines ran parallel to that of
                                                                             throughout the API.
the work in this paper. Futhermore, the approach taken by
                                                                        •    Transfer: move data from anywhere to anywhere
gUSE to provide a SOAP-based service stack runs counter to
                                                                             using multiple protocols.
the desire of current web developers to interact with REST
services in an asynchronous manner.                                     •    Metadata: create, search, and infer metadata about
                                                                             any resource (file, job, person, system, etc.) within
     In response to the dearth of platform APIs available for                the API.
general science the iPlant Collaborative created the
Foundation API [37]. The Foundation API is a RESTful                     By hiding all the heavy lifting of accessing systems,
Science-as-a-Service platform for building modern                   moving data, running simulations, and establishing
applications. It includes services that allow consumers to          relationships between people, data, and devices, consumers
securely conduct science, manage data, and share and curate         are freed up to focus on their science and developers are able
their work. Foundation exists as a hosted, multi-tenant cloud       to focus on innovation at the application layer rather than
service that is freely available to the open science community.     infrastructure at the system level.
Version 1 of Foundation supports the following services.                         V.    YOUR NEXT SCIENCE GATEWAY

    •    Apps: Allows users to register and discover scientific          Turning back to Andre’s GeneSeqr form, this tool is as
         codes that can be run via the Jobs service. There are      basic an example of a science gateway as one can find, but it
         currently over 160 scientific codes both public and        gets the job done. A scientist with remedial programming
         private that can be run across multiple high               capabilities can stand up a static web page on their personal
         performance compute systems.                               computer, a public web server, or on a CDN such as their
                                                                    public Dropbox folder, Amazon S3, or even a free Yahoo
    •    Auth: token-based authentication service. Issues
                                                                    Sitebuilder page. When technology is that easy to adopt and
         limited use tokens that can be restricted to a
                                                                    reuse, the possibility for it to reach a broad audience increases
         timeframe and number of uses and revoked when
                                                                    dramatically. The question then becomes, how can we build
         needed.
                                                                    tools to accomplish tasks requiring a bit more complexity and
                                                                    interaction and yet make them as simple to adopt and reuse as
                                                                    Andre’s GeneSeqr form?
     In recent years, a variety of toolkits and frameworks for               We selected Backbone as platform for multiple reasons.
developing modern web applications have emerged that aid in             Backbone adheres closely to our build-for-today design
the development of lightweight, responsive, standards-driven,           philosophy. It is specifically designed for developing rich, yet
front-end components. These projects are open source, have              lightweight client-side applications that utilize a RESTful API
very large user communities, and are supported by real                  backend. Backbone applications follow the Model-View-
companies such as Twitter (Hogan.js, Bootstrap) [38][39],               Controller (MVC) design pattern making for code that is easy
DocumentCloud (Backbone.js, Underscore.js) [40][41], and                to develop, maintain, and extend. As a JavaScript application
Google (Yeoman.io) [42]. Furthermore, as HTML5 has come                 framework, it can be easily integrated into other environments
into its own and the development of single-page applications            and web platforms such as Liferay or Drupal. Finally,
has become commonplace in the commercial web, it makes                  Backbone is a widely used and popular framework with an
sense to begin using these technologies in science gateways.            active user community and multiple examples of large-scale
                                                                        applications built on top of it. Two examples of large-scale
    With the goal of building tools that are simple to adopt            Backbone users are the Khan Academy [43] and Coursera
and reuse in mind, we have developed a toolkit using these              [44], both providers of massively open online courses
frameworks with the intention of doing for science gateways             (MOOC). One can imagine the benefits of having a
what jQuery did for JavaScript and Web 2.0. We leverage the             computational science course with labs and homework that
iPlant Foundation API as a backend, and provide plugins for             included hands-on access to a computational environment
Backbone.js that allow a Backbone.js application to easily use          where students can gain experience using actual large-scale,
the Foundation API without the developer needing detailed               high-performance computational systems. Using the
knowledge of its inner workings. These plugins provide                  Backbone-Foundation plugins we have developed, these
implementations of the objects in the API as Backbone                   MOOC providers could easily integrate the Foundation API
Models and Collections that can be readily used to build                into coursework offered through those sites.
science gateways. This allows the gateway developer to focus
more on gateway development and less on handling the web
service calls to the backing API.
    TABLE 1. THE FULL LIST OF FOUNDATION API BACKBONE.JS PLUGINS
               AND THE FUNCTIONALITY THEY PROVIDE.
      Plugin Name                    Functionality Provided
 backbone-foundation           Core support for using Foundation API
 backbone-foundation-apps      Application discovery and registration
 backbone-foundation-data      Data transformation and staging
 backbone-foundation-io        Data management and movement
 backbone-foundation-jobs      Job submission and monitoring
 backbone-foundation-profile   Identity management
 backbone-foundation-systems   Resource discovery and monitoring
 backbone-foundation-post-it   Pre-authenticated URL shortening

     The Backbone-Foundation library is broken into separate
plugins that can be included in an ad hoc manner based on the
needs of the application. At the core is the main backbone-
foundation.js file, which provides functionality for basic
interaction with the Foundation API. By providing extensions
of the default Backbone Model and Collection objects, the
Foundation API can be used transparently through the                    Figure 1. A standalone boilerplate gateway built using Backbone.js and
                                                                        the Backbone-Foundation plugins. This application leverages the iPlant
standard Backbone API. Also in this plugin is an                        Foundation API to provide authentication, data management, application
implementation of the Foundation Auth API and Model                     discovery, and job submission with no backend other than the
objects for authenticating and obtaining API tokens for                 Foundation API.
authenticated use of the API. Finally, we include an Events
object that can be used to manage API-aware events across the                We have also developed a complete Backbone application
application.                                                            (Figure 1) as a boilerplate science gateway using the
                                                                        Backbone-Foundation library. The Backbone-Foundation
    Support for the remaining Foundation services is provided           library and Foundation API are white-label components that
by the additional Backbone plugins listed in Table 1. Each              can be readily and easily used to develop your own science
plugin depends on the Backbone-Foundation core library to               gateway. This application is built using Backbone for the
provide the API integration. The only other dependency of the           application framework, Twitter Bootstrap for the front-end
Backbone-Foundation library is Backbone.js itself.                      components and HTML structure, and has no backend other
                                                                        than the Foundation API and a web server to host the static
                                                                        assets (which could also be hosted out of the Foundation API).
     Figure 2. Embedding gateway widgets as a page in privately hosted CMS. From left to right: Wordpress, Drupal, Joomla, and Liferay sites.

The development of this boilerplate application took a single              development and deployment of new features in a results-
developer less than a month to complete and includes                       driven fashion no matter how established an existing portal
authentication, data management, application discovery, and                may be.
job submission.
                                                                                Consider the example of the Liferay enterprise platform.
    Lastly, we have developed a collection of embeddable                   Just as with the CMS platforms mentioned above, one can
“widgets” that provide discrete slices of functionality that can           drop in a Foundation application using only HTML and
be used immediately to add advanced capabilities to any web                JavaScript, and utilizing the Liferay Web Content Display
page or existing portal or gateway with no more effort than                portlet as shown in Figure 2. Or, if something more robust is
adding a Twitter “Tweet this” or Facebook “Like” button.                   needed, the application can be wrapped in a portlet and
                                                                           deployed it in the same way one would deploy any custom
     These widgets are also built on top of the Backbone-                  portlet.
Foundation plugin library. To include a widget in a page, the
page author only needs to add a reference to the widgets script                  Whether this functionality is packaged as content (HTML
and a single div tag with the widget configuration contained in            and JavaScript) or as a plugin, module, extension, or JSR 268
data attributes on the tag. The widgets script acts as a scout             portlet for a specific platform, migrating the functionality from
script to discover the widget div, determine the desired widget,           a lean prototype gateway using these tools, to an enterprise
and then inject the appropriate widget into the page.                      solution with all the bells and whistles is a trivial process.
                                                                           Deploying features built entirely on the front end is not a
  Foundation widgets can be easily used in any HTML page                   deliverable that consumes months of time and effort. On the
and many CMS platforms such as Wordpress, Drupal, or                       contrary, it is more akin to migrating static content from one
Joomla (Figure 2). And because they leverage the Foundation                site to another.
API backend and don’t require local server configuration to
use, they can be used even on cloud-hosted sites such as                        Finally, as mentioned above, forward integration isn’t
Wordpress.com.                                                             limited to wrapping bits of functionality as pages. It is possible
                                                                           to embed custom widgets to provide one-off functionality such
   The widgets available at the time of this writing include a             as activity streams, share buttons, data drop boxes, submission
drag-and-drop file uploader, an application discovery widget,              forms, and directory trees just to name a few.
and a job execution widget. The uploader widget gives a drag-
and-drop upload functionality using the HTML5 FileAPIs                          The process of embedding a widget is the same as that of
allowing users to drag files from their desktop into the web               adding a page. However, for easier adoption, an AJAX driven
browser in order to upload to the iPlant Data Store. The                   widget generator is provided on the Foundation API
application discovery widget allows embedding up-to-date                   developer’s website to help users create widgets based on their
lists of available iPlant applications into any page essentially           unique constraints such as styling, default values, and
providing an application catalog for browsing and searching                restricted permissions.
applications. The job execution widget allows the embedding
of an application-specific job submission form in any page.                                         VII. CONCLUSION
                                                                                Science gateway development has historically been an
          VI.   GETTING FROM TODAY TO TOMORROW
                                                                           enterprise effort. In recent years, the introduction of
    The discussion on building for today has been targeted at              lightweight web technologies and REST APIs have changed
researchers developing new gateways up to this point.                      the way modern applications are built. By leveraging the
Previous sections have demonstrated how one can bootstrap                  technologies of today and decoupling complex infrastructure
an idea into a functional science gateway with a relatively                from gateway front ends, developers can respond to change
short ramp-up using existing APIs and services like the                    faster, innovate more quickly, prototype more easily, and
Foundation API. However, these same development principles                 drastically reduce their time to production. This paper presents
can benefit existing gateways and portals, enabling the rapid              a set of reusable, white labeled, front end components written
entirely in HTML, JavaScript, and CSS that leverage the                            [15] 2005. Edublogs – education blogs for teachers, students and schools.
                                                                                        http://edublogs.com/.
Foundation API and enable just such a transformation. By
                                                                                   [16] "nsf.gov - National Science Foundation (NSF) News - XSEDE Project
utilizing the backbone-foundation plugins as fully functional,                          ..."             2011.              29             Mar.             2013.
interchangeable components, both new and existing gateways                              http://www.nsf.gov/news/news_summ.jsp?cntn_id=121181.
can shift their attention from tedious integration to rapid                        [17] 2002. Liferay.com: Enterprise open source portal and collaboration
innovation that can impact researchers today rather than                                software. http://www.liferay.com/.
tomorrow. Both the gateway components and the Foundation                           [18] Dahan, Maytal, Eric Roberts, and Jay Boisseau. 2007. TeraGrid User
API      are    freely    available    for  use   today     at                          Portal v1. 0: Architecture, Design, and Technologies. International
                                                                                        Workshop on Grid Computing Environments. November 28.
https://foundation.iplantcollaborative.org.
                                                                                   [19] Martin, Robert Cecil. 2003. Agile software development: principles,
                                                                                        patterns, and practices. Prentice Hall PTR, September 1.
                           ACKNOWLEDGMENT
                                                                                   [20] "Instagram." 2009. 29 Mar. 2013. http://instagram.com/.
    The iPlant Collaborative is funded by a grant from the                         [21] "Pinterest." 2009. 29 Mar. 2013 .http://pinterest.com/.
National Science Foundation Plant Cyberinfrastructure                              [22] "GivePulse | Enabling Everyone to Volunteer." 2012. 22 Mar. 2013.
Program (#DBI-0735191). This work was also partially                                    https://www.givepulse.com/.
supported by a grant from the National Science Foundation                          [23] "World Wide Web Consortium (W3C)." 29 Mar. 2013.
Cybersecurity Program (#1127210).                                                       http://www.w3.org/.
                                                                                   [24] 2006.             Adobe              -            Flash           Player.
                               REFERENCES                                               http://www.adobe.com/software/flash/about/.
                                                                                   [25] "CSS Transforms." 2012. 29 Mar. 2013. http://www.w3.org/TR/css3-
                                                                                        transforms/.
[1]  Miller, M.A., Pfeiffer, W., and Schwartz, T. (2010) "Creating the
     CIPRES Science Gateway for inference of large phylogenetic trees" in          [26] "CSS Transitions." 2009. 29 Mar. 2013. http://www.w3.org/TR/css3-
     Proceedings of the Gateway Computing Environments Workshop                         transitions/.
     (GCE), 14 Nov. 2010, New Orleans, LA pp 1 - 8.                                [27] Schlueter, Shannon D, Qunfeng Dong, and Volker Brendel.
[2] Dooley, Rion, Kent Milfeld, Chona Guiang, Sudhakar Pamidighantam,                   "GeneSeqer@ PlantGDB: Gene structure prediction in plant genomes."
     and Gabrielle Allen. 2006. From proposal to production: Lessons                    Nucleic Acids Research 31.13 (2003): 3597-3600.
     learned developing the computational chemistry grid cyberinfrastructure.      [28] "jQuery." 2006. 29 Mar. 2013. http://jquery.com/.
     Journal of Grid Computing 4, no. 2: 195-208.                                  [29] "ProgrammableWeb - Mashups, APIs, and the Web as Platform." 2005.
[3] Demeler, BORRIES. 2005. UltraScan: a comprehensive data analysis                    29 Mar. 2013. http://www.programmableweb.com/.
     software package for analytical ultracentrifugation experiments. Modern       [30] 2012. 8,000 APIs: Rise of the Enterprise - ProgrammableWeb.com.
     analytical ultracentrifugation: Techniques and methods: 210-229.                   http://blog.programmableweb.com/2012/11/26/8000-apis-rise-of-the-
[4] Goecks, Jeremy, Anton Nekrutenko, James Taylor, and T Galaxy Team.                  enterprise/.
     2010. Galaxy: a comprehensive approach for supporting accessible,             [31] "Backend as a service - Wikipedia, the free encyclopedia." 2012. 29
     reproducible, and transparent computational research in the life sciences.         Mar. 2013. http://en.wikipedia.org/wiki/Backend_as_a_service.
     Genome Biol 11, no. 8: R86.
                                                                                   [32] "Twilio Cloud Communications - APIs for Voice, VoIP and Text ..."
[5] Klimeck, Gerhard, Michael McLennan, Sean P Brophy, George B                         2005. 29 Mar. 2013. http://www.twilio.com/.
     Adams, and Mark S Lundstrom. 2008. nanohub. org: Advancing
                                                                                   [33] 2007. Soaplab2. http://soaplab.sf.net/.
     education and research in nanotechnology. Computing in Science &
     Engineering 10, no. 5: 17-23.                                                 [34] Cholia, Shreyas, David Skinner, and Joshua Boverhof. 2010. NEWT: A
                                                                                        RESTful service for building High Performance Computing web
[6] Norris, Pippa. 2003. Digital divide: Civic engagement, information
                                                                                        applications. Gateway Computing Environments Workshop (GCE),
     poverty, and the Internet worldwide. Vol. 40. Cambridge: Cambridge
                                                                                        2010. IEEE, November 14.
     University Press, September.
                                                                                   [35] CHAIN, (2010) Co-ordination & Harmonisation of Advanced e-
[7] 2012.        Science       and      Engineering       Indicators      2012.
                                                                                        Infrastructures EU FP7 project http://www.chain-project.eu) project
     http://www.nsf.gov/statistics/seind12/start.htm.
                                                                                   [36] Peter Kacsuk, Zoltan Farkas, Miklos Kozlovszky, Gabor Hermann,
[8] Bureau of Labor Statistics, U.S. Department of Labor, Occupational
                                                                                        Akos Balasko, Krisztian Karoczkai and Istvan Marton:
     Outlook     Handbook,       2012-13     Edition,   Medical      Scientists,
                                                                                        WS-PGRADE/gUSE Generic DCI Gateway Framework for a Large
     on the Internet at http://www.bls.gov/ooh/life-physical-and-social-
                                                                                        Variety                 of               User                Communities
     science/medical-scientists.htm (visited May 10, 2013).
                                                                                        Journal of Grid Computing, Vol. 9, No. 4, pp 479-499, 2012.
[9] Fawcett, Henry. Manual of political economy. Macmillan, 1888.
                                                                                   [37] Dooley, Rion, Matthew Vaughn, Dan Stanzione, Steve Terry, and Edwin
[10] Wilkins-Diehr, Nancy, Dennis Gannon, Gerhard Klimeck, Scott Oster,                 Skidmore. Software-as-a-Service: The iPlant Foundation API.
     and Sudhakar Pamidighantam. 2008. TeraGrid science gateways and
                                                                                   [38] 2013. Hogan.js - Twitter on GitHub. http://twitter.github.io/hogan.js/.
     their impact on science. Computer 41, no. 11: 32-41.
                                                                                   [39] 2011. Bootstrap. http://twitter.github.com/bootstrap/.
[11] Lawrence, Katherine A., and Nancy Wilkins-Diehr. "Roadmaps, not
     blueprints: paving the way to science gateway success." In Proceedings        [40] 2011. Backbone.js. http://backbonejs.org/.
     of the 1st Conference of the Extreme Science and Engineering                  [41] 2008. Underscore.js. http://underscorejs.org/.
     Discovery Environment: Bridging from the eXtreme to the campus and            [42] 2012. Yeoman - Modern workflows for modern webapps.
     beyond, p. 40. ACM, 2012.                                                          http://yeoman.io/.
[12] Wilkins-Diehr, Nancy, and Katherine A. Lawrence. "Opening science             [43] 2009. Khan Academy - Wikipedia, the free encyclopedia.
     gateways to future success: The challenges of gateway sustainability." In          http://en.wikipedia.org/wiki/Khan_Academy.
     Gateway Computing Environments Workshop (GCE), 2010, pp. 1-10.
     IEEE, 2010.                                                                   [44] 2012. Coursera. https://www.coursera.org/.
[13] "XSEDE          |     Overview."       2011.      29      Mar.       2013.
     https://www.xsede.org/gateways.
[14] 2005. Joomla! The CMS Trusted By Millions for their Websites.
     http://www.joomla.org/.