=Paper=
{{Paper
|id=None
|storemode=property
|title=Recipes 2.0: Building for Today and Tomorrow
|pdfUrl=https://ceur-ws.org/Vol-993/paper16.pdf
|volume=Vol-993
|dblpUrl=https://dblp.org/rec/conf/iwsg/DooleyH13
}}
==Recipes 2.0: Building for Today and Tomorrow==
Recipes 2.0: Building for Today and Tomorrow Rion Dooley Matthew R. Hanlon Texas Advanced Computing Center Texas Advanced Computing Center The University of Texas at Austin The University of Texas at Austin Austin, US Austin, US Abstract—The history of science gateway development has, in multiple investments of time writing system-specific code many ways, been a story of the “Haves” vs. the “Have-nots.” when the web only requires a single investment. Large infrastructure projects led the way, building thick client portals to provide coherent interfaces to an incoherent Despite the cost and complexity, projects that could afford environment. Contrast this with the way the modern web is to make the investment in a portal did so gladly because the designed using light, front end components and outsourcing end product was well worth the cost. Portals brought cohesion much of the heavy lifting to a mash-up of REST APIs, and it is to complicated infrastructure and made computational science easy to see why modern web applications can be prototyped and accessible to researchers without computer science degrees. refined into stable products in the time it previously took thick They pulled the focus away from the machines and put it back client portals to do an initial release. This paper argues that a onto the science. “build for today” philosophy can lead to the rapid development of science gateways to serve the “Have-nots.” Also presented is a Portals such as Cipres [1], GridChem [2], UltraScan [3], set of responsive front end components built on top of the iPlant Galaxy [4], and NanoHub [5], just to name a few examples in Foundation API that provide the boilerplate for rapid the United States, continue to provide tremendous value to development of lightweight science gateways using only HTML, their user communities. The researchers using these portals JavaScript, and CSS. Using these components, developers can have made discoveries leading to hundreds of published easily stand up new gateways or quickly add new functionality to papers, multiple thesis and dissertations, and insights that existing ones. would have taken significantly longer to realize, if at all. One cannot deny the value of such portals in today’s scientific Keywords— Science Gateway, REST, API, web service, process. AGAVE, HTML5, JavaScript, web The challenge portal-driven science faces is that for every scientist that has a portal like Cipres at their disposal there are I. INTRODUCTION hundreds more in the same domain who do not. No portal can The history of science gateway development has, in many meet the needs of everyone. Successful portals find their niche ways, been a story of the “Haves” versus the “Have-nots.” and focus on providing value to the researchers in that niche. Large infrastructure projects led the way, building thick client Inherent in the design of a successful portal is the realization portals to piece together incongruent service stacks and provide that it cannot and will not meet the needs of the vast majority cohesion to an incoherent environment. The field was of scientists who could otherwise derive value from similar dominated so thoroughly by these heavyweight portals, that the tools. Thus, even within the highly technical research terms portal and gateway became interchangeable. A gateway landscape there is a digital divide [6] between those who have was no longer just a means of access, it was an ecosystem of advanced portal technology to facilitate their work and those moving parts that all had to be managed and maintained over who do not. time for the gateway to work. The concept of modular design became a relative term. If one could take a component out of Exact numbers are difficult to obtain, but a rough one monolithic framework instance and add it to another approximation is possible. The National Science Foundation’s monolithic framework instance, then the component was Science and Engineering Indicator Report for 2012 (SEI) states modular. Disregard the background processes, supporting that as of 2010, the US Science and Engineering (S&E) services, and database that needed replication in order for the workforce is 6.65 million people [7]. Of them, 31% describe module to work. If the UI could be reused, the component was research and development (R&D) as a major work activity. If considered modular. we consider only those with doctorates, 12% of those who describe R&D as a major work activity remain. This indicates The resources required to build and maintain such portals that there are at least 247,000 PhD level workers in S&E made finding portals with long-term success rare. Whereas at actively conducting research in the US. Add to this the one time portals were built as thick desktop clients, one of the estimated 100,000 medical researchers in the US according to reasons that portals gravitated from the desktop to the web was the Bureau of Labor and Statistics and we come to a lower the ongoing cost of maintaining software on multiple operating bound of 347,000 for the number of researchers who could be systems. Even applications written entirely in Java require impacted by portal technologies [8]. some platform-specific attention. That means multiple sets of unit tests, multiple testing environments, and most importantly, Looking again at SEI, we see a reasonably proportional investment in R&D across US S&E companies of roughly 6%. Given that the Pareto Principal applies to revenue distribution release of GridChem provided federated identity management, among businesses, we can infer an 80/20 split among industrial job tracking, system monitoring, scheduling, enforcement of researchers [9]. With 20% having access to the latest high proprietary software license agreements, distributed account technology tools to perform their research and 80% utilizing management, large data management, full experiment effective, but cost-restricted technologies. In academia, SEI reproducibility, and integration with application codes shows that the top 100 spending universities spent 80% of the installed on the user’s local system. Many of the features took academic R&D money in the US. This is significantly more a significant amount of time to build which pushed back the lopsided ratio, but as a lower bound, the Pareto Principal holds first release of the software by nearly a year. However, after for academic research as well. Thus, it is reasonable to assume its first 3 years in production GridChem had enabled 500 plus at least an 80/20 split between the haves and the have-not researchers to publish over 60 papers and complete 6 across US R&D in both sectors today, indicating there are at least 277,000 underserved researchers in the US alone. dissertations. The software was used as a teaching tool in undergraduate chemistry classes at The Ohio State University, the University of Illinois, and the University of Kentucky to II. BUILDING FOR TOMORROW expose hundreds of students each semester to computational How does one go about reaching the 277K scientists on the chemistry. The value of GridChem is obvious, however that other side of the digital divide? Raising taxes to build 10,000 value came at the up front cost of 6 man-years of development portals is not realistic. It also does not address the at a cost of $2.7M to provide enough features to fundamentally deeper issue of utility. That portals provide simultaneously support undergraduate students and full value to their users is well documented [10][11][12]. What professors alike. Further operation led to another $1M in value they provide and at what cost are less well-documented funding to support workflow integration and expanded support questions. We look at 5 portals from the XSEDE Gateways for determining appropriate parameters for use in different Program [13] as short case studies. experiments. Galaxy is an open, web-based platform for data intensive The Cipres Science Gateway is a public resource for biomedical research. Scientists can download a copy of inference of large phylogenetic trees. As of this writing, Galaxy for private use or they can use the hosted Galaxy Cipres exposes 30 different tools for use on a preconfigured instance, often called Galaxy Main. The Galaxy Main portal set of systems ranging from large shared compute clusters to contains over 2500 application codes in its “Shed” that users private virtual machines. Users access these tools through a can leverage for their work. Historically, the vast majority of form-driven web interface. The process of developing Cipres users select a small number of codes that they use for all their included building multiple interfaces for each applications, job work. In 2012, users ran over 100k jobs a month through scheduling heuristics, data management, accounting systems, Galaxy Main. In addition to application registration and job identity management, and integration with multiple submission, Galaxy also supports visualization and data infrastructure providers. These features took a significant publication. Both are popular features, but neither is the amount of time, $4.5M in funding from NSF, and a very primary focus of the portal. Does that mean that they were a talented team of programmers to develop. The result of that waste of time? No. Galaxy Main serves over 28k users. There work was a wildly successful portal. Cipres now serves over are many other features built into Galaxy, but the point of this 700 users and has been used to run nearly 100k simulations observation is that as a portal, Galaxy casts a wide net and burning over 15M compute hours. After 18 months in tries to provide something of value to every one its users. The production, Cipres’ usage was outgrowing its infrastructure. price of doing so is added complexity, greater development Due to the heavyweight nature of the infrastructure it took costs, and a larger investment in supporting infrastructure to another year of development and $1.5M in funding from NSF run the application. Initially funded by two awards totaling to allow them to scale out to other systems and move away just under $1.4M in 2006 from the National Science from a community account model. While growth is a common Foundation (NSF), the additional use cases necessitated problem of success, this particular problem came at the end of another round of funding totaling $1.1M from NSF. Recently, the project’s original funding. Had it not been for the talent to support the expanding user community and support and passion of the development team, Cipres would not have different resource utilization patterns, another round of been able to address its growing pains and, as such, would funding totally $5.8M was obtained from the National Institute have stalled until the next round of funding arrived. of Health to carry the project through 2018. Even for a successful portal like Galaxy Main, maintaining continuous NanoHub is web application built upon the Joomla CMS funding and retaining talent are ongoing concerns. [14] and designed to support nanotechnology research and education. It provides over 270 simulation tools, 3800 GridChem is a desktop application supporting the seminars, tutorials, and teaching materials, 200 distinct user computational chemistry community. Its mission is to enable groups, and a mature workflow engine called Pegasus, which computational and experimental scientists to do more supports job execution across heterogeneous systems. Behind computational chemistry by providing capability computing NanoHub lies a series of web services, command line tools, a resources and services at their fingertips. To that end, the first full CMS, and an application-authoring tool. The portal as a whole was built to support a large community and it does so The development of the XUP is a continuation of the previous very well. In 2010, NanoHub saw 10k users run 380k 5 years of development of the TeraGrid User Portal [18]. The simulations. In 2011, 11k users ran 400k simulations. In 2012, initial cost of development for the first TeraGrid User Portal 12k users ran 410k simulations. Clearly a lot of people are was on the order of $800k. Since then another $1.7M has been doing a lot of work and the growth is cumulative year over invested in the dedicated team of developers maintaining year. Such usage indicates that the portal is reaching a active development on XUP, adding features, addressing user significant number of people, exposing them to some issues, and providing the general maintenance required of a functionality, helping them accomplish a specific task, and a portal with over 12,000 registered users that supports a variety percentage are coming back year after year. The numbers are of user communities within the XSEDE organization, each impressive, but the behavior is consistent with other portals. with different needs. Users come in, find a few tools and/or features of value to them, and make a routine using those specific tools and/or In order to provide all of this functionality, the XUP, and features for the duration of their interaction with the portal. to a lesser extend the XSEDE website, rely on a suite of services that provide the backend information services. These Success comes at a price, and the price of building services include relational databases, non-relational “NoSQL” NanoHub was $14M from NSF. Sustaining NanoHub amid databases, SOAP and REST web services, flat file parsers, and rapid growth has been an even more expensive activity. Their many other services that interact directly with the resources in latest round of funding is $21.9M from NSF starting in 2013. the XSEDE CI. The front end is built from many custom To put that in perspective, NanoHub is a Joomla instance with developed and specialized portlet applications, as well as out- a lot of custom plugins and some back-end services to support of-the-box Liferay portlets. The system works because the running nanotechnology simulations at an average rate of one development team has administrative access to the entire simulation every 78 seconds. Looking at the CMS alone, the XSEDE infrastructure. They are able to obtain information site receives 8.5 million hits a month. That is roughly half the that other gateways simply do not have access to. As a result, traffic of edublogs.com, the leading educational blog provider the portlets developed for XUP and the functionality they in the world with 1.6 million blogs since 2005 [15]. Given provide cannot easily be replicated simply by copying over the comparable expenditures and team sizes between the two portlet code. organizations, the cost of custom development and supporting the back-end infrastructure of NanoHub costs roughly 200% Each of the above portals is different in focus and function, more than the total cost of running the website alone. but they are all successful science gateway projects and provide broad functionality. That functionality is often The Extreme Science and Engineering Discovery targeted at a small set of users who, for a given portal, will Environment (XSEDE) [16] is a National Science Foundation only ever use a subset of the features. The cost of these portals (NSF) funded national cyberinfrastructure (CI) that provides a in terms of time and effort are all measured in multiple man- set of large resources for scientific simulation and analysis. years and millions of dollars before they ever had a single The XSEDE User Portal (XUP), led by TACC, is the primary user. They were designed to accommodate thousands of users interface for users to XSEDE. It provides user account when they went live and they made sure they could support a management, project management, documentation, data thousand users before they tried to support one. From their management, and a myriad of other features to help users be inception they were targeting long-term operational goals productive on the XSEDE CI. It was built on the Liferay rather than short-term results. To be clear, there is nothing Portal platform [17], an enterprise open-source Java portal wrong with that, but it is an important distinction to make. The framework. The Liferay platform itself provides many features image of a successful science gateway promoted over the last out of the box, including a content management system, wikis, decade was a portal built to support users of tomorrow rather calendaring, web forms, user forums, security and access than something that will get the results they need today. control, and user notifications. Liferay also provides a plugin development platform for extending the portal with plugins III. BUILDING FOR TODAY and portlets. As an enterprise portal, Liferay is one of the The reality for many scientists on the wrong side of the leaders in the field, but there is a significant financial and digital divide is that they do not need portals built for human cost associated with its use. The cost of training, tomorrow; they need gateways built for today. They are professional consulting, and enterprise support must be content using their current workflows, but are willing to adopt considered. technologies that make their workflows more efficient, more powerful, or less painful. They will gladly set down Outlook XUP has a very different focus than the previously for Gmail, their departmental FTP server for Dropbox, and the mentioned science gateways, and it is more of a “destination server under their desk for a virtual machine on Amazon. portal” then “science gateway”. But it is another example of a These scientists are not pushing the boundaries of size and large, enterprise project that is designed to be a one-stop shop scale, but they are, in aggregate, performing the bulk of the that provides users of XSEDE everything they need to be science done today. productive on XSEDE, excepting streamlined job execution. These scientists do not live in an enterprise world and memory, and disk available than the virtual machines their experimental processes are much less rigid than those of powering the hosted services we rely upon. Furthermore, the organizations building the previously mentioned portals modern web browsers are constantly evolving with powerful above. These scientists look for silver bullets, or the next best new features both for the user and the developers of web thing, to accelerate the time between proposal and discovery. applications. At the same time, as more browsers have adopted And if a miracle doesn’t come, simply squeezing an extra 5% web standards put forth by the W3C [23] these features are out of their week would be a huge win for them. more available for use natively in the browser without the need for polyfills such as Adobe Flash [24]. Some of these Whether they realize it or not, these scientists have features can even leverage advanced capabilities of the embraced the spirit of Agile [19] development that drives underlying system such as GPU accelerated CSS rendering today’s web ecosystem. In contrast to the monolithic and animations. The latest CSS modules, such as transforms deliverable approach historically taken by portal projects, [25] and transitions [26] are even beginning to push the today's web creates and innovates at a blazing pace. Working boundaries of 3D graphics. Combined, this makes the from incremental release to incremental release, actively development of feature-rich, high-performance, and reliable engaging users, and obsessing over a results-first focus web applications using only HTML, CSS, and JavaScript a enables high quality sites and services to be created and reality. refined into stable products in the time it takes most portals to make their first release. One notable example being an By moving away from monolithic frameworks and large, application called Burbn, which over a seven-month server-side stacks to client-side applications built using only timeframe morphed from a web application to an iPhone app HTML, CSS, and JavaScript and leveraging RESTful APIs, to a cross-platform application, then changed its focus and one can rapidly develop powerful, targeted applications that relaunched as Instagram [20]. A second example is the social can be quickly deployed, are highly scalable and “cloud- bulletin board site Pinterest, which spent 3 months in friendly.” development before launch, then constantly adapted to user feedback over the next year before expanding as an iPhone One example is a tool created by Andre Mercer, an app and exploding into the giant of today [21]. A third undergraduate student at the University of Arizona. Andre example is a relatively new startup called GivePulse [22], created a simple web page that submitted a request to the which spent 4 months iterating over designs and features with iPlant Foundation API to run a GeneSeqer job [27]. He spent local philanthropic organizations before publicly launching as an afternoon creating the page, then showed it to his a site enabling the promotion, matchmaking, and coordination supervisor, iterated a handful of times on the wording and of volunteers with events. While each of these examples gives default settings, then pushed it out into the group’s website. launch timelines in terms of months, their feature development Jon Duvick, a bioinformatician in a sister group saw the tool cycles were on the order of 1-2 weeks with updates and bug and decided to add it to his site as well as embedding it as part fixes pushed out daily. of his cloud-based annotation pipeline. Based on the success of the original tool, Andre is now adding data browsing via a In each of these examples, the product that went to market jQuery [28] dialog box to the form so users can run analysis was markedly different from the project that was originally on files stored in the Cloud as well their desktop. conceived. They survived due to their ability to leverage existing open source technologies, prototype ideas, and add IV. BUILDING ON A SOLID FOUNDATION small bits of functionality that they could present to their One of the reasons that applications can be built with such audience and find out if it had enough promise to invest more light front ends is that they now rely upon a growing number time into its continued development. of web-friendly APIs for much of the work. The API watchdog Programmable Web has tracked the growth and When attempting to serve the needs of the lower half of adoption of APIs since 2005 and has seen an explosion of new the digital divide, developers would do well to learn from APIs in the last 2 years [29]. Much of this growth has been Instagram, Pinterest, and GivePulse and take these lessons to attributed to the fact that, “APIs are helping companies do heart. Start first by understanding that not every project can or business, with the tradeoff between adding an external should be the next big thing. Providing a tool that helps a dependency being out-shined by the ability to move faster researcher to see a problem in a different light and enables the building upon someone else’s expertise [30].” In short, APIs discovery of a solution is a significant contribution in its own allow companies to run lighter and move faster. right. The gateway does not need to serve every conceivable user community to be a success. For new applications the abundance of APIs completely changes the established paradigm. API providers offering While much of what one interacts with on the web is access to cloud storage, authentication, identity management, provided as a hosted service, i.e. Facebook, Gmail, DailyMile, and Backend-as-a-Service (BaaS) [31] have redefined how etc., there is no reason that every gateway should be a hosted applications are built. Things that used to take months to build service. Most new desktop computers have more CPUs, and test are now leveraged as hosted services and integrated in an afternoon. One well-known benefactor of building on the • Data: Acts as a Rosetta stone for biological data. shoulders of other APIs is the communication platform Supports the conversion of data between known provider Twilio [32]. From its inception Twilio has leveraged formats. Amazon Web Services to handle spikes in demand and offload • IO: provides multiprotocol data movement and much of its compute load while focusing on the core part of management. their service, the communication platform. • Jobs: Handles the end-to-end execution of registered applications on a heterogeneous set of systems Of the thousands of public APIs available today, and the ranging from HPC to raw VMs. hundreds targeted towards science, there are remarkably few • Monitor: constantly monitors Foundation and its that provide a generic platform for computational science. The dependent services. Provides real-time and historical SoapLab [33] project provides mechanisms for accessing monitoring test results. SOAP services through a common interface, but it does not • PostIt: pre-authenticated URL shortening. deal with federated identity, sharing, or access control. The • Profile: search and view profiles of other users within NEWT project exposes HPC systems on the web, but is the API. restricted in scope to systems and services at NERSC [34]. • Systems: provides information about systems Recently, the CHAIN project has promoted an end-to-end available from Foundation including status, stats, and solution for science gateway development based on open accessibility. standards including JSR 168 and 268, OpenLDAP, SAGA, and PKSC-11 [35]. The framework is still relatively new at the Since its initial release in November 2011, the Foundation time of this writing and as such, could not be included in the API has supported over 250 unique projects representing 10k evaluation process leading up to the development of the scientists worldwide. Users burned nearly 9M SUs running solutions described in this paper. Based on early successes, over 10k jobs, leveraging 200 application codes installed on CHAIN seems like an exciting project to watch going forward. HPC systems at PSC, SDSC, and TACC. Version 2, due out The target audience and advertised use case, however, are prior to the publication of this paper, will add the following more in line with traditional portal development than services as well as expanded support for system registration, lightweight gateways creation. The gUSE project provides a federated identity management, additional execution mature web service framework for running workflows, storing platforms, and a more mature callback system. data, and registering applications [36]. Further, it has existing integration with the WS-PGRADE portal to provide an out-of- • Systems: discovery and register storage, the-box front end based on Liferay. As with CHAIN, the authentication, and execution systems for use PGRADE and gUSE project timelines ran parallel to that of throughout the API. the work in this paper. Futhermore, the approach taken by • Transfer: move data from anywhere to anywhere gUSE to provide a SOAP-based service stack runs counter to using multiple protocols. the desire of current web developers to interact with REST services in an asynchronous manner. • Metadata: create, search, and infer metadata about any resource (file, job, person, system, etc.) within In response to the dearth of platform APIs available for the API. general science the iPlant Collaborative created the Foundation API [37]. The Foundation API is a RESTful By hiding all the heavy lifting of accessing systems, Science-as-a-Service platform for building modern moving data, running simulations, and establishing applications. It includes services that allow consumers to relationships between people, data, and devices, consumers securely conduct science, manage data, and share and curate are freed up to focus on their science and developers are able their work. Foundation exists as a hosted, multi-tenant cloud to focus on innovation at the application layer rather than service that is freely available to the open science community. infrastructure at the system level. Version 1 of Foundation supports the following services. V. YOUR NEXT SCIENCE GATEWAY • Apps: Allows users to register and discover scientific Turning back to Andre’s GeneSeqr form, this tool is as codes that can be run via the Jobs service. There are basic an example of a science gateway as one can find, but it currently over 160 scientific codes both public and gets the job done. A scientist with remedial programming private that can be run across multiple high capabilities can stand up a static web page on their personal performance compute systems. computer, a public web server, or on a CDN such as their public Dropbox folder, Amazon S3, or even a free Yahoo • Auth: token-based authentication service. Issues Sitebuilder page. When technology is that easy to adopt and limited use tokens that can be restricted to a reuse, the possibility for it to reach a broad audience increases timeframe and number of uses and revoked when dramatically. The question then becomes, how can we build needed. tools to accomplish tasks requiring a bit more complexity and interaction and yet make them as simple to adopt and reuse as Andre’s GeneSeqr form? In recent years, a variety of toolkits and frameworks for We selected Backbone as platform for multiple reasons. developing modern web applications have emerged that aid in Backbone adheres closely to our build-for-today design the development of lightweight, responsive, standards-driven, philosophy. It is specifically designed for developing rich, yet front-end components. These projects are open source, have lightweight client-side applications that utilize a RESTful API very large user communities, and are supported by real backend. Backbone applications follow the Model-View- companies such as Twitter (Hogan.js, Bootstrap) [38][39], Controller (MVC) design pattern making for code that is easy DocumentCloud (Backbone.js, Underscore.js) [40][41], and to develop, maintain, and extend. As a JavaScript application Google (Yeoman.io) [42]. Furthermore, as HTML5 has come framework, it can be easily integrated into other environments into its own and the development of single-page applications and web platforms such as Liferay or Drupal. Finally, has become commonplace in the commercial web, it makes Backbone is a widely used and popular framework with an sense to begin using these technologies in science gateways. active user community and multiple examples of large-scale applications built on top of it. Two examples of large-scale With the goal of building tools that are simple to adopt Backbone users are the Khan Academy [43] and Coursera and reuse in mind, we have developed a toolkit using these [44], both providers of massively open online courses frameworks with the intention of doing for science gateways (MOOC). One can imagine the benefits of having a what jQuery did for JavaScript and Web 2.0. We leverage the computational science course with labs and homework that iPlant Foundation API as a backend, and provide plugins for included hands-on access to a computational environment Backbone.js that allow a Backbone.js application to easily use where students can gain experience using actual large-scale, the Foundation API without the developer needing detailed high-performance computational systems. Using the knowledge of its inner workings. These plugins provide Backbone-Foundation plugins we have developed, these implementations of the objects in the API as Backbone MOOC providers could easily integrate the Foundation API Models and Collections that can be readily used to build into coursework offered through those sites. science gateways. This allows the gateway developer to focus more on gateway development and less on handling the web service calls to the backing API. TABLE 1. THE FULL LIST OF FOUNDATION API BACKBONE.JS PLUGINS AND THE FUNCTIONALITY THEY PROVIDE. Plugin Name Functionality Provided backbone-foundation Core support for using Foundation API backbone-foundation-apps Application discovery and registration backbone-foundation-data Data transformation and staging backbone-foundation-io Data management and movement backbone-foundation-jobs Job submission and monitoring backbone-foundation-profile Identity management backbone-foundation-systems Resource discovery and monitoring backbone-foundation-post-it Pre-authenticated URL shortening The Backbone-Foundation library is broken into separate plugins that can be included in an ad hoc manner based on the needs of the application. At the core is the main backbone- foundation.js file, which provides functionality for basic interaction with the Foundation API. By providing extensions of the default Backbone Model and Collection objects, the Foundation API can be used transparently through the Figure 1. A standalone boilerplate gateway built using Backbone.js and the Backbone-Foundation plugins. This application leverages the iPlant standard Backbone API. Also in this plugin is an Foundation API to provide authentication, data management, application implementation of the Foundation Auth API and Model discovery, and job submission with no backend other than the objects for authenticating and obtaining API tokens for Foundation API. authenticated use of the API. Finally, we include an Events object that can be used to manage API-aware events across the We have also developed a complete Backbone application application. (Figure 1) as a boilerplate science gateway using the Backbone-Foundation library. The Backbone-Foundation Support for the remaining Foundation services is provided library and Foundation API are white-label components that by the additional Backbone plugins listed in Table 1. Each can be readily and easily used to develop your own science plugin depends on the Backbone-Foundation core library to gateway. This application is built using Backbone for the provide the API integration. The only other dependency of the application framework, Twitter Bootstrap for the front-end Backbone-Foundation library is Backbone.js itself. components and HTML structure, and has no backend other than the Foundation API and a web server to host the static assets (which could also be hosted out of the Foundation API). Figure 2. Embedding gateway widgets as a page in privately hosted CMS. From left to right: Wordpress, Drupal, Joomla, and Liferay sites. The development of this boilerplate application took a single development and deployment of new features in a results- developer less than a month to complete and includes driven fashion no matter how established an existing portal authentication, data management, application discovery, and may be. job submission. Consider the example of the Liferay enterprise platform. Lastly, we have developed a collection of embeddable Just as with the CMS platforms mentioned above, one can “widgets” that provide discrete slices of functionality that can drop in a Foundation application using only HTML and be used immediately to add advanced capabilities to any web JavaScript, and utilizing the Liferay Web Content Display page or existing portal or gateway with no more effort than portlet as shown in Figure 2. Or, if something more robust is adding a Twitter “Tweet this” or Facebook “Like” button. needed, the application can be wrapped in a portlet and deployed it in the same way one would deploy any custom These widgets are also built on top of the Backbone- portlet. Foundation plugin library. To include a widget in a page, the page author only needs to add a reference to the widgets script Whether this functionality is packaged as content (HTML and a single div tag with the widget configuration contained in and JavaScript) or as a plugin, module, extension, or JSR 268 data attributes on the tag. The widgets script acts as a scout portlet for a specific platform, migrating the functionality from script to discover the widget div, determine the desired widget, a lean prototype gateway using these tools, to an enterprise and then inject the appropriate widget into the page. solution with all the bells and whistles is a trivial process. Deploying features built entirely on the front end is not a Foundation widgets can be easily used in any HTML page deliverable that consumes months of time and effort. On the and many CMS platforms such as Wordpress, Drupal, or contrary, it is more akin to migrating static content from one Joomla (Figure 2). And because they leverage the Foundation site to another. API backend and don’t require local server configuration to use, they can be used even on cloud-hosted sites such as Finally, as mentioned above, forward integration isn’t Wordpress.com. limited to wrapping bits of functionality as pages. It is possible to embed custom widgets to provide one-off functionality such The widgets available at the time of this writing include a as activity streams, share buttons, data drop boxes, submission drag-and-drop file uploader, an application discovery widget, forms, and directory trees just to name a few. and a job execution widget. The uploader widget gives a drag- and-drop upload functionality using the HTML5 FileAPIs The process of embedding a widget is the same as that of allowing users to drag files from their desktop into the web adding a page. However, for easier adoption, an AJAX driven browser in order to upload to the iPlant Data Store. The widget generator is provided on the Foundation API application discovery widget allows embedding up-to-date developer’s website to help users create widgets based on their lists of available iPlant applications into any page essentially unique constraints such as styling, default values, and providing an application catalog for browsing and searching restricted permissions. applications. The job execution widget allows the embedding of an application-specific job submission form in any page. VII. CONCLUSION Science gateway development has historically been an VI. GETTING FROM TODAY TO TOMORROW enterprise effort. In recent years, the introduction of The discussion on building for today has been targeted at lightweight web technologies and REST APIs have changed researchers developing new gateways up to this point. the way modern applications are built. By leveraging the Previous sections have demonstrated how one can bootstrap technologies of today and decoupling complex infrastructure an idea into a functional science gateway with a relatively from gateway front ends, developers can respond to change short ramp-up using existing APIs and services like the faster, innovate more quickly, prototype more easily, and Foundation API. However, these same development principles drastically reduce their time to production. This paper presents can benefit existing gateways and portals, enabling the rapid a set of reusable, white labeled, front end components written entirely in HTML, JavaScript, and CSS that leverage the [15] 2005. Edublogs – education blogs for teachers, students and schools. http://edublogs.com/. Foundation API and enable just such a transformation. By [16] "nsf.gov - National Science Foundation (NSF) News - XSEDE Project utilizing the backbone-foundation plugins as fully functional, ..." 2011. 29 Mar. 2013. interchangeable components, both new and existing gateways http://www.nsf.gov/news/news_summ.jsp?cntn_id=121181. can shift their attention from tedious integration to rapid [17] 2002. Liferay.com: Enterprise open source portal and collaboration innovation that can impact researchers today rather than software. http://www.liferay.com/. tomorrow. Both the gateway components and the Foundation [18] Dahan, Maytal, Eric Roberts, and Jay Boisseau. 2007. TeraGrid User API are freely available for use today at Portal v1. 0: Architecture, Design, and Technologies. International Workshop on Grid Computing Environments. November 28. https://foundation.iplantcollaborative.org. [19] Martin, Robert Cecil. 2003. Agile software development: principles, patterns, and practices. Prentice Hall PTR, September 1. ACKNOWLEDGMENT [20] "Instagram." 2009. 29 Mar. 2013. http://instagram.com/. The iPlant Collaborative is funded by a grant from the [21] "Pinterest." 2009. 29 Mar. 2013 .http://pinterest.com/. National Science Foundation Plant Cyberinfrastructure [22] "GivePulse | Enabling Everyone to Volunteer." 2012. 22 Mar. 2013. Program (#DBI-0735191). This work was also partially https://www.givepulse.com/. supported by a grant from the National Science Foundation [23] "World Wide Web Consortium (W3C)." 29 Mar. 2013. Cybersecurity Program (#1127210). http://www.w3.org/. [24] 2006. Adobe - Flash Player. REFERENCES http://www.adobe.com/software/flash/about/. [25] "CSS Transforms." 2012. 29 Mar. 2013. http://www.w3.org/TR/css3- transforms/. [1] Miller, M.A., Pfeiffer, W., and Schwartz, T. (2010) "Creating the CIPRES Science Gateway for inference of large phylogenetic trees" in [26] "CSS Transitions." 2009. 29 Mar. 2013. http://www.w3.org/TR/css3- Proceedings of the Gateway Computing Environments Workshop transitions/. (GCE), 14 Nov. 2010, New Orleans, LA pp 1 - 8. [27] Schlueter, Shannon D, Qunfeng Dong, and Volker Brendel. [2] Dooley, Rion, Kent Milfeld, Chona Guiang, Sudhakar Pamidighantam, "GeneSeqer@ PlantGDB: Gene structure prediction in plant genomes." and Gabrielle Allen. 2006. From proposal to production: Lessons Nucleic Acids Research 31.13 (2003): 3597-3600. learned developing the computational chemistry grid cyberinfrastructure. [28] "jQuery." 2006. 29 Mar. 2013. http://jquery.com/. Journal of Grid Computing 4, no. 2: 195-208. [29] "ProgrammableWeb - Mashups, APIs, and the Web as Platform." 2005. [3] Demeler, BORRIES. 2005. UltraScan: a comprehensive data analysis 29 Mar. 2013. http://www.programmableweb.com/. software package for analytical ultracentrifugation experiments. Modern [30] 2012. 8,000 APIs: Rise of the Enterprise - ProgrammableWeb.com. analytical ultracentrifugation: Techniques and methods: 210-229. http://blog.programmableweb.com/2012/11/26/8000-apis-rise-of-the- [4] Goecks, Jeremy, Anton Nekrutenko, James Taylor, and T Galaxy Team. enterprise/. 2010. Galaxy: a comprehensive approach for supporting accessible, [31] "Backend as a service - Wikipedia, the free encyclopedia." 2012. 29 reproducible, and transparent computational research in the life sciences. Mar. 2013. http://en.wikipedia.org/wiki/Backend_as_a_service. Genome Biol 11, no. 8: R86. [32] "Twilio Cloud Communications - APIs for Voice, VoIP and Text ..." [5] Klimeck, Gerhard, Michael McLennan, Sean P Brophy, George B 2005. 29 Mar. 2013. http://www.twilio.com/. Adams, and Mark S Lundstrom. 2008. nanohub. org: Advancing [33] 2007. Soaplab2. http://soaplab.sf.net/. education and research in nanotechnology. Computing in Science & Engineering 10, no. 5: 17-23. [34] Cholia, Shreyas, David Skinner, and Joshua Boverhof. 2010. NEWT: A RESTful service for building High Performance Computing web [6] Norris, Pippa. 2003. Digital divide: Civic engagement, information applications. Gateway Computing Environments Workshop (GCE), poverty, and the Internet worldwide. Vol. 40. Cambridge: Cambridge 2010. IEEE, November 14. University Press, September. [35] CHAIN, (2010) Co-ordination & Harmonisation of Advanced e- [7] 2012. Science and Engineering Indicators 2012. Infrastructures EU FP7 project http://www.chain-project.eu) project http://www.nsf.gov/statistics/seind12/start.htm. [36] Peter Kacsuk, Zoltan Farkas, Miklos Kozlovszky, Gabor Hermann, [8] Bureau of Labor Statistics, U.S. Department of Labor, Occupational Akos Balasko, Krisztian Karoczkai and Istvan Marton: Outlook Handbook, 2012-13 Edition, Medical Scientists, WS-PGRADE/gUSE Generic DCI Gateway Framework for a Large on the Internet at http://www.bls.gov/ooh/life-physical-and-social- Variety of User Communities science/medical-scientists.htm (visited May 10, 2013). Journal of Grid Computing, Vol. 9, No. 4, pp 479-499, 2012. [9] Fawcett, Henry. Manual of political economy. Macmillan, 1888. [37] Dooley, Rion, Matthew Vaughn, Dan Stanzione, Steve Terry, and Edwin [10] Wilkins-Diehr, Nancy, Dennis Gannon, Gerhard Klimeck, Scott Oster, Skidmore. Software-as-a-Service: The iPlant Foundation API. and Sudhakar Pamidighantam. 2008. TeraGrid science gateways and [38] 2013. Hogan.js - Twitter on GitHub. http://twitter.github.io/hogan.js/. their impact on science. Computer 41, no. 11: 32-41. [39] 2011. Bootstrap. http://twitter.github.com/bootstrap/. [11] Lawrence, Katherine A., and Nancy Wilkins-Diehr. "Roadmaps, not blueprints: paving the way to science gateway success." In Proceedings [40] 2011. Backbone.js. http://backbonejs.org/. of the 1st Conference of the Extreme Science and Engineering [41] 2008. Underscore.js. http://underscorejs.org/. Discovery Environment: Bridging from the eXtreme to the campus and [42] 2012. Yeoman - Modern workflows for modern webapps. beyond, p. 40. ACM, 2012. http://yeoman.io/. [12] Wilkins-Diehr, Nancy, and Katherine A. Lawrence. "Opening science [43] 2009. Khan Academy - Wikipedia, the free encyclopedia. gateways to future success: The challenges of gateway sustainability." In http://en.wikipedia.org/wiki/Khan_Academy. Gateway Computing Environments Workshop (GCE), 2010, pp. 1-10. IEEE, 2010. [44] 2012. Coursera. https://www.coursera.org/. [13] "XSEDE | Overview." 2011. 29 Mar. 2013. https://www.xsede.org/gateways. [14] 2005. Joomla! The CMS Trusted By Millions for their Websites. http://www.joomla.org/.