Lightning Talk: HydroShare – A Case Study in Software Engineering Best Practices and Culture Change for Developing Sustainable Community Software Ray Idaszak1, David G. Tarboton (PI)2, Hong Yi1, Michael Stealey1, Pabitra Dash2, Alva Couch3, Daniel P. Ames4, Jeffery S. Horsburgh2, Tony Castronova2, Jon Goodall5, Mohamed Morsy5, Venkatesh Merwade6, Mauriel Ramirez2, Tian Gan2, Drew (Zhiyu) Li4, Jeff Sadler4, Shawn Crawley4, Zhaokun Xue3, Lan Zhao6, Carol Song6, Christina Bandaragoda7 1RENCI, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA, Email: {rayi, hongyi, stealey}@renci.org 2Utah State University, Logan, Utah, USA, Email: {dtarb, pabitra.dash, jeff.horsburgh, tony.castronova, mauriel.ramirez, tian.gan}@usu.edu 3Tufts University, Medford, Massachusetts, USA, Email: {alva.couch, zhaokun.xue)@tufts.edu 4Brigham Young University, Provo, Utah, USA, Email: {dan.ames, zhiyu.li, jeffrey.sadler, shawn.crawley)@byu.edu 5University of Virginia, Charlottesville, Virginia, USA, Email: {goodall, mmm4dh}@virginia.edu 6Purdue University, West Lafayette, Indiana, USA, Email: {venvmerwade, lanzhao, cxsong}@purdue.edu 7University of Washington, Seattle, Washington, USA, Email: cband@uw.edu Abstract—Applying modern software engineering to scientific At the onset of the HydroShare project, most of the research software development has many challenges. These include lack of scientist collaborators on the project were not clear on the differ- time or incentives to learn software engineering best practices, a ence between software development and software engineering, lack of understanding or appreciation of the value of modern soft- and they were not familiar with concepts such as iterative soft- ware engineering, and a shortage of mechanisms to more broadly ware development, test-driven development, code reviews, and change the software engineering culture of a community of re- searchers in a singular concerted effort. The HydroShare project continuous integration. Now, as the HydroShare project enters is a large distributed research software development project that its fifth year of NSF funding, research scientists and research has made significant inroads on these challenges. As the Hy- software engineers from the ten collaborating institutions con- droShare project enters its fifth year of NSF funding, we discuss sistently produce high-quality HydroShare code releases every how research scientists and research software engineers from the 2-3 weeks that are formally reviewed and tested. The Hy- ten collaborating institutions consistently produce high-quality droShare team now understands and embraces the value of mod- HydroShare code releases every 2-3 weeks that are formally re- ern software engineering; indeed, they understand the time sav- viewed and tested. ings of producing high-quality sustainable code at the onset as I. INTRODUCTION enabling more time spent on their research and not on the time- consuming alternative of managing potentially poor quality code Applying modern software engineering to scientific software had they not embraced modern software engineering. development has many challenges. These include lack of time or incentives to learn software engineering best practices, a lack II. SOFTWARE ENGINEERING BEST PRACTICES AND CULTURE of understanding or appreciation of the value of modern software CHANGE FOR DEVELOPING SUSTAINABLE COMMUNITY engineering, and a shortage of mechanisms to more broadly SOFTWARE change the software engineering culture of a community of re- The HydroShare team has achieved a community rhythm in searchers in a singular concerted effort. The HydroShare project the continual deployment of high-quality community code re- is a large distributed research software development project that leases of HydroShare. The community visibility of this rhythm has made significant inroads on these challenges [1]. Hy- has served as an incentive for community culture change in that droShare is a hydrology community open-source cyberinfra- team members are pleased with the resulting code, its evolving structure project supported by the National Science Foundation significant capabilities, and the efficiency by which new features (NSF) through its Software Infrastructure for Sustained Innova- are tested and integrated as contributed by the collaborating team tion program (SI2) [2, 3]. Domain scientists, professional soft- members. This culture change has naturally incented collabo- ware engineers, and academic software developers from ten ac- rating researchers to take the time to teach modern software en- ademic, research, and development organizations located across gineering best practices to their graduate and postdoctoral stu- the United States collaborate to develop HydroShare - an online, dents that time has shown have now also embraced these prac- collaborative system supporting the open sharing of hydrologic tices. While host universities have provided the requisite train- data, analytical tools, and computer models. ing in computer programming to HydroShare researchers, they have not provided the accompanying instruction on software en- This work is licensed under a CC-BY-4.0 license. gineering best practices. However, the HydroShare project has provided numerous faculty, graduate students, postdoctoral stu- A concluding note on the software sustainability of Hy- dents, and even undergraduate students the opportunity to learn droshare: since 2002, CUAHSI [6] – the primary U.S. hydrology modern software engineering best practices first-hand in the ab- consortium with 130 member universities and international or- sence formal classroom instruction of same. ganizations – collaborated in the predecessor to HydroShare called Hydrologic Information System, or HIS [7]. HIS is now What is important to convey is this propagation and adoption maintained by the CUAHSI Water Data Center [8] as its com- of software engineering best practices across the HydroShare munity sustainability model. HydroShare is positioned as the team is happening organically without the need to force it. It is successor to HIS, complementing but not replacing it. When the a success story in that what is referred to as a “community NSF-funded HydroShare award concludes, it will also be hosted rhythm” herein is in effect like an engine that, once started, sus- by the CUAHSI Water Data Center – its long-term sustainabil- tains itself in part by visibly promoting its own success and effi- ity ensured by modern software engineering that will readily en- ciencies such that there is no questioning of its uptake by new able the broader community to continually make novel and use- team members. In other words, adoption of these software en- ful contributions. gineering best practices becomes the new norm – there is no al- ternative that new HydroShare team members are ever exposed ACKNOWLEDGMENT to. This is especially important to those faculty and students who This material is based upon work supported by the USA Na- are early in their careers as it has proven a viable mechanism of tional Science Foundation (NSF) under awards 1148453 and getting these individuals on a sustainable software path early on. 1148090; any opinions, findings, conclusions, or recommenda- tions expressed in this material are those of the authors and do As it is beyond the scope of a lightning talk summary paper not necessarily reflect the views of the NSF. to fully describe the mechanism of how the success of Hy- droShare’s software engineering is achieved, we refer the reader REFERENCES to a book chapter titled “HydroShare – A case study of the ap- [1] Tarboton, D. G., Idaszak, R., Horsburgh, J. S., J. Heard, Ames, D. plication of modern software engineering to a large distributed P., Goodall, J. L., Band, L., Merwade, V., Couch, A., Arrigo, J., federally-funded scientific software development project” that Hooper, R., Valentine D., and Maidment, D. (2014), "HydroShare: Advancing Collaboration through Hydrologic Data offers a comprehensive discussion of this work [4]. The book and Model Sharing." 7th International Conference on chapter discusses the HydroShare team’s use of iterative soft- Environmental Modelling and Software. Ed. D. Ames and N. ware development, continuous integration, and DevOps. Hy- Quinn. San Diego, 2014. droShare features are positioned as GitHub branches and worked [2] Implementation of NSF CIF21 Software Vision (SW-Vision), on by subsets of the active HydroShare development community http://www.nsf.gov/si2/. distributed across the ten collaborating organizations. Designs [3] NSF collaborative HydroShare award numbers 1148453 and of proposed new features are discussed extensively initially dur- 1148090, ing one of the weekly HydroShare team calls involving hydrol- http://www.nsf.gov/awardsearch/showAward?AWD_ID=11484 ogy domain researchers, developers, and software engineers 53/ and from the HydroShare collaborating institutions as well as com- http://www.nsf.gov/awardsearch/showAward?AWD_ID=11480 munity stakeholders. Designs are revisited as required so as to 90/. adapt to new technologies and/or address changing require- [4] Idaszak, R., Tarboton, D.G., Yi, H., Christopherson, L., Stealey, ments. Once a design is accepted, unit tests are written and in- M.J., Miles, B., Dash, P., Couch, A., Spealman, C., Ames, D.P., tegrated with Jenkins which is an open source continuous inte- Horsburgh, J.S. HydroShare – A case study of the application of gration tool. GitHub commits are made daily, and functional modern software engineering to a large distributed federally- progress with running code demonstrations are reviewed weekly funded scientific software development project. Accepted for inclusion in: J. Carver, N.P.C. Hong, and G.K. Thiruvathukal for functionality and usability during the HydroShare weekly (eds.) Software Engineering for Science, ISBN 9781498743853. team calls on one of the HydroShare pre-release virtual ma- Taylor&Francis CRC Press; November 2016. chines. HydroShare GitHub feature branches are rebased regu- [5] HydroShare GitHub site, larly with the HydroShare main branch to keep the code from https://github.com/hydroshare/hydroshare/graphs/commit- getting out of sync with the HydroShare production release. As activity. with all HydroShare code, code reviews are performed by some- [6] A Vision for Hydrologic Science Research, Consortium of one other than the author of the code, and only until a “+1” is Universities for the Advancement of Hydrologic Sciences Inc., given by the code reviewer (via GitHub issue tracking) and all Technical Report Number 1, unit tests pass will code be committed incrementally into the Hy- http://www.cuahsi.org/publications/cuahsi_tech_rpt_1.pdf. droShare main branch. The HydroShare main site offers mech- [7] CUAHSI Hydrologic Information Systems, Consortium of anisms for the community at large to comment on issues (includ- Universities for the Advancement of Hydrologic Science, Inc. ing submission of bugs) and contribute suggestions to Hy- Technical Report Number 2 - Hydrologic Information Systems droShare. The HydroShare GitHub site maintains active statis- Committee, http://www.cuahsi.org/docs/dois/CUAHSI-TR2.pdf. tics demonstrating the vibrant, open, and diverse HydroShare re- [8] https://www.cuahsi.org/wdc. search software development activity [5].