Social Debt Analytics for Improving the Management of Software Evolution Tasks Fabio Palomba Alexander Serebrenik Andy Zaidman Delft University of Technology Eindhoven University of Technology Delft University of Technology Eindhoven University of Technology The Netherlands The Netherlands The Netherlands a.serebrenik@tue.nl a.e.zaidman@tudelft.nl f.palomba@tudelft.nl f.palomba@tue.nl Abstract—The success of software engineering projects is in metaphor of technical debt, which refers to practices per- a large part dependent on social and organization aspects of formed by developers that lead to the introduction of bad the development community. Indeed, it not only depends on the design solutions that will turn in an additional cost during complexity of the product or the number of requirements to be implemented, but also on people, processes, and how they software maintenance and evolution. One noticeable symptom impact the technical side of software development. Social debt of technical debt is the presence of code smells [19], i.e., represents patterns across the organizational structure around a bad programming practices applied by developers that lead software system that may lead to additional unforeseen project the source code to be less maintainable and more change- and costs. Condescending behavior, disgruntlement or rage quitting fault-prone [30]. are just some examples of social issues that may occur among the developers of a software project. While the research community On the other hand, software communities have been the has recently investigated the underlying dynamics leading to the subject of several empirical investigations as well [5], [9], [15], introduction of social debt (e.g., the so-called “community smells” [33], [40], [41]: in particular, they have mainly focused on how which represent symptoms of the presence of social problems in such communities evolve [21], [27], [28], [47], while they only a community), as well as how such debt can be payed off, there partially considered the relationships between community- is still a noticeable lack of empirical evidence on how social debt impacts software maintenance and evolution. In this paper, related information and software maintenance and evolution of we present our position on how social debt can impacts technical technical products. Indeed, such studies especially targeted the aspects of source code by presenting a road map toward a deeper so-called social debt, i.e., unforeseen project cost connected to understanding of such relationship. a “suboptimal” development community (i.e., both in structure and behavior) [42], [43]. For instance, Tamburri et al. [41], I. R ESEARCH P ROBLEM AND M OTIVATION [43] defined a set of community smells, a set of socio-technical Software engineering is clearly characterized by two main characteristics (e.g., high formality) and patterns (e.g., recur- components, i.e., technical and social [39]. While the former rent condescending behavior, or rage-quitting), which may is related to the processes aimed at producing sound technical lead to the emergence of social debt. One of the typical products that meet the expected requirements, the latter has to community smells they found is the Organizational Silo Effect, do with the relationships involving organizations, developers, which arises when a software community presents siloed and stakeholders who are responsible for the definition of the areas that essentially do not communicate, except through technical product [12]. According to the Netherlands Knowl- one or two of their respective members: as a consequence, edge and Innovation Agenda ICT 2016 - 2019, “software and the development activities might be delayed due to lack of system complexity is not solely of technological nature but communication between developers. also defined by people and processes”1 . Indeed, the success of Besides the studies on social debt, a consistent number software engineering projects is strongly dependent on social of empirical analyses have been carried out on the so-called and organization aspects of the development community [11]. socio-technical congruence [8], i.e., the alignment between co- In past and recent years the software evolution research ordination requirements extracted from technical dependencies community has actively focused on technical aspects of soft- among tasks and the actual coordination activities performed ware projects, by (i) understanding the key factors making by the developers. While studies in this category had the technical products easier to maintain [2], [4], [10], [29], [32], intention to investigate the relationship between social and [44], [48] or (ii) devising techniques and tools to support technical sides of software communities (e.g., studying how developers during different evolutionary tasks [17], [25], [31], the collaboration among developers influence their productiv- [36], [45]. For example, Cunningham [14] introduced the ity [16], [18]), we believe there is still a lack of evidence on their connection, i.e., how social-related issues occurring 1 https://www.4tu.nl/nirict/en/Research/knowledge-and-innovation-agenda- among the developers of a software project and indicated ict-2016-2020.pdf by the presence of social debt might impact the quality of 18 both technical products and processes. In other words, it is of issues within a community might lead developers to still unclear whether factors like coordination issues, lack of misunderstand requirements implementation or design communication, or structure of development community have choices to apply, thus possibly introducing bugs. For an effect on the quality of source code (e.g., code smells [19]) instance, Kwan et al. [24] suggested that social debt or the quality of software processes (e.g., code review [2] and not only impacts software build success, but also may bug resolution [22] completion time). We believe that such contribute to worsen program comprehension. relationship is of a paramount importance for enlarging the 2) How different software community types are affected knowledge on how software systems evolve and how to prop- by technical debt. As clearly visible looking at the erly support developers along with software evolution. While modern software development practices, current systems some exploratory studies in this direction have been conducted heavily rely on open-source software (OSS) [13], [35]. in the domain of requirements engineering for coordination by Indeed, over the past years OSS has moved from an Cataldo et al. [8] and Kwan et al. [24], both these lines of academic curiosity to a mainstream focus within soft- inquiry and related research are still far from relating social ware communities [13]. Despite their high popularity, and technical debt with the goal of comprehending their co- open-source development communities rarely rely on creation and evolution. governance insights from organizations research and/or In this paper, we present our position about the relation- tracking their organizational status using social networks ship between social and technical aspects of source code by analysis (SNA) [23], e.g., to evaluate the current salient describing our road map toward its proper management. social and organizational characteristics describing their community. Such a lack of structured information might II. O N THE R ELATIONSHIP BETWEEN S OCIAL AND lead to two undesirable consequences. On the one hand, T ECHNICAL D EBT the implosion of the community: for example, an in- The first envisioned step toward the exploration of the con- creasing number of reflections and empirical evidence nection between social and technical debt is to assess whether over abandonware [26], [37] or even failure of entire such relationship actually exists and, as a consequence, what open-source forges indicates the need to understand and its strength is. In particular, we hypothesize that community- better support the organizational and social aspects of related issues can be the cause of introducing technical debt. open-source communities [20], [38]. On the other hand, To verify this hypothesis, we believe that future research the introduction of specific technical debt based on effort should be devoted to design a number of empirical the underlying community type of a software project: investigations aimed at characterizing the interaction between for instance, communities characterized by the presence the two aspects, including but not limited to: of formal vs informal communications might suffer of different technical issues due to the different nature of 1) How social debt impact the presence of technical debt. developers’ interactions. Thus arises the need to support As briefly explained in the previous section, commu- clients of OSS in the evaluation of the risks associated nity smells indicate frequent negative patterns over a to the usage of such open-source code in their systems. software community organization that might result in Therefore, an important challenge is represented by a additional project costs. We suppose that such project deeper analysis of (i) how to automatically identify the costs can be basically due to (i) pure social-related issues key characteristics of different community types so that raising between developers and (ii) technical issues they can be classified and (ii) to what extent different arising in the source code. While the causes behind ad- community types exhibit different technical debt such as ditional costs due to the presence of social issues within code smells or bugs. communities [5], [9], [15], [34], an important challenge To face the challenges described so far, a mixture of for the research community is that to assess the extent to quantitative and qualitative analyses is required. Indeed, on which additional technical costs are due to social debt. the one the assessment of the relationship between social and Specifically, we believe that future research should focus technical debt needs to be performed by means of correlation, on the relationship between social debt, e.g., measured statistical, and predictive analyses. On the other hand, surveys in terms of community smells [41], and technical aspects or semi-structured interviews with developers might reveal such as (i) presence of code smells and (ii) introduction additional insights on such a relationship, possibly explaining of defects. In the first case, a clear real-case scenario (i) the motivations behind it and (ii) effective ways to re- might involve the presence of sub-communities that do organize software communities to remove social debt. not communicate with each other (e.g., in presence of a Organizational Silo smell) that, as a consequence, are III. O N THE I MPACT OF S OCIAL D EBT ON T ECHNICAL not able to come up with a correct way to modularize the TASKS C OMPLETION E FFORT different modules of the systems they are developing, Once we have assessed the actual relationship between thus possibly introducing architectural or code smells social and technical debt of software systems, the subsequent like Promiscuous Package or Blob [7], [19]. In the step that we envision is to determine the impact of community- second case, it is reasonable to think that the presence related aspects on different evolutionary tasks performed by 19 developers during software maintenance. In particular, we review and bug resolution tools by exploiting social debt when believe that poor communication between developers increases recommending the developers that should perform such tasks, the time needed to complete technical tasks such as code re- and (iv) improving maintenance effort estimation capabilities view, issue resolution, etc. This set of empirical investigations adding social debt information as additional predictors. The include: results of such line of research are intended to be useful for 1) How social debt impacts code review and bug resolution both managers, interested in solving potential social issues time. The modern assessment of software quality during in a community, and developers, interested in estimating the maintenance and evolution is typically guided by mod- effort—social and technical—required for maintaining the ern code reviews [2]: In a typical code review, one or source code. Finally, the research might set a research agenda more developers dress the role of reviewers and verify for socio-technical factors in software development. that the proposed source code changes meet the quality V. C ONCLUSION requirements with respect to readability, maintainability, Social debt is a manifestation of issues occurring within a and the absence of defects, before being deployed into software development team. For instance, community smells production. In this context, social issues occurring within reflect sub-optimal organizational and socio-technical char- the development community might influence this process acteristics or patterns in the organizational structure of the in two different ways: (i) how source code is reviewed, software community. Much in the same way, technical debt i.e., the quality of a review, and (ii) the time needed to represents sub-optimal solutions applied by programmers dur- verify that the changes applied meet the requirements. ing the development of technical artifacts. Recent empirical Indeed, following the socio-technical argumentations studies provided some insights into a possible relationships proposed so far [8], [24], problems in the technical between them that, however, has not been shown yet. development may be a reflection of issues occurring at In this paper, we highlight a possible outline of future the community-level. At the same time, another critical research on the relationship between social and technical debt software engineering process during maintenance and arising in software systems, by describing a set of empirical evolution is related to the way developers fix bugs [1]: studies that might lead to a more comprehensive understanding on the one hand, we believe that the research community of the phenomenon. Starting from the connection between should focus on investigations aimed at understanding (i) symptoms of the presence of social and technical debt, i.e., how the presence of community-related issues impacts community and code smells, important challenges for the soft- the ability to diagnose bugs and (ii) whether the time ware evolution research community include the understanding needed to resolve a bug is higher when the reviewer is of (i) how different software development community types involved in social debt. are affected by social and technical debt, (ii) how different 2) How social debt impacts maintenance effort estimation software development practices, such as code review, bug models. Since its birth, software effort estimation has resolution, or effort estimation might be impacted by issues represented the Holy Grail of software engineering [6], occurring at community-level, and (iii) whether and to what since it supports project managers in the correct esti- extent current software quality evaluation techniques and tools mation of effort and costs needed to develop a software can be improved by exploiting social debt information. system. The problem is still more tricky when it comes We believe that such challenges would help in the deeper to the maintenance and evolution: indeed, this phase understanding of how software systems evolve and how is responsible for more than 80% of the total cost of community-related factors impact this process. software [3]. While some researchers proposed the use VI. ACKNOWLEDGMENT of product and process metrics for estimating the effort Fabio Palomba is funded by the 4TU-NIRICT project estimation of future maintenance tasks [46], following “Social Aspects of Software Quality”. Any opinions, findings, the conjectures reported previously in this section we and conclusions expressed herein are the authors’ and do not believe that a promising challenge is to exploit social necessarily reflect those of the sponsors. debt as additional factors to correctly estimate the costs of maintenance and evolution tasks. R EFERENCES [1] J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug? In Pro- IV. I MPROVING S OFTWARE E VOLUTION T OOLS BY ceedings of the 28th international conference on Software engineering, E XPLOITING S OCIAL D EBT pages 361–370. ACM, 2006. [2] A. Bacchelli and C. Bird. Expectations, outcomes, and challenges Based on the results achieved from the studies described of modern code review. In Proceedings of the 2013 international in Sections II and III, several software evolution tools might conference on software engineering, pages 712–721. IEEE Press, 2013. [3] R. D. Banker, S. M. Datar, C. F. Kemerer, and D. Zweig. Software be improved by means of social debt analysis. For instance, complexity and maintenance costs. Communications of the ACM, it would be possible to study solutions aimed at (i) detecting 36(11):81–95, 1993. social debt in software communities (e.g., by means of social [4] P. Bhattacharya and I. Neamtiu. Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging. In Software network analysis techniques), (ii) prioritizing technical debt Maintenance (ICSM), 2010 IEEE International Conference on, pages based on the co-occurrence of social debt, (iii) improving code 1–10. IEEE, 2010. 20 [5] C. Bird, N. Nagappan, H. Gall, B. Murphy, and P. Devanbu. Putting [28] K. Nakakoji, Y. Yamamoto, Y. Nishinaka, K. Kishida, and Y. Ye. it all together: Using socio-technical networks to predict failures. In Evolution patterns of open-source software systems and communities. Proceedings of the 2009 20th International Symposium on Software In Proceedings of the international workshop on Principles of software Reliability Engineering, ISSRE ’09, pages 109–119, Washington, DC, evolution, pages 76–85. ACM, 2002. USA, 2009. IEEE Computer Society. [29] W. Oizumi, A. Garcia, L. da Silva Sousa, B. Cafeo, and Y. Zhao. [6] L. C. Briand and I. Wieczorek. Resource estimation in software Code anomalies flock together: Exploring code anomaly agglomerations engineering. Encyclopedia of software engineering, 2002. for locating design problems. In Software Engineering (ICSE), 2016 [7] W. H. Brown, R. C. Malveau, H. W. S. McCormick, and T. J. Mowbray. IEEE/ACM 38th International Conference on, pages 440–451. IEEE, AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis. 2016. John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1998. [30] F. Palomba, G. Bavota, M. Di Penta, F. Fasano, R. Oliveto, and [8] M. Cataldo, J. D. Herbsleb, and K. M. Carley. Socio-technical con- A. De Lucia. On the diffuseness and the impact on maintainability of gruence: a framework for assessing the impact of technical and work code smells: a large scale empirical investigation. Empirical Software dependencies on software development productivity. In H. D. Rombach, Engineering, pages 1–34, 2017. S. G. Elbaum, and J. Mnch, editors, Proceedings of the Int’l Symposium [31] F. Palomba, G. Bavota, M. D. Penta, R. Oliveto, D. Poshyvanyk, and on Empirical Software Engineering and Measurement (ESEM), pages A. D. Lucia. Mining version histories for detecting code smells. IEEE 2–11. ACM, 2008. Transactions on Software Engineering, 41(5):462–489, May 2015. [9] M. Cataldo and S. Nambiar. The impact of geographic distribution [32] F. Palomba, A. Panichella, A. Zaidman, R. Oliveto, and A. De Lucia. and the nature of technical coupling on the quality of global software The scent of a smell: An extensive comparison between textual and development projects. Journal of Software: Evolution and Process, structural smells. IEEE Transactions on Software Engineering, 2017. 24(2):153–168, 2012. [33] M. Pinzger, N. Nagappan, and B. Murphy. Can developer-module [10] A. Chatzigeorgiou and A. Manakos. Investigating the evolution of code networks predict failures? In Proceedings of the 16th ACM SIGSOFT smells in object-oriented systems. Innov. Syst. Softw. Eng., 10(1):3–18, International Symposium on Foundations of software engineering, SIG- Mar. 2014. SOFT ’08/FSE-16, pages 2–12, New York, NY, USA, 2008. ACM. [11] M. Conway. How do committees invent. Datamation, 14(4):28–31, [34] L. Prattico. Governance of open source software foundations: Who holds 1968. the power? Technology Innovation Management Review, 1(12):37–42, [12] M. E. Conway. How do committees invent. Datamation, 14(4):28–31, 2012. 1968. [35] K. Raju. Is the Future of Software Development in Open Source? [13] K. Crowston, K. Wei, J. Howison, and A. Wiggins. Free/libre open- Proprietary vs Open Source Software: A Cross Country Analysis. source software development: What we know and what we do not know. Journal of Intellectual Property Rights, Vol. 12, No. 2, March 2007, ACM Computing Surveys (CSUR), 44(2):7, 2012. 12(2):21–42, 2007. [14] W. Cunningham. The wycash portfolio management system. SIGPLAN [36] R. Robbes and M. Lanza. A change-based approach to software OOPS Mess., 4(2):29–30, Dec. 1992. evolution. Electronic Notes in Theoretical Computer Science, 166:93– [15] S. Datta, R. Sindhgatta, and B. Sengupta. Evolution of developer 109, 2007. collaboration on the jazz platform: a study of a large scale agile project. [37] G. Robles, J. M. González-Barahona, C. Cervigón, A. Capiluppi, and In Proceedings of the 4th India Software Engineering Conference, ISEC D. Izquierdo-Cortázar. Estimating development effort in free/open ’11, pages 21–30, New York, NY, USA, 2011. ACM. source software projects by mining software repositories: a case study [16] C. R. de Souza and D. F. Redmiles. On the roles of apis in the co- of openstack. In Proceedings of the 11th Working Conference on Mining ordination of collaborative software development. Computer Supported Software Repositories, pages 222–231. ACM, 2014. Cooperative Work (CSCW), 18(5-6):445, 2009. [38] C. M. Schweik. Sustainability in open source software commons: [17] D. Di Nucci, F. Palomba, G. De Rosa, G. Bavota, R. Oliveto, and Lessons learned from an empirical study of sourceforge projects. Tech- A. De Lucia. A developer centered bug prediction model. IEEE nology Innovation Management Review, 3:13–19, 01/2013 2013. Transactions on Software Engineering, 2017. [39] I. Sommerville. Software Engineering. Addison-Wesley, Harlow, [18] S. B. Fonseca, C. R. De Souza, and D. F. Redmiles. Exploring the England, 9. edition, 2010. relationship between dependencies and coordination to support global [40] D. Surian, D. Lo, and E.-P. Lim. Mining collaboration patterns from a software development projects. In Global Software Engineering, 2006. large developer network. In WCRE, pages 269–273, 2010. ICGSE’06. International Conference on, pages 243–243. IEEE, 2006. [41] D. A. Tamburri, R. Kazman, and H. Fahimi. The architect’s role in [19] M. Fowler. Refactoring: Improving the Design of Existing Code. community shepherding. IEEE Software, 33(6):70–79, 2016. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, [42] D. A. Tamburri, P. Kruchten, P. Lago, and H. van Vliet. What is social 1999. debt in software engineering? In ICSE - CHASE Workshop Series, pages [20] J. Gamalielsson and B. Lundell. Sustainability of open source software 40–49, 2013. communities beyond a fork: How and why has the libreoffice project [43] D. A. Tamburri, P. Kruchten, P. Lago, and H. van Vliet. Social debt in evolved? Journal of Systems and Software, 3(11):128–145, 2013. software engineering: insights from industry. J. Internet Services and [21] Y. Gao, V. Freeh, and G. Madey. Analysis and modeling of the open Applications, 6(1):10:1–10:17, 2015. source software community. NAACSOS, Pittsburgh, 2003. [44] M. Tufano, F. Palomba, G. Bavota, R. Oliveto, M. D. Penta, A. D. [22] G. Jeong, S. Kim, and T. Zimmermann. Improving bug triage with Lucia, and D. Poshyvanyk. When and why your code starts to smell bug tossing graphs. In Proceedings of the the 7th joint meeting of bad (and whether the smells go away). IEEE Transactions on Software the European software engineering conference and the ACM SIGSOFT Engineering, PP(99):1–1, 2017. symposium on The foundations of software engineering, pages 111–120. [45] M. White, M. Tufano, C. Vendome, and D. Poshyvanyk. Deep learning ACM, 2009. code fragments for code clone detection. In Proceedings of the 31st [23] D. Knoke and S. Yang. Social network analysis, volume 154. Sage, IEEE/ACM International Conference on Automated Software Engineer- 2008. ing, pages 87–98. ACM, 2016. [24] I. Kwan, A. Schroter, and D. Damian. Does socio-technical congruence [46] H. Wu, L. Shi, C. Chen, Q. Wang, and B. Boehm. Maintenance effort have an effect on software build success? a study of coordination in a estimation for open source software: A systematic literature review. In software project. IEEE Trans. Softw. Eng., 37(3):307–324, May 2011. Software Maintenance and Evolution (ICSME), 2016 IEEE International [25] M. Lanza. The evolution matrix: Recovering software evolution using Conference on, pages 32–43. IEEE, 2016. software visualization techniques. In Proceedings of the 4th interna- [47] J. Xu, Y. Gao, S. Christley, and G. Madey. A topological analysis of tional workshop on principles of software evolution, pages 37–42. ACM, the open souce software development community. In System Sciences, 2001. 2005. HICSS’05. Proceedings of the 38th Annual Hawaii International [26] T. Mens, M. Claes, P. Grosjean, and A. Serebrenik. Studying evolving Conference on, pages 198a–198a. IEEE, 2005. software ecosystems based on ecological models. In Evolving Software [48] A. Zaidman, B. Van Rompaey, S. Demeyer, and A. Van Deursen. Mining Systems, pages 297–326. Springer, 2014. software repositories to study co-evolution of production & test code. [27] T. Mens and M. Goeminne. Analysing the evolution of social aspects of In Software Testing, Verification, and Validation, 2008 1st International open source software ecosystems. In IWSECO@ ICSOB, pages 1–14, Conference on, pages 220–229. IEEE, 2008. 2011. 21