1st International Workshop on Technical Debt Analytics (TDA 2016) Report on the First International Workshop on Technical Debt Analytics (TDA 2016) Aiko Yamashita Leon Moonen Tom Mens Amjed Tahir CWI, the Netherlands & Simula Research Laboratory University of Mons Massey University HiOA, Norway Norway Belgium New Zealand aiko.yamashita@cwi.nl leon.moonen@computer.org tom.mens@umons.ac.be a.tahir@massey.ac.nz Abstract—This report outlines the motivation and goals of the II. A BOUT TDA 2016 First International Workshop on Technical Debt Analytics (TDA TDA 2016 was held in New Zealand on 6 December 2016, in 2016), presents the workshop programme, introduces the work accepted for presentation, and summarizes the major results conjunction with the 23rd Asia-Pacific Software Engineering and themes that emerged from the discussion and activities Conference (APSEC 2016). The goal of TDA 2016 was to undertaken during the workshop. offer a specialised arena in TD to discuss about: 1) Calibrating technical debt and technical wealth related terminologies and concepts that are used indistinctly and I. I NTRODUCTION interchangeably in software engineering literature. 2) Comparing, integrating, compiling and even reconciling Technical debt (TD) is a metaphor reflecting technical com- empirical work on the effects of technical debt/technical promises that can yield short-term benefit but may hurt the wealth from economic and organisational perspectives. long-term health of a software system. This metaphor has To reach these goals, the workshop gathered practitioners been initially concerned with software implementation (i.e., and researchers working in the field of TD, to share experi- code smells), but it has been extended to software design and ences, concur on terminologies and evaluation guidelines, and architecture (i.e., anti-patterns and architectural smells) as well to build a common research agenda for the community. as documentation, requirements, and testing [1]. TDA 2016 built further upon results proposed during the A systematic literature review by Li et al. [2] indicates that Dagstuhl 16162 seminar on Managing Technical Debt in the term “debt” has been used in different ways by different Software Engineering (April 2016), and discussed during the software communities, leading to ambiguous interpretations of eighth international workshop on Managing Technical Debt the term. They found that code-related TD (i.e., code smells) (MTD), held in October 2016 in conjunction with ICSME. and its detection and resolution have gained the most attention III. W ORKSHOP P ROGRAMME whilst there is a need for more empirical studies with high- The morning session of the workshop started with an intro- quality evidence on the whole Technical Debt Management duction by the organisers, immediately followed by an invited (TDM) process and on the application of specific TDM ap- keynote presentation entitled “Towards quantifying technical proaches in industrial settings. The lack of empirically rooted debt” by Ewan Tempero, Associate Professor in the Depart- evidence makes it difficult for organizations to align business ment of Computer Science at The University of Auckland, value with the intrinsic quality of the software product itself. New Zealand. He discussed the current status of measuring Zazworka et al. [3] argue that, in many projects, the cost TD and presented ideas as to what the TD community needs and benefit of software refactoring (an approach to repaying to do to develop the necessary tools to properly manage TD, TD) cannot be easily quantified and estimated. Consequently, and more specifically to quantify TD. it is still an open challenge to translate TD into economic This keynote talk was followed by a short lightning talk by consequences, making it difficult for development teams to Jim Buchan on the relation between technical debt and legacy make a strong case to the business side to them to invest in software. The extended abstract of his talk is reprinted, with fixing technical shortcuts. permission, in section IV-C of this report. One of the major challenges rooted in the aforementioned The remainder of the morning session was filled with “ambiguity” mentioned by Li et al. [2] is the lack of an presentations of the accepted peer-reviewed contributions for underlying theory and models to aid TD identification and TDA 2016. These are summarised in section IV-B. measurement. Seaman et al. [4] argue that a comprehensive The afternoon was devoted to a moderated discussion TD theory should be developed to formalize the relationship around the workshop goals, initiated by a brief summary between the cost and benefit of the TD concept, and subse- by Clemente Izurieta (Assistant Professor at Montana State quently practical TDM approaches should be developed and University) who presented the technical debt roadmap, future validated to exploit the TD theory in management decision research perspectives and open research challenges discussed making. during the Dagstuhl seminar on Managing Technical Debt [5]. 58 1st International Workshop on Technical Debt Analytics (TDA 2016) This presentation was followed by an interactive working • Solomon Mensah, Jacky Keung, Michael Franklin Bosu session using discussion techniques such as card sorting and and Kwabena Ebo Bennin. Rework Effort Estimation of fishbowl panels. These sessions aimed at building a common Self-admitted Technical Debt information. understanding of the challenges, future directions for potential Programmers unintentionally leave incomplete, tem- solutions and establishing a common research agenda. porary workarounds and buggy codes that require re- IV. W ORKSHOP C ONTRIBUTIONS work. This phenomenon in software development is referred to as Self-admitted Technical Debt (SATD). A. Keynote address by Ewan Tempero: Towards quantifying The authors report on an exploratory study using technical debt a text mining approach to extract SATD from de- Summary: Technical debt (TD) is a metaphor that comes velopers source code comments and implemented from the financial world, however it breaks down almost an effort metric to estimate the rework effort that immediately. In the financial world when considering taking might be needed to resolve the SATD problem. on debt, we can use a financial planner to determine such The study confirms the results of a prior study that things as what our regular payments need to be and what the found design debt to be the most predominant class total cost of the loan will be. In software development, those of SATD. This technique could support managerial making a decision that creates TD have no idea how much decisions on whether to handle SATD as part of debt they are taking on, and often do not even realise when on-going project development or defer it to the they are taking on some debt. For the metaphor to be useful, maintenance phase. we must develop the means to quantify TD, in particular to • Aabha Choudhary and Paramvir Singh. Minimizing be able to do so before we take on TD. In this talk I will Refactoring Effort through Prioritization of Classes based discuss the current status of measuring TD and present some on Historical, Architectural and Code Smell Information. ideas as to what we have to do to develop the necessary tools The authors present an approach for identifying to properly manage TD, specifically what we need to do to and prioritizing object oriented software classes in quantify TD. need of refactoring by identifying the most change- Bio: Ewan Tempero is an Associate Professor in the Depart- prone as well as the most architecturally relevant ment of Computer Science at The University of Auckland, classes, and by generating class ranks based on code New Zealand. He graduated from the University of Otago, smell information. Also, the approach provides to New Zealand, with a B.Sc., (Honours) in Mathematics in 1983 developers an estimation of maximum code smell and received his Ph.D. in Computer Science from the Univer- correction (paying off maximum technical debt) with sity of Washington, USA, in 1990. He has published over 170 minimum refactoring effort. papers in journals and internationally-refereed conferences, • Johannes Holvitie, Sherlock Licorish, Antonio Martini mainly in the areas of software reuse, software tools, and and Ville Leppänen. Co-Existence of the ‘Technical Debt’ software metrics. His current research is developing metrics for and ‘Software Legacy’ Concepts. measuring the quality of software designs. He is the developer Beyond strategic and accidental accumulation, tech- and maintainer of the Qualitas Corpus. nical debt may also occur due to delayed accumula- B. Accepted submissions for TDA 2016 tion. In addition, technical debt and software legacy The following papers were accepted to be presented during are concepts that share a lot of commonalities. Both TDA 2016: concepts describe a state of software that is sub- optimal, and explain how this state can decrease an • Norihiro Yoshida. When, why and for whom do practi- organization’s development efficiency. The authors tioners detect technical debt? An experience report. report on an initial examination of technical debt Based on his experience through industry-university and software legacy similarities, and their somewhat collaboration, the author discusses when, why and challenging co-existence. for whom practitioners detect code clones, one of the • Clemente Izurieta, Ipek Ozkaya, Carolyn Seaman, most common code-level notions of technical debt. Philippe Kruchten, Robert Nord, Will Snipes, Paris Avge- • Yasutaka Kamei, Everton Maldonado, Emad Shihab and riou. Perspectives on Managing Technical Debt: A Tran- Naoyasu Ubayashi. Using Analytics to Quantify Interest sition Point and Roadmap from Dagstuhl of Self-Admitted Technical Debt. This paper summarizes the outcomes of a Dagstuhl In this paper, the authors determined ways to mea- Seminar where the current state of managing tech- sure the ‘interest’ on the debt and used these mea- nical debt in software engineering was discussed. sures to see how much of the technical debt incurs Participants reflected on the significant advances that positive interest, i.e., debt that indeed costs more to the Managing Technical Debt (MTD) community pay off in the future. To measure interest, they used has made since its inception in 2010; reached a the LOC and Fan-In code metrics, and carried out a consensus on a definition, called the Dagstuhl 16K case study on the Apache JMeter project. 59 1st International Workshop on Technical Debt Analytics (TDA 2016) technical debt definition; and discussed avenues for In this presentation I will firstly establish a common vocabu- future progress in the area. This paper offers a lary by exploring the meanings (and ambiguity) in the concepts roadmap and a vision that describe the areas of of legacy code and TD and their relationship. Analysis of research in TD where significant challenges remain. the case organization provides some grounded insights into some of the challenges and consequences of managing a C. Lightning talk by Jim Buchan: TD and legacy code large proportion of legacy code, as well as suggesting some Many organisations have software that has evolved over many recommendations for managing the legacy code debt. Based years and much of their focus is on enhancing and modifying on the case study, as well as related research-based theory, I this existing software product, some of which may be based will address questions that include: on older technology. Although the older code may represent • When does code become “legacy”? the company’s core intellectual property, representing previous • How can the need to replace legacy code be identified? innovations, its age often introduces constraints and com- • How can the economic value of a legacy code re-write promises on the continued evolution of the product. Over be evaluated? time, the proportion of the code judged as “legacy” grows, • What is the best way to approach the large re-write of a resulting in increased effort and uncertainty in expanding, legacy code-base? testing and modifying the legacy code. At some point in time this may become untenable, with lost opportunity offered by V. T HEMES THAT E MERGED FROM THE W ORKSHOP new technologies, shortage of expertise in the legacy system, or unacceptable levels of bugs. Often this will trigger a full or The following research challenges and open issues emerged partial re-write of the system to replace parts of the code. via discussions during the keynote, presentations, lightning The growth of the legacy code can be viewed as increasing talk and afternoon activities. technical debt (TD) in the sense that the software design has become sub-optimal over time and the interest in not paying A. Heterogeneity of TD Definitions back the legacy code debt may increase. The past decisions to incur debt may have a component that is deliberate, with the One of the major obstacles to build a unified approach to acceptance of compromises to new code, due to constraints quantify TD was found to be the lack of consensus on what of the legacy code. There is also a component of the TD that constitutes TD. There are many different definitions of TD, is not deliberate, with the emergence of new, unforeseeable leading to many different interpretations. For example, one technologies that offer new design and business opportunities. way to define TD is do what you need to do to get a release I present a brief case study of an organization in New on time. However, the interpretation of this definition is highly Zealand and its challenges and issues related to dealing with context-dependent and misses some important details on the TD in the form of legacy code. The case organisation has a potential effects of TD. It is widely believed that TD does software product that grew very quickly in the size of the client not have a commonly agreed vocabulary (i.e., different terms base as well as the code base in the late 90s early 2000s. It can mean the same things). The SonarQube tool is a typical was largely based on technology written in a 4GL language example of this issue, as it indicates TD for each rule vio- common at the time, with an integrated database and (limited) lation. However, these violations are often project-dependent, GUI input and output. making the tool displaying misleading or inaccurate results to Over recent years the user interface and new modules were determine TD. refreshed for a more modern look and feel, as well as to take advantage of the performance gains of new technologies. B. Immature Measurement Theory in TD Studies The changes were typically wrappers for the underlying 4GL Another challenge for quantifying TD is the lack of un- code introducing new layers of processing, conversion, and derstanding of general measurement theory and empirical as- presentation. There are over 1 million lines of code in the 4GL sessment of software measurements [6]. For example, there is language and expertise in the 4GL language was becoming frequent misunderstanding or confusion between what metrics scarce. Extensions to the product were becoming increasingly constitute in contrast to measurements. difficult to develop and test with a high degree of uncertainty in dependencies and redundancies. After around 15 years of growth, the decision was made to initiate a project to port the C. Unclear Relation between TD and Legacy Systems existing product to a new technology stack, adding some new Another challenge raised was the difficulty of relating prob- features at the same time. lems stemming from legacy systems with problems stemming from TD. What are the boundaries and the differences between both phenomena? It appears that managers in industry are unsure of what these boundaries are. During the workshop, a literature review was presented which constituted a mapping study to verify if the terms legacy and debt have been used together in previous studies. 60 1st International Workshop on Technical Debt Analytics (TDA 2016) D. Moderator and Contextual Factors Another issue is the influence of moderator variables over TD. One of the case studies presented during the workshop indicated that the type of programming language used will highly influence the presence of code clones, thus one should be careful when assessing TD based on code cloning, for example by taking into consideration the language used, the framework available, etc. E. Self-Admitted Technical Debt Some advances presented and discussed during the work- shop was the usage of text mining approach to identify instances of self-admitted technical debt (SATD), which can also lead to modelling and understanding better instances of TD, the context in which they were identified, as well as for estimations of reworks as result of SATD. Fig. 1. TDA participants during the card sorting activity F. Technical Dept Research Community Agenda During the workshop, the major outcomes from the architectural decisions. Some of the terms used by participants Dagstuhl seminar on Managing Technical Debt were pre- referred to coding-related TD issues such as code smells, sented. That seminar tried to achieve the following goals: ignoring standards, and poor programming practices. • Identify the most pressing industry problems For TW, participants suggested that good design practices • Identify the most promising research approaches are the key to TW. Participants suggested that the use of design • Identify the “hard” research questions patterns and refactoring are considered valuable for achieving A new definition of TD was proposed, focusing particularly TW. A suggested example was the use of aspects (as in Aspect- on two quality aspects: maintainability and evolvability. The Oriented Programming) and the implementation of the “sepa- definition underscored the importance of the domain (ex. ration of concerns” principle. Other suggestions that are likely design or implementation issue) and the technical context (ex. to contribute to TW were the use of proper documentation, degree of uncertainty, development and organisational context, comments in the code, and applying coding standards. Many time, causal chains). organisational factors are also considered valuable for TW. Concerning a TD community agenda, it was deemed that the Examples are the awareness and acknowledgment of TD, the research and development of TD should lead to the following acknowledgment of additional maintenance cost and the risk picture: of immature refactoring decisions. • More effort needs to be spent to develop a clear oper- VII. R ESULTS FROM THE PANEL D ISCUSSION ational definition of minimum viable quality levels that A panel discussion was moderated by Jim Buchan, fea- can reconcile both technical and economic perspectives. turing three panelists: Ewan Tempero, Clemente Izurieta and • There should be a clear way to translate developer con- Yasutaka Kamei. The discussion focused on two out of seven cerns into manager concerns, which can be used as a basis topics selected by participants. The audience voted for the for making decisions on investing on TW. following questions: What is needed beyond more empirical • TD would be incurred unintentionally most of the time. studies? and What are the likely reasons studies appear to VI. R ESULTS FROM THE C ARD S ORTING ACTIVITY display contradictory results on the same smells (TD)? All participants were asked to post the TD terminologies Question 1: What is needed beyond more empirical studies? that they were aware of on the whiteboard. We suggested Ewan asserted the need to think more about how to do studies three categories to classify terms: TD (Technical Debt), TW and how to replicate some of these studies. He asserted that (Technical Wealth), and “Others” to mark relevant terms that current studies in the SE community lack sufficient details to did not directly contribute to TD or TW. For each single card make replication of results possible. Clemente asserted that it was discussed with the whole audience why this term should the goal of empirical studies needs to be clearly outlined, as be included/excluded. We then mapped similar terms together, well as their motivation. Yasutaka added that is important to resulting in the following concept map of Figure 2. have actionable results from these studies so that they can be Many of the participants suggested a wide variety of do- used in industry. mains that they believe to be related to TD/TW, ranging Question 2: What are the likely reasons studies appear to from requirement gathering and analysis to deployment and display contradictory results on the same smells (TD)? maintenance. As shown in Figure 2, the majority of included Ewan asserted that this is related to the first question and thus, TD terms are related to design TD. This has been referred the answer is the same. Good study design should show similar to as anti-patterns, architectural smells, design flaws or poor results. In order to do that, the experimental design should 61 1st International Workshop on Technical Debt Analytics (TDA 2016) Existing Definitions of Technical Debt Domains Existing Definitions of Technical Wealth (Code smells, Anti-patterns...) of usage (Design patterns, refactoring) Poor programming Documentation Code Brown Anti- practices resulting in Uncommented Code Coding Code smells Comments (Javadoc, train of pattern extra work code Refactoring standards thought, etc) Design Proper use of documentation Design Architectural Non-necessary Architectural Separation of Refactoring Design Anti-pattern Design Flaw concern printiple Pattern smells complexity Style Aspect Design Model Poor trail of Lack of Testing TD Estimates Awareness of possible Customer's decisions documentations Debt -Cast maintenance cost Value of Refactoring Others -SQUALE Minimum Viable -SonarQube Awareness of the Product Financial cost of TD TD in Product -Nugroho potential Risk of Poor planning Opportunity of Value (Ampatozoglou) Lines Refactoring Fig. 2. Concept Map resulting from Sorting Activity be clear enough. He remarked that in general, “. . . we usually B. Strategy/Approach: have issues as software engineers to compare results of similar • Standardization efforts with members cross-cutting differ- studies as we do not take other factors into consideration. . . ” ent TD domains are needed, via coordinated action events Clemente asserted that there are too many reasons, such as and/or a standardization task force. This should be initi- context, benchmark, no repository, poor methodology. He ated alongside cooperation with industrial practitioners. added that the question should be the opposite: how to show • Incentives, frameworks and infrastructure need to be similar results? Yasutaka indicated that contradictory results developed for facilitating the proliferation of Open Data, are due to the studies depending on the context. If the context alongside support for describing the Methods used to is different, then the results will look different. Collect and Analyse the data. This can support and Clemente added that there is a problem with students that facilitate replication culture in SE research. are poorly prepared on how to conduct empirical studies, as • Better preparation and culture for empirical replication well as to correctly produce and perform replication studies. must be supported by network actions and alongside This statement was confirmed by Ewan: students should learn industry-focused conferences (e.g., the XP conference). more about research design/methods in order to be able to • Better quantification (and tool support development) of produce good empirical studies. TD can potentially be attained by: – Finding, curating, and providing accessibility to exper- VIII. W ORKSHOP S UMMARY imental artefacts – Reducing confounding factors by explicitly describing We summarise the main conclusions from the workshop in them and, when possible, controlling for them in terms of identified challenges and subsequent strategies. empirical studies IX. ACKNOWLEDGEMENTS A. Challenges: Our sincere gratitude goes to the PC members who helped in • We need a better classification of TD and standardise peer-reviewing the workshop submissions. terminology to avoid confusion and quantify TD more • Francesca Arcelli Fontana,University of Milano Bicocca • Paris Avgeriou, University of Groningen accurately. • Andrea Capiluppi, Brunel University • Study replication is quite important in any empirical • Alexander Chatzigeorgiou, University of Macedonia software engineering study, including TD. We should be • Eleni Constantinou, University of Mons able to replicate TD studies in order to be able to correctly • Steve Counsell, Brunel University • Davide Falessi, California Polytechnic State University compare results. • Yann-Gaël Guéhéneuc, École Polytechnique de Montréal • More high quality empirical studies and evidence are • Marouane Kessentini, University of MichiganDearborn needed, making replications possible and establishing an • Foutse Khomh, École Polytechnique de Montréal empirical basis and data science for TD. • Ipek Ozkaya, Software Engineering Institute - Carnegie Mellon • We need to better understand the interplay between model University • Fabio Palomba, University of Salerno TD and implementation TD from methodological and • Gregorio Robles, Universidad Rey Juan Carlos instrumentation perspectives. • Diomidis Spinellis, Athens University of Economics & Business • Effective tooling needs to be developed to assist industry • Nikolaos Tsantalis, Concordia University with assessing TD. • Mel Ó Cinnide, University College Dublin 62 1st International Workshop on Technical Debt Analytics (TDA 2016) R EFERENCES [4] C. Seaman and Y. Guo. “Measuring and monitoring [1] N. Brown et al. “Managing technical debt in software- technical debt”. In: Advances in Computers 82.25-46 reliant systems”. In: FSE/SDP workshop on Future of (2011), p. 44. Software Engineering Research (2010), p. 47. [5] P. Avgeriou et al. “Managing Technical Debt in Software [2] Z. Li, P. Avgeriou, and P. Liang. “A systematic mapping Engineering (Dagstuhl Seminar 16162)”. In: Dagstuhl study on technical debt and its management”. In: J. Reports 6.4 (2016), pp. 110–138. Systems and Software 101.11 (2015), pp. 193–220. [6] N. Fenton and J. Bieman. Software Metrics: A Rigorous [3] N. Zazworka, C. Seaman, and F. Shull. “Prioritizing and Practical Approach. CRC Press, Nov. 2014. design debt investment opportunities”. In: 2nd Workshop on Managing Technical Debt. New York, New York, USA: ACM Press, 2011, p. 39. 63