<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Technical Debt Measurement: An Exploratory Literature Review</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Donatien Koulla Moulla</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ernest Mnkandla</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hayatou Oumarou</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thomas Fehlmann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Euro Project Office</institution>
          ,
          <addr-line>Giblenstrasse 50, 8049 Zürich</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Maroua</institution>
          ,
          <addr-line>Maroua, P.O. Box 46</addr-line>
          ,
          <country country="CM">Cameroon</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of South Africa</institution>
          ,
          <addr-line>The Science Campus, Florida, 1710</addr-line>
          ,
          <country country="ZA">South Africa</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Measuring Technical Debt is important in guiding software development teams to make informed decisions and prioritize refactoring initiatives. This study presents an exploratory literature review of studies published between 2010 and 2023 to investigate the current state of Technical Debt measurement research. Through a set of four research questions, this study identifies the prevalent methodologies, metrics, and obstacles entailed in quantifying Technical Debt. Specifically, this study focuses on what is proposed to be measured through Technical Debt, the measurement solutions proposed for measuring Technical Debt, and how these approaches categorize and evaluate various aspects of Technical Debt. By scrutinizing the diverse approaches and challenges, this exploratory literature review identifies gaps (and related issues) in Technical Debt measurement research and contributes to a nuanced understanding of Technical Debt measurement practices, offering insights into enhancing software sustainability and maintainability.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Technical debt measurement</kwd>
        <kwd>Technical debt quantification</kwd>
        <kwd>Technical debt identification</kwd>
        <kwd>Defect density</kwd>
        <kwd>Maintainability</kwd>
        <kwd>Exploratory Literature Review</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction and Background</title>
      <p>
        The Technical Debt (TD) metaphor was first introduced by Ward Cunningham in 1992 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] in the
following way: “Shipping first-time code is like going into debt. A little debt speeds development so
long as it is paid back promptly with a rewrite. Objects make the cost of this transaction tolerable. The
danger occurs when the debt is not repaid. Every minute spent on not-quite-right code counts as interest
on that debt. Entire engineering organizations can be brought to a stand-still under the debt load of an
unconsolidated implementation, object-oriented or otherwise”.
      </p>
      <p>
        Since then, the definition and understanding of TD has evolved [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ]. Technical debt has recently
gained traction in both industry and the research community. As software systems evolve and grow,
managing TD becomes important to ensure long-term sustainability and maintainability of the
codebase. However, for TD to be managed, it must be identified and measured [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Measuring TD is important for software organizations to understand the extent of the problem and
prioritize areas that require immediate attention. By quantifying TD, organizations can make informed
decisions about when and where to invest resources to address the accumulated debt [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6, 7, 8</xref>
        ]. Some
studies have highlighted the significant impact of TD on software quality, productivity, and overall
project success [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ].
      </p>
      <p>
        Researchers have explored various approaches to quantify TD, and the widely adopted approach
involves leveraging code analysis tools that evaluate the quality of the codebase by employing
predefined metrics such as code complexity, code duplication, and coding style violations [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ]. An
alternative approach relies on subjective assessments provided by experienced developers, who can
provide insights into TD based on their understanding of the codebase and development process [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        The most common approach for measuring TD involves various approaches that focus on the
different aspects of TD quantification. These approaches include identifying smells, quantifying the
Return on Investment (ROI) of refactoring, comparing the ideal state with the current state of software
quality, and evaluating alternative development paths to reduce technical debt [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. However, the lack
of consistency among existing tools at the approach and ruleset levels has made it challenging to
compare and evaluate these different measurement approaches effectively [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ]. To address this
issue, a conceptual model called the Technical Debt Quantification Model (TDQM) was developed,
which captures key concepts related to technical debt quantification and allows for comparisons and
evaluations between different approaches [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>
        Except for the literature review of TD in requirements [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], there is a lack of comprehensive and
up-to-date reviews synthesizing state-of-the-art techniques and approaches for measuring technical
debt. This exploratory literature review examines the current state of research on technical debt
measurement. By understanding the extent and impact of technical debt, organizations can make
informed decisions about when and where to invest resources to address it.
      </p>
      <p>The remainder of this paper is organized as follows. Section 2 presents the systematic literature
review method used in this study. Section 3 presents and discusses the results. Section 4 concludes the
paper with a summary of key findings and directions for future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Review Method</title>
      <p>
        The following section describes the review method used for conducting this exploratory literature
review through a Systematic Literature Review (SLR), a rigorous and transparent method for
comprehensively analyzing existing research on a specific topic [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. This study provides a
comprehensive overview of existing research on technical debt measurement and identifies research
gaps in existing studies, which are used to highlight future research directions. To ensure
methodological rigor and transparency, this study followed the SLR guidelines proposed in [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] and
adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)
framework [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
2.1.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Research questions</title>
      <p>To delve deeper into technical debt measurement, this exploratory literature review addresses the
research questions (RQs) presented in Table 1, along with their rationales.</p>
    </sec>
    <sec id="sec-4">
      <title>Search strategy</title>
      <p>To identify relevant literature to this exploratory review, we conducted a comprehensive search
across five reputable digital libraries: Scopus, ScienceDirect, ACM Digital Library, IEEE Xplore, and
SpringerLink. The search encompassed peer-reviewed publications published between January 2010
and December 2023. The choice of 2010 as the starting point for the exploratory literature review on
technical debt measurement, instead of 1992 when the technical debt metaphor was introduced, is
primarily due to the significant increase in research activity around technical debt in the last decade.
This study focuses on this period to capture the most recent and relevant developments in the field.
Additionally, the selection of this time frame helps ensure that the review includes comprehensive and
up-to-date techniques and approaches, reflecting the current state of technical debt measurement
practices. These specific databases were chosen because of their extensive coverage of computer
science and engineering research, ensuring a high likelihood of capturing pertinent studies on technical
debt measurements within this domain.</p>
      <p>The search string was based on the key terms identified in the RQs as well as the commonly used
terminology associated with TD and measurement. Initially, the main search terms were combined
using the Boolean operator “OR” to their corresponding related keywords. Subsequently, these main
terms were linked with one another using “AND” to ensure a focused and relevant set of results. Table
2 lists the complete search strings used in this study.</p>
      <p>Owing to potential variations in search engine syntax across different databases, we carefully
adapted our search strings to optimize retrieval across each database (Scopus, ScienceDirect, ACM
Digital Library, IEEE Xplore, and SpringerLink). The search focused on titles, abstracts, and keywords
to ensure relevant studies were captured. To manage the retrieval process, we conducted separate
searches on each database, followed by consolidation of the identified papers. Subsequently, we
employed EndNote reference management software to identify and remove duplicate studies, ensuring
streamlined and non-duplicate studies for further analysis.
2.3.</p>
    </sec>
    <sec id="sec-5">
      <title>Study selection</title>
      <p>The selection aimed to identify relevant primary studies that addressed the measurement of technical
debt using the following inclusion criteria (IC) and exclusion criteria (EC):</p>
      <p>Inclusion criteria:
• Studies written in English and published between January 2010 and December 2023.
• Studies published in peer-reviewed journals, conference proceedings, workshop
proceedings, and book chapters.
• Studies where full texts are available.
• Studies that propose, evaluate, or discuss techniques, metrics, or approaches for measuring
technical debt.</p>
      <p>Exclusion criteria:
• Duplicate publications of the same study.
• Studies where full texts are not available.
• Studies titles and abstracts with a focus on non-technical debt aspects of software
development.
• Studies on TD but not on TD measurement.
• Studies on TD without a clear focus on measurement aspects
• Studies have focused solely on identifying TD (without measurement) or lacking details on
the measurement approach.</p>
      <p>A three-stage process was followed to select the studies for this review.</p>
      <p>In the first stage, 808 primary studies were identified from the five digital libraries. Seventy (70)
studies were discarded as duplicate publications of the same study.</p>
      <p>In the second stage, an initial screening of the search results (738 studies) was performed based on
titles and abstracts. Studies that full texts are not available were excluded (228 studies). Additionally,
480 studies that did not explicitly mention technical debt measurement or quantification techniques,
metrics, or approaches were excluded. After this stage, we obtained 30 full-text studies reviewed.</p>
      <p>In the third stage, a full-text review of the remaining studies (30 remaining studies) was conducted.
During this phase, we applied the defined inclusion and exclusion criteria to ensure the relevance and
quality of the selected studies. The exclusion criteria were as follows:
• Studies on TD but not on TD measurement.
• Studies on TD without a clear focus on measurement aspects.
• Studies have focused solely on identifying TD (without measurement) or lacking details on
the measurement approach.</p>
      <p>To ensure the reliability of the study selection process, two researchers independently performed
screening and full-text reviews. Any disagreements or conflicts were resolved by discussion and
consensus. After this phase, the 21 remaining studies were included in the review. The study selection
process, with the total number of studies retrieved and included in each phase, is shown in Figure 1.</p>
      <p>Identification of studies via databases
n
o
i
t
a
c
i
f
i
t
n
e
d
I</p>
      <p>Records identified (n=808):</p>
      <p>ACM (n = 292)
IEEE Xplore (n = 161)
ScienceDirect (n = 72)
Scopus (n = 182)</p>
      <p>SpringerLink (n = 101)</p>
      <sec id="sec-5-1">
        <title>Records screened (n = 738)</title>
      </sec>
      <sec id="sec-5-2">
        <title>Reports assessed for eligibility (n = 510)</title>
      </sec>
      <sec id="sec-5-3">
        <title>Full text studies reviewed (n = 30)</title>
      </sec>
      <sec id="sec-5-4">
        <title>Studies included in review (n = 21)</title>
        <p>Records removed before screening:
Duplicate records removed</p>
        <p>(n = 70)
•
•
•
•
•</p>
        <p>Records excluded (n = 228):</p>
      </sec>
      <sec id="sec-5-5">
        <title>Full text not available</title>
        <p>Reports excluded (n = 480):</p>
      </sec>
      <sec id="sec-5-6">
        <title>Title and abstract focus on non</title>
        <p>technical debt aspects of software
development
Reports excluded (n = 9):</p>
      </sec>
      <sec id="sec-5-7">
        <title>Studies on TD but not on TD measurement.</title>
      </sec>
      <sec id="sec-5-8">
        <title>Studies on TD without a clear focus</title>
        <p>on measurement aspects</p>
      </sec>
      <sec id="sec-5-9">
        <title>Studies solely focused on identifying</title>
      </sec>
      <sec id="sec-5-10">
        <title>TD (without measurement) or lacking details on the measurement approach</title>
        <p>
          Quality assessment was used to assess the relevance and credibility of the selected studies. The
papers were selected from five well-known databases of papers reviewed by experts before their
publication. The quality of the included primary studies was assessed using a customized quality
checklist adapted from Kitchenham and Charters [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] that was the most appropriate for our research
questions.
        </p>
        <p>The checklist consists of the following four quality criteria:
• QA1: Are the research aims clearly stated?
• QA2: Is the technical debt measurement approach clearly described?
• QA3: Are the findings clearly stated?
• QA4: Are the limitations of the study discussed?
Only studies that answered at least three of the above questions were selected. After this phase, all
the 21 studies met three of the above questions.
2.5.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Data extraction and Synthesis</title>
      <p>A data extraction form was developed to retrieve relevant information from the included primary
studies, addressing the RQs. It should be noted that not all the selected studies addressed all four RQs.
Data synthesis aimed to collate and summarize the results of the included primary studies. We identified
and grouped all relevant data to answer the RQs using a descriptive synthesis.</p>
    </sec>
    <sec id="sec-7">
      <title>3. Results and Discussion</title>
      <p>This section provides answers to the SLR research questions based on a synthesis of the selected
studies.
3.1.
(TD)?</p>
    </sec>
    <sec id="sec-8">
      <title>RQ1: Why, how, and what should be measured about technical debt</title>
      <p>The scope of the RQ1 covers a triple question “why”, “how” and “what”. Why measuring TD
enables to understanding the reasons behind measuring technical debt helps in recognizing its
importance and the benefits it brings to software projects. How to Measure TD focuses on the
methodologies and approaches used to quantify and analyze TD. What Should be Measured about TD
identifies the specific aspects and metrics that need to be quantified to understand and manage technical
debt effectively.</p>
      <p>The primary measurement goals identified in the selected studies are:
1. Sizing TD in terms of the effort required to reduce it to zero (S1, S3, S4, S5, S8, S19, S20, S21).
2. Assessing the impact of TD on software quality attributes such as functionality, performance,
maintainability, reliability, and security (S1, S2, S6, S9, S10, S17, S18).
3. Estimating the rework/refactoring efforts required to enhance evolvability and mitigate
accumulated TD (S3).
4. Evaluating the accuracy and usefulness of TD measurement tools (S11).
5. Comparing different TD identification techniques (S13).</p>
      <p>Several approaches have been proposed for measuring (quantifying) TD. Existing approaches
measure TD in terms of what is proposed to be measured through TD in the following ways:
• Eight studies (S1, S3, S4, S5, S8, S19, S20, S21) base their quantification on the identification
of code, design, architectural smells, or defects.
• Seven studies (S3, S4, S5, S8, S19, S20, S21) have attempted to quantify the return on
investment (ROI) of refactoring activities to remove technical debt.
• Seven studies (S1, S2, S6, S9, S10, S17, S18) compare an ideal state with the current state of
the software in terms of quality attributes, such as maintainability and modularity.
• Seven studies (S3, S4, S5, S8, S19, S20, S21) compare alternative development paths with the
aim of reducing rework and quantifying the impacts of taking on technical debt versus not taking
it on.</p>
      <p>In summary, the key measurements relate to sizing TD in terms of the required remediation effort,
assessing the impact on software quality attributes, estimating rework efforts for mitigating TD, and
evaluating the effectiveness of TD measurement approaches and tools.
3.2. RQ2: What are the existing measurement solutions for measuring
technical debt (TD)?</p>
      <p>The selected studies proposed various measurement solutions, including:
• Code metrics (complexity, coupling, cohesion, size, duplication, etc.) (S1, S3, S6, S9, S10, S11,</p>
      <p>S14, S16, S18).
• Modularity and design quality metrics (S3, S14).
• Defect proneness, change proneness, and maintenance effort metrics (S7, S9, S10, S15, S17).
• Static code analysis issues/violations (S2, S9, S19, and S20).
• Test coverage and quality metrics (S2, S9).
• Technical debt principal and interest calculations (S5, S7, S12, S15, S19, S20, and S21).
• Machine Learning models for TD identification and forecasting (S6, S9, and S10).</p>
      <p>Table 3 presents the measurement solutions/metrics proposed for measuring Technical Debt (TD)
across the selected studies.</p>
      <p>As shown in Table 3, the measurement solutions/metrics proposed across the selected studies
covered a wide range of aspects, including code metrics, design/architecture metrics, static analysis
metrics, quality metrics, effort/cost metrics, machine learning features, and other metrics related to
defect proneness, change proneness, and maintainability.
3.3.
of TD?</p>
    </sec>
    <sec id="sec-9">
      <title>RQ3: How do these approaches categorize and evaluate various aspects</title>
      <p>The selected studies categorize and evaluate various aspects of TD in the following ways:
• Based on the type of TD: code debt, design debt, architectural debt, documentation debt, test
debt, etc. (S3, S4, S10, S12, S18, S19, and S20).
• Based on architectural levels: path, activity, and application levels (S3).
• Based on quality characteristics/attributes affected: reliability, security, maintainability,
portability, etc. (S4, S13, S19, and S20).
• Based on severity/priority of issues (S1, S19, S20).
• Based on financial cost/effort estimations (S5, S19, S20, and S21).</p>
      <p>In summary, the approaches categorize and evaluate TD based on numerous factors, including the
types of TD (code, design, architectural, etc.), architectural levels affected, quality characteristics
impacted, severity or priority of issues, and financial implications or effort estimations. By considering
these different categories and evaluation perspectives, this study aims to provide an understanding of
TD, its manifestations, its impact on various aspects of software quality, and the potential costs and
efforts required for its remediation.
3.4.</p>
    </sec>
    <sec id="sec-10">
      <title>RQ4: What are the gaps (issues) identified in TD measurement research?</title>
      <p>From the analysis of selected studies, gaps (and related issues) in TD measurement research were
identified by the researchers themselves:
• Lack of validation on real-world projects (S3).
• Measuring other types of debt beyond code debt (documentation debt, test debt, architectural
debt, etc.) (S4, S10).
• Integrating developers’ opinions with code characteristics to improve TD severity identification
(S1).
• Holistic approaches combining financial and technical factors (S14).
• Interpretable thresholds in metric-based approaches (S14).
• Quantifying interest costs, risks/liabilities, and opportunity costs of TD (S19 and S20).
• Generalizability and application to complex systems (S21).
• Need for effective tooling and methodologies for managing TD across the software development
lifecycle (S10).</p>
      <p>These gaps and issues highlight the need for further research and improvements in TD measurement,
including validation on real-world projects, comprehensive coverage of different TD types, integration
of developer perspectives, holistic approaches, interpretable thresholds, consideration of TD costs and
risks, generalizability to complex systems, effective tooling and methodologies, direct TD
quantification, and accounting for non-functional requirements.</p>
      <p>Addressing these gaps and issues can contribute to more accurate, practical, and comprehensive TD
measurement approaches, thereby enabling better management and decision-making processes related
to TD in software development projects.
3.5.</p>
    </sec>
    <sec id="sec-11">
      <title>Discussion</title>
      <p>This study presents several key findings. Firstly, the study identified a variety of measurement goals
such as sizing TD, assessing its impact on software quality attributes, estimating rework efforts for TD
mitigation, and evaluating the effectiveness of TD measurement tools. The research highlights the
prevalent methodologies, including code metrics, design/architecture metrics, static analysis metrics,
quality metrics, effort/cost metrics, and machine learning features for identifying and forecasting TD.</p>
    </sec>
    <sec id="sec-12">
      <title>3.5.1. Implications for Researchers</title>
      <p>The findings indicate significant research gaps and areas for further exploration. Researchers are
encouraged to focus on:
1. Validation in Real-World Projects: There is a need for more empirical validation of TD
measurement tools and methodologies in real-world software development projects to enhance
their practical applicability.</p>
      <p>2. Comprehensive Coverage of TD Types: Future research should expand beyond code debt to
include other types such as documentation debt, test debt, and architectural debt. This holistic approach
will provide a more accurate picture of TD and its implications.</p>
      <p>3. Integration of Developer Perspectives: Incorporating insights from developers regarding the
severity and impact of TD can improve the accuracy and relevance of measurement tools.</p>
      <p>4. Holistic Approaches Combining Financial and Technical Factors: Developing measurement
approaches that consider both financial and technical aspects of TD can offer more comprehensive
management strategies.</p>
      <p>5. Interpretable Thresholds in Metric-Based Approaches: Establishing clear and interpretable
thresholds for various TD metrics will facilitate better decision-making processes for software teams.</p>
    </sec>
    <sec id="sec-13">
      <title>3.5.2. Implications for Practitioners</title>
      <p>Practitioners can benefit from the study’s insights by:
1. Adopting Diverse Measurement Solutions: Utilizing a combination of code metrics,
design/architecture metrics, static analysis metrics, and machine learning models can provide a
multifaceted understanding of TD, aiding in more effective management and reduction
strategies.</p>
      <p>2. Focusing on High-Impact Areas: By identifying and prioritizing areas with high TD,
practitioners can allocate resources more efficiently, addressing the most critical issues that affect
software quality and maintainability.</p>
      <p>3. Continuous Monitoring and Refinement: Implementing continuous TD measurement and
monitoring processes will help in early detection and mitigation of TD, thereby reducing long-term
costs and improving software sustainability.</p>
    </sec>
    <sec id="sec-14">
      <title>3.5.3. High-Level Concepts and Lessons Learned</title>
      <p>From the synthesis of over a decade of research, several high-level concepts and lessons emerge:
1. The Complexity of TD Measurement: The measurement of TD is inherently complex, involving
multiple dimensions such as code quality, design, architecture, and financial implications. This
complexity necessitates sophisticated and integrated measurement approaches.</p>
      <p>2. The Importance of Contextual Factors: The impact of TD varies significantly depending on the
context of the software project, including factors like project size, complexity, and team expertise.
Tailoring TD measurement approaches to specific project contexts can enhance their effectiveness.</p>
      <p>3. Need for Standardization and Tool Integration: There is a pressing need for standardization in
TD measurement approaches and better integration of tools to facilitate more consistent and reliable
measurements across different projects and organizations.</p>
      <p>4. Continuous Evolution of Measurement Techniques: As software development practices evolve,
so too must the techniques and tools for measuring TD. Keeping abreast of emerging trends and
incorporating new methodologies will be crucial for maintaining effective TD management practices.</p>
      <p>In summary, addressing these insights and integrating them into both research and practice can
significantly enhance the management of TD, contributing to the long-term sustainability and quality
of software systems.</p>
    </sec>
    <sec id="sec-15">
      <title>4. Threat to Validity</title>
      <p>This exploratory literature review aimed to provide a comprehensive overview of state-of-the-art in
technical debt measurement research. However, there are potential threats to validity that should be
acknowledged:
• While the search string was carefully constructed to capture relevant studies, it is possible that
some relevant publications may have been missed because of the use of different terminologies
or the presence of relevant studies in sources not included in the selected digital libraries. This
review focused exclusively on studies published between January 2010 and December 2023.
Consequently, relevant earlier works or very recent publications may have been unintentionally
excluded.
• Although the study selection process followed well-defined inclusion and exclusion criteria and
was conducted independently by two researchers, there is an inherent risk of bias in the
selection and interpretation of studies.
• The quality assessment of the included studies was based on a customized checklist adapted
from established guidelines. However, the assessment process may have introduced bias
because of the subjective interpretation of quality criteria.
• The included studies exhibited substantial heterogeneity in terms of research methodologies,
measurement approaches, and evaluation contexts. This diversity may introduce challenges in
synthesizing and comparing findings across studies.</p>
      <p>Despite these potential threats, we used rigorous and systematic methods to conduct the literature
review, including following established guidelines involving multiple researchers in the study selection
and data extraction processes. Additionally, we have transparently acknowledged the limitations and
potential threats to validity, which can inform the interpretation and applicability of the findings.</p>
    </sec>
    <sec id="sec-16">
      <title>5. Conclusion and Future Work</title>
    </sec>
    <sec id="sec-17">
      <title>5.1. Summary of findings</title>
      <p>
        The quantification and measurement of TD are important for software development teams and
organizations. By understanding the extent and impact of technical debt, organizations can make
informed decisions about when and where to invest resources to address it. A number of technical debt
issues have been investigated by researchers over the years, but relatively few have focused on technical
debt measurements/metrics. This 2010-2023 SLR in studies proposing TD measurements followed the
guidelines proposed by Kitchenham and Charters [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] and adhered to the PRISMA framework [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. 21
studies included in the review were selected from the Scopus, ScienceDirect, ACM Digital Library,
IEEE Xplore, and SpringerLink digital libraries to address our research questions using specified
inclusion and exclusion criteria, and then analyzed to answer the research questions.
      </p>
      <p>The key findings are as follows:
• RQ1: Why, how, and what should be measured about technical debt (TD)?
The primary measurement goals identified were:
1) sizing TD in terms of required remediation effort,
2) assessing the impact of TD on software quality attributes,
3) estimating rework efforts for TD mitigation,
4) evaluating the effectiveness of TD measurement approaches and tools.
• RQ2: What are the existing measurement solutions for measuring TD?</p>
      <p>A wide range of measurement solutions has been proposed, including code metrics (complexity,
coupling, cohesion, etc.), design and architecture metrics, static analysis metrics, quality metrics,
effort/cost metrics, machine learning features, and metrics related to defect proneness, change
proneness, and maintainability.</p>
      <p>• RQ3: How do these approaches categorize and evaluate various aspects of TD?
The approaches categorized and evaluated TD based on factors such as the types of TD (code,
design, architectural, etc.), architectural levels affected, quality characteristics impacted, severity or
priority of issues, and financial implications or effort estimations.</p>
      <p>• RQ4: What are the gaps (issues) identified in TD measurement research?</p>
      <p>The key gaps identified included the lack of real-world validation, limited coverage of non-code
debt types, need for holistic approaches integrating technical and financial factors, lack of interpretable
thresholds, quantification of TD costs and risks, generalizability to complex systems, and the need for
effective tooling and methodologies.</p>
      <p>In summary, this review identified a diversity of measurement solutions and categorization
approaches for technical debt while also highlighting significant gaps and areas for further research and
improvement in this field. The findings provide a nuanced understanding of the current state of technical
debt measurement research and offer insights into enhancing software sustainability and maintainability
through effective technical debt management practices.
5.2.</p>
    </sec>
    <sec id="sec-18">
      <title>Future Work</title>
      <p>The findings from this exploratory literature review highlight promising directions for future
research on technical debt measurements. There is a need for more comprehensive measurement
approaches that can effectively quantify and consolidate distinct types of technical debt such as design
debt, architectural debt, documentation debt, and test debt. Further research could also explore the
identification and measurement of technical debt, which can be derived from software functional
requirements not yet implemented, as well as from system non-functional requirements not
implemented and that can be implemented in software functions distributed across a software
environment.</p>
    </sec>
    <sec id="sec-19">
      <title>Acknowledgment</title>
      <p>We are grateful to anonymous reviewers.</p>
    </sec>
    <sec id="sec-20">
      <title>Appendix – selected primary studies.</title>
      <p>This appendix contains the supporting documentation for our article. The list of the 21 selected
primary studies to perform the SLR is presented in Table 1.</p>
      <sec id="sec-20-1">
        <title>Judith Perera</title>
      </sec>
      <sec id="sec-20-2">
        <title>Lerina AVERSANO</title>
      </sec>
      <sec id="sec-20-3">
        <title>Elvira-Maria Arvanitou et al.</title>
      </sec>
      <sec id="sec-20-4">
        <title>Ana Melo et al.</title>
      </sec>
      <sec id="sec-20-5">
        <title>Dimitrios Tsoukalas et al. Dimitrios Tsoukalas et al.</title>
      </sec>
      <sec id="sec-20-6">
        <title>Jason Lefever et al.</title>
      </sec>
      <sec id="sec-20-7">
        <title>Paris Avgeriou et al.</title>
      </sec>
      <sec id="sec-20-8">
        <title>Peter S. et al.</title>
      </sec>
      <sec id="sec-20-9">
        <title>Makrina Viola Kosti et al.</title>
      </sec>
      <sec id="sec-20-10">
        <title>Davide Falessi and5Andreas Reichel</title>
      </sec>
      <sec id="sec-20-11">
        <title>Clauirton A. Siebra et al.</title>
      </sec>
      <sec id="sec-20-12">
        <title>Nico Zazworka et al.</title>
      </sec>
      <sec id="sec-20-13">
        <title>Francesca Arcelli Fontana et al. Bill Curtis et al.</title>
      </sec>
      <sec id="sec-20-14">
        <title>Bill Curtis et al.</title>
      </sec>
      <sec id="sec-20-15">
        <title>Ariadi Nugroho et al. Technical Debt: An Empirical Study</title>
      </sec>
      <sec id="sec-20-16">
        <title>Modelling the</title>
        <p>Quantification of Technical</p>
        <p>Debt</p>
      </sec>
      <sec id="sec-20-17">
        <title>On the Lack of Consensus Among Technical Debt Detection Tools</title>
      </sec>
      <sec id="sec-20-18">
        <title>An Overview and Comparison of Technical Debt Measurement Tools Comparing</title>
        <p>Maintainability Index, SIG
Method, and SQALE for
Technical Debt Identification</p>
        <p>Technical Debt Principal
Assessment through Structural</p>
        <p>Metrics
Towards an Open-Source
Tool for Measuring and
Visualizing the Interest of
Technical Debt</p>
        <p>Applying Metrics to</p>
        <p>Identify and Monitor
Technical Debt Items during
Software Evolution</p>
        <p>Comparing four
approaches for technical debt
identification</p>
        <p>Investigating the Impact of
Code Smells Debt on Quality</p>
        <p>Code Evaluation</p>
        <p>Estimating the Size, Cost,
and Types of Technical Debt</p>
        <p>Estimating the Principal of
an Application’s Technical</p>
        <p>Debt</p>
      </sec>
      <sec id="sec-20-19">
        <title>An Empirical Model of</title>
        <p>Technical Debt and Interest</p>
      </sec>
      <sec id="sec-20-20">
        <title>Frontiers of Computer Science</title>
      </sec>
      <sec id="sec-20-21">
        <title>Euromicro Conference on Software Engineering and Advanced Applications (SEAA)</title>
      </sec>
      <sec id="sec-20-22">
        <title>Journal of Systems and Software</title>
      </sec>
      <sec id="sec-20-23">
        <title>SN Computer Science</title>
      </sec>
      <sec id="sec-20-24">
        <title>IEEE Transactions on Software</title>
        <p>Engineering</p>
        <p>IEEE/ACM 43rd International
Conference on Software Engineering:
Software Engineering in Practice
(ICSE-SEIP)</p>
      </sec>
      <sec id="sec-20-25">
        <title>Annual ACM Symposium on</title>
        <p>Applied Computing (SAC '20)</p>
      </sec>
      <sec id="sec-20-26">
        <title>International Workshop on</title>
        <p>Managing Technical Debt (MTD)</p>
      </sec>
      <sec id="sec-20-27">
        <title>Software Quality Journal</title>
      </sec>
      <sec id="sec-20-28">
        <title>International Workshop on Managing Technical Debt (MTD)</title>
      </sec>
      <sec id="sec-20-29">
        <title>IEEE Software</title>
      </sec>
      <sec id="sec-20-30">
        <title>Proceedings of the 2nd Workshop</title>
        <p>on Managing Technical Debt (MTD
'11)</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>W.</given-names>
            <surname>Cunningham</surname>
          </string-name>
          ,
          <article-title>The WyCash Portfolio Management System</article-title>
          ,
          <source>in: Proceedings of the 7th International Conference on Object-Oriented Programming, Systems, Languages, and Applications</source>
          , OOPSLA, Association for Computing Machinery, New York, NY, USA,
          <year>1992</year>
          , pp.
          <fpage>29</fpage>
          -
          <lpage>30</lpage>
          . doi:
          <volume>10</volume>
          .1145/157709.157715
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>I. Gat</surname>
          </string-name>
          , (Ed.), Special Issue: Technical Debt, Cutter IT J., vol.
          <volume>23</volume>
          , no.
          <issue>10</issue>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fowler</surname>
          </string-name>
          , Technical Debt, blog,
          <year>2019</year>
          . URL: http:// martinfowler.com/bliki/TechnicalDebt.html.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Kruchten</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Nord</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Ozkaya</surname>
          </string-name>
          ,
          <article-title>Technical debt: From metaphor to theory and practice</article-title>
          , IEEE software,
          <volume>29</volume>
          (
          <issue>6</issue>
          ) (
          <year>2012</year>
          )
          <fpage>18</fpage>
          -
          <lpage>21</lpage>
          . doi:
          <volume>10</volume>
          .1109/MS.
          <year>2012</year>
          .167
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Rios</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.G.d. Mendonça</given-names>
            <surname>Neto</surname>
          </string-name>
          , and
          <string-name>
            <given-names>O.R.</given-names>
            <surname>Spínola</surname>
          </string-name>
          ,
          <article-title>A tertiary study on technical debt: Types, management strategies, research trends, and base information for practitioners</article-title>
          ,
          <source>Information and Software Technology</source>
          ,
          <volume>102</volume>
          (
          <year>2018</year>
          )
          <fpage>117</fpage>
          -
          <lpage>145</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Seaman</surname>
          </string-name>
          ,
          <article-title>A portfolio approach to technical debt management</article-title>
          ,
          <source>in: Proceedings of the 2nd Workshop on Managing Technical Debt</source>
          , Association for Computing Machinery, New York, NY, USA,
          <year>2011</year>
          , pp.
          <fpage>31</fpage>
          -
          <lpage>34</lpage>
          . doi:
          <volume>10</volume>
          .1145/1985362.1985370.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>V.</given-names>
            <surname>Lenarduzzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Besker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Taibi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Martini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. A.</given-names>
            <surname>Fontana</surname>
          </string-name>
          ,
          <article-title>A systematic literature review on Technical Debt prioritization: Strategies, processes, factors, and tools</article-title>
          ,
          <source>Journal of Systems and Software</source>
          ,
          <volume>171</volume>
          (
          <year>2011</year>
          )
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.jss.
          <year>2020</year>
          .
          <volume>110827</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Seaman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          , Chapter 2 - Measuring and Monitoring Technical Debt, in: Marvin V.
          <string-name>
            <surname>Zelkowitz</surname>
          </string-name>
          (Ed.), Advances in Computers, Elsevier,
          <volume>82</volume>
          (
          <year>2011</year>
          ), pp.
          <fpage>25</fpage>
          -
          <lpage>46</lpage>
          . doi:
          <volume>10</volume>
          .1016/B978-0
          <source>-12- 385512-1</source>
          .
          <fpage>00002</fpage>
          -
          <lpage>5</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G.</given-names>
            <surname>Freitas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Bernardo</surname>
          </string-name>
          , G. SizíLio,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Da Costa</surname>
          </string-name>
          , and
          <string-name>
            <given-names>U.</given-names>
            <surname>Kulesza</surname>
          </string-name>
          ,
          <article-title>Analyzing the Impact of CI Sub-practices on Continuous Code Quality in Open-Source Projects: An Empirical Study</article-title>
          ,
          <source>in Proceedings of the 37th Brazilian Symposium on Software Engineering (SBES '23)</source>
          , Campo Grande, Brazil,
          <year>2023</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          ,
          <year>2023</year>
          . doi:
          <volume>10</volume>
          .1145/3613372.3613403.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>F. A.</given-names>
            <surname>Fontana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Roveda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vittori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Metelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Saldarini</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Mazzei</surname>
          </string-name>
          ,
          <article-title>On evaluating the impact of the refactoring of architectural problems on software quality</article-title>
          ,
          <source>in: Proceedings of the Scientific Workshop Proceedings of XP2016 (XP '16 Workshops)</source>
          , Edinburgh Scotland, United Kingdom,
          <year>2016</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          . doi:
          <volume>10</volume>
          .1145/2962695.2962716.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>N.</given-names>
            <surname>Zazworka</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Vetro',
          <string-name>
            <given-names>C.</given-names>
            <surname>Izurieta</surname>
          </string-name>
          , et al.,
          <article-title>Comparing four approaches for technical debt identification</article-title>
          ,
          <source>Software Quality Journal</source>
          ,
          <volume>22</volume>
          (
          <issue>3</issue>
          ) (
          <year>2014</year>
          )
          <fpage>403</fpage>
          -
          <lpage>426</lpage>
          . doi:
          <volume>10</volume>
          .1007/s11219-013-9200-8
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Melo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagundes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Lenarduzzi</surname>
          </string-name>
          , W. B.
          <string-name>
            <surname>Santos</surname>
          </string-name>
          ,
          <article-title>Identification and measurement of Requirements Technical Debt in software development: A systematic literature review</article-title>
          ,
          <source>Journal of Systems and Software</source>
          ,
          <volume>194</volume>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>21</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.jss.
          <year>2022</year>
          .
          <volume>111483</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Boris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Castellanos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Correal</surname>
          </string-name>
          , et al.,
          <article-title>Technical debt payment and prevention through the lenses of software architects</article-title>
          ,
          <source>Information and Software Technology</source>
          ,
          <volume>140</volume>
          (
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.infsof.
          <year>2021</year>
          .
          <volume>106692</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>U.</given-names>
            <surname>Vora</surname>
          </string-name>
          ,
          <source>Measuring the Technical Debt, in: Proceedings of the 17th Annual System of Systems Engineering Conference (SOSE)</source>
          , Rochester,
          <string-name>
            <surname>NY</surname>
          </string-name>
          , USA,
          <year>2022</year>
          , pp.
          <fpage>185</fpage>
          -
          <lpage>189</lpage>
          , doi: 10.1109/SOSE55472.
          <year>2022</year>
          .
          <volume>9812632</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mathioudaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tsoukalas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Siavvas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Kehagias</surname>
          </string-name>
          ,
          <source>Comparing Univariate and Multivariate Time Series Models for Technical Debt Forecasting, in: Proceedings of Computational Science and Its Applications - ICCSA 2022 Workshops</source>
          , Malaga, Spain,
          <year>2022</year>
          , pp.
          <fpage>62</fpage>
          -
          <lpage>78</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -10542-
          <issue>5</issue>
          _
          <fpage>5</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mathioudaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tsoukalas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Siavvas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Kehagias</surname>
          </string-name>
          ,
          <source>Technical Debt Forecasting Based on Deep Learning Techniques, in: Proceedings of Computational Science and Its Applications - ICCSA 2022 Workshops</source>
          , Cagliari, Italy,
          <year>2021</year>
          , pp.
          <fpage>306</fpage>
          -
          <lpage>322</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -87007-2_
          <fpage>22</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J.</given-names>
            <surname>Perera</surname>
          </string-name>
          ,
          <source>Modelling the Quantification of Technical Debt, in: Companion Proceedings of the 2022 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH Companion</source>
          <year>2022</year>
          ), Auckland, New Zealand,
          <year>2022</year>
          , pp.
          <fpage>50</fpage>
          -
          <lpage>53</lpage>
          . doi:
          <volume>10</volume>
          .1145/3563768.3565553.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>B. A.</given-names>
            <surname>Kitchenham</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Charters</surname>
          </string-name>
          ,
          <article-title>Guidelines for performing systematic literature review in software engineering</article-title>
          ,
          <source>Technical Report</source>
          , Keele University,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>M. J. Page</surname>
            ,
            <given-names>J. E.</given-names>
          </string-name>
          <string-name>
            <surname>McKenzie</surname>
            ,
            <given-names>P. M.</given-names>
          </string-name>
          <string-name>
            <surname>Bossuyt</surname>
            , I. Boutron,
            <given-names>T. C.</given-names>
          </string-name>
          <string-name>
            <surname>Hoffmann</surname>
            ,
            <given-names>C. D.</given-names>
          </string-name>
          <string-name>
            <surname>Mulrow</surname>
          </string-name>
          et al.,
          <source>The PRISMA</source>
          <year>2020</year>
          statement
          <article-title>: an updated guideline for reporting systematic reviews, BMJ</article-title>
          .,
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .1136/bmj.n71.
          <source>Companion Proceedings of the 2022 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH Companion</source>
          <year>2022</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>