<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Mining Knowledge on Technical Debt Propagation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tomi `bgt' Suovuo</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Johannes Holvitie</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jouni Smed</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ville Leppanen</string-name>
          <email>ville.leppaneng@utu.fi</email>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Turku Centre for Computer Science, Software Development Laboratory &amp; University of Turku, Department of Information Technology</institution>
          ,
          <addr-line>Turku</addr-line>
          ,
          <country country="FI">Finland</country>
        </aff>
      </contrib-group>
      <fpage>281</fpage>
      <lpage>295</lpage>
      <abstract>
        <p>ion of the channel patterns into rules is pursued so that development tools may automatically maintain technical debt information with them (the authors have introduced the DebtFlag tool for this). Hence, successfully implementing this study would allow further understanding and describing technical debt propagation at both the high level (longitudinal technical debt propagation e ects for the project) and the low level (artifact level e ects describing the mechanism of technical debt value accumulation).</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Technical debt is a software development concept that is interested in exposing
asset management characteristics for project trade-o s [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Working with scarce
resources to ful ll ever-changing requirements, software projects often need to
emphasize certain development driving aspects over others, such as delivery
deadlines over thorough documenting. Further, invalid or lacking knowledge on
certain aspects of the development may lead to emphases made that improperly
re ect the actual situation. In both cases the informed and uninformed decisions
result to trade-o s that accumulate technical debt [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        It has been argued [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] that a key factor for the adoption of technical debt
management into software development is the capability to produce and maintain
technical debt information within the project. That is, the project trade-o s must
be identi ed, their distribution and e ects de ned, and this information must
be maintained to re ect the true software project state. Undoubtedly, failures in
the information delivery result in unmanaged technical debt, or decisions being
made based on outdated information, both of which, implicitly or explicitly,
a ect the project.
      </p>
      <p>
        Technical debt research has been pro cient in suggesting identi cation,
tracking, and governance solutions to overcome the technical debt information
production issues [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. The problem is that while solutions have been proposed and
trialed on various software contexts, no prior research has properly investigated
the whole software context space. That is, identifying and classifying where and
how technical debt exists and how does it propagate intra- and
inter-softwarecontexts. This higher level structure may be described in some studies as the
concept of technical debt interest and its accumulation, but it has not been
explicitly examined; being less important to the relevant studies' goals. Arguably,
however, in order to make technical debt management applicable, the various
solutions must function together, and in this the enabling factor is technical debt
propagation.
      </p>
      <p>Today, the software projects that plug into social media services through
APIs (Application Programming Interface) are an exemplar eld of software
context versatility. Updates to these APIs, invoked by their external authors,
indicate sources of technical debt accumulation and propagation in their clients',
often business critical, software. Mining Software Repositories (MSR) for the
clients that are subject to these updates enables studying the software context
space to address the cap in technical debt propagation knowledge.</p>
      <p>
        In the 1980s software applications were relatively simple and they were
delivered as is. They were relatively bug free and needed no updates. Once an
application was released, any existing technical debt was outside the organization's
control. As software grew increasingly complex, especially with the emergence
of the Internet in the 1990s, bigger applications were released with more issues
remaining. The practise eventually turned out having regularly released patches
as a norm, as they were also easily distributed through the net. Technical debt
was feasible and also realized. Now, in the 2010s we have complex applications
that not only utilize third party libraries, but also third party services through
APIs. There are regular updates to the libraries and the APIs, as well as to the
client applications themselves. These all are sources of technical debt. Further, as
previously shown [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], a singular technical debt instance rarely limits to a single
software development component but rather spans over multiple (e.g., design,
implementation, and testing), making the emerging debt even more cumbersome
to track.
      </p>
      <p>Our intention is to understand the technical debt propagation context by
investigating the latest trends: use of external APIs and especially those of social
media services. The paper is structured as follows: we begin in Section 2 by
reviewing the background. Section 3 builds on this and introduces our technical
debt propagation research objectives. We introduce our approach to overcome
the objectives and initial results in Section 4. The concluding remarks appear in
Section 6.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>We will introduce here related work regarding technical debt, propagation in
the software context, and APIs. Whilst de ning core concepts for the article's
foundation, empirical work is also visited so as to further understand the state
of current research.
2.1</p>
      <sec id="sec-2-1">
        <title>Technical Debt and Its Propagation</title>
        <p>
          The term \technical debt" was initially coined by Ward Cunningham [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. In
his experience report, releasing code was paralleled to going into debt:
tradeo s are made in the software project to meet a deadline, and these trade-o s
can be considered debt that should be paid o when resources permit. Until
the debt is paid o , it will incur interest payments|that is, later work in the
project must accommodate the inoptimalities resulting from the trade-o s. This
description has remained applicable to these days. Later revisits to the de nition
have mainly captured dimensions that further explain the role of the debt in the
project: McConnell [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] provides a de nition for intentional and unintentional
technical debt, while Brown et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] give a further description of the debt's
e ects via re ection to the nancial domain and discussion on the resolution
probability.
        </p>
        <p>
          Firstly, McConnell [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] provided a de nition for the intentionality behind the
debt: intentional debt is a trade-o made whilst fully aware of its consequences,
an investment with an expected return. Unintentional debt on the other hand
is accumulated due to, for example, lack of knowledge. This type is a cause for
concern as it remains unmanaged until discovered. Secondly, Brown et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] gave
a further description of the debt's e ects via re ection to the nancial domain:
the earlier trade-o s accumulate interests payments manifesting as increased
future costs, and trained decisions should evaluate if paying the interest is more
pro table over reducing the loan via refactoring. Di ering from the nancial
domain, here, the debt's interest has a probability that captures if the trade-o
will have visible e ects on future development: debt within a software artifact
that will not be visited has a realization probability of zero.
        </p>
        <p>Management of technical debt requires that we are capable of identifying and
tracking the trade-o s, the atomic instances, that form the debt for a project.</p>
        <p>
          Without this information readily available, trained decision regarding the debt's
governance cannot be made [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. The software context, however, makes the
identi cation, and especially, the tracking an arduous task: instances of technical
debt can span over multiple development phases and the most a ected part is
the software implementation [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] which arguably grows exponentially complex in
the future through various abstraction layers and techniques. Nevertheless, the
tracking should be able to follow a technical debt instance in this context.
        </p>
        <p>
          From the latest systematic mapping study on technical debt [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] we can see
that several solutions for tracking technical debt are available. However, we also
observe (see Figure 10 in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]) that there are areas in the software development
context that are not covered by any solution; whilst most of the solutions cover
sub-contexts focusing on prede ned environments and speci c parts of the
software life-cycle. Furthermore, from Kruchten et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] and Izurieta et al. [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] we
can see that the causes for technical debt are various and they can be described
using various characteristics. We consider all these ndings indicative of the
multiformity of the context of technical debt in software projects. Thus, in addition
to searching for solutions in this context, technical debt research should pursue
mapping the full context space and an understanding of technical debt's value
in it.
        </p>
        <p>
          Lastly, we note that technical debt tracking is the process of indicating
technical debt propagation in the software context. To this end, the authors identify
only the work by McGregor et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] to explicitly address this issue. Here,
considering mainly the software implementation, they note that technical debt for
a new software asset is a ected by the technical debt in relied upon assets, the
amount of abstraction layers may diminish the amount of technical debt that
propagates, and, in another scenario, rather than being directly accumulated
from integrated assets, the technical debt has an indirect e ect on the asset's
users|for example, by making adoption more di cult.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Software Change Analysis</title>
        <p>
          What is pursued herein is a better understanding of the context of technical debt
propagation in software. We argue that software change should be considered the
fundamental unit for this. Something that Schmid [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] also considered core to
technical debt modelling during software evolution. Capturing software changes
and distinguishing between technical debt inclined and other changes (that is,
changes using information relatable to technical debt properties described by
Brown et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] and discussed in Section 2.1, and changes with no such
properties) would allow non-restricted observation of technical debt in the full software
context. Identifying software change retrospectively for projects corresponds to
Mining Software Repositories (MSR).
        </p>
        <p>
          Kagdi et al. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] produce a taxonomy on MSR techniques, de ning software
change as \the addition, deletion, or modi cation of any software artifact such
that it alters, or requires amendment of, the original assumptions of the subject
system." Here, a source code change is indicated as the fundamental unit for
software evolution, but as the causes [
          <xref ref-type="bibr" rid="ref10 ref7">10, 7</xref>
          ] and the manifestations [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] for
technical debt do not limit to the implementation, we adopt software change as the
fundamental unit.
        </p>
        <p>
          In this work, the mining e orts focus on large open-source,
social-networkingenabled, repositories in order to maximally cover the diversity of software change.
Tsay et al. [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] note that in GitHub handling of pull-requests is a ected by social
factors: highly discussed requests enjoy a lower acceptance rate, while
submitters relations to|especially the manager of| the accepting project increases
acceptance; this is supported also by [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Kalliamvakou et al. [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] survey GitHub
as a MSR target. They conclude that the repository gives solid data on basic
project properties, such as program language use, but synthesizing more
abstract conclusions requires careful assessment. The main cause for concern here
is GitHub's utilization as infrastructure for personal projects. This form of usage
vastly deviates from others. To counter this bias, Kalliamvakou et al. [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] suggest
considering only projects with more than two authors and demonstrated activity
in both commit and pull requests.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Seeking Technical Debt Knowledge</title>
      <p>In the following we address our ongoing technical debt propagation research on
two distinct levels: the inter-dependency e ects at the software artifact level and
the longitudinal e ects at the project level.
3.1</p>
      <sec id="sec-3-1">
        <title>Inter-Dependency E ects within Software Artifacts</title>
        <p>As discussed in Section 2, a multitude of solutions exist for both identifying
and tracking technical debt. However, most of the solutions are intended for
pre-de ned software development contexts; for example, limiting their use to
a speci c sub-set of implementation techniques and herein, during continued
software development, to certain mechanisms for technical debt propagation.</p>
        <p>
          However, the ability to produce exhaustive technical debt information
requires that all possibilities for technical debt propagation are acknowledged. We
postulate, based on the properties of technical debt identi ed by Brown et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]
and to the average cover of single technical debt instances queried by Holvitie
et al. [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], that the propagation \stream" for technical debt is capable of leaving
the current host technique and merging into others. This is indicative of several
sub-areas within technical debt research.
        </p>
        <p>Foremost research area for technical debt propagation in software artifacts,
is (1) to show that technical debt propagates between software components that
can exist in external and independent projects and be implemented using di
erent technologies. The interest and even the whole initial debt can be created in an
external, but linked project that is worked by another team. The works referred
here do not dispute this information, and may even implicitly assume this, but
it is important to recognize this phenomenon explicitly and have quantitative
research conducted on it to indisputably point it out.</p>
        <p>Second research area, partially reliant on the rst, is (2) to accumulate a
documentation that describes the possible ways in which technical debt can
propagate. Preferably, this would be a taxonomy capturing the unique
propagation channels for technical debt. Finally, in order to enable information delivery
for technical debt management purposes, (3) the channel descriptions must be
enriched with information regarding technical debt value accumulation for all
unique accounts of propagation. This would enable, possibly automated,
technical debt information maintenance as the taxonomy is capable of tracking and
valuating technical debt through out the software project.
One way to identify the propagation of technical debt is to make longitudinal
studies of increased debt in di erent phases of a project and connect them with
the root causes. Technical debt can be identi ed as matters, such as discovered
vulnerabilities, updates, and feature discontinuation in systems related to the
project. Also, adding a new feature in a utilized external service API may cause
technical debt when the project customer wants the new feature implemented in
the project. We can identify di erent propagation paths by following how such
an event causes extra work in the chain of projects (COP) that are all linked
with each other.</p>
        <p>If an API is not interfaced directly but through a third party library, it may
be that the customer is not happy to wait until the library is updated with the
new feature. This will cause the project debt to be paid by implementing this
new feature quickly with an internal solution. This will become a new kind of
a debt, from the opposite end of the COP, when the referred library is nally
updated. Here, the internal solution becomes legacy and requires refactorization
into a solution that utilizes the library again, for example, in accordance to the
coding conventions followed by the programming team.</p>
        <p>There are cross waves moving back and forth in the COP from the root cause,
through the library, to the end of chain application. These can be tracked by
following the amount of increased work in each area.</p>
        <p>Figure 1 demonstrates a sample classi cation for COPs. Here, case 1
demonstrates a monolith project that has internally implemented services with no
outside dependencies. This is a classical, and probably the most studied,
scenario for technical debt management, where the debt is only internally caused,
felt, and managed. Cases 2 through 4 depict more modern scenarios, where the
projects depend on external service providers. In case 2, the project has a direct
dependency to the service and adapts explicitly and directly as invoked by the
service. A slightly dampened version, but still fully managed by the project
organization is presented in case 3, where the project, possibly alongside with the
organization's other projects, uses an internally produced adapter to access the
service. Hence, the project itself does not directly feel changes in the external
service, but adaptation to them is still managed internally. Finally, in case 4
the project uses an external adapter to access the service. The external adapter
generally serves a broader range of projects and hence is not customized for the
needs of speci c projects. On the other hand, external adapters tend to retain
compatibility as long as possible which dampens change speeds invoked by the
external service.</p>
        <p>The classi cation in Figure 1 is especially important from the viewpoint
of distinguishing between the \noisy" and the technical debt inclined software
changes, as the monolith projects of similar size can be used as the baseline
when studying how the external service invokes and propagates technical debt.
Further, as per the previous description, it can be expected that the invoked
technical debt will propagate quicker in the directly dependent cases than in the
indirect cases 2 to 4.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Exploiting Open-Source Projects</title>
      <p>Exploiting open source code repositories enables us to make longitudinal surveys
of the history. The GitHub code repository service 1 appears as a treasure trove
for this kind of research. We can take a project from GitHub, and we can nd
for it, neatly logged, each change and its date with great detail.</p>
      <sec id="sec-4-1">
        <title>1 See https://github.com/</title>
        <p>GitHub gives an open access to several di erent projects. However, there
is also an option of hosting private projects for premium users as mentioned in
Section 2.2. With only the public access to the repositories, the sample is likely to
be biased. This means that traditionally non-disclosed for-pro t projects cannot
be found in GitHub like this, which entails that a lot of professional work is not
covered by this study. However, it can be argued that functionality is delivered
via the same technologies in closed-source projects.</p>
        <p>Furthermore, regarding mapping the software change (as discussed in Section
2), the GitHub API gives an easy access to byte-wise size of source les and
line-wise size of code change per commit. Through this we have the scale of
the whole project in bytes, but the scale of changes in lines of code. Optimally
both variables would be measured identically, but we can only rely on these
two measures being su ciently comparable. The only other option would be to
go through the source les and count the line breaks outside the GitHub API
support.</p>
        <p>As elaborated in Section 3, we want to observe the propagation of technical
debt on both at the software project and the software artifact levels, and with as
little constrain as possible so as to capture the propagation context as complete
as possible. Herein, we face the problem of how to identify technical debt in a
highly diverse setting, and this is the reason why we emphasize the novelty of
researching open-source social-networking-enabled projects.</p>
        <p>Figure 2 captures the di erent technical debt accumulation classes for projects
with dependencies to external services. Case 3 depicts the most common
situation in which the project accumulates technical debt that realizes at a certain
point in time. In case 1 factors external to the component and its development
invoke technical debt, and it may realize and invoke management needs at a
point in time. In case 2 technical debt has realized (its interest probability is
one, or a decision to remove the debt has been made) and it a ects the project.
In this scenario, the debt will propagate onwards, directly or through
intermediaries, and accumulate in dependants. Accumulation channels are addressed in
Figure 3.</p>
        <p>The classi cation in Figure 2 is important for distinguishing technical debt
inclined software change, as we must be able to distinguish between invoked
change (case 2) and internally accumulated debt (case 1 and 3). This is because
the monolith projects (see Figure 1) are able to internally accumulate technical
debt, and we must form the baseline whilst aware of this.</p>
        <p>In addition to source code, open-source projects provide access to
documentation and other descriptors. Of these, the social media enabled ones form a set of
projects that share a joint technical debt inducer: the social media APIs. These
APIs provide business critical functionality for the projects, and every time they
change, it causes several changes for their clients. Due to the massive adoption
of social media services, their APIs (e.g., the Facebook Graph API 2 and the
Google OpenID API 3) integrate into and a ect a vast amount of projects. This
diverse collection of technologies, which all connect to the APIs that now cause
changes for them, unveils a unique opportunity for technical debt research. As
the changes propagate through various di erent technologies, they demonstrate
a variety of technical debt propagation paths. Whilst our survey on to the social
media involved open-source does not capture the full propagation space,
particularly, propagation to business processes, it does yield a formidable library
for the propagation of technical debt in delivered software and its supporting
structures. Considering that usually this corresponds to the projects' delivered
value, research should have a special interest to it.</p>
        <p>Figure 3 demonstrates two channels, from a plethora of foreseeable options,
through which technical debt can propagate and accumulate in new components.
The upper channel captures a more problematic propagation method, in which
no explicit dependency exists. In this, accumulated technical debt in the form
of incomplete documentation causes a misunderstanding in a conceptualization
phase of software development and leads to a complex component design. The
lower channel demonstrates an explicit channel, where an interface change is felt
in the dependent project as component disconnection. For example, a referred
class is renamed in the service due to which the client can not access it in the
original fashion. This leads to an erroneous implementation state in the
dependent and undoubtedly invokes reparation e orts. In our MSR of open-source
projects, over going both the human-produced messages and the automatically</p>
      </sec>
      <sec id="sec-4-2">
        <title>2 See http://graph.facebook.com/ 3 See https://profiles.google.com/</title>
        <p>identi ed changes should reveal instances that t both channels shown in Figure
3, but due to its implicit nature, identi cation of cases in the upper channel will
be di cult.
4.1</p>
        <sec id="sec-4-2-1">
          <title>Study Approach</title>
          <p>We use the GitHub API through PyGithub/PyGithub library 4. Our crawler is
a Python program 5 designed to crawl through all commits of a given project
and report, for each commit, the date it was committed, the amount of changes
(as the amount of added and removed rows), and the changed les. As such, our
crawler is in itself an end part of a COP.</p>
          <p>
            For an initial test of concept we chose Google's closing of OpenID 2.0 service
on April 20th 2015 [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ] as a source of technical debt. We made a manual search
in GitHub and discovered two Java projects which had closed issues
mentioning Google closing the service. One was the Passport-based User Authentication
system for sails.js applications|GitHub repository tjwebb/sails-auth. The other
was a Grails website that provides information about festivals|GitHub
repository domurtag/festivals. For a control project we selected another Java project
that was similarly a user authentication system for sails.js as sails-auth, but
did not appear to be involved with Google services|GitHub repository
waterlock/waterlock
4.2
          </p>
        </sec>
        <sec id="sec-4-2-2">
          <title>Initial Results</title>
          <p>Our analysis produced the graphs shown in Figure 4.The blue colour is used for
sails-auth, red for festivals and cyan for waterlock. The X-axis marks the time.
The dots denote the amount of changes in a commit. The bars denote commits
4 see https://github.com/PyGithub/PyGithub and in similar fashion for the other
mentioned repositories as well
5 GitHub repository tomibgt/GitHubResearchDataMiner
4
10
3
10
2
10
1
10
4
10
3
10
2
10
0
10
Oct 2014</p>
          <p>Nov 2014</p>
          <p>Dec 2014</p>
          <p>Jan 2015</p>
          <p>Feb 2015</p>
          <p>Mar 2015</p>
          <p>Apr 2015</p>
          <p>May 2015</p>
          <p>Jun 2015
0
10
Oct 2014</p>
          <p>Nov 2014</p>
          <p>Dec 2014</p>
          <p>Jan 2015</p>
          <p>Feb 2015</p>
          <p>Mar 2015</p>
          <p>Apr 2015</p>
          <p>May 2015</p>
          <p>Jun 2015
)
e 5
im01
T 2
l ,
a 0
v 2
o
e M
R (
e M
R (
0
10
Oct 2014</p>
          <p>Nov 2014</p>
          <p>Dec 2014</p>
          <p>Jan 2015</p>
          <p>Feb 2015</p>
          <p>Mar 2015</p>
          <p>Apr 2015</p>
          <p>May 2015</p>
          <p>Jun 2015
for a time period at least a week long. The lines denote commit frequency for
previous time interval of at least a week. Finally, on the graph is marked the
date-of-interest, April 20th 2015.</p>
          <p>The lines show a general decline, which would appear to indicate that as a
project progresses, less and less changes are made for it. Note that the Y-axis is
logarithmic, which makes the lines curve down, instead of appearing linear.</p>
          <p>It would appear to be supporting our hypothesis, where, after the marked
date, sails-auth and festivals show decrease in the decline, unlike the control
project waterlock. With only three projects and without more precise
investigation we can not, of course, claim this to be strong evidence, but it is enough to
encourage us in continuing with this approach.</p>
          <p>With moderate work, the analyser can be modi ed to point out the les where
there has been increasing changes in the commits correlating to the investigated
events. (See Tables 1 and 2.) Looking into the changes made into these les
should help us to analyse further the e ort put by the programmers to pay the
speci c technical debt. Also, it should be possible to follow the wave of changes
throughout the COP and analyse the propagation of the debt and the involved
work and communication.
5</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Applicability and Limitations</title>
      <p>The aforedescribed approach is limited by certain factors which we would like
to address here. Firstly, we described this method as a possibility to explore the
complete software context space, but the study design suggests using service calls
to, especially social media, APIs and libraries as the method. It can be assumed,
as previously discussed, that this approach does not capture all possible varieties
of software change (see 2.2). This is a foreseeable data limitation even though it
can be argued that the volume of captured changes would produce a
representative set for analysis; accumulating enough assurance to allow abstraction to
non-captured context areas.</p>
      <p>Second, there are limitations potentially a ecting the identi cation of
technical debt instances. We discussed the technical debt properties which can be used
to associate a software change with managing technical debt. While this set of
properties currently accounts the state-of-the-art from technical debt research,
if not exhaustive, the properties may lead to missing particular sub-classes of
technical debt. Approach discussed in the following paragraph, can be considered
a partial remedy to this.</p>
      <p>
        Finally, foreseeable limitations may also a ect the tracking of technical debt
instances. As a premise for tracking, [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] showed the instances' ability to span over
multiple components. Modelling of the chain of projects was introduced as the
method to allow capturing this behaviour. The current classi cation presented
in Figure 1 considers one dimension for the COPs|presumed to be the most
dominant. This classi cation can be a limiting factor, especially in large hybrid
COP projects, but we argue that this can be countered by iteratively exploring
more dimensions for the COPs until all technical debt inclined changes have
been successfully associated to the technical debt instances.
      </p>
      <p>Overcoming the limitations and achieving the study's objectives, there is a
number of applications for the results (discussed in Section 3.1). Firstly,
demonstrating technical debt's ability to propagate, almost boundlessly, between
software projects and artifacts should fuel the apparent paradigm shift in software
life-cycle management where the inter-connectivity of software project entities
carries increased value. Second, documenting the ways in which technical debt
can propagate should provide an interface for integrating knowledge from other
research domains to enhance technical debt management by for example
applying nancial models for technical debt strategization. Lastly, associating the
documentation's technical debt propagation channels with information regarding
their value accumulation allows automated tooling approaches to be introduced,
but also makes technical debt an integral and explicit component of the software
project's value production and its assessment.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusions and Future Work</title>
      <p>With similar studies in the future, using di erent event markers, it is possible
to map the propagation of technical debt by observing the amount of increased
work caused by di erent causes of technical debt. It is possible to observe who
pays the technical debt and how it is propagated from the original cause (e.g., a
change in a fundamental library used by many projects) through facade libraries
and components to the nal applications.</p>
      <p>
        In an e ort to e ciently analyse the propagation of technical debt through
propagation channels, a taxonomy of projects in GitHub should be created to
help characterize and predict the characteristics of the projects. To this end,
and to achieve the goals stated herein, we have analyzed over twenty-eight
thousand projects from GitHub and have successfully identi ed a number of projects
with references to suitable external services. According to Lambe [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], even
taxonomies founded on criteria that do not stand all scrutiny, can allow for reliable
predictions and descriptions of characteristics of new members of the taxonomy
based on very little information. A well created taxonomy combined with our
expected mining results should help us identify di erent propagation channels
within the projects without even analysing them at the code level. Should we
nd two or more clusters of di erent kinds of change behaviour within a single
taxonomy class, it could suggest that the propagation channels between these
clusters di er from each other.
      </p>
      <p>
        There can, of course, be other causes to variance within a class. For example,
it would be bene cial to have the information of the process maturity level for
each project team. This kind of information would be signi cant in
understanding the project's sensitivity to external changes and the general preparedness
and carefulness in the design. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]
      </p>
      <p>Such work would provide us with a better understanding of the economy of
technical debt, which again would help us give good estimates on the actual
costs of applying, for example, social media APIs in an application system and
compare it with the projected bene ts and income. It would help in answering the
question: would applying certain features increase the revenue from the service.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgment</title>
      <p>J. Holvitie is supported by the Nokia Foundation Scholarship and the Finnish
Foundation for Technology Promotion, the Ulla Tuominen Foundation, and the
Finnish Science Foundation for Economics and Technology grants.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Brown</surname>
          </string-name>
          , N.,
          <string-name>
            <surname>Cai</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kazman</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kruchten</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lim</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>MacCormack</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nord</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ozkaya</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          , et al.:
          <article-title>Managing technical debt in software-reliant systems</article-title>
          .
          <source>In: Proceedings of the FSE/SDP Workshop on Future of Software Engineering Research</source>
          . pp.
          <volume>47</volume>
          {
          <fpage>52</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Cunningham</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>The WyCash portfolio management system</article-title>
          .
          <source>In: Proceedings Addendum for Object-Oriented Programming Systems, Languages, and Applications (OOPSLA)</source>
          . pp.
          <volume>29</volume>
          {
          <fpage>30</fpage>
          . No.
          <volume>22</volume>
          (
          <year>1992</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Dabbish</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stuart</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsay</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Herbsleb</surname>
          </string-name>
          , J.:
          <article-title>Social coding in github: transparency and collaboration in an open software repository</article-title>
          .
          <source>In: Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work</source>
          . pp.
          <volume>1277</volume>
          {
          <fpage>1286</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Google</given-names>
            <surname>Developers</surname>
          </string-name>
          :
          <article-title>Migrating to google sign-in (</article-title>
          <year>2015</year>
          ), https://developers.google.com/identity/sign-in/auth-migration
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seaman</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomes</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cavalcanti</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tonin</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Da</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Siebra</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>Tracking technical debt - an exploratory case study</article-title>
          .
          <source>In: 27th IEEE International Conference on Software Maintenance (ICSM)</source>
          . pp.
          <volume>528</volume>
          {
          <fpage>531</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Holvitie</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Leppanen, V.,
          <string-name>
            <surname>Hyrynsalmi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Technical debt and the e ect of agile software development practices on it-an industry practitioner survey</article-title>
          .
          <source>In: Sixth International Workshop on Managing Technical Debt (MTD)</source>
          . pp.
          <volume>35</volume>
          {
          <fpage>42</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Izurieta</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vetro</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zazworka</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cai</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seaman</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shull</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Organizing the technical debt landscape</article-title>
          .
          <source>In: Third International Workshop on Managing Technical Debt (MTD)</source>
          . pp.
          <volume>23</volume>
          {
          <fpage>26</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kagdi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Collard</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maletic</surname>
            ,
            <given-names>J.I.:</given-names>
          </string-name>
          <article-title>A survey and taxonomy of approaches for mining software repositories in the context of software evolution</article-title>
          .
          <source>Journal of Software Maintenance and Evolution: Research and Practice</source>
          <volume>19</volume>
          (
          <issue>2</issue>
          ),
          <volume>77</volume>
          {
          <fpage>131</fpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Kalliamvakou</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gousios</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blincoe</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singer</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>German</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Damian</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>The promises and perils of mining github</article-title>
          .
          <source>In: Proceedings of the 11th Working Conference on Mining Software Repositories</source>
          . pp.
          <volume>92</volume>
          {
          <fpage>101</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kruchten</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nord</surname>
            ,
            <given-names>R.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ozkaya</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Technical debt: From metaphor to theory and practice</article-title>
          .
          <source>IEEE Software 29(6)</source>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Lambe</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Organising knowledge: taxonomies, knowledge and organisational effectiveness</article-title>
          .
          <source>Chandos Publishing</source>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Avgeriou</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>A systematic mapping study on technical debt and its management</article-title>
          .
          <source>Journal of Systems and Software</source>
          <volume>101</volume>
          ,
          <issue>193</issue>
          {
          <fpage>220</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>McConnell</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <source>Technical debt. 10x Software Development Blog,(Nov</source>
          <year>2007</year>
          ).
          <article-title>Construx Conversations</article-title>
          . URL= http://blogs. construx. com/blogs/stevemcc/archive/2007/11/01/technical-debt-2.aspx (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>McGregor</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monteith</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Zhang, J.:
          <article-title>Technical debt aggregation in ecosystems</article-title>
          .
          <source>In: Third International Workshop on Managing Technical Debt (MTD)</source>
          . pp.
          <volume>27</volume>
          {
          <fpage>30</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Schmid</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>A formal approach to technical debt decision making</article-title>
          .
          <source>In: Proceedings of the 9th International ACM SIGSOFT Conference on Quality of Software Architectures</source>
          . pp.
          <volume>153</volume>
          {
          <fpage>162</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Seaman</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Izurieta</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cai</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zazworka</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shull</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vetro</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Using technical debt data in decision making: Potential decision approaches</article-title>
          .
          <source>In: Third International Workshop on Managing Technical Debt (MTD)</source>
          . pp.
          <volume>45</volume>
          {
          <fpage>48</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Suenson</surname>
          </string-name>
          , E.: How Computer Programmers Work {
          <article-title>Understanding Software Development in Practise</article-title>
          .
          <source>Ph.D. thesis</source>
          , Turku Centre for Computer Science (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Tsay</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dabbish</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Herbsleb</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>In uence of social and technical factors for evaluating contribution in github</article-title>
          .
          <source>In: Proceedings of the 36th International Conference on Software Engineering</source>
          . pp.
          <volume>356</volume>
          {
          <fpage>366</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>