<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Self-Adaptation for Machine Learning Based Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maria Casimiro</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paolo Romano</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Garlan</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gabriel A. Moreno</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eunsuk Kang</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mark Klein</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>INESC-ID, Instituto Superior Técnico, University of Lisbon</institution>
          ,
          <addr-line>Lisbon</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute for Software Research, Carnegie Mellon University</institution>
          ,
          <addr-line>Pittsburgh, PA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Software Engineering Institute, Carnegie Mellon University</institution>
          ,
          <addr-line>Pittsburgh, PA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Today's world is witnessing a shift from human-written software to machine-learned software, with the rise of systems that rely on machine learning. These systems typically operate in non-static environments, which are prone to unexpected changes, as is the case of self-driving cars and enterprise systems. In this context, machine-learned software can misbehave. Thus, it is paramount that these systems are capable of detecting problems with their machined-learned components and adapt themselves to maintain desired qualities. For instance, a fraud detection system that cannot adapt its machine-learned model to eficiently cope with emerging fraud patterns or changes in the volume of transactions is subject to losses of millions of dollars. In this paper, we take a first step towards the development of a framework aimed to self-adapt systems that rely on machine-learned components. We describe: (i) a set of causes of machine-learned component misbehavior and a set of adaptation tactics inspired by the literature on machine learning, motivating them with the aid of a running example; (ii) the required changes to the MAPE-K loop, a popular control loop for self-adaptive systems; and (iii) the challenges associated with developing this framework. We conclude the paper with a set of research questions to guide future work.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Self-adaptive systems</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>Model degradation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>diagnosis, which relies on ML for classifying types of
diseases of sick patients [4]; self-driving cars, which use
The field of self-adaptive systems (SAS) is an extensive ML to determine whether they should stop based on how
and active research area that has made steady improve- distant they are from the car in front [5]; robots, which
ments for years. SAS react to environment changes, faults rely on ML models to predict the amount of remaining
and internal system issues to improve the system’s be- battery power [6]; and targeted advertisement services,
havior, utility and/or dependability [1]. These systems which rely on recommender systems to show users items
usually adopt an architecture, known as the MAPE-K that they may find interesting [7].
loop, which monitors the system, decides when it needs For such systems, adaptation poses a key concern. In
adaptation, selects the best course of action to improve addition to the reasons that traditional systems must
the system, and executes it [2]. The actions available adapt (faults, changing requirements, unexpected loads,
for the system to execute are usually called tactics. The etc.), ML-based components may fail to perform as
exliterature on SAS spans a broad range of systems such as pected, thereby reducing system utility. For instance,
enterprise systems, and cyber-physical systems (CPS). changes in a system’s operating environment can
intro</p>
      <p>In parallel with the maturing of SAS research, a new duce drifts in the input data of the ML models making
class of systems has emerged: supervised and semi- them less accurate [8], or attacks may attempt to subvert
supervised machine learning (ML) based systems are now the intended functionality of the system [9].
becoming ubiquitous. Such systems embed one or more Thankfully, there is a large number of emerging
components, whose behavior is derived from training techniques that have been developed by the ML
comdata, into a larger system containing traditional compu- munity for adapting supervised ML models and that
tational entities (web services, databases, operator inter- could in principle be used as adaptation tactics in a
faces). Examples include: fraud detection, which uses a self-adaptive system. These range from of-line,
fromclassifier to detect fraudulent transactions [ 3]; medical scratch model retraining and replacement, at one
extreme, to incremental approaches performed in-situ, at
SAML’21: International Workshop on Software Architecture and the other [10, 11, 12, 13, 14, 15]. And more techniques
Machine Learning, September 13–17, 2021, Växjö, Sweden are being developed constantly.
" maria.casimiro@tecnico.ulisboa.pt (M. Casimiro);
romano@inesc-id.pt (P. Romano); garlan@cs.cmu.edu (D. Garlan); Unfortunately, determining when and how to take
gmoreno@sei.cmu.edu (G. A. Moreno); eunsukk@andrew.cmu.edu advantage of such tactics to perform adaptation is highly
(E. Kang); mk@sei.cmu.edu (M. Klein) non-trivial. First, there is a large number of possible
©Us2e02p1erCmoipttyerdiguhntdfoerr tChriesaptaivpeerCboymCmaornnesgLieicMenesleloAntUtrnibivuetriosinty4a.0ndInttheernaauttihoonrasl. adaptation tactics that could potentially be applied to
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g (CCCEBUY R4.0)W.orkshop Proceedings (CEUR-WS.org) an ML component, but not all approaches work with
all forms of supervised ML models. For example, some ing system) may rely on ML to perform a given function
training models may allow a system to selectively “forget” (e.g., decide the tactic to execute), the actual system that
certain inputs, while others do not. Similarly, some ML is adapted (i.e. the managed system) does not rely on any
models support transfer learning to incrementally update ML component. These systems have at their disposal a
a learnt model, but not all do. set of tactics that, for instance, change a system’s
archi</p>
      <p>Second, the value of investing in improving the accu- tecture (e.g., adding/removing servers) or the quality of
racy of an ML component is strongly context-dependent the service they provide (e.g., increasing/decreasing the
– often depending on both the domain and timing consid- rendering quality of images) in response to environment
erations. For example, while a medical diagnosis system changes. Usually, tactic outcomes have some uncertainty
may support model retraining at run time, the latency that can be modeled via probabilistic methods given
asof this tactic may make it infeasible for self-driving cars, sumptions on the underlying hardware/software
platwhich rely instead on swifter tactics (such as replacing forms and their characteristics. Further, one can measure
the ML component entirely) that can address real-time the properties of such systems through the use of metrics
system response requirements. In a diferent mode of op- such as latency, throughput and content quality.
eration, however, both types of tactics may be available, Determining the costs and benefits of such adaptation
e.g., if the self-driving car is stopped (parked mode of op- tactics has been well researched and there are numerous
eration), it may be feasible to retrain an underperforming techniques and algorithms for that end [17]. However,
model without compromising safety. new challenges arise when considering managed systems</p>
      <p>Third, calculating the costs and benefits of these tactics that depend on ML models. Not only are we missing a
is dificult, particularly in a whole-system context, where well-understood and generally applicable set of tactics
improving a particular component’s performance may that SAS can use to adapt ML-based systems, but also
or may not improve overall system utility. Costs include the properties of ML components, such as accuracy and
time, resources (processing, memory, power), and service fairness, may not change consistently with the tactic that
disruption. Benefits derive for instance from increased is executed. For example, if we retrain an ML model, its
accuracy or fairness of the ML component, which can in accuracy is not always afected in the same way, but may
turn lead to better performing down-stream components depend on the samples available to retrain the model, on
and support overall business goals (e.g. by improving ad- the duration of the retraining process, and on the model’s
vertisement revenue). Both costs and benefits can be hard hyper-parameters. Similarly, model fairness may also be
to quantify, however, and hence to reason about when de- afected in diferent ways due to the training samples
termining whether an ML adaptation tactic makes sense. that are fed during re-training [18].</p>
      <p>We argue, therefore, that in order to harness the po- To improve the self-adaptive capabilities of systems
tential of the rich space of ML adaptation mechanisms, it and their performance, recent research has proposed
is necessary to develop methods that can reason about SASs that rely on ML techniques and models to adapt the
which tactics are available to adapt the ML component, system [19, 20]. Specifically, ML is used in the adaptation
which are the most efective to employ in a given context manager to: update adaptation policies, predict resource
so that system utility is maximized, and how to integrate usage, update run-time models, reduce adaptation spaces,
them into modern adaptive systems architectures. Specif- predict anomalies, and collect knowledge. Additionally,
ically, in this paper we attempt to bring some clarity to learning is typically leveraged to improve the Analysis
this emerging but critical aspect of SAS by outlining (i) and Plan components of the MAPE-K loop [19].
a set of causes of ML component performance degrada- In this paper, we focus on the problem of how to
tion and a set of adaptation tactics derived from research leverage self-adaptation to correct and adapt supervised
on ML (§ 3); (ii) architectural and algorithmic changes ML components of a managed system, while
increasrequired to incorporate efective ML adaptation into the ing overall utility of ML-based systems when their ML
MAPE-K loop, a popular framework for monitoring and components are underperforming. This vision is aligned
controlling self-adaptive systems (§ 4); and (iii) the mod- with the one presented by Bures [21] in which the
aueling and engineering challenges associated with realiz- thor claims that “self-adaptation should stand in
equal-toing the full potential for adaptation of ML-based systems equal relationship to AI. It should both benefit from AI and
(§ 4). We conclude with a set of open research questions. enable AI.” Extending this vision further, we argue that
the techniques developed in this context could also be
applied, in a recursive fashion, to self-adapt adaptation
2. Background &amp; Related Work managers that rely on ML components to enhance their
efectiveness and robustness. For instance a planner that
Current literature on SAS focuses on managed systems relies on ML to reduce the adaptation space could have
that do not embed (nor rely upon) ML models [16]. That its own self-adaptation manager to ensure that the ML
is, although the self-adaptation mechanism (i.e. manag- component is working as expected.</p>
      <p>The vision presented in this paper difers from work may introduce higher latencies that compromise SLAs.
on collective SAS since we are targeting systems with However, the impact of these mispredictions varies not
only one agent and with a centralized learning process, only from client to client, with whom diferent SLAs may
whereas this line of research focuses on systems with mul- have been agreed upon, but also in time, since during
tiple agents that can share knowledge with each other. specific periods, e.g., Black Friday, the volume of
transac</p>
      <p>Diferently, our vision ties in the field of life- tions is substantially altered. During busy days such as
long/continual learning [22, 23], which deals with open- these, adapting the ML models responsible for fraud
deworld problems, with the field of self-adaptive systems. In tection so that they are less strict and reduce false alarms
fact, dealing with open-world changes was identified by is crucial in order to preserve system utility. However,
Gheibi et. al. [19] as an open problem in the SAS domain. this adaptation entails a delicate trade-of, since less strict
Specifically, Lifelong Learning deals with the problem of models can allow fraudulent transactions to be accepted.
leveraging past knowledge to learn a new task better and Further, these systems are subject to constantly evolving
Continual Learning is focused on solving the problem of fraud patterns, to which the ML models must adapt [24].
maintaining the accuracy of old tasks when learning new
tasks [23]. The techniques developed in this domain can 3.2. Causes of Degradation of ML
be leveraged by SASs to improve ML components when
Components’ Accuracy
unexpected changes occur in the environment or when
the performance of the ML component is degraded and af- We now focus on problems that deteriorate the
perforfects overall system utility. Overall, our focus is on SASs mance of ML components such that they are no longer
and on how to integrate techniques from these research able to maintain system utility at a desired level. In
pardomains into a generic, yet rigorous/principled frame- ticular, we present two classes of problems, which, we
work that can decide which ML component to adapt, how argue, are general enough to be representative of most
and when. The next section provides details on possible of the issues addressed by the existing ML literature.
causes of ML component degradation and repair tactics
inspired by this field of research.</p>
      <sec id="sec-1-1">
        <title>Data-set Shift. When the distribution of the inputs to</title>
        <p>a model changes, such that it becomes substantially
difer3. Adaptation of ML-based ent from the distribution on which the model was trained,
we find ourselves in the presence of a problem commonly
Systems known as data-set shift [8, 11, 10, 25]. As recent work has
shown, not all data-set shifts are malign [10]. As such,
We now motivate the need for self-adaptive ML-based an efective SAS should not only detect shifts, but also
systems through an example from the enterprise systems be able to assess their actual impact on system utility.
domain. Then, we present a set of possible causes for In a fraud detection system, data-set shift occurs when
ML component performance degradation and a set of new fraud patterns emerge (e.g., charges at a particular
adaptation tactics. merchant), or when patterns of legitimate transactions
change, for instance due to busy shopping days like Black
3.1. Running Example – Fraud Detection Friday and Christmas [24]. Although the actual features
System. used for classification may not change, their distribution
does. This means that diferent values of the features
Consider a fraud detection system that relies on ML mod- now characterize legitimate and fraudulent transactions.
els for scoring credit/debit card transactions. The score
attributed by the ML model is then used by a rule-based Incorrect Data. This problem arises when there are
model to decide whether transactions are legitimate or samples in the model’s training set that are incorrectly
fraudulent. Typical clients of companies that provide labeled [26] or when test data is tampered with, thus
leadfraud detection services are banks and merchants. In ing the model to mispredict when certain inputs arrive.
this setting, system utility is typically defined based on The former can happen, for instance, when unsupervised
attributes such as the cost of losing clients due to in- techniques are used to label examples in order to
bootcorrectly declined transactions, fairness (no client is de- strap the training set of a second supervised model [26].
clined more often) [18] and the overall cost of service Incorrect data can also make their way into a model’s
level agreement (SLA) violations (these systems have training set due to attackers that intentionally pollute
strict SLAs to process transactions in real time, e.g. at it so as to cause the ML component to incorrectly
premost 200ms on the 99.999th percentile of the latencies’ dict outputs for certain inputs [12, 9]. For instance, in
distribution [3]). While cost and revenue are directly the fraud detection case, security breaches could lead to
afected by ML model’s mispredictions, response time is
afected by model complexity, i.e., more complex models
poisoning the data used for training ML models, hence components or giving them correct samples [14]. For
causing them to make incorrect predictions. instance, whenever the ML component suspects a
transaction of being fraudulent, it can be automatically
can3.3. Repair Tactics celed. Then, the user can be informed of the decision
and asked whether the transaction should be authorized
Table 1 illustrates a collection of tactics that can be used or declined in the future. Another possibility is to add
to deal with issues introduced by ML-based components. humans to the loop when adding samples to the ML
comThese tactics were inspired by research on ML [22, 14, ponent’s training set. In this scenario, an expert can be
27, 13, 15]. Next, we describe the tactics presented in the asked to review the most uncertain classifications so as
table, motivating them with scenarios in which they can to improve the quality of the training samples. In the
be applied and discussing their costs and benefits. former scenario, the benefits are easily quantifiable, since
the risk of accepting a possibly fraudulent transaction
Component replacement. This tactic assumes the can be measured via its economic value. However, users
existence of a repository of components and respective may get annoyed if their transactions are canceled too
meta-data that can be analyzed to determine if there ex- often, to the extent that they may stop purchasing using
ists a component that is better suited for the current that credit card provider. As for relying on experts to
system state. For example, when the volume of transac- review uncertain classifications, having an on-demand
tions changes, for instance in special days such as Black expert performing this task is expensive and the latency
Friday, ML models may consider the increased frequency of the manual labeling process may be unacceptable.
of transactions as an indicator of fraud and erroneously
lfag legitimate transactions as fraudulent. Such mispre- Transfer learning. Transfer learning (TL) techniques
dictions can lead to significant financial losses [ 3], thus re- leverage knowledge obtained when performing previous
quiring timely fixes and rendering the use of high latency tasks that are similar to the current one so that learning
tactics infeasible (note that in this context, transactions the current task becomes easier [27]. Suppose that: (i)
need to be accepted/rejected within milliseconds [3]). As a fraud detection company has a set of clients (such as
such, only low latency tactics can be applied. An example banks), (ii) the company has a unique ML model for each
is to replace the underperforming models with rule-based client, so that it complies with data privacy regulations1 ,
models, e.g., developed by experts for specific situations, and (iii) one of its clients is afected by a new attack
patand/or to switch to previously trained models that are tern, which is eventually learned by that client’s model.
known to perform well in similar conditions. A benefit of In this scenario, TL techniques [29, 27] can be used to
this tactic, whenever it is available, is too enable a swift improve the other clients’ models so that they can react
reaction to data set shifts. Its main cost depends on the la- to the same attack. Estimating the benefits of executing
tency and resources used for the analysis of the candidate this tactic for a given client boils down to estimating
replacing components available in the repository. the likelihood that this client may sufer the same attack.
Yet, the execution of this tactic typically implies high
computational costs (e.g., if cloud resources are used)
Human-based labeling. Humans are often able to
recognize patterns, problems, and objects more
accurately than ML components [14]. Thus, depending on
the domain, humans may play a role in correcting these
1Since privacy is important in this domain, there are techniques
that can be used to deal with the problem of ensuring data
confidentiality and anonymity in information transfer between clients [28].
and non-negligible latency, which may render this tactic
economically unfavorable, or even inadequate, e.g., if
the attack on a diferent client is imminent and the TL
process is slow.</p>
        <p>Unlearning. This tactic corresponds to unlearning
data that no longer reflects the current environment/state
of the system and its lineage, thus eliminating the efect
of that data on current predictions [13], while avoiding
a full model retrain. A key problem that stands in the
way of the execution of this tactic is the identification of
incorrect labels. For instance, in a fraud detection system,
incorrectly classified transactions may all be eventually
identified for “free”, although with large latencies, when
users review their credit card statements. Conversely, in
scenarios in which the identification of incorrect
samples is not readily available, one may leverage automatic
techniques, such as the one described in [30], which are
faster but typically less accurate. As such, the cost and
complexity of this task vary depending on the context.</p>
        <p>Then, after identifying the incorrect samples, the model
must be updated to accurately reflect the correct data. At
this point, the advantage of unlearning techniques with
respect to a typical full model retrain is the time savings
(up to 9.5 × 104) that can be achieved [13].</p>
        <p>Retrain and/or hyper-parameter optimization.</p>
        <p>This is a general tactic that involves retraining the model
with new data that reflects recent relevant data-set
drifts, e.g., a new kind of attack in a fraud detection
system. There are many types of retraining, ranging
from a simple model refresh (incorporate new data
using old hyper-parameters), to a full retrain (including
hyper-parameter optimization, possibly encompassing
diferent model types/architectures), which imply
diferent computational costs and can benefit model’s
accuracy at diferent extents. In the presence of data-set
shift, when there is new data that already incorporates
the new input distribution, this tactic often represents a
simple, yet possibly expensive, approach to deal with
this problem. The benefits of this tactic are dependent on
the type of retrain process and on the quality of the new
data. As for its cost, if retraining is performed on the
cloud, it can be directly converted to the economic cost
of renting the virtual machines and several techniques
exist to predict such costs [31, 32].</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. MAPE-K Loop for ML-Based</title>
    </sec>
    <sec id="sec-3">
      <title>Systems</title>
      <sec id="sec-3-1">
        <title>In SAS, the MAPE-K loop typically actuates over a system composed of non-ML components. To enable the development of self-adaptive ML-based systems, in which the</title>
      </sec>
      <sec id="sec-3-2">
        <title>MAPE-K loop actuates over a system composed of nonML and ML components (Figure 1) we argue that each stage of the MAPE-K loop should be revised to efectively leverage tactics such as the ones mentioned.</title>
        <p>4.1. Monitor
The Monitor stage has to keep track of the inputs used
when querying ML components because shifts of the
input distributions may afect the predictions. For
instance, the detection of out-of-distribution inputs may
mean that there has been a change in the environment
and thus the model used by some ML component may
no longer be representative of the current environment.</p>
        <p>The challenge here is not only detecting the occurrence
of shifts in a timely and reliable fashion, but also how
to efectively characterize them — since diferent types
of shifts require diferent reaction methods. As in other
SAS, typical attributes that contribute to the system’s
utility (e.g., latency, throughput) or the satisfaction of
required system properties must be monitored. In addition
to these, the Monitor stage must also gather the outputs
of the ML component to account for situations in which
changes in the inputs go by unnoticed, perhaps because
they are too slow, but that manifest themselves faster in
the outputs [33]. Examples of outputs to monitor are, for
instance, shifts in the output distribution, model’s
accuracy and error – obtained by comparing predictions with
real outcomes. A relevant challenge here is that often real
outcomes are only known after a long time, if ever. For
instance, in fraud detection, false negatives (i.e., undetected
real fraud) are known only when users file a complaint
and false positives are normally undetectable (since no
feedback is obtained for transactions that are legitimate
but rejected by the system). Approaches such as those
proposed in [33, 11, 34] provide a good starting point
for the implementation of a Monitor for self-adaptive
ML-based systems.
approaches[38]. An additional concern is that some of
these tactics may require a considerable use of resources
to execute, either in the system itself or ofloaded. This
requires Plan to account for this impact or cost.</p>
        <p>For ML-based systems that rely on multiple ML
components, whenever a system property is (expected to be)
violated or when system utility decreases, fault
localization may be required to understand which component is
underperforming and should be repaired/replaced [39].</p>
        <p>Challenges. Monitoring input and output
distributions requires keeping track of a multitude of features
and parameters which would otherwise be disregarded. Challenges. Although there are several
apThis is already challenging due to the amount of data that proaches [31, 40] that attempt to predict the time/cost
needs to be stored, maintained, and analyzed. Finding of training ML models, this is a complex problem
suitable frequencies to gather these data and adapting that is strongly influenced by the type of ML models
them in the face of evolving time constraints is an even considered, their hyper-parameters and the underlying
bigger challenge in time-critical domains [35, 11]. (cloud) infrastructure. These techniques represent a
natural starting point to estimate the costs and benefits
4.2. Analyze of adaptation tactics such as the ones presented. Yet,
developing techniques for predicting the costs/benefits
The Analyze stage is responsible for determining whether of complex tactics, e.g. unlearning, remains an open
degradations of the prediction quality of ML components challenge. One interesting direction is to exploit
are afecting (or predicted to afect) other system com- techniques for estimating the uncertainty [25] of ML
ponents and system utility to such an extent that adap- models to quantify both the likelihood of models’
mispretation may be required. To accomplish this, one can dictions as well as the potential benefits deriving from
leverage techniques developed by the ML community to employing corrective adaptation tactics. Certain ML
detect possible issues in the inputs and outputs of the models can directly estimate their own uncertainty [41],
model [8, 11, 10, 33], errors in its training set [36] and the or additional techniques (e.g. ensembles [42]) can be
appearance of new features relevant for prediction [37]. used to obtain uncertainty estimations. Still, existing
These techniques must then be adjusted for the particular techniques can sufer from significant shortcomings in
case of each system, which includes adapting them to practical settings [25].
diferent ML models and tasks. Finally, tactics that modify ML components are
computationally expensive (e.g., non-negligible latency). Thus,
Challenges. Estimating the impact of an ML compo- Plan must have mechanisms to verify that the system can
nent on other system components and on system utility execute the tactic without compromising other
compocan be challenging because often (mis)predictions afect nents/properties, or even the entire system.
the system’s utility/dependability in ways that are not
only application- but also context-dependent. For in- 4.4. Execute
stance, during periods with higher transaction volumes,
such as on Black Friday, mispredictions have higher im- To execute a given adaptation tactic, the Execute stage
pact on system utility, since during these periods it is must have access to mechanisms to improve or replace
more critical to accurately detect fraud, while maximizing the ML component and/or its training set. As in the
accepted transactions. Architectural models can capture conventional MAPE-K loop, we require implementations
the information flows among components, but the chal- of adaptation tactics that are not only eficient to execute,
lenge is to estimate how the uncertainty in the output of but also have predictable costs/benefits and are resilient
the ML components propagates throughout the system. to run-time exceptions.
4.3. Plan Challenges. A key challenge is how to enhance the
predictability of the execution of the ML adaptation
tacThe Plan stage is responsible for identifying which adap- tics, which often require the processing of large volumes
tation tactics (if any) to employ to address issues with of data (e.g., to re-train a large scale model) possibly
ML components afecting the system. As with other self- under stringent timing constraints. We argue that the
adaptation approaches, this reasoning should consider community of SAS would benefit from the availability
the costs and benefits of each viable tactic. Further, most of open-source software frameworks that implement a
of the proposed tactics have a non-negligible latency, range of generic adaptation tactics for ML components.
which needs to be accounted for as in latency-aware
This would allow one to mask complexity, promote
interoperability and comparability of SAS. Further, it would
also provide an opportunity to assemble, in a common
framework, techniques that have been proposed over
many years in diferent areas of the AI/ML literature.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgments</title>
      <sec id="sec-4-1">
        <title>Support for this research was provided by Fundação para</title>
        <p>a Ciência e a Tecnologia (Portuguese Foundation for
Science and Technology) through the Carnegie Mellon
Portugal Program under Grant SFRH/BD/150643/2020
4.5. Knowledge and via projects with references
POCI-01-0247FEDER-045915, POCI-01-0247-FEDER-045907, and
Finally, the Knowledge module is responsible for main- UIDB/50021/2020. This material is based upon work
taining information that reflects what is known about funded and supported by the Department of Defense
the environment and the system. For ML-based systems, under Contract No. FA8702-15-D-0002 with Carnegie
the Knowledge component should evolve in order to keep Mellon University for the operation of the Software
track of the costs/benefits of each tactic on the afected Engineering Institute, a federally funded research and
ML components and system’s utility. This corresponds development center. DM21-0052
to: gathering knowledge on how each tactic altered an
ML component and on the context in which the tactic</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>References</title>
      <p>was executed; and meta information on training sets, for
instance characterizing the most important features for
predicting the costs and benefits of the diferent tactics.</p>
      <p>This added knowledge should be leveraged to improve
the decision making process and, thus, improve
adaptation. By gathering knowledge on how each tactic altered
an ML component and on the context in which the tactic
was executed, the Analyze and Plan stages can take more
efective decisions on when to adapt and which tactic to
execute, respectively. Finally, for a tactic that replaces
underperforming ML components with non ML-based
ones, Knowledge must contain a repository of the
available components and their meta-data. This meta-data, we
argue, should provide information to enable reasoning
on whether the necessary preconditions to enable a safe
and timely reconfiguration hold.
[16] C. Krupitzer, et al., A survey on engineering ap- tion, Springer, 2018.</p>
      <p>proaches for self-adaptive systems (2018). [36] Z. Abedjan, et al., Detecting data errors: Where are
[17] K. Ervasti, A survey on network measurement: we and what needs to be done?, Procs. of VLDB 9</p>
      <p>Concepts, techniques, and tools (2016). (2016).
[18] A. F. Cruz, et al., A bandit-based algorithm [37] D. Papamartzivanos, et al., Introducing deep
learnfor fairness-aware hyperparameter optimization, ing self-adaptive misuse network intrusion
detecCoRR abs/2010.03665 (2020). tion systems, IEEE Access 7 (2019).
[19] O. Gheibi, et al., Applying machine learning in self- [38] G. A. Moreno, et al., Flexible and eficient
decisionadaptive systems: A systematic literature review, making for proactive latency-aware self-adaptation,
arXiv preprint arXiv:2103.04112 (2021). ACM Trans. Auton. Adapt. Syst. 13 (2018).
[20] T. R. D. Saputri, S.-W. Lee, The application of ma- [39] A. Christi, et al., Evaluating fault localization for
chine learning in self-adaptive systems: A system- resource adaptation via test-based software
modifiatic literature review, IEEE Access 8 (2020). cation, in: Procs. of QRS, 2019.
[21] T. Bureš, Self-adaptation 2.0, in: 2021 International [40] O. Alipourfard, et al., Cherrypick: Adaptively
unSymposium on Software Engineering for Adaptive earthing the best cloud configurations for big data
and Self-Managing Systems (SEAMS), 2021. analytics, in: Procs. of NSDI, 2017.
[22] D. L. Silver, Q. Yang, L. Li, Lifelong machine learn- [41] M. A. Osborne, et al., Gaussian processes for global
ing systems: Beyond learning algorithms, in: 2013 optimization, in: LION, 2009.</p>
      <p>AAAI spring symposium series, 2013. [42] L. Breiman, Bagging predictors, in: Machine
Learn[23] B. Liu, Learning on the job: Online lifelong and con- ing, volume 24, Springer, 1996.
tinual learning, in: Procs. of the AAAI Conference
on Artificial Intelligence, volume 34, 2020.
[24] D. Aparício, et al., Arms: Automated rules
management system for fraud detection, arXiv preprint
arXiv:2002.06075 (2020).
[25] Y. Ovadia, et al., Can you trust your model's
uncertainty? evaluating predictive uncertainty under
dataset shift, in: Procs. of NIPS, 2019.
[26] D. Wu, et al., A highly accurate framework for
selflabeled semisupervised classification in industrial
applications, IEEE TII 14 (2018).
[27] S. J. Pan, Q. Yang, A survey on transfer learning,</p>
      <p>IEEE TKDE 22 (2009).
[28] Y. Liu, et al., A secure federated transfer learning</p>
      <p>framework, Procs. of IS 35 (2020).
[29] K. Swersky, et al., Multi-task bayesian optimization,</p>
      <p>Procs. of NIPS 26 (2013).
[30] Y. Cao, et al., Eficient repair of polluted machine
learning systems via causal unlearning, in: Procs.</p>
      <p>of Asia CCS, 2018.
[31] M. Casimiro, et al., Lynceus: Cost-eficient tuning
and provisioning of data analytic jobs, in: Procs. of</p>
      <p>ICDCS, 2020.
[32] P. Mendes, et al., TrimTuner: Eficient
optimization of machine learning jobs in the cloud via
subsampling, in: MASCOTS, 2020.
[33] X. Zhou, et al., A Framework to Monitor Machine</p>
      <p>Learning Systems Using Concept Drift Detection,</p>
      <p>Springer, 2019.
[34] Z. Yang, M. H. Asyrofi, D. Lo, BiasRV: Uncovering
biased sentiment predictions at runtime, CoRR
abs/2105.14874 (2021). arXiv:2105.14874.
[35] E. Bartocci, et al., Specification-based monitoring
of cyber-physical systems: a survey on theory, tools
and applications, in: Lectures on Runtime
Verifica</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>