Software Configuration Diagnosis – A Survey of Existing
               Methods and Open Challenges
                                 Artur Andrzejak1 and Gerhard Friedrich2 and Franz Wotawa3


Abstract. As software systems become more complex and feature-                ware, we focus on methods and tools that have been developed within
rich, configuration mechanisms are needed to adapt them to differ-            the area of software configuration. Dealing with software configura-
ent execution environments and usage profiles. As a consequence,              tion only allows for extracting and straightforwardly using informa-
failures due to erroneous configuration settings are becoming more            tion from programs, which would be hardly obtained when consid-
common, calling for effective mechanisms for diagnosis, repair, and           ering hardware. As a consequence, there are many approaches that
prevention of such issues. In this paper, we survey approaches for di-        work exclusively in the software configuration domain. Neverthe-
agnosing software configuration errors, methods for debugging these           less, there are also approaches that can be generalized to serve di-
errors, and techniques for testing against such issues. In addition, we       agnosis of system configuration as well. Especially, when it comes
outline current challenges of isolating and fixing faults in configu-         to large software comprising million lines of source code and also to
ration settings, including improving fault localization, handling the         cases where source code is not available, approaches have to follow
case of multi-stack systems, and configuration verification at run-           a more black-box oriented approach. This approach also enables di-
time.                                                                         agnosis in case of hardware or systems in general where hard- and
                                                                              software is investigated.
1     Introduction                                                               In more detail, given a program, its configuration parameters (or
                                                                              settings), and an execution environment, a software configuration er-
Tackling software configuration errors is recognized as an important          ror comes forward when the parameters assume incorrect values. The
research problem which has been investigated by many groups from              configuration parameters might specify multiple aspects of system
academia and industry, e.g., see [51]. In a recent study [52], the au-        behavior, including adaptation to execution environment (paths, net-
thors report empirical findings on the impact of configuration errors         work settings, ..), functionality (enabled/disabled components, log-
in practice. In particular, a study of over 500 real-world configura-         ging, ...), performance and resource policies (cache sizes, number
tion issues revealed that this type of problems constituted the largest       of threads, ..), security settings, and others. Consequently, erroneous
percentage (31%) of high-severity support requests. Moreover, a sig-          configuration settings can cause failures of multiple types: complete
nificant portion of these issues (16% to 47%) rendered systems fully          crashes, partially disabled functionality, performance issues, inap-
unavailable or caused severe performance degradation. Also other              propriate resource usage, or security threads. A frequent scenario of
studies [30] and incident reports [5] confirm that detecting and cor-         a configuration error are parameter values which do not fit to the spe-
recting configuration errors in software is of a great importance for         cific execution environment. For example, we specified a path to a
practical applications.                                                       working directory of the application but the user executing the pro-
   In this paper, we focus on providing an overview of current re-            gram do not have write access to this directory, causing the program
search in the area of software configuration diagnosis comprising             to crash (or at least to terminate with an exception).
fault detection, fault localization, and correction. Besides discussing          In the context of this survey, we consider the configuration error
research articles dealing with software configure errors, we further          diagnosis problem in its most general form: detecting the root causes,
discuss open issues and challenges that are worth being tackled in fu-        i.e. isolating the configuration parameters with inappropriate values,
ture research activities. While the excellent survey [51] has a broader       and providing means for repair in terms of identifying correct val-
scope and also includes aspects such as configuration-free/easy-to-           ues or value ranges for these parameters (or adapting the execution
configure systems, hardening against configuration errors, automat-           environment). This definition implies that we do not target diagno-
ing deployment and monitoring etc., we consider in this paper pri-            sis of ”traditional” software bugs, since we assume that a repair is
marily diagnosis aspects. We also cover the most recent state-of-             possible without code changes. Note that it might be difficult to de-
the-art work like diagnosing cross-stack configuration errors [32]. In        cide whether a failure should be attributed to a configuration problem
summary, this survey attempts to offer a compact and focused intro-           or a software bug, and this challenge remains one of the open issues
duction to this research area, thus serving as a good starting point for      (see Section 3). For example, if a failure-triggering sequence of state-
further contributions.                                                        ments in a faulty program is executed only because of a certain pa-
   Although, there has been work also dealing with configurations             rameter setting, the subsequent failure might appear to be caused by
and configuration errors for systems comprising hardware and soft-            a configuration error.
1 Heidelberg University, Germany, email: artur.andrzejak@informatik.uni-         We organize this paper as follows: We first discuss in Section 2
    heidelberg.de                                                             previous research works dealing with software configuration diagno-
2 University Klagenfurt, Austria, email: Gerhard.Friedrich@aau.at
                                                                              sis. In the following Section 3 we present open research challenges
3    TU Graz, Institute    for   Software   Technology,   Austria,   email:   that have not been tackled so far. We discuss threats to validity in
    wotawa@ist.tugraz.at
2
Sec. 4. Finally, we summarize the content and the findings of this        Linking configuration options and code regions. Approaches in
paper (Section 5).                                                       this group attempt to find a correspondence between a configuration
                                                                         option and code regions impacted by this option. Frequently, such
                                                                         techniques exploit static [43] or dynamic program slicing [14]. In
2     Previous Work on Software Configuration                            program slicing, one attempts to find the set of all code locations
      Diagnosis                                                          which might influence a target statement (so-called seed), or all code
                                                                         locations which might be influenced by a seed statement. Hence,
In this section, we discuss research work that has been published in     there approaches are mainly applicable in the software configuration
the area of software configuration diagnosis. We obtained the papers     setting and may not be generalizable to deal with hardware configu-
searching relevant digital libraries from IEEE and ACM. We further       ration diagnosis.
focussed on the most recent work in this area not older than 10 years.      ConfAnalyzer [29] builds a map from each program point to the
Hence, we do not claim the survey to comprise all papers in the con-     options that might cause an error at that point by static data-flow
text of software configuration errors (for a more comprehensive col-     analysis. For diagnosis, it treats a configuration option as the root
lection see [51]). However, the presented papers are intended to give    cause if its value flows into the crashing point. The approach does
an overview of the current research directions in software configura-    not require from users to install or use additional tools, but it can use
tion diagnosis and methods and techniques used for this purpose.         logs and stack traces to reduce the rate of false positives.
   In order to present the discussed papers in an accessible way, we        ConfDiagnoser [57, 56] uses static analysis, dynamic profiling,
classify the paper accordingly to the following categories: (i) diag-    and statistical analysis to link the undesired behavior that are repre-
nosing single-layer configuration errors, (ii) diagnosing cross-stack    sented by predicates to configuration options. When these predicates
configuration errors, (iii) diagnosing using configuration knowledge,    indicate behavior deviating from the one known for correct profiles,
and (iv) other aspects of software configuration diagnosis. Single-      ConfDiagnoser lists the relevant configuration options as suspects.
layer configuration errors are errors found in one-component ap-            Work [58] presents a technique and a tool to troubleshoot con-
plications like MySQL, Hive, or Spark. Typically, such applica-          figuration errors caused by software evolution. The approach uses
tions have one common configuration file/database and are devel-         dynamic profiling, execution trace comparison, and static analysis to
oped as an integral project. Cross-stack configuration errors occur in   link the undesired behavior to its root cause - a configuration option
multi-component applications or software stacks like LAMP (Linux,        which needs to be changed in the new software version.
Apache Web Server, MySQL, PHP, Wordpress/Drupal), J2EE, or                  ConfDoctor [7] is an approach based on static analysis to diag-
MEAN.                                                                    nose configuration defects. It does not require users to execute an
   The rational behind these categories is the following. Most previ-    instrumented program or to reproduce errors, which is an essential
ous work is available for diagnosing single-layer configuration errors   advantage compared to previous approaches. The only run-time in-
and this case offers an opportunity for an overview of existing diag-    formation required is the stack trace of a failure. An evaluation on
nosis approaches. Diagnosis of cross-stack configuration errors pose     JChord, Randoop, Hadoop, and Hbase shows that the approach could
additional challenges. In some cases, the source code of stack com-      successfully diagnose 27 out of 29 errors, with 20 of them ranked
ponents might not be available, precluding usage of general program      first.
analysis techniques. More frequently, cross-stack configuration er-         Authors of [25] propose a lightweight dynamic analysis technique
rors are frequently caused by a mismatch between the configuration       that automatically discovers a program’s interactions, i.e., logical for-
settings within separate components [32, 33]. To diagnose such is-       mulae that give developers information about how a system’s config-
sues, knowledge about the interactions between the components is         uration option settings map to particular code coverage. It is evalu-
needed.                                                                  ated on 29 programs spanning five languages and could find precise
   In case of the availability of formal knowledge about configura-      interactions based on a very small fraction of the number of possible
tions, i.e., configuration rules or constraints, diagnosis can be per-   configurations.
formed using this knowledge. Such formal knowledge bases may be
applicable for single-layer or cross-stack applications.
   Finally, there are other aspects that cannot be assigned to one of     Data flow analysis. ConfAid [3] applies dynamic information flow
the former categories, for example testing configurable systems or       analysis techniques to track tokens from specified “configuration
optimization of software based on configuration parameters.              sources” and analyze dependencies between the tokens and the er-
                                                                         ror symptoms, pinpointing which tokens are root causes.
                                                                            Sherlog [53] uses static analysis to infer control and data infor-
2.1    Diagnosing Single-Layer Configuration Errors                      mation in case of a failure. It analyses source code by exploiting in-
                                                                         formation from run-time logs and computes what must or may have
Single-layer programs are typically written in a single programming      happened during the failed run. One deficiency of this tool is that it
language and often the source code is available. Hence, static and dy-   may require guidance from developers about which function should
namic program analysis techniques can be applied to obtain a map-        be symbolically executed.
ping from configuration options to code regions. This information           Paper [17] introduces Lotrack, an extended static taint analysis ap-
can be exploited for localizing the root cause behind configuration      proach and tool to automatically track configuration options. It de-
errors. Consequently, a lot of approaches for diagnosis configuration    rives a configuration map that explains for each code fragment under
errors in such programs have been proposed.                              which configurations it may be executed.
                                                                                                                                                    3
 Supervised learning approaches. Relatively few authors propose                 Sayagh et al. [33] perform a qualitative analysis of over 1,000 con-
to use machine learning approaches based on supervised learning              figuration errors to understand their impact and complexity. Based
(i.e. mainly classification). This can be explained by the fact that it is   on this data they develop a slicing-based approach to identify error-
difficult to obtain or generate training data with appropriate structure     inducing configuration options in heterogeneous software stacks. So
and in sufficient amount. Similarly to the challenges of mutation test-      far it is the only approach which attempts to provide a complete, end-
ing, if training samples are generated, faults injected in the configu-      to-end process for diagnosing cross-stack configuration errors.
ration files might not trigger a failure or have unrealistic properties.        Work [4] focuses on finding configuration inconsistencies between
Also, since a configuration file might contain hundreds of options,          layers in complex, multi-component software. The proposed tech-
a training set is likely to containt only few faulty cases per option,       nique (based on static analysis) can handle software that is written
giving rise to the unbalanced class problem.                                 in multiple programming languages and has a complex preference
   Authors of [41] use machine learning to predict whether a configu-        structure.
ration error is responsible for a failure and if yes, what is the category      In [31] the authors target the identification of configuration depen-
of the error. To obtain training data, faults are injected into configu-     dencies in multi-tiered enterprise applications. It provides a method
ration files and the resulting error category is manually labeled.           for analyzing existing deployments to infer the configuration depen-
   Work [38] exploits statistical decision tree analysis to determine        dencies in a probabilistic sense. This yields rank-ordered list of de-
possible misconfigurations in data center environments. The authors          pendencies so that administrators can consult it and systematically
further improve the accuracy of this approach via a pattern modifica-        identify the true dependencies.
tion method.                                                                    Authors of [12] attempt to quantify the challenges that config-
                                                                             urability of complex, multi-component systems creates for software
                                                                             testing and debugging. It analyzes a highly-configurable industrial
 Replay-based techniques. One category of well-known tools [44,
                                                                             application and two open source applications. They notice that all
37, 20] are the replay-based diagnosis techniques. They treat the sys-
                                                                             three applications consist of multiple programming languages, lim-
tem as a black box to automatically run the system with possible
                                                                             iting the applicability of static analysis. Furthermore, they find out
configurations values without damaging the rest of the system until
                                                                             that there many access points and methods to modify configurations,
fixing the misconfiguration. This class of techniques relies on having
                                                                             and that the configuration state of an application on failure cannot be
a working configuration. Otherwise, it can not be applied. Besides,
                                                                             determined only from persistent data.
they require users with more domain knowledge.

 Signature-based approaches. Another family of tools mine a large            2.3    Rules, Constraints and Fixing their Violations
amount of configuration data from different instances to infer rules         Once configuration knowledge can be described using constraints or
about options and use these rules to identify software misconfigura-         rules they can be used for diagnosis as well. The use of such knowl-
tions.                                                                       edge is neither restricted to single-layer nor cross-stack applications
   EnCore [55] and CODE [54] belong to this category of work. En-            in general. Hence, methods and techniques based on rules and con-
Core takes into account the interaction between the configuration set-       straints, which can also be seen as models of the applications, would
tings and the executing environment, as well as the correlations be-         provide a more general account to solve the software configuration
tween configuration entries. It learns configuration rules from a given      error problem. In this section, we distinguish methods for learning
set of sample configurations and pinpoints configuration anomalies           knowledge, fixing violations, and inconsistency detection between
based on these rules.                                                        different software artifacts.
   Analogously, some tools such as Strider [42] or PeerPressure [40]
adopt statistical techniques to compare values of configuration op-
tions in a problematic system with those in other systems to infer the        Learning constraints and rules. Several existing approaches ex-
root cause of a failure. All these techniques require substantial effort     tract configuration models [42, 40, 54, 50, 55] and leverage them for
to collect the baseline data.                                                configuration debugging, mainly via detecting value anomalies and
                                                                             rule violations.
                                                                                 The categories of extracted data constituting the models typically
2.2    Diagnosing cross-stack configuration errors                           include the primitive and semantic data types of configuration op-
                                                                             tions (e.g., integer, file path, port number, URL), the value ranges of
Configuration options in multi-layer architectures (e.g., LAMP,              options (minimum and maximum integer values or a list of accept-
J2EE, or MEAN “software stacks”) might easily contradict each                able values), the control dependencies (i.e., usage of parameter Q
other or be hard to trace to each other. Therefore, configuration error      relies on the setting of another parameter P ), and value relationships
diagnosis in such architectures is particularly challenging [51]. On         (e.g., value of parameter S should be greater than that of parameter
the other hand, so far there are very few research approaches or tools       T ). EnCore [55] additionally considers the properties of the execu-
targeting this scenario [33].                                                tion environment as a part of their models.
   Sayagh and Adams [32] conducted an empirical study on multi-                  CODE [54] takes a unique approach and uses dynamic execution
layer configuration options across Wordpress (WP) plugins, WP, and           information as the model content, namely sequences of (Windows)
the PHP engine. They discover a large and increasing number of con-          registry accesses and derived rules. Using these rules for efficient
figuration options used by WP and its plugins. In addition, over 85%         filtering of even large lists of events, CODE can detect not only con-
of these options are used by at least two plugins at the same time.          figuration errors but also deviant program executions. It requires no
4
source code, application-specific semantics, or heavyweight program         tomizing the behavior and initial settings of software applications,
analysis.                                                                   server processes, and operating systems. Their distinctive property
   SPEX [50] analyzes source code to infer configuration option con-        is that each option is processed, defined, and described in different
straints and use these constraints to diagnose software misconfigura-       parts of a software project - namely in code, in configuration file, and
tions, to expose misconfiguration vulnerabilities, and to detect error-     in documentation. This creates a challenge for maintaining project
prone configuration design and handling.                                    consistency as it evolves. It also promotes inconsistencies leading to
                                                                            misconfiguration issues in production scenarios.
 Build-time configuration settings. Another category of work ad-               Confalyzer [30] uses static analysis to extract a list of configura-
dresses configurations and their constraints used at compilation and        tion option from source code and from associated options documen-
build time. Such configurations determine whether certain product           tation. Confalyzer first marks configuration APIs in the configura-
features (e.g. logging, debugging) are activated, or even which soft-       tion classes. Then it identifies calls to these APIs in the program by
ware components are included in the shipped product. The later as-          building a call graph and obtains option names by reading values of
pect is relevant e.g., for software product lines.                          parameters of these calls.
   Works [22], [23] propose a static analysis approach to extract              PrefFinder [11] proposed by Jin et al., uses static analysis and dy-
(build-time) configuration constraints from C code. Despite of its          namic analysis techniques to extract configuration options and stores
simplicity, it has high precision (77% - 93% in the studied systems)        them in a database for query and use.
and can recover 28% of existing constraints. A further study of the            The SCIC approach [4] exploits Confalyzer to implement the func-
authors reveals that configuration constraints enforce correct runtime      tionality of extracting configuration options in the key-value model
behavior, improve users’ configuration experience, and prevent cor-         and the tree-structured model.
ner cases.                                                                     Work [6] proposes an approach for detection of inconsistencies
                                                                            between source code and documentation based on static analysis.
 Fixing violations of configuration constraints. The problem of             It identifies source code locations where options are read and for
fixing a configuration that violates one or more constraints is ad-         each such location retrieves the name of the option. Inconsistencies
dressed in [47, 48]. The authors introduce to this purpose the concept      are then detected by comparing the results against the option names
of a range fix, which specifies the options to change the ranges of val-    listed in documentation.
ues for these options. They also design and evaluate an algorithm that
automatically generates range fixes for a violated constraint. Empiri-      2.4    Other Aspects
cal studies shows that the range fix approach provides mostly simple
yet complete sets of fixes and has a moderate running time in the           There are other papers dealing with diagnosis of software configura-
order of seconds.                                                           tion errors not falling into the previous categories like testing, end-
   Configurable software (e.g., Linux OS, eCos) can have very high          user support and performance optimization, which we discuss in this
number of options (variables) and constraints. E.g., Linux has over         subsection.
6,000 variables and 10,000 constraints; eCos has over 1,000 variables
and 1,000 constraints. Such systems typically use variability model-         Testing of highly configurable systems. Paper [18] presents an
ing languages and configuration tools (called configurators). Exam-         initial study on the potential of using statistical testing techniques for
ples of variability languages include Linux Kconfig, eCos CDL, and          improving the efficiency of test selection for configurable software.
feature models. With variability modeling languages and configura-          The study aims to answer whether statistical testing can reduce the
tors, errors can be detected early, but users still have to resolve the     effort of localizing the most critical software faults, seen from user
errors, which is also not an easy task: the constraints in variability      perspective.
models can be very complex and highly interconnected. Therefore,               Authors of [19] analyze program traces to characterize and iden-
researchers have proposed automated approaches that suggest a list          tify where interactions occur on control flow and data. They find that
of fixes for an error. A fix is a set of changes that, when performed       the essential configuration complexity of these programs is indeed
on the configuration, resolve the current error. However, the recom-        much lower than the combinatorial explosion of the configuration
mended fixes in these approaches are sometimes large in number and          space indicates.
size. For example, fix lists for eCos configurations contain up to nine        Work [36] proposes S-SPLat, a technique that combines heuristic
fixes, and some fixes change up to nine variables.                          sampling with symbolic search to explore enormous space of config-
   In this context, work [39] proposes a method to reduce the size          urations for testing of software product lines.
and complexity of error fixes by introducing a concept of dynamic              A more general approach for testing configurable systems includ-
priorities. The basic idea is to first generate one fix and then to grad-   ing software is combinatorial testing [15, 16]. There the underlying
ually reach the desirable state based on user feedback. To this end,        assumption is that it is not necessarily one configuration parameter
the approach (1) automatically translates user feedback into a set of       that reveals a fault but a certain combination of parameters. Combi-
implicit priority levels on variables, using five priority assignment       natorial testing assures to compute all combinations for any arbitrary
and adjustment strategies and (2) efficiently identifies potentially de-    subset of configuration parameters of arity k. In the context of com-
sirable fixes that change only the variables with low priorities.           binatorial testing, the resulting test suite is said being of strength k.
                                                                            There are many algorithms and tools for combinatorial testing [13].
 Detecting inconsistencies between code, documentation, and                 For a survey on combinatorial testing we refer the interested user
configuration files. Configuration options are widely used for cus-         to [26].
                                                                                                                                                 5
  Configuration and debugging support for end-users. A tech-              3    Challenges in Configuration Diagnosis
nique to detect inadequate (i.e., missing or ambiguous) diagnostic
messages for configuration errors issued by a configurable software       Based on the survey of papers presented in the previous section, we
system is proposed in [59]. It injects configuration errors and uses      are able to identify several still open challenges. A general challenge
natural language processing to analyze the resulting diagnostic mes-      that immediately arises is to distinguish whether an application fail-
sages. It then identifies messages which might be unhelpful in diag-      ure is due to a fault in the configuration setup or code defect in the
nosis or even negatively impact this process.                             program. This is a common problem when applying configuration
   Authors of [49] study configuration settings of real-world users       debugging tools, which usually assumes a certain cause. If we want
from multiple projects and reveal patterns of unnecessary complex-        to come up with a general approach for software configuration di-
ity in configuration design. The authors also provide a few guidelines    agnosis, we have to adapt diagnosis to identify the underlying root
to reduce the configuration space. Finally, the existing configuration    cause.
navigation methods are studied in terms of their effectiveness in deal-      A method that is able to separate these causes would take the cur-
ing with the over-designed configuration.                                 rent configuration, the program, the description of the execution en-
   Work [28] introduces ConfSeer, a system which recommends to            vironment, and the passing/failing tests as input. Based on these in-
users suitable knowledge base articles which are likely to describe       puts the possible causes of a failure are provided as output. In order to
user’s current configuration problem and its fix. To this end, Conf-      come up with such an approach, it is necessary to have a close look at
Seer takes the snapshots of configuration files from a user machine       various configuration diagnosis problems, given consequently raise
as input, then extracts the configuration parameter names and value       to the another challenge, i.e., providing an open repository of various
settings from the snapshots and matches them against a large set of       configuration diagnosis problems that can be accessed by researchers
KB articles. If a match is found, ConfSeer pinpoints the configuration    in this field.
error with its matching KB article. The described system powers the          Such a general repository for software configuration diagnosis
recommendation engine behind Microsoft Operations Management              should include a larger set of different programs from single-layer
Suite.                                                                    to cross-stack applications together with configuration errors com-
                                                                          ing from different sources, test suites, and ideally also configuration
                                                                          knowledge bases. The repository should cover programs of different
  Optimizing performance via configuration settings. In [24], a           sizes and from different domains capturing currently available soft-
rank-based approach to efficient creation of performance models is        ware to allow comparing different configuration diagnosis methods
introduced. Such models can be exploited for finding an optimally         and techniques.
performing configuration of a software system.                               Besides these two general challenges, there are other challenges
   Authors of [10] conducted an empirical study on four popular soft-     that are more specific to the applications (single-layer versus cross-
ware systems by varying software configurations and environmental         stack) or the tasks to be tackled (i.e., fault localization and repair
conditions, to identify the key knowledge pieces that can be exploited    versus fault detection). In the following, we illustrate some of these
for transfer learning for constructing performance models of config-      more specific challenges in detail.
urable software systems.
   Paper [35] proposes a multi-objective evolutionary algorithm to         Diagnosis of single-layer applications Despite the fact that there
find the optimal solutions and addresses the configuration optimiza-      have been various methods already published in this domain, there
tion problem for software product lines.                                  are still some open issues.
   Finally, the work described in [27] employs random sampling and
recursive search in a configuration space to find optimally performing    • Transfer techniques from functional fault localization: In case of
configurations for an anticipated workload in software product lines.       software debugging, there are various methods available going be-
                                                                            yond program analysis including spectrum-based fault localiza-
                                                                            tion [1, 2] among others. In this approach, code regions are ranked
2.5    Survey Summary                                                       (essentially) according to the number of times there are executed
                                                                            by passing or by failing tests (intuition: if a code line is executed
There are lots of papers dealing with configuration diagnosis of sin-       primarily by failing tests, it is more likely to contribute to a fail-
gle layer applications often employing program analysis techniques          ure). For a detailed look at current debugging techniques we re-
but also making use of machine learning or replay methods. In case          fer the interested reader to Wong et al.’s survey [46]. In particu-
of more complicated applications comprising interacting and con-            lar spectrum-based fault localization offers superior performance
figurable software components there have been less papers dealing           compared to static and dynamic program analysis applied to de-
with concrete solutions. One approach that can be used in both cases        bugging. The open research question that is, whether spectrum-
of software is to make use of formalized knowledge about config-            based fault localization can be efficiently used for software con-
urations, i.e., the configuration parameters, their domains, and rules      figuration diagnosis as well.
specifying limitations and relationships among parameters. It would       • Study and exploit the trade-off between the type of data from users
be interesting to investigate whether classical approaches to diagno-       required for diagnosis (as well as the effort of obtaining this data,
sis of knowledge-bases like [8, 45, 9, 34] can also be successfully         e.g., via instrumentation) and the achieved accuracy. The research
applied for configuration diagnosis. Other aspects, discussed in this       goals that would go into this direction include:
section include testing configurations, end-user support, and perfor-         – For each type of diagnosis data (from static analysis to diag-
mance optimization.                                                             nosis data dynamically created from instrumentation and also
6
      for combinations) understand and quantify the degree of likely           cation environment, but are probably more comprehensive if this
      penalties (e.g., in terms of accuracy) of using only this data for       is also taken into account.
      diagnosis. Specifically, characterize error types which can be
      or cannot be diagnosed for each type of diagnosis data (when             Consequently, this discussion gives rise to the following goals:
      using state-of-the art debugging approaches).
                                                                           • Attempt automated test generation that considers the state of the
    – For each “class” of diagnosis data, attempt to improve the cor-        application environment and the configuration settings (maybe
      responding state-of-the art diagnosis methods in terms of types        implicitly). Such tests would adapt to environment changes and
      of errors they are able to debug. This can be done e.g., by an         target only the above-mentioned mismatch between environment
      in-depth analysis why they fail for some error types and by pro-       and configuration. In order to avoid confusion with the meaning of
      viding substrates/replacements for the missing diagnosis data.         traditional testing, we might call this “configuration verification”
                                                                             step instead of testing.
 Diagnosing of cross-stack configuration errors In the case of             • Generate tests that verify only the consistency of configurations
cross-stack applications, there is not so much work available. Impor-        between layers of a multi-stack system. In this case a test failure
tant open research challenges include:                                       should indicate only an inconsistency, not a lack of adaptation to
                                                                             the production environment. For example, a test could only verify
• Exploit work on consistency checking to detect potential incon-            the consistency of configurations across layers, not execute the
  sistencies between different stack layers.                                 whole application.
• Leverage existing work on extraction of rules and constraints to         • Generate tests which verify the correctness of application’s be-
  model dependencies between layers. Then use the techniques for             havior independently of the configuration settings. For example,
  discovery and fixing of constraint violations to diagnose (and pos-        an application should produce the same behavior independently
  sibly repair) cross-stack configuration errors.                            of the exact path to input/output/libraries, number of used threads
• As a further application of extracted rules, configurator-like tools       (in some range), used compiler (or its flags) etc.
  (as used for configuring operating systems) could be used for safe       • Generate tests that improve the outcome of fault localization.
  configuration of cross-stack systems.                                      There it would be necessary to identify those tests that can dis-
• Create models of expected behavior (given a current global con-            tinguish different computed root causes (see e.g., [21]).
  figuration) of each layer from the perspective of each layer. Di-
  vergences in the behavior might indicate potential configuration
                                                                           4     Threats to Validity
  inconsistencies or errors. For example, given the current config-
  uration of a database-layer (specifying n1 database connections),        Several threats to validity of this paper exist. The main one is the risk
  also the PHP-layer should allow n1 database connections. How-            of omitting important contributions to this field. To mitigate this risk,
  ever, if the expected behavior of PHP-layer, based on its own con-       we have created lists of relevant works using several processes de-
  figuration, allows only n2 < n1 database-connections, then an            scribed below. We then merged and pruned the results according to
  inconsistency between these two behavioral models is indicated.          the rank of the publishing venue and originality (i.e. works proposing
                                                                           a novel or distinctive approach were included even if published in a
   It is worth noting that it is quite important which dependencies        workshop). In the first literature collection process, we searched for
or interaction between layers can be observed or recorded. More-           publications containing the word ”configuration” that were published
over, in the context of these challenges the application of model-         in selected high-quality venues (ICSE, ASE, ISSTA, FSE, ISSRE,
based approaches for diagnosing (configuration) knowledge-base,            ICSME, ICPC, IEEE Trans. Software Eng., and some others) in the
e.g., [8, 45, 9, 34], might be worth being considered.                     last five years; for each found publication, we verified via abstract
                                                                           whether a publication indeed targets configuration error (diagnosis).
 Testing-related challenges and goals In case of testing, we are           In the second process, we read the related work sections of the pre-
interested in detecting faults caused by configuration settings. There     viously identified works, and created a list of papers discussed there,
the motivation is to improve testing approaches specifically for de-       which are of relevance (here, also less prestigious venues were con-
tecting faults in system configurations ideally during software devel-     sidered). Finally, we screened the survey [51] for checking that no
opment. To clarify the meaning of “software testing” in context of         important contribution was omitted.
configuration (errors) we should consider that an application failure         Another threat to validity is the possibility to misinterpret any of
in this context does not necessarily imply that there is a defect in       the discussed papers (e.g. due to different understanding of terms),
code (as in traditional testing). Such a failure rather indicates that:    and state here inaccurate claims. To reduce this risk, we have studied
                                                                           each described contribution in a depth sufficient to avoid a misinter-
• There is a mismatch between the state of the application environ-        pretation. Besides of this, information from related work section to
  ment (operating system, file system, hardware, location of input         verify our interpretation was used where available.
  data, libraries, network properties, remote components, etc.) and
  the configuration settings. This implies that a test for this type of
                                                                           5     Conclusion
  error must take into consideration the environment.
• There is an inconsistency between configuration values, either           In this paper, we presented a survey on methods and techniques used
  within a single layer or between layers in a multi-layer applica-        for detecting, localizing, and correcting faults in the context of soft-
  tion. The corresponding tests might be independent of the appli-         ware configurations. We distinguished the different cases of software
                                                                                                                                                              7
configuration diagnosis for single-layer and cross-stack applications             [13] Sunint Kaur Khalsa and Yvan Labiche, ‘An orchestrated survey of
as well as methods used in case of available configuration knowledge                   available algorithms and tools for combinatorial testing’, in 25th Inter-
                                                                                       national Symposium on Software Reliability Engineering, pp. 323–334,
and further aspects. From the survey we were able to identify some                     (2015).
still open challenges and research questions including distinguish-               [14] Bogdan Korel and Janusz Laski, ‘Dynamic Program Slicing’, Informa-
ing different variants of potential root causes, the lack of repositories              tion Processing Letters, 29, 155–163, (1988).
of application-cases for validating and comparing research results as             [15] D. R. Kuhn, R. N. Kacker, and Y. Lei, ‘Combinatorial testing’, in En-
well as the need for new fault localization and testing methods.                       cyclopedia of Software Engineering, ed., Phillip A. Laplante, Taylor &
                                                                                       Francis, (2012).
   The motivation for this paper is to provide a solid basis for fu-              [16] D. Richard Kuhn, Renee Bryce, Feng Duan, Laleh Sh. Ghandehari,
ture research in this area and to identify some important challenges                   Yu Lei, and Raghu N. Kacker, ‘Combinatorial testing: Theory and prac-
in software configuration diagnosis worth being tackled. We also in-                   tice’, in Advances in Computers, volume 99, 1–66, Elsevier, (2015).
dicated some relationships with work on diagnosis of configuration                [17] Max Lillack, Christian Kästner, and Eric Bodden, ‘Tracking Load-time
                                                                                       Configuration Options’, in 29th ACM/IEEE International Conference
knowledge bases and other approaches of software debugging that                        on Automated Software Engineering, ASE ’14, pp. 445–456, New York,
might stimulate this field. Because of the growing interest in provid-                 NY, USA, (2014). ACM.
ing programs comprising a stack of other programs that themselves                 [18] Dusica Marijan, ‘Improving Configurable Software Testing with Statis-
can be configured, we see a growing need for research in this area.                    tical Test Selection’, in International Workshop on Formal Methods for
                                                                                       Analysis of Business Systems, ForMABS 2016, pp. 5–8, New York, NY,
                                                                                       USA, (2016). ACM.
REFERENCES                                                                        [19] J. Meinicke, C. P. Wong, C. Kästner, T. Thüm, and G. Saake, ‘On
                                                                                       essential configuration complexity: Measuring interactions in highly-
 [1] Rui Abreu, Peter Zoeteweij, Rob Golsteijn, and Arjan J. C. van                    configurable systems’, in 2016 31st IEEE/ACM International Con-
     Gemund, ‘A practical evaluation of spectrum-based fault localization’,            ference on Automated Software Engineering (ASE), pp. 483–494,
     Journal of Systems and Software, 82(11), 1780–1792, (2009).                       (September 2016).
 [2] Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund, ‘Spectrum-           [20] James Mickens, Martin Szummer, and Dushyanth Narayanan, ‘Snitch:
     based multiple fault localization’, in ASE 2009, 24th IEEE/ACM In-                Interactive Decision Trees for Troubleshooting Misconfigurations’, in
     ternational Conference on Automated Software Engineering, Auckland,               2Nd USENIX Workshop on Tackling Computer Systems Problems with
     New Zealand, November 16-20, 2009, pp. 88–99. IEEE Computer So-                   Machine Learning Techniques, pp. 8:1–8:6, Cambridge, MA, (2007).
     ciety, (2009).                                                                    USENIX Association.
 [3] Mona Attariyan and Jason Flinn, ‘Automating Configuration Trou-              [21] Nica Mihai, Nica Simona, and Wotawa Franz, ‘On the use of mutations
     bleshooting with Dynamic Information Flow Analysis’, in 9th USENIX                and testing for debugging’, Software: Practice and Experience, 43(9),
     Conference on Operating Systems Design and Implementation, pp. 1–                 1121–1142, (2013).
     11, Vancouver, BC, Canada, (2010). USENIX Association.                       [22] S. Nadi, T. Berger, C. Kästner, and K. Czarnecki, ‘Where Do Configura-
 [4] Farnaz Behrang, Myra B. Cohen, and Alessandro Orso, ‘Users Be-                    tion Constraints Stem From? An Extraction Approach and an Empirical
     ware: Preference Inconsistencies Ahead’, in 2015 10th Joint Meeting               Study’, IEEE Transactions on Software Engineering, 41(8), 820–841,
     on Foundations of Software Engineering, ESEC/FSE 2015, pp. 295–                   (August 2015).
     306, New York, NY, USA, (2015). ACM.                                         [23] Sarah Nadi, Thorsten Berger, Christian Kästner, and Krzysztof Czar-
 [5] Jon Brodkin. Why Gmail Went Down: Google Misconfigured Load                       necki, ‘Mining Configuration Constraints: Static Analyses and Empiri-
     Balancing Servers. https://goo.gl/Hdga7H. Accessed: 5 June                        cal Results’, in 36th International Conference on Software Engineering,
     2018.                                                                             ICSE 2014, pp. 140–151, New York, NY, USA, (2014). ACM.
 [6] Z. Dong, A. Andrzejak, D. Lo, and D. Costa, ‘ORPLocator: Identify-           [24] Vivek Nair, Tim Menzies, Norbert Siegmund, and Sven Apel, ‘Using
     ing Read Points of Configuration Options via Static Analysis’, in 2016            Bad Learners to Find Good Configurations’, in 2017 11th Joint Meeting
     IEEE 27th International Symposium on Software Reliability Engineer-               on Foundations of Software Engineering, ESEC/FSE 2017, pp. 257–
     ing (ISSRE), pp. 185–195, (October 2016).                                         267, New York, NY, USA, (2017). ACM.
 [7] Z. Dong, A. Andrzejak, and K. Shao, ‘Practical and accurate pinpoint-        [25] ThanhVu Nguyen, Ugur Koc, Javran Cheng, Jeffrey S. Foster, and
     ing of configuration errors using static analysis’, in 2015 IEEE Interna-         Adam A. Porter, ‘iGen: Dynamic Interaction Inference for Configurable
     tional Conference on Software Maintenance and Evolution (ICSME),                  Software’, in 2016 24th ACM SIGSOFT International Symposium on
     pp. 171–180, (September 2015).                                                    Foundations of Software Engineering, FSE 2016, pp. 655–665, New
 [8] A Felfernig, G Friedrich, D Jannach, and M Stumptner, ‘Consistency-               York, NY, USA, (2016). ACM.
     based diagnosis of configuration knowledge bases’, Artificial Intelli-       [26] Changhai Nie and Hareton Leung, ‘A survey of combinatorial testing’,
     gence, 152(2), 213–234, (2004).                                                   ACM Computing Surveys, 43(2), (January 2011).
 [9] A. Felfernig, M. Schubert, and C. Zehentner, ‘An efficient diagnosis al-     [27] Jeho Oh, Don Batory, Margaret Myers, and Norbert Siegmund, ‘Find-
     gorithm for inconsistent constraint sets’, Artificial Intelligence for En-        ing Near-optimal Configurations in Product Lines by Random Sam-
     gineering Design, Analysis and Manufacturing, 26(1), 53–62, (2 2012).             pling’, in 2017 11th Joint Meeting on Foundations of Software Engi-
[10] Pooyan Jamshidi, Norbert Siegmund, Miguel Velez, Christian Kästner,              neering, ESEC/FSE 2017, pp. 61–71, New York, NY, USA, (2017).
     Akshay Patel, and Yuvraj Agarwal, ‘Transfer Learning for Performance              ACM.
     Modeling of Configurable Systems: An Exploratory Analysis’, in 32Nd          [28] Rahul Potharaju, Joseph Chan, Luhui Hu, Cristina Nita-Rotaru, Ming-
     IEEE/ACM International Conference on Automated Software Engineer-                 shi Wang, Liyuan Zhang, and Navendu Jain, ‘ConfSeer: Leveraging
     ing, ASE 2017, pp. 497–508, Piscataway, NJ, USA, (2017). IEEE Press.              Customer Support Knowledge Bases for Automated Misconfiguration
[11] Dongpu Jin, Myra B. Cohen, Xiao Qu, and Brian Robinson,                           Detection’, Proc. VLDB Endow., 8(12), 1828–1839, (August 2015).
     ‘PrefFinder: Getting the Right Preference in Configurable Software           [29] Ariel Rabkin and Randy Katz, ‘Precomputing Possible Configuration
     Systems’, in 29th ACM/IEEE International Conference on Automated                  Error Diagnoses’, in 2011 26th IEEE/ACM International Conference
     Software Engineering, ASE ’14, pp. 151–162, New York, NY, USA,                    on Automated Software Engineering, pp. 193–202, Washington, DC,
     (2014). ACM.                                                                      USA, (2011). IEEE Computer Society.
[12] Dongpu Jin, Xiao Qu, Myra B. Cohen, and Brian Robinson, ‘Config-             [30] Ariel Rabkin and Randy Katz, ‘Static Extraction of Program Config-
     urations Everywhere: Implications for Testing and Debugging in Prac-              uration Options’, in 33rd International Conference on Software Engi-
     tice’, in Companion Proceedings of the 36th International Conference              neering, ICSE ’11, pp. 131–140, New York, NY, USA, (2011). ACM.
     on Software Engineering, ICSE Companion 2014, pp. 215–224, New
     York, NY, USA, (2014). ACM.
8
[31] Vinod Ramachandran, Manish Gupta, Manish Sethi, and Soudip Roy           [46] W. Eric Wong, Ruizhi Gao, Yihao Li, Rui Abreu, and Franz Wotawa,
     Chowdhury, Determining Configuration Parameter Dependencies via               ‘A survey on software fault localization’, IEEE Trans. Software Eng.,
     Analysis of Configuration Data from Multi-tiered Enterprise Appli-            42(8), 707–740, (2016).
     cations’, in 6th International Conference on Autonomic Computing,        [47] Y. Xiong, H. Zhang, A. Hubaux, S. She, J. Wang, and K. Czarnecki,
     ICAC ’09, pp. 169–178, New York, NY, USA, (2009). ACM.                        ‘Range Fixes: Interactive Error Resolution for Software Configuration’,
[32] M. Sayagh and B. Adams, ‘Multi-layer software configuration: Em-              IEEE Transactions on Software Engineering, 41(6), 603–619, (June
     pirical study on wordpress’, in 2015 IEEE 15th International Working          2015).
     Conference on Source Code Analysis and Manipulation (SCAM), pp.          [48] Yingfei Xiong, Arnaud Hubaux, Steven She, and Krzysztof Czarnecki,
     31–40, (September 2015).                                                      ‘Generating Range Fixes for Software Configuration’, in 34th Inter-
[33] Mohammed Sayagh, Noureddine Kerzazi, and Bram Adams, ‘On                      national Conference on Software Engineering, ICSE ’12, pp. 58–68,
     Cross-stack Configuration Errors’, in 39th International Conference on        Piscataway, NJ, USA, (2012). IEEE Press.
     Software Engineering, ICSE ’17, pp. 255–265, Piscataway, NJ, USA,        [49] Tianyin Xu, Long Jin, Xuepeng Fan, Yuanyuan Zhou, Shankar Pasu-
     (2017). IEEE Press.                                                           pathy, and Rukma Talwadker, ‘Hey, You Have Given Me Too Many
[34] Kostyantyn Shchekotykhin, Gerhard Friedrich, Patrick Rodler, and              Knobs!: Understanding and Dealing with Over-designed Configuration
     Philipp Fleiss, ‘Sequential diagnosis of high cardinality faults in           in System Software’, in 2015 10th Joint Meeting on Foundations of
     knowledge-bases by direct diagnosis generation’, in ECAI ’14, pp. 813–        Software Engineering, ESEC/FSE 2015, pp. 307–319, New York, NY,
     818, (2014).                                                                  USA, (2015). ACM.
[35] K. Shi, ‘Combining Evolutionary Algorithms with Constraint Solving       [50] Tianyin Xu, Jiaqi Zhang, Peng Huang, Jing Zheng, Tianwei Sheng,
     for Configuration Optimization’, in 2017 IEEE International Confer-           Ding Yuan, Yuanyuan Zhou, and Shankar Pasupathy, ‘Do Not Blame
     ence on Software Maintenance and Evolution (ICSME), pp. 665–669,              Users for Misconfigurations’, in Twenty-Fourth ACM Symposium on
     (September 2017).                                                             Operating Systems Principles, pp. 244–259, Farminton, Pennsylvania,
[36] S. Souto, M. D’Amorim, and R. Gheyi, ‘Balancing Soundness and                 (2013). ACM.
     Efficiency for Practical Testing of Configurable Systems’, in 2017       [51] Tianyin Xu and Yuanyuan Zhou, ‘Systems Approaches to Tackling
     IEEE/ACM 39th International Conference on Software Engineering                Configuration Errors: A Survey’, ACM Comput. Surv., 47(4), 70:1–
     (ICSE), pp. 632–642, (May 2017).                                              70:41, (July 2015).
[37] Ya-Yunn Su, Mona Attariyan, and Jason Flinn, ‘AutoBash: Improving        [52] Zuoning Yin, Xiao Ma, Jing Zheng, Yuanyuan Zhou, Lakshmi N.
     Configuration Management with Operating System Causality Analy-               Bairavasundaram, and Shankar Pasupathy, ‘An Empirical Study on
     sis’, in Proceedings of Twenty-first ACM SIGOPS Symposium on Oper-            Configuration Errors in Commercial and Open Source Systems’, in
     ating Systems Principles, pp. 237–250, Stevenson, Washington, USA,            Twenty-Third ACM Symposium on Operating Systems Principles, SOSP
     (2007). ACM.                                                                  ’11, pp. 159–172, New York, NY, USA, (2011). ACM.
[38] T. Uchiumi, S. Kikuchi, and Y. Matsumoto, ‘Misconfiguration detection    [53] Ding Yuan, Haohui Mai, Weiwei Xiong, Lin Tan, Yuanyuan Zhou, and
     for cloud datacenters using decision tree analysis’, in Network Opera-        Shankar Pasupathy, ‘SherLog: Error Diagnosis by Connecting Clues
     tions and Management Symposium (APNOMS), 2012 14th Asia-Pacific,              from Run-time Logs’, in Fifteenth Edition of ASPLOS on Architectural
     pp. 1–4, (September 2012).                                                    Support for Programming Languages and Operating Systems, ASPLOS
[39] Bo Wang, Leonardo Passos, Yingfei Xiong, Krzysztof Czarnecki,                 XV, pp. 143–154, New York, NY, USA, (2010). ACM.
     Haiyan Zhao, and Wei Zhang, ‘SmartFixer: Fixing Software Config-         [54] Ding Yuan, Yinglian Xie, Rina Panigrahy, Junfeng Yang, Chad Ver-
     urations Based on Dynamic Priorities’, in 17th International Software         bowski, and Arunvijay Kumar, ‘Context-based Online Configuration-
     Product Line Conference, SPLC ’13, pp. 82–90, New York, NY, USA,              error Detection’, in 2011 USENIX Conference on USENIX Annual
     (2013). ACM.                                                                  Technical Conference, pp. 28–28, Portland, OR, (2011). USENIX As-
[40] Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-min                sociation.
     Wang, ‘Automatic Misconfiguration Troubleshooting with PeerPres-         [55] Jiaqi Zhang, Lakshminarayanan Renganarayana, Xiaolan Zhang, Niyu
     sure’, in In OSDI, pp. 245–258, (2004).                                       Ge, Vasanth Bala, Tianyin Xu, and Yuanyuan Zhou, ‘EnCore: Exploit-
[41] Mengliao Wang, Xiaoyu Shi, and K. Wong, ‘Capturing Expert Knowl-              ing System Environment and Correlation Information for Misconfig-
     edge for Automated Configuration Fault Diagnosis’, in 2011 IEEE               uration Detection’, in 19th International Conference on Architectural
     19th International Conference on Program Comprehension (ICPC), pp.            Support for Programming Languages and Operating Systems, pp. 687–
     205–208, (June 2011).                                                         700, Salt Lake City, Utah, USA, (2014). ACM.
[42] Yi-min Wang, Chad Verbowski, John Dunagan, Yu Chen, Helen J.             [56] Sai Zhang, ‘ConfDiagnoser: An Automated Configuration Error Di-
     Wang, and Chun Yuan, ‘STRIDER: A Black-box, State-based Ap-                   agnosis Tool for Java Software’, in 2013 International Conference on
     proach to Change and Configuration Management and Support’, in In             Software Engineering, ICSE ’13, pp. 1438–1440, Piscataway, NJ, USA,
     Usenix LISA, pp. 159–172, (2003).                                             (2013). IEEE Press.
[43] Mark Weiser, ‘Program slicing’, IEEE Transactions on Software Engi-      [57] Sai Zhang and Michael D. Ernst, ‘Automated diagnosis of software con-
     neering, 10(4), 352–357, (July 1984).                                         figuration errors’, in ICSE’13, 34th International Conference on Soft-
[44] Andrew Whitaker, Richard S. Cox, and Steven D. Gribble, ‘Configu-             ware Engineering, San Francisco, CA, USA, (May 2013).
     ration Debugging As Search: Finding the Needle in the Haystack’, in      [58] Sai Zhang and Michael D. Ernst, ‘Which Configuration Option Should
     6th Conference on Symposium on Opearting Systems Design & Imple-              I Change?’, in 36th International Conference on Software Engineering,
     mentation - Volume 6, pp. 6–6, San Francisco, CA, (2004). USENIX              ICSE 2014, pp. 152–163, New York, NY, USA, (2014). ACM.
     Association.                                                             [59] Sai Zhang and Michael D. Ernst, ‘Proactive Detection of Inadequate
[45] Jules White, David Benavides, Douglas C. Schmidt, Pablo Trinidad,             Diagnostic Messages for Software Configuration Errors’, in Int. Symp.
     Brian Dougherty, and Antonio Ruiz Cortés, ‘Automated diagnosis of            on Software Testing and Analysis (ISSTA), pp. 12–23, NY, USA, (2015).
     feature model configurations’, Journal of Systems and Software, 83(7),        ACM.
     1094–1107, (2010).