Towards a Software Product Line for Machine Learning
          Workflows: Focus on Supporting Evolution

              Cécile Camillieri                         Luca Parisi                      Mireille Blay-Fornarino
            camillie@i3s.unice.fr                   parisi@i3s.unice.fr                    blay@i3s.unice.fr

              Frédéric Precioso                        Michel Riveill                       Joël Cancela Vaz
            precioso@i3s.unice.fr                   riveill@i3s.unice.fr               joel.cancelavaz@gmail.com
                                            Université Côte d’Azur, CNRS, I3S
                                            Bat Templier, 930 Route des Colles
                                                 Sophia Antipolis, France

ABSTRACT                                                              in order to help users who want to build ML workflows, we
The purpose of the ROCKFlows project is to lay the foun-              have to propose a system that can present a large variety
dations of a Software Product Line (SPL) that helps the               of algorithms to users, while helping them in their choices
construction of machine learning workflows. Based on her              based on their data and objectives. At the same time, we
data and objectives, the end user, who is not necessarily an          should be able to extend the supported solutions at least
expert, should be presented with workflows that address her           by incorporating new algorithms. The challenge is to hide
needs in the ”best possible way”. To make such a platform             the complexity of the choices to the end user and to revise
durable, data scientists should be able to integrate new algo-        our knowledge with each addition: an algorithm can become
rithms that can be compared to existing ones in the system,           less efficient compared to a new one, while the introduction
thus allowing to grow the space of available solutions. While         of new pre-processing operations can extend the reach of
comparing the algorithms is challenging in itself, Machine            algorithms already present.
Learning, as a constantly evolving, extremely complex and                The contribution of this paper is thus to describe a tool-
broad domain, requires the definition of specific and flexible        supported approach, responding to this challenge: the ROCK-
evolution mechanisms. In this paper, we focus on mecha-               Flows project1 .
nisms based on meta-modelling techniques to automatically                The remainder of the paper is organized as follows. We
enrich a SPL while ensuring its consistency.                          discuss in the next section challenges we face and some re-
                                                                      lated works. Section 3 describes the architecture that sup-
                                                                      ports the project and two usage scenarios focusing each on
Keywords                                                              a different user of ROCKFlows. We detail the evolution
Software Product Line, Machine Learning Workflow, Evolu-              process and the correlated artefacts in Section 4. Section 5
tion                                                                  concludes the paper and briefly discusses future work.

1.   INTRODUCTION                                                     2.   TOWARDS A SPL FOR MACHINE
   The answer to the question ”What Machine Learning (ML)                  LEARNING WORKFLOWS
algorithm should I use?” is always ”It depends.” It depends              The purpose of the ROCKFlows project is to lay the foun-
on the size, quality, and nature of the data. It also depends         dations of a software platform that helps the construction of
on what we want to do with the answer [21].                           ML workflows. This task is highly complex because of the in-
   The industry of cloud-based machine learning (e.g., IBM’s          creasing number and variability of available algorithms and
Watson Analytics, Amazon Machine Learning, Google’s Pre-              the difficulty in choosing the suitable and parametrized algo-
diction API) provides tools to learn ”from your data” with-           rithms and their combinations. The problem is not only on
out having to worry about the cumbersome pre-processing               choosing the proper algorithms, but the proper transforma-
and ML algorithms. To address such a challenge they pro-              tions to apply on the input data. It is a trade-off between
pose fully automated solutions to some classical learning             many requirements (e.g., accuracy, execution and training
problems such as classification. Some other actors like Mi-           time).
crosoft, with the Azure’s Machine Learning platform, allow               Since Software Product Line (SPL) engineering is con-
users to build much more complex ML workflows, in a graph-            cerned with both variability and systematically reusing de-
ical editor that is targeted towards ML experts.                      velopment assets in an application domain [5], we have based
   The common point between these solutions is that they              our project on SPL and model-driven techniques. The SPL
chose to select only a few algorithms, in comparison to the           engineering separates two processes: domain engineering for
hundreds that are available. However data scientists know             defining commonality and the variability of the product line
that the best algorithm will not be the same for each dataset         and application engineering for deriving product line appli-
[22]. Moreover, new algorithms are regularly proposed by
data scientists for dealing with more or less specific prob-          1
                                                                        ROCKFlows stands for Request your Own Convenient
lems and improving performances and accuracy [6]. Thus,               Knowledge Flows.


                                                                 65
cations [15]. Similarly, the ROCKFlows project requires on            maintaining consistency of evolving FMs. In our case, we
one hand to build a consistent SPL, allowing end users to get         rely on these operations to update our models, but we had
reliable workflows, and on the other hand to allow evolution          to encapsulate them in business oriented services.
of this SPL to integrate new algorithms and pre-processing               Moreover, despite the huge variability of the system, we
treatments.                                                           have decided to propose the end user only choices that can
   Based on these requirements, we have identified the fol-           lead to a proper result. Hence, it should not be possible
lowing challenges, addressing the needs for building and              to build a configuration for which we would not be able to
evolving a SPL in a domain as complex and changing as ML,             generate a workflow. For instance, it will not be possible for
ensuring a global consistency of the knowledge and scalabil-          a user to select, relatively to a given dataset, a performance
ity of the system.                                                    value for which we have no algorithm that can reach such
                                                                      requirements. It is necessary for us to ensure a consistent
C1: Exploratory project in a complex environment.                     configuration process [20].
   Making a selection among the high number of data mining               Contrary to the works allowing several users to modify
algorithms is a real challenge: more than hundreds of algo-           a model in contradictory manners and aiming to reconcile
rithms exist that can tackle a single ML problem such as              those [3], here we are in a setting where only consistent
classification. While work exists to try and rank their per-          evolutions are possible. Thus we did not have to handle
formance [6], it only gives an overview of which algorithms           co-evolution problems. Like the approach used in SPLEM-
are best in average, not for a given specific problem and not         MA [17], ”Maintenance Services” define the semantics of
according to different pre-processing pipelines.                      evolution operations on the SPL ensuring its consistency.
   Data scientists often approach new problems with a set             However, the analogy between meta-elements manipulated
of best practices, acquired through experience. However,              in ROCKFlows and SPLEMMA is hard to establish, espe-
there is few scientific evidence as to why an algorithm or            cially because our solution and problem spaces are in a con-
pre-processing technique leads to better results than another         stant evolution. Thus the associated meta-models are not
and in which case. Thus, one of the biggest challenge for this        stable, implying intraspatial second degree evolutions [19]
project is so to find a proper way to characterise algorithms         i.e., several spaces and mappings are simultaneously mod-
and to compare them, relatively to the very broad spectrum            ified; e.g., Adding a non-functional property kind, due to
of user needs and data representations.                               some improvement of the experiment meta-model, involves
   Collaboration between SPL developers and data scientists           to extend the FM and corresponding end user representation
induces a complex software ecosystem [13] where some math-            (problem space) and generation tools to take into account
ematical results may or not find a correspondence at end              this new feature.
user problem level. Heterogeneity of formalisms induces that
new evolutions are regularly discussed between the different          C3: User Centered SPL.
stakeholders.                                                            We identify three stakeholders, each leveraging challenges.
   Given these domain requirements, meta-model driven en-             - SPL users are the end users of the SPL: it can be a neo-
gineering provides an efficient and powerful solution to ad-          phyte who is looking for a solution to extract information
dress the complexity of the ecosystem through support of              from a dataset as well as an expert who wants to check or
separation of concerns and collaborations. In order to oper-          learn dependencies among dataset properties, algorithms,
ationalize it, we chose to consider this environment as a set         targeted platforms and user objectives. They want to use
of components relying on different meta-models for which              a system helping them to master the variability, i.e., to ex-
the evolution mechanisms are exposed through services.                press their requirements and get their envisioned ML work-
                                                                      flow. Both of these do not know about FMs and may need
C2: SPL building in a constantly evolving environ-                    complementary information like examples of uses of the al-
ment.                                                                 gorithms, details about the algorithm author or implemen-
   While the number of ML algorithms and techniques con-              tation, etc. Thus, the visualization of a FM as in standard
stantly grows, the fundamental understanding of ML inter-             tool is not adapted. Creating a user interface dedicated to
nal mechanisms is not stable enough to allow us to set any            the SPL is also problematic knowing the changing nature of
knowledge in stone. Both the domain and our understanding             the system. At the same time, in a agile process, we need to
of it evolve quickly, forcing constant evolution of the SPL.          test the SPL with users in order to align it with their needs,
   The line evolves in particular through the addition of             which are hard to identify a priori.
new algorithms and pre-processings. We run experiments                   Some flowcharts have been designed to give users a bit of
to identify dataset patterns leading to similar behavior of           a rough guide on how to approach ML problems [16, 14].
algorithms on different concrete datasets. The high num-              ROCKFlows wants this approach to be operational. So, we
ber of possible combinations (variability of compositions and         do not aim for the construction of workflows by assembly,
algorithms) as well as the frequent changes in ML require             but the automatic production of these workflows, without a
evolution mechanisms that are both incremental and loosely            direct contribution of the user in this construction process.
coupled with the elements presented to the end user.                  Works such as these are however potential targets for the
   As of today, ROCKFlows’ SPL contains roughly 300 fea-              generation of the workflows where proposed optimization
tures, and 5000 constraints, representing 70 different ML al-         could then be used automatically. Moreover, faced with the
gorithms, 5 pre-processing workflows, and is mostly focused           multitude of such systems (Clowdflows [10], MLbase [11],
on classification problems.                                           Weka [9]), it is right to allow the user to select her execu-
   Evolution in SPLs has been a challenge for many years [15].        tion target(s) so that the production is limited only to the
In particular, several works exist on evolution of Feature            ML workflows implemented by these platforms.
Models (FM) [8, 1]. They propose different mechanisms for             - External Developer are domain experts who contribute


                                                                 66
to the SPL. They do not have all the knowledge of the sys-              should be added in the FM;
tem and contribute by adding new algorithms. They have to               - GUI: Display metadata should be updated to reflect this
be able to contribute separately with minimal interference.             new feature in the GUI;
- Internal developers are leaders of the SPL. They have                 - Generation: If we can execute the algorithm, the gener-
the knowledge of the global architecture and manage contri-             ator should be updated to allow it either through code or a
butions of external developers to integrate them. They have             reference to the corresponding element in target platform(s);
to be able to maintain the platform and to ensure the consis-           - Experiments should be made with this new algorithm in
tency of all products despite the evolution of the ecosystem.           order to compare it to the other known algorithms. Cur-
                                                                        rently, the properties that are considered are accuracy of
3.      ARCHITECTURE FOR ROCKFlows                                      the results, execution time of the workflow, and memory
                                                                        usage. If experiments can be achieved, i.e., Experiments
3.1      ROCKFlows Big Picture                                          module has access to algorithm execution and results, the
                                                                        SPL needs to be updated with performance information for
   Figure 1 represents the current proposed architecture for            this algorithm but also for all the algorithms whose ranking
ROCKFlows on the component level. On the left of the cen-               has changed.
tral vertical line are the components that will be necessary              The central component SPLConsistencyManager’s role is
for the end user to configure her worfklow. On the right,               to ensure that all required changes are made across the whole
components enabling users to add their own ML algorithm                 system. Through the present scenario, we describe how this
in the system is described. Relationships between the dif-              component handles the impacts mentioned above.
ferent components are relying on meta-models.
   We now describe this architecture through the description            (1) As an External Developer wants to add a new algorithm,
of two scenarios:                                                           the GUI presents her with proper information that she
                                                                            needs to provide. Because this information is meant to
3.1.1       Scenario 1: Configuration of a ML Workflow                      change as the system encompasses more possibilities of
   A user willing to configure a new ML workflow will do so                 Machine Learning, it should be easy to change. Hence,
through a web-based configuration interface2 . The process                  the SPLConsistencyManager provides the GUI with the
requires at least the following steps, as visible on the left of            information needed to add a new algorithm in the tool.
figure 1:                                                                   The GUI presents then a generated form to the user.

(a) The Graphical User Interface (GUI) requests display meta-           (2) Once the External Developer has provided all informa-
    data on the Feature Model. Metadata associate each                      tion on the algorithm, it is sent to the Consistency Man-
    unique feature of the model with descriptions, references               ager and dispatched among the other components.
    or other artefacts aiming to help non-experts in their
                                                                        (3) If the External Developer provided code to execute the
    choices. Figure 2 shows a screenshot of our GUI. Here,
                                                                            algorithm, the Experiments component is requested to
    user is presented with the choice of its main objective,
                                                                            run tests on the algorithms, in order to find its perfor-
    in the form of questions.
                                                                            mance. This component is described more precisely in
(b) Once the FM is loaded and displayed properly with the                   the next section.
    metadata, the user configures the underlying FM by                  (4) Once experiments are finished, the manager analyses the
    responding to questions. The Feature Model compo-                       results to find whether any inconsistency was found be-
    nent in the figure exposes a web-service that allows for                tween the information provided by the user and the re-
    configuration on any FM, through the use of SPLAR’s                     sults. If not, the algorithm can be added both to the
    API [12].                                                               Feature Model and base of currently supported algo-
(c) Once a valid and complete configuration has been de-                    rithms.
    fined, it is sent to the Workfow component. The configu-            (5) Finally, once the feature has been added to the FM,
    ration is then transformed into a Platform Independent                  its display metadata can be filled with the information
    workflow model, that can in a second step be used to                    provided by the expert.
    generate executable code for different target platforms.
                                                                        3.2   Component for experiments on algorithms
(d) The Generator may require access to the base of algo-
                                                                           As we want even non-expert users to be able to get the
    rithms handled by the system in order to be able to
                                                                        appropriate algorithm for their need, we chose to express
    produce the proper code.
                                                                        higher level goals such as best accuracy or quickest execution
  Though it is not described here, depending on user’s pref-            time in the FM. This representation allows to filter out a
erence, the generated workflow would either be provided to              number of algorithms by offering users with trade-offs over
the user or directly executed by the target platform.                   these goals. In a second step, or a more advanced mode,
                                                                        using other models such as requirements engineering Goal
3.1.2       Scenario 2: Submission of a new algorithm                   Models is considered. In combination with our FM, it could
   We will now focus on the introduction of new ML algo-                allow during configuration to present the user with more
rithms in the SPL. Such an action has impacts on several                precise expected values for her non functional goals [2].
parts of the system:                                                       To be able to express knowledge such as ”best”, ”average”
- SPL: At least a new feature representing the algorithm                or ”worst” accuracy, we need an appropriate way to com-
                                                                        pare algorithms on a given problem. This ranking among
2
    The interface is accessible at http://rockflows.i3s.unice.fr        algorithms is computed by our Experiment module. The


                                                                   67
                                    Figure 1: High level architecture for ROCKFlows.


                                                                         all experiments do not have the need to be executed again,
                                                                         however the ranking of the algorithms must be updated to
                                                                         take in account the newest algorithm.


                                                                         4.    ARTEFACTS TO SUPPORT EVOLUTION
                                                                           This section discusses how we allow users to provide in-
                                                                         formation about new algorithms, how we use it to update
     Figure 2: Screenshot of ROCKFlows’ GUI.                             the system, and how we can ensure consistency despite the
                                                                         multiple impacts of these changes.

                                                                         4.1     Meta-models
                                                                           Figure 4 shows an excerpt of the meta-elements that deal
                                                                         with evolution of the SPL. Each component described earlier
                                                                         corresponds to a meta-model, and the SPLConsistencyMan-
                                                                         ager maintains consistency among them.

                                                                         4.1.1    SPL: Feature Model and Configuration
                                                                          Our feature model is represented in the SXFM (Simple
                                                                         XML Feature Model)3 . The rest of our SPL handling is also
                                                                         made through SPLOT’s FM reasoning library, SPLAR4 .

                                                                         4.1.2    Addressing non-expert users
                  Figure 3: Experiments                                     In order to make the system as accessible as possible, ad-
                                                                         ditional information on features must be set, such as descrip-
                                                                         tions or examples, as well as closed questions that will be
component runs each known algorithm on the available com-                asked to the end user during configuration. This metadata
patible datasets and stores its performance on the different             on the practical features is handled in a dedicated meta-
properties for each dataset. On top of that, it will transform           model AlgorithmDescriptionMM and used to build the GUIs
the datasets with a set of available pre-processing operations           that are presented to the end users and external developers.
and test again each algorithm with those new sets.                       The model is briefly described in subsection 4.2.
   Algorithms results on similar datasets are then compared
to one another to get a result similar to the one visible on fig-        4.1.3    Handling the results of Experiments
ure 3. Though it will not be discussed here, we have defined
                                                                           Information such as the accuracy ranking of algorithms
an algorithm based on classical ML and statistical methods
                                                                         according to dataset patterns is managed by the Experiment
to pull off this comparison and regroup datasets in so-called
                                                                         component. The expected format of this data is designed in
dataset patterns. This knowledge can then be pushed in
                                                                         a dedicated meta-model ExperimentPropertyMM. It evolves
the FM in the form of constraints linking a functional ob-
                                                                         as we gain knowledge and experience.
jective, an algorithm, a dataset pattern and the ranking for
each of the properties, also depending on the possible pre-              3
processings for this dataset.                                              http://ec2-52-32-1-180.us-west-2.compute.amazonaws.
                                                                         com:8080/SPLOT/sxfm.html
   As presented in section 3.1.2, the Experiment component               4
                                                                           An excerpt of the meta-model used for both FM and
is driven by the SPL Consistency Manager, that is charged                configuration definition in the library, can be found
to start experiments, gather and validate results before in-             in https://github.com/FMTools/sxfm-ecore/blob/master/
corporating them in the SPL. As new algorithms are added,                plugins/sxfm/model/sxfm.png


                                                                    68
                                         Figure 4: SPL Core Meta-models excerpt


4.2    Metadata on Algorithms                                           chy of the features. So, adding new class of algorithms or
   Data scientists adding their algorithms in the system need           modifying the structure of the FM can easily be achieved.
to provide at least:                                                    Once the feature for the algorithm has been created, addi-
- The high level ML objective the algorithm can be used                 tional constraints need to be defined between the algorithm
for: classification of data, prediction of numerical values (re-        and other features, in particular those describing input data.
gression), anomaly detection, etc.;                                        Generating domain constraints. Only certain types of con-
- Properties of the algorithm in regards to input data:                 straints must be defined among those different sub-trees.
which data types the algorithm supports, can the algorithm              For instance algorithms can define constraints towards input
handle missing values in the data, etc.;                                data, such as ”SVM implies Numerical Data” but never the
- A description of the algorithm, examples of its use, refer-           other way around. However such a constraint only applies in
ences to publications or web pages describing the algorithm.            a workflow if no pre-processing is used. So, this constraint
As described in section 4.1.2, those will be displayed in the           has to be transformed to express a constraint depending on
configuration interface;                                                the pre-processings that can be applied. The complexity of
- If possible, code that will allow us to run the algorithm             these cases, their multitude and the frequency of evolution
in our tool, so that we can both compare it to the others               lead us to encapsulate the generation of those constraints in
algorithms and provide executable workflows to end users.               dedicated operators, working on given ensembles (e.g., pre-
   Through the definition of theses elements in the Algo-               processing set that returns numerical Data). They also in-
rithmDescriptionMM meta-model, a form is automatically                  troduce features that are hidden to the end user, allowing
generated and presented to the domain expert. Thus, the                 us to tame this complexity.
model can be extended to add new properties for the al-                 It is interesting to note that, depending of the semantics
gorithms and will be automatically handled by the GUI.                  associated to the features, different constraints should be
However the impact of such changes on the FM still has to               generated. For instance, if an algorithm cannot deal with
be handled by the SPL manager. We do not know if tools                  missing values, a constraint ”algo excludes missing values”
such as the one described in [7] could help because those               needs to be generated. In the other case, a constraint ”algo
changes mostly impact code.                                             implies missing values” should never be generated because
                                                                        such an algorithm can still be used even if no missing value is
4.3    Domain driven tooled approach to man-                            present in the input data. Like previously, this higher level
       age Feature Model evolution                                      knowledge of the features is defined in our metadata. It al-
                                                                        lows us to ensure that all necessary constraints are properly
  Even though we have defined a single FM for ROCK-
                                                                        defined for all algorithms we add into the SPL.
Flows, we put a focus on separating concerns in it. The
tree is currently separated in 4 sub-trees handling: input
data description, user objectives, the processing algorithms,           5.   CONCLUSION AND FUTURE WORK
and expected properties of the generated workflow. This                   In this paper, we have outlined some of the difficulties
separation of concerns provides a first level of modularity             related to building and evolving a SPL for ML workflows.
for the model.                                                          To handle the line’s complexity and evolution, we have pro-
  Linking Domain artefact and FM structure. Metadata ex-                posed an architecture organized around a set of meta-models
ternal to the FM itself defines particular points in the FM             and transformations encapsulated within services, A large
where feature can be inserted. Only the sub-trees that need             number of complementary perspectives are considered, both
to be modified are considered. In our example, we add all               on mechanisms to build the SPL and on the business ap-
algorithms responding to the classification problem in the              proach of ML.
same sub-tree. This mechanism enables us to extend the                    The complexity we are facing requires an agile and prag-
model cleanly, and abstract ourselves from the exact hierar-            matic approach in which users are given the opportunity


                                                                   69
to provide feedback during the early stages of the project.              [11] T. Kraska, A. Talwalkar, J. Duchi, R. Griffith, M. J.
The necessity for evolution of the system leads us today to                   Franklin, and M. Jordan. MLbase: A Distributed
consider moving towards approaches focused on the reuse of                    Machine-Learning System. In CIDR, 2013.
existing components [18]. This would allow for a necessary               [12] M. M. Mendonça, M. Branco, D. Cowan, and
control over the system’s evolution.                                          M. Mendonca. S.P.L.O.T.: software product lines
   The domain of ML is currently booming with propositions                    online tools. In OOPSLA, pages 761–762. ACM Press,
for algorithms, distributivity of execution and the opening to                2009.
a larger audience. We consider to extend our approach to in-             [13] T. Mens, M. Claes, P. Grosjean, and A. Serebrenik.
tegrate the necessary parameters for distributing workflows’                  Studying Evolving Software Ecosystems based on
execution as well as proposing deep learning workflows. A                     Ecological Models. In Evolving Software Systems,
longer term question is to target Scientific Workflow Man-                    pages 297–326. Springer, 2014.
agement systems, allowing so to explore data driven exe-                 [14] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,
cution through transformations targeting the specification                    B. Thirion, O. Grisel, M. Blondel, and al. Scikit-learn:
interchange language called WISP [4].                                         Machine learning in Python. Journal of Machine
   Finally, we wish to allow the end user to express her needs                Learning Research, 12:2825–2830, 2011. http:
in business terms, through an approach similar to IBM Wat-                    //scikit-learn.org/stable/tutorial/machine learning map/.
son Analytics5 . It is a question of integrating the state of the        [15] K. Pohl, G. Böckle, and F. J. van der Linden.
art practices in our modelisation, without losing the power                   Software Product Line Engineering: Foundations,
of our evolutionary approach, driven by experiments.                          Principles and Techniques. Springer-Verlag, 2005.
                                                                         [16] B. Rohrer. Machine learning algorithm cheat sheet for
6.     REFERENCES                                                             Microsoft Azure Machine Learning Studio.
                                                                             https://azure.microsoft.com/en-us/documentation/
 [1] M. Acher, P. Collet, P. Lahire, and R. France.                          articles/machine-learning-algorithm-cheat-sheet/.
     Separation of Concerns in Feature Modeling: Support                 [17] D. Romero, S. Urli, C. Quinton, M. Blay-Fornarino,
     and Applications. In AOSD’12. ACM, 2012.                                 P. Collet, L. Duchien, and S. Mosser. SPLEMMA: a
 [2] O. Alam, J. Kienzle, and G. Mussbacher.                                  generic framework for controlled-evolution of software
     Concern-Oriented Software Design, pages 604–621.                         product lines. In International Workshop on
     Springer Berlin Heidelberg, Berlin, Heidelberg, 2013.                    Model-driven Approaches in SPL (MAPLE), volume
 [3] A. Anwar, S. Ebersold, B. Coulette, M. Nassar, and                       2013, pages 59–66, 2013.
     A. Kriouile. A Rule-Driven Approach for composing                   [18] M. Schöttle, O. Alam, J. Kienzle, and G. Mussbacher.
     Viewpoint-oriented Models. Journal of Object                             On the modularization provided by concern-oriented
     Technology, 9(2):89–114, 2010.                                           reuse. In Proceedings of MODULARITY’16, pages
 [4] B. F. Bastos, R. M. M. Braga, and A. T. A. Gomes.                        184–189, New York, NY, USA, 2016. ACM.
     WISP: A pattern-based approach to the interchange of                [19] C. Seidl, F. Heidenreich, U. Aßmann, and U. Aßmann.
     scientific workflow specifications. Concurrency and                      Co-evolution of Models and Feature Mapping in
     Computation: Practice and Experience, 2016.                              Software Product Lines. In Proceedings of SPLC’12,
 [5] P. Clements and L. M. Northrop. Software Product                         pages 76–85, New York, NY, USA, 2012. ACM.
     Lines : Practices and Patterns. Addison-Wesley                      [20] S. Urli, M. Blay-fornarino, and P. Collet. Handling
     Professional, 2001.                                                      Complex Configurations in Software Product Lines : a
 [6] M. Fernández-Delgado, E. Cernadas, S. Barro, and                        Tooled Approach. In ACM, editor, Proceedings of
     D. Amorim. Do we Need Hundreds of Classifiers to                         SPLC’14, pages 112–121, Florence, Italy, 2014.
     Solve Real World Classification Problems? Journal of                [21] I. H. Witten, E. Frank, and M. a. Hall. Data Mining:
     Machine Learning Research, 15:3133–3181, 2014.                           Practical Machine Learning Tools and Techniques
 [7] S. Getir, M. Rindt, and T. Kehrer. A generic                             (Google eBook). 2011.
     framework for analyzing model co-evolution. In                      [22] D. H. Wolpert. The lack of a priori distinctions
     Proceedings of the Workshop on Models and Evolution                      between learning algorithms. Neural Comput.,
     co-located with MoDELS 2014, Valencia, Spain, Sept                       8(7):1341–1390, Oct. 1996.
     28, 2014., pages 12–21, 2014.
 [8] J. Guo, Y. Wang, P. Trinidad, and D. Benavides.
     Consistency maintenance for evolving feature models.
     Expert Systems with Applications, 39(5):4987–4998,
     2012.
 [9] M. Hall, E. Frank, G. Holmes, B. Pfahringer,
     P. Reutemann, and I. H. Witten. The weka data
     mining software: An update. SIGKDD Explor. Newsl.,
     11(1):10–18, Nov. 2009.
[10] J. Kranjc, V. Podpečan, and N. Lavrač. Clowdflows: a
     cloud based scientific workflow platform. In Machine
     Learning and Knowledge Discovery in Databases,
     pages 816–819. Springer Berlin Heidelberg, 2012.
5
    https://www.ibm.com/analytics/watson-analytics/


                                                                    70