=Paper= {{Paper |id=Vol-2245/ammore_paper_1 |storemode=property |title=Automatic Model Repair Using Reinforcement Learning |pdfUrl=https://ceur-ws.org/Vol-2245/ammore_paper_1.pdf |volume=Vol-2245 |authors=Angela Barriga,Adrian Rutle,Rogardt Heldal |dblpUrl=https://dblp.org/rec/conf/models/BarrigaRH18 }} ==Automatic Model Repair Using Reinforcement Learning== https://ceur-ws.org/Vol-2245/ammore_paper_1.pdf
            Automatic model repair using reinforcement learning
              Angela Barriga                                   Adrian Rutle                               Rogardt Heldal
       Western Norway University of                   Western Norway University of                Western Norway University of
             Applied Sciences                               Applied Sciences                            Applied Sciences
             Bergen, Norway                                 Bergen, Norway                              Bergen, Norway
               abar@hvl.no                                    aru@hvl.no                                  rohe@hvl.no
ABSTRACT                                                                models are repaired by using a rule-based interactive approach.
When performing modeling activities, the chances of breaking a          Other contributions take into account the editing history of a model
model increase together with the size of development teams and          such as [18] with a tool to generate repair suggestions and [20]
number of changes in software specifications. One option to pre-        with a prototype based on graph transformation theory.
vent and repair broken models is to automatize this process with a         However, and despite the potential already shown by Machine
software tool, using techniques like Machine Learning (ML). De-         Learning (ML) in achieving (and sometimes even surpassing) human
spite its potential to help fixing models, it is still challenging to   performance in repetitive tasks, we could not find in the literature
apply ML techniques in the modeling domain due to the lack of           any research applying ML to improve how developers deal with
available datasets. However, some ML branches could offer promis-       cyclical tasks in modeling, such as repairing. Authors in [3] stress
ing results since they do not require initial training data, such as    how the combination of these two fields, ML and modeling, could
Unsupervised and Reinforcement Learning (RL). In this paper we          be really beneficial for the future of modeling and state some appli-
present a prototype tool for automatic model repairing by using         cation examples. ML has already shown its effectiveness providing
RL algorithms. These algorithms could potentially reach model           support and independently solving different tasks in software en-
repairing with human-quality without requiring supervision. To          gineering like testing [5], quality assurance [4, 15] or verification
conduct an initial evaluation we have tested our prototype with a       [8].
set of broken models and studied its execution and results.                The biggest challenge for ML adoption within modeling is the
                                                                        lack of historical data available publicly, since most ML algorithms
CCS CONCEPTS                                                            need great amounts of datasets in order to achieve high-quality
                                                                        results. Some available model repositories are [1, 7, 11], yet the
• Computing methodologies → Reinforcement learning; Model
                                                                        data they offer is limited in terms of diversity and labeling. Still,
verification and validation; • Software and its engineering →
                                                                        until the available model datasets grow and improve, there are
Risk management; • Theory of computation → Theory and al-
                                                                        some ML algorithms that could work overcoming this data issue.
gorithms for application domains;
                                                                        Unlike other ML techniques, Reinforcement Learning (RL) does not
                                                                        require training datasets, since their purpose is to find structures in
KEYWORDS                                                                the data they work with without supplying any target or outcome
Model Repair, Machine Learning, Reinforcement Learning                  beforehand. This make them useful in situations and domains which
                                                                        lack historical data.
1   INTRODUCTION                                                           Another issue is that the size of search space when performing
Models are used to develop key parts of systems in engineering          automatic repairing can grow exponentially when the target model
domains [23]. Therefore, the correctness and accuracy of such mod-      has enough content. In addition, potential interesting fixes are usu-
els are of the utmost importance to maintain good quality during        ally not unique, the same model could be repaired in many different
the development of these systems. The complexity of keeping mod-        ways, adding complexity and uncertainty into ML algorithms.
els free of errors grows together with the size of working teams           RL offers algorithms able to learn by themselves how to interact
and number changes during the development process. Tracking all         in an environment by providing them a set of possible actions and
versions of the same model and checking its correctness can be a        rewards for each of them. Using these rewards they can learn which
challenging task, especially if also the size of the model increases.   are the best actions to perform in said environment. Rewards can be
Automation can be a good solution to ease the complexity of this        established so that algorithms learn how to fix undesired situations
process by periodically checking if a model is free of errors and       or problems in the environment, such as conflicts with non-unique
repairing them when they occur. Tools that automatize or support        repairing actions. The search space can be reduced by classifying
error detection and repairing of models can greatly improve how         actions available accordingly to the error to fix. Actions that could
organizations deal with Model-Driven engineering processes by           never interact with an error could be omitted.
reducing the burden of manually dealing with correctness issues            This technique dates from the 80s, however, it has been during
and improving delivery time and final quality.                          last years due to the decreasing price of hardware and increasing
   To this end, there already exists many tools and research efforts    of computational power, that it has produced breakthrough results
[16] for automatically resolving software bugs that have shown          in tasks as different as: reaching above-human-level performance
promising results. Examples of developed prototypes for model           in video and board games [19], improving existing algorithms [14]
repairing can be found in [6] where authors achieve repairing by        or teaching simulation entities and robots themselves to walk [9].
aligning models and their behavior logs or in [17] where EMF
   Given the potential these algorithms offer together with the need      while the agent interacts with the environment with repeated calcu-
of automated tools for model repair, we consider that studying how        lations of the Bellman Equation [2]. This equation returns a Q-value
to apply RL for autonomous model repair could provide interesting         telling that the maximum future reward is the reward (r) the agent
results, with the possibility to achieve human performance in fixing      received for entering the current state (s) with some action (a) plus
models.                                                                   the maximum future reward for the next state (s’) and action (a’)
   Hence in this paper we present a prototype of how a RL algo-           reduced by a discount factor (gamma) to avoid falling in a local
rithm can repair errors in a set of broken models without human           maximum (see Fig. 1). This allow to infer the value of the current
intervention, showing its execution and results. In this first ap-        state (s) based on the calculation of the next one (s’), which can be
proach, we consider errors related to violations of well-formedness       used to calculate an optimal policy to select actions.
constraints, with the idea to generalize the type of errors supported
after further development.                                                                        Q(s, a) = r + γ (max
                                                                                                                    ′
                                                                                                                       (Q(s ′, a ′ ))
                                                                                                                       a
2     AUTOMATIC MODEL REPAIR TOOL                                            From here the algorithm can determine the utility of a state and
The Automatic Model Repair Tool (AMRT) is able, by using RL               the optimal action to do next until it reaches its final goal (maximum
techniques, to choose and apply the best actions to fix a broken          reward). Each iteration (attempt made by the agent) is called an
model. During this process the tool is independent from the user          episode. For each episode, the agent will try to achieve the goal
and does not require any supervision.                                     state and for every transition (pair of state and action) it will keep
                                                                          on updating the values of the Q table.
2.1    Algorithm
RL consists of algorithms that learn what to do, given a situation        Q t +1 (st , at ) = Q t (st , at ) + α(r t +1 + γ max Q t (st +1 , a) − Q t (st , at ))
                                                                                                                             a
and a set of possible actions to choose from, in order to maximize a
reward. These algorithms include an agent that must learn by itself       2.2      Phases
what to do through interacting with the environment. Learning is
                                                                          The tool undergoes two different phases: learning and running.
accomplished by choosing those available actions with best rewards.
                                                                          During the learning phase, the Q-learning algorithm will be trained
    In the model repair scenario, the agent is the model to be repaired
                                                                          using one or different models, updating its Q-table and improving
and the environment the framework where it was defined (EMF
                                                                          the general Q-value. In other words, learning which are the optimal
in our approach). EMF provides error diagnostics about the model,
                                                                          actions to fix a broken model.
which can be altered by actions available in the framework. If the
                                                                             The algorithm will receive as initial input:
algorithm selects actions reducing the number of errors in the model
it gets a positive reward, contrarily it is penalized. This algorithm         (1) An agent: model in XML format.
workflow is detailed in Fig. 1.                                               (2) Actions: given Ecore and Ecore-editor, find the exhaustive set
    When there are no more errors to solve the environment returns                of editions which are possible actions available to perform
a maximum reward, the ultimate goal to achieve. This guarantees                   on the model.
to solve all errors on the model but since there might be many ways           (3) State: given a model, get an error diagnostics from the EMF
to solve them, the algorithm might not find the optimal solution.                 tool.
However, it could be reached by providing the algorithm with more            As the algorithm is executed, for each action performed on the
models to solve so that it learns further or by manually favoring         model it will need to know its current state, namely it will need to
some rewards over others.                                                 be connected to EMF to iteratively update the model and analyse its
                                                                          errors. The Q-value will be optimized until the model has no errors
                                                                          left. This structure can be seen in detail in Fig. 2 and as pseudocde
                                                                          in Alg. 1. The more models provided to the algorithm the better
                                                                          performance it can get.




Figure 1: Reinforcement learning applied to model repair
workflow

   AMRT is powered by a Q-Learning algorithm [22]. We will apply
this algorithm to a sample EMF model in section 2.3, here we only
explain the main characteristics of the algorithm.
   In Q-Learning, the agent chooses the most optimal action by
consulting a table structure called Q-table. This table is updated                            Figure 2: Tool training structure
    Once the algorithm has improved enough, the tool can enter into        (3) Error 2 - Class super type of itself.
the running phase like a trained package pluggable to EMF. As a            (4) Error 3 - Duplicated class name.
plugin it will be able to analyze and repair models directly on EMF        The selected model represents part of a Web structure and con-
without any supervision required. Still, it could be possible to open   sists of three classes related with a reference, a composition and a
its behavior for human interaction, allowing users to give feedback     super type link, as can be seen in Fig 3.
to the tool about decisions made so that it can learn from human
choices and to personalize its performance.

2.3    Example                                                                           [0..*] webpage                      [0..*] section

In this section, we introduce a running example in AMRT and
demonstrate how the algorithm repairs all errors in a sample model
from scratch. For this example we have assumed the available ac-
tions to perform are:                                                                        Figure 3: First sample model
   (1) Action 0 - Rename class
   (2) Action 1 - Delete super type                                        Each class has an error:
   (3) Action 2 - Rename attribute                                         (1) Web: duplicated attribute name (Error 0).
   (4) Action 3 - Add type to attribute                                    (2) Webpage: attribute without type (Error 1).
                                                                           (3) Section: the class is super type of itself (Error 2).
Algorithm 1 AMRT Algorithm
  Initialize: Q-matrix, Gamma                                           Table 1 details the execution of the algorithm on the sample model
  INPUT: from EMF (Model, Actions, Errors)                              step by step. The correct form to read the table is, for each episode,
  while Episodes not empty do                                           to read steps from left to right consecutively (if there are two rows
      for each Error s do                                               of steps in the same episode the correct way would be to read all
           for Steps in Episode do                                      steps on the first one and then the second). Cells in red indicate that
               Select Action a // random or highest Q-value             the selected action does not solve the current error, while green
               Evaluate changes in Model produced by a                  implies solving the error. (R) after an action means it was selected
               Update Q-table                                           randomly.
           if s is solved then
               Delete s                                                           Table 1: Algorithm execution on sample model
               Exit for
           else                                                          Episodes            Step 1                    Step 2                       Step 3
                                                                                      Error 0 <- Action 0      Error 0 <- Action 2 (R)
               s’ ← s // s’ stores s modified by a                            1
                                                                                      Error 2 <- Action 0       Error 2 <- Action 1
               Save s’ in Errors                                              2       Error 1 <- Action 0       Error 1 <- Action 1           Error 1 <- Action 2
                                                                              3      Error 1 <- Action 1 (R)    Error 1 <- Action 3

   We have preferred to omit delete class/attribute actions, keeping
the example as simple as possible since they could also technically         The algorithm starts without any knowledge about the environ-
solve errors in the models but would not provide correct solutions      ment, therefore it starts by exploring for each error the different
(e.g.; by deleting the class Web its attributes would have no dupli-    actions available. First attempt is to solve Error 0 using Action 0
cated names anymore). However, this behavior can be controlled          without success. The algorithm updates then the Q-table and learns
by reducing the rewards of this kind of actions and they would not      that Action 0 does not solve Error 0. Next action is picked randomly
be a problem in a real environment.                                     and it solves Error 0. Here the randomness component has increased
   The algorithm counts with a maximum of 20 episodes and 3             the speed for finding a solution. Next error to solve is Error 2, since
steps per episode, this number is low as it also is the amount of       it is the first time the algorithm processes this error, it tries actions
pairs (action, error ) available, allowing us to take further insight   available one by one. It finds the solution by applying Action 1.
about how the algorithm interacts with the model (contrarily it         Next the algorithm faces Error 1, again an unknown Error. This
would find the solution faster but its decision process would not be    time we can see how it explore all actions (even trying a random
so visible).                                                            repetition with Action 1) until it finds the solution in the last one,
   Additionally, the algorithm can sometime pick random actions         Action 3.
instead of the one with biggest reward, in order to find new and            After a few iterations, the algorithm has been able to fix all errors
perhaps more optimal solutions. Our random probability for this         in the model without any previous training.
example is 10% since it is a really controlled environment and our
objective is to prove how the algorithm learns by itself.               2.4       Evaluation
   During this section errors will be identified using numbers:         Once the algorithm has learned about the environment, it is inter-
    (1) Error 0 - Duplicated attribute name.                            esting to evaluate its flexibility by repairing other models and how
    (2) Error 1 - Attribute with missing type.                          its performance improves.
   We have applied the AMRT on a set of models created with               Errors:
errors reparable by the available actions stated in Sec. 2.3.             (1) Boat: duplicated attribute name, twice (Error 0).
   The set contains four models. For each model we include an             (2) Boat: the class is super type of itself (Error 2).
enumeration of present errors, its diagram and a table showing
how its is fixed by the algorithm. We present them in the order               Table 3: Algorithm execution on the third model
they are processed by the algorithm. For each model the Q-table is
updated and saved for the next one.
                                                                                       Episodes            Step 1
  2.4.1   Second model.                                                                              Error 2 <- Action 1
                                                                                            1
                                                                                                     Error 1 <- Action 3
                                                                                            2        Error 0 <- Action 2

                                                                          The algorithm is able to fix this model directly as can be seen in
                             [0..1] car                                 Table 5, already knowing which action solve each error.
                                                                          2.4.3   Fourth model.


                Figure 4: Second sample model

  Errors:
  (1) Car: duplicated class name (Error 3).
  (2) Car: attribute without type (Error 1).
  (3) Car’: duplicated attribute name (Error 0).

     Table 2: Algorithm execution on the second model                                   Figure 6: Fourth sample model

 Episodes              Step 1                     Step 2                  Errors:
               Error 3 <- Action 1 (R)      Error 3 <- Action 0           (1) House: duplicated attribute name (Error 0).
      1
                Error 0 <- Action 1         Error 0 <- Action 2           (2) House: attribute without type (Error 1).
      2         Error 1 <- Action 3
                                                                             Table 4: Algorithm execution on the fourth model
    With this model, a new error is introduced into the system, Error
3. One can see in Table 4 that the algorithm starts by selecting for                   Episodes             Step 1
it a wrong random action but is able to fix it with the second action                      1         Error 0 <- Action 2
chosen. Next, although Error 0 is already known, Q-table still needs                                Error 1 <- Action 0 (R)
                                                                                           2
to be tuned and it finds a solution on step 2. Finally, it is able to                                Error 1 <- Action 2
find a solution for Error 1 at the first attempt.
    Although it is only the second model processed by the algorithm,       In execution showed by Table 6, the algorithm selects one wrong
the fixing process is already performed faster.                         action, but only because it is selected randomly.
  2.4.2   Third model.                                                    2.4.4   Fifth model.




                                                                                                   [0..1] reference




                                                                                         Figure 7: Fifth sample model

                                                                          Errors:
                 Figure 5: Third sample model                             (1) Machine: duplicated class name (Error 3).
    (2) Machine: duplicated attribute name (Error 0).                              cover the whole state space even if there is total conformance
    (3) Machine: attribute without type (Error 1).                                 between models and logs. Our proposed technique avoid processing
    (4) Machine: the class is super type of itself (Error 2).                      the totality of the state space at once, allowing subdivision for each
    (5) Machine’: duplicated attribute name (Error 0’).                            error and its available actions.
    (6) Machine’: the class is super type of itself (Error 2’).                       Other efforts focus on rule-based approaches. Nassar et al. [17]
                                                                                   present a prototype where EMF models are automatically trimmed
        Table 5: Algorithm execution on the fifth model                            and completed, with user intervention in the process. Their tool
                                                                                   works as a support guide for modelers. For us, it is also important to
 Episodes            Step 1                 Step 2                 Step 3          provide user interaction but our aim is to give the prototype freedom
     1         Error 3 <- Action 0    Error 1 <- Action 3    Error 0 <- Action 2   to perform its own decisions (although they can be supervised by
     2         Error 2 <- Action 1    Error 2’ <- Action 1                         the user if he desires so).
     3         Error 0’ <- Action 1                                                   Alternative contributions take into account the editing history
                                                                                   of a model such as Ohrndorf et al. [18] with a tool that generates
   The final model is the biggest of the set with six errors. Table 7              repair suggestions. Their method presents the interesting idea of
displays how it is fixed by the algorithm straightforwardly.                       identifying which change in the editing history originated an error.
                                                                                   However, the tool works again just as a mere recommendation
    2.4.5    Evaluation results.
                                                                                   system and it does not have decision power. Additionally their
                                                                                   approach only works if the model comes together with its edit
   The evaluation shows that AMRT is able to automatically repair
                                                                                   history, not if the model is introduced with errors from scratch.
a model without any previous training. Once the algorithm gains
                                                                                      Taentzer et al. [20] present a prototype based on graph trans-
enough knowledge, it is able to repair new models straightforwardly
                                                                                   formation theory for change-preserving model repair. In their ap-
as long as they are reparable by the set of known actions.
                                                                                   proach, authors check operations performed on a model to iden-
   Performance improves really quick, allowing the AMRT to fix
                                                                                   tify which caused inconsistencies and apply the correspondent
models almost directly after a couple of models processed.
                                                                                   consistency-preserving operations, maintaining already performed
   Figure 8 shows how the accuracy percentage evolves through the
                                                                                   changes on the model. Their preservation approach is interesting,
models processed. Improvement is clear, the percentage increases
                                                                                   however it only works assuming that the latest change of the model
each iteration except for the fourth model. However, as showed in
                                                                                   is the most significant. This could not work in all environments
Table 6, the only wrong action picked for this model was caused
                                                                                   lacking therefore adaptability and reusability, for instance in situa-
by the random dimension of the algorithm (10% of the actions are
                                                                                   tions where inconsistencies happen due to simultaneous commits
selected randomly instead of picking the best possible one in the
                                                                                   from team members.
Q-table), and without it 100% accuracy would have been reached.
                                                                                      Kretschmer et al. introduce in [13] an approach for discovering
                                                                                   and validating values for repairing inconsistencies automatically.
                                                                                   Values are found by using a validation tree to reduce the state
                                                                                   space size. Their approach is similar to ours but we believe RL can
                                                                                   reduce state space even more than a tree-based search, especially
                                                                                   when tuning rewards and tackling errors independently. Also, trees
                                                                                   tend to lead to the same solutions once and again due to their
                                                                                   exploitation nature (probing a limited region of the search space).
                                                                                   Differently, RL algorithms include both exploitation and exploration
                                                                                   (randomly exploring a much larger portion of the search space with
                                                                                   the hope of finding other promising solutions that would not be
                                                                                   selected normally), allowing to find new and, sometimes more
                                                                                   optimal solutions for a given problem.
                                                                                      Lastly, again Kretschmer et al. [12] propose an automated tool to
                                                                                   repair inconsistencies. They propose generator functions to make
                                                                                   their tool more generic and users have to add their own generator
            Figure 8: Accuracy evolution of the algorithm                          functions. We go a step forward using RL techniques that extend
                                                                                   the scope of our approach and provide the flexibility for reusing
                                                                                   the framework to be applied in a wide variety of model repairing
3     RELATED WORK                                                                 cases. Also, we get the available actions directly from EMF, without
Model repair is a research field which results can greatly improve                 needing further intervention from the user.
how engineers interact with model-driven projects. Thus, it has
drawn the interests of many researchers to formulate approaches
and build different tools to fix broken models.                                    4   FUTURE WORK
   Some research tackle model repair relating a given model to its                 In the following section, we discuss the next research steps and
past event logs. Fahland et al. [6] study model repairing by aligning              how to improve our current results.
a given model with its behavior logs. Their model repair technique
   At the moment, the tool simulates to be connected with EMF                             [9] Nicolas Heess, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval
in order to receive models, errors and actions. Next immediate                                Tassa, Tom Erez, Ziyu Wang, Ali Eslami, Martin Riedmiller, et al. 2017. Emergence
                                                                                              of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286
step is to actually implement and test the tool connected with the                            (2017).
framework. Additionally, it will be necessary to develop a plugin                        [10] Leslie Pack Kaelbling, Michael L Littman, and Andrew W Moore. 1996. Rein-
                                                                                              forcement learning: A survey. Journal of artificial intelligence research 4 (1996),
and test how it works with models directly on the EMF environment.                            237–285.
   In order to reach better results and further validation, we will                      [11] Bilal Karasneh and Michel RV Chaudron. 2013. Online Img2UML Repository: An
test the tool on a bigger set of models. These will be obtained by                            Online Repository for UML Models.. In EESSMOD@ MoDELS. 61–66.
                                                                                         [12] Roland Kretschmer, Djamel Eddine Khelladi, Andreas Demuth, Roberto E Lopez-
creating a number of EMF models (either manually or from tools                                Herrejon, and Alexander Egyed. 2017. From Abstract to Concrete Repairs of
like [21], from a public repository such as [1, 7, 11] or by extracting                       Model Inconsistencies: an Automated Approach. In 2017 24th Asia-Pacific Software
them from general repositories like GitHub by using data mining                               Engineering Conference (APSEC). IEEE, 456–465.
                                                                                         [13] Roland Kretschmer, Djamel Eddine Khelladi, and Alexander Egyed. 2018. An
techniques. These models will be analyzed and repaired by the                                 automated and instant discovery of concrete repairs for model inconsistencies.
tool using the complete set of actions available in EMF. We will                              In Proceedings of the 40th International Conference on Software Engineering: Com-
                                                                                              panion Proceeedings. ACM, 298–299.
study how the tool performances in this more complex scenario                            [14] Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, OpenAI Pieter Abbeel, and Igor
and modify it until it reaches an optimal performance. Additionally,                          Mordatch. 2017. Multi-agent actor-critic for mixed cooperative-competitive
we will research different modifications of the algorithm in order                            environments. In Advances in Neural Information Processing Systems. 6382–6393.
                                                                                         [15] Ruchika Malhotra. 2015. A systematic review of machine learning techniques
to find the best way to face model repairing, as well as testing other                        for software fault prediction. Applied Soft Computing 27 (2015), 504–518.
RL algorithms [10].                                                                      [16] Martin Monperrus. 2018. Automatic software repair: a bibliography. ACM
   Finally, we would like to study further to which degree the                                Computing Surveys (CSUR) 51, 1 (2018), 17.
                                                                                         [17] Nebras Nassar, Hendrik Radke, and Thorsten Arendt. 2017. Rule-Based Repair of
interaction between users and the algorithm can benefit the final                             EMF Models: An Automated Interactive Approach. In International Conference
result, which are the best terms to develop said interaction and to                           on Theory and Practice of Model Transformations. Springer, 171–181.
                                                                                         [18] Manuel Ohrndorf, Christopher Pietsch, Udo Kelter, and Timo Kehrer. 2018. ReVi-
identify the most crucial operations for model developers subject to                          sion: a tool for history-based model repair recommendations. In Proceedings of the
be improved by using ML. We would like to focus especially on how                             40th International Conference on Software Engineering: Companion Proceeedings.
users can boost the algorithm decidability when facing situations                             ACM, 105–108.
                                                                                         [19] David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja
where different actions are able to apply the same fix on the model.                          Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton,
                                                                                              et al. 2017. Mastering the game of go without human knowledge. Nature 550,
                                                                                              7676 (2017), 354.
5    CONCLUSIONS                                                                         [20] Gabriele Taentzer, Manuel Ohrndorf, Yngve Lamo, and Adrian Rutle. 2017.
In this paper, we have proposed the use of RL algorithms to fix                               Change-preserving model repair. In International Conference on Fundamental
                                                                                              Approaches to Software Engineering. Springer, 283–299.
broken models. Contrarily to other ML techniques, RL algorithms                          [21] Antonio Vallecillo, Frank Hilken, Loli Burgueño, Martin Gogolla, et al. 2016.
do not need as many data as they do not require an initial dataset                            Generating Effective Test Suites for Model Transformations Using Classifying
and are known for working well in finding solutions in unknown                                Terms. (2016).
                                                                                         [22] Christopher JCH Watkins and Peter Dayan. 1992. Q-learning. Machine learning
environments.                                                                                 8, 3-4 (1992), 279–292.
   We present AMRT, a tool that given a broken model and a set of                        [23] Jon Whittle, John Hutchinson, and Mark Rouncefield. 2014. The state of practice
                                                                                              in model-driven engineering. IEEE software 31, 3 (2014), 79–85.
editing actions available, is able to find mistakes and which actions
can repair them. This combination of RL and automatic fixing could
potentially allow model repairing at human-level without requiring
supervision.
   While the results obtained are still at an early stage, we consider
them promising and an indicator of the potential in this research
line.

REFERENCES
 [1] Francesco Basciani, Juri Di Rocco, Davide Di Ruscio, Amleto Di Salle, Ludovico
     Iovino, and Alfonso Pierantonio. 2014. MDEForge: an Extensible Web-Based
     Modeling Platform.. In CloudMDE@ MoDELS. 66–75.
 [2] Richard Bellman. 2013. Dynamic programming. Courier Corporation.
 [3] Jordi Cabot, Robert Clarisó, Marco Brambilla, and Sébastien Gérard. 2017. Cogni-
     fying Model-Driven Software Engineering. In Federation of International Confer-
     ences on Software Technologies: Applications and Foundations. Springer, 154–160.
 [4] Kanika Chandra, Gagan Kapoor, Rashi Kohli, and Archana Gupta. 2016. Improving
     software quality using machine learning. In Innovation and Challenges in Cyber
     Security (ICICCS-INBUSH), 2016 International Conference on. IEEE, 115–118.
 [5] Richard Chang, Sriram Sankaranarayanan, Guofei Jiang, and Franjo Ivancic. 2014.
     Software testing using machine learning. US Patent 8,924,938.
 [6] Dirk Fahland and Wil MP van der Aalst. 2015. Model repair-aligning process
     models to reality. Information Systems 47 (2015), 220–243.
 [7] Robert B France, James M Bieman, Sai Pradeep Mandalaparty, Betty HC Cheng,
     and Adam C Jensen. 2012. Repository for model driven development (ReMoDD).
     In Proceedings of the 34th International Conference on Software Engineering. IEEE
     Press, 1471–1472.
 [8] Oliver Friedrichs, Alfred Huger, and Adam J O’donnell. 2015. Method and appa-
     ratus for detecting malicious software through contextual convictions, generic
     signatures and machine learning techniques. US Patent 9,088,601.