<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Model-Driven Reverse Engineering of Technology-Induced Architecture for Quality Prediction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yves R. Kirschner</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Karlsruhe Institute of Technology (KIT)</institution>
          ,
          <addr-line>Karlsruhe</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <fpage>13</fpage>
      <lpage>17</lpage>
      <abstract>
        <p>In software engineering, techniques of continuous integration allow short test cycles and thus fast feedback cycles. However, if there is a need to change the deployment or update the functionality, it becomes dificult to assess the impact of such changes on performance. In this context, we present in this paper an extensible approach for reverse engineering of component-based systems for quality prediction. We build on the principles and techniques of model-driven engineering to analyze a system at the model level and to enable subsequent optimization.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Analytical models</kwd>
        <kwd>Object oriented modeling</kwd>
        <kwd>Performance evaluation</kwd>
        <kwd>Predictive models</kwd>
        <kwd>Reverse engineering</kwd>
        <kwd>Software architecture</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Component-based software development is characterized by the concept of developing systems
by integrating reusable components that interact with each other and form a clearly defined
software architecture. However, the resulting advantages, such as improved maintainability
and scalability, also face new challenges. The increased number of software components brings
to the fore the importance of compliance with standards for rapid development and sustainable
maintenance. To perform this task, a new developer must first understand the architecture
of the component-based systems. One way to improve this understanding is to automatically
restore and create an architecture model with the system properties relevant for developers.
Model-driven reverse engineering techniques can be used for this purpose. It is the process
of understanding software and creating a model suitable for documentation, maintenance or
reengineering.</p>
      <p>
        Garcia et al. conclude in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], that clustering of software entities is the almost uniformly applied
method for automated architecture recovery. In most cases, a graph structure is generated based
on dependencies in the source code, so that components can be reconstructed using clustering
or pattern matching. However, our newly proposed approach aims at using mapping rules on a
structural level for the reconstruction of architectural models from source code. We assume
that components with their interfaces and distribution can often be explicitly determined by the
technology used, such as application frameworks like the Spring framework or API specification
like JAX-RS. By considering technologies, we expect better results in reverse engineering, since
heuristics such as cohesion or coupling do not have to be used to determine components from
source code.
      </p>
      <p>The goal of our proposed approach is to support the creation and maintenance of software
architecture models for quality prediction. For this purpose, we want to capitalize on reusable
descriptions of concepts for the reverse engineering of models for quality prediction. To achieve
such a goal, the following research questions are formulated:
RQ1 How do the technologies used induce the software architecture?
RQ2 How can knowledge about technologies be implemented as rules for reverse engineering?
RQ3 How can these rules be composable for diferent technologies for a software system?</p>
    </sec>
    <sec id="sec-2">
      <title>2. Foundations</title>
      <p>
        The application of model-driven software development techniques to solve reverse engineering
problems is called Model-Driven Reverse Engineering (MDRE). [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] defines MDRE as the creation
of descriptive models from existing systems that have been previously created in some way.
MDRE is about transforming heterogeneous software development artifacts into homogeneous
models. According to [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], the following two steps form an MDRE process: 1. conversion of the
software system to be analyzed into a set of models of software development artifacts without
losing the necessary information; 2. using these models to generate the desired output models
through model transformations.
      </p>
      <p>The motivation of Model-Driven Quality Prediction (MDQP) is to enable the early design
time prediction of software systems to determine quality characteristics such as performance,
reliability or data protection. MDQP attempts to automate the process of analyzing the
information available in the model and provide software developers and architects with these analysis
values as early as possible as a basis for design decisions.</p>
      <p>
        An example of a MDQP approach would be the so-called Palladio approach [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. With the
Palladio Component Model (PCM), this approach provides a modeling language for the definition
of component-based software architectures for quality prediction. The model defines interfaces
that describe a set of services. A component can either require or provide interfaces and
describes the resource demand for each provided service. Furthermore, this approach provides
an integrated modeling environment, the so-called Palladio-Bench, based on the Eclipse IDE. By
solving a PCM instance analytically or simulation-based, the Palladio approach enables quality
predictions. We use the PCM as the underlying architectural model because it allows
modeldriven quality prediction. However, the presented underlying idea of the approach is not limited
to the PCM. For example, for the pure reverse engineering of the software architecture, a
UMLbased model could also be applied as a resulting model.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed Approach</title>
      <p>The idea of the approach we propose is to model the knowledge of the domain of the technologies
used in component-based software development in order to reverse engineer the architecture
from artifacts. This knowledge could describe how a component is implemented using a
certain framework. Our approach is module-based and comes with a predefined set of modules
for model discovery, understanding and generation. This allows our approach to be easily
extended by project-specific rules or implementations. Our approach is to integrate the following
steps into the continuous integration of a software system: 1. Discovery: Parsing the existing
artifacts, accumulating structure and behavior information about these artifacts in EMF-based
models. 2. Understanding: Analyzing these models using rules represented by model-to-model
transformations. Detected concepts are stored in the decorated model. 3. Generation: Generate
behavioral models based on the source code model and the decorated model.</p>
      <sec id="sec-3-1">
        <title>3.1. Model Discovery</title>
        <p>The first step of our reverse engineering approach is to obtain a model from existing artifacts
that allows a uniform view of the software system. Thereby artifacts that are written during the
development of a software system, e. g. source code or other configuration files like deployment
descriptors, are taken into account. In order for these models to provide a uniform view, they
must conform to a given metamodel that expresses the selected artifacts. This metamodel could
be programming language specific or map other configuration files. Moreover, these metamodels
are created as Ecore metamodels in order to enable uniform modeling. The actual structure of
these models is realized in model-driven reverse engineering by so-called discoverers, which
depend on the associated metamodel. In our approach, discoverers are provided by additional
modules and can support diferent types of artifacts.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Model Understanding</title>
        <p>The second step is the main step of our reverse engineering approach. In this step, the previously
generated models are used to efectively achieve the desired reverse engineering scenario. During
this step, these models are analyzed by using so-called rules. The underlying idea of the rules
is to capture domain knowledge about technologies with them. We use this knowledge to
reverse engineer the architecture of a system in which this technology is used. For this purpose,
rules capture how a certain concept is implemented in a technology and, which efects this
concept has on the architecture of the system. These rules are expressed as model-to-model
transformations.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Model Generation</title>
        <p>In the third step the behavior of the methods of the interfaces of the recognized components is
transformed into a PCM instance. In this step, parts of the Palladio-Bench are used to transform
the behavior given in the Java model into an abstract behavior on component level. For this
purpose, not only the Java model but also the references to the previously recognized concepts
are entered into the Palladio-Bench. Links to the original source classes are stored here for each
architecture model entity. By this mechanism, any later analysis result on architecture level can
be traced back to low-level classes and methods.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Expected Results and Evaluation</title>
      <p>The expected main contribution is the design and development of the model-driven reverse
engineering approach described in the previous section. Where the focus will be on the rules
for a diverse set of technologies. We plan to evaluate both the framework and these rules.
Evaluation methods will include industrial case studies as well as reference applications for
reverse engineering. In this evaluation, we will define goals, questions whose answers will
indicate whether the goal has been achieved, and metrics that represent measurable answers
to the questions. On this basis, we will compare the quality predictions of our own reverse
engineering results with related approaches.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Related Work</title>
      <p>
        Tzerpos and Holt present ACCD [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], an algorithm for communication-driven clustering that
groups files based on name patterns. Mancoridis et al. present Bunch, a tool for generated
decompositions using similarity measurements, which uses upscaling algorithms to group
ifles into clusters based on coupling and cohesion [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Andritsos et al. present LIMBO [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], an
algorithm for information-theoretical clustering for software that uses hierarchical clustering
based on similarities between groups of files to reverse engineer an architecture. Even though
each of these reverse engineering methods has a diferent principle, all of these methods divide
source code entities into mutually exclusive clusters, each based on a dominant principle such
as cohesion and coupling or naming patterns. Müller et al. present JQAssistant, an approach to
create a uniform data source for software analysis and visualization [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. JQAssistant ofers the
possibility to extract a model of the software architecture by graph querying. However, only
dependencies on the level of methods are considered, so that there is no possibility to process
ifne granular information, which is needed for the creation of models for quality prediction.
      </p>
      <p>
        Garzón et al. propose an approach to reverse engineer object-oriented code into a unified
language for both object-oriented programming and modeling [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. By using an incremental and
rule-based approach, UML class diagrams and state machines can be mixed with the associated
source code. However, these rules cover only the fundamental object-oriented constructs and no
special technologies nor are these UML diagrams suitable as a basis for quality prediction. Klint
et al. create with RASCAL a domain-specific language, which integrates source code analysis,
transformation and generation on the language level [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The application purpose of RASCAL
is code analysis and manipulation, e. g. for refactoring. However, this code transformation does
not support the generation of models which have a diferent meta-model than the code.
      </p>
      <p>
        Raibulet et al. compare in depth fifteen diferent model-driven reverse engineering approaches
in their literature review and find that both these approaches and their areas of application
are versatile [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In this respect, MoDisco is the most related approach in a comprehensive
scope. Bruneliere et al. developed MoDisco [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], a generic and extensible model-driven reverse
engineering approach. It provides support for Java, JEE and XML technologies to generate
model-based views of the architecture. Although it is extensible, it does not support direct reuse
of common concepts and combination of several technologies is not supported.
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this paper, we created an overview of our model-driven approach to improve the reverse
engineering of component architectures. The main idea is the reverse engineering of components
with their interfaces and their distribution from existing software development artifacts such
as source code or configuration files under consideration of the used technologies. We expect
improved reverse engineering results, since heuristics such as cohesion or coupling no longer
need to be used to determine components from the source code. As we want to use knowledge
about the technologies used to detect components, we expect our proposed approach to generate
models more in line with real architecture. Also we expect technology-specific rules to provide
a better understanding of the relationships between a technology and its underlying concept
and software architecture. These technology-specific rules allow reuse in projects that use a
specific technology.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Garcia</surname>
          </string-name>
          , I. Ivkovic,
          <string-name>
            <given-names>N.</given-names>
            <surname>Medvidovic</surname>
          </string-name>
          ,
          <article-title>A comparative analysis of software architecture recovery techniques</article-title>
          ,
          <source>in: ASE'13</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>J.-M. Favre</surname>
          </string-name>
          ,
          <article-title>Foundations of model (driven) (reverse) engineering</article-title>
          , in: Language Engineering for Model-Driven
          <source>Software Development</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Raibulet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. A.</given-names>
            <surname>Fontana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zanoni</surname>
          </string-name>
          ,
          <article-title>Model-driven reverse engineering approaches: A systematic literature review</article-title>
          ,
          <source>IEEE Access 5</source>
          (
          <year>2017</year>
          )
          <fpage>14516</fpage>
          -
          <lpage>14542</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R. H.</given-names>
            <surname>Reussner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Becker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Happe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Heinrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Koziolek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Koziolek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kramer</surname>
          </string-name>
          ,
          <article-title>Modeling and simulating software architectures: The Palladio approach</article-title>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>V.</given-names>
            <surname>Tzerpos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. C.</given-names>
            <surname>Holt</surname>
          </string-name>
          ,
          <article-title>Accd: an algorithm for comprehension-driven clustering</article-title>
          ,
          <source>in: WCRE'00</source>
          , IEEE,
          <year>2000</year>
          , pp.
          <fpage>258</fpage>
          -
          <lpage>267</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mancoridis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mitchell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          , E. Gansner,
          <article-title>Bunch: a clustering tool for the recovery and maintenance of software system structures</article-title>
          ,
          <source>in: IEEE ICSM</source>
          ,
          <year>1999</year>
          , pp.
          <fpage>50</fpage>
          -
          <lpage>59</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Andritsos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tsaparas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. C.</given-names>
            <surname>Sevcik</surname>
          </string-name>
          , Limbo:
          <article-title>Scalable clustering of categorical data</article-title>
          ,
          <source>in: EDBT'04</source>
          , Springer,
          <year>2004</year>
          , pp.
          <fpage>123</fpage>
          -
          <lpage>146</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mahler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hunger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Nerche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Harrer</surname>
          </string-name>
          ,
          <article-title>Towards an open source stack to create a unified data source for software analysis and visualization</article-title>
          , in: VISSOFT'18, IEEE,
          <year>2018</year>
          , pp.
          <fpage>107</fpage>
          -
          <lpage>111</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Garzón</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. C.</given-names>
            <surname>Lethbridge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. I.</given-names>
            <surname>Aljamaan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Badreddin</surname>
          </string-name>
          ,
          <article-title>Reverse engineering of objectoriented code into umple using an incremental and rule-based approach</article-title>
          .,
          <source>in: CASCON'14</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>91</fpage>
          -
          <lpage>105</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Klint</surname>
          </string-name>
          , T. van der Storm, J. Vinju,
          <article-title>Rascal: A domain specific language for source code analysis and manipulation</article-title>
          ,
          <source>in: ICSME'09</source>
          ,
          <year>2009</year>
          , pp.
          <fpage>168</fpage>
          -
          <lpage>177</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>H.</given-names>
            <surname>Bruneliere</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cabot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Jouault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Madiot</surname>
          </string-name>
          ,
          <article-title>Modisco: A generic and extensible framework for model driven reverse engineering</article-title>
          , in: ASE'11, ASE'10,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2011</year>
          , pp.
          <fpage>173</fpage>
          -
          <lpage>174</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>