<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Natural Language in Agile Modeling</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dominik Fuchß</string-name>
          <email>dominik.fuchss@kit.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Informal Models, Formal Models, Sketches, Agile Modeling</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>KASTEL - Institute of Information Security and Dependability, Karlsruhe Institute of Technology</institution>
          ,
          <addr-line>Karlsruhe</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <fpage>13</fpage>
      <lpage>17</lpage>
      <abstract>
        <p>Hand-drawn sketches help to quickly grasp structures and facts about software architectures during software development. Therefore, software architects and software developers also use them in their everyday work. A major problem with drawings is the fact that, unlike more formal models, their information must be interpreted rather than read out directly. Thus, they are dificult to process automatically. This paper presents an approach to create links between elements in sketches and elements in more formal models such as architectural models. In order to achieve the element recognition, we consider natural language as further source of information. We show various options how these links can support architects and developers in their design, for example by reporting inconsistencies directly.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        During agile software development, Informal Artifacts (IAs) and Formal Artifacts (FAs) are
created by the project team. In our case FAs are artifacts that are instances of technical meta
models and can therefore processed automatically to get information from them. An example
for FAs would be code or models / diagrams that have been created in a formal language editor.
Since the UML Diagram Interchange [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] specifies an exchange format for UML diagrams, they
can be seen as formal artifacts. IAs like hand-drawn sketches, transcripts of discussions, or
voice recordings are dificult to process automatically because they have to be interpreted first.
However, the interpretation of IAs is not only desirable but necessary to access their inherent
knowledge. In contrast to that, FAs can be mapped to their underlying meta model to access
their information directly.
      </p>
      <p>We illustrate the benefits of automatic analysis of IAs with an example in Figure 1. The figure
shows an architect explaining a part of the current system during a developer meeting using
a smart whiteboard. Unfortunately, the architect made a mistake while sketching that would
lead to inconsistencies with a stored formal artifact. This inconsistency could be detected by
automatic analysis of the sketch and comparing elements of the sketch with model elements.</p>
      <p>The overall goal of this thesis is to establish links between sketches and FAs like architecture
models by using sketch recognition techniques, natural language processing, and speech
recognition. This enables tool support for architects and developers to directly point out possible
inconsistencies. In addition to this scenario, other applications of such links are feasible, e.g., a
support for drawing diagrams on the basis of existing models.</p>
      <p>This work shall address the following problems: System knowledge from sketches is dificult to
use in an automated way (P1). Furthermore, there is a lack of support for finding inconsistencies
while working with sketches (P2). Therefore, our Research Questions (RQs) at the current stage
are the following:
RQ1 How can formal artifacts be used to improve the recognition of (informal) sketches?
RQ2 Does tool support for working with sketches improve the software development process?
RQ3 How can inconsistencies between formal artifacts and sketches be detected?
RQ1 deals with the detection of elements from sketches. Especially, whether the knowledge of
existing model elements helps to resolve uncertainties and ambiguities during recognition is
the question here. RQ2 addresses the mentioned future tool support that shall help developers
and architects during the Software Development Process ( SDP). Here, we want to analyze
its influence on the SDP. RQ3 is the most complex so far. Assuming the existence of formal
artifacts (e.g., digital models of the (planned) system), the RQ deals with the general question, if
inconsistencies between sketches (IAs) and such FAs are detectable. The problem here is the
diferentiation between inconsistency, absence of certain elements because of simplifications,
and presence of elements because of creation or planning activities.</p>
      <p>The paper is organized as follows: In a first step, the related work will be discussed in Section 2.
Afterwards, we focus the actual approach to work with sketches in SDPs in Section 3. Then,
the planned evaluation steps and expected results will be discussed in Section 4. Section 5
concludes this paper and reflects the thesis.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>This section discusses the related work of the thesis. First, we focus on Modern Sketch Recognition
Systems that are related to the SDP. Afterwards, we analyze research regarding Assistance
Systems for Developers and Architects.</p>
      <p>
        Modern Sketch Recognition Systems Wüest et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] developed FlexiSketch, a framework
that allows end-users to draw models on a tablet or computer. In addition, the system supports
the user by creating an on-the-fly meta model from the elements the user has drawn. The
system is able to use concepts like cardinality for the meta models. The meta model elements
are identified by comparing drawn elements and grouping similar ones. In contrast to our
approach, their focus lies on creation of models and meta models.
      </p>
      <p>
        Schäfer [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] created a system that is able to detect sketched processes in a business
process domain. To achieve the recognition, they built Arrow R-CNN a “deep learning detector
for flowchart structure recognition”. The overall goal of them is the “end-to-end flowchart
recognition”. In contrast to our approach, they focus on one single type of diagram: Business
Process Diagrams / Flowcharts. Furthermore, they think about using a formal model to correct
recognition errors. Therefore, in contrast to our approach, they do not consider the information
of informal models to correct the formal model. Besides that, they are focusing on business
process modeling instead of architectural models. In their recent research [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], they applied their
approach on a finite automata dataset. Since the recognition results are promising, it has to be
considered whether their approach can be adapted for architectural models in our research.
      </p>
      <p>This related work supports the idea that automatic processing and recognition of sketches
should be facilitated. At the same time, the connection with existing formal models still has to
be accomplished.</p>
      <p>
        Assistance Systems for Developers and Architects Samuelsson and Book [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] built a
sketched-based user interface for Integrated Development Environments (IDEs). They state
that “sketching could be a more intuitive way of expressing user intentions than navigating
nested menus”. Their system shall provide an IDE that is controlled by sketches. E.g., a user
could mark a method by a circle and draw an arrow pointing to the method’s new location. In
contrast to our work, they are focusing on sketches as a user interface.
      </p>
      <p>
        OctoUML [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is a system to bridge the “gap between informal designing […] and formal
design and documentation practices in subsequent development”. Therefore, OctoUML is able
to generate formal models out of sketched diagrams. The system is able to recognize sketched
UML classes and to name them by using voice commands. In contrast to our work, they focus
on combination of formal and informal models rather than finding inconsistencies between
sketches and existing models.
      </p>
      <p>In summary, assistance systems focus on user interactions or creation of diagrams. The
linking of formal artifacts and sketches as informal artifacts still needs to be addressed.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Approach for Linking Sketches and Software Architecture</title>
      <p>
        This section explains the approach that is currently planned to tackle the research questions.
In this paper, sketches, speech, and/or written text are the considered Informal Artifacts (IAs).
Digital architectural models (e.g., Palladio Component Model (PCM) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]) and code are the formal
ones (FAs).
      </p>
      <p>
        The overview in Figure 2 illustrates that the approach shall analyze information from informal
and formal artifacts to perform specific tasks (e.g., recovery of trace links). Sketch recognition
is used to gather the structure of diagrams describing a system or parts of it. Therefore, sketch
recognition as part of the analyses of IAs is used to obtain the inherent knowledge of sketches.
For the purpose of recognition, Arrow R-CNN [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] could be used, as they have already shown
good results for the detection of flow charts. Speech and natural language processing techniques
can be used here to, e.g., name the recognized elements (cf. Section 2). Imagine architects who
describe which component they are currently sketching. Alternatively, handwriting recognition
approaches can be used to identify the element’s names. In the current state of the work, no
ifnal decision has been made as whether speech should be considered as a source of information.
The result of this analysis stage subsumes all structural information obtained from diagrams as
well as any information gathered about possible names or element types (e.g., class, interface, …)
      </p>
      <p>
        After the analyses of the formal and informal artifacts, the system can use two knowledge
sources: General Knowledge (GK) and Transient Knowledge (TK). GK refers to knowledge
about the domain or knowledge about architecture in general (e.g., architecture patterns). TK
describes the information obtained from analyses of FAs and IAs, such as PCM models [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] or a
representation of the source code as formal artifacts and sketches as informal artifacts.
      </p>
      <p>
        Using GK and TK the envisaged tool shall perform specific tasks (e.g., the creation of trace
links). This execution provides the Target Knowledge that can be used to present certain
information (e.g., inconsistencies) to users of the approach. Since we first want to focus on
the structure of static models to link sketches and models, we plan to evaluate algorithms for
“inexact pattern matching” [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] in graphs. Therefore, we conduct exploratory experiments to
test whether these algorithms can be applied in our use cases.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Evaluation</title>
      <p>
        This section covers the planned evaluation of our work using a Goal-Question-Metric-plan [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>The first goal of this work is the creation of links between sketches and (architectural)
models (G1). In this context, it is important that these links are made available to the architects and
developers in a usable manner. Therefore, the key question regarding G1 is how good elements
of sketches can be linked to elements in models (Q1.A: Linking). In addition, we want to evaluate
if architects and developers consider tool support for working with sketches as useful (Q1.B:
Usability). The second goal of the thesis is the automatic detection of inconsistencies between
sketches and models (G2). The important questions are how can inconsistencies between
sketches and models be characterized (Q2.A: Categories) and how many of them are recognized
automatically (Q2.B: Classification )? In terms of Q1.A and Q2.B measures for classification
tasks such as precision, recall, and   score shall be applied to case studies. Regarding Q2.A,
inconsistency classes that occur could be defined based on the case studies. In order to measure
the final usefulness of a tool for working with sketches and models ( Q1.B), a demonstration
with expert interviews shall be used.</p>
      <p>Due to the early state of the work, this is a first planning of a possible evaluation. It is
expected that through this type of evaluation both technical and personal aspects related to
sketching can be investigated.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>We presented our planned approach to tackle the missing connections between sketches as
informal artifacts and more formal artifacts such as code. In order to clarify our research, we
stated several research questions and goals we want to address. Even though the basic concepts
of the work had to be defined first, important questions and steps regarding the evaluation
could be identified. Although we are still at the beginning of the work, we have demonstrated
the importance of addressing the problems regarding the lack of processing sketches.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Object</given-names>
            <surname>Management</surname>
          </string-name>
          <article-title>Group (OMG)</article-title>
          ,
          <source>UML Diagram Interchange</source>
          ,
          <year>2006</year>
          . URL: omg.org/spec/ UMLDI/1.0/PDF.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Wüest</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Seyf</surname>
          </string-name>
          , M. Glinz,
          <article-title>FlexiSketch: a lightweight sketching and metamodeling approach for end-users</article-title>
          ,
          <source>Software &amp; Systems Modeling</source>
          <volume>18</volume>
          (
          <year>2019</year>
          )
          <fpage>1513</fpage>
          -
          <lpage>1541</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          , Business Process Sketch Recognition,
          <source>Proceedings of the Dissertation Award</source>
          , Doctoral Consortium, and Demonstration Track at BPM 2019
          <string-name>
            <surname>- BPM Doctoral Consortium</surname>
          </string-name>
          (
          <year>2019</year>
          )
          <article-title>6</article-title>
          . URL:
          <article-title>ceur-ws</article-title>
          .
          <source>org/</source>
          Vol-
          <volume>2420</volume>
          /paperDC9.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Keuper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Stuckenschmidt</surname>
          </string-name>
          ,
          <article-title>Arrow R-CNN for handwritten diagram recognition</article-title>
          ,
          <source>International Journal on Document Analysis and Recognition (IJDAR)</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S. G.</given-names>
            <surname>Samuelsson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Book</surname>
          </string-name>
          ,
          <article-title>Towards Sketch-based User Interaction with Integrated Software Development Environments</article-title>
          ,
          <source>in: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops</source>
          , ACM, Seoul Republic of Korea,
          <year>2020</year>
          , pp.
          <fpage>181</fpage>
          -
          <lpage>184</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B.</given-names>
            <surname>Vesin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jolak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Chaudron</surname>
          </string-name>
          ,
          <article-title>OctoUML: An Environment for Exploratory and Collaborative Software Design</article-title>
          , in: 2017 IEEE/ACM 39th International Conference on Software Engineering
          <string-name>
            <surname>Companion (ICSE-C)</surname>
          </string-name>
          ,
          <year>2017</year>
          , pp.
          <fpage>7</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Reussner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Becker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Burger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Happe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hauck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Koziolek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Koziolek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Krogmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kuperberg</surname>
          </string-name>
          , The Palladio Component Model,
          <source>Technical Report, Karlsruhe</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R.</given-names>
            <surname>Pienta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tamersoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Tong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. H.</given-names>
            <surname>Chau</surname>
          </string-name>
          , MAGE:
          <article-title>Matching Approximate Patterns in Richly-Attributed Graphs</article-title>
          ,
          <source>IEEE International Conference on Big Data</source>
          (
          <year>2014</year>
          )
          <article-title>6</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>V. R.</given-names>
            <surname>Basili</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <article-title>A Methodology for Collecting Valid Software Engineering Data</article-title>
          ,
          <source>IEEE Transactions on Software Engineering SE-10</source>
          (
          <year>1984</year>
          )
          <fpage>728</fpage>
          -
          <lpage>738</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>