=Paper= {{Paper |id=Vol-2978/mde4sa-paper1 |storemode=property |title=Ensuring threat-model assumptions by using static code analyses |pdfUrl=https://ceur-ws.org/Vol-2978/mde4sa-paper1.pdf |volume=Vol-2978 |authors=Johannes Geismann,Bastian Haverkamp, Eric Bodden |dblpUrl=https://dblp.org/rec/conf/ecsa/GeismannHB21 }} ==Ensuring threat-model assumptions by using static code analyses== https://ceur-ws.org/Vol-2978/mde4sa-paper1.pdf
Ensuring threat-model assumptions by using static code
analyses
Johannes Geismann1 , Bastian Haverkamp1 and Eric Bodden1,2
1
    Department of Computer Science, Heinz Nixdorf Institute, Paderborn University, Fürstenallee 11, 33102 Paderborn, Germany
2
    Fraunhofer IEM, Zukunftsmeile 1, 33102 Paderborn, Germany


                                             Abstract
                                             In the past years, the security of information systems has become more and more important. Threat modeling techniques
                                             are applied during the design phase of the development, helping to find potential threats as early as possible. However,
                                             assumptions made at this development step are often not considered in later steps or are not validated correctly, particularly
                                             not during the concrete implementation of the system. To overcome this problem, we present cards, a security modeling
                                             approach on the architectural level which utilizes code analyses to validate assumptions made during the threat modeling
                                             phase. cards helps ensure a correct implementation but also allows one to determine which effect code vulnerabilities
                                             can have on the overall architecture, as described through models. We implemented cards based on the Eclipse Modeling
                                             Framework, for Java-based system implementations. We evaluated cards based on the CoCoME case study to show its
                                             efficacy. The evaluation showed that cards can ease the validation of assumptions made during threat modeling and reduce
                                             the overall analysis effort.

                                             Keywords
                                             Threat-modeling, Security, Component-based, Static Code Analyses, Security-by-design



1. Introduction                                                                                                       tions and are used to ensure that specific dataflows are
                                                                                                                      prevented. For example, when paying in the super mar-
Security is an essential property when developing mod-                                                                ket, such an assumption on the implementation of the
ern software-intensive systems. To ensure high security,                                                              cash desk could be that customer credit card informa-
it is important to consider security not only during the                                                              tion is only sent to system parts that have permission
implementation but already when designing the system.                                                                 to process it. Especially for large-scale systems this be-
Especially dataflows are of high interest because con-                                                                comes a challenge because large code bases have to be
fidential data resembles important assets for every in-                                                               analyzed. Additionally, such systems consist of several
formation system, and also because attacker-controlled                                                                subsystems that are possibly developed by different par-
inputs need to be properly filtered before they are used.                                                             ties. Distributed systems, micro-services and “serverless”
For this reason, one uses threat modeling approaches to                                                               architectures are just some prominent examples.
reason about potential threats and corresponding coun-                                                                   Particularly in these areas, model-based approaches
termeasures in early development steps [1].                                                                           are promising for threat modeling and security by de-
    Current approaches, however, are limited because of                                                               sign [3]. However, most approaches are either fully
the lack of full traceability from threat model to the sys-                                                           model-driven approaches that are quite heavy-weight
tem artifacts. In particular, due to a missing connection                                                             and usually hardly adaptable, e.g, UMLsec [4] or SEED [5],
of threat model artifacts and the implementation, this im-                                                            or light-weight approaches such as STRIDE [1] that only
plementation often differs from the specifications made                                                               take threat modeling into account but do not consider
during threat modeling [2]. Hence, assumptions made                                                                   the connection to the implemented system. To make
during the design or in the threat model are not cor-                                                                 threat modeling more effective for distributed systems,
rectly implemented or not even implemented at all, which                                                              the following challenges need to be met.
leaves the security state of the actual system unclear.                                                                  Security requirements are usually defined by several
    Static code analyses can help to validate these assump-                                                           disciplines and, therefore, should be specified on the ar-
                                                                                                                      chitectural or system level such that they can be dis-
2ND INTERNATIONAL WORKSHOP ON MODEL-DRIVEN                                                                            cussed independently from—and in the best case already
ENGINEERING FOR SOFTWARE ARCHITECTURE (MDE4SA 2021),                                                                  before—the implementation phase. Countermeasures de-
September 13th 2021, Virtual (originally Växjö, Sweden)
                                                                                                                      fined during such a threat modeling phase are usually
" johannes.geismann@upb.de (J. Geismann);
bastihav@mail.upb.de (B. Haverkamp); eric.bodden@upb.de                                                               assumptions made about the implementation. Hence, all
(E. Bodden)                                                                                                           assumptions made on the architectural level have to be
 0000-0003-2015-2047 (J. Geismann); 0000-0002-1189-6290                                                              made explicit in the model and have to be correctly re-
(B. Haverkamp); 0000-0003-3470-3647 (E. Bodden)                                                                       fined into source code [2]. Because this is a tedious and
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative
                                       Commons License Attribution 4.0 International (CC BY 4.0).                     error-prone task, one must validate these assumptions on
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
                                                        2                          3                       4
                                   Specify Security-        Analyze Architecture        Generate Source
                                 relevant information          for Violations            Code Analyses

                            1
            Create System        Specify Restrictions                                   Check System for
                                                              Implementation
             Architecture         and Assumptions                                         Compliance


Figure 1: Overview and main process steps of cards.



the source code level. An additional challenge is that the         The source code for our implementation can be found
implemented system is usually not completely under the           on https://github.com/secure-software-engineering/cards
control of one development team. Static code analyses
are a suitable solution to this end because they validate
such assumptions on the source code and can be defined
for a specific subsystem regardless of who is responsible        2. CARDS: Security Modeling and
for the implementation. However, if static code analyses            Validation
are used, the results are most useful if fed back to the
architecture and threat model. Unfortunately, current            Effective threat modeling requires four basic steps: (1)
solutions fall short in this regard.                             Finding security-relevant systems parts and functions, (2)
   We see two main concepts as essential here: Connec-           Finding potential threats with regard to these parts, (3)
tion to the source code and making the requirements              Risk-assessment, i.e., prioritizing the threats, and (4) im-
and assumptions made during threat modeling explicit.            plementing appropriate countermeasures. While threat
To address these challenges, we have developed cards             modeling in general targets all kind of threats, cards
(Component-based Assumptions and Restrictions for                focuses on dataflow-specific threats. We designed cards
Dataflow Specifications), a security modeling approach           in such a way that it’s concepts can be applied to existing
for dataflows in distributed systems. It provides a new          development processes. Figure 1 shows an overview of
DSL which operates on a generic component model and,             the main steps.
therefore, can be adapted for existing component-based              At first, the system designer creates a component model
approaches. cards can be used to specify security re-            describing the basic architecture of the system (1). This
quirements for dataflows, as well as assumptions made to         step is not necessarily part of cards since an existing
fulfill these restrictions on the architectural level. cards     architectural model could also be adapted for the applica-
further illustrates how static code analyses can be used         tion of cards. Based on the component model, security
to validate the assumptions on the code level.                   and domain experts specify security-relevant informa-
   In particular, this paper makes the following original        tion, e.g., confidential data. Also, security restrictions and
contributions:                                                   security assumptions are specified explicitly. Security
                                                                 restrictions describe security requirements for specific
     • cards: a concept and a domain-specific language           data types of the system, e.g., data from the credit card
       for the specification of dataflow restrictions and        reader are always sanitized before being sent to other
       assumptions on the architectural level,                   components of the system. Dataflow-specific security
     • an analyzer checking the system for dataflow vi-          requirements for the system can be refined to security
       olations,                                                 restrictions. A security assumption makes an assumption
     • a concept for generating the corresponding static         to the implementation explicit, e.g., that confidential data
       code analyses, and                                        will never be send to an external entity. Following this,
     • an implementation of these concepts based on              a restriction describes global requirements the system
       the Eclipse modeling framework and Sirius, pro-           should satisfy, an assumption contrarily describes what
       viding a textual as well as graphical syntax.             the designer assumes to be implemented for each com-
                                                                 ponent. The concept of both (1) and (2) are described in
   This paper is structured as follows: In Section 2, we         more detail in Section 2.1.
provide an overview of cards, describe our concept for              Next, the system can be analyzed whether all security
security restrictions and assumptions, explain our model         restrictions are satisfied assuming that all assumptions
analyses on and the generation of code analyses. In Sec-         will be implemented correctly (3). If a violated restric-
tion 3, we describe the implementation of the prototype          tion is found, the security experts may add additional
and present the evaluation of cards in Section 4. Sec-           assumptions to mitigate this security issue and re-apply
tion 5 compares cards with related approaches and Sec-           the analysis until all restrictions are satisfied. The as-
tion 6 concludes this paper.
               0..*   ports
                                       1                      ponents or AtomicComponents. Composite components
                                                              can contain further components by defining Component-
Port                           Component
                                                              Parts which allows for a hierarchical component model.
    1          1                                              Atomic components cannot contain further components.
 from           to                                            Components use Ports for communication with other
    1          1                                              components. In our component model, we assume com-
PortConnector                  CompositeComponent             munication to be asynchronous. Ports are connected
        0..*
                                                              via PortConnectors which are embedded into the parent
                                                              composite component. For a better overview, we have
                                                              omitted several parts of the meta-model that are mainly
ComponentPart                  AtomicComponent
                                                              needed for technical reasons. The full meta-model can
                                                              be found in our provided implementation artifacts.
Figure 2: Overview of the used generic component model.
                                                              2.1.2. Security-relevant Information
                                                              Based on the component model, cards utilizes several
sumptions can be useful for the actual implementation         security-relevant pieces of information that can be spec-
of the system giving the developers guidelines for the        ified within our DSL. In the following, we give a short
implementation. Concepts for the analyses and potential       overview of the supported language features and their
use-cases in the development are explained in Section 2.2.    purpose.
   Finally, cards uses generated static code analyses to
validate if all assumptions are implemented correctly (4).           DataTypes are representing the security-relevant
For this, we provide in Section 2.3 a concept for how                data. They are the data assets of the system be-
the assumptions can be mapped to static code analyses                cause they represent the data that should be pro-
automatically. If all generated analyses pass and no vio-            tected. We only consider data that are relevant
lation is found on source code, the restrictions made to             for the analyses. DataTypes can have attributes
the system can be seen as satisfied on code-level, too.              for labels, e.g, to mark a datatype as external user
                                                                     input, a security level, and a type which can be
2.1. Specifying Restrictions and                                     interesting when mapping to the actual source
                                                                     code base. Listing 1 shows an excerpt of the ex-
     Assumptions                                                     ample where three data types are defined (lines
In this section, we explain our concepts of restrictions,            1-5).
assumptions, and all concepts required. We developed a               Data Groups are used to combine several Data-
DSL for specifying security-relevant information of the              Types, e.g., all data describing parts of credit card
system, security restrictions, and security assumptions.             information. DataGroups are mainly used when
Since it is essential to refer to the actual system model,           defining Restrictions and Assumptions. In List-
this DSL refers to a component model. For demonstra-                 ing 1, the data types CreditCardNumber and
tion purposes, we are using a generic component model                CreditCardPIN are grouped (cf. line 11).
which is described in Section 2.1.1. However, since we               Component Groups are used similarly to com-
use a generic component model, we see our concepts                   bine components that have something in com-
not restricted to one component model but adaptable                  mon, e.g., (un)trusted components.
to other component models. After that, we describe in                Component Kinds can be used to categorize
Section 2.1.2 how security-relevant information can be               components, e.g., to mark components as external
formalized. Finally, we explain our concept of restrictions          entities, datastores, or processes (similar to DFD
and assumptions in more detail and describe our DSL for              threat modeling) [1].
this step.                                                           Data Sources describe which components are the
                                                                     sources for a specific DataType. In Listing 1, the
2.1.1. Component Model                                               component CardReader is marked as source for
                                                                     the types CredietCardPin and CreditCard-
For demonstration purposes, we are using a generic com-
                                                                     Number.
ponent model. We therefore expect that our concepts can
be applied to most other component-based system speci-               Sanitzers are used to modify data making them
fications as well. Figure 2 depicts the main parts of the            secure for further use, e.g., escaping bad char-
underlying meta-model. A component model consists of                 acters. At this stage, a sanitizer is only on con-
a set of components which can be either CompositeCom-                ceptual level and can be used in the security as-
                                                                     sumptions (cf. Section 2.1.3). In the example, a
        Listing 1: Example code of a cards-specification.              Listing 2: Example of a restriction using cards-
1    dataTypes {                                                                  specification.
2     DataType BarCode { },
                                                                   1   DataFlowRestrictions {
3     DataType CreditCardNumber {securityLevel 3 },
                                                                   2   GloballyPREVENT CreditCardInfo {
4     DataType CreditCardPin {securityLevel 4 }
                                                                   3    Comp CreditCardPin , CreditCardNumber allow CardReader ,
5    }
                                                                               Bank , CashDeskPC}}
6    components {
7     AtomicComponent CardReader {
8     ports { INOUTPort cardReaderPort ( )}
9     sourceOf { CreditCardPin,CreditCardNumber }}
10
11
     }
     Groups {DataGroup CreditCardInfo {CreditCardPin,                  Listing 3: Example code of security assumptions using
           CreditCardNumber}}                                                     cards.
12   Sanitizer {CCSanitizer}
                                                                   1   DataFlowAssumptions {
                                                                   2     componentAssumptions {
                                                                   3       Component CashDesk neverOut CreditCardInfo }
                                                                   4     portAssumptions {
            sanitizer is defined that should sanitize all confi- 5         Port pcLightDisplay neverOut CreditCardInfo
                                                                 6         Port pcCashBoxPort neverOut CreditCardInfo}
            dential credit card information, e.g., by replacing 7        sanitizersAssumptions {
            it with asterisks.                                   8         Component CashDeskPC sanitizes DataFlow
                                                                                 pcCardReaderPort -> pcPrinterPort of
            Security Level can be used to assign a specific                      CreditCardInfo using CCSanitizer}}
            level of security or trust to components.


     2.1.3. Dataflow Restrictions and Assumptions                      restrictions could not be validated. The security engineer
                                                                       can therefore specify assumptions of the implemented be-
     In the following, we describe our concepts for security           havior which must be met to achieve the restriction. We
     restrictions and corresponding assumptions and how                next explain how to specify such assumptions in cards.
     cards supports the security engineer specifying these.
     Essentially, restrictions formally describe security re-          Specifying Assumptions An assumption describes a
     quirements regarding the dataflow within the system.              required behavior of a component. cards provides dif-
     Assumptions are used to describe countermeasures that             ferent kinds of assumptions. At first, we distinguish
     are assumed to be in place in the source code.                    between two major kinds of assumptions: neverOut-
                                                                       assumptions and sanitzer-assumptions. A neverOut-as-
     Specifying Restrictions Restrictions are used to for-             sumption specifies that a context element will never leak
     mally describe security requirements for the data types           the given data type, e.g., that a component will never
     specified as assets. In essence, the security engineer has        send private data to another component. A sanitizer-
     to describe a security policy for each data type describing       assumption specifies that a context element will always
     which component is allowed to access the data. Basically,         sanitize the data before leaking it using a specific san-
     there are two options: 1. Globally allow all components           itizer, e.g., replacing some digits with asterisks when
     to access a data type and define exceptions that are not          sending credit card information to the printer.
     allowed to access the data type (deny-listing approach)              We support three different context elements: compo-
     and 2. globally prevent components from accessing the             nents, ports, and flows within a component. Assumptions
     data type and define exceptions describing components             for component parts are not useful because all parts of a
     that are allowed to access the data type (allow-listing           specific component type will have the same implementa-
     approach).                                                        tion. In the example in Listing 3, we show four different
        Corresponding to this, we distinguish between two              assumptions: 1. an assumption that the (composite) com-
     kinds of restrictions, so-called Allow-Restrictions and           ponent CashDesk will never leak the credit card info
     Prevent-Restrictions. For each datatype, the security en-         (line 3). 2. an assumption that the pcLightDisplay
     gineer has to specify such a restriction. One restriction         port will never leak the credit card info (line 6). 3. an
     may cover more than one data type. Listing 2 shows an             assumption that the pcCashBoxPort port will never
     example of a specified restriction. In particular, we de-         leak the credit card info (line 7). 4. an assumption that
     fine a prevent restriction describing, that the data types        the component CashDeskPc port will always sanitize
     CreditCardPin and CreditCardNumber should only                    dataflows of credit card info from pcCardReaderPort
     be accessed by the components CardReader, Bank, and               to pcPrinterPort, using the sanitizer CCSanitizer (line
     CashDeskPC by combining the prevent restriction and a             10).
     component refinement. Beside component refinements,                  cards provides an analysis to check whether the spec-
     restrictions cards also supports refinements for compo-           ified restriction is satisfied on model level if all assump-
     nent parts and component groups. Without any knowl-               tions are implemented correctly. This analysis is ex-
     edge of the concrete behavior of the components, this
plained in the next section. Section 2.3 describes our restriction if data types of the defined restriction are
concept how the correct implementation of the assump- illegally accessible at a component. Based on the anal-
tions can be validated using static code analyses.           ysis results, we can compare the list of available data
                                                             types and the list of data types specified in the restriction.
2.2. Analysis and Reporting                                  If a violation is found, it is essential to report it to the
                                                             engineers properly. For this, cards provides different
cards provides model-based analyses checking whether report features, e.g., visual feedback in the graphical edi-
all specified restrictions are satisfied and if all security tors, exported HTML and JSON reports, and an export to
assumptions have been implemented correctly. This anal- attack-defense graphs [6].
ysis should be part of the threat modeling activity dur-
ing system design and is also useful to find effects in
                                                             2.3. Using Code Analyses for Validation
the system’s architecture when a problem in the actual
implementation is found. The analysis can help secu- When all violations of dataflow restrictions are elimi-
rity experts to find unintended dataflows and to specify nated by specifying assumptions, these assumptions must
requirements for the implementation of a component also be met through correctly implemented source code.
by creating security assumptions. Besides the analysis, To validate this, we propose to use static code analysis
cards also provides several reporting features to assist (cf. Step 4 in Figure 1). We provide a general concept for
the security experts by exporting the analysis results in creating static code analyses for the given model assump-
useful formats. In this section, we describe how our anal- tions. Since these analyses base on a common structure,
ysis works at first and how the results can be reported it is reasonable to generate them and, thus, automating
afterward.                                                   this step. However, to generate the analysis, some man-
   In cards, we apply a two-step analysis. First, for each ual prerequisites must be met, i.e., a connection between
component, all possible paths through the model are the model and the code base has to be created. In the
determined. Second, for each component and compo- following, we explain how we propose to create such a
nent parts respectively, all data types are determined that connection first and how the analyses can be generated
might reach this component. For the first analysis, we automatically in a second step.
treat the component model as a directed graph where
components are the nodes and port connectors are the 2.3.1. Connection to Source Code
edges. Conceptually, the analysis is as a basic depth-first
search. The output of the analysis is a mapping from For connecting the (secured) component model to a given
components to all (longest) paths through the model, i.e., code base, we propose to use a so-called mapping model.
for each component, we store which components it could This mapping model is used to describe the connections
directly or indirectly communicate with. In the second between model artifacts and parts of the source code. All
analysis, for every component, a set of available data required mappings are shown in Table 1. All mappings
types is determined, i.e., data types that could possibly have to specify the model element, class and a method.
be accessed by this component. In the beginning, the            Since creating all mappings by hand is a tedious task,
set of available data types of all components that are a     we  provide a source code generator that generates source
source for a data type are set to these data types. Next,    code  skeletons for a given composite component and also
the analysis recursively propagates data types through creates an appropriate mapping model containing all re-
the system. The analysis iterates through the paths and, quired mappings. As proof of concept, we implemented a
for each step in the path, adds all currently available data generator for Java which is explained in Section 3 in more
types to an output set which is again propagated to the detail. Supporting the engineers in creating a mapping
next component in the path. In this step, we evaluate model for an existing code base is not in the scope of this
given assumptions of the component to alter the set of paper but we see potential by applying semi-automatic
available data. If a sanitizer-assumption is specified for approaches like done by Peldszus et al. [7]. However, both
this component and datatype, we add a flag to the data the mapping model itself and the generator are concep-
type that it becomes sanitized by this component. If a tually not restricted to one programming language but
neverOut-assumption is specified, the data type is re- can be easily adapted for other programming languages.
moved from the output set. The output of this analysis is
a mapping of components to pairs of lists of paths and        2.3.2. Generating Static Analyses
data types, which are received on these paths. The advan-     After creating the mapping model, we use this informa-
tage of this two-step analysis is that the result does not    tion to create a suitable static code analysis. Since we
only show available data for each component but also          tend to analyze the flow of information, we use a taint
which path is the source for a given datatype.                analysis to validate the flow of data through the program.
  To find violations of restrictions, we check for each
Table 1                                                       When generating the analyses, we can reduce the search
Mappings defined in the mapping model.                        space by considering the information of the component
 Model Element       Description                              model. In particular, we only take methods for ports into
 Component           In general, a component is mapped        account that are capable of handling the data types un-
                     to a class. However, this mapping is     der investigation. For example, let us assume that the
                     also used to specify a method that       component of the card reader (cf. Listing 1) is connected
                     describes the main entry point of the    to the cash desk. When analyzing the implementation of
                     component, e.g., a method that exe-      the cash desk on the flow of credit card information, it is
                     cutes the behavior of the component.     sufficient to take the port of this connection as a source
 Port                This mapping is used to specify a        for the credit card information.
                     method for writing to or reading from
                                                                 After executing the analyses, the result shows if the
                     a component port. We therefore
                     distinguish between IN-port map-
                                                              assumptions are correctly implemented in the given im-
                     pings and OUT-port mappings. If an       plementation. An advantage is that not all analyses have
                     INOUT-port is used, both mappings        to be re-evaluated if the source code for a component
                     have to be specified.                    changes but only the analyses that are relevant for this
 Data Source         This mapping is used to specify a        component. Also, the security engineer can use this infor-
                     method that returns a specific data      mation to either consider this fact in the security model,
                     type if a component is specified as a    e.g., by adding additional assumptions to other compo-
                     source for a data type.                  nents, or by contacting the developer of the components
 Sanitizer           This mapping is used to specify a        that do not comply with the assumptions.
                     method that executes the sanitiza-
                     tion of a data type.
                                                              3. Implementation
Instead of generating full analyses, we use the informa-      We implemented a prototype of our DSL and analyses
tion stored in assumptions and the mapping model to           using the Eclipse Modeling Framework (EMF). We chose
configure taint analyses provided by mature frameworks        to add a textual representation of the DSL using Xtext
such as Boomerang [8, 9].                                     [10] and implemented a graphical editor using Sirius [11].
   Since assumptions are always specified for a specific         The source code for our implementation can be found
component, the analyses are restricted to the correspond-     on https://github.com/secure-software-engineering/cards
ing implementation for this component as well. In gen-           In the following, we describe all parts of our imple-
eral, both the read-messages for all IN-ports of the com-     mentation shortly.
ponent that receive a specific datatype and (if the com-
ponent is a source) the source-method for data type are       Textual and Graphical Editor The graphical editor
potential sources for the taint analysis. Similarly, all      for cards was implemented using Sirius. Figure 3 shows
OUT-ports are potential sinks for the taint analysis. In      an example of the graphical editor. In addition, we pro-
the following, we describe how a taint analysis can be        vide a textual editor implemented using the Xtext frame-
specified for each assumption based on our models.            work. All changes made to the model in the graphical
   We assume that the mapping model is fully specified        editor are also reflected on the underlying Xtext model.
and, therefore, provides methods for reading a data type      Hence, developers can switch at any time to the repre-
from a IN-port, writing a data type to an OUT-port, sani-     sentation they prefer. Using the graphical editor, we can
tizing data types for each sanitizer, and for executing the   easily model systems or create representations for exist-
component’s behavior. The last method can be used as          ing models. The diagram representation can be analyzed
an entry-point for the code analyses. If not specified, all   using Sirius’ own tool to verify diagrams, which invokes
public accessible methods have to be considered as poten-     our analyses, using EMF validation and are shown in the
tial entry points, e.g., public methods in Java. Methods      model and the Eclipse problems view.
for ports and sanitizer are used to configure the taint
analyses. Both methods for reading IN-ports of all ports      Analyses The analysis explained in Section 2.2 is im-
that are capable of handling the data type to be analyzed,    plemented as a basic depth-first search. We treat the
and a method if the component is a source for the data        model as a directed graph and recursively propagate data
type are considered as sources for in taint analysis. Cor-    types, which a component is source for, over outgoing
respondingly, methods for OUT-ports are considered as         edges. Output of this analysis is a mapping from compo-
sinks in the taint analysis. In the case of a flow assump-    nents to all paths through the model. The assumption
tion that explicitly defines a flow from one to another       analysis explained in Section 2.2 iterates through the
port, only methods for these two ports are considered.        paths determines the processed data per component. The
output of this analysis is a mapping of components to          reader, a cash box, a printer and a light display, all of
pairs of lists of paths and data types, which are received     which are connected to a cash desk pc, which also con-
on these paths. To resolve restrictions, we check for each     nects to a bank. Figure 3 shows the component model
restriction, if data types of the defined restriction are      using our graphical editor. For our evaluation, we chose
illegally accessible at a component.                           to base our model on CoCoME’s first proposed use case,
                                                               the sale. A sale is an interaction between a customer and
Mapping Model As explained in 2.3, we created a                a cashier. We model the complete cash desk, a bank and
mapping model, which maps model parts to Java code to          the store infrastructure. We adapted the data types pro-
ease the generation of static code analyses. This mapping      vided in the reference implementation of CoCoME [13],
is implemented as a EMF model. Empty mappings for new          as they are not part of the original definition. We used
model parts are automatically added to this model when         the case study as a proof of concept of cards itself. For
using our graphical editor suite. Instead of providing         our example, we defined a restriction that the credit card
an additional DSL for the mapping model, we provide            number and pin may only be accessed by the card reader,
a properties view for relevant parts of the model in our       bank and cash desk pc. In the real world, the credit card
graphical editor, where mappings can be edited.                number may be printed if partly replaced with asterisks,
                                                               so a sanitization is a sensible approach.
Generation of Glue Code Using the Xtend frame-                    Using the provided models of CoCoME, this restric-
work, we implemented a code generation, whose output           tion is not directly clear, as dataflows are not part of their
can serve as glue code for Java implementations of a given     modeling. With cards, we can already provide a formal
model. Components are implemented as Java threads and          restriction for this use case. Listing 2 shows the textual
all connections and mappings between component parts           representation of this restriction. Upon validating the
are implemented using the observer pattern. Commu-             model, our analyses provide the developer with feedback
nication is restricted to strings, but can be extended to      that the current model violates the restriction because
arbitrary objects. Similar to our DSL, composite com-          the credit card information may be accessed at every com-
ponents handle the inter-component communication by            ponent, including the printer. To address this violation,
instantiating connections. Additionally, all assumptions       we chose to define several dataflow assumptions for our
are added as documentation for the developer using Java        model. Listing 3 shows a representation of the assump-
annotations. Upon code generation, the mapping model           tions we made to resolve the violations. In particular,
is also created automatically.                                 we assume that the credit card information will never be
                                                               leaked to the light display, cash box and anything out-
                                                               side the cash desk component. Additionally, dataflows
Static Code Analyses Based on the concepts described
                                                               between pcCardReaderPort and pcPrinterPort of
in Section 2.3.2, we generate the configuration code for
                                                               the cash desk pc component will be sanitized using the
the static code analysis automatically using the Xtend
                                                               CCSanitizer. With these assumptions in place, the analy-
framework. The generator takes the component model
                                                               sis does not show any violations for the restriction. De-
and the mapping model as input. All assumptions can
                                                               velopers might find major security flaws in their archi-
be validated using taint analysis. Since we are focusing
                                                               tecture based on restriction violations, which may lead
on Java code in our implementation, we decided to use
                                                               to architectural refactorings that resolve the violation.
the established analysis framework Boomerang [9, 8] for
                                                               We used cards to generated a Java project for the cash
the specification and execution of the taint analyses. We
                                                               desk application and implemented the behavior code for
generate the required taint analyses for each assumption.
                                                               the relevant components based on the documentation of
The generator can be adapted to any other framework
                                                               CoCoME. Also, the corresponding mapping model and
that enables the specification and execution of taint anal-
                                                               the static code analyses were created automatically.
yses. This also allows one to use different languages for
                                                                  For the evaluation of the analyses, we created two ver-
the implementation of the system’s components.
                                                               sions of the implementation: one version violating the
                                                               assumptions which should therefore lead to a report by
4. Case Study                                                  the analysis, and one version that respects the dataflow
                                                               assumptions, e.g. by preventing dataflows or using the
We evaluated cards using a case study based on CoCoME          desired sanitizer. The analyses were able to find the in-
[12]. CoCoME is an established example for component           correct dataflows. However, it showed that in the current
modeling commonly used in research. The example sys-           implementation false positives might get reported if one
tem is a model of a store which is part of an enterprise. An   specifies different policies for data of the same port. To
enterprise consists of a server, client and several stores,    solve this problem, the developer needs to either adjust
each store consists of a server, client and several cash       the implementation making sure that the data are filtered
desks. A cash desk consists of a bar code scanner, a card      and correctly sanitized, or the result is fed back into the
Figure 3: Component diagram based on the CoCoME case study.



component model where the security engineer can split        Extended dataflow diagrams Berger et al. present an
the dataflows such that the flows are analyzed separately.   approach using extended DFDs [16] which are a more
   The evaluation showed one major advantage of the          formal version of classical dataflow diagrams. Since these
approach. When the source of one component changes,          DFDs allow for formal analyses and hierarchical system
only the analyses for this component have to be re-evalu-    specification, it allows for more precise threat modeling.
ated instead of analyzing the whole source code again.       In contrast, we base our threat modeling approach on
For example, assuming that the implementation of the         established modeling artifacts enabling the integration of
component CashDeskPC changes, only the analyses for          our concepts into existing approaches. Peldszus et al. [17]
this component have to be executed. If the implementa-       providing an approach that aims at the connection from
tion of other components changes, no re-evaluation is        dataflow diagrams to source code and is therefore also
required. Especially for large-scale systems, this compo-    highly related to our approach. This approach enables
sitional approach can help to reduce the overall time for    more precise threat modeling because the actual imple-
threat modeling and risk analysis.                           mentation is respected in the threat model. In contrast,
                                                             cards focuses on a top-down approach enabling early
                                                             analyses without a code-base.
5. Related Work                                                 Also, model-driven and model-based security grew to
                                                             a large research area in the last years [18]. An overview
There are two major areas to which cards is related:
                                                             of approaches in general can be found in the mapping
Threat Modeling and Model-based security testing. Threat
                                                             study by Nguyen et al. [19]. Several approaches integrate
modeling because cards enables threat modeling and
                                                             security modeling into existing modeling approaches,
analyses based on the created threat model. Security
                                                             e.g. SEED [5] or UMLsec [4]. SEED [5] is an approach
testing since cards aims to automate validating the im-
                                                             that aims at building a bridge between embedded system
plemented security assumptions.
                                                             experts and security experts. In SEED, security experts
                                                             can define security solutions that can be used during the
Threat Modeling For threat modeling, often dataflow          system design and to validate the system based on the in-
diagram based approaches are applied because of the          tegrated security solutions. In contrast, cards focuses on
simplicity and technology-agnostic modeling [2]. Most        the definition of assumptions at design time and the vali-
prominent examples are the STRIDE approach [14] or           dation on source code level instead of defining concrete
LINDDUNN[15] for privacy-focused threat modeling.            security solutions that are integrated into the system
cards is related to these approaches since it also uti-      design. UMLsec [4] provides a UML profile providing
lizes an architectural description of the system. However,   modeling concepts and analyses for security-relevant sys-
in contrast, cards focuses on seamless threat modeling       tem properties. In contrast to UMLsec, cards focuses
by combining threat modeling and analyses on the actual      on the connection of design-time assumptions and the
implementation. Currently, cards does support finding        source code implementation, leaving model-driven con-
known threats automatically but we plan to implement         cepts like concrete behavior modeling out.
this in future work.
   Several approaches enhanced the use of data-flow di-
agrams to improve threat modeling and risk analysis.
Model-based Security Testing Following the clas-             proach helps designers identify required dataflow rules
sifications discussed in a survey by Felderer et. al [20],   for the implementation at early development steps. These
for security testing two principal approaches are distin-    rules (assumptions) can be useful in different ways: On
guished in general: Testing to find vulnerabilities and      the one hand, when implementing a new system, they
unknown threats in the system and testing if the security    can be used as requirements for the later implementa-
mechanisms are implemented correctly [21]. The first         tion. On the other hand, they can be used to validate if
category does not fit to cards since we are using threat     an already implemented system does comply with the
modeling techniques to define security requirements and      security assumptions.
threats in the initial steps but cards does not contribute      Furthermore, we provide a concept of how these as-
to finding new threats or vulnerabilities by itself.         sumptions can be expressed by static code analyses, al-
   Following Schieferdecker et al. [22], models that are     lowing to automatically validate the assumptions on a
used for model-based security testing can be categorized     given implementation. The advantage of this modular
into three major categories: First, Architectural and func-  approach compared to approaches that validate security
tional models which “are concerned with system require-      requirements is that assumptions are defined component-
ments regarding the general behavior and setup of a          wise and, therefore, only the code for affected compo-
software-based system” [22]. Second, Threat, fault and       nents has to be analyzed. This is especially important if
risk models that “focus on what can go wrong” [22] and       the source code for only one component changes and the
are used to determine potential threats, corresponding       requirements has to be re-evaluated. Also, connecting
risk factors, and their relationships, e.g., STRIDE [1].     a threat model on the architectural level with concrete
Third, Weakness and vulnerabilities models describing        analyses on the source code level helps feed back analysis
“the weakness and vulnerabilities itself” [22], e.g., models results into the threat model. This simplifies reasoning
referring to CVE or CWE but also catalogs for generating     about the effects of the analysis results.
threat lists like in the Microsoft Threat Modeling Tool [1].    We provide a prototypical implementation of cards
cards provides a combination of the approaches of the        containing a graphical and textual editor for component
first and second category because it utilizes architectural  model and our DSL for describing assumptions and re-
models for describing a secure system architecture but       strictions and evaluated our concepts based on a use
also concepts and analyses for reasoning about dataflow      case of the CoCoME case study. To ease the process of
threats in the system. In contrast to existing approaches    connecting threat model and code, we provide a gener-
cards combines a light-weighted threat modeling ap-          ator to Java code that automatically creates a mapping
proach on abstract design models with concrete analyses      model describing the connections from model elements
on the implemented system and, therefore, enables seam-      to dedicated Java methods. For existing system imple-
less threat modeling of a system. Providing vulnerability    mentations, the approach is currently limited in efficacy
and attack catalogs or the integration of CVEs is currently  because the mapping model that connects the component
not supported and left for future work.                      model used for threat modeling and the source code has
                                                             to be created manually. However, we see potential to
                                                             automate this step in future work. We also plan to extend
6. Conclusion                                                the approach by taking the kind and security level of data
                                                             types and components into account when analyzing the
Modern information systems require development tech-
                                                             model. This would enable the security engineers to apply
niques that ensure security-by-design. Especially, dataflows
                                                             concepts of DFD-threat modeling (like in STRIDE) on the
within a system are of high interest since data is often
                                                             component model and to search for required restrictions
a sensitive asset of the system. The early creation of
                                                             and corresponding assumptions automatically.
a threat model but also the seamless integration of the
                                                                We see cards as a promising combination of light-
threat model into all development steps of the system
                                                             weighted threat modeling and concrete security analyses
are essential to this extent. In this paper, we have pre-
                                                             on source code which can help system developers to
sented cards, a model-based threat modeling approach
                                                             create more secure large-scaled distributed systems.
for dataflows in distributed systems. We discussed our
concepts based on a generic component model. cards
allows to formally specify security requirements for sensi- References
tive data of the system and to validate these requirements
on architectural level by defining assumptions for the         [1] A. Shostack, Threat Modeling: Designing for Secu-
system’s components that need to be fulfilled in the imple-        rity, John Wiley and Sons, Indianapolis, USA, 2014.
mentation. For this, we provide a DSL that allows defin- [2] L. Sion, K. Yskout, D. Van Landuyt, A. van den
ing both requirements and assumptions for a component-             Berghe, W. Joosen, Security threat modeling: Are
based system specification. Using this systematic ap-              data flow diagrams enough?, in: IEEE/ACM 42nd
     International Conference on Software Engineering-             requirements, Requirements Engineering 16 (2011)
     Workshops, IEEE/ACM, 2020.                                    3–32.
 [3] P. H. Nguyen, M. Kramer, J. Klein, Y. Le, An exten-      [16] B. J. Berger, K. Sohr, R. Koschke, Automatically ex-
     sive systematic review on the Model-Driven Devel-             tracting threats from extended data flow diagrams,
     opment of secure systems, Information and Soft-               in: J. Caballero, E. Bodden, E. Athanasopoulos
     ware Technology 68 (2015) 62–81. doi:10.1016/j.               (Eds.), Engineering Secure Software and Systems,
     infsof.2015.08.006.                                           Springer International Publishing, Cham, 2016, pp.
 [4] J. Jürjens, Umlsec: Extending uml for secure sys-             56–71.
     tems development, in: J.-M. Jézéquel, H. Hussmann,       [17] S. Peldszus, D. Strüber, J. Jürjens, Model-based secu-
     S. Cook (Eds.), UML 2002 — The Unified Modeling               rity analysis of feature-oriented software product
     Language, Springer Berlin Heidelberg, Berlin, Hei-            lines, in: Proceedings of the 17th ACM SIGPLAN
     delberg, 2002, pp. 412–425.                                   International Conference on Generative Program-
 [5] M. Vasilevskaya, L. A. Gunawan, S. Nadjm-Tehrani,             ming: Concepts and Experiences, 2018, pp. 93–106.
     P. Herrmann, Integrating security mechanisms into        [18] A. V. Uzunov, E. B. Fernandez, K. Falkner, Engineer-
     embedded systems by domain-specific modelling,                ing security into distributed systems: A survey of
     Security and Communication Networks 7 (2014)                  methodologies., J. UCS 18 (2012) 2920–3006.
     2815–2832.                                               [19] P. H. Nguyen, S. Ali, T. Yue, Model-based secu-
 [6] B. Kordy, L. Piètre-Cambacédès, P. Schweitzer,                rity engineering for cyber-physical systems: A sys-
     DAG-based attack and defense modeling:                        tematic mapping study, Information and Software
     Don’t miss the forest for the attack trees,                   Technology 83 (2017) 116–135.
     Computer Science Review 13-14 (2014) 1–                  [20] M. Felderer, P. Zech, R. Breu, M. Büchler,
     38.       doi:10.1016/j.cosrev.2014.07.001.                   A. Pretschner, Model-based security testing: a tax-
     arXiv:arXiv:1303.7397v1.                                      onomy and systematic classification, Software Test-
 [7] S. Peldszus, K. Tuma, D. Strüber, J. Jürjens, R. Scan-        ing, Verification and Reliability 26 (2016) 119–148.
     dariato, Secure Data-Flow Compliance Checks be-          [21] G. Tian-yang, S. Yin-Sheng, F. You-yuan, Research
     tween Models and Code based on Automated Map-                 on software security testing, World Academy of
     pings, in: MODELS2019, i, 2019.                               science, engineering and Technology 70 (2010) 647–
 [8] J. Späth, K. Ali, E. Bodden, Context-, flow-, and             651.
     field-sensitive data-flow analysis using synchro-        [22] I. Schieferdecker, J. Grossmann, M. A. Schneider,
     nized pushdown systems, Proceedings of the ACM                Model-based security testing, in: A. K. Petrenko,
     SIGPLAN Symposium on Principles of Program-                   H. Schlingloff (Eds.), Proceedings 7th Workshop on
     ming Languages 3 (2019) 48:1–48:29. doi:10.1145/              Model-Based Testing, MBT 2012, Tallinn, Estonia,
     3290361.                                                      25 March 2012, volume 80 of EPTCS, ETAPS, Tallinn,
 [9] J. Späth, L. N. Q. Do, K. Ali, E. Bodden, Boomerang:          Estonia, 2012, pp. 1–12. doi:10.4204/EPTCS.80.
     Demand-driven flow- and context-sensitive pointer             1.
     analysis for Java, in: European Conference on
     Object-Oriented Programming (ECOOP), 2016.
[10] Xtext, https://www.eclipse.org/Xtext/, [Online; ac-
     cessed Dec-2020].
[11] Sirius, https://www.eclipse.org/sirius/, [Online; ac-
     cessed Dec-2020].
[12] S. Herold, H. Klus, Y. Welsch, C. Deiters, A. Rausch,
     R. Reussner, K. Krogmann, H. Koziolek, R. Miran-
     dola, B. Hummel, et al., Cocome-the common com-
     ponent modeling example, in: The Common Com-
     ponent Modeling Example, Springer, 2008, pp. 16–
     53.
[13] Cocome - the common component modelling exam-
     ple webpage, https://cocome.org, [Online; accessed
     Dec-2020].
[14] A. Shostack, Threat modeling : designing for secu-
     rity, Indianapolis, Ind. : Wiley, 2014.
[15] M. Deng, K. Wuyts, R. Scandariato, B. Preneel,
     W. Joosen, A privacy threat analysis framework:
     supporting the elicitation and fulfillment of privacy