<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>International Conference on Software Engineering- requirements, Requirements Engineering</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1016/j</article-id>
      <title-group>
        <article-title>Ensuring threat-model assumptions by using static code analyses</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Johannes Geismann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bastian Haverkamp</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eric Bodden</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Heinz Nixdorf Institute, Paderborn University</institution>
          ,
          <addr-line>Fürstenallee 11, 33102 Paderborn</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Fraunhofer IEM</institution>
          ,
          <addr-line>Zukunftsmeile 1, 33102 Paderborn</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>16</volume>
      <issue>2011</issue>
      <fpage>0000</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>In the past years, the security of information systems has become more and more important. Threat modeling techniques are applied during the design phase of the development, helping to find potential threats as early as possible. However, assumptions made at this development step are often not considered in later steps or are not validated correctly, particularly not during the concrete implementation of the system. To overcome this problem, we present cards, a security modeling approach on the architectural level which utilizes code analyses to validate assumptions made during the threat modeling phase. cards helps ensure a correct implementation but also allows one to determine which efect code vulnerabilities can have on the overall architecture, as described through models. We implemented cards based on the Eclipse Modeling Framework, for Java-based system implementations. We evaluated cards based on the CoCoME case study to show its eficacy. The evaluation showed that cards can ease the validation of assumptions made during threat modeling and reduce the overall analysis efort.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Threat-modeling</kwd>
        <kwd>Security</kwd>
        <kwd>Component-based</kwd>
        <kwd>Static Code Analyses</kwd>
        <kwd>Security-by-design</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>tions and are used to ensure that specific dataflows are
prevented. For example, when paying in the super
marSecurity is an essential property when developing mod- ket, such an assumption on the implementation of the
ern software-intensive systems. To ensure high security, cash desk could be that customer credit card
informait is important to consider security not only during the tion is only sent to system parts that have permission
implementation but already when designing the system. to process it. Especially for large-scale systems this
beEspecially dataflows are of high interest because con- comes a challenge because large code bases have to be
ifdential data resembles important assets for every in- analyzed. Additionally, such systems consist of several
formation system, and also because attacker-controlled subsystems that are possibly developed by diferent
parinputs need to be properly filtered before they are used. ties. Distributed systems, micro-services and “serverless”
For this reason, one uses threat modeling approaches to architectures are just some prominent examples.
reason about potential threats and corresponding coun- Particularly in these areas, model-based approaches
termeasures in early development steps [1]. are promising for threat modeling and security by
de</p>
      <p>Current approaches, however, are limited because of sign [3]. However, most approaches are either fully
the lack of full traceability from threat model to the sys- model-driven approaches that are quite heavy-weight
tem artifacts. In particular, due to a missing connection and usually hardly adaptable, e.g, UMLsec [4] or SEED [5],
of threat model artifacts and the implementation, this im- or light-weight approaches such as STRIDE [1] that only
plementation often difers from the specifications made take threat modeling into account but do not consider
during threat modeling [2]. Hence, assumptions made the connection to the implemented system. To make
during the design or in the threat model are not cor- threat modeling more efective for distributed systems,
rectly implemented or not even implemented at all, which the following challenges need to be met.
leaves the security state of the actual system unclear. Security requirements are usually defined by several
Static code analyses can help to validate these assump- disciplines and, therefore, should be specified on the
architectural or system level such that they can be
discussed independently from—and in the best case already
before—the implementation phase. Countermeasures
deifned during such a threat modeling phase are usually
assumptions made about the implementation. Hence, all
assumptions made on the architectural level have to be
made explicit in the model and have to be correctly
reifned into source code [ 2]. Because this is a tedious and
error-prone task, one must validate these assumptions on</p>
      <p>Create System
Architecture
1</p>
      <p>Specify
Securityrelevant information
Specify Restrictions
and Assumptions
Analyze Architecture
for Violations
Implementation
Check System for</p>
      <p>Compliance
the source code level. An additional challenge is that the The source code for our implementation can be found
implemented system is usually not completely under the on https://github.com/secure-software-engineering/cards
control of one development team. Static code analyses
are a suitable solution to this end because they validate
such assumptions on the source code and can be defined
for a specific subsystem regardless of who is responsible 2. CARDS: Security Modeling and
for the implementation. However, if static code analyses Validation
are used, the results are most useful if fed back to the
architecture and threat model. Unfortunately, current Efective threat modeling requires four basic steps: (1)
solutions fall short in this regard. Finding security-relevant systems parts and functions, (2)</p>
      <p>We see two main concepts as essential here: Connec- Finding potential threats with regard to these parts, (3)
tion to the source code and making the requirements Risk-assessment, i.e., prioritizing the threats, and (4)
imand assumptions made during threat modeling explicit. plementing appropriate countermeasures. While threat
To address these challenges, we have developed cards modeling in general targets all kind of threats, cards
(Component-based Assumptions and Restrictions for focuses on dataflow-specific threats. We designed cards
Dataflow Specifications), a security modeling approach in such a way that it’s concepts can be applied to existing
for dataflows in distributed systems. It provides a new development processes. Figure 1 shows an overview of
DSL which operates on a generic component model and, the main steps.
therefore, can be adapted for existing component-based At first, the system designer creates a component model
approaches. cards can be used to specify security re- describing the basic architecture of the system (1). This
quirements for dataflows, as well as assumptions made to step is not necessarily part of cards since an existing
fulfill these restrictions on the architectural level. cards architectural model could also be adapted for the
applicafurther illustrates how static code analyses can be used tion of cards. Based on the component model, security
to validate the assumptions on the code level. and domain experts specify security-relevant
informa</p>
      <p>In particular, this paper makes the following original tion, e.g., confidential data. Also, security restrictions and
contributions: security assumptions are specified explicitly. Security
restrictions describe security requirements for specific
• cards: a concept and a domain-specific language data types of the system, e.g., data from the credit card
for the specification of dataflow restrictions and reader are always sanitized before being sent to other
assumptions on the architectural level, components of the system. Dataflow-specific security
• an analyzer checking the system for dataflow vi- requirements for the system can be refined to security
olations, restrictions. A security assumption makes an assumption
• a concept for generating the corresponding static to the implementation explicit, e.g., that confidential data
code analyses, and will never be send to an external entity. Following this,
• an implementation of these concepts based on a restriction describes global requirements the system
the Eclipse modeling framework and Sirius, pro- should satisfy, an assumption contrarily describes what
viding a textual as well as graphical syntax. the designer assumes to be implemented for each
component. The concept of both (1) and (2) are described in
more detail in Section 2.1.</p>
      <p>Next, the system can be analyzed whether all security
restrictions are satisfied assuming that all assumptions
will be implemented correctly (3). If a violated
restriction is found, the security experts may add additional
assumptions to mitigate this security issue and re-apply
the analysis until all restrictions are satisfied. The
as</p>
      <p>This paper is structured as follows: In Section 2, we
provide an overview of cards, describe our concept for
security restrictions and assumptions, explain our model
analyses on and the generation of code analyses. In
Section 3, we describe the implementation of the prototype
and present the evaluation of cards in Section 4.
Section 5 compares cards with related approaches and
Section 6 concludes this paper.</p>
      <p>0..*</p>
      <p>ports
Component
CompositeComponent</p>
      <p>AtomicComponent</p>
      <sec id="sec-1-1">
        <title>Based on the component model, cards utilizes several</title>
        <p>sumptions can be useful for the actual implementation security-relevant pieces of information that can be
specof the system giving the developers guidelines for the ified within our DSL. In the following, we give a short
implementation. Concepts for the analyses and potential overview of the supported language features and their
use-cases in the development are explained in Section 2.2. purpose.</p>
        <p>Finally, cards uses generated static code analyses to
validate if all assumptions are implemented correctly (4).</p>
        <p>For this, we provide in Section 2.3 a concept for how
the assumptions can be mapped to static code analyses
automatically. If all generated analyses pass and no
violation is found on source code, the restrictions made to
the system can be seen as satisfied on code-level, too.
ponents or AtomicComponents. Composite components
can contain further components by defining
ComponentParts which allows for a hierarchical component model.</p>
        <p>Atomic components cannot contain further components.</p>
        <p>Components use Ports for communication with other
components. In our component model, we assume
communication to be asynchronous. Ports are connected
via PortConnectors which are embedded into the parent
composite component. For a better overview, we have
omitted several parts of the meta-model that are mainly
needed for technical reasons. The full meta-model can
be found in our provided implementation artifacts.</p>
        <sec id="sec-1-1-1">
          <title>2.1.2. Security-relevant Information</title>
          <p>DataTypes are representing the security-relevant
data. They are the data assets of the system
because they represent the data that should be
protected. We only consider data that are relevant
for the analyses. DataTypes can have attributes
for labels, e.g, to mark a datatype as external user
input, a security level, and a type which can be
interesting when mapping to the actual source
code base. Listing 1 shows an excerpt of the
example where three data types are defined (lines
1-5).</p>
          <p>Data Groups are used to combine several
DataTypes, e.g., all data describing parts of credit card
information. DataGroups are mainly used when
defining Restrictions and Assumptions. In
Listing 1, the data types CreditCardNumber and
CreditCardPIN are grouped (cf. line 11).
Component Groups are used similarly to
combine components that have something in
common, e.g., (un)trusted components.</p>
          <p>Component Kinds can be used to categorize
components, e.g., to mark components as external
entities, datastores, or processes (similar to DFD
threat modeling) [1].</p>
          <p>Data Sources describe which components are the
sources for a specific DataType. In Listing 1, the
component CardReader is marked as source for
the types CredietCardPin and
CreditCardNumber.</p>
          <p>Sanitzers are used to modify data making them
secure for further use, e.g., escaping bad
characters. At this stage, a sanitizer is only on
conceptual level and can be used in the security
assumptions (cf. Section 2.1.3). In the example, a
2.1. Specifying Restrictions and</p>
          <p>Assumptions</p>
        </sec>
      </sec>
      <sec id="sec-1-2">
        <title>In this section, we explain our concepts of restrictions,</title>
        <p>assumptions, and all concepts required. We developed a
DSL for specifying security-relevant information of the
system, security restrictions, and security assumptions.
Since it is essential to refer to the actual system model,
this DSL refers to a component model. For
demonstration purposes, we are using a generic component model
which is described in Section 2.1.1. However, since we
use a generic component model, we see our concepts
not restricted to one component model but adaptable
to other component models. After that, we describe in
Section 2.1.2 how security-relevant information can be
formalized. Finally, we explain our concept of restrictions
and assumptions in more detail and describe our DSL for
this step.</p>
        <sec id="sec-1-2-1">
          <title>2.1.1. Component Model</title>
        </sec>
      </sec>
      <sec id="sec-1-3">
        <title>For demonstration purposes, we are using a generic com</title>
        <p>ponent model. We therefore expect that our concepts can
be applied to most other component-based system
speciifcations as well. Figure 2 depicts the main parts of the
underlying meta-model. A component model consists of
a set of components which can be either
CompositeCom</p>
        <p>Listing 1: Example code of a cards-specification.
1 dataTypes {
2 DataType BarCode { },
3 DataType CreditCardNumber {securityLevel 3 },
4 DataType CreditCardPin {securityLevel 4 }
5 }
6 components {
7 AtomicComponent CardReader {
8 ports { INOUTPort cardReaderPort ( )}
9 sourceOf { CreditCardPin,CreditCardNumber }}
10 }
11 Groups {DataGroup CreditCardInfo {CreditCardPin,</p>
        <p>CreditCardNumber}}
12 Sanitizer {CCSanitizer}</p>
      </sec>
      <sec id="sec-1-4">
        <title>Listing 2: Example of a restriction using cardsspecification.</title>
        <p>1 DataFlowRestrictions {
2 GloballyPREVENT CreditCardInfo {
3 Comp CreditCardPin , CreditCardNumber allow CardReader ,</p>
        <p>Bank , CashDeskPC}}
Listing 3: Example code of security assumptions using</p>
        <p>cards.
1 DataFlowAssumptions {
2 componentAssumptions {
3 Component CashDesk neverOut CreditCardInfo }
4 portAssumptions {
sanitizer is defined that should sanitize all confi- 5 Port pcLightDisplay neverOut CreditCardInfo
dential credit card information, e.g., by replacing 76 sanPiotritzpecrCsaAsshsBuomxpPtoirontsne{verOut CreditCardInfo}
it with asterisks. 8 Component CashDeskPC sanitizes DataFlow
Security Level can be used to assign a specific pCcrCeadridtRCeaarddeIrnPfooruts-i&gt;ngpCcCPSrainnitetriPzoerr}t}of
level of security or trust to components.</p>
        <sec id="sec-1-4-1">
          <title>2.1.3. Dataflow Restrictions and Assumptions</title>
          <p>restrictions could not be validated. The security engineer
can therefore specify assumptions of the implemented
behavior which must be met to achieve the restriction. We
next explain how to specify such assumptions in cards.</p>
        </sec>
      </sec>
      <sec id="sec-1-5">
        <title>In the following, we describe our concepts for security</title>
        <p>restrictions and corresponding assumptions and how
cards supports the security engineer specifying these.
Essentially, restrictions formally describe security
requirements regarding the dataflow within the system.
Assumptions are used to describe countermeasures that
are assumed to be in place in the source code.</p>
      </sec>
      <sec id="sec-1-6">
        <title>Specifying Assumptions An assumption describes a</title>
        <p>required behavior of a component. cards provides
different kinds of assumptions. At first, we distinguish
between two major kinds of assumptions:
neverOutassumptions and sanitzer-assumptions. A
neverOut-asSpecifying Restrictions Restrictions are used to for- sumption specifies that a context element will never leak
mally describe security requirements for the data types the given data type, e.g., that a component will never
specified as assets. In essence, the security engineer has send private data to another component. A
sanitizerto describe a security policy for each data type describing assumption specifies that a context element will always
which component is allowed to access the data. Basically, sanitize the data before leaking it using a specific
santhere are two options: 1. Globally allow all components itizer, e.g., replacing some digits with asterisks when
to access a data type and define exceptions that are not sending credit card information to the printer.
allowed to access the data type (deny-listing approach) We support three diferent context elements:
compoand 2. globally prevent components from accessing the nents, ports, and flows within a component. Assumptions
data type and define exceptions describing components for component parts are not useful because all parts of a
that are allowed to access the data type (allow-listing specific component type will have the same
implementaapproach). tion. In the example in Listing 3, we show four diferent</p>
        <p>Corresponding to this, we distinguish between two assumptions: 1. an assumption that the (composite)
comkinds of restrictions, so-called Allow-Restrictions and ponent CashDesk will never leak the credit card info
Prevent-Restrictions. For each datatype, the security en- (line 3). 2. an assumption that the pcLightDisplay
gineer has to specify such a restriction. One restriction port will never leak the credit card info (line 6). 3. an
may cover more than one data type. Listing 2 shows an assumption that the pcCashBoxPort port will never
example of a specified restriction. In particular, we de- leak the credit card info (line 7). 4. an assumption that
ifne a prevent restriction describing, that the data types the component CashDeskPc port will always sanitize
CreditCardPin and CreditCardNumber should only dataflows of credit card info from pcCardReaderPort
be accessed by the components CardReader, Bank, and to pcPrinterPort, using the sanitizer CCSanitizer (line
CashDeskPC by combining the prevent restriction and a 10).
component refinement. Beside component refinements, cards provides an analysis to check whether the
specrestrictions cards also supports refinements for compo- ified restriction is satisfied on model level if all
assumpnent parts and component groups. Without any knowl- tions are implemented correctly. This analysis is
exedge of the concrete behavior of the components, this
cards provides model-based analyses checking whether
all specified restrictions are satisfied and if all security
assumptions have been implemented correctly. This
analysis should be part of the threat modeling activity
during system design and is also useful to find efects in 2.3. Using Code Analyses for Validation
the system’s architecture when a problem in the actual
implementation is found. The analysis can help secu- When all violations of dataflow restrictions are
elimirity experts to find unintended dataflows and to specify nated by specifying assumptions, these assumptions must
requirements for the implementation of a component also be met through correctly implemented source code.
by creating security assumptions. Besides the analysis, To validate this, we propose to use static code analysis
cards also provides several reporting features to assist (cf. Step 4 in Figure 1). We provide a general concept for
the security experts by exporting the analysis results in creating static code analyses for the given model
assumpuseful formats. In this section, we describe how our anal- tions. Since these analyses base on a common structure,
ysis works at first and how the results can be reported it is reasonable to generate them and, thus, automating
afterward. this step. However, to generate the analysis, some
man</p>
        <p>In cards, we apply a two-step analysis. First, for each ual prerequisites must be met, i.e., a connection between
component, all possible paths through the model are the model and the code base has to be created. In the
determined. Second, for each component and compo- following, we explain how we propose to create such a
nent parts respectively, all data types are determined that connection first and how the analyses can be generated
might reach this component. For the first analysis, we automatically in a second step.
treat the component model as a directed graph where
components are the nodes and port connectors are the 2.3.1. Connection to Source Code
edges. Conceptually, the analysis is as a basic depth-first
search. The output of the analysis is a mapping from For connecting the (secured) component model to a given
components to all (longest) paths through the model, i.e., code base, we propose to use a so-called mapping model.
for each component, we store which components it could This mapping model is used to describe the connections
directly or indirectly communicate with. In the second between model artifacts and parts of the source code. All
analysis, for every component, a set of available data required mappings are shown in Table 1. All mappings
types is determined, i.e., data types that could possibly have to specify the model element, class and a method.
be accessed by this component. In the beginning, the Since creating all mappings by hand is a tedious task,
set of available data types of all components that are a we provide a source code generator that generates source
source for a data type are set to these data types. Next, code skeletons for a given composite component and also
the analysis recursively propagates data types through creates an appropriate mapping model containing all
rethe system. The analysis iterates through the paths and, quired mappings. As proof of concept, we implemented a
for each step in the path, adds all currently available data generator for Java which is explained in Section 3 in more
types to an output set which is again propagated to the detail. Supporting the engineers in creating a mapping
next component in the path. In this step, we evaluate model for an existing code base is not in the scope of this
given assumptions of the component to alter the set of paper but we see potential by applying semi-automatic
available data. If a sanitizer-assumption is specified for approaches like done by Peldszus et al. [7]. However, both
this component and datatype, we add a flag to the data the mapping model itself and the generator are
conceptype that it becomes sanitized by this component. If a tually not restricted to one programming language but
neverOut-assumption is specified, the data type is re- can be easily adapted for other programming languages.
moved from the output set. The output of this analysis is
a mapping of components to pairs of lists of paths and 2.3.2. Generating Static Analyses
data types, which are received on these paths. The advan- After creating the mapping model, we use this
informatage of this two-step analysis is that the result does not tion to create a suitable static code analysis. Since we
only show available data for each component but also tend to analyze the flow of information, we use a taint
which path is the source for a given datatype. analysis to validate the flow of data through the program.</p>
        <p>To find violations of restrictions, we check for each</p>
        <sec id="sec-1-6-1">
          <title>Description</title>
          <p>In general, a component is mapped
to a class. However, this mapping is
also used to specify a method that
describes the main entry point of the
component, e.g., a method that
executes the behavior of the component.</p>
          <p>This mapping is used to specify a
method for writing to or reading from
a component port. We therefore
distinguish between IN-port
mappings and OUT-port mappings. If an
INOUT-port is used, both mappings
have to be specified.</p>
          <p>This mapping is used to specify a
method that returns a specific data
type if a component is specified as a
source for a data type.</p>
          <p>This mapping is used to specify a
method that executes the
sanitization of a data type.</p>
          <p>When generating the analyses, we can reduce the search
space by considering the information of the component
model. In particular, we only take methods for ports into
account that are capable of handling the data types
under investigation. For example, let us assume that the
component of the card reader (cf. Listing 1) is connected
to the cash desk. When analyzing the implementation of
the cash desk on the flow of credit card information, it is
suficient to take the port of this connection as a source
for the credit card information.</p>
          <p>After executing the analyses, the result shows if the
assumptions are correctly implemented in the given
implementation. An advantage is that not all analyses have
to be re-evaluated if the source code for a component
changes but only the analyses that are relevant for this
component. Also, the security engineer can use this
information to either consider this fact in the security model,
e.g., by adding additional assumptions to other
components, or by contacting the developer of the components
that do not comply with the assumptions.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>3. Implementation</title>
      <p>Instead of generating full analyses, we use the informa- We implemented a prototype of our DSL and analyses
tion stored in assumptions and the mapping model to using the Eclipse Modeling Framework (EMF). We chose
configure taint analyses provided by mature frameworks to add a textual representation of the DSL using Xtext
such as Boomerang [8, 9]. [10] and implemented a graphical editor using Sirius [11].</p>
      <p>Since assumptions are always specified for a specific The source code for our implementation can be found
component, the analyses are restricted to the correspond- on https://github.com/secure-software-engineering/cards
ing implementation for this component as well. In gen- In the following, we describe all parts of our
impleeral, both the read-messages for all IN-ports of the com- mentation shortly.
ponent that receive a specific datatype and (if the
component is a source) the source-method for data type are Textual and Graphical Editor The graphical editor
potential sources for the taint analysis. Similarly, all for cards was implemented using Sirius. Figure 3 shows
OUT-ports are potential sinks for the taint analysis. In an example of the graphical editor. In addition, we
prothe following, we describe how a taint analysis can be vide a textual editor implemented using the Xtext
framespecified for each assumption based on our models. work. All changes made to the model in the graphical</p>
      <p>We assume that the mapping model is fully specified editor are also reflected on the underlying Xtext model.
and, therefore, provides methods for reading a data type Hence, developers can switch at any time to the
reprefrom a IN-port, writing a data type to an OUT-port, sani- sentation they prefer. Using the graphical editor, we can
tizing data types for each sanitizer, and for executing the easily model systems or create representations for
existcomponent’s behavior. The last method can be used as ing models. The diagram representation can be analyzed
an entry-point for the code analyses. If not specified, all using Sirius’ own tool to verify diagrams, which invokes
public accessible methods have to be considered as poten- our analyses, using EMF validation and are shown in the
tial entry points, e.g., public methods in Java. Methods model and the Eclipse problems view.
for ports and sanitizer are used to configure the taint
analyses. Both methods for reading IN-ports of all ports Analyses The analysis explained in Section 2.2 is
imthat are capable of handling the data type to be analyzed, plemented as a basic depth-first search. We treat the
and a method if the component is a source for the data model as a directed graph and recursively propagate data
type are considered as sources for in taint analysis. Cor- types, which a component is source for, over outgoing
respondingly, methods for OUT-ports are considered as edges. Output of this analysis is a mapping from
composinks in the taint analysis. In the case of a flow assump- nents to all paths through the model. The assumption
tion that explicitly defines a flow from one to another analysis explained in Section 2.2 iterates through the
port, only methods for these two ports are considered. paths determines the processed data per component. The
output of this analysis is a mapping of components to reader, a cash box, a printer and a light display, all of
pairs of lists of paths and data types, which are received which are connected to a cash desk pc, which also
conon these paths. To resolve restrictions, we check for each nects to a bank. Figure 3 shows the component model
restriction, if data types of the defined restriction are using our graphical editor. For our evaluation, we chose
illegally accessible at a component. to base our model on CoCoME’s first proposed use case,
the sale. A sale is an interaction between a customer and
Mapping Model As explained in 2.3, we created a a cashier. We model the complete cash desk, a bank and
mapping model, which maps model parts to Java code to the store infrastructure. We adapted the data types
proease the generation of static code analyses. This mapping vided in the reference implementation of CoCoME [13],
is implemented as a EMF model. Empty mappings for new as they are not part of the original definition. We used
model parts are automatically added to this model when the case study as a proof of concept of cards itself. For
using our graphical editor suite. Instead of providing our example, we defined a restriction that the credit card
an additional DSL for the mapping model, we provide number and pin may only be accessed by the card reader,
a properties view for relevant parts of the model in our bank and cash desk pc. In the real world, the credit card
graphical editor, where mappings can be edited. number may be printed if partly replaced with asterisks,
so a sanitization is a sensible approach.</p>
      <p>Generation of Glue Code Using the Xtend frame- Using the provided models of CoCoME, this
restricwork, we implemented a code generation, whose output tion is not directly clear, as dataflows are not part of their
can serve as glue code for Java implementations of a given modeling. With cards, we can already provide a formal
model. Components are implemented as Java threads and restriction for this use case. Listing 2 shows the textual
all connections and mappings between component parts representation of this restriction. Upon validating the
are implemented using the observer pattern. Commu- model, our analyses provide the developer with feedback
nication is restricted to strings, but can be extended to that the current model violates the restriction because
arbitrary objects. Similar to our DSL, composite com- the credit card information may be accessed at every
components handle the inter-component communication by ponent, including the printer. To address this violation,
instantiating connections. Additionally, all assumptions we chose to define several dataflow assumptions for our
are added as documentation for the developer using Java model. Listing 3 shows a representation of the
assumpannotations. Upon code generation, the mapping model tions we made to resolve the violations. In particular,
is also created automatically. we assume that the credit card information will never be
leaked to the light display, cash box and anything
outStatic Code Analyses Based on the concepts described side the cash desk component. Additionally, dataflows
in Section 2.3.2, we generate the configuration code for between pcCardReaderPort and pcPrinterPort of
the static code analysis automatically using the Xtend the cash desk pc component will be sanitized using the
framework. The generator takes the component model CCSanitizer. With these assumptions in place, the
analyand the mapping model as input. All assumptions can sis does not show any violations for the restriction.
Debe validated using taint analysis. Since we are focusing velopers might find major security flaws in their
archion Java code in our implementation, we decided to use tecture based on restriction violations, which may lead
the established analysis framework Boomerang [9, 8] for to architectural refactorings that resolve the violation.
the specification and execution of the taint analyses. We We used cards to generated a Java project for the cash
generate the required taint analyses for each assumption. desk application and implemented the behavior code for
The generator can be adapted to any other framework the relevant components based on the documentation of
that enables the specification and execution of taint anal- CoCoME. Also, the corresponding mapping model and
yses. This also allows one to use diferent languages for the static code analyses were created automatically.
the implementation of the system’s components. For the evaluation of the analyses, we created two
versions of the implementation: one version violating the
assumptions which should therefore lead to a report by
4. Case Study the analysis, and one version that respects the dataflow
assumptions, e.g. by preventing dataflows or using the
We evaluated cards using a case study based on CoCoME desired sanitizer. The analyses were able to find the
in[12]. CoCoME is an established example for component correct dataflows. However, it showed that in the current
modeling commonly used in research. The example sys- implementation false positives might get reported if one
tem is a model of a store which is part of an enterprise. An specifies diferent policies for data of the same port. To
enterprise consists of a server, client and several stores, solve this problem, the developer needs to either adjust
each store consists of a server, client and several cash the implementation making sure that the data are filtered
desks. A cash desk consists of a bar code scanner, a card and correctly sanitized, or the result is fed back into the
component model where the security engineer can split Extended dataflow diagrams Berger et al. present an
the dataflows such that the flows are analyzed separately. approach using extended DFDs [16] which are a more</p>
      <p>The evaluation showed one major advantage of the formal version of classical dataflow diagrams. Since these
approach. When the source of one component changes, DFDs allow for formal analyses and hierarchical system
only the analyses for this component have to be re-evalu- specification, it allows for more precise threat modeling.
ated instead of analyzing the whole source code again. In contrast, we base our threat modeling approach on
For example, assuming that the implementation of the established modeling artifacts enabling the integration of
component CashDeskPC changes, only the analyses for our concepts into existing approaches. Peldszus et al. [17]
this component have to be executed. If the implementa- providing an approach that aims at the connection from
tion of other components changes, no re-evaluation is dataflow diagrams to source code and is therefore also
required. Especially for large-scale systems, this compo- highly related to our approach. This approach enables
sitional approach can help to reduce the overall time for more precise threat modeling because the actual
implethreat modeling and risk analysis. mentation is respected in the threat model. In contrast,
cards focuses on a top-down approach enabling early
analyses without a code-base.
5. Related Work Also, model-driven and model-based security grew to
a large research area in the last years [18]. An overview
There are two major areas to which cards is related: of approaches in general can be found in the mapping
Threat Modeling and Model-based security testing. Threat study by Nguyen et al. [19]. Several approaches integrate
modeling because cards enables threat modeling and security modeling into existing modeling approaches,
analyses based on the created threat model. Security e.g. SEED [5] or UMLsec [4]. SEED [5] is an approach
testing since cards aims to automate validating the im- that aims at building a bridge between embedded system
plemented security assumptions. experts and security experts. In SEED, security experts
can define security solutions that can be used during the
system design and to validate the system based on the
integrated security solutions. In contrast, cards focuses on
the definition of assumptions at design time and the
validation on source code level instead of defining concrete
security solutions that are integrated into the system
design. UMLsec [4] provides a UML profile providing
modeling concepts and analyses for security-relevant
system properties. In contrast to UMLsec, cards focuses
on the connection of design-time assumptions and the
source code implementation, leaving model-driven
concepts like concrete behavior modeling out.</p>
      <sec id="sec-2-1">
        <title>Threat Modeling For threat modeling, often dataflow</title>
        <p>diagram based approaches are applied because of the
simplicity and technology-agnostic modeling [2]. Most
prominent examples are the STRIDE approach [14] or
LINDDUNN[15] for privacy-focused threat modeling.
cards is related to these approaches since it also
utilizes an architectural description of the system. However,
in contrast, cards focuses on seamless threat modeling
by combining threat modeling and analyses on the actual
implementation. Currently, cards does support finding
known threats automatically but we plan to implement
this in future work.</p>
        <p>Several approaches enhanced the use of data-flow
diagrams to improve threat modeling and risk analysis.</p>
        <p>Model-based Security Testing Following the clas- proach helps designers identify required dataflow rules
sifications discussed in a survey by Felderer et. al [ 20], for the implementation at early development steps. These
for security testing two principal approaches are distin- rules (assumptions) can be useful in diferent ways: On
guished in general: Testing to find vulnerabilities and the one hand, when implementing a new system, they
unknown threats in the system and testing if the security can be used as requirements for the later
implementamechanisms are implemented correctly [21]. The first tion. On the other hand, they can be used to validate if
category does not fit to cards since we are using threat an already implemented system does comply with the
modeling techniques to define security requirements and security assumptions.
threats in the initial steps but cards does not contribute Furthermore, we provide a concept of how these
asto finding new threats or vulnerabilities by itself. sumptions can be expressed by static code analyses,
al</p>
        <p>Following Schieferdecker et al. [22], models that are lowing to automatically validate the assumptions on a
used for model-based security testing can be categorized given implementation. The advantage of this modular
into three major categories: First, Architectural and func- approach compared to approaches that validate security
tional models which “are concerned with system require- requirements is that assumptions are defined
componentments regarding the general behavior and setup of a wise and, therefore, only the code for afected
composoftware-based system” [22]. Second, Threat, fault and nents has to be analyzed. This is especially important if
risk models that “focus on what can go wrong” [22] and the source code for only one component changes and the
are used to determine potential threats, corresponding requirements has to be re-evaluated. Also, connecting
risk factors, and their relationships, e.g., STRIDE [1]. a threat model on the architectural level with concrete
Third, Weakness and vulnerabilities models describing analyses on the source code level helps feed back analysis
“the weakness and vulnerabilities itself” [22], e.g., models results into the threat model. This simplifies reasoning
referring to CVE or CWE but also catalogs for generating about the efects of the analysis results.
threat lists like in the Microsoft Threat Modeling Tool [1]. We provide a prototypical implementation of cards
cards provides a combination of the approaches of the containing a graphical and textual editor for component
ifrst and second category because it utilizes architectural model and our DSL for describing assumptions and
remodels for describing a secure system architecture but strictions and evaluated our concepts based on a use
also concepts and analyses for reasoning about dataflow case of the CoCoME case study. To ease the process of
threats in the system. In contrast to existing approaches connecting threat model and code, we provide a
genercards combines a light-weighted threat modeling ap- ator to Java code that automatically creates a mapping
proach on abstract design models with concrete analyses model describing the connections from model elements
on the implemented system and, therefore, enables seam- to dedicated Java methods. For existing system
impleless threat modeling of a system. Providing vulnerability mentations, the approach is currently limited in eficacy
and attack catalogs or the integration of CVEs is currently because the mapping model that connects the component
not supported and left for future work. model used for threat modeling and the source code has
to be created manually. However, we see potential to
automate this step in future work. We also plan to extend
6. Conclusion the approach by taking the kind and security level of data
types and components into account when analyzing the
Modern information systems require development tech- model. This would enable the security engineers to apply
niques that ensure security-by-design. Especially, dataflows concepts of DFD-threat modeling (like in STRIDE) on the
within a system are of high interest since data is often component model and to search for required restrictions
a sensitive asset of the system. The early creation of and corresponding assumptions automatically.
a threat model but also the seamless integration of the We see cards as a promising combination of
lightthreat model into all development steps of the system weighted threat modeling and concrete security analyses
are essential to this extent. In this paper, we have pre- on source code which can help system developers to
sented cards, a model-based threat modeling approach create more secure large-scaled distributed systems.
for dataflows in distributed systems. We discussed our
concepts based on a generic component model. cards
allows to formally specify security requirements for sensi- References
tive data of the system and to validate these requirements
on architectural level by defining assumptions for the [1] A. Shostack, Threat Modeling: Designing for
Secusystem’s components that need to be fulfilled in the imple- rity, John Wiley and Sons, Indianapolis, USA, 2014.
mentation. For this, we provide a DSL that allows defin- [2] L. Sion, K. Yskout, D. Van Landuyt, A. van den
ing both requirements and assumptions for a component- Berghe, W. Joosen, Security threat modeling: Are
based system specification. Using this systematic ap- data flow diagrams enough?, in: IEEE/ACM 42nd</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>