=Paper=
{{Paper
|id=Vol-2978/mde4sa-paper1
|storemode=property
|title=Ensuring threat-model assumptions by using static code analyses
|pdfUrl=https://ceur-ws.org/Vol-2978/mde4sa-paper1.pdf
|volume=Vol-2978
|authors=Johannes Geismann,Bastian Haverkamp, Eric Bodden
|dblpUrl=https://dblp.org/rec/conf/ecsa/GeismannHB21
}}
==Ensuring threat-model assumptions by using static code analyses==
Ensuring threat-model assumptions by using static code
analyses
Johannes Geismann1 , Bastian Haverkamp1 and Eric Bodden1,2
1
Department of Computer Science, Heinz Nixdorf Institute, Paderborn University, Fürstenallee 11, 33102 Paderborn, Germany
2
Fraunhofer IEM, Zukunftsmeile 1, 33102 Paderborn, Germany
Abstract
In the past years, the security of information systems has become more and more important. Threat modeling techniques
are applied during the design phase of the development, helping to find potential threats as early as possible. However,
assumptions made at this development step are often not considered in later steps or are not validated correctly, particularly
not during the concrete implementation of the system. To overcome this problem, we present cards, a security modeling
approach on the architectural level which utilizes code analyses to validate assumptions made during the threat modeling
phase. cards helps ensure a correct implementation but also allows one to determine which effect code vulnerabilities
can have on the overall architecture, as described through models. We implemented cards based on the Eclipse Modeling
Framework, for Java-based system implementations. We evaluated cards based on the CoCoME case study to show its
efficacy. The evaluation showed that cards can ease the validation of assumptions made during threat modeling and reduce
the overall analysis effort.
Keywords
Threat-modeling, Security, Component-based, Static Code Analyses, Security-by-design
1. Introduction tions and are used to ensure that specific dataflows are
prevented. For example, when paying in the super mar-
Security is an essential property when developing mod- ket, such an assumption on the implementation of the
ern software-intensive systems. To ensure high security, cash desk could be that customer credit card informa-
it is important to consider security not only during the tion is only sent to system parts that have permission
implementation but already when designing the system. to process it. Especially for large-scale systems this be-
Especially dataflows are of high interest because con- comes a challenge because large code bases have to be
fidential data resembles important assets for every in- analyzed. Additionally, such systems consist of several
formation system, and also because attacker-controlled subsystems that are possibly developed by different par-
inputs need to be properly filtered before they are used. ties. Distributed systems, micro-services and “serverless”
For this reason, one uses threat modeling approaches to architectures are just some prominent examples.
reason about potential threats and corresponding coun- Particularly in these areas, model-based approaches
termeasures in early development steps [1]. are promising for threat modeling and security by de-
Current approaches, however, are limited because of sign [3]. However, most approaches are either fully
the lack of full traceability from threat model to the sys- model-driven approaches that are quite heavy-weight
tem artifacts. In particular, due to a missing connection and usually hardly adaptable, e.g, UMLsec [4] or SEED [5],
of threat model artifacts and the implementation, this im- or light-weight approaches such as STRIDE [1] that only
plementation often differs from the specifications made take threat modeling into account but do not consider
during threat modeling [2]. Hence, assumptions made the connection to the implemented system. To make
during the design or in the threat model are not cor- threat modeling more effective for distributed systems,
rectly implemented or not even implemented at all, which the following challenges need to be met.
leaves the security state of the actual system unclear. Security requirements are usually defined by several
Static code analyses can help to validate these assump- disciplines and, therefore, should be specified on the ar-
chitectural or system level such that they can be dis-
2ND INTERNATIONAL WORKSHOP ON MODEL-DRIVEN cussed independently from—and in the best case already
ENGINEERING FOR SOFTWARE ARCHITECTURE (MDE4SA 2021), before—the implementation phase. Countermeasures de-
September 13th 2021, Virtual (originally Växjö, Sweden)
fined during such a threat modeling phase are usually
" johannes.geismann@upb.de (J. Geismann);
bastihav@mail.upb.de (B. Haverkamp); eric.bodden@upb.de assumptions made about the implementation. Hence, all
(E. Bodden) assumptions made on the architectural level have to be
0000-0003-2015-2047 (J. Geismann); 0000-0002-1189-6290 made explicit in the model and have to be correctly re-
(B. Haverkamp); 0000-0003-3470-3647 (E. Bodden) fined into source code [2]. Because this is a tedious and
© 2021 Copyright for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0). error-prone task, one must validate these assumptions on
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
2 3 4
Specify Security- Analyze Architecture Generate Source
relevant information for Violations Code Analyses
1
Create System Specify Restrictions Check System for
Implementation
Architecture and Assumptions Compliance
Figure 1: Overview and main process steps of cards.
the source code level. An additional challenge is that the The source code for our implementation can be found
implemented system is usually not completely under the on https://github.com/secure-software-engineering/cards
control of one development team. Static code analyses
are a suitable solution to this end because they validate
such assumptions on the source code and can be defined
for a specific subsystem regardless of who is responsible 2. CARDS: Security Modeling and
for the implementation. However, if static code analyses Validation
are used, the results are most useful if fed back to the
architecture and threat model. Unfortunately, current Effective threat modeling requires four basic steps: (1)
solutions fall short in this regard. Finding security-relevant systems parts and functions, (2)
We see two main concepts as essential here: Connec- Finding potential threats with regard to these parts, (3)
tion to the source code and making the requirements Risk-assessment, i.e., prioritizing the threats, and (4) im-
and assumptions made during threat modeling explicit. plementing appropriate countermeasures. While threat
To address these challenges, we have developed cards modeling in general targets all kind of threats, cards
(Component-based Assumptions and Restrictions for focuses on dataflow-specific threats. We designed cards
Dataflow Specifications), a security modeling approach in such a way that it’s concepts can be applied to existing
for dataflows in distributed systems. It provides a new development processes. Figure 1 shows an overview of
DSL which operates on a generic component model and, the main steps.
therefore, can be adapted for existing component-based At first, the system designer creates a component model
approaches. cards can be used to specify security re- describing the basic architecture of the system (1). This
quirements for dataflows, as well as assumptions made to step is not necessarily part of cards since an existing
fulfill these restrictions on the architectural level. cards architectural model could also be adapted for the applica-
further illustrates how static code analyses can be used tion of cards. Based on the component model, security
to validate the assumptions on the code level. and domain experts specify security-relevant informa-
In particular, this paper makes the following original tion, e.g., confidential data. Also, security restrictions and
contributions: security assumptions are specified explicitly. Security
restrictions describe security requirements for specific
• cards: a concept and a domain-specific language data types of the system, e.g., data from the credit card
for the specification of dataflow restrictions and reader are always sanitized before being sent to other
assumptions on the architectural level, components of the system. Dataflow-specific security
• an analyzer checking the system for dataflow vi- requirements for the system can be refined to security
olations, restrictions. A security assumption makes an assumption
• a concept for generating the corresponding static to the implementation explicit, e.g., that confidential data
code analyses, and will never be send to an external entity. Following this,
• an implementation of these concepts based on a restriction describes global requirements the system
the Eclipse modeling framework and Sirius, pro- should satisfy, an assumption contrarily describes what
viding a textual as well as graphical syntax. the designer assumes to be implemented for each com-
ponent. The concept of both (1) and (2) are described in
This paper is structured as follows: In Section 2, we more detail in Section 2.1.
provide an overview of cards, describe our concept for Next, the system can be analyzed whether all security
security restrictions and assumptions, explain our model restrictions are satisfied assuming that all assumptions
analyses on and the generation of code analyses. In Sec- will be implemented correctly (3). If a violated restric-
tion 3, we describe the implementation of the prototype tion is found, the security experts may add additional
and present the evaluation of cards in Section 4. Sec- assumptions to mitigate this security issue and re-apply
tion 5 compares cards with related approaches and Sec- the analysis until all restrictions are satisfied. The as-
tion 6 concludes this paper.
0..* ports
1 ponents or AtomicComponents. Composite components
can contain further components by defining Component-
Port Component
Parts which allows for a hierarchical component model.
1 1 Atomic components cannot contain further components.
from to Components use Ports for communication with other
1 1 components. In our component model, we assume com-
PortConnector CompositeComponent munication to be asynchronous. Ports are connected
0..*
via PortConnectors which are embedded into the parent
composite component. For a better overview, we have
omitted several parts of the meta-model that are mainly
ComponentPart AtomicComponent
needed for technical reasons. The full meta-model can
be found in our provided implementation artifacts.
Figure 2: Overview of the used generic component model.
2.1.2. Security-relevant Information
Based on the component model, cards utilizes several
sumptions can be useful for the actual implementation security-relevant pieces of information that can be spec-
of the system giving the developers guidelines for the ified within our DSL. In the following, we give a short
implementation. Concepts for the analyses and potential overview of the supported language features and their
use-cases in the development are explained in Section 2.2. purpose.
Finally, cards uses generated static code analyses to
validate if all assumptions are implemented correctly (4). DataTypes are representing the security-relevant
For this, we provide in Section 2.3 a concept for how data. They are the data assets of the system be-
the assumptions can be mapped to static code analyses cause they represent the data that should be pro-
automatically. If all generated analyses pass and no vio- tected. We only consider data that are relevant
lation is found on source code, the restrictions made to for the analyses. DataTypes can have attributes
the system can be seen as satisfied on code-level, too. for labels, e.g, to mark a datatype as external user
input, a security level, and a type which can be
2.1. Specifying Restrictions and interesting when mapping to the actual source
code base. Listing 1 shows an excerpt of the ex-
Assumptions ample where three data types are defined (lines
In this section, we explain our concepts of restrictions, 1-5).
assumptions, and all concepts required. We developed a Data Groups are used to combine several Data-
DSL for specifying security-relevant information of the Types, e.g., all data describing parts of credit card
system, security restrictions, and security assumptions. information. DataGroups are mainly used when
Since it is essential to refer to the actual system model, defining Restrictions and Assumptions. In List-
this DSL refers to a component model. For demonstra- ing 1, the data types CreditCardNumber and
tion purposes, we are using a generic component model CreditCardPIN are grouped (cf. line 11).
which is described in Section 2.1.1. However, since we Component Groups are used similarly to com-
use a generic component model, we see our concepts bine components that have something in com-
not restricted to one component model but adaptable mon, e.g., (un)trusted components.
to other component models. After that, we describe in Component Kinds can be used to categorize
Section 2.1.2 how security-relevant information can be components, e.g., to mark components as external
formalized. Finally, we explain our concept of restrictions entities, datastores, or processes (similar to DFD
and assumptions in more detail and describe our DSL for threat modeling) [1].
this step. Data Sources describe which components are the
sources for a specific DataType. In Listing 1, the
2.1.1. Component Model component CardReader is marked as source for
the types CredietCardPin and CreditCard-
For demonstration purposes, we are using a generic com-
Number.
ponent model. We therefore expect that our concepts can
be applied to most other component-based system speci- Sanitzers are used to modify data making them
fications as well. Figure 2 depicts the main parts of the secure for further use, e.g., escaping bad char-
underlying meta-model. A component model consists of acters. At this stage, a sanitizer is only on con-
a set of components which can be either CompositeCom- ceptual level and can be used in the security as-
sumptions (cf. Section 2.1.3). In the example, a
Listing 1: Example code of a cards-specification. Listing 2: Example of a restriction using cards-
1 dataTypes { specification.
2 DataType BarCode { },
1 DataFlowRestrictions {
3 DataType CreditCardNumber {securityLevel 3 },
2 GloballyPREVENT CreditCardInfo {
4 DataType CreditCardPin {securityLevel 4 }
3 Comp CreditCardPin , CreditCardNumber allow CardReader ,
5 }
Bank , CashDeskPC}}
6 components {
7 AtomicComponent CardReader {
8 ports { INOUTPort cardReaderPort ( )}
9 sourceOf { CreditCardPin,CreditCardNumber }}
10
11
}
Groups {DataGroup CreditCardInfo {CreditCardPin, Listing 3: Example code of security assumptions using
CreditCardNumber}} cards.
12 Sanitizer {CCSanitizer}
1 DataFlowAssumptions {
2 componentAssumptions {
3 Component CashDesk neverOut CreditCardInfo }
4 portAssumptions {
sanitizer is defined that should sanitize all confi- 5 Port pcLightDisplay neverOut CreditCardInfo
6 Port pcCashBoxPort neverOut CreditCardInfo}
dential credit card information, e.g., by replacing 7 sanitizersAssumptions {
it with asterisks. 8 Component CashDeskPC sanitizes DataFlow
pcCardReaderPort -> pcPrinterPort of
Security Level can be used to assign a specific CreditCardInfo using CCSanitizer}}
level of security or trust to components.
2.1.3. Dataflow Restrictions and Assumptions restrictions could not be validated. The security engineer
can therefore specify assumptions of the implemented be-
In the following, we describe our concepts for security havior which must be met to achieve the restriction. We
restrictions and corresponding assumptions and how next explain how to specify such assumptions in cards.
cards supports the security engineer specifying these.
Essentially, restrictions formally describe security re- Specifying Assumptions An assumption describes a
quirements regarding the dataflow within the system. required behavior of a component. cards provides dif-
Assumptions are used to describe countermeasures that ferent kinds of assumptions. At first, we distinguish
are assumed to be in place in the source code. between two major kinds of assumptions: neverOut-
assumptions and sanitzer-assumptions. A neverOut-as-
Specifying Restrictions Restrictions are used to for- sumption specifies that a context element will never leak
mally describe security requirements for the data types the given data type, e.g., that a component will never
specified as assets. In essence, the security engineer has send private data to another component. A sanitizer-
to describe a security policy for each data type describing assumption specifies that a context element will always
which component is allowed to access the data. Basically, sanitize the data before leaking it using a specific san-
there are two options: 1. Globally allow all components itizer, e.g., replacing some digits with asterisks when
to access a data type and define exceptions that are not sending credit card information to the printer.
allowed to access the data type (deny-listing approach) We support three different context elements: compo-
and 2. globally prevent components from accessing the nents, ports, and flows within a component. Assumptions
data type and define exceptions describing components for component parts are not useful because all parts of a
that are allowed to access the data type (allow-listing specific component type will have the same implementa-
approach). tion. In the example in Listing 3, we show four different
Corresponding to this, we distinguish between two assumptions: 1. an assumption that the (composite) com-
kinds of restrictions, so-called Allow-Restrictions and ponent CashDesk will never leak the credit card info
Prevent-Restrictions. For each datatype, the security en- (line 3). 2. an assumption that the pcLightDisplay
gineer has to specify such a restriction. One restriction port will never leak the credit card info (line 6). 3. an
may cover more than one data type. Listing 2 shows an assumption that the pcCashBoxPort port will never
example of a specified restriction. In particular, we de- leak the credit card info (line 7). 4. an assumption that
fine a prevent restriction describing, that the data types the component CashDeskPc port will always sanitize
CreditCardPin and CreditCardNumber should only dataflows of credit card info from pcCardReaderPort
be accessed by the components CardReader, Bank, and to pcPrinterPort, using the sanitizer CCSanitizer (line
CashDeskPC by combining the prevent restriction and a 10).
component refinement. Beside component refinements, cards provides an analysis to check whether the spec-
restrictions cards also supports refinements for compo- ified restriction is satisfied on model level if all assump-
nent parts and component groups. Without any knowl- tions are implemented correctly. This analysis is ex-
edge of the concrete behavior of the components, this
plained in the next section. Section 2.3 describes our restriction if data types of the defined restriction are
concept how the correct implementation of the assump- illegally accessible at a component. Based on the anal-
tions can be validated using static code analyses. ysis results, we can compare the list of available data
types and the list of data types specified in the restriction.
2.2. Analysis and Reporting If a violation is found, it is essential to report it to the
engineers properly. For this, cards provides different
cards provides model-based analyses checking whether report features, e.g., visual feedback in the graphical edi-
all specified restrictions are satisfied and if all security tors, exported HTML and JSON reports, and an export to
assumptions have been implemented correctly. This anal- attack-defense graphs [6].
ysis should be part of the threat modeling activity dur-
ing system design and is also useful to find effects in
2.3. Using Code Analyses for Validation
the system’s architecture when a problem in the actual
implementation is found. The analysis can help secu- When all violations of dataflow restrictions are elimi-
rity experts to find unintended dataflows and to specify nated by specifying assumptions, these assumptions must
requirements for the implementation of a component also be met through correctly implemented source code.
by creating security assumptions. Besides the analysis, To validate this, we propose to use static code analysis
cards also provides several reporting features to assist (cf. Step 4 in Figure 1). We provide a general concept for
the security experts by exporting the analysis results in creating static code analyses for the given model assump-
useful formats. In this section, we describe how our anal- tions. Since these analyses base on a common structure,
ysis works at first and how the results can be reported it is reasonable to generate them and, thus, automating
afterward. this step. However, to generate the analysis, some man-
In cards, we apply a two-step analysis. First, for each ual prerequisites must be met, i.e., a connection between
component, all possible paths through the model are the model and the code base has to be created. In the
determined. Second, for each component and compo- following, we explain how we propose to create such a
nent parts respectively, all data types are determined that connection first and how the analyses can be generated
might reach this component. For the first analysis, we automatically in a second step.
treat the component model as a directed graph where
components are the nodes and port connectors are the 2.3.1. Connection to Source Code
edges. Conceptually, the analysis is as a basic depth-first
search. The output of the analysis is a mapping from For connecting the (secured) component model to a given
components to all (longest) paths through the model, i.e., code base, we propose to use a so-called mapping model.
for each component, we store which components it could This mapping model is used to describe the connections
directly or indirectly communicate with. In the second between model artifacts and parts of the source code. All
analysis, for every component, a set of available data required mappings are shown in Table 1. All mappings
types is determined, i.e., data types that could possibly have to specify the model element, class and a method.
be accessed by this component. In the beginning, the Since creating all mappings by hand is a tedious task,
set of available data types of all components that are a we provide a source code generator that generates source
source for a data type are set to these data types. Next, code skeletons for a given composite component and also
the analysis recursively propagates data types through creates an appropriate mapping model containing all re-
the system. The analysis iterates through the paths and, quired mappings. As proof of concept, we implemented a
for each step in the path, adds all currently available data generator for Java which is explained in Section 3 in more
types to an output set which is again propagated to the detail. Supporting the engineers in creating a mapping
next component in the path. In this step, we evaluate model for an existing code base is not in the scope of this
given assumptions of the component to alter the set of paper but we see potential by applying semi-automatic
available data. If a sanitizer-assumption is specified for approaches like done by Peldszus et al. [7]. However, both
this component and datatype, we add a flag to the data the mapping model itself and the generator are concep-
type that it becomes sanitized by this component. If a tually not restricted to one programming language but
neverOut-assumption is specified, the data type is re- can be easily adapted for other programming languages.
moved from the output set. The output of this analysis is
a mapping of components to pairs of lists of paths and 2.3.2. Generating Static Analyses
data types, which are received on these paths. The advan- After creating the mapping model, we use this informa-
tage of this two-step analysis is that the result does not tion to create a suitable static code analysis. Since we
only show available data for each component but also tend to analyze the flow of information, we use a taint
which path is the source for a given datatype. analysis to validate the flow of data through the program.
To find violations of restrictions, we check for each
Table 1 When generating the analyses, we can reduce the search
Mappings defined in the mapping model. space by considering the information of the component
Model Element Description model. In particular, we only take methods for ports into
Component In general, a component is mapped account that are capable of handling the data types un-
to a class. However, this mapping is der investigation. For example, let us assume that the
also used to specify a method that component of the card reader (cf. Listing 1) is connected
describes the main entry point of the to the cash desk. When analyzing the implementation of
component, e.g., a method that exe- the cash desk on the flow of credit card information, it is
cutes the behavior of the component. sufficient to take the port of this connection as a source
Port This mapping is used to specify a for the credit card information.
method for writing to or reading from
After executing the analyses, the result shows if the
a component port. We therefore
distinguish between IN-port map-
assumptions are correctly implemented in the given im-
pings and OUT-port mappings. If an plementation. An advantage is that not all analyses have
INOUT-port is used, both mappings to be re-evaluated if the source code for a component
have to be specified. changes but only the analyses that are relevant for this
Data Source This mapping is used to specify a component. Also, the security engineer can use this infor-
method that returns a specific data mation to either consider this fact in the security model,
type if a component is specified as a e.g., by adding additional assumptions to other compo-
source for a data type. nents, or by contacting the developer of the components
Sanitizer This mapping is used to specify a that do not comply with the assumptions.
method that executes the sanitiza-
tion of a data type.
3. Implementation
Instead of generating full analyses, we use the informa- We implemented a prototype of our DSL and analyses
tion stored in assumptions and the mapping model to using the Eclipse Modeling Framework (EMF). We chose
configure taint analyses provided by mature frameworks to add a textual representation of the DSL using Xtext
such as Boomerang [8, 9]. [10] and implemented a graphical editor using Sirius [11].
Since assumptions are always specified for a specific The source code for our implementation can be found
component, the analyses are restricted to the correspond- on https://github.com/secure-software-engineering/cards
ing implementation for this component as well. In gen- In the following, we describe all parts of our imple-
eral, both the read-messages for all IN-ports of the com- mentation shortly.
ponent that receive a specific datatype and (if the com-
ponent is a source) the source-method for data type are Textual and Graphical Editor The graphical editor
potential sources for the taint analysis. Similarly, all for cards was implemented using Sirius. Figure 3 shows
OUT-ports are potential sinks for the taint analysis. In an example of the graphical editor. In addition, we pro-
the following, we describe how a taint analysis can be vide a textual editor implemented using the Xtext frame-
specified for each assumption based on our models. work. All changes made to the model in the graphical
We assume that the mapping model is fully specified editor are also reflected on the underlying Xtext model.
and, therefore, provides methods for reading a data type Hence, developers can switch at any time to the repre-
from a IN-port, writing a data type to an OUT-port, sani- sentation they prefer. Using the graphical editor, we can
tizing data types for each sanitizer, and for executing the easily model systems or create representations for exist-
component’s behavior. The last method can be used as ing models. The diagram representation can be analyzed
an entry-point for the code analyses. If not specified, all using Sirius’ own tool to verify diagrams, which invokes
public accessible methods have to be considered as poten- our analyses, using EMF validation and are shown in the
tial entry points, e.g., public methods in Java. Methods model and the Eclipse problems view.
for ports and sanitizer are used to configure the taint
analyses. Both methods for reading IN-ports of all ports Analyses The analysis explained in Section 2.2 is im-
that are capable of handling the data type to be analyzed, plemented as a basic depth-first search. We treat the
and a method if the component is a source for the data model as a directed graph and recursively propagate data
type are considered as sources for in taint analysis. Cor- types, which a component is source for, over outgoing
respondingly, methods for OUT-ports are considered as edges. Output of this analysis is a mapping from compo-
sinks in the taint analysis. In the case of a flow assump- nents to all paths through the model. The assumption
tion that explicitly defines a flow from one to another analysis explained in Section 2.2 iterates through the
port, only methods for these two ports are considered. paths determines the processed data per component. The
output of this analysis is a mapping of components to reader, a cash box, a printer and a light display, all of
pairs of lists of paths and data types, which are received which are connected to a cash desk pc, which also con-
on these paths. To resolve restrictions, we check for each nects to a bank. Figure 3 shows the component model
restriction, if data types of the defined restriction are using our graphical editor. For our evaluation, we chose
illegally accessible at a component. to base our model on CoCoME’s first proposed use case,
the sale. A sale is an interaction between a customer and
Mapping Model As explained in 2.3, we created a a cashier. We model the complete cash desk, a bank and
mapping model, which maps model parts to Java code to the store infrastructure. We adapted the data types pro-
ease the generation of static code analyses. This mapping vided in the reference implementation of CoCoME [13],
is implemented as a EMF model. Empty mappings for new as they are not part of the original definition. We used
model parts are automatically added to this model when the case study as a proof of concept of cards itself. For
using our graphical editor suite. Instead of providing our example, we defined a restriction that the credit card
an additional DSL for the mapping model, we provide number and pin may only be accessed by the card reader,
a properties view for relevant parts of the model in our bank and cash desk pc. In the real world, the credit card
graphical editor, where mappings can be edited. number may be printed if partly replaced with asterisks,
so a sanitization is a sensible approach.
Generation of Glue Code Using the Xtend frame- Using the provided models of CoCoME, this restric-
work, we implemented a code generation, whose output tion is not directly clear, as dataflows are not part of their
can serve as glue code for Java implementations of a given modeling. With cards, we can already provide a formal
model. Components are implemented as Java threads and restriction for this use case. Listing 2 shows the textual
all connections and mappings between component parts representation of this restriction. Upon validating the
are implemented using the observer pattern. Commu- model, our analyses provide the developer with feedback
nication is restricted to strings, but can be extended to that the current model violates the restriction because
arbitrary objects. Similar to our DSL, composite com- the credit card information may be accessed at every com-
ponents handle the inter-component communication by ponent, including the printer. To address this violation,
instantiating connections. Additionally, all assumptions we chose to define several dataflow assumptions for our
are added as documentation for the developer using Java model. Listing 3 shows a representation of the assump-
annotations. Upon code generation, the mapping model tions we made to resolve the violations. In particular,
is also created automatically. we assume that the credit card information will never be
leaked to the light display, cash box and anything out-
side the cash desk component. Additionally, dataflows
Static Code Analyses Based on the concepts described
between pcCardReaderPort and pcPrinterPort of
in Section 2.3.2, we generate the configuration code for
the cash desk pc component will be sanitized using the
the static code analysis automatically using the Xtend
CCSanitizer. With these assumptions in place, the analy-
framework. The generator takes the component model
sis does not show any violations for the restriction. De-
and the mapping model as input. All assumptions can
velopers might find major security flaws in their archi-
be validated using taint analysis. Since we are focusing
tecture based on restriction violations, which may lead
on Java code in our implementation, we decided to use
to architectural refactorings that resolve the violation.
the established analysis framework Boomerang [9, 8] for
We used cards to generated a Java project for the cash
the specification and execution of the taint analyses. We
desk application and implemented the behavior code for
generate the required taint analyses for each assumption.
the relevant components based on the documentation of
The generator can be adapted to any other framework
CoCoME. Also, the corresponding mapping model and
that enables the specification and execution of taint anal-
the static code analyses were created automatically.
yses. This also allows one to use different languages for
For the evaluation of the analyses, we created two ver-
the implementation of the system’s components.
sions of the implementation: one version violating the
assumptions which should therefore lead to a report by
4. Case Study the analysis, and one version that respects the dataflow
assumptions, e.g. by preventing dataflows or using the
We evaluated cards using a case study based on CoCoME desired sanitizer. The analyses were able to find the in-
[12]. CoCoME is an established example for component correct dataflows. However, it showed that in the current
modeling commonly used in research. The example sys- implementation false positives might get reported if one
tem is a model of a store which is part of an enterprise. An specifies different policies for data of the same port. To
enterprise consists of a server, client and several stores, solve this problem, the developer needs to either adjust
each store consists of a server, client and several cash the implementation making sure that the data are filtered
desks. A cash desk consists of a bar code scanner, a card and correctly sanitized, or the result is fed back into the
Figure 3: Component diagram based on the CoCoME case study.
component model where the security engineer can split Extended dataflow diagrams Berger et al. present an
the dataflows such that the flows are analyzed separately. approach using extended DFDs [16] which are a more
The evaluation showed one major advantage of the formal version of classical dataflow diagrams. Since these
approach. When the source of one component changes, DFDs allow for formal analyses and hierarchical system
only the analyses for this component have to be re-evalu- specification, it allows for more precise threat modeling.
ated instead of analyzing the whole source code again. In contrast, we base our threat modeling approach on
For example, assuming that the implementation of the established modeling artifacts enabling the integration of
component CashDeskPC changes, only the analyses for our concepts into existing approaches. Peldszus et al. [17]
this component have to be executed. If the implementa- providing an approach that aims at the connection from
tion of other components changes, no re-evaluation is dataflow diagrams to source code and is therefore also
required. Especially for large-scale systems, this compo- highly related to our approach. This approach enables
sitional approach can help to reduce the overall time for more precise threat modeling because the actual imple-
threat modeling and risk analysis. mentation is respected in the threat model. In contrast,
cards focuses on a top-down approach enabling early
analyses without a code-base.
5. Related Work Also, model-driven and model-based security grew to
a large research area in the last years [18]. An overview
There are two major areas to which cards is related:
of approaches in general can be found in the mapping
Threat Modeling and Model-based security testing. Threat
study by Nguyen et al. [19]. Several approaches integrate
modeling because cards enables threat modeling and
security modeling into existing modeling approaches,
analyses based on the created threat model. Security
e.g. SEED [5] or UMLsec [4]. SEED [5] is an approach
testing since cards aims to automate validating the im-
that aims at building a bridge between embedded system
plemented security assumptions.
experts and security experts. In SEED, security experts
can define security solutions that can be used during the
Threat Modeling For threat modeling, often dataflow system design and to validate the system based on the in-
diagram based approaches are applied because of the tegrated security solutions. In contrast, cards focuses on
simplicity and technology-agnostic modeling [2]. Most the definition of assumptions at design time and the vali-
prominent examples are the STRIDE approach [14] or dation on source code level instead of defining concrete
LINDDUNN[15] for privacy-focused threat modeling. security solutions that are integrated into the system
cards is related to these approaches since it also uti- design. UMLsec [4] provides a UML profile providing
lizes an architectural description of the system. However, modeling concepts and analyses for security-relevant sys-
in contrast, cards focuses on seamless threat modeling tem properties. In contrast to UMLsec, cards focuses
by combining threat modeling and analyses on the actual on the connection of design-time assumptions and the
implementation. Currently, cards does support finding source code implementation, leaving model-driven con-
known threats automatically but we plan to implement cepts like concrete behavior modeling out.
this in future work.
Several approaches enhanced the use of data-flow di-
agrams to improve threat modeling and risk analysis.
Model-based Security Testing Following the clas- proach helps designers identify required dataflow rules
sifications discussed in a survey by Felderer et. al [20], for the implementation at early development steps. These
for security testing two principal approaches are distin- rules (assumptions) can be useful in different ways: On
guished in general: Testing to find vulnerabilities and the one hand, when implementing a new system, they
unknown threats in the system and testing if the security can be used as requirements for the later implementa-
mechanisms are implemented correctly [21]. The first tion. On the other hand, they can be used to validate if
category does not fit to cards since we are using threat an already implemented system does comply with the
modeling techniques to define security requirements and security assumptions.
threats in the initial steps but cards does not contribute Furthermore, we provide a concept of how these as-
to finding new threats or vulnerabilities by itself. sumptions can be expressed by static code analyses, al-
Following Schieferdecker et al. [22], models that are lowing to automatically validate the assumptions on a
used for model-based security testing can be categorized given implementation. The advantage of this modular
into three major categories: First, Architectural and func- approach compared to approaches that validate security
tional models which “are concerned with system require- requirements is that assumptions are defined component-
ments regarding the general behavior and setup of a wise and, therefore, only the code for affected compo-
software-based system” [22]. Second, Threat, fault and nents has to be analyzed. This is especially important if
risk models that “focus on what can go wrong” [22] and the source code for only one component changes and the
are used to determine potential threats, corresponding requirements has to be re-evaluated. Also, connecting
risk factors, and their relationships, e.g., STRIDE [1]. a threat model on the architectural level with concrete
Third, Weakness and vulnerabilities models describing analyses on the source code level helps feed back analysis
“the weakness and vulnerabilities itself” [22], e.g., models results into the threat model. This simplifies reasoning
referring to CVE or CWE but also catalogs for generating about the effects of the analysis results.
threat lists like in the Microsoft Threat Modeling Tool [1]. We provide a prototypical implementation of cards
cards provides a combination of the approaches of the containing a graphical and textual editor for component
first and second category because it utilizes architectural model and our DSL for describing assumptions and re-
models for describing a secure system architecture but strictions and evaluated our concepts based on a use
also concepts and analyses for reasoning about dataflow case of the CoCoME case study. To ease the process of
threats in the system. In contrast to existing approaches connecting threat model and code, we provide a gener-
cards combines a light-weighted threat modeling ap- ator to Java code that automatically creates a mapping
proach on abstract design models with concrete analyses model describing the connections from model elements
on the implemented system and, therefore, enables seam- to dedicated Java methods. For existing system imple-
less threat modeling of a system. Providing vulnerability mentations, the approach is currently limited in efficacy
and attack catalogs or the integration of CVEs is currently because the mapping model that connects the component
not supported and left for future work. model used for threat modeling and the source code has
to be created manually. However, we see potential to
automate this step in future work. We also plan to extend
6. Conclusion the approach by taking the kind and security level of data
types and components into account when analyzing the
Modern information systems require development tech-
model. This would enable the security engineers to apply
niques that ensure security-by-design. Especially, dataflows
concepts of DFD-threat modeling (like in STRIDE) on the
within a system are of high interest since data is often
component model and to search for required restrictions
a sensitive asset of the system. The early creation of
and corresponding assumptions automatically.
a threat model but also the seamless integration of the
We see cards as a promising combination of light-
threat model into all development steps of the system
weighted threat modeling and concrete security analyses
are essential to this extent. In this paper, we have pre-
on source code which can help system developers to
sented cards, a model-based threat modeling approach
create more secure large-scaled distributed systems.
for dataflows in distributed systems. We discussed our
concepts based on a generic component model. cards
allows to formally specify security requirements for sensi- References
tive data of the system and to validate these requirements
on architectural level by defining assumptions for the [1] A. Shostack, Threat Modeling: Designing for Secu-
system’s components that need to be fulfilled in the imple- rity, John Wiley and Sons, Indianapolis, USA, 2014.
mentation. For this, we provide a DSL that allows defin- [2] L. Sion, K. Yskout, D. Van Landuyt, A. van den
ing both requirements and assumptions for a component- Berghe, W. Joosen, Security threat modeling: Are
based system specification. Using this systematic ap- data flow diagrams enough?, in: IEEE/ACM 42nd
International Conference on Software Engineering- requirements, Requirements Engineering 16 (2011)
Workshops, IEEE/ACM, 2020. 3–32.
[3] P. H. Nguyen, M. Kramer, J. Klein, Y. Le, An exten- [16] B. J. Berger, K. Sohr, R. Koschke, Automatically ex-
sive systematic review on the Model-Driven Devel- tracting threats from extended data flow diagrams,
opment of secure systems, Information and Soft- in: J. Caballero, E. Bodden, E. Athanasopoulos
ware Technology 68 (2015) 62–81. doi:10.1016/j. (Eds.), Engineering Secure Software and Systems,
infsof.2015.08.006. Springer International Publishing, Cham, 2016, pp.
[4] J. Jürjens, Umlsec: Extending uml for secure sys- 56–71.
tems development, in: J.-M. Jézéquel, H. Hussmann, [17] S. Peldszus, D. Strüber, J. Jürjens, Model-based secu-
S. Cook (Eds.), UML 2002 — The Unified Modeling rity analysis of feature-oriented software product
Language, Springer Berlin Heidelberg, Berlin, Hei- lines, in: Proceedings of the 17th ACM SIGPLAN
delberg, 2002, pp. 412–425. International Conference on Generative Program-
[5] M. Vasilevskaya, L. A. Gunawan, S. Nadjm-Tehrani, ming: Concepts and Experiences, 2018, pp. 93–106.
P. Herrmann, Integrating security mechanisms into [18] A. V. Uzunov, E. B. Fernandez, K. Falkner, Engineer-
embedded systems by domain-specific modelling, ing security into distributed systems: A survey of
Security and Communication Networks 7 (2014) methodologies., J. UCS 18 (2012) 2920–3006.
2815–2832. [19] P. H. Nguyen, S. Ali, T. Yue, Model-based secu-
[6] B. Kordy, L. Piètre-Cambacédès, P. Schweitzer, rity engineering for cyber-physical systems: A sys-
DAG-based attack and defense modeling: tematic mapping study, Information and Software
Don’t miss the forest for the attack trees, Technology 83 (2017) 116–135.
Computer Science Review 13-14 (2014) 1– [20] M. Felderer, P. Zech, R. Breu, M. Büchler,
38. doi:10.1016/j.cosrev.2014.07.001. A. Pretschner, Model-based security testing: a tax-
arXiv:arXiv:1303.7397v1. onomy and systematic classification, Software Test-
[7] S. Peldszus, K. Tuma, D. Strüber, J. Jürjens, R. Scan- ing, Verification and Reliability 26 (2016) 119–148.
dariato, Secure Data-Flow Compliance Checks be- [21] G. Tian-yang, S. Yin-Sheng, F. You-yuan, Research
tween Models and Code based on Automated Map- on software security testing, World Academy of
pings, in: MODELS2019, i, 2019. science, engineering and Technology 70 (2010) 647–
[8] J. Späth, K. Ali, E. Bodden, Context-, flow-, and 651.
field-sensitive data-flow analysis using synchro- [22] I. Schieferdecker, J. Grossmann, M. A. Schneider,
nized pushdown systems, Proceedings of the ACM Model-based security testing, in: A. K. Petrenko,
SIGPLAN Symposium on Principles of Program- H. Schlingloff (Eds.), Proceedings 7th Workshop on
ming Languages 3 (2019) 48:1–48:29. doi:10.1145/ Model-Based Testing, MBT 2012, Tallinn, Estonia,
3290361. 25 March 2012, volume 80 of EPTCS, ETAPS, Tallinn,
[9] J. Späth, L. N. Q. Do, K. Ali, E. Bodden, Boomerang: Estonia, 2012, pp. 1–12. doi:10.4204/EPTCS.80.
Demand-driven flow- and context-sensitive pointer 1.
analysis for Java, in: European Conference on
Object-Oriented Programming (ECOOP), 2016.
[10] Xtext, https://www.eclipse.org/Xtext/, [Online; ac-
cessed Dec-2020].
[11] Sirius, https://www.eclipse.org/sirius/, [Online; ac-
cessed Dec-2020].
[12] S. Herold, H. Klus, Y. Welsch, C. Deiters, A. Rausch,
R. Reussner, K. Krogmann, H. Koziolek, R. Miran-
dola, B. Hummel, et al., Cocome-the common com-
ponent modeling example, in: The Common Com-
ponent Modeling Example, Springer, 2008, pp. 16–
53.
[13] Cocome - the common component modelling exam-
ple webpage, https://cocome.org, [Online; accessed
Dec-2020].
[14] A. Shostack, Threat modeling : designing for secu-
rity, Indianapolis, Ind. : Wiley, 2014.
[15] M. Deng, K. Wuyts, R. Scandariato, B. Preneel,
W. Joosen, A privacy threat analysis framework:
supporting the elicitation and fulfillment of privacy