-

Architecture-Driven Reduction of Specification Overhead for Verifying Confidentiality in Component-Based Software Systems

0 Kateryna Yurchenko, Moritz Behr, Heiko Klare, Max Kramer and Ralf Reussner Institute for Program Structures and Data Organization Karlsruhe Institute of Technology Karlsruhe , Germany

-Code verification techniques can be used confidentiality for architectural designs, developers can to guarantee that some of the information processed in account for these requirements when developing or selecting software systems remains confidential. For this, allowed components for reuse in a particular system. This can iunnfdoremraatnioanlysflios.wsRhedavuecintog btehespsepceificiefidcaftoironthoevseyrshteeamd prevent late errors that are costly to fix. could render code verification feasible where verification was considered too complex or costly so far. In this I I . B ac kg ro u n d a n d Fo u n dat i o n s paper, we introduce a model-driven approach to reduce the overhead for creating and maintaining such speci- Before we present our approach to architecture-driven fications. Independent of the verification input format, confidentiality specifications, we briefly discuss the fundadevelopers can specify confidentiality for component- mental techniques that we use. based architecture models, which are kept consistent with object-oriented code. They are supported in adapting the specifications to evolving systems in order to A. Component-Based Software Design detect information leaks with less effort and in earlier development stages.

I . I n t ro d u c t i o n a n d M o t i vat i o n

For many software systems, the confidentiality of processed data is important. To guarantee this non-functional property, it is possible to verify the flow of information in the code. For this, confidentiality requirements have to be specified for every system under analysis. The effort required to specify and verify confidentiality represents a design and development overhead. A part of this effort is inevitable, and another part is due to accidental complexity. As a result, it is either more costly than necessary to develop secure software, or the software is less secure than required.

In this paper, we present an approach for reducing the overhead for creating and maintaining confidentiality specifications using the component-based development paradigm and model-driven techniques. With this approach 1.) confidentiality can be specified for services that are defined for component interfaces, 2.) architectural specifications for the component architecture are automatically translated into proof obligations for code verification, and 3.) evolving architectures, confidentiality specifications as well as code are kept semi-automatically consistent with each other. As a result, developers do not need to consider the internal realization of components or details of verification techniques. Furthermore, they can read and modify confidentiality specifications on their level of abstraction because specifications in the architecture and in the code are kept consistent with each other. Finally, by specifying

Our approach supports confidentiality specifications

for component-based software. Components are reusable software units that provide and require services, for which signatures are defined in interfaces. We use the Palladio Component Model (PCM) [1] as architectural description language. With the PCM, service signatures are modeled with a return type, a name, and parameters, which have a name and a type. Confidentiality is specified based on these service signatures in line with the central idea of componentbased software development: the internal realization of components is hidden behind interface contracts, which specify properties of the input and output for provided and required services.

B. Code Generation and Consistency Techniques to automatically derive code from models

are called code generation or model-to-text transformation techniques. Software is, however, often not developed in a strict forward engineering process where models are not changed after they have been used to produce code. Therefore, techniques for preserving consistency between code and models were developed. We use the Vitruvius framework [2] and its reactions language to keep Java code consistent with PCM instances based on monitored changes.

C. Specification and Verification of Confidentiality using Non-Interference

Confidentiality is specified and verified with our approach based on the notion of non-interference. The intuition for preventing information leaks is that confidential in- C. Code Generation and Verification formation may not have any direct or indirect influence To obtain a fully functional and verified system, we on non-confidential information. For this, it has to be generate, complete, and verify code in four steps. First, specified which input data of a component’s service shall Java code and confidentiality annotations are generated not interfere. The observable behavior of the component from PCM instances and the architectural specifications. and output of the service has to be equivalent for all calls Then, the empty method stubs for component services that are equivalent with respect to this specification [3]. are manually completed. Next, the completed code with If information is classified as “high” (confidential) and confidentiality annotations is copied and the annotations “low” (non-confidential), then high inputs may not have any are translated into JML proof obligations. In this step, the observable influence on low outputs. Low inputs, however, generator produces all required JML proof obligations for may influence high outputs. In general, information can be cases that do not need to be distinguished by developers classified based on arbitrary lattices, i.e. partially ordered in architectural specifications, such as different obligations sets with unique supremum and infimum for any two for inputs and outputs to service calls. Finally, the code elements. copy is verified by proving the JML obligations using

From architectural specifications we derive proof obli- KeY (subsection II-C). Figure 1 illustrates these two code gations that can be verified using the theorem prover representations of confidentiality specifications as Java KeY [4]. KeY can be used for automated and interactive annotations (middle), and as JML proof obligations (right). proofs of requirements that are expressed using dynamic The purpose of the annotations is to reduce the complexity logic and specified with the Java Modeling Language for developers by providing them a code representation of (JML) [5]. Non-interference is verified with KeY using the architectural specifications. These two representations two symbolic executions that only differ in terms of a only have syntactic differences: data set definitions are confidential input [6]. turned into Java enums and stereotype applications are I I I . A rc h i t e c t u r e - D r i v e n C o n f i d e n t i a l i t y turned into annotation usages. The level of abstraction with data sets and information flow permissions only for

We will explain how confidentiality can be specified, inputs and outputs of component services is the same. verified and co-evolved in an architecture-driven way based on a simple groupware example.

D. Consistency across Models and Code A. Groupware Calendar Example

In Figure 1, we show three snippets for a groupware calendar system. The first snippet shows an architectural model for a groupware component with a confidentiality specification. The component provides a service yielding all periods that are blocked by calendar entries and that are scheduled between two given timestamps. The second service yields all details for a specific calendar entry, such as the location and participants.

B. Confidentiality Specification

Confidentiality is not directly specified on the code level but on the level of component-based architectures. This way, developers do not need to know the verification input format and information flow only needs to be specified for inter- but not for intra-component communication. I V . R e a l i z at i o n a n d E va l u at i o n

First, so-called data sets have to be defined. They group In our prototypical implementation1, data set definitions data that may interfere with each other but not with other and pairs that link data sets and parameters or return values data sets. If information may flow from and to these data are persisted in a confidentiality specification model that sets, it is specified for service signatures of architectural is kept separate from the architectural models. Stereotypes interfaces. Other flows are not permitted. of a confidentiality profile for PCM can be applied to

In our example in Figure 1, confidentiality specifica- service signatures of architectural interfaces in order to tions are shown as notes with the keyword includes. For refer to these pairs as tagged values. By using this nonthe service getBlockedPeriods, a star yields two allowed invasive profile mechanism the architecture models remain information flows: 1.) non-confidential information (low) compatible to all Palladio tooling. may influence all inputs, 2.) returned values may influence To obtain Java code and confidentiality annotations low information. For the service getFullCalendarEntry, from PCM instances and specifications, we have created a it is specified that 1.) low information may influence generator based on Xtend. An additional generator for JML the provided ID, 2.) returned values may only influence confidential information (high). 1github.com/KASTEL-SCBS/PCM2Java4Key We introduce the intermediate annotation representation for confidentiality specifications not only to reduce specification complexity, but also to ease automated consistency preservation for it. The goal is to support development processes in which code, architectural models and confidentiality specifications may co-evolve. To this end, we reuse a mechanism that keeps component-based architectures (PCM instances) and Java code consistent with each other by reacting to monitored changes (subsection II-B). We are currently extending this mechanism to keep confidentiality specifications on the architectural level consistent with specifications in code. The JML input for KeY is, however, always regenerated and therefore no consistency preservation is needed for it. enum DataSets {APPOINTMENT(),METADATA();} enum IFPairs { METADATA_STAR(DataSets.METADATA,"*"), METADATA_ID(DataSets.METADATA,"id"), APP_RETURN(DataSets.APPOINTMENT,"\return"); } interface Calendar { @NISPEC({IFPairs.METADATA_STAR}) BP[] getBlockedPeriods(TS from, TS to); @NISPEC({IFPairs.METADATA_ID, IFPairs.APP_RETURN}) CE getFullCalendarEntry(ID id); } dentiality specifications for the verification of componentbased software. We have presented architecture-driven confidentiality specifications for which JML proof obligations can be automatically derived so that developers do not need to know the verification input. Furthermore, we have proposed a mechanism for keeping evolving architectures, confidentiality specifications and code consistent with each other. With it, developers can design and realize software with confidential data in an incremental way and on their level of abstraction. The goal of the approach is to help developers in finding confidentiality leaks with less overhead and in earlier development stages. proof obligations is under development. Both generators process the complete input in batch mode and overwrite previous output. We are currently developing consistency preservation reactions, which update annotations after changes in architectural specifications. Java code and architectural models are already kept consistent based on previous work [2].

To ensure the quality of both batch generators and of the incremental reactions, we create unit tests that compare the obtained output with expected outputs for all types of specification possibilities and changes. Furthermore, we will perform cross-validation by always comparing the result of incremental consistency preservation with that of a batch generation of code and annotations.

We will evaluate how confidentiality can be specified and maintained in an architecture-driven way using three systems: a simple web shop, a cloud-based file sharing platform, and a common case study realizing a trading system for retail shops.

R e f e r e n c e s [1] R. H. Reussner, S. Becker, J. Happe, R. Heinrich, A. Koziolek,

H. Koziolek, M. Kramer, and K. Krogmann, Modeling and Simulating Software Architectures – The Palladio Approach.

Cambridge, MA: MIT Press, 2016, 408 pp. [2] M. E. Kramer, M. Langhammer, D. Messinger, S. Seifermann, and E. Burger, “Change-driven consistency for component code, architectural models, and contracts,” in 18th

V . R e l at e d Wo r k International Symposium on Component-Based Software Jürjens presents a model-driven approach where security- E20n1g5in,epepr.in2g1,–s2e6r.. CBSE ’15, Montréal, QC, Canada: ACM, rueslienvgantthienfoUrMmaLtsieocn eisxtsepnesciiofinedfoonr tthhee sUysntiefimeddeMsigondelleivnegl [3] dSe.clGasrseiifinceartiaonndinDc.omGproanheln,t“-bNaosned-instyesrtfeemresn,”cein w20it1h6 IwEhEaEtLanguage (UML) [7]. This approach does, however, not 29th Computer Security Foundations Symposium (CSF), support automatic proof obligations generation for code. 2016, pp. 253–267.

The IFlow approach [8] allows the model-driven develop- [4] B. Beckert, R. Hähnle, and P. H. Schmitt, Eds., Verification wmeebntseorfvidciesstrwibhuicthedarseyssteecmursecwon.rs.its.tiinnfgoromfamtioobnilfleoawp.pUssainndg o4f33O4b.jeScptr-iOnrgieern-tVeedrlSaogf,tw2a0r0e7:. The KeY Approach, ser. LNCS the system’s UML model, IFlow automatically generates [5] dGe.siTg n. Lofeajmveln:sA, Abe.hLa.viBoraakleirn,tearnfdaceC.spRecuibfiyc,at“iPonrellaimngiunaagrye code and a formal model of the system based on abstract for java,” ACM SIGSOFT Software Engineering Notes, 31, state machines, which is used to verify information-flow no. 3, pp. 1–38, 2006. properties. IFlow does not support automatic consistency [6] C. Scheben and S. Greiner, “Information flow analysis,” preservation on both design and code levels during the in Deductive Software Verification – The KeY Book: From system evolution. THhäehonrlye,tPo. PHr.aSccthicme,itWt, .anAdhMre.nUdtlb,rBic.hB,Eecdkse.rCt,hRam.:BSupbreinl,gRer.

To the best of our knowledge, there is no other approach International Publishing, 2016, pp. 453–471. for verifying component-based software against architec- [7] J. Jürjens, “Umlsec: Extending uml for secure systems tural confidentiality requirements or for preserving con- development,” in International Conference on The Unified sistency between confidentiality specifications in software Modeling Language, Springer, 2002, pp. 412–425. architecture descriptions and code. [8] K. Katkalov, K. Stenzel, M. Borek, and W. Reif, “Modeldriven development of information flow-secure systems with iflow,” in Social Computing (SocialCom), 2013 International

Conference on, IEEE, 2013, pp. 51–56.