=Paper=
{{Paper
|id=Vol-2245/mrt_paper_1
|storemode=property
|title=Reflecting on the Past and the Present with Temporal Graph-based Models
|pdfUrl=https://ceur-ws.org/Vol-2245/mrt_paper_1.pdf
|volume=Vol-2245
|authors=Antonio Garcia-Dominguez,Nelly Bencomo,Luis Hernan Garcia Paucar
|dblpUrl=https://dblp.org/rec/conf/models/Garcia-Dominguez18
}}
==Reflecting on the Past and the Present with Temporal Graph-based Models==
Reflecting on the past and the present with temporal graph-based models Antonio García-Domínguez Nelly Bencomo Luis H. Garcia Paucar SEA, SARI SEA, SARI SEA, SARI Aston University, UK Aston University, UK Aston University, UK a.garcia-dominguez@aston.ac.uk nelly@acm.org garciapl@aston.ac.uk ABSTRACT In this paper we offer the first contributions towards allowing the Self-adaptive systems (SAS) need to reflect on the current envi- system to support explanations to operators and end users based on ronment conditions, their past and current behaviour to support a generic meta-model. Specifically, we define a generic meta-model decision making. Decisions may have different effects depending for structuring execution traces of SAS, and show how a sequence on the context. On the one hand, some adaptations may have run of traces can be turned into a temporal graph model. We present a into difficulties. On the other hand, users or operators may want first version of a query language for these temporal graphs based on to know why the system evolved in a certain direction. Users may specific cases related to a case study. Our solution relies on temporal just want to know why the system is showing a given behaviour model-based graphs that abstracts decisions, evidence collected and or has made a decision as the behaviour may be surprising or not their corresponding estimated impacts on quality properties of the expected. We argue that answering emerging questions related to system. We foresee two potential applications of our approach: situations like these requires storing execution trace models in a forensic analysis of SAS once the system has finished, and self- way that allows for travelling back and forth in time, qualifying explanation supported by the self-adaptive system at runtime. the decision making against available evidence. In this paper, we The paper is organised as follows. Section 2 presents the basic propose temporal graph databases as a useful representation for concepts in self-explanation and temporal graphs needed to under- trace models to support self-explanation, interactive diagnosis or stand the rest of the text. Section 3 describes our proposed approach forensic analysis. We define a generic meta-model for structuring for creating frameworks for reusable self-explanation, and outlines execution traces of SAS, and show how a sequence of traces can be our proof of concept implementations of the key components. Sec- turned into a temporal graph model. We present a first version of a tion 4 presents a case study on an existing self-adaptive system, query language for these temporal graphs through a case study, and together with a number of time-aware queries targeted at users outline the potential applications for forensic analysis (after the sys- and developers. Section 5 relates this work to others in the fields tem has finished in a potentially abnormal way), self-explanation, of self-explanation and model versioning. Section 6 concludes the and interactive diagnosis at runtime. paper with some general remarks and our lines of future work. CCS CONCEPTS 2 BACKGROUND • Software and its engineering → Software system models; This section will present some of the basic concepts that underlie Extra-functional properties; Designing software; • Comput- our proposal: the need for self-explanation in self-adaptive system, ing methodologies; and our specific choice among the various definitions available in the literature for temporal graphs. KEYWORDS Self-explanation, Temporal Graph Models, Runtime models, Self- 2.1 Self-explanation and diagnosis in adaptation self-adaptive systems Our increasing reliance on software systems has made self-adaptation 1 INTRODUCTION a expected capability. However, self-adaptation actions may run In [31], it is argued that self-explanation shown by the running into problems or unexpected behaviour due to uncertainty in the system helps someone diagnosing the behaviour of the system to environment [11]. Therefore, end users may require explanation analyze and trace past actions, helping fix potential faults and fos- about the reasons the system is showing the current behaviour tering the trust of the end users. To enable these capabilities, we and specifically why it has made particular adaptations actions argue that self-adaptive systems should be equipped with traceabil- that were not expected. Further, in case of a failure, the operators ity management facilities and offer temporal links to provide (i) may perform diagnosis during runtime, or forensic studies after the impacts of the adaptation actions over the quality properties the system has terminated, to therefore identify the origins of the of the system over time and (ii) the history of the decisions of the failure. Surprisingly, this area of research has been rather limited system and the evidence that supports the decisions made with the with scarce research efforts. We describe some of the few initiatives environmental conditions observed. below. Early work has been done by Roth-Berghofer et al [7, 28] on Explanation-aware Computing. The main idea was to help design- ers and engineers to create explanations for users. The explanations MRT 2018, October 14, Copenhagen, Denmark Antonio García-Domínguez, Nelly Bencomo, and Luis H. Garcia Paucar ti ti+1 ti+2 time Bob Bob Bob Alice friendReq friend friend friend Eve Eve Eve owns owns owns watched watched Video Video Video temporal graph key: key: key: key: key: state node: 1 node: 2 node: 3 node: 1 node: 4 chunks time: i time: i time: i time: i+1 time: i+2 value: value: value: value: value: name: Eve description: Video name: Bob name: Eve name: Alice friend: [3] owns: [2] friend: [3] friendReq: [3] friend: [1] watched:[2] copy-on-write Figure 2: Example of a time-evolving temporal graph, by Figure 1: Email exchange example of a temporal graph by Hartmann et al. [13] Kostakos [21] (2) Directed edges are used to link the various nodes of a person into a sequence. The edges have weights equal to the time should cover why specific services were recommended and how elapsed between the two timepoints. the system infers that the end user will agree, and therefore main- (3) Unweighted directed edges are used to link people who ex- tain the end user satisfied by the recommendations. In their work changed emails at a specific timepoint: for instance, an edge explanation generation was an aim. from At 3 to Bt 3 means that at timepoint 3, A sent an email More recently, the need of self-explanation in self-adaptive sys- to B. tems was argued in [3, 29, 31]. The authors claim the behaviour of This representation lends itself well to variants of traditional self-adaptive systems is emergent, and means that the behaviour graph metrics, such as temporal distance between two people, or exposed by the running system may be seen as unexpected by its the temporal availability of a path from one person to another (i.e. end users or its developers. They further argue that trust in the whether there is a chain of emails from one person to another while system and the resolution of the surprising behaviour can only be considering time ordering). It uses discrete time, where timepoints achieved if a self-adaptive system is also capable of self-explaining act as timeslices and events are assumed to be instantaneous. itself [29]. In [10] we presented how traceability (i.e. following the In a later survey of temporal networks by Holme and Sarä- life of a requirement) and versioning (i.e. keeping track of how a maki [15], this type of graph with instantaneous edge activations is specific artifact evolved over time) are needed for self-explanation called a contact sequence, and another type of temporal network is and diagnosis. More recently, in [23], the authors present a tem- identified: interval graphs, where edges are active over a set of time poral model to support interactive diagnosis of adaptive Systems. intervals rather than at specific timepoints. Holme and Sarämaki The authors describe a temporal data model to represent, store and mentioned in the same survey how the use of temporal networks query decisions as well as their relationship with the context, re- was becoming common across multiple disciplines, and no stan- quirements, and adaptation actions. Self-explanation and diagnosis dardized notation had been set out yet. support is still a young research area that needs more research Regardless, these two previous works consider temporal graphs efforts. to be rearrangements of a sequence of events between persistent entities, which may or may not be instantaneous. In contrast, Hart- 2.2 Temporal graphs mann et al. consider temporal graphs as attributed labelled graphs1 whose state evolves over time [13]. In the most naive approach, A graph is a well-understood concept in computer science: in its one would think of simply storing each version of a time-evolving most basic form, it is a collection of nodes with edges connecting graph as separate snapshots, and to visit each snapshot as needed. them, which may be directed or undirected. There is a number of Unfortunately, the space requirements for such a naive solution ways to extend the concept of a graph with the time dimension: in would skyrocket as we increase the number of timepoints, and the this section we will present three, one of which is the base for our time needed to visit the various versions would raise as well. In proposal. the same paper, Hartmann et al. specifically considered Internet Kostakos [21] was one of the first to use the term temporal graph, of Things devices and cyber-physical systems, where a network of as a graph encoding of a temporal dataset of events. Kostakos’ sensors may be picking up readings frequently over a long period proposal includes an example where the email exchanges between of time, at different rates. a number of people through time are transformed into a temporal Hartmann et al. proposed a more efficient data model and storage graph like the one in Figure 1 in three steps: mechanism for these temporal graphs, and made it available as the (1) One node is created per person and point in time when it 1 Attributed graphs have key-value pairs in their nodes and edges, and labelled graphs sent an email: a person A would have nodes At 1 , At 2 and so classify nodes and edges into equivalence sets, e.g. “person” nodes and “emailed” edges. on. Neo4j is a well-known implementation of this data structure. Reflecting on the past and the present with temporal graph-based models MRT 2018, October 14, Copenhagen, Denmark Greycat open source project2 . In this data model (shown in Figure 2), Each Decision is based on an Observation of the environment, the graph is stored as a collection of nodes, which are conceptual which produces a set of Measurements of the Metrics. In this identifiers that are mapped to specific state chunks depending on version of the language we do not include specific values, but rather the world and timepoint chosen to visit it. Nodes have a lifespan between which of the various Thresholds the value was. For in- between two specific timepoints, and within that lifespan they may stance, if we had three thresholds x and y with values 10 and 20, take on a sequence of state chunks. Each state chunk appears at a position 0 would be for x < 10, position 1 would be 10 ≤ x < 20, specific timepoint and overrides any previous state chunk. and position 2 would be x ≥ 20. Using these measurements, the sys- In the example in Figure 2, during timepoint i + 1 a “watched” tem would derive a set of beliefs about the satisficement of the NFR edge is created from “Eve” to “Video”, and in i + 2 “Alice” enters and the value of the different Actions, and finally pick a specific the graph and posts a “friendReq” to “Bob”. Instead of storing the Action from the Decision. three full graphs outright, we only create new state chunks for “Eve” and “Alice” as needed, using a copy-on-write style. State chunks are 3.2 Transparent temporal graph storage keyed by node, time and (in Greycat) by world. This third coordinate makes it possible to “fork” the graph into multiple branching paths, The next part of the approach is to store the models themselves which enables what-if analyses. in a temporal graph to facilitate querying. In the literature, there The approach presented in this paper adopts the data model by are essentially two approaches to integrate graph databases with Hartmann et al. of an evolving labelled attributed graph. If we can modelling technologies: changing the storage layer of the system turn the models that the system operates upon into this type of directly (as implemented by NeoEMF [6]), or having an external temporal graphs, we could allow the system to reflect on what it system watch the existing storage and update the graph when has been doing in the long and the short term, and provide clear changes are detected (as done by our Hawk [9] tool). explanations about its history to the user. Either option is valid, but we chose Hawk as it had a number of advantages over NeoEMF for this problem. First, using Hawk does 3 PROPOSED APPROACH not require modifying existing systems: Hawk has a component- based architecture, making it possible to change the database tech- Our end goal is to develop a generic and reusable framework to nology, graph updating algorithms and supported model storage allow self-adaptive systems using the models@run.time approach locations to fit the situation. In addition, Hawk has been specifically to reflect upon their past execution and to improve the explanations designed to detect the parts of a model that have changed, and only provided to the users about their behaviour. In this section we will changes the subgraph that is impacted by these changes [1]. describe the various components that we see as necessary to achieve In our implementation, we have extended Hawk with the ability this goal. The next section will present an initial case study for a to use Greycat as a backend. We have also extended Hawk with SAS which must choose between multiple configurations. a time-aware version of the incremental graph update algorithm, which tells time-aware backends (i.e. Greycat) to “travel in time” to 3.1 Problem-independent execution trace the timepoint when the change has been introduced before applying models the detected changes. This allows for the preservation of the original Self-adaptive systems are generally built as feedback loops (e.g. graph at the previous timepoint, making it possible to travel back those following the MAPE-K architecture [18]). At each timepoint and forth in time to answer queries about the history of the trace or time slice, observations are made and analysed, then future be- execution model. haviour is planned, and those plans are executed. Since we want to make the queries on the execution history reusable, the his- 3.3 Reusable time-aware query language tory must be expressed in a language that can be reused across multiple problems (e.g. network management and smart grids). Having a convenient way to write queries over the history of the Whether the language could be further reused across multiple types graph is another important ingredient for reusable self-explanation. of self-adaptive systems would require further research. It may be To simplify adoption, the most direct approach is to start from an necessary to allow extending these metamodels to accommodate existing model querying language (e.g. OCL), and then add the algorithm-specific details. ability to traverse the history of a model element or a type through As an example of a potential execution trace metamodel for their lifespans. Our definitions for the history of a model element self-adaptive systems that need to switch between multiple con- and a type are as follows: figurations, consider the metamodel shown in Figure 3. At the top • The history of a model element starts from the moment it is level, the Log for a time slice records the requested non-functional created, and ends when it is destroyed. Model elements are requirements as NFR objects (which have specific satisficement assumed to have a unique identity, which could be a natural thresholds between 0 and 1), together with the Metrics to be mea- or artificial identifier or its location within the model. There sured to check their satisficement, and the alternative Actions will be a new version of a model element every time its state that can be taken. These are used in the various Decisions that changes, whether by changing the value of an attribute or must be taken by the system. The system is pre-configured with a the target of one of its references to other model elements. RewardTable linking the satisfaction of certain NFR with certain • Model element types are considered “immortal”, in the sense Actions to a reward value. These rewards may evolve over time. that they are created at the first timepoint in the graph and 2 https://github.com/datathings/greycat last to the virtual “end of time” of the graph. We will have a MRT 2018, October 14, Copenhagen, Denmark Antonio García-Domínguez, Nelly Bencomo, and Luis H. Garcia Paucar Log [0..*] requirements [0..*] actions [0..*] metrics [0..*] decisions NFR Metric Observation Decision Action name : EString name : EString description : EString name : EString name : EString threshold : EDouble = 0.0 probability : EDouble = 0.0 [0..1] metric [0..*] measurements [0..1] actionTaken [0..*] thresholds [0..1] observation [0..*] actionBeliefs [0..1] action Threshold Measurement ActionBelief name : EString measurementPosition : EInt estimatedValue : EDouble = 0.0 value : EDouble = 0.0 [0..1] rewardTable [0..1] action [0..1] nfr NFRBelief [0..*] nfrBeliefs RewardTable estimatedProbability : EDouble = 0.0 [0..*] rows [0..1] nfr NFRSatisfaction RewardTableRow [0..*] satisfactions satisfied : EBoolean = false value : EDouble = 0.0 Figure 3: Execution trace metamodel for a decision-based self-adaptive system new version of a model element type every time an instance Operation Syntax of the type is created or destroyed. All versions, from x.versions For model elements and model element types, we consider these newest to oldest to be the basic time-aware operations that must be supported by Versions within a range x.getVersionsBetween(from, to) the temporal graph backend (e.g. Greycat): i) retrieving all versions, Versions from a time- x.getVersionsFrom(from) ii) all versions within a range, iii) versions from/up to a certain point (included) timepoint (included), iv) earliest/previous/next/latest version, and Versions up to a time- x.getVersionsUpTo(from) v) retrieving the timepoint for that version. point (included) Our approach is heavily inspired by the “history” fields proposed Earliest / latest version x.earliest, x.latest by Rose and Segev during their work on temporal object-oriented Next / previous version x.next, x.prev/x.previous data models in the 90s [27]. Rose and Segev went further in their Version timepoint x.time proposal, suggesting the availability of per-field histories and a wider variety of predicates covering linear temporal logic. We are Table 1: Implemented time-aware primitives in the Hawk considering providing pre-defined versions of these additional fa- EOL dialect, for a model element or type x cilities on top of the basic primitives above. For our proof-of-concept implementation, Hawk already had most of the elements, as it came with a number of backend-specific and backend-agnostic query engine components. The most mature system, to allow users to jump to the main changes in behaviour query language at the time of writing is a dialect of the Epsilon that were introduced automatically, or a predefined sequence of Object Language (EOL) [20], essentially a mix between JavaScript “why”-form questions for common queries: why was it doing this and OCL. It was a matter of defining a new query engine based on at that time, why did it stop doing the previous action, why did it the EOL one with additional support for the previous primitives: reason that was beneficial, why was the reasoning process in such Table 1 lists the syntax for these new primitives. a state, and why was the user configuration like that. Adding these visualizations to a system should require less ef- 3.4 Reusable visualizations fort once we have achieved the definition of a standardized trace The last piece in the puzzle would be to have a reusable set of metamodel, and have reusable temporal storage and querying ca- visualizations for a certain class of self-adaptive system. These pabilities that we can always start from. The visualizations would could be dashboards with the key instants in the self-adaptive be backed by time-aware queries, and could be packaged together Reflecting on the past and the present with temporal graph-based models MRT 2018, October 14, Copenhagen, Denmark Listing 1: Excerpt of the original JSON trace execution logs from the Remote Data Mirroring self-adaptive system 1 { 2 "0": { 3 "current_belief_mec_true": 0.5, 4 "current_belief_mr_true": 0.25, 5 "current_observation_code": −1, 6 "current_rewards": [ 7 [90.0, 45.0, 25.0, 5.0], 8 [100.0, 10.0, 20.0, 0.0] 9 ], 10 "ev.mst": 465.104345236406, 11 "ev.rt": 326.710194366562, 12 "flagUpdatedRewards": 0, 13 "observation_description": "None", 14 "observation_probability": 0.0, 15 "selected_action": "MST" 16 }, Figure 4: RDM Case Study 17 "1": { 18 "current_belief_mec_true": 0.94, ... 19 },... with the configuration of the self-adaptive system for a specific 20 } problem domain. Beyond getting the data, another challenge is finding an accessi- ble way to present it, which steps into the realm of human-computer interaction and is outside the scope of this paper. Regardless, at this satisficement of the NFRs observed, the Maximization of Reliabil- stage the reusable visualizations remain as a future line of work. ity (MR) and the Minimization of Energy Consumption (MEC). As such, and RDM makes decisions about the topologies to use. The 4 CASE STUDY: DECISION-MAKING SAS FOR operators may find themselves asking the reasons why the RDM NETWORKS SAS has used one topology instead of the other. The states of these NFRs are not directly observable. Observa- As an example to demonstrate the feasibility of the ideas proposed, tions about their states are obtained by using monitoring variables let us consider the case of the Remote Data Mirroring (RDM) self- (called MON variables). Two MON variables REC=“Ranges of En- adaptive system (SAS). RDM is a technique to protect data against ergy Consumption" and NCC=“Number of Concurrent Connec- inaccessibility to therefore provide further resistance to data loss tions" has been specified. In [24], we have shown the requirements [17, 26]. An RDM maintains data availability and prevents data loss specification based on Partially Observable Markov Decision Pro- by storing copies (i.e., replicates) on servers (i.e., data mirrors) in cesses (POMDP) that enables reasoning and decision-making about physically remote locations [8]. partial satisficement of non-functional requirements (NFRs) and Fig. 4 presents the R-POMDP (Relational Partially Observable their trade-off based on evidence collected at runtime based on Markov Decision Process [22]) model of the RDM SAS for a given the formalism for decision-making under uncertainty provided by IT network infrastructure which has been used as the case studies POMDPs (See Fig. 4). in [2, 25]. The RDM described above has been designed to be configured by using two different topologies: minimum spanning tree (MST) 4.1 Log preprocessing and redundant topology (RT). These two possible configurations In its current implementation, the RDM SAS produces execution allow the RDM selectively activate and deactivate network links to traces in JSON format for each time slice, mentioning the observa- change its overall topology at runtime [8]. tions made, the currently estimated levels of satisficement of the The RDM SAS self-adapts reconfiguring itself at runtime accord- NFRs, and the preferences currently being applied in the decision ing to the changes in its environment, which may include either process. The JSON log is made available for forensic purposes to delayed or dropped messages and network link failures. Each net- debug the system after the fact. Listing 1 shows an excerpt of the work link in the RDM brings upon an operational cost and has a log for the first time slice. measurable throughput, latency, and loss rate. The performance and Due to time constraints, for this first feasibility study it was reliability of the RDM are determined by these metrics according decided to collect a large number of JSON logs and transform them to the following trade-off: while RT is more reliable than MST, RT into a temporal graph, answering queries away from the system can be prohibitively more expensive than MST in some contexts. (in an “off-line” fashion). It is planned to revise the RDM SAS in Each configuration provides its own levels of reliability and energy future studies to have it maintain the temporal graph while it is costs which are taken into account while estimating the levels of running, so queries can be answered “on-line” for reflection and MRT 2018, October 14, Copenhagen, Denmark Antonio García-Domínguez, Nelly Bencomo, and Luis H. Garcia Paucar self-explanation. All the resources used in this study are available Listing 2: EOL query to check the evolution of belief levels online3 , and Hawk is freely available as open source from Github4 . through the lifespan of each action choice The transformation of the logs into a temporal graph was done 1 return RewardTableRow.latest.all on a Lenovo Thinkpad X1 Carbon laptop with an Intel i7-6600U CPU running at 2.60GHz, running Ubuntu Desktop 18.04, Linux 2 .collect(r_row | r_row.versions.size).max(); 4.15.0 and Oracle Java 8u102, allocating 8GB of RAM through -Xmx8g -Xms8g. The process required a number of steps: (1) The RDM SAS had been previously run in a different ma- Listing 3: EOL query to check the min/max/average shift in chine through a simulation over 1000 time slices, producing reward values through the life of the SAS a sequence of entries in JSON format which took 536KB. 1 var rewardShifts = RewardTableRow.latest.all (2) The trace execution metamodels of Section 3.1 were imple- 2 .collect(row | row.getRewardShifts()).flatten(); mented in the Eclipse Modelling Framework [30]. 3 (3) A small Java program (381 lines of code) was created to trans- 4 return Sequence { rewardShifts.min(), form the JSON logs into EMF models conforming to the trace 5 rewardShifts.max(), rewardShifts.average() }; execution metamodels, and store them into a Subversion5 6 (SVN) repository as a sequence of revisions of the same trace 7 operation RewardTableRow getRewardShifts(): Sequence { execution model file. The SVN repository was produced after 8 var v = self.versions; 48 seconds, taking up 7.3MB of disk space, and resulted in 9 if (v.size <= 1) { 888 commits. SVN naturally ignores cases when the model 10 return Sequence {}; has not changed at all from one time slice to the next. 11 } else { (4) Hawk was instructed to index the full history of the SVN 12 return v.subList(0, v.size − 1) repository into a Greycat temporal graph, using its new 13 .collect(v | v.value − v.prev.value); time-aware updater component. From the second revision 14 } onwards, Hawk used its incremental updating capabilities to 15 } propagate any differences observed since the previous revi- 16 operation Sequence average() { return self.sum() / self.size(); } sion. The Greycat temporal graph over the 888 commits was produced after 21 seconds, taking up 31MB of disk space6 . We chose SVN over manually time-stamped files (for example, other times they will want to identify points in time where the slice1.xmi, slice2.xmi, and so on) because the version of the self-adaptive system misbehaved. local folder indexing component in Hawk at the time would index Listing 2 shows a first example of what can be done for the all time-stamped files separately, rather than as a single evolving developers. It allows the RDM SAS developer to check if the internal model. The SVN component in Hawk is designed to provide the reward values in the decision algorithm have evolved over time or full history of each file, which produced the results we intended. if they have remained the same. It operates as follows: In the future, we may create a version of the local folder indexing (1) RewardTableRow.latest returns the latest version of the component that understands that files time-stamped according to a RewardTableRow type node in the temporal graph. There certain convention are versions of the same model. are only two versions for this node: the one at the beginning Regardless, ignoring those 112 timepoints when the model did of time with no instances, and the second one with all the not changed did not result in loss of information. Indeed, if we only instances. RewardTableRows are not created or deleted, have SVN revisions for timepoints 1 and 10, that means that the they are simply modified with different reward values. model did not change at all between timepoints 2 and 9. If we ask (2) .all returns all instances of that type at that point in time. for the state of the model at timepoint 5, we will see the version (3) .collect(x | expression) visits each instance, comput- at timepoint 1 as we should. Omitting timepoints which did not ing an expression and collecting the results into a new list. introduce any changes can result in significant space savings when (4) r_row.versions.size returns the number of versions for changes are infrequent, i.e. in a stable system configuration. This that instance. This would be the number of times that the also reduces the number of results that we will have to go through reward values have changed. in our queries. It can be thought of as a form of compression. (5) .max() computes the maximum value over the list with all the numbers of versions of the various RewardTableRows. 4.2 Time-aware queries for developers Essentially, if the query returns 1 we know that the reward values We argue that self-explanation needs to be tailored to the reader. have remained the same, whereas if it returns 2 or higher we will SAS developers and integrators will be interested on a different know that it has changed at some point, and depending on the type of explanations about the system. Particularly, they will often value we will know how often it happened. For this experiment, need to verify that certain desirable properties are being met, while the query returned 442 - there was a reward table row that had 3 https://git.aston.ac.uk/garcia-a/hawk-mrt2018 changed that many times in value. 4 https://github.com/mondo-project/mondo-hawk 5 https://subversion.apache.org/ Developers can easily expand upon the queries to produce more 6 A previous version with second-level rather than millisecond-level timestamps re- nuanced results. Listing 3 shows a more advanced example that quired 2.6MB instead. We intend to investigate this further in future studies. computes some basic descriptive statistics of the reward table rows Reflecting on the past and the present with temporal graph-based models MRT 2018, October 14, Copenhagen, Denmark Listing 4: EOL query to detect the longest sequence of action Listing 5: EOL query to find cases when observations clash thrashing within the SAS against the understanding of satisficement within the SAS 1 var decVersions = Decision.latest.all.first.versions; 1 var mecBelief = NFRBelief.latest.all.selectOne( 2 var dvBeforeLast = decVersions.subList(1, decVersions.size); 2 nfrb | nfrb.nfr.name='Minimization of Energy Consumption' 3 var dvWithObs = dvBeforeLast.collect(dv | Sequence { 3 ); 4 dv.time, dv.actionTaken.name, dv.supportingObservations() 4 var recMeasurement = Measurement.latest.all.selectOne( 5 }); 5 m | m.metric.name = 'Ranges of Energy Consumption' 6 6 ); 7 // Find longest sequence of actions supported by 1 observation 7 8 var longestThrashing : Sequence; 8 var contradictions : Sequence; 9 var lastStart = −1; 9 for (v in recMeasurement.versions) { 10 for (i in 0.to(dvWithObs.size() − 1)) { 10 var vMecBelief = mecBelief.travelInTime(v.time); 11 var nObs = dvWithObs.at(i).at(2); 11 var meets = vMecBelief.estimatedProbability 12 if (nObs > 1) { 12 >= vMecBelief.nfr.threshold; 13 if (lastStart > −1 and i − lastStart > longestThrashing.size) { 13 if (meets and v.measurementPosition >= 2) { 14 longestThrashing = dvWithObs.subList(lastStart, i); 14 contradictions.add(Sequence { 15 } 15 v.time, meets, v.eContainer.description}); 16 lastStart = −1; 16 } else if (not meets and v.measurementPosition <= 1) { 17 } else if (lastStart = −1) { 17 contradictions.add(Sequence { 18 lastStart = i; 18 v.time, meets, v.eContainer.description}); 19 } 19 } 20 } 20 } 21 return longestThrashing; 21 22 22 return contradictions; 23 // Count supporting observations for a decision 24 operation Decision supportingObservations() : Integer { 25 var timeNextDecision = self.next.time; system kept changing action after each observation. It is a rather 26 var result = 0; complex query, but it can be broken down as follows: 27 var observation = self.observation; 28 while (observation.time < timeNextDecision) { (1) Lines 1–5 go from the earliest to the second last versions of 29 observation = observation.next; the only Decision in the RDM SAS. For each of them, they 30 result += 1; compute a triplet with the timepoint, the name of the action 31 } taken, and the number of supporting observations. 32 return result; (2) The number of supported operations is defined by the con- 33 } text operation in lines 24–33, which counts the number of observations that existed before the next version of that decision. (3) Lines 8–21 find the longest sequence of triplets with 1 sup- porting observation. For our trace, the query finds a sequence over time. This query makes use of EOL context operations to of 8 timepoints when the SAS is switching back and forth define the “reward shifts” of a specific version of a RewardTable- between RT and MST after each observation. Row. If we have only one version, it is the empty sequence. If it has more than one version, then it is the sequence of differences Queries can also be used to find less intuitive scenarios. Being between the values of each version and the one immediately before probabilistic, R-POMDP may infer that a certain NFR is not being it. For values (0.1, 0.12, 0.11) we would have shifts of (0.02, −0.01). met even though the current observation may say otherwise: this This can give us an idea of whether the reward recalculation is simulates sensor failures and noise. Listing 5 shows a query which keeping shifts bounded, or if the values are wildly shifting from found 18 time slices when the Minimization of Energy Consumption one timepoint to the next. In the case of the current log, the SAS NFR satisfaction did not match the Rate of Energy Consumption kept shifts bounded to a symmetric range within ±0.034, with an measurement. Either the NFR was met even though we were in the average of −5.31 × 10−9 . high ranges of REC, or the NFR was not met even though we were Continuing with the theme of checking if the SAS is behaving in the low ranges of REC. As a minor detail, for each version of the appropriately, it may be important to notice situations in which measurement we check the belief level at the same timepoint by the system may be “thrashing” between two actions, which suggest using travelInTime on the belief node. that the decision process may benefit from a “tolerance interval” where it will not react just yet to an observed situation. Listing 4 4.3 Time-aware queries for users shows a query designed to find the longest sequence of actions that Other queries may be more generally useful to the wider community are only backed by a single observation, i.e. intervals in which the around the SAS, and could be fed into dashboards. They would MRT 2018, October 14, Copenhagen, Denmark Antonio García-Domínguez, Nelly Bencomo, and Luis H. Garcia Paucar Listing 6: EOL query to compute statistics about NFR satis- Listing 7: EOL query to find intervals of MR satisficement ficement above their thresholds states and reactions by the SAS upon the observations made. 1 return NFRBelief.latest.all.collect(nfrb | nfrb.stats()); 1 var vNfrMecB = NFRBelief.latest.all.selectOne(nfrb 2 2 | nfrb.nfr.name = 'Minimization of Energy Consumption' 3 operation NFRBelief stats() { 3 ).versions.reverse(); 4 var versions = self.versions; 4 5 var nAbove = versions.select(v | 5 var currentStates = computeStates(vNfrMecB.first.time); 6 v.estimatedProbability >= v.nfr.threshold).size; 6 var newStates : Map; 7 var nBelow = versions.size − nAbove; 7 var results : Sequence; 8 return Map { 'name' = self.nfr.name, 8 var length = 0; 9 'above' = nAbove, 'below' = nBelow }; 9 for (v in vNfrMecB) { 10 } 10 newStates = computeStates(v.time); 11 if (newStates.equals(currentStates)) { 12 length += 1; 13 } else { essentially start off from the NFRs and give increasingly more 14 var lastDecision = Decision.latest.all.first.travelInTime(v.time); detailed explanations of to what degree they were met, what was 15 results.add(Sequence { currentStates, length, done to correct situations when they were not met, and why those 16 v.time, // time of last decision taken in this interval corrective actions were chosen. 17 lastDecision.observation.description, Listing 6 shows a query which indicates how often the various 18 lastDecision.actionTaken.name // name of action NFRs were met. The query takes all the NFRBelief instances, and 19 }); visits all their versions, counting how many are above and below 20 currentStates = newStates; their thresholds7 . We compute a simple triplet with the name of 21 length = 1; the NFR and the number of times we believed it to be above/below 22 } the threshold. Regarding MEC, out of the 888 unique belief levels 23 } stated by the SAS, 670 passed the threshold and 218 did not. 665 24 return results; 25 belief levels passed MR, and 223 did not. Deeper self-explanation requires looking at how the satisfice- 26 operation computeStates(instant: Integer): Map { ment of the MRs evolved over time, and how the system reacted 27 var nfrbs = NFRBelief.latest.all to it. Listing 7 shows a query that will produce a timeline of how 28 .collect(nfrb | nfrb.travelInTime(instant)); the NFRs changed between being met and not met, as shown in 29 var result : Map; Listing 8. 30 for (nfrb in nfrbs) { Looking at line 1 of the output, we see that the system started 31 result.put(nfrb.nfr.name, with both MR and MEC unmet, that it stayed like that for 1 obser- 32 nfrb.estimatedProbability >= nfrb.nfr.threshold); vation, and that since it observed that REC was low and NCC was 33 } high, it decided to go with RT as an action. Line 2 shows that the 34 return result; system started meeting MR and MEC, but then observed energy 35 } usage (REC) to be in the high ranges, so it went into MST. Interest- ingly, MEC started to fail later on, even though the observed energy Listing 8: Excerpt of output from Listing 7 about justifica- usage was not that high: again, this may be due to the probabilistic tion of the actions taken by the system. nature of R-POMDP observed in Listing 5. In general, the main advantage of the presented approach is that 1 [[{Maximization of Reliability=false, Minimization of Energy it allows for rapid development and iteration of new queries on the Consumption=false}, 1, 1532385574820, REC LOWER X history of the model, making it possible to create the explanations AND NCC GREATER S, Redundant Topology], for a category of SAS as required, and then later package them as 2 [{Maximization of Reliability=true, Minimization of Energy premade, reusable visualizations. Consumption=true}, 1, 1532385575022, REC IN Y_Z AND NCC GREATER S, Minimum Spanning Tree Topology], 5 RELATED WORK 3 [{Maximization of Reliability=true, Minimization of Energy Consumption=false}, 1, 1532385575166, REC LOWER X This paper is based on a combination of various results from the AND NCC GREATER S, Minimum Spanning Tree Topology areas of self-explanation for decision making systems, and model ], versioning. In this section we will relate the work to several key 4 ...] contributions in these fields. 7 Interestingly, this query could easily accommodate dynamic NFR thresholds without any changes, since we visit the NFR through the version of the belief. Reflecting on the past and the present with temporal graph-based models MRT 2018, October 14, Copenhagen, Denmark 5.1 Decision making, self-explanation, In the last few years, there has been increasing interest in having interactive diagnosis time-awareness as a native capability of the modelling framework itself. In 2012, Holmes et al. implemented a copy-on-write ver- The area of research about self-explanation and interactive is still sioning scheme for models at an element level using UUIDs [16]. in its infancy. The need for it is exacerbated due to the use of Hartmann showed in 2014 [14] a first version of the Kevoree Mod- artificial intelligence and machine learning. However, few research elling Framework that supported reusable versioning of individual initiatives exist. The authors in [3] use goal-based requirements model elements, with the ability to travel back and forth in time. models at runtime to offer self-explanation of how a system is This would eventually evolve to their standalone Greycat temporal meeting its requirements. Our case study also contemplates the use graph database. Our proposal takes this element-level versioning of runtime goal-based models but supported by POMDPS. Different idea as a base, and proposes a general approach to use it for self- from the work in [3], our work uses Bayesian learning. Further, explanation across a variety of SAS, adding a generic metamodel new future versions of the temporal graph models will be seen as for execution traces and a easy-to-use, database-agnostic query runtime models to be consulted at runtime [4] to support decision language. We have also integrated Greycat with a model indexer, making. making it possible to reason over temporal graphs without needing In [23], and as in our case, the authors present a temporal model to re-engineer existing systems. to support interactive diagnosis of self-adaptive systems. The au- Another recent work in the area of temporal graph stores for thors describe a temporal data model to represent, store and query models is ChronoSphere, developed by Haeusler et al. [12]. Simi- decisions as well as their relationship with the context, require- lar to Greycat, it is also based on a key-value store where the key ments, and adaptation actions. So far, we do not include the context combines the timepoint and the element identifier. Unlike Greycat, or the requirements, however it is part of our future research av- which is a “pure” temporal graph database, the authors report capa- enues. They have used their approach in the area of smart grids bilities as a model repository, supporting branches (without merges, while we have used RDMs as the case study. While they use Greycat, for now), transactions, and some capabilities for metamodel evo- only we have extended the Hawk model indexer with the ability lution. The authors also mention the application of ChronoSphere to use Greycat as a backend. Using a model indexer makes it pos- in an “industrial IT Landscape Management tool” called Txture for sible to reason over temporal graphs without the need of making model-based visualizations. We intend to evaluate ChronoSphere changes to the existing system. as an alternative to Greycat in future versions of our approach, in terms of performance and feature set. 5.2 Model versioning One last idea that Borsch et al. identified in their survey as an As a complex artifact developed within teams, keeping track of open research area was intention-aware model versioning, where the various revisions that a model goes through is very important. merges could be simplified by encoding what the modelers wanted According to the survey by Brosch et al. [5], versioning approaches to accomplish with their changes. We have not found many re- can be classified across two orthogonal dimensions: the way they search initiatives in the area since then, but we find that the idea of represent the artefacts, and the way they identify, represent and encoding the intentions of a change in the model could certainly merge differences between versions. Artifact representations can be relevant and useful for self-explanation. For future work, we are be text-based as in most well-known tools (e.g. Subversion or Git), considering model versioning approaches where the self-adaptive where they are seen as a collection of lines, or can be graph-based model-based system would encode their intentions upon the vari- as a collection of nodes and edges, potentially with attributes and ous changes. These intentions could also be indexed by Hawk into labels. Merging two versions developed in parallel from a common the temporal graph, and could be accessed from EOL queries. ancestor can be done in two ways: by comparing their states, or by combining the operations that were applied on the common ancestor in each side. 6 CONCLUSIONS AND FUTURE WORK In terms of tools, many practitioners use simple and mature In this work, we have described the key requirements for a reusable text-based version control systems (VCS) to keep track of their framework for self-explanation in self-adaptive systems: a generic models (e.g. Git), and they use standalone state-based model com- and extensible execution trace metamodel, a temporal graph to store parison and merging tools (e.g. EMF Compare8 ). Others accept the these traces, a time-aware query language that allows to reason additional complexity for the sake of additional functionality and about the history of the models, and a set of reusable visualiza- use dedicated model repositories, which handle model revisions in tions the main types of self-adaptive systems in the wild. We have terms of model elements and their references. Some well-known provided proof-of-concept implementations for the first three. We examples are Eclipse Connected Data Objects9 , which stores model have also demonstrated different queries aimed at explaining the revisions inside a relational database (usually combined with a self-adaptive nature of the systems to developers and end users. relational database), or EMFStore [19], which actually uses a collec- The present work can be considered as a first step towards that tion of XMI files. EMFStore is interesting in that it keeps both the reusable framework, suggesting several lines of future work. First, it states of the various revisions, and the individual changes that were would be useful to have a taxonomy of the various types of queries applied between those revisions, so it may use those for merging. that different audiences may ask of a self-adaptive system, and the different levels of detail that we could use for our answers. Some of 8 https://www.eclipse.org/emf/compare/downloads/ those queries may cross over multiple types of SAS, while others 9 https://www.eclipse.org/cdo/ may be specific to a class of SAS. Once we determine which queries MRT 2018, October 14, Copenhagen, Denmark Antonio García-Domínguez, Nelly Bencomo, and Luis H. Garcia Paucar are the most valuable and reusable, we would develop visualizations [12] Haeusler, M., Trojer, T., Kessler, J., Farwick, M., Nowakowski, E., Breu, R.: Com- based on them. bining Versioning and Metamodel Evolution in the ChronoSphere Model Reposi- tory. In: SOFSEM 2018: Theory and Practice of Computer Science. pp. 153–167. The trace execution metamodel shown in the paper captured Lecture Notes in Computer Science, Edizioni della Normale, Cham (Jan 2018). most concepts used by the RDM SAS, but it may require further https://doi.org/10.1007/978-3-319-73117-9_11 [13] Hartmann, T., Fouquet, F., Jimenez, M., Rouvoy, R., Traon, Y.L.: Analyzing Com- refinement to cover other SAS. We need to apply the approach to a plex Data in Motion at Scale with Temporal Graphs. In: Proceedings of the 29th wider range of SAS and to allow the extension of the trace execu- International Conference on Software Engineering & Knowledge Engineering tion metamodel with “profiles” for other types of self-adaptation (SEKE’17). pp. 596–601 (Jul 2017). https://doi.org/10.18293/SEKE2017-048 [14] Hartmann, T., Fouquet, F., Nain, G., Morin, B., Klein, J., Barais, O., Le Traon, Y.: A approaches. This is a similar approach to UML and the use of pro- Native Versioning Concept to Support Historized Models at Runtime. In: Dingel, files for specific domains. J., Schulte, W., Ramos, I., Abrahão, S., Insfran, E. (eds.) Model-Driven Engineering The query language is based on a set of basic primitives around Languages and Systems, vol. 8767, pp. 252–268. Springer International Publishing, Cham (2014). https://doi.org/10.1007/978-3-319-11653-2_16 versions, but it lacks the richness of a more formal model such [15] Holme, P., Saramäki, J.: Temporal networks. Physics Reports 519(3), 97–125 (Oct as linear temporal logic, with richer predicates such as “always”, 2012). https://doi.org/10.1016/j.physrep.2012.03.001 [16] Holmes, T., Zdun, U., Dustdar, S.: Automating the Management and Versioning “never” or “eventually”. of Service Models at Runtime to Support Service Monitoring. In: 2012 IEEE 16th We also envision a different application of temporal graphs International Enterprise Distributed Object Computing Conference. pp. 211–218. to produce better simulation models for the development of self- IEEE, Beijing, China (Sep 2012). https://doi.org/10.1109/EDOC.2012.32 [17] Ji, M., Veitch, A.C., Wilkes, J., et al.: Seneca: remote mirroring done write. In: adaptive systems. If we kept track of the actions and their impacts USENIX Annual Conference. pp. 253–268 (2003) in a self-adaptive system “deployed in the wild”, we could produce a [18] Kephart, J.O., Chess, D.M.: The vision of autonomic computing. Computer 36(1), better probability matrix between NFR satisficement levels, actions 41–50 (Jan 2003). https://doi.org/10.1109/MC.2003.1160055 [19] Koegel, M., Helming, J.: EMFStore: A Model Repository for EMF Models. In: and observations. Further, we envision that the temporal graph Proceedings of the 32nd ACM/IEEE International Conference on Software En- models will act as runtime models to support self-explanation, in- gineering - Volume 2. pp. 307–308. ICSE ’10, ACM, New York, NY, USA (2010). https://doi.org/10.1145/1810295.1810364 teractive diagnosis and even decision-making. The language will [20] Kolovos, D.S., Paige, R.F., Polack, F.: The Epsilon Object Language (EOL). In: provide a way how to access and change the runtime model during Model Driven Architecture - Foundations and Applications, Second European execution. Conference, ECMDA-FA 2006, Bilbao, Spain, July 10-13, 2006, Proceedings. pp. 128–142 (2006). https://doi.org/10.1007/11787044_11 [21] Kostakos, V.: Temporal graphs. Physica A: Statistical Mechanics and its Applica- tions 388(6), 1007–1023 (Mar 2009). https://doi.org/10.1016/j.physa.2008.11.021 [22] Monahan, G.E.: State of the Art—A Survey of Partially Observable Markov De- REFERENCES cision Processes: Theory, Models, and Algorithms. Management Science 28(1), [1] Barmpis, K., Shah, S., Kolovos, D.S.: Towards Incremental Updates in Large- 1–16 (Jan 1982). https://doi.org/10.1287/mnsc.28.1.1 Scale Model Indexes. In: Taentzer, G., Bordeleau, F. (eds.) Modelling Foundations [23] Mouline, L., Benelallam, A., Fouquet, F., Bourcier, J., Barais, O.: A temporal and Applications, pp. 137–153. No. 9153 in Lecture Notes in Computer Science, model for interactive diagnosis of adaptive systems. In: 2018 IEEE International Springer International Publishing (Jul 2015) Conference on Autonomic Computing, ICAC 2018, Trento, Italy, September 3-7, [2] Bencomo, N., Belaggoun, A., Issarny, V.: Dynamic decision networks to support 2018 (2018) decision-making for self-adaptive systems. In: (SEAMS) (2013) [24] Paucar, L.H.G., Bencomo, N.: RE-STORM: mapping the decision-making problem [3] Bencomo, N., Welsh, K., Sawyer, P., Whittle, J.: Self-explanation in adaptive and non-functional requirements trade-off to partially observable Markov deci- systems. In: 17th IEEE International Conference on Engineering of Complex sion processes. In: Proceedings of the 13th International Conference on Software Computer Systems, ICECCS 2012, Paris, France, July 18-20, 2012. pp. 157–166 Engineering for Adaptive and Self-Managing Systems. pp. 19–25. SEAMS ’18, (2012). https://doi.org/10.1109/ICECCS.2012.34 ACM, New York, NY, USA (2018). https://doi.org/10.1145/3194133.3195537 [4] Blair, G., France, R.B., Bencomo, N.: Models@ run.time. Computer 42, 22–27 (10 [25] Paucar, L.H.G., Bencomo, N., Fung Yuen, K.K.: Juggling Preferences in a World of 2009). https://doi.org/10.1109/MC.2009.326 Uncertainty. RE NEXT, Lisbon. (2017) [5] Brosch, P., Kappel, G., Langer, P., Seidl, M., Wieland, K., Wimmer, M.: An In- [26] Ramirez, A., Cheng, B., Bencomo, N., Sawyer, P.: Relaxing claims: Coping with troduction to Model Versioning. In: Bernardo, M., Cortellessa, V., Pierantonio, uncertainty while evaluating assumptions at run time. MODELS (2012) A. (eds.) Formal Methods for Model-Driven Engineering, vol. 7320, pp. 336–398. [27] Rose, E., Segev, A.: TOODM: A Temporal Object-Oriented Data Model with Tem- Springer Berlin Heidelberg, Berlin, Heidelberg (2012). https://doi.org/10.1007/978- poral Constraints. Tech. Rep. LBL-30678; CONF-9110316–1, Lawrence Berkeley 3-642-30982-3_10 Lab., CA (United States) (Apr 1991), https://www.osti.gov/scitech/biblio/5969182 [6] Daniel, G., Sunyé, G., Benelallam, A., Tisi, M., Vernageau, Y., Gómez, A., [28] Roth-Berghofer, T., Tintarev, N., Leake, D.B. (eds.): Explanation-aware Computing, Cabot, J.: NeoEMF: A multi-database model persistence framework for very Papers from the 2011 IJCAI Workshop, Barcelona, Spain, July 16-17, 2011 (2011) large models. Science of Computer Programming 149, 9–14 (Dec 2017). [29] Sawyer, P., Bencomo, N., Whittle, J., Letier, E., Finkelstein, A.: Requirements- https://doi.org/10.1016/j.scico.2017.08.002 aware systems: A research agenda for RE for self-adaptive systems. In: Pro- [7] Forcher, B., Agne, S., Dengel, A., Gillmann, M., Roth-Berghofer, T.: Semantic ceedings of the 2010 18th IEEE International Requirements Engineering Confer- logging: Towards explanation-aware DAS. In: 2011 International Conference on ence. pp. 95–103. RE ’10, IEEE Computer Society, Washington, DC, USA (2010). Document Analysis and Recognition, ICDAR 2011, Beijing, China, September https://doi.org/10.1109/RE.2010.21 18-21, 2011. pp. 1140–1144 (2011). https://doi.org/10.1109/ICDAR.2011.230 [30] Steinberg, D., Budinsky, F., Paternostro, M., Merks, E.: EMF: Eclipse Modeling [8] Fredericks, E.M.: Mitigating uncertainty at design time and run time to addres- Framework. Addison Wesley, Upper Saddle River, NJ, 2 edition edn. (Dec 2008) sassurance for dynamically adaptive systems. Michigan State University. PhD [31] Welsh, K., Bencomo, N., Sawyer, P., Whittle, J.: Self-explanation in adaptive Thesis. (2015) systems based on runtime goal-based models. Trans. Computational Collective [9] Garcia-Dominguez, A., Barmpis, K., Kolovos, D.S., Wei, R., Paige, R.F.: Stress- Intelligence 16, 122–145 (2014). https://doi.org/10.1007/978-3-662-44871-7_5 testing remote model querying APIs for relational and graph-based stores. Soft- ware & Systems Modeling pp. 1–29 (Jun 2017). https://doi.org/10.1007/s10270- 017-0606-9 [10] García-Domínguez, A., Bencomo, N.: Non-human modelers: Challenges and roadmap for reusable self-explanation. In: Software Technologies: Applications and Foundations - STAF 2017 Collocated Workshops, Marburg, Germany, July 17- 21, 2017, Revised Selected Papers. pp. 161–171 (2017). https://doi.org/10.1007/978- 3-319-74730-9_14 [11] Giese, H., Bencomo, N., Pasquale, L., Ramirez, A.J., Inverardi, P., Wätzoldt, S., Clarke, S.: Living with Uncertainty in the Age of Runtime Models, pp. 47–100. Springer International Publishing, Cham (2014). https://doi.org/10.1007/978-3- 319-08915-7_3