Trusted and Auditable Decision Aids over Data Streams Dominic J. Duxbury Norman W. Paton John A. Keane University of Manchester University of Manchester University of Manchester M13 9PL, Manchester, UK, M13 9PL, Manchester, UK M13 9PL, Manchester, UK dominic.duxbury@manchester.ac.uk norman.paton@manchester.ac.uk john.keane@manchester.ac.uk we outline our approach to building a decision support platform for these dynamic multi-criteria optimisation problems. ABSTRACT Decision support systems are only useful if they are trusted Data stream management systems exist to support dynamic anal- by a decision maker. Trust is especially challenging when work- ysis of streaming data, often to inform decision-making. Decision ing with dynamic data; a decision maker does not have time to support systems exist to enable decisions to be made that take ascertain if a black box system has made a mistake, and therefore into account user priorities. However, although these categories it is beneficial to provide provenance data to the decision maker, of system are now quite mature, there has been little work in- ensuring that the information motivating a recommendation is vestigating their use together. In this paper we bring together a readily available. Data provenance provides a historical record of well established streaming platform (Storm) and a widely used data and its origins, which allows the user to assess data quality decision-support methodology (Analytic Hierarchy Process) to and suitability. In addition to the underlying evidence, it is also provide dynamic decision support over data streams. In so do- important that the user has some understanding of the space of ing, we also investigate approaches making recommendations possible solutions; as a result, some form of explanation mecha- auditable (using provenance) and trustable (using explanations). nism is required that explains how a recommendation has been The resulting stream decision support system is illustrated using arrived at, and/or describes the relationship between alternative an application that supports train journey planning. options. All this is required in a context where there may be genuine uncertainty relating to criteria that inform a recommendation. 1 INTRODUCTION As such, it is important for maintaining trust to ensure that the uncertainty intrinsic in a recommendation is either presented to Data streams exist as an abstraction to support analysis of dy- a user or able to be reflected within the decision-making process. namic data as it is produced [11]. Decision Support systems exist Drawing this together, we have the following 5 desiderata for to support users in navigating a space of options [3]. These seem dynamic multi-criteria decision support systems: to be complimentary paradigms, which can be brought together to support decision making with dynamic data. Current practice (1) declarative specification of preferences, in stream data processing makes extensive use of Stream Pro- (2) dynamic revision of recommendations, cessing Engines (SPEs) which provide a framework for acting (3) provenance capturing the data underpinning decisions, upon elements in a stream. For decision support, an interesting (4) explanation of how a proposal was made, and problem is how to build on these capabilities to support real-time (5) explicit support for uncertain data. decision support over streams. To investigate how these desiderata can be supported in stream For real-time decision support systems, the choices made by decision support, a running example based on train journey plan- decision makers often affect the state of the system. It is therefore ning is introduced in Section 2. An architecture for dynamic useful to model decision makers as not just users, but as compo- decision support is described in Section 3. The application of nents of a cyber-physical-social system (CPSS). CPSS span the the architecture to support the above desiderata is discussed in physical, information, cognitive and social domains. In the CPSS Section 4. Section 5 describes some related work, and conclusions field, human users are considered a component of the system; are presented in Section 6. falling within the cognitive domain [7]. Human components can be a necessary part of a system, such as when making life or 2 MOTIVATING EXAMPLE death decisions. Decision support systems are therefore often To illustrate multi-criteria decision support over streams, we vital, as they bridge the information and cognitive domains by consider an application relating to train journey planning. We distilling data to assist decision makers. assume that a user can state where they need to go from and Decision support systems are enabled by decision analysis. to, along with the proposed start time. We also assume that the Decision analysis is the field concerned with the study of complex most suitable journey time for a user may depend on different decisions. Multi-criteria decision analysis is a sub-discipline of criteria, specifically the arrival time of the journey, the price of decision analysis comprising techniques for evaluating solutions the journey, and the number of changes. with multiple conflicting criteria [3]. A common example of this For example, in Figure 1, a decision maker must choose a route is purchasing a car; the safest car is not often the cheapest and from A to F in a way that takes into account price, arrival time so these criteria are conflicting. These criteria can have different and number of changes. importance to different decision makers so we require a method Table 1 shows the solutions to this example. We note that for users to specify their preferences. If the values of these criteria the solution ABF dominates ABDF as it is equal or better for are also changing then we call the problem dynamic. In this paper all criteria values. This leaves us with two potential solutions; ABF and ACDF . A business person may prefer ABF because it First International Workshop on Data Science for Industry 4.0. is quicker, whereas a student may prefer to save money and Copyright ©2019 for the individual papers by the papers’ authors. Copying permit- ted for private and academic purposes. This volume is published and copyrighted take ACDF . There is no optimal solution for everyone and so we by its editors. require user specification of criteria preferences (Desiderata 1). Published in the Workshop Proceedings of the EDBT/ICDT 2019 Joint Conference (March 26, 2019, Lisbon, Portugal) on CEUR-WS.org. Figure 1: Example Train Routing Scenario Circles and arrows depict stations and trains respectively. Solution Price (£) Changes Arrival Time ABF 15 1 14:00 ABDF 16 2 14:00 ACDF 9 2 14:40 Figure 2: Prototype Architecture Table 1: The solutions to figure 1. service generates a list of train journeys between the requested origin and destination stations at the specified departure time. One such criterion, arrival time, indicates the expected arrival Initial values are then calculated for all criteria. The Timetable Ser- time of a journey. This is subject to change, as trains may be vice returns an unranked list of train journeys which are passed delayed or lines closed. Ticket prices are also subject to change from the Application Controller to the Live Train Service. A stream- up until the time of purchase. If a train is delayed or the price ing component is also required to update the dynamic criteria increases, the resulting solution may no longer be optimal, there- and to produce a new ranking in real-time. The Live Train Service fore dynamically revising recommendations (Desiderata 2) to is an implementation of this component for the train scenario. reflect the most recent information is clearly beneficial. The user In this case the live train service must update the expected train may also move between stations as a part of their interaction arrival time. The Live Train Service is initialised with a list of train with the system; hence requiring an entirely new set of solutions. journeys, which are ranked by the Ranking Service. A stream of A decision maker may see these solutions and choose option UK wide train updates from National Rail is filtered, and match- ACDF because they believe it will only take 10 minutes. However, ing updates are used to update criteria values. The updated list this route could unreliable due to engineering works, so it may be of train journeys is then re-ranked by the Ranking Service. The important for the user to understand the source and derivation output stream of ranked train journeys is communicated to the of criteria values (Desiderata 3) to improve trustability, or to User Interface over web-sockets. understand the uncertainty that is characteristic of this particular The Ranking Service accepts a specification of preferences and train service (Desiderata 5). a list of solutions, to produce a ranking. This ranking is calcu- Finally, after expressing their preferences, accepting criteria lated through the application of the Analytic Hierarchy Process, values and understanding uncertain aspects, a user is left with a a popular method for multi-criteria decision analysis. The criteria recommended journey. It may be difficult to trust this recommen- and criteria behaviour are specified through the configuration. dation without understanding why it was selected. Therefore For example we specify that price is a criterion and should be we should provide the user with an explanation of where the minimised. This allows the service to remain generic. The other recommendation falls in the solution space, so that they can un- generic component is the provenance sub-system. The prove- derstand the trade-offs being made, and how this ties into their nance sub-system generates, stores and serves provenance data criteria preferences (Desiderata 4). within the platform. This subsystem is made up of a message queue, a database (Prov DB) and two services; one for generat- 3 ARCHITECTURE ing provenance (Prov Generator Service), one for serving it (Prov To evaluate our approach, a prototype platform has been devel- Provider Service). The sub-system receives messages from the oped. This platform implements our desiderata from Section 1, streaming service which are processed to produce provenance whilst providing decision support for train route planning. The graphs. system utilises a micro-services architecture shown in Figure 2. The decision maker operates the decision support system 3.1 Architecture Components through the user interface. The user inputs details for a planned In this subsection, we provide further details of the components trip; an origin station, a destination station and a departure time. in Figure 2. The user also must specify their preferences with regard to the criteria. This information is sent with a request to open a web- Live Train Service. The live train service applies Apache Storm sockets connection to the Application Controller. The Application to transform streams of tuples. Apache Storm is an open source Controller holds the state of the train journeys (solutions) within SPE which utilises three abstractions; spouts, bolts and topolo- the system. The controller uses the planned trip to build an http gies. Spouts produce streams. Bolts consume any number of request to send to the Timetable Service. streams to produce new output streams. A topology describes Our architecture requires a solution service to generate the ini- a network of spouts and bolts. Within our streaming compo- tial solution space. The Timetable Service is the implementation nent we instrument these operators to extract provenance data. of the solution service for the train route planning scenario. The We extend the base classes for bolts and spouts to produce two Operator Input Output NationalRailSpout N/A DelayBolt NationalRailSpout RankingBolt DelayBolt ] > Table 2: Input and Output types for each operator new provenance aware classes; ProvenanceAwareBolt and Prove- nanceAwareSpout. An example of a bolt extending this class is shown in Listing 1. Execute defines how a bolt processes each tuple and declareOutputFields declares the shape of tuples in the output stream. An operator inheriting from these classes will write provenance information concerning its inputs and outputs to the provenance sub-system. For the train route scenario we have three operators; Nation- alRailSpout, DelayBolt and RankingBolt . The NationalRailSpout produces a stream of delays, the DelayBolt applies relevant delays to a list of journeys and the RankingBolt interfaces with the Rank- ing Service to calculate a score for each journey. Table 2 shows the input and output tuples for each operator. We instrument all Figure 3: Provenance graph for a train schedule update the operators to supply us with provenance regarding the history of solutions, their criteria values and the resulting ranking. p u b l i c c l a s s E x a m p l e B o l t extends P r o v e n a n c e A w a r e B o l t { or an exponential scale (2). These formulas map two normalised p u b l i c void e x e c u t e ( T u p l e t u p l e ) { } values (x, y) to the fundamental scale proposed by Saaty [14]. For p u b l i c void d e c l a r e O u t p u t F i e l d s ( D e c l a r e r d e c l a r e r ) { } the train route planning scenario we apply the first formula (1), } because all criteria form a linear scale. E.g. train prices might be Listing 1: Code for a provenance aware bolt £10, £15, £20 for three alternative routes and not £10, £100, £1000. Ranking Service. To calculate a recommendation we apply the ex f (x, y) = |(x − y) × 8| + 1 (1) f (x, y) = y (2) Analytic Hierarchy Process (AHP) [14]. AHP is a structured tech- e nique for organising and analysing complex decisions. AHP con- sists of an overall goal, a group of options or alternatives for The eigenvalues of the comparison matrix for each criterion reaching the goal and a group of factors or criteria that relate represent the score for the respective criteria value of each so- the alternatives to the goal; the criteria can be further broken lution. The criteria value scores are then multiplied by the rele- down. These criteria generally have different values for different vant criteria weightings and summed across each solution. This decision makers and so the algorithm requires users to express process produces the scores which are used to derive a global their preferences. The user preferences are expressed in the form ranking. of pairwise comparisons. For instance, a decision maker could The normalisation of criteria values can cause some brittleness express that “Price is more important than Travel Duration”. Pair- in the results when we only have a small range. If the algorithm wise comparisons are easy for a user to express and model the is supplied with two journeys, one costing £50 and another £51 users knowledge within the system. The comparisons are then these are seen as the best and worst possible price and so scored used to generate weightings for each criteria. accordingly. It would be beneficial for the algorithm to recognise To produce a ranking, criteria values must also be scored. To that there is little difference between these two prices. We aim do this the values are first normalised according to the range of to solve this by allowing those implementing the framework to values across all solutions using the following formula: specify a range of possible values for a criterion. x − minX The decision support component operates over web-sockets. Norm(x) = maxX − minX The service requires a configuration file when a connection is Where minX and maxX are the smallest and largest criteria values opened, providing information about criteria. Critically the con- respectively. The values are then compared pairwise to generate figuration indicates the number of criteria and whether numerical a comparison matrix. For three solutions S 1 , S 2 and S 3 and a criteria should be maximised or minimised. The configuration criterion X with normalised criteria values x 1 , x 2 , x 3 , we would also allows us to indicate how we should compare non numerical generate a comparison matrix C. criteria. Once a connection is opened, AHP is applied to a stream of solutions, producing a stream of rankings. S1 S2 S3 S1 " 1 f (x 1 , x 2 ) f (x 1 , x 3 )# Provenance Sub-system. The provenance sub-system processes C = S 2 f (x 2 , x 1 ) 1 f (x 2 , x 3 ) messages from the streaming system and stores the output in S 3 f (x 3 , x 1 ) f (x 3 , x 2 ) 1 a database for future querying. To store this data we choose to conform to the PROV standard [10]. PROV defines a data model We provide two separate formulas for comparing criteria val- consisting of a set of vertices and edges for modelling provenance ues, depending on whether the values fall along a linear scale (1) as graphs. We adapt a subset of these to map to concepts from data stream analysis. For vertices we use entities, activities and agents. For edges we use wasGeneratedBy, used and wasAssociatedWith. The PROV data model describes entities as “an immutable piece of state”, activities as “dynamic aspects of the world which produce entities” and agents as “parties which take a role in activ- ities”. We model stream elements as entities, stream operations as activities and stream operators as agents. Note, we call a set of in- puts and outputs a stream operation. The stream operator refers to the operator applied to these inputs to produce the outputs. Edges describe the relationships between two entities. wasGen- eratedBy links an entity to the activity which generated it. used links an activity to an entity it consumed. wasAssociatedWith links an activity to an agent associated with it. We say a stream element was generated by a stream operation. These operations used a stream element or window of elements. The operation also wasAssociatedWith the operator which was applied. An example provenance graph is shown in Figure 3. This example shows the derivation for an expected train arrival time. The new arrival time wasGeneratedBy an operation which used the scheduled arrival time and the schedule delay. The operation wasAssociatedWith the delay operator (DelayBolt). Figure 4: Cumulative Density Function for Arrival Time 3.2 Framework Concepts In the remainder of this section, we explain what we mean by ex- this distribution we can view the probability of the potential planation and uncertainty and how these concepts surface within risks (lateness) for a journey. CDFs serve as alternative to criteria our architecture. values for uncertain criteria but we require a method of compar- ing two CDFs. To do this we extract three key values from the Explanation. The AHP algorithm outputs a weight vector for distribution; optimistic, expected and pessimistic values. For a criteria and a score for each solution. Whilst this is useful for CDF f we define optimistic, expected and pessimistic values as constructing a ranking, these values are difficult for a human x such that f (x) = 0.05, f (x) = 0.5 and f (x) = 0.95 respectively. to interpret. Therefore we require some further explanation of An example for train arrival times is shown in Figure 4. The user how the system arrived at a recommendation. Fundamentally interface allows the decision maker to toggle which of these three we describe explanation as a description of how a set of criteria values is fed into the ranking algorithm. preferences are used by AHP to select a solution from a solution space. Perhaps the most important part, is an explanation of the 4 MOTIVATING EXAMPLE APPLICATION trade-offs and benefits of a recommendation and how this ties In this section we explain how the user interacts with the system into the specified user preferences. For instance, in the case of and how this interface supports the five desiderata from Section 1. train route planning, a user could specify that price is critical The user interface aims to target end-users, rather than decision to them. Assuming the system recommends ABC, the cheapest scientists [16]. The user interface for the train route planner is option, a simple explanation would be that ABC is the cheapest shown in Figure 5. train and price is the most important criterion. For a decision maker planning a train journey, the first task Our recommendations are dynamic and so it is important that is to specify the planned trip. The top left corner shows the an explanation can be processed by the user quickly. This lead trip input form, where the user can input where they wish to us towards visual forms of explanation such as bar and spider travel From (Origin Station), To (Destination Station) and the charts. Spider charts visualise multi-variate data as a shape con- time they are Leaving At (Departure Time). Once these values structed from three or more quantitative variables across axes are set the user can click Calculate Routes to generate a set of stemming from the same point. Typically a chart with a larger possible journeys. The next task is for the user to specify their area represents a better solution, but these charts can be mislead- preferences (Desiderata 1). In our user interface these pairwise ing as the order of criteria can greatly affect the area. For this user preferences are located in the bottom left. In Figure 5 the reason we chose instead to visualise the solution space through preferences are set to default, with all criteria equal. Each pair can bar charts where the values for each criterion and solution are be set through a drop-down menu one of five potential values; plotted side-by-side. Bar charts are one of the most simple forms (1) X is much more important than Y , of data visualisation, leaving less room for misinterpretation. (2) X is more important than Y , Uncertainty. Uncertainty is modelled using cumulative proba- (3) X is just as important as Y , bility density functions (CDFs) drawn from historical data. These (4) X is less important than Y , functions capture information regarding the potential values of (5) X is much less important than Y . an uncertain criterion for a particular solution. Arrival time is an These preferences can be changed at any point, triggering the uncertain criterion for train route planning. We derive a CDF of system to re-rank the journeys. arrival times for a journey from the historical performance of the Once the planned trip and preferences have been detailed trains travelling the same route. These CDFs are a simple model, the user is presented with the top five ranked journeys (the capturing the distribution of potential criteria values. Through fourth and fifth fall below the fold). Immediately the user can Figure 5: Route Planning User Interface view criteria values of each journey (Price , Arrival Time and times (such as commuters) whereas pessimistic values would Transfers ). These values and the resultant ranking are updated be more important in a scenario where a user is travelling for continuously once routes have been calculated (Desiderata 2). something more time critical (such as a job interview). To prevent information overload some extra details are hidden. Clicking the plus next to Journey Path displays the information 5 RELATED WORK needed to undertake a journey, including the journey path and the This paper has proposed an approach for the integration of trains of which the journey is composed. Each journey also has streaming data with decision support methodologies, with a view a View Detail button, which allows the user to view provenance information in a pop-up window (Desiderata 3). The design for this window is shown in Figure 6. Here the user can view the history of values for Arrival Time and the data sources. The values for each of the criteria are shown in the bar charts at the top of Figure 5, with the x-axes ordered according to the ranking. These charts allow the user to visually compare a rec- ommendation (the furthest left value) to the solution space (all other values). The charts are also ordered according to the weight- ing calculated through AHP, with the most important criteria appearing on the left. This means a user can both understand the trade-offs of a recommendation and how this ties into their specified preferences (Desiderata 4). Finally the user can toggle between Pessimistic , Expected and Optimistic modes for the predicted arrival time by clicking the corresponding button. These modes simply change the value extracted from the CDF, as described in Section 3.2 (Desiderata 5). Expected values are more useful for users making a journey many Figure 6: Provenance Data for an Arrival Time to enabling users to make decisions that reflect their priorities in ACKNOWLEDGMENTS the context of a changing physical environment. In this section, Dominic Duxbury is supported by an EPSRC iCASE award in we review related work on the intersection of cyber-physical association with BAE Systems. The authors would also like to systems (CPS) with decision support, stream data analytics and recognise Andrew Campbell and Joseph Allen for their assistance provenance for data streams. in designing the user interface. In relation to CPS, decision support is growing in significance. CPS with key decision support components are being widely REFERENCES adopted in the medical field ([4, 19]). These systems advise doc- [1] J. BenÃŋtez, X. Delgado-GalvÃąn, J. Izquierdo, and R. PÃľrez-GarcÃŋa. 2012. tors in the diagnosis and treatment of patients. Liu et al. [7] An approach to AHP decision in a dynamic context. Decision Support Systems 53, 3 (2012), 499 – 506. DOI:http://dx.doi.org/10.1016/j.dss.2012.04.015 outlines a framework in the context of command and control; [2] Mohamed Medhat Gaber, Arkady Zaslavsky, and Shonali Krishnaswamy. 2005. highlighting how decision support can be integrated within a Mining Data Streams: A Review. SIGMOD Rec. 34, 2 (June 2005), 18–26. DOI: larger CPS and the benefits of doing so. Wang [18] et al. make the http://dx.doi.org/10.1145/1083784.1083789 [3] Salvatore Greco, Matthias Ehrgott, and JoseÌĄ Rui Figueira. 2016. Multiple argument for referring to CPS as cyber-physical-social systems Criteria Decision Analysis. Springer, Springer, New York, NY. (CPSS). This paper argues the importance of the human aspect [4] Yu Jiang, Houbing Song, Rui Wang, Ming Gu, Jiaguang Sun, and Lui Sha. 2017. within CPS, identifying that users should be more closely inte- Data-centered runtime verification of wireless medical cyber-physical system. IEEE Transactions on Industrial Informatics 13, 4 (aug 2017), 1900–1909. DOI: grated within the systems they control. Our architecture fulfils http://dx.doi.org/10.1109/TII.2016.2573762 this paradigm by improving extraction of knowledge (pairwise [5] M. Kontaki, A. N. Papadopoulos, and Y. Manolopoulos. 2008. Continuous K-dominant Skyline Computation on Multidimensional Data Streams. In Pro- comparisons) and presentation of knowledge (recommendations). ceedings of the 2008 ACM Symposium on Applied Computing (SAC ’08). ACM, There is a substantial body of work on stream data analyses, New York, NY, USA, 956–960. DOI:http://dx.doi.org/10.1145/1363686.1363908 often investigating how specific analyses can be carried out effi- [6] Hyo-Sang Lim, Yang-Sae Moon, and Elisa Bertino. 2010. Provenance-based Trustworthiness Assessment in Sensor Networks. In Proceedings of the Sev- ciently on rapidly streaming data (e.g. [2, 15]). Here the focus has enth International Workshop on Data Management for Sensor Networks (DMSN been more on the intersection of streaming and decision support ’10). ACM, New York, NY, USA, 2–7. DOI:http://dx.doi.org/10.1145/1858158. architectures than on algorithms for stream analytics, although 1858162 [7] Zhong Liu, Dong Sheng Yang, Ding Wen, Wei Ming Zhang, and Wenji Mao. this architectural work would benefit from, and presents specific 2011. Cyber-physical-social systems for command and control. IEEE Intelligent requirements for, efficient multi-dimensional optimization over Systems 26, 4 (2011), 92–96. DOI:http://dx.doi.org/10.1109/MIS.2011.69 [8] Peter Macko and Margo Seltzer. 2012. A General-purpose Provenance Li- streams (e.g. [5]). brary. In Proceedings of the 4th USENIX Conference on Theory and Prac- It has been recognised that multi-criteria decision support tice of Provenance (TaPP’12). USENIX Association, Berkeley, CA, USA, 6–6. systems need to operate in dynamic environments. For example, http://dl.acm.org/citation.cfm?id=2342875.2342881 [9] Archan Misra, Marion Blount, Anastasios Kementsietsidis, Daby Sow, and Benitez et al. [1] and Raharjo et al. [13] consider making incre- Min Wang. 2008. Advances and Challenges for Scalable Provenance in Stream mental responses to changes in criteria, but there has been less Processing Systems. In Provenance and Annotation of Data and Processes, Ju- of a focus on responding to changes in criteria values. liana Freire, David Koop, and Luc Moreau (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 253–265. It has also been recognized that provenance for data streams [10] Luc Moreau, Paolo Missier, James Cheney, and Stian Soiland-Reyes. 2013. is both important for specific streaming applications where deci- PROV-N: The Provenance Notation. World Wide Web Consortium, United States. sions may be audited, but also challenging in relation to scalabil- [11] S. Muthukrishnan. 2005. Data Streams: Algorithms and Applications. now, 2600 ity [9]. Previous work has involved designing generic approaches AD Delft, The Netherlands. https://ieeexplore.ieee.org/document/8186985 to collecting and storing provenance data [8, 12]. These systems [12] Priya. Narasimhan and Peter Triantafillou. 2012. SPADE: support for prove- nance auditing in distributed environments. In Proceedings of the 13th Inter- provide a generic interface for provenance management but no national Middleware Conference. Springer, Springer, New York, NY, 101–120. integration with streaming systems. Lim et al. have looked at https://dl.acm.org/citation.cfm?id=2442634 integrating provenance with streaming systems in the context of [13] Hendry Raharjo, Min Xie, and Aarnout C. Brombacher. 2009. On modeling dynamic priorities in the analytic hierarchy process using compositional data sensor networks [6], and Blount et al. provided provenance for analysis. European Journal of Operational Research 194, 3 (2009), 834 – 846. medical event streams [17]. These papers engineer a solution for DOI:http://dx.doi.org/10.1016/j.ejor.2008.01.012 [14] R. W. Saaty. 1987. The analytic hierarchy process-what it is and how it is used. generating and managing provenance specific to their respective Mathematical Modelling 9, 3-5 (1987), 161–176. DOI:http://dx.doi.org/10.1016/ areas rather than seeking to integrate provenance generation 0270-0255(87)90473-8 into generic SPEs. [15] Jonathan A. Silva, Elaine R. Faria, Rodrigo C. Barros, Eduardo R. Hruschka, André C. P. L. F. de Carvalho, and João Gama. 2013. Data Stream Clustering: A Survey. ACM Comput. Surv. 46, 1, Article 13 (July 2013), 31 pages. DOI: http://dx.doi.org/10.1145/2522968.2522981 [16] Sajid Siraj, Ludmil Mikhailov, and John A. Keane. 2015. PriEsT: an interactive decision support tool to estimate priorities from pairwise comparison judg- 6 CONCLUSIONS ments. ITOR 22, 2 (2015), 217–235. DOI:http://dx.doi.org/10.1111/itor.12054 Decision support systems use user-specified criteria to compare [17] Min Wang, Marion Blount, John Davis, Archan Misra, and Daby Sow. 2007. A Time-and-value Centric Provenance Model and Architecture for Medi- candidate solutions within a multi-dimensional space of alter- cal Event Streams. In Proceedings of the 1st ACM SIGMOBILE International natives. This requirement for user-driven comparison of can- Workshop on Systems and Networking Support for Healthcare and Assisted Liv- didate outcomes is widely recognised in decision support, and ing Environments (HealthNet ’07). ACM, New York, NY, USA, 95–100. DOI: http://dx.doi.org/10.1145/1248054.1248082 seems relevant to streaming applications in transport, health- [18] Ying Ming Wang and Kwai Sang Chin. 2011. Fuzzy analytic hierarchy process: care, command and control, etc. In this paper we have identified A logarithmic fuzzy preference programming methodology. International Journal of Approximate Reasoning 52, 4 (2011), 541–553. DOI:http://dx.doi. five desiderata for trusted and auditable decision aids over data org/10.1016/j.ijar.2010.12.004 streams, described an architecture that supports these desider- [19] Yin Zhang, Meikang Qiu, Chun-Wei Tsai, Mohammad Mehedi Hassan, and ata, and illustrated its application to an application in journey Atif Alamri. 2017. Health-CPS: Healthcare Cyber-Physical System Assisted by Cloud and Big Data. IEEE Systems Journal 11, 1 (mar 2017), 88–95. DOI: planning. Future work includes the evaluation of the approach in http://dx.doi.org/10.1109/JSYST.2015.2460747 different applications, scalability of decision support over high- velocity data streams, and investigation of different approaches to uncertainty.