=Paper=
{{Paper
|id=Vol-2530/paper8
|storemode=property
|title=The CitySPIN Platform: A CPSS Environment for City-Wide Infrastructures
|pdfUrl=https://ceur-ws.org/Vol-2530/paper8.pdf
|volume=Vol-2530
|authors=Amr Azzam,Peb R. Aryan,Alessio Cecconi,Claudio Di Ciccio,Fajar J. Ekaputra,Javier Fernández,Sotiris Karampatakis,Elmar Kiesling,Angelika Musil,Marta Sabou,Pujan Shadlau,Thomas Thurner
|dblpUrl=https://dblp.org/rec/conf/iot/AzzamACCEFKKMSS19
}}
==The CitySPIN Platform: A CPSS Environment for City-Wide Infrastructures==
The CitySPIN Platform: A CPSS Environment for City-Wide Infrastructures Amr Azzam Peb R. Aryan Alessio Cecconi WU Vienna ISE Institute, TU Wien WU Vienna 1020 Vienna, Austria 1040 Vienna, Austria 1020 Vienna, Austria aazzam@wu.ac.at peb.aryan@tuwien.ac.at cecconi@ai.wu.ac.at Claudio Di Ciccio Fajar J. Ekaputra Javier Fernández WU Vienna ISE Institute, TU Wien WU Vienna 1020 Vienna, Austria 1040 Vienna, Austria 1020 Vienna, Austria claudio.di.ciccio@ai.wu.ac.at fajar.ekaputra@tuwien.ac.at jfernand@wu.ac.at Sotiris Karampatakis Elmar Kiesling Angelika Musil Semantic Web Company ISE Institute, TU Wien ISE Institute, TU Wien 1070 Vienna, Austria 1040 Vienna, Austria 1040 Vienna, Austria sotiris.karampatakis@semantic- elmar.kiesling@tuwien.ac.at angelika.musil@tuwien.ac.at web.com Marta Sabou Pujan Shadlau Thomas Thurner ISE Institute, TU Wien Wiener Stadtwerke Holding AG Semantic Web Company 1040 Vienna, Austria 1030 Vienna, Austria 1070 Vienna, Austria marta.sabou@tuwien.ac.at Pujan.Shadlau@wienerstadtwerke.at t.thurner@semantic-web.at ABSTRACT KEYWORDS Cyber-physical Social System (CPSS) are complex systems that span CPSS, Linked Data, Knowledge Graphs, Publict Transport, Smart City. the boundaries of the cyber, physical and social spheres. They play an important role in a variety of domains ranging from industry to smart city applications. As such, these systems necessarily need 1 INTRODUCTION to take into account, combine and make sense of heterogeneous data sources from legacy systems, from the physical layer and also Cyber-physical Systems (CPSs) are systems that span the physical the social groups that are part of/use the system. The collection, and cyber-world by linking objects and process from these spaces. cleansing and integration of these data sources represents a major A typical CPS collects data from the physical world via sensors and effort not only during the operation of the system, but also dur- applies computation resources from the cyber-space to integrate ing its engineering and design. Indeed, while ongoing efforts are and analyze this data in order to decide on optimal feedback pro- concerned primarily with the operation of such systems, limited cesses that can be put in place by physical actuators. CPSs have focus has been put on supporting the engineering phase of CPSS. started to diffuse into many areas, including mission-critical public To address this shortcoming, within the CitySPIN project we aim to transportation, energy services, and industrial production and man- create a platform that supports stakeholders involved in the design ufacturing processes. of these systems especially in terms of support for data manage- The results of a recent study about adaptation in CPS [16] revealed ment. To that end, we develop methods and techniques based on an emerging trend to add an additional social layer in a CPS Semantic Web and Linked Data technologies for the acquisition architecture to address human and social factors and evolve these and integration of heterogeneous data from disparate structured, systems into CPSSs [21]. The resulting systems consist not only of semi-structured and unstructured sources, including open data and software and raw sensing and actuating hardware, but are social data. In this paper we present the overall system fundamentally grounded in the behaviour of human actors, architecturewith a core focus on data acquisition and integration.We who both generate data and make informed decisions based on data demon-strate our approach through a prototypical implementation [5, 12, 22]. of an adaptive planning use case for public transportation scheduling. 1st Workshop on Cyber-Physical Social Systems (CPSS2019), October 22, 2019, Bilbao, Spain. Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 57 CPSS2019, October 22, 2019, Bilbao, Spain M. Sabou, et al. The CitySPIN1 project aims to lay a foundation for the develop- All these challenges are amply reflected in a CitySPIN use case ment of CPSSs in the context of Smart City infrastructure services. that aims to improve the daily schedule planning for the Vien- To this end, we develop both theoretical and conceptual foundations, nese public transport network. In particular, this use case aims as well as a set of innovative components — illustrated in Figure 1 to support planners in their work by allowing them to treat the — that support a CPSS design process in a uniform platform. This transportation system as a CPSS and accounting for the dynam- platform supports key stakeholders involved in the design process ics of the involved travelers (especially during large-scale events). through a prototyping environment that provides a visual interface This requires, amongst others, the integration of data from various which allows them to (i) access a wide range of data sources from sources including data internal to the organization (e.g., historic sensors, social channels, and legacy systems; (ii) integrate and ana- data about event attendance), open data (e.g., expected events), as lyze heterogeneous data; and (iii) visualise results. This platform well as real-time data from mobility operators. Some of these data is made possible by methods and tools that make use of Semantic sources can raise privacy concerns (e.g., when harvested by apps Web and Linked Data technologies to support the collection and installed on individual mobiles) and therefore user consent about integration of heterogeneous data sources. the use of this data needs to be appropriately captured and con- In this paper, after a brief overview of the CitySPIN arechitecture sidered during data processing. Finally, network planners would in Section 2, we focus on the two core aspects of this technology like to understand recurring social behaviors and patterns – for stack: the knowledge graph construction, covered in Section 3, and example, the typical routes followed by participants of an event. the prototyping environment, described in Section 4. Furthermore, CitySPIN tackles these challenges in the design and prototyping we discuss the prototypical implementation and illustrate the appli- phases of a CPSS and aims to offer support to key stakeholders cation of the platform by means of an example use case involving involved in these stages including decision makers, project man- Vienna’s largest public transport provider in Section 5. Finally, we agers, software architects, and software engineers, as depicted in briefly review related work in Section 6 and conclude the paper Figure 1. These stakeholders are provided with a CPSS Prototyping with an outlook on future research in Section 7. Environment that adopts a mashup-based paradigm to allow them to easily acquire, explore, combine and visualise a variety of data 2 CITYSPIN ARCHITECTURE OVERVIEW sources (e.g., legacy data, streaming data, social media data, open data). The CPSS Prototyping Environment relies on and is made The design of cyber-physical social systems raises challenges due possible by three key components, as described next. to high complexity introduced by social systems in terms of: Scalable Linked Data Integration. We adopt Linked Data tech- (i) the number and heterogeneity of data sources that need to be in- nologies to address the integration of multiple, heterogeneous data tegrated: CPSSs involve large amounts of heterogeneous, poly- sources. To this end, we developed dedicated components for the structured data from a variety of sources, ranging from legacy acquisition and semantic enrichment of data as well as the integra- databases to highly dynamic sensor data. To create CPSS ap- tion into a CPSS Knowledge Graph. The next sections of this paper plications and services, it is paramount to efficiently integrate will focus on CitySPIN’s data integration architecture primarily. not just the data produced by individual processes within the Secure Data Access and Privacy. To deal with privacy concerns organization, but to achieve integration across processes, de- typically associated with social data, we develop components for partments, organizational boundaries, and domains. Finally, capturing user consent and making use of this consent during the external data, such as, for instance, social media streams, are entire data integration chain. also of pivotal importance in the context of CPSSs. Hence, a Process Mining on Linked Data. Finally, to support stakeholders in major challenge is to develop flexible data integration infras- gaining a better insight into group dynamics, we develop a Process tructures that are responsive to the varying needs of CPSSs. Mining & Analytics component that can be used to analyze behav- (ii) privacy concerns associated with the processing of sensitive social ioral patterns and make predictions based on the CPSS knowledge data: Adequate privacy protection is a fundamental require- graph. Process mining is the discipline connecting data science and ment in the context of CPSSs, which often make use of and inte- business process management that aims at discovering, checking, grate sensitive information from various sources. Additionally, and enhancing business processes based on data logged by infor- the new EU General Data Protection regulation imposes new mation systems [20]. In the context of this project, we resort in demands in terms of transparency of data processing and also particular on declarative process mining to cater for the flexibility in terms of allowing data subjects to revoke or change their of the processes considered in this project [15]. Declarative pro- consent in parts, which calls for more flexible and dynamic cess models specify dynamic systems through temporal-logic-based compliance checking. This represents a significant barrier to- rules that establish the constraints with which the execution must wards the development and provision of integrated smart city comply. Therefore, we resort on the expression of those constraints services and hinders product and process innovation. as queries over the CPSS knowledge graph to monitor and analyse (iii) uncertainty due to social dynamics: CPSS designers need a the behavior of ongoing processes [9]. The query answers are thus better understanding of the social dynamics of the groups routed to the CPSS Mashup Platform to allow for further complex involved in the CPSS, both at the design time and the run-time analytics and refinements. of the system (e.g., for on-the-fly adaptation). 1 http://cityspin.net 58 The CitySPIN Platform: A CPSS Environment for City-Wide Infrastructures CPSS2019, October 22, 2019, Bilbao, Spain Figure 1: CitySPIN architecture components and stakeholders. 3 CPSS KNOWLEDGE GRAPH SPARQL5 , input data is queried and further transformed or aggre- CONSTRUCTION gated as needed by other components of the CitySPIN platform. The following paragraphs discuss the concepts applied here in more The broad scope of CPSSs and the large variety of technical infras- detail. tructure and data involved in them give rise to unique interoperabil- ity challenges when it comes to acquiring, enriching, integrating, Data Integration Lifecycle. Heterogeneous sources such as social managing, and processing data from various sources pertaining to media data, sensor data and business intelligence data have to be the social, physical, and cyber dimensions of CPSSs. In CitySPIN made available to the CPSS for further processing. Connectors to the a CPSS knowledge graph acts as an integration hub for all this source systems hook into APIs, CSV repositories or direct database heterogeneous data. In this section we describe the technologies calls (data acquisition). Various steps follow to remove outliers and used to construct this knowledge graph. noise from data (data cleansing) as well as to refine their structure We rely on UnifiedViews2 , developed at Semantic Web Com- and align their content (data preparation). Finally data are merged, pany (SWC), as a core building block for the knowledge graph transformed and saved into pre-processable formats (data storage). construction. Specifically, data sources are aggregated and trans- The consolidated data are then available for subsequent analysis formed using so-called Data Processing Units (DPU)s, which are and reuse. Therefore, those data are fed back to the acquisition assembled into data integration pipelines3 . All input data is avail- stage and the integration cycle restarts. able in a structured format for further processing by subsequent elements of the pipeline. The pipelines transform data from various Data Acquisition and Enrichment. As we consistently follow an source formats and lift them into Resource Description Framework ontology-based data integration approach, we extract data accord- (RDF) format, a semantically explicit format standardized by the ing to a CPSS-wide ontology and transform the data into RDF. The World Wide Web Consortium (W3C). This results in a knowledge RDF is, in turn, an interchange format which is used as the canoni- graph that expresses the data using common standard vocabularies cal one for further processing. By following the W3C standards for as well as vocabularies tailored to the use cases. Table 1 provides an the Semantic Web, our approach ensures compatibility with a wide overview of the key vocabularies used for the semantic alignment range of tools used in the CPSS stack. of the various datasets which underlie the public transportation Semantic Alignment for Data Integration. Aligning contents and planning use case used as an illustrative example in this paper. data alongside an ontology enables the CPSS to access enriched The knowledge graph is stored into an RDF triple store – specif- contextual knowledge. This additional information forms a critical ically Ontotext GraphDB4 . Using the standard RDF query language part of an integrated view on CPSS data and is essential for realizing the integrated user interface presenting the planning dashboard. Data Cleansing. All gathered data is integrated and enriched by 2 https://unifiedviews.eu a processing pipeline, which lies at the functional core of the CPSS 3 cf. https://help.poolparty.biz/display/UDDOC/Basic+Concepts+for+DPU+developers (e.g.: prediction, analysis, decision). Based on domain knowledge for an introduction to the core concepts 4 http://graphdb.ontotext.com 5 https://www.w3.org/TR/sparql11-query/ 59 CPSS2019, October 22, 2019, Bilbao, Spain M. Sabou, et al. Name Prefix Namespace Documentation Purpose Time Ontology time https://www.w3.org/2006/time# [7] modeling time Geovocab geometry geom http://geovocab.org/geometry# [2, 17] desribing geographical regions Geovocab spatial spatial http://geovocab.org/spatial# [3, 17] topological relations between features wgs84 wgs http://www.w3.org/2003/01/geo/wgs84_ [1] lat(itude), long(itude) about spatially-located pos# things Event event http://w3id.org/cityspin/ns/event# http://rebrand.ly/dmkrk0 city event data (location, participants etc.) Cellular data mobile http://w3id.org/cityspin/ns/mobile# http://rebrand.ly/pkmq2j Cellular location data SPECIAL-CPSS special-cpss http://w3id.org/cityspin/ns/special-cpss# http://rebrand.ly/5m33q2 CPSS usage policy and consent specification Transport tp http://w3id.org/cityspin/ns/transport# http://rebrand.ly/ee83mf structuring and annotating public transport data (e.g., stops, routes, schedules). Disruption td http://purl.org/td/transportdisruption# [6] modelling travel and transport related events that have a disruptive impact on an agent’s planned travel Data Cube qb http://purl.org/linked-data/cube# [8] multi-dimensional data (e.g., district heating network statistics) Table 1: Key vocabularies for semantic alignment and process knowledge, data are consolidated and made available To cater for the reporting and predicting needs of a CPSS, our for extraction and further processing by actuators, visualization architecture includes an intermediate layer in which data are ex- and re-feeds into the learning pipeline. tracted from the Querying component and fed to the Prediction component or directly to the Dashboard of the frontend layer. The Knowledge Graph Storage. The central processing pipeline acts as Querying component of our prototyping environment relies on an interface to other algorithms, further user-driven explorations, the data endpoint provided by the back-end module for the execu- or visual representations of the output. The loop-back to the Data tion of queries. In this component, we are using the W3C-standard Acquisition stage of the CPSS is realized through interim storage SPARQL query language6 for querying the integrated data. Further- in a central triple store and actuation of external triggers. more, we can also use SPARQL Construct queries to encode rules for inferring new knowledge. 4 CITYSPIN PROTOTYPING ENVIRONMENT The Prediction component is designed to allow for the appli- For the implementation of the prototyping environment, the CitySPIN cation of Machine Learning (ML) techniques aimed to derive pre- project proposes an architecture inspired by the Presentation Ab- diction models that – based on historical data – can be used as a straction Control (PAC) architectural pattern [10]. In this section, decision support system for CPSS stakeholders to react ahead of we adapt the PAC architecture to the CPSS needs and integrate it time to predicted arising situations [4]. Example prediction results with the modular approach of Linked Widgets [19] and Unified- include the forecast of numerical trends of variables under anal- Views [13] to develop a CPSS prototyping environment. In the ysis, the identification of changes in the classification of recently following subsections, we illustrate the longitudinal section of the collected data to raise alerts in case of anomalies, or the recom- software architecture to describe the associations and information mendation on the next operation to undergo in light of the recent flows between the main logical components at large. In line with developments of the data under observation. the PAC pattern, three-layered architecture of the CitySPIN CPSS ML algorithms require learning, validation, and testing phases prototyping environment consists of: on historical data, prior to, or alternated with, run-time processing • the Back-end layer, in which data are loaded, pre-processed, or reinforcement on live data. To cater for these requirements, our and aggregated (abstraction) - details on this layer are previ- architecture binds the Querying and Prediction components with ously discussed in Section 3, together with the CPSS Knowl- data-flow associations that proceed in both ways: (i) from Querying edge Graph construction and therefore will not explained to Prediction for data feed, and (ii) from Prediction to Querying for further in this Section; updates on the classifications and predictions made. Notice that • the Service layer, in which those data are queried and ana- this architectural choice allows for the marshalling and storage of lyzed to infer additional knowledge and later on generate models learned from the Prediction component for further reuse. prediction models (control) - cf. Section 4.1; This is the basis through which ex-post data analyses conducted via • the Front-end layer (presentation), from which users can process mining can be readily available for decision support and access the prediction models and data analysis reports to monitoring via successive queries, as suggested in [9]. Finally, we monitor the current status of the infrastructure, explore the emphasize that both the Querying and Prediction components are historic performance, and make informed decisions on the containers for diverse ML modules that can be used alternatively future settings (Section 4.2). in multiple use cases, e.g., as a plugin for Linked Widgets. 4.1 Service: Querying and Prediction There are a wide range of services required in the CPSS context due to the diversity of application domains, use cases and scenarios. In our CPSS Prototyping environment, we focused on two main services: (i) Querying, and (ii) Prediction. 6 https://www.w3.org/TR/2013/REC-sparql11-query-20130321/ 60 The CitySPIN Platform: A CPSS Environment for City-Wide Infrastructures CPSS2019, October 22, 2019, Bilbao, Spain 4.2 Frontend: Visualization and Decision as parts of city-wide infrastructures. Therefore, relevant data is Support collected from social sensors and data sources that act as proxies for human behavior (e.g., ticket sales). The relevant data is collected The Frontend layer allows users to interact with, get informed from a multitude of data sources (e.g., ticket sales, open government about, and interactively explore the integrated knowledge acquired data, mobility data). The resulting Event-Aware Mobility Planner from the data and augmented by the Prediction component. To this (CaMP) system enables WL planners to inspect attendance specific end, the use of Linked Widget Platform (LWP) [19] provides the information for a large number of events drawn from a variety of necessary high degree of flexibility and customizability for CPSS data sources. It allows integrated and visual access to attendance prototyping. LWP combines semantic web and mashup concepts to data (i) from legacy (historic) sources, (ii) open data sources and support non-expert users in efficiently making use of various open (iii) social data. and non-open data sources. In particular, the platform allows users to collaboratively and interactively integrate data in an ad-hoc and distributed manner. Each stakeholder can contribute their data and 5.2 Data exploration computing resources to a shared data processing flow in a shared To elicit requirements and how they could be addressed with avail- interface that allows them to orchestrate the interaction among able data, several workshops were held to (i) review the organiza- components within a CPSS. tional and technical context of the real-world use case, (ii) conduct Depending on their needs, users can directly construct analyti- a high-level survey of available data sources within the use case cal data flows, fine-tune queries, ML parameters, and visualization partner’s organization as well as externally available data, (iii) pri- parameters within a single graphical interface. Bi-directional infor- oritize available data sources and the required data acquisition mation flows between Querying and Dashboard components allow methods, (iv) evaluate design alternatives for data acquisition and users to save their preferences and potentially store the relevant semantic enrichment, (v) explore architectural options for a plat- facts that they may have discovered in the Data Store. This would form environment that supports integration of large-scale batch be crucial, for instance, to enable reinforcement learning for future and high-frequency data flows. projects building upon the CPSS Prototyping Environment, e.g., This resulted in a set of preliminary data models, vocabularies, application developments based on the prototype results. and guidelines used in the extraction, transformation and enrich- ment steps of the knowledge graph construction, as described next. 5 USE CASE In this section, we introduce one of our real-world use cases in 5.3 Mobility Knowledge Graph Construction the public transport domain (Section 5.1), discuss data exploration The knowledge graph constructed for the mobility use case cov- (Section 5.2), describe the construction of the knowledge graph for ers (i) public transportation infrastructure (e.g., agencies, lines, the use case in Section 5.3 and illustrate the prototypical implemen- schedule), (ii) internal planning protocols from WL, and (iii) event tation within the CitySPIN platform (Section 5.4). information. 5.1 Mobility Use Case Description Public Transportation Data. The first part of the mobility knowl- edge graph covers public transportation data in Vienna. Transporta- The goal of the CitySPIN project is to deliver a generic platform for tion data are often available as open data in GTFS format, which CPSS development that can support a wide variety of use cases in is widely used by Google for their online services7 . This data for- the context of city infrastructure services. To develop and prototype mat covers transport agencies/operators, the routes and the stop this platform, we chose use cases that cover a broad spectrum of locations, trip schedule, and rules to describe the operation/service. smart city services (viz. public transportation and district heating In the context of our prototype, we rely on the existing GTFS network control) while, and on the other hand, exhibiting synergies ontology8 and GTFS CSV converter9 to transform the original GTFS in terms of data and component requirements. data provided by the City of Vienna10 to produce our GTFS trans- In this paper, we focus on the CitySPIN Event-Aware Mobility portation KG. In total, the resulted KG contains more than 20 mil- Planning (CaMP) use case, which allows planners at Wiener Linien lion triples, which is now available online as a SPARQL endpoint11 (WL) to estimate mobility demands of large-scale events in order to hosted in an HDT[11] server. tailor the mobility planning accordingly. To cater for the needs of participants of such large-scale events, WL already actively adapts Event Information. In addition to public transportation data, we its transportation network schedule. In particular, the types, ca- include the event information from the Wien-Ticket open data pacities and frequencies of vehicles in service during such events API12 as the second part of the mobility knowledge graph. The are currently decided by planners based on historic data about the Wien-Ticket data contains general event information in Vienna, number of attendants to recurring events, which are recorded in e.g., event name, address of the event location, and performer’s event planning protocols saved as .pdf files. name. This current approach makes it difficult to plan for new or non- recurring events for which no planning protocols exist. Addition- 7 https://gtfs.org 8 https://github.com/OpenTransport/linked-gtfs ally, the current planning process does not take into account any 9 https://github.com/OpenTransport/gtfs-csv2rdf feedback from social sources, e.g., such as event attendant profiles. 10 https://www.data.gv.at/katalog/dataset/wiener-linien-fahrplandaten-gtfs-wien The CitySPIN project addresses this use case with the concept 11 http://triple.ai.wu.ac.at of Cyber-Physical Social Systems (CPSS), where citizens are seen 12 http://data.opendataportal.at/dataset/wien-ticket-vorverkauf 61 1st Workshop on Cyber-Physical Social Systems (CPSS), Oct 22 - 25, 2019, Bilbao, Spain M. Sabou, et al. Figure 2: Data Processing Units (DPUs) Orchestration for Wien-Ticket data To extract this data, we implement a data extraction workflow Language Processing (NLP) techniques and human computation to as a pipeline in UnifiedViews – depicted in Figure 2. The pipeline extract the necessary information. In the end, we are able to extract consists of three main stages: (i) The first step is the event data information from more than 250 out of a set of 300 test planning extraction from the open data API, which is originally provided in documents, which accounts for more than 8,700 triples in total. CSV format. This process downloads the original data from the API We do not plan to make the raw information about this planning (#1), translates into RDF (#2 & #3), and merges it with a namespace protocol public, as it may contain sensitive internal information. graph (#4). (ii) After the event data is transformed into RDF format, the second step of the extraction performs the linking of the events 5.4 Interactive Planning Support and address dataset (#5). (iii) Finally, in the last step, the resulted linked graph is merged with the original event dataset (#6) and To support the mobility use-case, we developed an interactive plan- inserted into a triple store via SPARQL (#7). ning support tool (cf. Screenshot in Figure 3) by instantiating the As a result of this process, we extracted more than 2,2 million CPSS prototyping environment. The intended user of this tool is triples of event data. We do not yet provide the resulting data as the operation planning department at WL. In particular, the sys- open data due to server limitations, but we are investigating options tem is designed to support decisions on measures to optimize the for opening the dataset for public access in the future. transportation network in anticipation of a certain event, especially by taking into account historic records of such measures for the same type of events or for events that happened at the same or WL Planning Protocols. The third part of the mobility KG is trans- neighbouring venues. portation planning data, which originated from the internal trans- We aim to support scenarios in which a transportation planner port planning protocols. The data is extracted from WL planning needs to decide traffic adjustment measures for an upcoming event. protocol documents, which are used internally to document mobil- In this scenario, the planner will start by browsing a list of upcoming ity planners’ measures taken in response to demand expectations, events - as shown by the top left widget in Figure 3 based on second including those due to special event. Such measures include increas- part of the knowledge graph on event information. From this list, ing the frequency of transportation lines that have stations in the they then choose a focus event (e.g., "Cirque du Soleil - TOTEM") vicinity of the event in a time interval covering the event’s duration. for planning adjustments. Based on the selected event, a geo-map The planning protocols are typically stored as Word or PDF docu- mashup will visualise the location of this event as a green pin on ments, which makes automatic data extraction difficult. Parts of the the map. challenges on this task includes dealing with various irregularities From this point, there are several possibilities for the planner to and inconsistencies in document layouts and extracting locally- choose as follows: used codes and abbreviations which are embedded within writ- ten comments. To address this issue, we employ a semi-automatic • inspect the list of events that took place in nearby locations information extraction pipeline, using a combination of Natural and for which a planning protocol has been produced ("event 62 The CitySPIN Platform: A CPSS Environment for City-Wide Infrastructures CPSS2019, October 22, 2019, Bilbao, Spain Figure 3: Events-Public transportation planning prototype planning protocol" widget). This widget draws on the data 6 RELATED WORK extracted from historic planning protocols –which is the A number of vision papers explored the applicability of CPSS in third part of the knowledge graph on event information– given domains. In the military domain, the CPSS concept fits natu- and allows planners to easily access the decisions taken for rally by spanning the boundaries of and connecting physical net- the nearby events (e.g., for event "8" the frequency of 3 lines works, the cyberspace, mental space and social networks that are has been set to 5, 15, 15 minutes respectively). the main components of command and control systems [14]. By • identify the public transportation stops in the immediate integrating these spaces, CPSS bring benefits such as synchroniza- geographic vicinity of the focus event’s location ("Nearby tion across the spaces, self-adaptation and “chaotic control" as an Stops" widget) based on first part of the knowledge graph on alternative to precise control in order to deal with inherent un- GTFS public transportation data. In our example scenario, certainties in the domain. In manufacturing [23], a new industrial Hermine-Jursa-Gasse and Maria-Jacobi-Gasse are two nearest revolution is emerging enabled by socio-cyber-physical system stops to the event location. (SCPS) which combine social elements with smart manufacturing • browse social media messages related to the event in order thanks to the four technical pillars of Internet of Things (IoT at the to identify any additional information from social signals physical layer), Internet of Knowledge (IoK) and Internet of Services (e.g., general satisfaction with the transportation support (IoS) at the cyber level, and Internet of People (IoP). A vision of etc.) Physical-Cyber-Social computing enabled by knowledge technolo- These functionalities for the CaMP use case are made available gies and illustrated with an application in the medical domain is by the underlying infrastructures which (i) ensures that data from discussed in [18]. Smart City applications inherently subscribe to various data sources is loaded and semantically integrated so that the concept of CPSS [5] as we also demonstrate in our own project it can be (ii) visualised using a visual widget-based platform where with a transportation and a sustainable energy related use case. various widget types can be combined into mashups in order to Common to CPSS efforts in all domains is that they primarily support the exploration of relevant planning information by the focus on describing concrete systems, and how they function. In transportation planners. CitySPIN, on the contrary, we aim to support the engineering phase We plan to continue the development of the current CaMP pro- of these systems. A particular focus is on the ETL and data integra- totype13 with the integration of additional social data, in particular tion process which takes up considerable effort. Similarly to our data from mobile operators and results from the process mining projects, the QROWD project14 also develops semantics based data components that should also allow planners to get a better under- integration approaches. However, these do not support privacy- standing of the social aspect of the CPSS. aware data integration as has been done in CitySPIN. 13 http://rebrand.ly/mobility-mashup 14 http://qrowd-project.eu/ 63 CPSS2019, October 22, 2019, Bilbao, Spain M. Sabou, et al. 7 CONCLUSION AND OUTLOOK [9] Claudio Di Ciccio, Fajar J. Ekaputra, Alessio Cecconi, Andreas Ekelhart, and Elmar Kiesling. 2019. Finding Non-compliances with Declarative Process Constraints In this paper, we provided an overview of the CitySPIN CPSSs through Semantic Technologies. In CAiSE Forum. Springer, 60–74. https://doi. platform and development approach focusing mainly on a data org/10.1007/978-3-030-21297-1_6 [10] A Dix, J Finlay, GD Abowd, and R Beale. 2004. Human-computer interaction: engineering perspective. Using multiple use cases developed with Pearson prentice hall. Inc, England (2004). stakeholders in a city-scale context as a lense to explore challenges [11] Javier D Fernández, Miguel A Martínez-Prieto, Claudio Gutiérrez, Axel Polleres, of heterogeneity, privacy, and process dynamics, we motivated the and Mario Arias. 2013. Binary RDF representation for publication and exchange (HDT). Web Semantics: Science, Services and Agents on the World Wide Web 19 design of the CitySPIN architecture described in this paper. We (2013), 22–41. illustrated the prototypical implementation of this architecture by [12] W. Guo, Y. Zhang, and L. Li. 2015. The integration of CPS, CPSS, and ITS: A means of a real-world use case in public transportation planning. focus on data. Tsinghua Science and Technology 20, 4 (August 2015), 327–335. https://doi.org/10.1109/TST.2015.7173449 In future work, we will investigate the integration of more real- [13] Tomas Knap, Petr Skoda, Jakub Klímek, and Martin Necaskỳ. 2015. UnifiedViews: time sensing and actuation components into the platform, which Towards ETL Tool for Simple yet Powerfull RDF Data Management.. In DATESO. 111–120. will enable CPSS developers to integrate additional social compo- [14] Z. Liu, D. Yang, D. Wen, W. Zhang, and W. Mao. 2011. Cyber-Physical-Social nents into the CPSS loop. In the long term, this could facilitate the Systems for Command and Control. IEEE Intelligent Systems 26, 4 (2011), 92–96. implementation of adaptive strategies in various use cases in the [15] Fabrizio Maria Maggi, Claudio Di Ciccio, Chiara Di Francescomarino, and Taavi Kala. 2018. Parallel algorithms for the automated discovery of declarative process mobility and energy domains. models. Inf. Syst. 74, Part 2 (2018), 136–152. https://doi.org/10.1016/j.is.2017.12.002 [16] Angelika Musil, Juergen Musil, Danny Weyns, Tomas Bures, Henry Muccini, and ACKNOWLEDGMENTS Mohammad Sharaf. 2017. Patterns for Self-Adaptation in Cyber-Physical Systems. In Multi-Disciplinary Engineering for Cyber-Physical Production Systems, Stefan This work was funded by the Austrian Research Promotion Agency Biffl, Arndt Lüder, and Detlef Gerhard (Eds.). Springer International Publishing, FFG under grant 861213 (CitySPIN). Chapter 13, 331–368. [17] Barry Norton, Luis M. Vilches, Alexander De León, John Goodwin, Claus Stadler, Suchith Anand, Dominic Harries, Boris Villazón-Terrazas, and Ghis- REFERENCES lain A. Atemezing. 2012. NeoGeo Vocabulary Specification. (2012). http: [1] 2003. Basic Geo (WGS84 lat/long) Vocabulary. (2003). https://www.w3.org/2003/ //geovocab.org/doc/neogeo/ [18] A. Sheth, P. Anantharam, and C. Henson. 2013. Physical-Cyber-Social Computing: 01/geo/ An Early 21st Century Approach. IEEE Intelligent Systems 28, 1 (Jan 2013), 78–82. [2] 2012. NeoGeo Geometry Ontology. (2012). http://geovocab.org/geometry https://doi.org/10.1109/MIS.2013.20 [3] 2012. NeoGeo Spatial Ontology. (2012). http://geovocab.org/spatial [19] Tuan-Dat Trinh, Peter Wetz, Ba-Lam Do, Elmar Kiesling, and A Min Tjoa. 2015. [4] Ethem Alpaydin. 2009. Introduction to machine learning. MIT press. Distributed mashups: a collaborative approach to data integration. International [5] Christos G. Cassandras. 2016. Smart Cities as Cyber-Physical Social Systems. Journal of Web Information Systems 11, 3 (2015), 370–396. Engineering 2, 2 (2016), 156 – 158. https://doi.org/10.1016/J.ENG.2016.02.012 [20] Wil M. P. van der Aalst. 2016. Process Mining - Data Science in Action, Second [6] David Corsar, Milan Markovic, Peter Edwards, and John D. Nelson. 2015. The Edition. Springer. https://doi.org/10.1007/978-3-662-49851-4 Transport Disruption Ontology. In The Semantic Web - ISWC 2015, Marcelo Are- [21] Y. Wang, W. Dai, B. Zhang, J. Ma, and A. V. Vasilakos. 2017. Word of Mouth nas, Oscar Corcho, Elena Simperl, Markus Strohmaier, Mathieu d’Aquin, Kavitha Mobile Crowdsourcing: Increasing Awareness of Physical, Cyber, and Social Srinivas, Paul Groth, Michel Dumontier, Jeff Heflin, Krishnaprasad Thirunarayan, Interactions. IEEE MultiMedia 24, 4 (October 2017), 26–37. https://doi.org/10. and Steffen Staab (Eds.). Vol. 9367. Springer International Publishing, Cham, 1109/MMUL.2017.4031317 329–336. https://doi.org/10.1007/978-3-319-25010-6_22 [22] G. Xiong, F. Zhu, X. Liu, X. Dong, W. Huang, S. Chen, and K. Zhao. 2015. Cyber- [7] Simon Cox, Chris Little, Jerry R. Hobbs, and Feng Pan. 2018. Time Ontology in physical-social system in intelligent transportation. IEEE/CAA Journal of Auto- OWL. W3C Recommendation. W3C. https://www.w3.org/TR/owl-time/ matica Sinica 2, 3 (July 2015), 320–333. https://doi.org/10.1109/JAS.2015.7152667 [8] Richard Cyganiak, Dave Reynolds, and Jeni Tennison. 2014. The RDF data cube [23] Xifan Yao and Yingzi Lin. 2016. Emerging manufacturing paradigm shifts for the vocabulary. W3C Recommendation. W3C. https://www.w3.org/TR/vocab-data- incoming industrial revolution. The International Journal of Advanced Manufac- cube/. turing Technology 85, 5 (01 Jul 2016), 1665–1676. 64