=Paper= {{Paper |id=Vol-2530/paper8 |storemode=property |title=The CitySPIN Platform: A CPSS Environment for City-Wide Infrastructures |pdfUrl=https://ceur-ws.org/Vol-2530/paper8.pdf |volume=Vol-2530 |authors=Amr Azzam,Peb R. Aryan,Alessio Cecconi,Claudio Di Ciccio,Fajar J. Ekaputra,Javier Fernández,Sotiris Karampatakis,Elmar Kiesling,Angelika Musil,Marta Sabou,Pujan Shadlau,Thomas Thurner |dblpUrl=https://dblp.org/rec/conf/iot/AzzamACCEFKKMSS19 }} ==The CitySPIN Platform: A CPSS Environment for City-Wide Infrastructures== https://ceur-ws.org/Vol-2530/paper8.pdf
                             The CitySPIN Platform:
                 A CPSS Environment for City-Wide Infrastructures
                    Amr Azzam                                            Peb R. Aryan                                 Alessio Cecconi
                   WU Vienna                                           ISE Institute, TU Wien                            WU Vienna
               1020 Vienna, Austria                                     1040 Vienna, Austria                         1020 Vienna, Austria
                aazzam@wu.ac.at                                       peb.aryan@tuwien.ac.at                         cecconi@ai.wu.ac.at

               Claudio Di Ciccio                                       Fajar J. Ekaputra                              Javier Fernández
                   WU Vienna                                         ISE Institute, TU Wien                               WU Vienna
              1020 Vienna, Austria                                    1040 Vienna, Austria                           1020 Vienna, Austria
          claudio.di.ciccio@ai.wu.ac.at                           fajar.ekaputra@tuwien.ac.at                         jfernand@wu.ac.at

            Sotiris Karampatakis                                        Elmar Kiesling                                 Angelika Musil
           Semantic Web Company                                      ISE Institute, TU Wien                        ISE Institute, TU Wien
             1070 Vienna, Austria                                     1040 Vienna, Austria                          1040 Vienna, Austria
       sotiris.karampatakis@semantic-                             elmar.kiesling@tuwien.ac.at                   angelika.musil@tuwien.ac.at
                   web.com

                   Marta Sabou                                           Pujan Shadlau                               Thomas Thurner
            ISE Institute, TU Wien                            Wiener Stadtwerke Holding AG                         Semantic Web Company
             1040 Vienna, Austria                                  1030 Vienna, Austria                              1070 Vienna, Austria
           marta.sabou@tuwien.ac.at                         Pujan.Shadlau@wienerstadtwerke.at                    t.thurner@semantic-web.at



ABSTRACT                                                                             KEYWORDS
Cyber-physical Social System (CPSS) are complex systems that span                    CPSS, Linked Data, Knowledge Graphs, Publict Transport, Smart City.
the boundaries of the cyber, physical and social spheres. They play
an important role in a variety of domains ranging from industry
to smart city applications. As such, these systems necessarily need                  1 INTRODUCTION
to take into account, combine and make sense of heterogeneous
data sources from legacy systems, from the physical layer and also                   Cyber-physical Systems (CPSs) are systems that span the physical
the social groups that are part of/use the system. The collection,                   and cyber-world by linking objects and process from these spaces.
cleansing and integration of these data sources represents a major                   A typical CPS collects data from the physical world via sensors and
effort not only during the operation of the system, but also dur-                    applies computation resources from the cyber-space to integrate
ing its engineering and design. Indeed, while ongoing efforts are                    and analyze this data in order to decide on optimal feedback pro-
concerned primarily with the operation of such systems, limited                      cesses that can be put in place by physical actuators. CPSs have
focus has been put on supporting the engineering phase of CPSS.                      started to diffuse into many areas, including mission-critical public
To address this shortcoming, within the CitySPIN project we aim to                   transportation, energy services, and industrial production and man-
create a platform that supports stakeholders involved in the design                  ufacturing processes.
of these systems especially in terms of support for data manage-                     The results of a recent study about adaptation in CPS [16] revealed
ment. To that end, we develop methods and techniques based on                        an emerging trend to add an additional social layer in a CPS
Semantic Web and Linked Data technologies for the acquisition                        architecture to address human and social factors and evolve these
and integration of heterogeneous data from disparate structured,                     systems into CPSSs [21]. The resulting systems consist not only of
semi-structured and unstructured sources, including open data and                    software and raw sensing and actuating hardware, but are
social data. In this paper we present the overall system                             fundamentally grounded in the behaviour of human actors,
architecturewith a core focus on data acquisition and integration.We                 who both generate data and make informed decisions based on data
demon-strate our approach through a prototypical implementation                      [5, 12, 22].
of an adaptive planning use case for public transportation
scheduling.


1st Workshop on Cyber-Physical Social Systems (CPSS2019),
October 22, 2019, Bilbao, Spain.

Copyright © 2019 for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).


                                                                                57
CPSS2019, October 22, 2019, Bilbao, Spain                                                                                                 M. Sabou, et al.


   The CitySPIN1 project aims to lay a foundation for the develop-                  All these challenges are amply reflected in a CitySPIN use case
ment of CPSSs in the context of Smart City infrastructure services.              that aims to improve the daily schedule planning for the Vien-
To this end, we develop both theoretical and conceptual foundations,             nese public transport network. In particular, this use case aims
as well as a set of innovative components — illustrated in Figure 1              to support planners in their work by allowing them to treat the
— that support a CPSS design process in a uniform platform. This                 transportation system as a CPSS and accounting for the dynam-
platform supports key stakeholders involved in the design process                ics of the involved travelers (especially during large-scale events).
through a prototyping environment that provides a visual interface               This requires, amongst others, the integration of data from various
which allows them to (i) access a wide range of data sources from                sources including data internal to the organization (e.g., historic
sensors, social channels, and legacy systems; (ii) integrate and ana-            data about event attendance), open data (e.g., expected events), as
lyze heterogeneous data; and (iii) visualise results. This platform              well as real-time data from mobility operators. Some of these data
is made possible by methods and tools that make use of Semantic                  sources can raise privacy concerns (e.g., when harvested by apps
Web and Linked Data technologies to support the collection and                   installed on individual mobiles) and therefore user consent about
integration of heterogeneous data sources.                                       the use of this data needs to be appropriately captured and con-
   In this paper, after a brief overview of the CitySPIN arechitecture           sidered during data processing. Finally, network planners would
in Section 2, we focus on the two core aspects of this technology                like to understand recurring social behaviors and patterns – for
stack: the knowledge graph construction, covered in Section 3, and               example, the typical routes followed by participants of an event.
the prototyping environment, described in Section 4. Furthermore,                   CitySPIN tackles these challenges in the design and prototyping
we discuss the prototypical implementation and illustrate the appli-             phases of a CPSS and aims to offer support to key stakeholders
cation of the platform by means of an example use case involving                 involved in these stages including decision makers, project man-
Vienna’s largest public transport provider in Section 5. Finally, we             agers, software architects, and software engineers, as depicted in
briefly review related work in Section 6 and conclude the paper                  Figure 1. These stakeholders are provided with a CPSS Prototyping
with an outlook on future research in Section 7.                                 Environment that adopts a mashup-based paradigm to allow them
                                                                                 to easily acquire, explore, combine and visualise a variety of data
2 CITYSPIN ARCHITECTURE OVERVIEW                                                 sources (e.g., legacy data, streaming data, social media data, open
                                                                                 data). The CPSS Prototyping Environment relies on and is made
The design of cyber-physical social systems raises challenges due
                                                                                 possible by three key components, as described next.
to high complexity introduced by social systems in terms of:
                                                                                    Scalable Linked Data Integration. We adopt Linked Data tech-
  (i) the number and heterogeneity of data sources that need to be in-           nologies to address the integration of multiple, heterogeneous data
      tegrated: CPSSs involve large amounts of heterogeneous, poly-              sources. To this end, we developed dedicated components for the
      structured data from a variety of sources, ranging from legacy             acquisition and semantic enrichment of data as well as the integra-
      databases to highly dynamic sensor data. To create CPSS ap-                tion into a CPSS Knowledge Graph. The next sections of this paper
      plications and services, it is paramount to efficiently integrate          will focus on CitySPIN’s data integration architecture primarily.
      not just the data produced by individual processes within the                 Secure Data Access and Privacy. To deal with privacy concerns
      organization, but to achieve integration across processes, de-             typically associated with social data, we develop components for
      partments, organizational boundaries, and domains. Finally,                capturing user consent and making use of this consent during the
      external data, such as, for instance, social media streams, are            entire data integration chain.
      also of pivotal importance in the context of CPSSs. Hence, a                  Process Mining on Linked Data. Finally, to support stakeholders in
      major challenge is to develop flexible data integration infras-            gaining a better insight into group dynamics, we develop a Process
      tructures that are responsive to the varying needs of CPSSs.               Mining & Analytics component that can be used to analyze behav-
 (ii) privacy concerns associated with the processing of sensitive social        ioral patterns and make predictions based on the CPSS knowledge
      data: Adequate privacy protection is a fundamental require-                graph. Process mining is the discipline connecting data science and
      ment in the context of CPSSs, which often make use of and inte-            business process management that aims at discovering, checking,
      grate sensitive information from various sources. Additionally,            and enhancing business processes based on data logged by infor-
      the new EU General Data Protection regulation imposes new                  mation systems [20]. In the context of this project, we resort in
      demands in terms of transparency of data processing and also               particular on declarative process mining to cater for the flexibility
      in terms of allowing data subjects to revoke or change their               of the processes considered in this project [15]. Declarative pro-
      consent in parts, which calls for more flexible and dynamic                cess models specify dynamic systems through temporal-logic-based
      compliance checking. This represents a significant barrier to-             rules that establish the constraints with which the execution must
      wards the development and provision of integrated smart city               comply. Therefore, we resort on the expression of those constraints
      services and hinders product and process innovation.                       as queries over the CPSS knowledge graph to monitor and analyse
(iii) uncertainty due to social dynamics: CPSS designers need a                  the behavior of ongoing processes [9]. The query answers are thus
      better understanding of the social dynamics of the groups                  routed to the CPSS Mashup Platform to allow for further complex
      involved in the CPSS, both at the design time and the run-time             analytics and refinements.
      of the system (e.g., for on-the-fly adaptation).


1 http://cityspin.net




                                                                            58
The CitySPIN Platform: A CPSS Environment for City-Wide Infrastructures                                                       CPSS2019, October 22, 2019, Bilbao, Spain




                                           Figure 1: CitySPIN architecture components and stakeholders.


3     CPSS KNOWLEDGE GRAPH                                                              SPARQL5 , input data is queried and further transformed or aggre-
      CONSTRUCTION                                                                      gated as needed by other components of the CitySPIN platform.
                                                                                        The following paragraphs discuss the concepts applied here in more
The broad scope of CPSSs and the large variety of technical infras-
                                                                                        detail.
tructure and data involved in them give rise to unique interoperabil-
ity challenges when it comes to acquiring, enriching, integrating,                         Data Integration Lifecycle. Heterogeneous sources such as social
managing, and processing data from various sources pertaining to                        media data, sensor data and business intelligence data have to be
the social, physical, and cyber dimensions of CPSSs. In CitySPIN                        made available to the CPSS for further processing. Connectors to the
a CPSS knowledge graph acts as an integration hub for all this                          source systems hook into APIs, CSV repositories or direct database
heterogeneous data. In this section we describe the technologies                        calls (data acquisition). Various steps follow to remove outliers and
used to construct this knowledge graph.                                                 noise from data (data cleansing) as well as to refine their structure
   We rely on UnifiedViews2 , developed at Semantic Web Com-                            and align their content (data preparation). Finally data are merged,
pany (SWC), as a core building block for the knowledge graph                            transformed and saved into pre-processable formats (data storage).
construction. Specifically, data sources are aggregated and trans-                      The consolidated data are then available for subsequent analysis
formed using so-called Data Processing Units (DPU)s, which are                          and reuse. Therefore, those data are fed back to the acquisition
assembled into data integration pipelines3 . All input data is avail-                   stage and the integration cycle restarts.
able in a structured format for further processing by subsequent
elements of the pipeline. The pipelines transform data from various                        Data Acquisition and Enrichment. As we consistently follow an
source formats and lift them into Resource Description Framework                        ontology-based data integration approach, we extract data accord-
(RDF) format, a semantically explicit format standardized by the                        ing to a CPSS-wide ontology and transform the data into RDF. The
World Wide Web Consortium (W3C). This results in a knowledge                            RDF is, in turn, an interchange format which is used as the canoni-
graph that expresses the data using common standard vocabularies                        cal one for further processing. By following the W3C standards for
as well as vocabularies tailored to the use cases. Table 1 provides an                  the Semantic Web, our approach ensures compatibility with a wide
overview of the key vocabularies used for the semantic alignment                        range of tools used in the CPSS stack.
of the various datasets which underlie the public transportation                           Semantic Alignment for Data Integration. Aligning contents and
planning use case used as an illustrative example in this paper.                        data alongside an ontology enables the CPSS to access enriched
   The knowledge graph is stored into an RDF triple store – specif-                     contextual knowledge. This additional information forms a critical
ically Ontotext GraphDB4 . Using the standard RDF query language                        part of an integrated view on CPSS data and is essential for realizing
                                                                                        the integrated user interface presenting the planning dashboard.
                                                                                           Data Cleansing. All gathered data is integrated and enriched by
2 https://unifiedviews.eu                                                               a processing pipeline, which lies at the functional core of the CPSS
3 cf. https://help.poolparty.biz/display/UDDOC/Basic+Concepts+for+DPU+developers        (e.g.: prediction, analysis, decision). Based on domain knowledge
for an introduction to the core concepts
4 http://graphdb.ontotext.com                                                           5 https://www.w3.org/TR/sparql11-query/




                                                                                   59
CPSS2019, October 22, 2019, Bilbao, Spain                                                                                                                  M. Sabou, et al.


          Name                   Prefix                    Namespace                             Documentation                             Purpose
 Time Ontology                time         https://www.w3.org/2006/time#                    [7]                        modeling time
 Geovocab geometry            geom         http://geovocab.org/geometry#                    [2, 17]                    desribing geographical regions
 Geovocab spatial             spatial      http://geovocab.org/spatial#                     [3, 17]                    topological relations between features
 wgs84                        wgs          http://www.w3.org/2003/01/geo/wgs84_             [1]                        lat(itude), long(itude) about spatially-located
                                           pos#                                                                        things
 Event                        event        http://w3id.org/cityspin/ns/event#               http://rebrand.ly/dmkrk0   city event data (location, participants etc.)
 Cellular data                mobile       http://w3id.org/cityspin/ns/mobile#              http://rebrand.ly/pkmq2j   Cellular location data
 SPECIAL-CPSS                 special-cpss http://w3id.org/cityspin/ns/special-cpss#        http://rebrand.ly/5m33q2   CPSS usage policy and consent specification
 Transport                    tp           http://w3id.org/cityspin/ns/transport#           http://rebrand.ly/ee83mf   structuring and annotating public transport
                                                                                                                       data (e.g., stops, routes, schedules).
 Disruption                   td             http://purl.org/td/transportdisruption#        [6]                        modelling travel and transport related events
                                                                                                                       that have a disruptive impact on an agent’s
                                                                                                                       planned travel
 Data Cube                    qb             http://purl.org/linked-data/cube#              [8]                        multi-dimensional data (e.g., district heating
                                                                                                                       network statistics)
                                                 Table 1: Key vocabularies for semantic alignment



and process knowledge, data are consolidated and made available                            To cater for the reporting and predicting needs of a CPSS, our
for extraction and further processing by actuators, visualization                       architecture includes an intermediate layer in which data are ex-
and re-feeds into the learning pipeline.                                                tracted from the Querying component and fed to the Prediction
                                                                                        component or directly to the Dashboard of the frontend layer. The
   Knowledge Graph Storage. The central processing pipeline acts as                     Querying component of our prototyping environment relies on
an interface to other algorithms, further user-driven explorations,                     the data endpoint provided by the back-end module for the execu-
or visual representations of the output. The loop-back to the Data                      tion of queries. In this component, we are using the W3C-standard
Acquisition stage of the CPSS is realized through interim storage                       SPARQL query language6 for querying the integrated data. Further-
in a central triple store and actuation of external triggers.                           more, we can also use SPARQL Construct queries to encode rules
                                                                                        for inferring new knowledge.
4     CITYSPIN PROTOTYPING ENVIRONMENT                                                     The Prediction component is designed to allow for the appli-
For the implementation of the prototyping environment, the CitySPIN                     cation of Machine Learning (ML) techniques aimed to derive pre-
project proposes an architecture inspired by the Presentation Ab-                       diction models that – based on historical data – can be used as a
straction Control (PAC) architectural pattern [10]. In this section,                    decision support system for CPSS stakeholders to react ahead of
we adapt the PAC architecture to the CPSS needs and integrate it                        time to predicted arising situations [4]. Example prediction results
with the modular approach of Linked Widgets [19] and Unified-                           include the forecast of numerical trends of variables under anal-
Views [13] to develop a CPSS prototyping environment. In the                            ysis, the identification of changes in the classification of recently
following subsections, we illustrate the longitudinal section of the                    collected data to raise alerts in case of anomalies, or the recom-
software architecture to describe the associations and information                      mendation on the next operation to undergo in light of the recent
flows between the main logical components at large. In line with                        developments of the data under observation.
the PAC pattern, three-layered architecture of the CitySPIN CPSS                           ML algorithms require learning, validation, and testing phases
prototyping environment consists of:                                                    on historical data, prior to, or alternated with, run-time processing
      • the Back-end layer, in which data are loaded, pre-processed,                    or reinforcement on live data. To cater for these requirements, our
        and aggregated (abstraction) - details on this layer are previ-                 architecture binds the Querying and Prediction components with
        ously discussed in Section 3, together with the CPSS Knowl-                     data-flow associations that proceed in both ways: (i) from Querying
        edge Graph construction and therefore will not explained                        to Prediction for data feed, and (ii) from Prediction to Querying for
        further in this Section;                                                        updates on the classifications and predictions made. Notice that
      • the Service layer, in which those data are queried and ana-                     this architectural choice allows for the marshalling and storage of
        lyzed to infer additional knowledge and later on generate                       models learned from the Prediction component for further reuse.
        prediction models (control) - cf. Section 4.1;                                  This is the basis through which ex-post data analyses conducted via
      • the Front-end layer (presentation), from which users can                        process mining can be readily available for decision support and
        access the prediction models and data analysis reports to                       monitoring via successive queries, as suggested in [9]. Finally, we
        monitor the current status of the infrastructure, explore the                   emphasize that both the Querying and Prediction components are
        historic performance, and make informed decisions on the                        containers for diverse ML modules that can be used alternatively
        future settings (Section 4.2).                                                  in multiple use cases, e.g., as a plugin for Linked Widgets.

4.1     Service: Querying and Prediction
There are a wide range of services required in the CPSS context
due to the diversity of application domains, use cases and scenarios.
In our CPSS Prototyping environment, we focused on two main
services: (i) Querying, and (ii) Prediction.                                            6 https://www.w3.org/TR/2013/REC-sparql11-query-20130321/




                                                                                   60
The CitySPIN Platform: A CPSS Environment for City-Wide Infrastructures                                                 CPSS2019, October 22, 2019, Bilbao, Spain


4.2 Frontend: Visualization and Decision                                       as parts of city-wide infrastructures. Therefore, relevant data is
    Support                                                                    collected from social sensors and data sources that act as proxies
                                                                               for human behavior (e.g., ticket sales). The relevant data is collected
The Frontend layer allows users to interact with, get informed
                                                                               from a multitude of data sources (e.g., ticket sales, open government
about, and interactively explore the integrated knowledge acquired
                                                                               data, mobility data). The resulting Event-Aware Mobility Planner
from the data and augmented by the Prediction component. To this
                                                                               (CaMP) system enables WL planners to inspect attendance specific
end, the use of Linked Widget Platform (LWP) [19] provides the
                                                                               information for a large number of events drawn from a variety of
necessary high degree of flexibility and customizability for CPSS
                                                                               data sources. It allows integrated and visual access to attendance
prototyping. LWP combines semantic web and mashup concepts to
                                                                               data (i) from legacy (historic) sources, (ii) open data sources and
support non-expert users in efficiently making use of various open
                                                                               (iii) social data.
and non-open data sources. In particular, the platform allows users
to collaboratively and interactively integrate data in an ad-hoc and
distributed manner. Each stakeholder can contribute their data and             5.2      Data exploration
computing resources to a shared data processing flow in a shared               To elicit requirements and how they could be addressed with avail-
interface that allows them to orchestrate the interaction among                able data, several workshops were held to (i) review the organiza-
components within a CPSS.                                                      tional and technical context of the real-world use case, (ii) conduct
   Depending on their needs, users can directly construct analyti-             a high-level survey of available data sources within the use case
cal data flows, fine-tune queries, ML parameters, and visualization            partner’s organization as well as externally available data, (iii) pri-
parameters within a single graphical interface. Bi-directional infor-          oritize available data sources and the required data acquisition
mation flows between Querying and Dashboard components allow                   methods, (iv) evaluate design alternatives for data acquisition and
users to save their preferences and potentially store the relevant             semantic enrichment, (v) explore architectural options for a plat-
facts that they may have discovered in the Data Store. This would              form environment that supports integration of large-scale batch
be crucial, for instance, to enable reinforcement learning for future          and high-frequency data flows.
projects building upon the CPSS Prototyping Environment, e.g.,                    This resulted in a set of preliminary data models, vocabularies,
application developments based on the prototype results.                       and guidelines used in the extraction, transformation and enrich-
                                                                               ment steps of the knowledge graph construction, as described next.
5     USE CASE
In this section, we introduce one of our real-world use cases in               5.3      Mobility Knowledge Graph Construction
the public transport domain (Section 5.1), discuss data exploration            The knowledge graph constructed for the mobility use case cov-
(Section 5.2), describe the construction of the knowledge graph for            ers (i) public transportation infrastructure (e.g., agencies, lines,
the use case in Section 5.3 and illustrate the prototypical implemen-          schedule), (ii) internal planning protocols from WL, and (iii) event
tation within the CitySPIN platform (Section 5.4).                             information.

5.1     Mobility Use Case Description                                             Public Transportation Data. The first part of the mobility knowl-
                                                                               edge graph covers public transportation data in Vienna. Transporta-
The goal of the CitySPIN project is to deliver a generic platform for          tion data are often available as open data in GTFS format, which
CPSS development that can support a wide variety of use cases in               is widely used by Google for their online services7 . This data for-
the context of city infrastructure services. To develop and prototype          mat covers transport agencies/operators, the routes and the stop
this platform, we chose use cases that cover a broad spectrum of               locations, trip schedule, and rules to describe the operation/service.
smart city services (viz. public transportation and district heating              In the context of our prototype, we rely on the existing GTFS
network control) while, and on the other hand, exhibiting synergies            ontology8 and GTFS CSV converter9 to transform the original GTFS
in terms of data and component requirements.                                   data provided by the City of Vienna10 to produce our GTFS trans-
   In this paper, we focus on the CitySPIN Event-Aware Mobility                portation KG. In total, the resulted KG contains more than 20 mil-
Planning (CaMP) use case, which allows planners at Wiener Linien               lion triples, which is now available online as a SPARQL endpoint11
(WL) to estimate mobility demands of large-scale events in order to            hosted in an HDT[11] server.
tailor the mobility planning accordingly. To cater for the needs of
participants of such large-scale events, WL already actively adapts               Event Information. In addition to public transportation data, we
its transportation network schedule. In particular, the types, ca-             include the event information from the Wien-Ticket open data
pacities and frequencies of vehicles in service during such events             API12 as the second part of the mobility knowledge graph. The
are currently decided by planners based on historic data about the             Wien-Ticket data contains general event information in Vienna,
number of attendants to recurring events, which are recorded in                e.g., event name, address of the event location, and performer’s
event planning protocols saved as .pdf files.                                  name.
   This current approach makes it difficult to plan for new or non-
recurring events for which no planning protocols exist. Addition-              7 https://gtfs.org
                                                                               8 https://github.com/OpenTransport/linked-gtfs
ally, the current planning process does not take into account any
                                                                               9 https://github.com/OpenTransport/gtfs-csv2rdf
feedback from social sources, e.g., such as event attendant profiles.          10 https://www.data.gv.at/katalog/dataset/wiener-linien-fahrplandaten-gtfs-wien
   The CitySPIN project addresses this use case with the concept               11 http://triple.ai.wu.ac.at

of Cyber-Physical Social Systems (CPSS), where citizens are seen               12 http://data.opendataportal.at/dataset/wien-ticket-vorverkauf




                                                                          61
1st Workshop on Cyber-Physical Social Systems (CPSS), Oct 22 - 25, 2019, Bilbao, Spain                                                             M. Sabou, et al.




                                  Figure 2: Data Processing Units (DPUs) Orchestration for Wien-Ticket data


   To extract this data, we implement a data extraction workflow                         Language Processing (NLP) techniques and human computation to
as a pipeline in UnifiedViews – depicted in Figure 2. The pipeline                       extract the necessary information. In the end, we are able to extract
consists of three main stages: (i) The first step is the event data                      information from more than 250 out of a set of 300 test planning
extraction from the open data API, which is originally provided in                       documents, which accounts for more than 8,700 triples in total.
CSV format. This process downloads the original data from the API                        We do not plan to make the raw information about this planning
(#1), translates into RDF (#2 & #3), and merges it with a namespace                      protocol public, as it may contain sensitive internal information.
graph (#4). (ii) After the event data is transformed into RDF format,
the second step of the extraction performs the linking of the events
                                                                                         5.4     Interactive Planning Support
and address dataset (#5). (iii) Finally, in the last step, the resulted
linked graph is merged with the original event dataset (#6) and                          To support the mobility use-case, we developed an interactive plan-
inserted into a triple store via SPARQL (#7).                                            ning support tool (cf. Screenshot in Figure 3) by instantiating the
   As a result of this process, we extracted more than 2,2 million                       CPSS prototyping environment. The intended user of this tool is
triples of event data. We do not yet provide the resulting data as                       the operation planning department at WL. In particular, the sys-
open data due to server limitations, but we are investigating options                    tem is designed to support decisions on measures to optimize the
for opening the dataset for public access in the future.                                 transportation network in anticipation of a certain event, especially
                                                                                         by taking into account historic records of such measures for the
                                                                                         same type of events or for events that happened at the same or
   WL Planning Protocols. The third part of the mobility KG is trans-                    neighbouring venues.
portation planning data, which originated from the internal trans-                          We aim to support scenarios in which a transportation planner
port planning protocols. The data is extracted from WL planning                          needs to decide traffic adjustment measures for an upcoming event.
protocol documents, which are used internally to document mobil-                         In this scenario, the planner will start by browsing a list of upcoming
ity planners’ measures taken in response to demand expectations,                         events - as shown by the top left widget in Figure 3 based on second
including those due to special event. Such measures include increas-                     part of the knowledge graph on event information. From this list,
ing the frequency of transportation lines that have stations in the                      they then choose a focus event (e.g., "Cirque du Soleil - TOTEM")
vicinity of the event in a time interval covering the event’s duration.                  for planning adjustments. Based on the selected event, a geo-map
The planning protocols are typically stored as Word or PDF docu-                         mashup will visualise the location of this event as a green pin on
ments, which makes automatic data extraction difficult. Parts of the                     the map.
challenges on this task includes dealing with various irregularities                        From this point, there are several possibilities for the planner to
and inconsistencies in document layouts and extracting locally-                          choose as follows:
used codes and abbreviations which are embedded within writ-
ten comments. To address this issue, we employ a semi-automatic                                • inspect the list of events that took place in nearby locations
information extraction pipeline, using a combination of Natural                                  and for which a planning protocol has been produced ("event




                                                                                    62
The CitySPIN Platform: A CPSS Environment for City-Wide Infrastructures                                         CPSS2019, October 22, 2019, Bilbao, Spain




                                            Figure 3: Events-Public transportation planning prototype


       planning protocol" widget). This widget draws on the data               6    RELATED WORK
       extracted from historic planning protocols –which is the                A number of vision papers explored the applicability of CPSS in
       third part of the knowledge graph on event information–                 given domains. In the military domain, the CPSS concept fits natu-
       and allows planners to easily access the decisions taken for            rally by spanning the boundaries of and connecting physical net-
       the nearby events (e.g., for event "8" the frequency of 3 lines         works, the cyberspace, mental space and social networks that are
       has been set to 5, 15, 15 minutes respectively).                        the main components of command and control systems [14]. By
     • identify the public transportation stops in the immediate               integrating these spaces, CPSS bring benefits such as synchroniza-
       geographic vicinity of the focus event’s location ("Nearby              tion across the spaces, self-adaptation and “chaotic control" as an
       Stops" widget) based on first part of the knowledge graph on            alternative to precise control in order to deal with inherent un-
       GTFS public transportation data. In our example scenario,               certainties in the domain. In manufacturing [23], a new industrial
       Hermine-Jursa-Gasse and Maria-Jacobi-Gasse are two nearest              revolution is emerging enabled by socio-cyber-physical system
       stops to the event location.                                            (SCPS) which combine social elements with smart manufacturing
     • browse social media messages related to the event in order              thanks to the four technical pillars of Internet of Things (IoT at the
       to identify any additional information from social signals              physical layer), Internet of Knowledge (IoK) and Internet of Services
       (e.g., general satisfaction with the transportation support             (IoS) at the cyber level, and Internet of People (IoP). A vision of
       etc.)                                                                   Physical-Cyber-Social computing enabled by knowledge technolo-
    These functionalities for the CaMP use case are made available             gies and illustrated with an application in the medical domain is
by the underlying infrastructures which (i) ensures that data from             discussed in [18]. Smart City applications inherently subscribe to
various data sources is loaded and semantically integrated so that             the concept of CPSS [5] as we also demonstrate in our own project
it can be (ii) visualised using a visual widget-based platform where           with a transportation and a sustainable energy related use case.
various widget types can be combined into mashups in order to                     Common to CPSS efforts in all domains is that they primarily
support the exploration of relevant planning information by the                focus on describing concrete systems, and how they function. In
transportation planners.                                                       CitySPIN, on the contrary, we aim to support the engineering phase
    We plan to continue the development of the current CaMP pro-               of these systems. A particular focus is on the ETL and data integra-
totype13 with the integration of additional social data, in particular         tion process which takes up considerable effort. Similarly to our
data from mobile operators and results from the process mining                 projects, the QROWD project14 also develops semantics based data
components that should also allow planners to get a better under-              integration approaches. However, these do not support privacy-
standing of the social aspect of the CPSS.                                     aware data integration as has been done in CitySPIN.



13 http://rebrand.ly/mobility-mashup                                           14 http://qrowd-project.eu/




                                                                          63
CPSS2019, October 22, 2019, Bilbao, Spain                                                                                                                               M. Sabou, et al.


7 CONCLUSION AND OUTLOOK                                                                      [9] Claudio Di Ciccio, Fajar J. Ekaputra, Alessio Cecconi, Andreas Ekelhart, and Elmar
                                                                                                  Kiesling. 2019. Finding Non-compliances with Declarative Process Constraints
In this paper, we provided an overview of the CitySPIN CPSSs                                      through Semantic Technologies. In CAiSE Forum. Springer, 60–74. https://doi.
platform and development approach focusing mainly on a data                                       org/10.1007/978-3-030-21297-1_6
                                                                                             [10] A Dix, J Finlay, GD Abowd, and R Beale. 2004. Human-computer interaction:
engineering perspective. Using multiple use cases developed with                                  Pearson prentice hall. Inc, England (2004).
stakeholders in a city-scale context as a lense to explore challenges                        [11] Javier D Fernández, Miguel A Martínez-Prieto, Claudio Gutiérrez, Axel Polleres,
of heterogeneity, privacy, and process dynamics, we motivated the                                 and Mario Arias. 2013. Binary RDF representation for publication and exchange
                                                                                                  (HDT). Web Semantics: Science, Services and Agents on the World Wide Web 19
design of the CitySPIN architecture described in this paper. We                                   (2013), 22–41.
illustrated the prototypical implementation of this architecture by                          [12] W. Guo, Y. Zhang, and L. Li. 2015. The integration of CPS, CPSS, and ITS: A
means of a real-world use case in public transportation planning.                                 focus on data. Tsinghua Science and Technology 20, 4 (August 2015), 327–335.
                                                                                                  https://doi.org/10.1109/TST.2015.7173449
    In future work, we will investigate the integration of more real-                        [13] Tomas Knap, Petr Skoda, Jakub Klímek, and Martin Necaskỳ. 2015. UnifiedViews:
time sensing and actuation components into the platform, which                                    Towards ETL Tool for Simple yet Powerfull RDF Data Management.. In DATESO.
                                                                                                  111–120.
will enable CPSS developers to integrate additional social compo-                            [14] Z. Liu, D. Yang, D. Wen, W. Zhang, and W. Mao. 2011. Cyber-Physical-Social
nents into the CPSS loop. In the long term, this could facilitate the                             Systems for Command and Control. IEEE Intelligent Systems 26, 4 (2011), 92–96.
implementation of adaptive strategies in various use cases in the                            [15] Fabrizio Maria Maggi, Claudio Di Ciccio, Chiara Di Francescomarino, and Taavi
                                                                                                  Kala. 2018. Parallel algorithms for the automated discovery of declarative process
mobility and energy domains.                                                                      models. Inf. Syst. 74, Part 2 (2018), 136–152. https://doi.org/10.1016/j.is.2017.12.002
                                                                                             [16] Angelika Musil, Juergen Musil, Danny Weyns, Tomas Bures, Henry Muccini, and
ACKNOWLEDGMENTS                                                                                   Mohammad Sharaf. 2017. Patterns for Self-Adaptation in Cyber-Physical Systems.
                                                                                                  In Multi-Disciplinary Engineering for Cyber-Physical Production Systems, Stefan
This work was funded by the Austrian Research Promotion Agency                                    Biffl, Arndt Lüder, and Detlef Gerhard (Eds.). Springer International Publishing,
FFG under grant 861213 (CitySPIN).                                                                Chapter 13, 331–368.
                                                                                             [17] Barry Norton, Luis M. Vilches, Alexander De León, John Goodwin, Claus
                                                                                                  Stadler, Suchith Anand, Dominic Harries, Boris Villazón-Terrazas, and Ghis-
REFERENCES                                                                                        lain A. Atemezing. 2012. NeoGeo Vocabulary Specification. (2012). http:
 [1] 2003. Basic Geo (WGS84 lat/long) Vocabulary. (2003). https://www.w3.org/2003/                //geovocab.org/doc/neogeo/
                                                                                             [18] A. Sheth, P. Anantharam, and C. Henson. 2013. Physical-Cyber-Social Computing:
     01/geo/
                                                                                                  An Early 21st Century Approach. IEEE Intelligent Systems 28, 1 (Jan 2013), 78–82.
 [2] 2012. NeoGeo Geometry Ontology. (2012). http://geovocab.org/geometry
                                                                                                  https://doi.org/10.1109/MIS.2013.20
 [3] 2012. NeoGeo Spatial Ontology. (2012). http://geovocab.org/spatial
                                                                                             [19] Tuan-Dat Trinh, Peter Wetz, Ba-Lam Do, Elmar Kiesling, and A Min Tjoa. 2015.
 [4] Ethem Alpaydin. 2009. Introduction to machine learning. MIT press.
                                                                                                  Distributed mashups: a collaborative approach to data integration. International
 [5] Christos G. Cassandras. 2016. Smart Cities as Cyber-Physical Social Systems.
                                                                                                  Journal of Web Information Systems 11, 3 (2015), 370–396.
     Engineering 2, 2 (2016), 156 – 158. https://doi.org/10.1016/J.ENG.2016.02.012
                                                                                             [20] Wil M. P. van der Aalst. 2016. Process Mining - Data Science in Action, Second
 [6] David Corsar, Milan Markovic, Peter Edwards, and John D. Nelson. 2015. The
                                                                                                  Edition. Springer. https://doi.org/10.1007/978-3-662-49851-4
     Transport Disruption Ontology. In The Semantic Web - ISWC 2015, Marcelo Are-
                                                                                             [21] Y. Wang, W. Dai, B. Zhang, J. Ma, and A. V. Vasilakos. 2017. Word of Mouth
     nas, Oscar Corcho, Elena Simperl, Markus Strohmaier, Mathieu d’Aquin, Kavitha
                                                                                                  Mobile Crowdsourcing: Increasing Awareness of Physical, Cyber, and Social
     Srinivas, Paul Groth, Michel Dumontier, Jeff Heflin, Krishnaprasad Thirunarayan,
                                                                                                  Interactions. IEEE MultiMedia 24, 4 (October 2017), 26–37. https://doi.org/10.
     and Steffen Staab (Eds.). Vol. 9367. Springer International Publishing, Cham,
                                                                                                  1109/MMUL.2017.4031317
     329–336. https://doi.org/10.1007/978-3-319-25010-6_22
                                                                                             [22] G. Xiong, F. Zhu, X. Liu, X. Dong, W. Huang, S. Chen, and K. Zhao. 2015. Cyber-
 [7] Simon Cox, Chris Little, Jerry R. Hobbs, and Feng Pan. 2018. Time Ontology in
                                                                                                  physical-social system in intelligent transportation. IEEE/CAA Journal of Auto-
     OWL. W3C Recommendation. W3C. https://www.w3.org/TR/owl-time/
                                                                                                  matica Sinica 2, 3 (July 2015), 320–333. https://doi.org/10.1109/JAS.2015.7152667
 [8] Richard Cyganiak, Dave Reynolds, and Jeni Tennison. 2014. The RDF data cube
                                                                                             [23] Xifan Yao and Yingzi Lin. 2016. Emerging manufacturing paradigm shifts for the
     vocabulary. W3C Recommendation. W3C. https://www.w3.org/TR/vocab-data-
                                                                                                  incoming industrial revolution. The International Journal of Advanced Manufac-
     cube/.
                                                                                                  turing Technology 85, 5 (01 Jul 2016), 1665–1676.




                                                                                        64