=Paper= {{Paper |id=Vol-3340/43 |storemode=property |title=DIORAMA: Digital twIn fOR sustAinable territorial MAnagement |pdfUrl=https://ceur-ws.org/Vol-3340/paper43.pdf |volume=Vol-3340 |authors=Andrea Bianchi,Giordano d'Alosio,Andrea D'Angelo,Antinisca Di Marco,Alessandro Di Matteo,Jessica Leone,Giulia Scoccia,Giovanni Stilo,Luca Traini,Marco Arazzi,Marco Ferretti,Antonino Nocera,Cheick T. Ba,Alessia Galdeman,Manuel Dileo,Christian Quadri,Matteo Zignani,Sabrina Gaito |dblpUrl=https://dblp.org/rec/conf/itadata/BianchidDMMLSST22 }} ==DIORAMA: Digital twIn fOR sustAinable territorial MAnagement== https://ceur-ws.org/Vol-3340/paper43.pdf
DIORAMA: Digital twIn fOR sustAinable territorial
MAnagement
Andrea Bianchi, Giordano d’Aloisio, Andrea D’Angelo, Antinisca Di Marco,
Alessandro Di Matteo, Jessica Leone, Giulia Scoccia, Giovanni Stilo and Luca Traini
Department of Engineering and Information Sciences and Mathematics, University of L’Aquila, Italy


                                      Abstract
                                      Territorial Management is a challenging task that goes beyond the concept of smart city mainly because
                                      it involves decisions taking into consideration several aspects including the urban planning, risks
                                      assessment, disaster recovery, as well as health, social and economical management of a specific area.
                                      Territorial Management and related decisions are still little supported by software and IT systems even if
                                      a lot of information and data are already available, as stored in institutional databases or released as open
                                      data. In this paper, we present DIORAMA, a digital twin for sustainable territorial management, that
                                      leveraging on the results of Territori Aperti and SoBigData projects, aims to provide citizens, associations
                                      and institutions a system that supports decision-makers to take sustainable decisions in the field of
                                      territorial management guaranteeing citizens participation and operating under the FAIR and Open
                                      Science principles.

                                      Keywords
                                      Territorial Management, Digital Twin, Sustainability




1. Introduction
Territorial management is always challenging, given the several domains involved, sometimes
in contrast to each other (e.g., health, economy, regulations, urban planning, risk assessment
and reduction, etc.). Managing a territory after a disaster is even more complex due to the
damages suffered by public/private buildings and infrastructures like bridges, roads, gas pipeline,
sewerage, water pipeline and electricity network [1]. Effective management of resources (such
as funding, time, data, software, and so on) is required to improve the efficiency of operations.
In addition, the population’s wellness must be considered to increase the effectiveness of
the management. The combination of effective management of resources, quick decisions
and wellness of the population comprises our definition of sustainability. We argue that the
management of the territory, especially under disaster recovery, must be sustainable. However,
decision-makers and public authorities often lack a comprehensive view of the territory and a

ITADATA2022: The 1𝑠𝑡 Italian Conference on Big Data and Data Science, September 20–21, 2022, Milan, Italy
Envelope-Open andrea.bianchi@graduate.univaq.it (A. Bianchi); giordano.daloisio@graduate.univaq.it (G. d’Aloisio);
andrea.dangelo6@student.univaq.it (A. D’Angelo); antinisca.dimarco@univaq.it (A. D. Marco);
alessandro.dimatteo1@student.univaq.it (A. D. Matteo); jessica.leone@student.univaq.it (J. Leone);
giulia.scoccia@student.univaq.it (G. Scoccia); giovanni.stilo@univaq.it (G. Stilo); luca.traini@univaq.it (L. Traini)
Orcid 0000-0002-6046-4355 (A. Bianchi); 0000-0001-7388-890X (G. d’Aloisio); 0000-0002-0577-2494 (A. D’Angelo);
0000-0001-7214-9945 (A. D. Marco); 0000-0002-2092-0213 (G. Stilo); 0000-0003-3676-0645 (L. Traini)
                                    © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073
                                    CEUR Workshop Proceedings (CEUR-WS.org)
data science tool is required to perform sustainable, efficient and effective management.
   The availability of big data coming from heterogeneous sources (such as open data repositories,
geographic information systems, smartphones, and social media) enables a complete digital
representation of the territory covering each aspect of its management, including disaster
recovery. In this paper, we want to make a step ahead and present DIORAMA, a Digital twIn
fOR sustAinable territorial MAnagement [2] we are developing under the Territori Aperti project.
Territori Aperti is a documentation, training, and research center for sustainable territorial
management with a particular focus on disaster recovery [3]. DIORAMA represents the core of
Territori Aperti since it leverages on all the work and research we are performing in the project.
As it happens for Territori Aperti, DIORAMA also leverages on SoBigData++ to implements the
open science principles.
   Through a digital twin (DT), it is possible to model any type of physical entity [4], therefore
DTs arr increasingly used to model entities of any domains, such as in medical, corporate and
manufacturing domains. In our particular case, we propose to model through the DT concept
the full sustainable territorial management. Papers such as [5], discuss the principles underlying
the construction of DT and give references to the main applications of DT to territories. It
highlights how, falling the topic within themes such as smart city development and sustainable
urban development, most of DT’s creations for the territory are confined to the city level [6, 7].
There are also several reviews of city-level DT construction types and applications of this
type [8, 9]. In general, there are several projects that aim to develop DT at urban level, at
different levels of abstraction [10, 11]. We recall that in the development of DT for more general
territorial management, we must consider different and sometimes opposite goals. For this
reason, since several representations of a territory are already available (satellite images, GIS
systems, etc), we have to rely on the integration of data from different sources and involve
different stakeholders from different domains. All the previously cited papers develop DT to
model processes, conditions and other various concepts related to territorial physical entities.
None of them is able to model the entire sustainable territorial management, considering
also non-physical concepts and by developing applications and analysis useful for users in
territorial management. With DIORAMA we aim to overcome this lack, by proposing DT for
the sustainable territorial management.
   The main contributions of this paper are: i) to present the DIORAMA digital twin and
its high-level architecture; ii) to describe how Territori Aperti contributes to DIORAMA by
describing how its achievements will be embedded in DIORAMA.
   The paper proceeds as follows: section 2 describes more in detail the Territori Aperti project
and DIORAMA. Sections 3, 4, and 5 are then respectively related to in-depth descriptions of the
analysis, visualization, and application projects we are conducting in Territori Aperti and which
comprises DIORAMA. Finally, section 6 describes some future works and concludes the paper.


2. DIORAMA Description
In this section, we describe the DIORAMA digital twin, we will develop under the Territori Aperti
project [3] by assembling all the applications, analyses, and visualizations developed within
the project. DIORAMA aims to create a complete representation of the territory integrating
                                 Institutions   Domain Experts          Citizens                               Open Data Sources
          Catalogue
                                                                        Final Users                            (e.g., Open




                                                                                      DIORAMA
                                                                                                               Repositories,
                                                Applications and Domain Services                               Institutional
              SoBigData
                                                                                                                                                   ESA
                                                                                                               Geographic
                                Health      Economy   Disaster            Other                                Information Systems)
                                                      Recovery     …     Domains
                                                                                                                                                Citizen
                                                                                                IoT/Egde and
                                    Analyses                     Visualizations                                                                 and
           Services                                                                                      Fog
                                                                                                                                                social
         for FAIR data                                                                            Computing
                          Territorial and environment representation                                                                            network
                                                                                          API
                                                                                                                                    Non-Open Data
          Distributed and super-computing IT infrastructures:                                                                      Sources (e.g., SIT,
          Caliban & D4Science                                                                                                   DemographicServices)


          Network Infrastructure (INCIPICT & 5G)




Figure 1: Overview of DIORAMA


different existing data sources each covering a specific domain of the territorial management
(i.e., health management, economy management, and so on), helping institutions, citizens and
domain experts in its sustainable management.
    Figure 1 reports a high-level layered architecture of DIORAMA which represents the follow-
up of Territori Aperti and is founded on a Territorial and environment representation derived
from data. Using this constantly updated representation of the environment, we conducted in
Territori Aperti several research activities related to analyses and visualization. These research
activities have resulted in approaches and methodologies that we will deploy in the Analyses &
Visualization layer of DIORAMA digital twin. Such approaches and methodologies are then
used in the development of several applications addressing one or more domains related to
the territorial management. These applications are used by stakeholders (such as, institutions,
domain experts, citizens, and so on) to actively participate to, to be informed of and to effectively
take decisions about a territory.
    Data is the primary resource of DIORAMA and drives the analyses, visualizations, and
applications we will develop. In Territori Aperti we collected information through APIs coming
from different sources (e.g., online data sources1 , ESA information systems [12], citizen and
social networks, and so on). We will reuse them in DIORAMA as first step and, as done in
Territori Aperti, all the resources produced in DIORAMA (such as data, methods, applications,
and so on) will be publicly shared following the Open Science and FAIR principle.
    We aim to support the DIORAMA implementation on the distributed and super-computing
IT infrastructures Territori Aperti leveraged on: Caliban Cluster and D4Science. The Caliban
Cluster supercomputer has a power computing of about 5.0 Teraflops and allows the execution of
extensive and computational complex analyses. It is a very important infrastructure for a broad
audience of researchers and students from various disciplines in the Department of Information
Engineering, Computer Science, and Mathematics of the University of L’Aquila. D4Science
is, instead, the IT distributed platform managed by National Council of Research on top of
which Territori Aperti gateway and SoBigData EU research infrastructure [13] are deployed.
Being one exploratory of SoBigData, Territori Aperti (as well as the future DIORAMA) exploits
SoBigData and all the resources it provides, including the public IT platform that implements
    1
        In Territori Aperti we established conventions with some institutions to also have data that are not open
                           Defines                                                        Quality
                                             Quality & Feature Model                     ML Pipeline

                  ML
                 Expert                                                                   Yes
                                                                  Slicing
                                     Functional and
                           Defines      Quality                                 ML       Requirements
                                                              Configuration
                                     Requirements                             Pipeline     satisfied?
                                      Specification
                  ML                                              1-n           1-n
                Designer                                                                  No



Figure 2: High-level architecture of MANILA


Open Science and FAIR principles.
   Note that DIORAMA must be intended as an ecosystem of high-level services and end-user
applications. Thus it must not be confused with other mid-level technologies such as Data
Lakes, Federated DB, and Polystores [14, 15]which are part of the core components on which
DIORAMA can be implemented.
   As final remark, due to the high amount of data DIORAMA could exchanges, a suitable
network infrastructure should be used. DIORAMA can count on two enabling technologies:
the fiber optic ring implemented within the INCIPICT project and the 5G infrastructure being
tested in the city of L’Aquila.
   In the following sections, we describe first the Analyses and Visualization techniques we
implemented in Territori Aperti (in Section 3 and in Section 4, respectively), and then we present
the applications developed on top of such approached (in Section 5). Each application can be
located in one or more of the domains depicted in figure 1.


3. Analyses and Research Projects
This section describes the analyses techniques and methodologies we have developed in Ter-
ritori Aperti. Together with visualization approaches, they represent the foundations for the
development of DIORAMA. In particular, we describe the following: MANILA (section 3.1),
BeFairest (section 3.2), and COMPASS (section 3.3) approaches.

3.1. Model bAsed developmeNt of ml pIpeLines with quAlity (MANILA)
In order to be sustainable and useful for end-users, machine learning-based (ML) data analysis
systems 2 must be accurate in their predictions and must satisfy quality attributes (QA). We
consider as crucial QA: fairness [16], computational complexity [17], interpretability [18],
explainability [19], and privacy [20]. In order to build analysis tools based on ML pipelines
satisfying such QA, a data scientist must have a good knowledge of the underlying ML domain.
   To simplify the implementation of quality ML pipeline, we are developing MANILA, an
innovative model-driven framework that guides data scientists in developing ML pipelines
assuring quality requirements [21, 22]. Figure 2 depicts the high-level architecture of such a
framework. In particular, ML pipelines are made of typical phases [23] that embed a set of

   2
       In this context, we define ML systems as a set of one or more ML pipelines.
standard components identified by the system’s functional requirements and a set of variability
points. Product-Line Architectures, specified by Feature Models [24], represent a suitable model
to formalize ML pipelines with variability. But, they miss adequate means to specify quality
attributes and requirements, thresholds and metrics. To address this issue, in MANILA approach
(see Figure2), we extend the feature models meta-model to enable: i) the creation, by the ML
expert, of a Quality & Feature Model (as done in [25]) ii) the specification of functional and quality
requirements by the data scientist. In particular, the data scientist specifies a set of functional
and quality requirements compliant with the defined meta-model. These requirements are
used to automatically generate, from the extended feature model provided by the ML expert,
a set of ML pipeline configurations able to satisfy the defined functional requirements (the
Configuration boxes in the Figure). The configurations are defined by removing from the feature
model all the components (and their relative specification) not suitable to meet the specified
requirements. The derived configurations are used to generate a set of python scripts each
implementing a ML pipeline possibly satisfying the posed constraints. These pipelines are then
tested to verify if the quality constraint is actually satisfied.
   The framework returns the set of Quality ML Pipelines, satisfying quality constraints, if any,
or demands the data scientist to relax quality requirements and repeat the process.

3.2. BeFairest
Fairness represents one of the critical quality attributes for ML systems, as it ensures that
they are unbiased and do not apply discrimination among groups in the input dataset. Its
importance motivated the joint effort of studying the Bias and Fairness problem in ML through
the approach named BeFairest. In particular, we developed and tested the Debiaser for Multiple
Variables (DEMV), a novel preprocessing algorithm to improve fairness in binary and multi-class
classification problems with any number of sensitive variables [26].
   For any ML system to be fair means to comply with several fairness metrics (for instance,
Statistical Parity [27]) within strict thresholds. DEMV re-balances the various combinations
of sensitive variables, each embodying one particular unprivileged group of samples, and by
doing so is capable of significantly improving the dataset’s fairness while keeping the inevitable
accuracy losses of the classifier down to indiscernible percentages.
   DEMV efficiently manage multiple sensitive variables, both binary and categorical, making
it highly flexible for any use. This is a considerable improvement concerning the baselines
described in Fairness literature, as they are often limited to one sensitive variable (e.g., Exponen-
tiated gradient [28]) or only binary labels (e.g., Reweighing [29]). Thanks to these properties,
DEMV can be used for many of the application domains depicted in figure 1.
   The implementation of DEMV is available at the Territori Aperti RI.

3.3. Improving prediction by integrating explainable cOmputational Models
     based on heterogeneouS data (COMPASS)
In complex scenario, such as the medical domain, data contain variability on the types and
in the formats and they usually comes from different sources. Due to issues related to high
redundancy, missing data, untruthfulness and having been created for several purposes, it
                                                                  Extended
                                                                knowledge base


                             Genomic data




                                                                       Interpretable hyper-model
                             Clinical data
                                                                                 Explanation
                                                                       Explanatio          Explanatio
                                                                                  -model 2
                                                                       n-model 1           n-model 3
                                                   Learning
                                                   models
                                                                                                          Personalized
                             Imaging data                                                                  Prediction
                                                                                                        + explanation of hyper model




                                             Single pipelines (one         Extendend pipeline
                                             pipeline per source)




Figure 3: High-level visual description for the system to analyze in the CVD domain.


is difficult to integrate these heterogeneous data to meet the business information demand.
In addition, although there are approaches that bring together different types of data and
from different sources (neural networks trained and which fuse together heterogeneous and/or
multi-sources data), in complex and sensible scenarios this is not always possible, because
they are critical domains protected by different regulations possibly related to ethics and data
privacy issues. To better clarify our goal, let us refer, as real case study, to cardiovascular risk
(CVD) prediction where we can apply the process mentioned above and summarized in Figure 3.
Suppose there are different departments within a hospital (the one responsible for the genetic
data, the one responsible for the clinical data, the one responsible for the images) that do not
share the data in their entirety. In these cases, it is necessary to proceed for local learning and
subsequently aggregate their individual predictions and tuning parameters to create a new
extended learning model. The aim of COMPASS is to define a system that generates accurate
predictions, exploits heterogeneous data and aims to be interpretable and explainable, also using
a pre-defined domain (in the considered scenario, medical) knowledge to assist the intelligent
learning models used in the system. In this case study, starting from single pipelines, we define
a new extended model, called Hyper Model (HM). The HM will be equipped with a knowledge
of application (e.g., medical) domain and it will be composed by components capable of defining
the explainability and interpretability, essential in the (medical) domain to bring trust in the
results of the predictions obtained by the single learning pipelines. Since HM is driven on the
predictions and on the parameters/weights of the individual local pipelines, it does not directly
access to raw data and hence solves the problems related to data privacy issues. Note the
difference with respect to architectural models such as federated learning [30], central learning
or swarm learning [31], which, although similar and while solving privacy problems in the
same way, do not solve intrinsic prediction issues, such as interpretability and explainability
modules.
4. Visual Analytics
The analyses within Territori Aperti and hence in DIORAMA are driven by vast sets of data
gained by different sources. To derive meaningful insights from them, it is necessary to employ
robust and scalable analytics that goes beyond the available data management systems enabling
the views of small portions of data. Data visualization plays a crucial role in digital twin analysis
and interpretation [4]. According to Vrabič et al. [32], a digital twin holds information that is
continuously visualised in a variety of ways to predict current and future conditions. Research
has shown how seeing and understanding together a large amount of data enables humans
to gather deeper knowledge and insights. Thus, approaches that integrate the exploration
capacity of experts with the enormous processing power of computers are winning for the
realize a powerful knowledge discovery tool that harnesses the best of both worlds [33]. Such
kind of approaches are called Visual Analytics. The typical steps of the Visual Analytics
process can be summarized as follows: i) Data Pre-processing (i.e. cleanning, transformation,
integration); ii) data analysis; iii) simple data Visualization; iv) Users insightful knowledge
generation through human perception, cognition, and reasoning activities; v) Users make new
hypotheses and integrate the newly generated knowledge into the analysis and visualization
through interactions; vi) Regenerate an updated visualization based on the interactions to reflect
the user’s understanding of the data.
   Visualization techniques can be classified according to [34]: i) The type of data to be visualized:
one-dimensional data such as temporal data, two-dimensional data such as geographic maps
and relational tables, text and hypertext, hierarchies and graphics, etc. ii) the Visualization
techniques: bubble chart, histogram, scatter plot, parallel coordinate, infographic, etc. iii) the
Interaction techniques: zooming, linking, overview+detail, fisheye etc. The above dimensions
can be considered orthogonal: any visualization technique can be used in conjunction with
any interaction technique as well as any type of data. In addition, a specific system can be
designed to support different types of data and can use a combination of multiple visualization
and interaction techniques. This allows you to quickly create different types of views that
help to dig deeper into the data. In Visual Analytics, the phases of querying, exploring and
visualizing data come together in a single process, helping to interpret data more easily and
thus make analytics easier for non-experts. The data are displayed interactively and graphically,
users can discover insights into the data without having to know how build charts and other
visualizations and without being proficient in analytical techniques, and therefore make smarter
decisions faster. DIORAMA will implement a novel Visual Analytics that supports human
thinking, fast data exploration and iteration, stakeholders collaborations and their insights
sharing.


5. Developed Applications
In this section, we describe the applications we developed in Territori Aperti and that can be
implemented in DIORAMA. These applications are made on top of the outcomes obtained from
the analyses and visualization techniques described respectively in sections 3 and 4. In this
paper we present the following applications: TA-Analytics (section 5.1), Territori Aperti Toolkit
                              Import
                               Data
                 USRA
               Repository
                                                                          Data Analytics
                                Data Importer   Store              Get         and          Use
                   ...
                                    App         Data               Data    Visualization   System
                                                        Database
                                                                             platform
                 i.Stat
                                                                                                    User



Figure 4: High-level architecture of TA-Analytics


(section 5.2), Evacuation and Reconstruction Planning (DiReCT approach, in Section 5.3) and
SismaDL (section 5.4).

5.1. TA-Analytics
Following the Open Science and FAIR principles, each end-user of DIORAMA should be able to
access and use the data collected, if restrictions do not bind them. TA-Analytics is an application
for the collection and analysis of Open Data that provides an interface for building interactive
dashboards shareable among all users.
   TA-Analytics is made of two principal components, which are highlighted in figure 4. The
first is the Data Importer (DI) component, which is responsible for automatically downloading
datasets from different Open Data repositories using the services (i.e., APIs) exposed by them.
The component can interact with different web services using different protocols. After down-
loading the datasets, DI stores the imported data inside a database. This process of downloading
and storing data is repeated periodically to have the most updated data available. At the time of
this paper, DI collects data from two Open Data repositories: i.Stat 3 , and USRA 4 .
   The second principal component is the Data Analytics and Visualization (DAV) application
which interacts with the database to retrieve the downloaded datasets. DAV is the entry point
for the end-users to all the datasets and services offered by TA-Analytics, allowing them to
build dashboards comprising analyses and charts. DAV offers the users a graphical interface
to interact with the data and make interactive dashboards. Each TA-Analytics dashboard can
include different datasets and visualizations and analyses. Finally, the implemented dashboards
can be shared among users and embedded in other applications.
   TA-Analytics represents one of the main results of the Territori Aperti project, embodying
the Open Science and FAIR philosophy by allowing users to build and share analyses using
Open Data from every domain of interest. Finally, TA-Analytics is extensible since it can embed
new methods and visualization techniques, as well as, new data the users wants to share.

5.2. Territori Aperti Toolkit
Natural disasters have a significant impact on the population and must be handled promptly by
competent administrations. However, the management of a disaster is not a simple activity. It


   3
       http://dati.istat.it/
   4
       https://bde.comuneaq.usra.it/bdeTrasparente/openData/openDataSet/
includes a series of critical aspects that must be considered and carefully managed, as demon-
strated by the two experiences of seismic craters of 2009 and 2016, which highlighted a series
of critical issues in the reconstruction process and emergency management.
   For this reason, we have developed the Territori Aperti Toolkit5 that we believe will improve
sustainability of post-disaster recovery procedure. This dynamic tool provides recommendations
to organizations, institutions, and citizens to better manage every critical aspect of a disaster.
The Toolkit is implemented as a website that maps every critical aspect of managing a disaster
to a series of good and bad practices. Recalling figure 1, the Territori Aperti Toolkit can be
located in the Disaster Recovery domain since its primary users can be identified in institutions
and citizens involved in managing a disaster (e.g., municipalities, special offices, civil protection,
and so on). The Toolkit is made of cards, classified by phases and sectors of application, each
with a common set of fields highlighting several aspects of disaster management. Among the
main features of the Toolkit, in addition to the consultation of the various data sheets, there is
the possibility of filtering them by Phases, Sectors and searching them using keywords, titles, or
names of the entities involved. All this happens through a dedicated search panel. It is also
possible to generate a PDF of all the cards currently shown, possibly filtered through the defined
conditions.

5.3. Disaster Recovery: the Direct Approach
Natural disasters can cause widespread damage to buildings and infrastructures and kill thou-
sands of living beings. These events are difficult to be overcome both by the populations and
by government authorities. Two challenging issues require in particular to be addressed: find
an effective way to evacuate people first, and later to rebuild houses and other infrastructures.
An adequate recovery strategy to evacuate people and start reconstructing damaged areas on
a priority basis can then be a game changer allowing to overcome effectively those terrible
circumstances. In this perspective, in [35] we present DiReCT, an approach based on i) a dy-
namic optimization model designed to timely formulate an evacuation plan of an area struck
by an earthquake, and ii) a decision support system, based on double deep Q Network, able to
guide efficiently the reconstruction the affected areas. The latter works by considering both the
resources available and the needs of the various stakeholders involved (e.g., residents social
benefits and political priorities). The ground on which both the above solutions stand was a
dedicated geographical data extraction algorithm, called ‘‘GisToGraph’’, especially developed for
this purpose. To check applicability of the whole approach, we dovetailed it on the real use-case
of the historical city center of L’Aquila (Italy) using detailed GIS data and information on urban
land structure and buildings vulnerability. Several simulations were run on the underlining
network generated. First, we ran experiments to safely evacuate in the shortest possible time
as many people as possible from an endangered area towards a set of safe places. Then, using
DDQN, we generated different reconstruction plans and selected the best ones considering
both social benefits and political priorities of the building units. The described approaches are
comprised in a more general data science framework delved to produce an effective response to
natural disasters. DIORAMA will embed the two services belonging to DiReCT.

    5
        https://toolkit.territoriaperti.univaq.it/
5.4. SismaDL
The emergency caused by a natural disasters must be tackled promptly by public institutions. In
this situation, Governments enact specific laws (i.e., decrees) to handle the emergency and the
reconstruction of destroyed areas. As it happened in 2009 and 2016 when the Italian Government
issued several, very different, decrees to face respectively the earthquakes of L’Aquila and Centro
Italia. In this work, we implemented SismaDL6 [36], an LKIF [37] based ontology, that models
the laws in the domain of natural disasters. SismaDL has been used to model the aforementioned
laws to build a knowledge base useful to reason about why one regulation is less effective and
efficient than the other. In particular, SismaDL extends the LKIF ontology to add entities and
properties specific of the 2009 and 2016 regulations. Using this ontology is it possible to analyze
the differences between the two regulations concerning, for instance, the Legal Model, the
Social Measures, or the Financing Mechanism. Recalling figure 1, SismaDL can be located under
the Disaster Recovery domain, since it is mostly focused on understanding and analyzing the
regulatory context that manages a natural calamity.


6. Conclusion and Future Works
In this paper, we described DIORAMA, a digital twin for sustainable territorial management
we will develop as the follow-up of Territori Aperti project. It will integrate, by leveraging
on SoBigData research infrastructure, all the research results and applications developed in
Territori Aperti by promoting the Open Science and FAIR principles. We have also described
in more detail the analyses and the visualization research we are conducting and the main
applications developed in Territori Aperti, each belonging to at least one of the domains depicted
in figure 1. To the best of our knowledge, DIORAMA, with its innovative data science approach
and tools, goes beyond state of the art in the field of territory management because it realizes
innovative and efficient services for the sustainable territorial management targeting several
involved stakeholders by reusing existing data, providing novel data analysis and powerful
visualization techniques.
   As lesson learned in Territori Aperti and in the definition of DIORAMA, we want to highlight
two main aspects: i) a lot of quality data already exists and it can be exploited both for new re-
search in territorial management and for the development of innovative and sustainable services
and application for institutions, citizens and decision-makers; ii) by reusing and integrating
research projects achievements we are able to implement a novel DT that also implement the
open science principles and the data FAIR capabilities.
   Future works are manifold. We need to still work on the foundational approaches described
in sections 3 and 4 and on improving the applications reported in section 5. In the future, we
aim to expand our work to other domains (such as urban planning) to make our DIORAMA
more complete and valuable.
Acknowledgments. This work is partially supported by Territori Aperti a project funded by Fondo
Territori Lavoro e Conoscenza CGIL CISL UIL and by SoBigData-PlusPlus H2020-INFRAIA-2019-1 EU
project, contract number 871042.

   6
       SismaDL is available in the Territori Aperti RI.
References
 [1] S. Tavakkol, H. To, S. H. Kim, P. Lynett, C. Shahabi, An entropy-based framework for
     efficient post-disaster assessment based on crowdsourced data, in: Proceedings of the 2nd
     ACM SIGSPATIAL International Workshop EM-GIS ’16, 2016, pp. 13:1–13:8.
 [2] M. Grieves, Digital twin: manufacturing excellence through virtual factory replication,
     White paper 1 (2014) 1–7.
 [3] Territori Aperti website, 2019. URL: https://territoriaperti.univaq.it/.
 [4] A. Fuller, Z. Fan, C. Day, C. Barlow, Digital twin: Enabling technologies, challenges and
     open research, IEEE access 8 (2020) 108952–108971.
 [5] A. Suvorova, Towards digital twins for the development of territories, in: Digital Trans-
     formation in Industry, Springer, 2022, pp. 121–131.
 [6] F. N. Abdeen, S. M. Sepasgozar, City digital twin concepts: A vision for community
     participation, Environmental Sciences Proceedings 12 (2022) 19.
 [7] G. Schrotter, C. Hürzeler, The digital twin of the city of zurich for urban planning, PFG–
     Journal of Photogrammetry, Remote Sensing and Geoinformation Science 88 (2020).
 [8] E. Shahat, C. T. Hyun, C. Yeom, City digital twin potentials: A review and research agenda,
     Sustainability 13 (2021) 3386.
 [9] T. Deng, K. Zhang, Z.-J. M. Shen, A systematic review of a digital twin city: A new pattern
     of urban governance toward smart cities, J. of Management Science and Eng. 6 (2021).
[10] A. V. Mukhacheva, M. N. Ugryumova, I. S. Morozova, M. Y. Mukhachyev, Digital twins
     of the urban ecosystem to ensure the quality of life of the population, in: International
     Scientific and Practical Conference Strategy of Development of Regional Ecosystems
     “Education-Science-Industry”(ISPCR 2021), Atlantis Press, 2022, pp. 331–338.
[11] G. Caprari, Digital twin for urban planning in the green deal era: A state of the art and
     future perspectives, Sustainability 14 (2022) 6263.
[12] T. Hörber, The european space agency and the european union, EU space policy (2015).
[13] V. Grossi, B. Rapisarda, F. Giannotti, D. Pedreschi, Data science at sobigdata: the european
     research infrastructure for social mining and big data analytics, International Journal of
     Data Science and Analytics 6 (2018) 205–216.
[14] F. Nargesian, E. Zhu, R. J. Miller, K. Q. Pu, P. C. Arocena, Data lake management: Challenges
     and opportunities, Proc. VLDB Endow. 12 (2019) 1986–1989. URL: https://doi.org/10.14778/
     3352063.3352116. doi:10.14778/3352063.3352116 .
[15] J. Duggan, A. J. Elmore, M. Stonebraker, M. Balazinska, B. Howe, J. Kepner, S. Madden,
     D. Maier, T. Mattson, S. Zdonik, The bigdawg polystore system, SIGMOD Rec. 44 (2015)
     11–16. URL: https://doi.org/10.1145/2814710.2814713. doi:10.1145/2814710.2814713 .
[16] N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, A. Galstyan, A Survey on Bias and
     Fairness in Machine Learning, ACM Computing Surveys 54 (2021) 1–35.
[17] S. A. Cook, An overview of computational complexity, Comm. of the ACM (1983).
[18] D. V. Carvalho, E. M. Pereira, J. S. Cardoso, Machine learning interpretability: A survey
     on methods and metrics, Electronics 8 (2019) 832.
[19] P. Linardatos, V. Papastefanopoulos, S. Kotsiantis, Explainable AI: A Review of Machine
     Learning Interpretability Methods, Entropy 23 (2020) 18.
[20] B. C. M. Fung, K. Wang, R. Chen, P. S. Yu, Privacy-preserving data publishing: A survey
     of recent developments, ACM Comput. Surv. 42 (2010).
[21] G. d’Aloisio, Quality-driven machine learning-based data science pipeline realization:
     a software engineering approach, in: 2022 IEEE/ACM 44th International Conference
     on Software Engineering: Companion Proceedings (ICSE-Companion), IEEE, 2022, pp.
     291–293.
[22] G. d’Aloisio, A. Di Marco, G. Stilo, Modeling Quality and Machine Learning Pipelines
     through Extended Feature Models, 2022. doi:10.48550/arXiv.2207.07528 .
[23] S. Amershi, A. Begel, C. Bird, R. DeLine, H. Gall, E. Kamar, N. Nagappan, B. Nushi,
     T. Zimmermann, Software engineering for machine learning: A case study, in: IEEE/ACM
     41st ICSE, SEIP track, IEEE, 2019, pp. 291–300.
[24] K. C. Kang, S. G. Cohen, J. A. Hess, W. E. Novak, A. S. Peterson, Feature-oriented domain
     analysis (FODA) feasibility study, Technical Report, Carnegie-Mellon Univ Pittsburgh Pa
     Software Engineering Inst, 1990.
[25] M. Asadi, S. Soltani, D. Gasevic, M. Hatala, E. Bagheri, Toward automated feature model
     configuration with optimizing non-functional requirements, Information and Software
     Technology 56 (2014) 1144–1165.
[26] G. d’Aloisio, G. Stilo, A. Di Marco, A. D’Angelo, Enhancing fairness in classification tasks
     with multiple variables: A data-and model-agnostic approach, in: International Workshop
     on Algorithmic Bias in Search and Recommendation, Springer, 2022, pp. 117–129.
[27] M. J. Kusner, J. Loftus, C. Russell, R. Silva, Counterfactual Fairness, in: Advances in Neural
     Information Processing Systems, volume 30, Curran Associates, Inc., 2017.
[28] A. Agarwal, A. Beygelzimer, M. Dudik, J. Langford, H. Wallach, A Reductions Approach to
     Fair Classification, in: Proc. of the 35th Int. Conf. on Machine Learning, 2018, pp. 60–69.
[29] F. Kamiran, T. Calders, Data preprocessing techniques for classification without discrimi-
     nation, Knowledge and Information Systems 33 (2012) 1–33.
[30] Q. Yang, Y. Liu, Y. Cheng, Y. Kang, T. Chen, H. Yu, Federated learning, Synthesis Lectures
     on Artificial Intelligence and Machine Learning 13 (2019) 1–207.
[31] S. Warnat-Herresthal, H. Schultze, K. L. Shastry, S. Manamohan, S. Mukherjee, V. Garg,
     R. Sarveswara, K. Händler, P. Pickkers, N. A. Aziz, et al., Swarm learning for decentralized
     and confidential clinical machine learning, Nature 594 (2021) 265–270.
[32] R. Vrabič, J. A. Erkoyuncu, P. Butala, R. Roy, Digital twins: Understanding the added value
     of integrated models for through-life engineering services, Procedia Manufacturing 16
     (2018) 139–146. doi:https://doi.org/10.1016/j.promfg.2018.10.167 , proceedings of
     the 7th International Conference on Through-life Engineering Services.
[33] P. C. Wong, Visual data mining, IEEE Computer Graphics and Applications 19 (1999).
[34] D. A. Keim, M. O. Ward, Visual data mining techniques, 2002.
[35] G. Mudassir, E. E. Howard, L. Pasquini, C. Arbib, E. Clementini, A. Di Marco, G. Stilo,
     Toward effective response to natural disasters: A data science approach, IEEE Access 9
     (2021) 167827–167844. doi:10.1109/ACCESS.2021.3135054 .
[36] F. Caroccia, D. D’Agostino, G. d’Aloisio, A. Di Marco, G. Stilo, SismaDL: an ontology to
     represent post-disaster regulation (2021) 14.
[37] R. Hoekstra, J. Breuker, M. Di Bello, A. Boer, et al., The LKIF core ontology of basic legal
     concepts., LOAIT 321 (2007) 43–63.