Empowering e-services through the Semantic Web Raffaella Maria Aracri1,† , Dario Frisardi1,† , Roberta Radini1,*,† and Valerio Santarelli2,† 1 Italian National Institute of Statistics (ISTAT), Italy 2 OBDA Systems s.r.l., Italy Abstract This article illustrates how to enhance data interoperability among Public Administrations (PPAA) by leveraging the publication of e-services based on Semantic Web (SW) technologies such as ontologies, controlled vocabularies, and data schemas, which, through standard languages like OWL [6], RDF[12], and SPARQL [5], ensure harmonization, integrability, and unique semantics for representing administrative data. Furthermore, the potential benefits of implementing semantic e-services through Ontology-based Data Management (OBDM), a data governance methodology that enables data services through ontologies, decoupling their implementation from the physical data sources of PPAA, are discussed. Additionally, the advantages for PPAA of data exchange through semantic e-services utilizing concepts published in the Schema platform, developed by the National Data Catalog project funded through PNRR funds, will be highlighted in the presented use case. Keywords PPAA, PNRR, Semantic Web, e-service, ontology, controlled vocabulary, data schema, OBDM 1. Introduction provides semantic clarity to administrative data through an extensive network of Ontologies, Con- As part of the investment initiatives delineated trolled Vocabularies, and data schemas [7]. These within the National Recovery and Resilience Plan are facilitated by standard languages such as OWL, (PNRR), which is part of the Next Generation EU RDF, and SPARQL, ensuring harmonization, in- (NGEU) program, a measure concerning the dig- tegrability, and a unified semantics to represent ital transition of Public Administrations (PPAA) administrative data information. in data management and interoperability has been The adoption of these standards ensures the acces- designated referred to as the National Digital Data sibility, reusability, and inferential capacities across Platform (PDND). This platform acts as the tool data originating from various sources, processes, to centralize the authentication and authorization and domains. To fully realize the benefits of these methods for data exchange among parties. There- technologies, data ideally should be accessible via fore, PDND manages the authorization phase of ac- standard data access protocols, such as SPARQL cessing the Application Program Interfaces (APIs), [5], the W3C’s reference language for querying RDF while PPAA set up their automatic connectors to datasets and OWL ontologies. Among the solu- make data accessible and interoperable, promot- tions to ensure adherence to these protocols is ing their sharing among administrations, as well the Ontology-based Data Management (OBDM) as between citizens and enterprises. Particularly, paradigm [8], which advocates for a virtual approach this approach avoids citizens from having to pro- to data governance, and consequently, data access, vide the same information multiple times to various through ontologies. administrations. The rest of this paper is structured as follows. In Additionally, investments aimed at enhancing Section 2, we will introduce Schema, the National data interoperability within the PDND also encom- Data Catalog for semantic interoperability, and ex- pass the development of Schema, the National Cat- plain how the semantic assets published therein can alog of data for semantic interoperability, which assist PPAA in the implementation of semantic e- services according to data schemas, leading to full Ital-IA 2024: 4th National Conference on Artificial In- data interoperability. Section 3 will illustrate the po- telligence, organized by CINI, May 29-30, 2024, Naples, tential benefits of implementing e-services through Italy * OBDM, primarily by decoupling the service layer Corresponding author. † from the data layer. In Section 4, we will provide These authors contributed equally. $ aracri@istat.it (R. M. Aracri); dario.frisardi@istat.it an example, through a use case, of the benefits (D. Frisardi); radini@istat.it (R. Radini); guaranteed to individuals by data interoperability. santarelli@obdasystems.com (V. Santarelli)  0009-0005-4399-7589 (D. Frisardi) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 2. Schema and the semantic Person : type : o b j e c t e-services d e s c r i p t i o n : h t t p s : / / w3id . o r g / Within the framework of semantic interoperability i t a l i a / onto /CPV/ Person enhancement, as delineated and overseen by the x−j s o n l d −c o n t e x t : National Data Catalog (NDC) project [7], the de- ... velopment of the portal, Schema, aims to make tax_code : h t t p s : / / w3id . o r g / i t a l i a / the semantic resources of PPAA available. Schema onto /CPV/ taxCode catalogs semantic assets, such as ontologies, con- d a t e _ o f _ b i r t h : h t t p s : / / w3id . o r g / trolled vocabularies, and data schemas, making i t a l i a / onto /CPV/ d a t e O f B i r t h them searchable, reusable, and thus fostering in- family_name : h t t p s : / / w3id . o r g / teroperability. The semantic structures of Schema i t a l i a / onto /CPV/ familyName enable the definition of a unified semantics that ... harmonizes data representation and facilitates in- properties : formation exchange for Italian PPAA. For further tax_code : information about Schema and the available seman- $ r e f : "#/components / schemas / tic assets, additional online resources can be found TaxCode " in Appendix A. date_of_birth : By leveraging ontology semantics, coherence and format : d a t e consistency in data are ensured, significantly en- type : s t r i n g hancing interoperability. Within ontologies, there is pattern : uniformity in the description and referencing of en- [0 −9]{4} −[0 −1][0 −9] −[0 −3][0 −9] tities present, supported by cardinality constraints family_name : that govern the relationships among them. type : s t r i n g Data schemas become an essential tool for PPAA ... to expose their data and facilitate communication TaxCode : through shared channels, exploiting the fundamen- type : s t r i n g tal principle of a unified semantics. The strength of d e s c r i p t i o n : h t t p s : / / w3id . o r g / data schemas lies in their ability to expose the data i t a l i a / onto /CPV/ taxCode structure and enforce type and format constraints, ... ensuring not only syntactic but also semantic inter- where the type is reported, as well as the URI operability, targeting entities and concepts defined pointing to the semantic resource is included in within ontologies and controlled vocabularies [10]. description. All the properties, whether re- In the context of e-services, the accurate defini- quired or necessary for the service, are inserted tion of data schemas plays a fundamental role in after specifying the corresponding semantic refer- ensuring data coherence and complete interoper- ence. Thus, it can be observed how data schemas ability. Establishing a shared ontological semantics, incorporate format and type constraints, as well therefore, becomes crucial to guarantee a uniform in- as references to other components (e.g., tax_code), terpretation of data by all involved parties. Through which will then reference a semantic resource defin- a proper implementation of data schemas, e-services ing its concept. Currently, JSON-LD standard [13] become essential tools for promoting the effective does not allow to define in a unique way the con- utilization of semantic technologies within the realm text for primitive values, i.e. string, and in such of public administration. cases it is necessary to adopt an ad-hoc shared strat- The data schema format for the e-service must egy. More detailed data schemas are available in adhere to the specifications of a YAML file [2] (if Appendix A. version 3.0 of OpenAPI is utilized, the YAML file should be named with an extension oas3.yaml). Within the e-service data schema, the main com- 3. Implementation of e-services ponents of the service in question must be defined, taking into account the semantic references of on- with OBDM tologies and controlled vocabularies. An example The Ontology-based Data Management (OBDM) of how the data schema should be structured to [4, 8, 9] is a paradigm introduced and promoted by define the concept of “Person” is provided below, the Department of Computer, Control, and Man- showcasing some of the key commands: agement Engineering “Antonio Ruberti” (DIAG) at Sapienza University of Rome and by OBDA Sys- with mappings serving the role of reconciliation be- tems1 . Its aim is the integration and governance of tween these levels. In this scenario, the ontology data stored in an organization’s information system and corresponding mappings to the sources provide through an ontology. The purpose of this approach not only a tool for data access but also a common is to create a single conceptual access point to the basis for documenting an enterprise’s information organization’s information assets, enabling the con- assets. This approach brings significant benefits ceptual realization of all data governance services for governance and management of the information within a complex system. system. Indeed, OBDM can be viewed as a form of virtual E-services represent an opportunity to leverage data integration. However, it is based on the notion the wealth of shared ontological models and con- of replacing the global schema, which represents the trolled vocabularies in Schema, not only as tools for unified view of the domain, with a conceptual and conceptual sharing, but also as a means of semanti- formal representation formulated through an ontol- cally accessing data according to standardized and ogy expressed in a logical language. This choice en- shared models. The value added by implementing sures that the integrated view offered by an OBDM e-services through OBDM techniques is two-fold. system is not limited to a structure accommodating Firstly, it resides in the capability to decouple data from sources but constitutes a semantically the implementation of e-services from the physical rich description of the relevant concepts within the sources of repositories that host the data of PPAA. domain of interest and the relationships between In this scenario, the realization of e-services could them. Similarly to how it occurs in data integra- potentially be accomplished solely through the artic- tion systems, conceptual relationships, or mappings, ulation of requirements or queries on the ontological are utilized to establish semantic correspondences models published on Schema by the data-owning between the global schema and the data in the PPAA. This approach delegates the task of medi- sources. ating with the data structures of individual PPAA When seeking information, a query is expressed to the mappings, while the ontological reasoning on the ontology (rather than on the information engine is entrusted with leveraging ontologies and system’s databases), and the correspondences estab- mappings to translate the ontological requirement lished between the data and the ontology’s concepts into queries on the physical data. enable the ontological reasoning engine [11, 3] to Secondly, OBDM provide the possibility to ac- derive the response. This relieves the user from the cess data using SPARQL [5], the W3C standard necessity of understanding the technical aspects of language for querying ontologies and RDF datasets, data storage and the specifics of where and how without the implementation of data transformation data is storage. Similarly, when carrying out a and migration processes from their sources, typi- data governance task (such as quality assessment, cally consisting relational DBMS, to triple stores re-engineering, data cleaning, etc.), direct access for formatted RDF data. Given the complexity and to the informational sources is bypassed, and the the volume of such data commonly managed by appropriate functions are performed through the the information systems of PPAA, such processes domain ontology. naturally require significant efforts, including infras- The latter aspect not only formally describes the tructural ones. OBDM, on the other hand, offers enterprise’s information model but also serves as a a solution distinct from this scenario, favoring a means to embrace a declarative approach to data virtual approach to data access, where queries ex- governance. Through the explicit delineation of pressed in SPARQL on the ontology are transformed the domain representation, knowledge reusability into SQL queries on the physical sources at query is achieved, a feat not achieved when the global time. schema merely provides a unified description of the The decision to employ OBDM presents clear underlying sources. advantages both in terms of e-service implementa- OBDM systems generally have a common struc- tion, utilizing standard Semantic Web languages for ture divided into three layers or tiers: ontology, map- querying ontologies, i.e. SPARQL, and in terms of pings, and data sources. The distinction between their maintenance and evolution. ontology and data sources reflects the separation The decoupling from the physical layer would between the conceptual or semantic level, which allow isolating the e-service layer from the usual dy- is presented to users, and the logical and physical namics of reorganization, restructuring, distribution, level of the information system, stored in the sources, or replication typically encountered by databases in information systems, particularly in the case of 1 www.obdasystems.com large organizations. In light of these developments, adjustments to ensure the consistency of data shar- i. an e-service, S1, exposed by the National ing services would be confined to the mapping layer, Register of the Resident Population (ANPR) thus modifying the assertions that express the cor- system, which returns a confirmation if the respondences between elements of the data layer person’s current municipality of residence and those of the semantic layer. matches that of the cultural asset; ii. an e-service, S2, provided by the Ministry for Universities and Research (MUR), con- 4. A use case of semantic firming that the person is a student enrolled interoperability in a High-Level Education course. Based on what has been introduced in the previous The e-service S12, after retrieving the validations sections, we aim to introduce an illustrative use case returned by the e-services S1 and S2, returns to of data interoperability among PPAA and a valuable the requesting institution the authorization for the aid to citizens by implementing the principle of once promotion for the individual. only [1]. It can be hypothesized that the Ministry of An example of the YAML code configuration for Culture, MiC, intends to offer a promotional service the e-service S1 is provided below, with the key for the enjoyment of cultural assets (such as certain commands: museums) by high-level education students from the components : local area, i.e., their municipality of residence. schemas : This use case involves an exchange of informa- RegisteredResidentPerson : tion between e-services exposed by the relevant x−j s o n l d −c o n t e x t : PPAA. For its modeling, as introduced in Section 2, RPO: h t t p s : / / w3id . o r g / i t a l i a / a portion of the Schema semantic network is uti- onto /RPO/ lized, particularly the core Location (CLV), People tax_code : h t t p s : / / w3id . o r g / (CPV), and Organization (COV) ontologies, along i t a l i a / onto /CPV/ taxCode with domain-specific ontologies, such as the Italian currently_registered_residence : Learning Ontology (Learning), Resident Population " @id " : "RPO: Ontology (RPO), and the Cultural Heritage Ontol- currentlyHasRegistered ogy (CulturalHeritage). All the reported ontologies ResidenceIn " are available in Appendix A, while in Figure 1 there " @context " : is a portion of the semantic network. " @base " : " h t t p s : / / w3id . o r g / i t a l i a / controlled − vocabulary / t e r r i t o r i a l − classifications/cities " type : o b j e c t d e s c r i p t i o n : h t t p s : / / w3id . o r g / i t a l i a / onto /RPO/ RegisteredResidentPerson required : − tax_code properties : taxCode : $ r e f : "#/components / schemas / TaxCode " currently_registered_residence : type : s t r i n g Figure 1: Extract of Schema’s semantic network. enum : [ . . . ] example : ’058103 −(1871 −01 −15) ’ TaxCode : To achieve this, the Ministry of Culture, MiC, can type : s t r i n g provide an e-service, S12, to a cultural institution d e s c r i p t i o n : h t t p s : / / w3id . o r g / (e.g., a museum), which, given the unique identifier i t a l i a / onto /CPV/ taxCode of a person (tax code) and the municipality where the cultural asset subject to promotion is located, The addressing to the tax code concept is retrieves: ensured by the URI identifying the concept (https://w3id.org/italia/onto/CPV/taxCode), simi- quently data, can fully achieve the objective of “dig- larly to the use of the controlled vocabulary of cities italization, innovation and security in the Public (https://w3id.org/italia/controlled-vocabulary/ Administration.” This could be the prompting for territorial-classifications/cities) for current reg- the digitalization of the entire Italian country. istered residence. Therefore, unique semantics emerge as a fundamental tool that, in the reported use case, enables interoperability among the 3 References e-services. [1] AgID. Piano Triennale per l’informatica In Figure 2 there is an illustration depicting the nella Pubblica Amministrazione 2024 main components, and their connections through - 2026. Agenzia per l’Italia Digi- e-services, in the use case. tale, Rome, Italy, 2023. Available online at: https://docs.italia.it/italia/ piano-triennale-ict/pianotriennale-ict-doc/it/ 2024-2026/index.html. [2] Oren Ben-Kiki, Clark Evans, and Ingy döt Net. YAML ain’t markup language version 1.2. 2009. Available on: http://yaml.org/spec/1.2/spec. html. [3] Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, Antonella Poggi, Mariano Rodriguez- Muro, Riccardo Rosati, Marco Ruzzi, and Domenico Fabio Savo. The MASTRO system for ontology-based data access. Semantic Web, 2(1):43–53, 2011. [4] Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, and Ric- Figure 2: Use case representation. cardo Rosati. Tractable reasoning and efficient query answering in description logics: The DL- Lite family. Journal of Automated reasoning, 39:385–429, 2007. 5. Conclusions [5] Steve Harris and Andy Seaborne. SPARQL 1.1 Query Language, 2013. Available online at: As previously outlined, the use of e-services repre- https://www.w3.org/TR/sparql11-query/. sents a pivotal element in automating data exchange [6] Pascal Hitzler, Markus Krötzsch, Bijan Parsia, among PPAA. Their implementation can play a Peter F. Patel-Schneider, Sebastian Rudolph, crucial role as the driving force for achieving data et al. OWL 2 web Ontology Language interoperability, facilitated by a common semantics Primer. W3C recommendation, 27(1):123, 2009. represented by the data schemas provided by the Available online at: https://www.w3.org/TR/ Schema platform. owl2-primer/. Managing data access through the OBDM tech- [7] Istat. Trasformazione digitale della pubblica nique is an area worth investing in the short term, amministrazione. Metodi per l’interoperabilità as it allows for the decoupling of the physical data per lo sviluppo di e-service. In Letture statis- layer from the semantic one and enables the applica- tiche – Metodi. Istituto nazionale di statistica, tion of these innovative techniques even on existing 2024. Available online at: https://www.istat. data structures without the need for re-conversion. it/it/archivio/293230. Furthermore, based on a logical language, these [8] Maurizio Lenzerini. Ontology-based data man- techniques also provide Artificial Intelligence ser- agement. In Proceedings of the 20th ACM vices that infer knowledge from data and facilitate international conference on Information and quality management. knowledge management, pages 5–6, 2011. The proposed solutions move in this direction, [9] Maurizio Lenzerini. Managing data through highlighting that much work still needs to be done. the lens of an ontology. AI Magazine, 39(2):65– Only the commitment of PPAA to implement se- 74, 2018. mantic interconnection of concepts, and conse- [10] Barry Nouwt. Tight integration of web APIs with Semantic Web. In SEMANTiCS (Work- shops), 2017. [11] Antonella Poggi, Domenico Lembo, Diego Cal- vanese, Giuseppe De Giacomo, Maurizio Lenz- erini, and Riccardo Rosati. Linking data to ontologies. In Journal on data semantics X, pages 133–173. Springer, 2008. [12] Guus Schreiber, Yves Raimond, Frank Manola, Eric Miller, and Brian McBride. RDF 1.1 Primer. World-Wide Web Consortium, 2014. Available online at: https://www.w3.org/TR/ rdf11-primer/. [13] Manu Sporny, Dave Longley, Gregg Kellogg, Markus Lanthaler, and Niklas Lindström. JSON-LD 1.1. W3C Recommendation, Jul, 2020. Available on: https://www.w3.org/TR/ json-ld11/. A. Online Resources The National Data Catalog (NDC) platform: • Schema The sources for data schemas in Schema: • INAIL • INPS The mentioned ontologies in Section 4: • CLV • COV • CPV • CulturalHeritage • L0 • Learning • RPO