Improved Analysis of Survey Data using Knowledge Graphs Anna Gossen1,2 , Dr. Eike Nicklas1,2 1 Bank for International Settlements 2 HMS Analytical Software GmbH Abstract In this work, we present an internal knowledge-graph-based application for the storage and analysis of data regarding central banking practice on governance, management and organizational matters. We discuss a custom ontology as well as the high level application architecture and implementation challenges we experienced. Keywords CEUR-WS, Knowledge Graphs, Ontologies Knowledge Graphs are widely used in financial sector different data structures, some of which is collected via for various purposes, including fraud detection, banking surveys. Its core is a custom Dataset ontology, which oversight or customer profiling. Traditional database provides generic data structures to store information technologies often do not solve expanding analytical about given entities including revisions, temporal ver- needs in the growing markets environment. By introduc- sions, provenance and data access information. The on- ing semantically meaningful meta data and integrating tology reuses W3C standards and open linked vocabular- instance data with the contextual structure, knowledge ies (SKOS1 , PROV-O2 , DCAT3 ), which makes it easy to graphs offer a smart and efficient way to create, store, understand and to apply. Figure 1 provides a high level query, analyze data and convert it into direct value. overview. In the context of central banking, the analysis of gov- Extensions to this ontology allow for the storage of ernance and organizational structures and processes in- additional metadata. For example, referenced entities and volves various challenges due to heterogeneous, but concepts can be structured in taxonomies for efficient strongly inter-connected organizational structures and data selection and aggregation, supported by standard often qualitative or textual data. In addition, hierarchical ontologies such as the ORG ontology. In addition, related metadata is commonly required for grouping or aggrega- data can be grouped in datasets, e.g. to trace data that tion of analytical results. was collected in the same survey. An exemplary use case demonstrating these challenges Supported by the use knowledge graphs technologies, is the following: the application offers: The data analysts need to find central banks that have a supervisory board, where the governor is not chair, and a • Generic, ontology-driven data analysis dedicated Monetary Policy Committee, where the governor • Advanced, inference-based full text search and is chair, and at least one member contextual filtering The task will require a lot of time, effort, resources, and • Data provenance tracking table join operations when using a standard relational • Time-based data analysis database. This exercise gets even more complicated when • Seamless datasets integration from heteroge- taking temporal aspects into account. In addition, the neous sources data required for this analysis is often not available in • Data quality validation one database, but distributed across multiple systems or files. In this presentation, we describe the generic dataset In this work, we present an internal knowledge-graph- ontology and a domain-specific extension, as well as the based application for the storage and analysis of data high-level application architecture. In addition, we will regarding central banking practice on governance, man- share experiences regarding implementation challenges agement and organizational matters. It supports the man- and discuss how the application supports the data analy- agement and analysis of data on various topics and with sis work in the context of central banking organization and governance. Published in the Workshop Proceedings of the EDBT/ICDT 2022 Joint Conference (March 29-April 1, 2022), Edinburgh, UK " anna.gossen@bis.org (A. Gossen); eike.nicklas@bis.org (Dr. E. Nicklas) 1 https://www.w3.org/TR/2009/REC-skos-reference-20090818/ © 2022 Copyright for this paper by its authors. Use permitted under Creative 2 Commons License Attribution 4.0 International (CC BY 4.0). https://www.w3.org/TR/prov-o/ CEUR Workshop Proceedings (CEUR-WS.org) 3 https://www.w3.org/TR/vocab-dcat-2/ CEUR http://ceur-ws.org Workshop ISSN 1613-0073 Proceedings Figure 1: Schematic overview of the main concepts in the dataset ontology.