=Paper=
{{Paper
|id=Vol-2977/paper2
|storemode=property
|title=Demo: GeoDataWizard for Linked Spatial Data Creation (short paper)
|pdfUrl=https://ceur-ws.org/Vol-2977/paper2.pdf
|volume=Vol-2977
|authors=Alexandra Rowland,Jorrit Overeem,Erwin Folmer
|dblpUrl=https://dblp.org/rec/conf/esws/RowlandOF21
}}
==Demo: GeoDataWizard for Linked Spatial Data Creation (short paper)==
<pdf width="1500px">https://ceur-ws.org/Vol-2977/paper2.pdf</pdf>
<pre>
Demo: GeoDataWizard for Linked Spatial Data Creation

                Alexandra Rowland1, Jorrit Overeem2 & Erwin Folmer2
           1 Kadaster & University of Twente, 7500 AE Enschede, The Netherlands

                                  lexi.rowland@kadaster.nl
                       2 Kadaster, 7311KZ Apeldoorn, The Netherlands

                                  jorrit.overeem@kadaster.nl
           3 Kadaster & University of Twente, 7500 AE Enschede, The Netherlands

                                  erwin.folmer@kadaster.nl


       Abstract. In order to assist users with the transformation and publication of
       spatial linked data on a small scale, the GeoDataWizard tool was developed by
       Kadaster, the Dutch Land Registry and Mapping Agency, as an open source
       project. The tool allows for the transformation of relational data, in a CSV for-
       mat, to spatial linked data. The results of this transformation can be downloaded
       or published to the Platform Linked Data Netherlands (PLDN) triple store. The
       intention of this paper is to support users in making use of this tool.

       Keywords: linked spatial data, linked data tooling, open source tooling.


1      Introduction

This paper demonstrates the GeoDataWizard tool developed by Kadaster, the Dutch
Land Registry. You can find the tool as a demonstrator1 and through a GitHub reposi-
tory2. The tool is an extension of the first version of the LDWizard, an open source
project initiated by Network Digital Heritage3 (in Dutch: Netwerk Digitaal Erfgoed).
This project resulted in a product which allows small tabular datasets (CSV files) to
be transformed into linked data and the extension demonstrated here ensures that
geographically related elements in a datasets, such as co-ordinates or an address, are
also transformed correctly as linked spatial data. The GeoDataWizard is also available
as open source software.
   A unique feature of this tool is the ability to directly publish data to the triple store
maintained by Platform Linked Data Netherlands (PLDN). This makes data accessible
for (SPARQL) querying and visualisation in the triple store itself, but also for reuse in
other applications as an endpoint. At this moment, it is possible to make relations


Copyright ©2021 for this paper by its authors. Use permitted under Creative Commons License
  Attribution 4.0 International (CC BY 4.0).

1 https://labs.kadaster.nl/demonstrators/geodatawizard/#1
2 https://github.com/netwerk-digitaal-erfgoed/LDWizard-Core
3 https://github.com/netwerk-digitaal-erfgoed/LDWizard-ErfgoedWizard
2


between Dutch city names contained in a csv file, the Base Registry for Addresses
and Buildings (Dutch acronym: BAG) 4 and Topography (Dutch acronym: BRT)5.
   The overall vision for the development of this tool is to assist users with the crea-
tion of spatial linked data. The publication of this newly-created spatial linked data to
the PLDN triple store in particular then provides the user with multiple out-of-the-box
techniques to perform analysis and visualization tasks on this data. Future develop-
ments could see this tool as a beginning tool in an overall tooling workflow which
sees the csv geodata transformed to spatial linked data, uploaded to the triple store for
analysis purposes and the results then visualized in a geographical viewer tool such as
the toponamenzoeker6, also developed and maintained by Kadaster. In this way, the
tool makes spatial linked data more accessible to a wider range of user groups and
contexts. Future iterations of this tool have two additions planned, namely; an address
matching functionality between addresses found in the csv and those registered in the
BAG dataset and the automatic generation of SPARQL queries using the uploaded
data; again supporting skillsets from a wider range of user groups.


2       GeoDataWizard Demonstration: A Step-by-Step Guide

2.1     Step 1: Upload the Dataset

To upload a file, click load your CSV file button on the main page. Currently, the
Wizard only accepts CSV files and if you do not have a suitable one, there is an ex-
ample csv available just below the load your CSV file button.


Fig. 1. Start page of the GeoDataWizard displaying the load your CSV file button.

Following this, a pop-up will appear asking you to choose the file you would like to
upload. Select the relevant CSV file on your local drive and then click open. The win-
dow will disappear and you will be automatically redirected to the configuration step.
When the CSV file is loaded, the first 10 lines of your dataset will appear in a table.


4 https://bag.basisregistraties.overheid.nl/
5 https://brt.basisregistraties.overheid.nl/
6 https://labs.kadaster.nl/demonstrators/namen-app/#/
                                                                                          3


2.2     Step 2: Configuration

Two different configurations are possible with this GeoDataWizard when a header is
clicked; either configuration based on the key column and resource class IRI where
these apply to the entire table (option A) or separate configurations per column (op-
tion B).

      A. Key Column Configuration
      You can set a key column based on which the configuration applies. These col-
      umn values are added to the resource class IRI with an ID and must be a column
      with a unique value. When you select resource class IRI, you can set the resource
      IRI that applies as the relevant resource for the properties in concern. If you leave
      the key column and resource class empty, the default values will apply.
      B. Column Configuration
      Each column can be configured separately with a number of settings. These set-
      tings are important to be able to make good linked spatial data.

Datatype Setup
The datatype setup applied to your data should be done in the context of the types of
analysis you are likely to perform after transformation. There are several options:
1. String: this is for text. This is the default value if the data type input is left empty.
2. Integer (int): can be used for integer data types.
3. Float: can be used for numbers with decimals, such as a coordinate point
4. WKT Literal: this can be used if coordinates are used in a separate column. Note:
   these values must be indicated as POINT (lat, long) in your dataset.

Property Settings
Each column value has its own property or properties in the linked (spatial) data
structure. It indicates the type of value in the input. Here, a type can be specifically
assigned per column.

Value Configuration Settings
With this option, you have a number of input values, all of which are used to trans-
form the column values into IRIs. The options for this are as follows:

1. IRI Prefix transformation: converts the column values to an IRI as set in the re-
   source class in the combination with the property.
2. Search for cities in BAG: this option is only possible for columns containing val-
   ues with a city or village. Naturally, this will only work for Dutch data.
3. Link GeoPoint: this option is only possible if the column contains coordinate
   points with a POINT (...) value. This option will link the points with a BRT area in
   the BRT by means of an identification number.
4. Search for places in BRT: this option is possible if you have an address column
   with a street name and house number and also have a place of residence or places
   of birth/death in the dataset. If selected, a new select box will appear with a choice
4


    of column names to which you want to link the address. Naturally, this will only
    work for Dutch data.

Once the desired configuration is set, you can press confirm and next to proceed to the
publishing screen.


2.3     Publication

For the publication of data to the triple store, you do need an account to be able to
request the token. If you are an existing TriplyDB7 user, you do not need to request a
new token, you can simply input your existing one. You can request this access by
contacting Erwin Folmer (erwin.folmer@kadaster.nl) where necessary. Alternatively,
you can download the data in three formats, namely; CSV, RDF, and a script through
which the transformation can be run manually.
   Once you have access to the triple store, go back to the GeoDataWizard publica-
tion page and below the token input you will see ‘Create a new token at: Kadaster or
PLDN’. Right click on PLDN to open a new window or tab. You will be taken to the
login page for the PLDN triple store, input you username and password. You will
then be redirected to the following page (Figure 2).


Fig. 2. Generation of an access token for the PLDN triple store.

Click +create token, input a token name and set the management access. A token
works on three access levels and allows you to restrict your published data for other
users. As a standard implementation, there is a read access restriction, with which you
data can only be read but not edited by external users. Click create and your new to-
ken will appear as a popup. Copy this token and store it for future use as this is only
issued once.


7 https://triplyDB.com
                                                                                    5


   After copying, click close and return to the GeoDataWizard and paste in the token
in the field. Click load token and then the GeoDataWizard will display your account
to which the dataset will be published. Click publish to publish your dataset. You can
view now the results on the PLDN data platform through the click here link.

</pre>