<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Linked Data for the Norwegian State of Estate Reporting Service</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ling Shi</string-name>
          <email>Ling.Shi@statsbygg.no</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bjorg Elsa Pettersen</string-name>
          <email>BjorgElsa.Pettersen@statsbygg.no</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dina Sukhobok</string-name>
          <email>dina.sukhobok@sintef.no</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nikolay Nikolov</string-name>
          <email>nikolay.nikolov@sintef.no</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dumitru Roman</string-name>
          <email>dumitru.roman@sintef.no</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>SINTEF</institution>
          ,
          <addr-line>Forskningsveien 1a, 0373 Oslo</addr-line>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Statsbygg</institution>
          ,
          <addr-line>Pb. 8106 Dep, 0032 Oslo</addr-line>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The Norwegian State of Estate (SoE) report includes information about all Norwegian state-owned properties and buildings in the public sector and aims to assist government decision makers to allocate resources more e ectively. A Linked Data based approach is presented here to increase the transparency in the government administration, improve the report generating process and also the report quality. Crossdomain government data originated from the business entity register, the cadastral system, the building accessibility register and the old SoE report are acquired, prepared, cleaned, transformed to Linked Data format and published. The source datasets are then integrated, augmented and interlinked before the results are published as a SPARQL endpoint, used for data visualization and report generation.</p>
      </abstract>
      <kwd-group>
        <kwd>Linked Data</kwd>
        <kwd>data integration</kwd>
        <kwd>government data</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The SoE report issued by Statsbygg3 on behalf of the Ministry of Local
Government and Modernisation (KMD) is a governmental white paper, providing a
complete list of state-owned properties and buildings in the Norwegian public
sector. The report has been produced every four years as a result of manual
collection of information from multiple sources. The data collection and
quality control process has historically been resource demanding and error prone
and the result was static and did not re ect the changes after the report was
published. This paper describes the publishing and integration of several
crossdomain government datasets related to state-owned real estates as Linked Open
Data. Sharing the state-owned properties related data in a Linked Data format
enables data reuse, opens up possibilities for using the data in innovative ways,
and helps to increase transparency in the government administration[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>In addition, we demonstrate a Web-based application for registration and
reporting state-owned properties in Norway. This represents a major improvement
compared to the tedious, manual collection of property data.
3 http://www.statsbygg.no/Om-Statsbygg/About-Statsbygg/</p>
    </sec>
    <sec id="sec-2">
      <title>Approach and Implementation</title>
      <p>Data Sources. Cross-domain government data from di erent open and
proprietary sources as listed below are involved in the Linked Data generation process.
{ The central government organization dataset (a subset of data from the
Norwegian Business Entity Register administrated by the Br nn ysund Register
Centre4);
{ The cadastral dataset (a subset of data from the Norwegian Cadastral
System5 administrated by the Norwegian Mapping Authority);
{ The building accessibility dataset from the Building Accessibility Register6
administrated by Statsbygg;
{ The previous SoE report7 dataset administrated by Statsbygg;
{ The municipality boundaries dataset8 administrated by the Norwegian
Mapping Authority.</p>
      <p>The non-geospatial datasets are prepared by dataset providers in tabular format
and the geospatial datasets are provided as shape les.</p>
      <p>Though the source data comes from the most authoritative sources in the
respective domains, it is not always complete and accurate. Data inconsistency
between source systems is one of the main challenges in the integration process.
Missing values and keys also represent signi cant barriers for the integration
process. Such issues are addressed in the Linked Data generation process.
Ontology description. Property related data are rich in attributes, with both
spatial and temporal characteristics. A common ontology model provides the
necessary semantic description of the data. We reuse standard and established
ontologies such as DBpedia-owl, DUL, GeoSPARQL and schema.org to
represent the data. For example, schema:leiCode, schema:legalName, dbpedia-owl:type,
schema:foundingDate and schema:parentOrgnization are used to model
organization number, name, type, founding date and parent organization, respectively,
in the central government organization dataset.</p>
      <p>
        In addition, we developed the proDataMarket9[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] ontology to model, among
others, the cadastral and building accessibility domain based on existing
ontologies and standards. For the cadastral domain, the proDataMarket
ontology reuses the Land Administration Domain Model (LADM) de ned in ISO
19152:201210 standard and cadastral parcel concept speci ed by the European
Union's INSPIRE data speci cations11.
4 https://www.brreg.no/home/
5 http://www.kartverket.no/en/Land-Registry-and-Cadestre/
6 https://byggforalle.no/uu/sok.html?&amp;locale=en
7 https://www.regjeringen.no/contentassets/f4346335264c4f8495bc559482428908/
no/sved/stateigedom.pdf
8 http://data.kartverket.no/download/content/geodataprodukter
9 http://vocabs.datagraft.net/proDataMarket/
10 http://www.iso.org/iso/catalogue_detail.htm?csnumber=51206
11 http://inspire.ec.europa.eu/data-model/approved/r4618-ir/html/
Linked Data Generation and Publication. The publication of SoE data as
Linked Data was performed with the help of DataGraft [
        <xref ref-type="bibr" rid="ref1 ref2">1,2</xref>
        ]12{ a cloud-based
platform for data cleaning and Linked Data generation. DataGraft facilitates
interactive data cleaning and transformation, mapping data to Linked Data
ontologies, generating a semantic RDF graph and provisioning data both as RDF
dump and as a SPARQL endpoint.
      </p>
      <p>Data cleaning and preparation activities for SoE data included assigning valid
cadastral parcel identi ers, unifying null values for attributes with null values,
changing decimal formatting and geospatial data conversion. After the source
data les were cleaned, they were mapped to the above mentioned ontology and
published in DataGraft. Next, the data augmentation was performed with the
help of SPARQL CONSTRUCT queries executed on the published data, thus
making it possible to infer new data based on known business rules. One example
of such a business rule states that a building is owned by the owner or lessor
of a cadastral parcel where the building is built upon. This helped to select out
the state-owned properties and buildings, calculate the area of real estates, infer
the building ownership based on the information about owner or lessor of the
belonging cadastral parcel, etc. In addition, the published data is interlinked
with several central LOD datasets (DBpedia13, GeoNames14 and Lenka.no15) in
order to increase its re-usability to support queries on cross-domain distributed
datasets.</p>
      <p>The result of the data augmentation and interlinking process is published on
DataGraft and is available through a SPARQL endpoint16 under the Norwegian
License of Open Data17 (NLOD). SPARQL queries can be run on the SPARQL
endpoint to assess the data quality of the SoE report dataset.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Demonstration</title>
      <p>During the demonstration, we will present the process of generating Linked Data
from the SoE report using DataGraft, and a web-based application for
registration and reporting state-owned properties in Norway. The demo scenario will
cover uploading raw SoE data to the DataGraft platform, data transformation
and publication as a Linked Data, and the the Web-based application. The
application allows users to con gure visualizations and browse state-owned properties
and relevant data on the map created in CartoDB18 (see Figure 1). The
stateowned properties absent in the previous SoE report are explicitly marked on the
12 https://datagraft.io/
13 http://wiki.dbpedia.org/wiktionary-rdf-extraction
14 http://www.geonames.org/
15 http://data.lenka.no/
16 https://datagraft.io/prodatamarket_publisher/sparql_endpoints/
norwegian-state-of-estate-report-04693e1f-4060-48c1-8ab9-888a6c95f6d6
SPARQL querying at this endpoint currently works only in Chrome.
17 https://data.norge.no/nlod/en/1.0
18 https://carto.com/
map (using pink colour), helping users to identify data quality issues such as
inconsistencies or missing registrations in the source systems.</p>
      <p>Acknowledgements This work is partly funded by the EC H2020 project
proDataMarket (Grant number: 644497).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Roman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dimitrov</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nikolov</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Putlier</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sukhobok</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elvester</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berre</surname>
            ,
            <given-names>A. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ye</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Petkov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <article-title>DataGraft: Simplifying Open Data Publishing</article-title>
          .
          <source>ESWC (Satellite Events)</source>
          <year>2016</year>
          :
          <fpage>101</fpage>
          -
          <lpage>106</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Roman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nikolov</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Putlier</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sukhobok</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elvester</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berre</surname>
            ,
            <given-names>A. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ye</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dimitrov</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zarev</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moynihan</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roberts</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berlocher</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Heath</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <article-title>DataGraft: One-stop-shop for open data management. To appear in the Semantic Web Journal (SWJ) Interoperability, Usability, Applicability (published and printed by IOS Press</article-title>
          , ISSN:
          <fpage>1570</fpage>
          -
          <lpage>0844</lpage>
          ),
          <year>2017</year>
          , DOI: 10.3233/SW-170263.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Shi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nikolov</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sukhobok</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tarasova</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Roman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>The proDataMarket Ontology for Publishing and Integrating Cross-domain Real Property Data. To appear in the journal "Territorio Italia. Land Administration, Cadastre and Real Estate"</article-title>
          . n.2/
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Shi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pettersen</surname>
            ,
            <given-names>B. E.</given-names>
          </string-name>
          , sthassel, I.,
          <string-name>
            <surname>Nikolov</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khorramhonarnama</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berre</surname>
            ,
            <given-names>A. J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Roman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2015</year>
          ,
          <article-title>August)</article-title>
          .
          <article-title>Norwegian State of Estate: A Reporting Service for the State-Owned Properties in Norway</article-title>
          .
          <source>In International Symposium on Rules and Rule Markup Languages for the Semantic Web</source>
          (pp.
          <fpage>456</fpage>
          -
          <lpage>464</lpage>
          ). Springer International Publishing.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>